Applying Time Series for Background User Identification Based on Their Text Data Analysis View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2018-09

AUTHORS

V. Yu. Korolev, A. Yu. Korchagin, I. V. Mashechkin, M. I. Petrovskii, D. V. Tsarev

ABSTRACT

An approach to user identification based on deviations of their topic trends in operation with text information is presented. An approach is proposed to solve this problem; the approach implies topic analysis of the user’s past trends (behavior) in operation with text content of various (including confidential) categories and forecast of their future behavior. The topic analysis of user’s operation implies determining the principal topics of their text content and calculating their respective weights at the given instants. Deviations in the behavior in the user’s operation with the content from the forecast are used to identify this user. In the framework of this approach, our own original time series forecasting method is proposed based on orthogonal non-negative matrix factorization (ONMF). Note that ONMF has not been used to solve time series forecasting problems before. The experimental research held on the example of real-world corporate emailing formed out of the Enron data set showed the proposed user identification approach to be applicable. More... »

PAGES

353-362

References to SciGraph publications

  • 2008. Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds in INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING – IDEAL 2008
  • 1999-10. Learning the parts of objects by non-negative matrix factorization in NATURE
  • 2011-11. Automatic text summarization using latent semantic analysis in PROGRAMMING AND COMPUTER SOFTWARE
  • 2015-01. Applying text mining methods for data loss prevention in PROGRAMMING AND COMPUTER SOFTWARE
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1134/s0361768818050055

    DOI

    http://dx.doi.org/10.1134/s0361768818050055

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1107186640


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information Systems", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Moscow State University", 
              "id": "https://www.grid.ac/institutes/grid.14476.30", 
              "name": [
                "Faculty of Computational Mathematics and Cybernetics, Lomonosov State University, 119991, Moscow, Russia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Korolev", 
            "givenName": "V. Yu.", 
            "id": "sg:person.014166423003.73", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014166423003.73"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Moscow State University", 
              "id": "https://www.grid.ac/institutes/grid.14476.30", 
              "name": [
                "Faculty of Computational Mathematics and Cybernetics, Lomonosov State University, 119991, Moscow, Russia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Korchagin", 
            "givenName": "A. Yu.", 
            "id": "sg:person.010170417462.26", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010170417462.26"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Moscow State University", 
              "id": "https://www.grid.ac/institutes/grid.14476.30", 
              "name": [
                "Faculty of Computational Mathematics and Cybernetics, Lomonosov State University, 119991, Moscow, Russia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Mashechkin", 
            "givenName": "I. V.", 
            "id": "sg:person.012306073373.52", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012306073373.52"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Moscow State University", 
              "id": "https://www.grid.ac/institutes/grid.14476.30", 
              "name": [
                "Faculty of Computational Mathematics and Cybernetics, Lomonosov State University, 119991, Moscow, Russia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Petrovskii", 
            "givenName": "M. I.", 
            "id": "sg:person.011760557107.63", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011760557107.63"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Moscow State University", 
              "id": "https://www.grid.ac/institutes/grid.14476.30", 
              "name": [
                "Faculty of Computational Mathematics and Cybernetics, Lomonosov State University, 119991, Moscow, Russia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Tsarev", 
            "givenName": "D. V.", 
            "id": "sg:person.07501646545.34", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07501646545.34"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1134/s0361768815010041", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000862321", 
              "https://doi.org/10.1134/s0361768815010041"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-88906-9_18", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1024582924", 
              "https://doi.org/10.1007/978-3-540-88906-9_18"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-88906-9_18", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1024582924", 
              "https://doi.org/10.1007/978-3-540-88906-9_18"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/17.6.520", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1024880743"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1134/s0361768811060041", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035808583", 
              "https://doi.org/10.1134/s0361768811060041"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.csda.2006.11.006", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1046812635"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/44565", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052721759", 
              "https://doi.org/10.1038/44565"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/44565", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052721759", 
              "https://doi.org/10.1038/44565"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/his.2011.6122102", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1094443487"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1017/cbo9780511809071", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1098672059"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2018-09", 
        "datePublishedReg": "2018-09-01", 
        "description": "An approach to user identification based on deviations of their topic trends in operation with text information is presented. An approach is proposed to solve this problem; the approach implies topic analysis of the user\u2019s past trends (behavior) in operation with text content of various (including confidential) categories and forecast of their future behavior. The topic analysis of user\u2019s operation implies determining the principal topics of their text content and calculating their respective weights at the given instants. Deviations in the behavior in the user\u2019s operation with the content from the forecast are used to identify this user. In the framework of this approach, our own original time series forecasting method is proposed based on orthogonal non-negative matrix factorization (ONMF). Note that ONMF has not been used to solve time series forecasting problems before. The experimental research held on the example of real-world corporate emailing formed out of the Enron data set showed the proposed user identification approach to be applicable.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1134/s0361768818050055", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": [
          {
            "id": "sg:journal.1136711", 
            "issn": [
              "0361-7688", 
              "1608-3261"
            ], 
            "name": "Programming and Computer Software", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "5", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "44"
          }
        ], 
        "name": "Applying Time Series for Background User Identification Based on Their Text Data Analysis", 
        "pagination": "353-362", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "c3c6d13be91b254b78b15f44b89784e16124b170b7ca062dd74b3a875844fc8f"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1134/s0361768818050055"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1107186640"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1134/s0361768818050055", 
          "https://app.dimensions.ai/details/publication/pub.1107186640"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-10T13:22", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8659_00000536.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "http://link.springer.com/10.1134%2FS0361768818050055"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1134/s0361768818050055'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1134/s0361768818050055'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1134/s0361768818050055'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1134/s0361768818050055'


     

    This table displays all metadata directly associated to this object as RDF triples.

    117 TRIPLES      21 PREDICATES      35 URIs      19 LITERALS      7 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1134/s0361768818050055 schema:about anzsrc-for:08
    2 anzsrc-for:0806
    3 schema:author Nf71f9ac9f9f5448e99c539e76905cceb
    4 schema:citation sg:pub.10.1007/978-3-540-88906-9_18
    5 sg:pub.10.1038/44565
    6 sg:pub.10.1134/s0361768811060041
    7 sg:pub.10.1134/s0361768815010041
    8 https://doi.org/10.1016/j.csda.2006.11.006
    9 https://doi.org/10.1017/cbo9780511809071
    10 https://doi.org/10.1093/bioinformatics/17.6.520
    11 https://doi.org/10.1109/his.2011.6122102
    12 schema:datePublished 2018-09
    13 schema:datePublishedReg 2018-09-01
    14 schema:description An approach to user identification based on deviations of their topic trends in operation with text information is presented. An approach is proposed to solve this problem; the approach implies topic analysis of the user’s past trends (behavior) in operation with text content of various (including confidential) categories and forecast of their future behavior. The topic analysis of user’s operation implies determining the principal topics of their text content and calculating their respective weights at the given instants. Deviations in the behavior in the user’s operation with the content from the forecast are used to identify this user. In the framework of this approach, our own original time series forecasting method is proposed based on orthogonal non-negative matrix factorization (ONMF). Note that ONMF has not been used to solve time series forecasting problems before. The experimental research held on the example of real-world corporate emailing formed out of the Enron data set showed the proposed user identification approach to be applicable.
    15 schema:genre research_article
    16 schema:inLanguage en
    17 schema:isAccessibleForFree false
    18 schema:isPartOf N40c93912a108408a8a779f8c9ac46012
    19 Nd7cbdaa78eea4329b3cd4f7892feb1b4
    20 sg:journal.1136711
    21 schema:name Applying Time Series for Background User Identification Based on Their Text Data Analysis
    22 schema:pagination 353-362
    23 schema:productId N1f2d4fda83c840e09512274b74070047
    24 N458efba2872a43aaa0e907762256207c
    25 Na734a33f1a9a4b47a377c22e1fa14895
    26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107186640
    27 https://doi.org/10.1134/s0361768818050055
    28 schema:sdDatePublished 2019-04-10T13:22
    29 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    30 schema:sdPublisher N675de3f255d04c04a51604d9b27c06a1
    31 schema:url http://link.springer.com/10.1134%2FS0361768818050055
    32 sgo:license sg:explorer/license/
    33 sgo:sdDataset articles
    34 rdf:type schema:ScholarlyArticle
    35 N1f2d4fda83c840e09512274b74070047 schema:name readcube_id
    36 schema:value c3c6d13be91b254b78b15f44b89784e16124b170b7ca062dd74b3a875844fc8f
    37 rdf:type schema:PropertyValue
    38 N363f8a9ecd0f4d01a82ca293d46fd5f0 rdf:first sg:person.011760557107.63
    39 rdf:rest N63ae57b23e93412e8808f69f3715148f
    40 N40c93912a108408a8a779f8c9ac46012 schema:issueNumber 5
    41 rdf:type schema:PublicationIssue
    42 N458efba2872a43aaa0e907762256207c schema:name doi
    43 schema:value 10.1134/s0361768818050055
    44 rdf:type schema:PropertyValue
    45 N63ae57b23e93412e8808f69f3715148f rdf:first sg:person.07501646545.34
    46 rdf:rest rdf:nil
    47 N675de3f255d04c04a51604d9b27c06a1 schema:name Springer Nature - SN SciGraph project
    48 rdf:type schema:Organization
    49 N8e58296ebe434c009ee0fdb56508233e rdf:first sg:person.010170417462.26
    50 rdf:rest N8ec87926404d498b8322cb8d0b2a4f6c
    51 N8ec87926404d498b8322cb8d0b2a4f6c rdf:first sg:person.012306073373.52
    52 rdf:rest N363f8a9ecd0f4d01a82ca293d46fd5f0
    53 Na734a33f1a9a4b47a377c22e1fa14895 schema:name dimensions_id
    54 schema:value pub.1107186640
    55 rdf:type schema:PropertyValue
    56 Nd7cbdaa78eea4329b3cd4f7892feb1b4 schema:volumeNumber 44
    57 rdf:type schema:PublicationVolume
    58 Nf71f9ac9f9f5448e99c539e76905cceb rdf:first sg:person.014166423003.73
    59 rdf:rest N8e58296ebe434c009ee0fdb56508233e
    60 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    61 schema:name Information and Computing Sciences
    62 rdf:type schema:DefinedTerm
    63 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
    64 schema:name Information Systems
    65 rdf:type schema:DefinedTerm
    66 sg:journal.1136711 schema:issn 0361-7688
    67 1608-3261
    68 schema:name Programming and Computer Software
    69 rdf:type schema:Periodical
    70 sg:person.010170417462.26 schema:affiliation https://www.grid.ac/institutes/grid.14476.30
    71 schema:familyName Korchagin
    72 schema:givenName A. Yu.
    73 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010170417462.26
    74 rdf:type schema:Person
    75 sg:person.011760557107.63 schema:affiliation https://www.grid.ac/institutes/grid.14476.30
    76 schema:familyName Petrovskii
    77 schema:givenName M. I.
    78 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011760557107.63
    79 rdf:type schema:Person
    80 sg:person.012306073373.52 schema:affiliation https://www.grid.ac/institutes/grid.14476.30
    81 schema:familyName Mashechkin
    82 schema:givenName I. V.
    83 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012306073373.52
    84 rdf:type schema:Person
    85 sg:person.014166423003.73 schema:affiliation https://www.grid.ac/institutes/grid.14476.30
    86 schema:familyName Korolev
    87 schema:givenName V. Yu.
    88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014166423003.73
    89 rdf:type schema:Person
    90 sg:person.07501646545.34 schema:affiliation https://www.grid.ac/institutes/grid.14476.30
    91 schema:familyName Tsarev
    92 schema:givenName D. V.
    93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07501646545.34
    94 rdf:type schema:Person
    95 sg:pub.10.1007/978-3-540-88906-9_18 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024582924
    96 https://doi.org/10.1007/978-3-540-88906-9_18
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1038/44565 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052721759
    99 https://doi.org/10.1038/44565
    100 rdf:type schema:CreativeWork
    101 sg:pub.10.1134/s0361768811060041 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035808583
    102 https://doi.org/10.1134/s0361768811060041
    103 rdf:type schema:CreativeWork
    104 sg:pub.10.1134/s0361768815010041 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000862321
    105 https://doi.org/10.1134/s0361768815010041
    106 rdf:type schema:CreativeWork
    107 https://doi.org/10.1016/j.csda.2006.11.006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046812635
    108 rdf:type schema:CreativeWork
    109 https://doi.org/10.1017/cbo9780511809071 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098672059
    110 rdf:type schema:CreativeWork
    111 https://doi.org/10.1093/bioinformatics/17.6.520 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024880743
    112 rdf:type schema:CreativeWork
    113 https://doi.org/10.1109/his.2011.6122102 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094443487
    114 rdf:type schema:CreativeWork
    115 https://www.grid.ac/institutes/grid.14476.30 schema:alternateName Moscow State University
    116 schema:name Faculty of Computational Mathematics and Cybernetics, Lomonosov State University, 119991, Moscow, Russia
    117 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...