Structure in the Enron Email Dataset View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2005-10

AUTHORS

P. S. Keila, D. B. Skillicorn

ABSTRACT

We investigate the structures present in the Enron email dataset using singular value decomposition and semidiscrete decomposition. Using word frequency profiles, we show that messages fall into two distinct groups, whose extrema are characterized by short messages and rare words versus long messages and common words. It is surprising that length of message and word use pattern should be related in this way. We also investigate relationships among individuals based on their patterns of word use in email. We show that word use is correlated to function within the organization, as expected. Lastly, we show that relative changes to individuals' word usage over time can be used to identify key players in major company events. More... »

PAGES

183-199

References to SciGraph publications

  • 2005. Beyond Keyword Filtering for Message and Conversation Detection in INTELLIGENCE AND SECURITY INFORMATICS
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10588-005-5379-y

    DOI

    http://dx.doi.org/10.1007/s10588-005-5379-y

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1052987434


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1701", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Psychology", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/17", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Psychology and Cognitive Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Queen's University", 
              "id": "https://www.grid.ac/institutes/grid.410356.5", 
              "name": [
                "School of Computing, Queen's University, K7L 3N6, Kingston, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Keila", 
            "givenName": "P. S.", 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Queen's University", 
              "id": "https://www.grid.ac/institutes/grid.410356.5", 
              "name": [
                "School of Computing, Queen's University, K7L 3N6, Kingston, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Skillicorn", 
            "givenName": "D. B.", 
            "id": "sg:person.015501033775.06", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015501033775.06"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1145/291128.291131", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028464808"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11427995_19", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041837329", 
              "https://doi.org/10.1007/11427995_19"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11427995_19", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041837329", 
              "https://doi.org/10.1007/11427995_19"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/pan/mph004", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059968923"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2005-10", 
        "datePublishedReg": "2005-10-01", 
        "description": "We investigate the structures present in the Enron email dataset using singular value decomposition and semidiscrete decomposition. Using word frequency profiles, we show that messages fall into two distinct groups, whose extrema are characterized by short messages and rare words versus long messages and common words. It is surprising that length of message and word use pattern should be related in this way. We also investigate relationships among individuals based on their patterns of word use in email. We show that word use is correlated to function within the organization, as expected. Lastly, we show that relative changes to individuals' word usage over time can be used to identify key players in major company events.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1007/s10588-005-5379-y", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": [
          {
            "id": "sg:journal.1048813", 
            "issn": [
              "1381-298X", 
              "1572-9346"
            ], 
            "name": "Computational and Mathematical Organization Theory", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "3", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "11"
          }
        ], 
        "name": "Structure in the Enron Email Dataset", 
        "pagination": "183-199", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "fc64fe195514e8f5faf86fbba63c3919785d318d6384b8ae3ee16a6a517ef307"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10588-005-5379-y"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1052987434"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10588-005-5379-y", 
          "https://app.dimensions.ai/details/publication/pub.1052987434"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-11T12:44", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000363_0000000363/records_70068_00000002.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "http://link.springer.com/10.1007%2Fs10588-005-5379-y"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10588-005-5379-y'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10588-005-5379-y'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10588-005-5379-y'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10588-005-5379-y'


     

    This table displays all metadata directly associated to this object as RDF triples.

    77 TRIPLES      21 PREDICATES      30 URIs      19 LITERALS      7 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10588-005-5379-y schema:about anzsrc-for:17
    2 anzsrc-for:1701
    3 schema:author Nbd2e696172c64e92983bec91b513211e
    4 schema:citation sg:pub.10.1007/11427995_19
    5 https://doi.org/10.1093/pan/mph004
    6 https://doi.org/10.1145/291128.291131
    7 schema:datePublished 2005-10
    8 schema:datePublishedReg 2005-10-01
    9 schema:description We investigate the structures present in the Enron email dataset using singular value decomposition and semidiscrete decomposition. Using word frequency profiles, we show that messages fall into two distinct groups, whose extrema are characterized by short messages and rare words versus long messages and common words. It is surprising that length of message and word use pattern should be related in this way. We also investigate relationships among individuals based on their patterns of word use in email. We show that word use is correlated to function within the organization, as expected. Lastly, we show that relative changes to individuals' word usage over time can be used to identify key players in major company events.
    10 schema:genre research_article
    11 schema:inLanguage en
    12 schema:isAccessibleForFree false
    13 schema:isPartOf N3bb8a987f5bc49e98c799f813ef82d19
    14 N5e3f167837184b8f94dd85478d50b668
    15 sg:journal.1048813
    16 schema:name Structure in the Enron Email Dataset
    17 schema:pagination 183-199
    18 schema:productId N2f9a56e20e4e4b3fb04f8b4f0ac758ac
    19 Na0522b489b1341cd972f78ca89dd2f76
    20 Nfc825709801443fd90a05ff8ed5e03df
    21 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052987434
    22 https://doi.org/10.1007/s10588-005-5379-y
    23 schema:sdDatePublished 2019-04-11T12:44
    24 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    25 schema:sdPublisher N9bf7478df19b464994bfe81d67e344b2
    26 schema:url http://link.springer.com/10.1007%2Fs10588-005-5379-y
    27 sgo:license sg:explorer/license/
    28 sgo:sdDataset articles
    29 rdf:type schema:ScholarlyArticle
    30 N0b58bade2e86422c828a5263acf157d1 rdf:first sg:person.015501033775.06
    31 rdf:rest rdf:nil
    32 N2f9a56e20e4e4b3fb04f8b4f0ac758ac schema:name doi
    33 schema:value 10.1007/s10588-005-5379-y
    34 rdf:type schema:PropertyValue
    35 N3bb8a987f5bc49e98c799f813ef82d19 schema:issueNumber 3
    36 rdf:type schema:PublicationIssue
    37 N5e3f167837184b8f94dd85478d50b668 schema:volumeNumber 11
    38 rdf:type schema:PublicationVolume
    39 N9bf7478df19b464994bfe81d67e344b2 schema:name Springer Nature - SN SciGraph project
    40 rdf:type schema:Organization
    41 Na0522b489b1341cd972f78ca89dd2f76 schema:name dimensions_id
    42 schema:value pub.1052987434
    43 rdf:type schema:PropertyValue
    44 Nbd2e696172c64e92983bec91b513211e rdf:first Nd2261f0614314ba1ae7b0b312974019d
    45 rdf:rest N0b58bade2e86422c828a5263acf157d1
    46 Nd2261f0614314ba1ae7b0b312974019d schema:affiliation https://www.grid.ac/institutes/grid.410356.5
    47 schema:familyName Keila
    48 schema:givenName P. S.
    49 rdf:type schema:Person
    50 Nfc825709801443fd90a05ff8ed5e03df schema:name readcube_id
    51 schema:value fc64fe195514e8f5faf86fbba63c3919785d318d6384b8ae3ee16a6a517ef307
    52 rdf:type schema:PropertyValue
    53 anzsrc-for:17 schema:inDefinedTermSet anzsrc-for:
    54 schema:name Psychology and Cognitive Sciences
    55 rdf:type schema:DefinedTerm
    56 anzsrc-for:1701 schema:inDefinedTermSet anzsrc-for:
    57 schema:name Psychology
    58 rdf:type schema:DefinedTerm
    59 sg:journal.1048813 schema:issn 1381-298X
    60 1572-9346
    61 schema:name Computational and Mathematical Organization Theory
    62 rdf:type schema:Periodical
    63 sg:person.015501033775.06 schema:affiliation https://www.grid.ac/institutes/grid.410356.5
    64 schema:familyName Skillicorn
    65 schema:givenName D. B.
    66 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015501033775.06
    67 rdf:type schema:Person
    68 sg:pub.10.1007/11427995_19 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041837329
    69 https://doi.org/10.1007/11427995_19
    70 rdf:type schema:CreativeWork
    71 https://doi.org/10.1093/pan/mph004 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059968923
    72 rdf:type schema:CreativeWork
    73 https://doi.org/10.1145/291128.291131 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028464808
    74 rdf:type schema:CreativeWork
    75 https://www.grid.ac/institutes/grid.410356.5 schema:alternateName Queen's University
    76 schema:name School of Computing, Queen's University, K7L 3N6, Kingston, Canada
    77 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...