Development and validation of a classification approach for extracting severity automatically from electronic health records View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2015-04-06

AUTHORS

Mary Regina Boland, Nicholas P Tatonetti, George Hripcsak

ABSTRACT

BACKGROUND: Electronic Health Records (EHRs) contain a wealth of information useful for studying clinical phenotype-genotype relationships. Severity is important for distinguishing among phenotypes; however other severity indices classify patient-level severity (e.g., mild vs. acute dermatitis) rather than phenotype-level severity (e.g., acne vs. myocardial infarction). Phenotype-level severity is independent of the individual patient's state and is relative to other phenotypes. Further, phenotype-level severity does not change based on the individual patient. For example, acne is mild at the phenotype-level and relative to other phenotypes. Therefore, a given patient may have a severe form of acne (this is the patient-level severity), but this does not effect its overall designation as a mild phenotype at the phenotype-level. METHODS: We present a method for classifying severity at the phenotype-level that uses the Systemized Nomenclature of Medicine - Clinical Terms. Our method is called the Classification Approach for Extracting Severity Automatically from Electronic Health Records (CAESAR). CAESAR combines multiple severity measures - number of comorbidities, medications, procedures, cost, treatment time, and a proportional index term. CAESAR employs a random forest algorithm and these severity measures to discriminate between severe and mild phenotypes. RESULTS: Using a random forest algorithm and these severity measures as input, CAESAR differentiates between severe and mild phenotypes (sensitivity = 91.67, specificity = 77.78) when compared to a manually evaluated reference standard (k = 0.716). CONCLUSIONS: CAESAR enables researchers to measure phenotype severity from EHRs to identify phenotypes that are important for comparative effectiveness research. More... »

PAGES

14

References to SciGraph publications

  • 2013-12-27. Fast time intervals mining using the transitivity of temporal relations in KNOWLEDGE AND INFORMATION SYSTEMS
  • 2013-12-01. Mining the ultimate phenome repository in NATURE BIOTECHNOLOGY
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1186/s13326-015-0010-8

    DOI

    http://dx.doi.org/10.1186/s13326-015-0010-8

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1017511191

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/25848530


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Observational Health Data Sciences and Informatics (OHDSI), Columbia University, 622 West 168th Street, PH-20, New York, NY USA", 
              "id": "http://www.grid.ac/institutes/grid.21729.3f", 
              "name": [
                "Department of Biomedical Informatics, Columbia University, New York, NY USA", 
                "Observational Health Data Sciences and Informatics (OHDSI), Columbia University, 622 West 168th Street, PH-20, New York, NY USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Boland", 
            "givenName": "Mary Regina", 
            "id": "sg:person.0770233234.36", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0770233234.36"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Medicine, Columbia University, New York, NY USA", 
              "id": "http://www.grid.ac/institutes/grid.21729.3f", 
              "name": [
                "Department of Biomedical Informatics, Columbia University, New York, NY USA", 
                "Observational Health Data Sciences and Informatics (OHDSI), Columbia University, 622 West 168th Street, PH-20, New York, NY USA", 
                "Department of Systems Biology, Columbia University, New York, NY USA", 
                "Department of Medicine, Columbia University, New York, NY USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Tatonetti", 
            "givenName": "Nicholas P", 
            "id": "sg:person.0651210417.29", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0651210417.29"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Observational Health Data Sciences and Informatics (OHDSI), Columbia University, 622 West 168th Street, PH-20, New York, NY USA", 
              "id": "http://www.grid.ac/institutes/grid.21729.3f", 
              "name": [
                "Department of Biomedical Informatics, Columbia University, New York, NY USA", 
                "Observational Health Data Sciences and Informatics (OHDSI), Columbia University, 622 West 168th Street, PH-20, New York, NY USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Hripcsak", 
            "givenName": "George", 
            "id": "sg:person.01304304572.48", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01304304572.48"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/s10115-013-0707-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047035835", 
              "https://doi.org/10.1007/s10115-013-0707-x"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.2757", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041461250", 
              "https://doi.org/10.1038/nbt.2757"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2015-04-06", 
        "datePublishedReg": "2015-04-06", 
        "description": "BACKGROUND: Electronic Health Records (EHRs) contain a wealth of information useful for studying clinical phenotype-genotype relationships. Severity is important for distinguishing among phenotypes; however other severity indices classify patient-level severity (e.g., mild vs. acute dermatitis) rather than phenotype-level severity (e.g., acne vs. myocardial infarction). Phenotype-level severity is independent of the individual patient's state and is relative to other phenotypes. Further, phenotype-level severity does not change based on the individual patient. For example, acne is mild at the phenotype-level and relative to other phenotypes. Therefore, a given patient may have a severe form of acne (this is the patient-level severity), but this does not effect its overall designation as a mild phenotype at the phenotype-level.\nMETHODS: We present a method for classifying severity at the phenotype-level that uses the Systemized Nomenclature of Medicine - Clinical Terms. Our method is called the Classification Approach for Extracting Severity Automatically from Electronic Health Records (CAESAR). CAESAR combines multiple severity measures - number of comorbidities, medications, procedures, cost, treatment time, and a proportional index term. CAESAR employs a random forest algorithm and these severity measures to discriminate between severe and mild phenotypes.\nRESULTS: Using a random forest algorithm and these severity measures as input, CAESAR differentiates between severe and mild phenotypes (sensitivity\u2009=\u200991.67, specificity\u2009=\u200977.78) when compared to a manually evaluated reference standard (k\u2009=\u20090.716).\nCONCLUSIONS: CAESAR enables researchers to measure phenotype severity from EHRs to identify phenotypes that are important for comparative effectiveness research.", 
        "genre": "article", 
        "id": "sg:pub.10.1186/s13326-015-0010-8", 
        "inLanguage": "en", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.3801821", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2545469", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1043573", 
            "issn": [
              "2041-1480"
            ], 
            "name": "Journal of Biomedical Semantics", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "6"
          }
        ], 
        "keywords": [
          "electronic health records", 
          "health records", 
          "mild phenotype", 
          "severity measures", 
          "comparative effectiveness research", 
          "individual patients", 
          "severe form", 
          "severity", 
          "Medicine Clinical Terms", 
          "Severity Index", 
          "individual patient's state", 
          "patient's state", 
          "phenotype severity", 
          "patients", 
          "effectiveness research", 
          "acne", 
          "reference standard", 
          "multiple severities", 
          "phenotype", 
          "phenotype-genotype relationship", 
          "Systemized Nomenclature", 
          "treatment time", 
          "comorbidities", 
          "medications", 
          "records", 
          "measures", 
          "index", 
          "index terms", 
          "procedure", 
          "differentiates", 
          "wealth of information", 
          "development", 
          "relationship", 
          "nomenclature", 
          "time", 
          "standards", 
          "method", 
          "random forest algorithm", 
          "validation", 
          "designation", 
          "approach", 
          "forest algorithm", 
          "information", 
          "state", 
          "terms", 
          "research", 
          "form", 
          "cost", 
          "researchers", 
          "input", 
          "classification approach", 
          "wealth", 
          "example", 
          "algorithm", 
          "Caesar", 
          "clinical phenotype-genotype relationships", 
          "patient-level severity", 
          "phenotype-level severity", 
          "overall designation", 
          "Extracting Severity", 
          "proportional index term", 
          "CAESAR differentiates", 
          "evaluated reference standard"
        ], 
        "name": "Development and validation of a classification approach for extracting severity automatically from electronic health records", 
        "pagination": "14", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1017511191"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1186/s13326-015-0010-8"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "25848530"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1186/s13326-015-0010-8", 
          "https://app.dimensions.ai/details/publication/pub.1017511191"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2021-12-01T19:32", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20211201/entities/gbq_results/article/article_651.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1186/s13326-015-0010-8"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s13326-015-0010-8'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s13326-015-0010-8'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s13326-015-0010-8'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s13326-015-0010-8'


     

    This table displays all metadata directly associated to this object as RDF triples.

    154 TRIPLES      22 PREDICATES      91 URIs      81 LITERALS      7 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1186/s13326-015-0010-8 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author Nc7285fa09f7b42c59b2257379195bd1f
    4 schema:citation sg:pub.10.1007/s10115-013-0707-x
    5 sg:pub.10.1038/nbt.2757
    6 schema:datePublished 2015-04-06
    7 schema:datePublishedReg 2015-04-06
    8 schema:description BACKGROUND: Electronic Health Records (EHRs) contain a wealth of information useful for studying clinical phenotype-genotype relationships. Severity is important for distinguishing among phenotypes; however other severity indices classify patient-level severity (e.g., mild vs. acute dermatitis) rather than phenotype-level severity (e.g., acne vs. myocardial infarction). Phenotype-level severity is independent of the individual patient's state and is relative to other phenotypes. Further, phenotype-level severity does not change based on the individual patient. For example, acne is mild at the phenotype-level and relative to other phenotypes. Therefore, a given patient may have a severe form of acne (this is the patient-level severity), but this does not effect its overall designation as a mild phenotype at the phenotype-level. METHODS: We present a method for classifying severity at the phenotype-level that uses the Systemized Nomenclature of Medicine - Clinical Terms. Our method is called the Classification Approach for Extracting Severity Automatically from Electronic Health Records (CAESAR). CAESAR combines multiple severity measures - number of comorbidities, medications, procedures, cost, treatment time, and a proportional index term. CAESAR employs a random forest algorithm and these severity measures to discriminate between severe and mild phenotypes. RESULTS: Using a random forest algorithm and these severity measures as input, CAESAR differentiates between severe and mild phenotypes (sensitivity = 91.67, specificity = 77.78) when compared to a manually evaluated reference standard (k = 0.716). CONCLUSIONS: CAESAR enables researchers to measure phenotype severity from EHRs to identify phenotypes that are important for comparative effectiveness research.
    9 schema:genre article
    10 schema:inLanguage en
    11 schema:isAccessibleForFree true
    12 schema:isPartOf N026d93d335d544b8b2f065ee0080e0da
    13 Nc8bdbf4ebd3a4912978b451431be9488
    14 sg:journal.1043573
    15 schema:keywords CAESAR differentiates
    16 Caesar
    17 Extracting Severity
    18 Medicine Clinical Terms
    19 Severity Index
    20 Systemized Nomenclature
    21 acne
    22 algorithm
    23 approach
    24 classification approach
    25 clinical phenotype-genotype relationships
    26 comorbidities
    27 comparative effectiveness research
    28 cost
    29 designation
    30 development
    31 differentiates
    32 effectiveness research
    33 electronic health records
    34 evaluated reference standard
    35 example
    36 forest algorithm
    37 form
    38 health records
    39 index
    40 index terms
    41 individual patient's state
    42 individual patients
    43 information
    44 input
    45 measures
    46 medications
    47 method
    48 mild phenotype
    49 multiple severities
    50 nomenclature
    51 overall designation
    52 patient's state
    53 patient-level severity
    54 patients
    55 phenotype
    56 phenotype severity
    57 phenotype-genotype relationship
    58 phenotype-level severity
    59 procedure
    60 proportional index term
    61 random forest algorithm
    62 records
    63 reference standard
    64 relationship
    65 research
    66 researchers
    67 severe form
    68 severity
    69 severity measures
    70 standards
    71 state
    72 terms
    73 time
    74 treatment time
    75 validation
    76 wealth
    77 wealth of information
    78 schema:name Development and validation of a classification approach for extracting severity automatically from electronic health records
    79 schema:pagination 14
    80 schema:productId N7d3c054d0e904ef196885ed929e002a1
    81 N85138d2dac424d2891adba6569fbec0e
    82 Nb9a9b5f006ed4e06b04d2722da087412
    83 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017511191
    84 https://doi.org/10.1186/s13326-015-0010-8
    85 schema:sdDatePublished 2021-12-01T19:32
    86 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    87 schema:sdPublisher N5f57c9bef35540b781c03800bb98dc93
    88 schema:url https://doi.org/10.1186/s13326-015-0010-8
    89 sgo:license sg:explorer/license/
    90 sgo:sdDataset articles
    91 rdf:type schema:ScholarlyArticle
    92 N026d93d335d544b8b2f065ee0080e0da schema:volumeNumber 6
    93 rdf:type schema:PublicationVolume
    94 N5ac0d52c0209430bafbc4fe433465da8 rdf:first sg:person.01304304572.48
    95 rdf:rest rdf:nil
    96 N5f57c9bef35540b781c03800bb98dc93 schema:name Springer Nature - SN SciGraph project
    97 rdf:type schema:Organization
    98 N64813d22d4d040c5903fa7090945380e rdf:first sg:person.0651210417.29
    99 rdf:rest N5ac0d52c0209430bafbc4fe433465da8
    100 N7d3c054d0e904ef196885ed929e002a1 schema:name dimensions_id
    101 schema:value pub.1017511191
    102 rdf:type schema:PropertyValue
    103 N85138d2dac424d2891adba6569fbec0e schema:name pubmed_id
    104 schema:value 25848530
    105 rdf:type schema:PropertyValue
    106 Nb9a9b5f006ed4e06b04d2722da087412 schema:name doi
    107 schema:value 10.1186/s13326-015-0010-8
    108 rdf:type schema:PropertyValue
    109 Nc7285fa09f7b42c59b2257379195bd1f rdf:first sg:person.0770233234.36
    110 rdf:rest N64813d22d4d040c5903fa7090945380e
    111 Nc8bdbf4ebd3a4912978b451431be9488 schema:issueNumber 1
    112 rdf:type schema:PublicationIssue
    113 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    114 schema:name Information and Computing Sciences
    115 rdf:type schema:DefinedTerm
    116 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    117 schema:name Artificial Intelligence and Image Processing
    118 rdf:type schema:DefinedTerm
    119 sg:grant.2545469 http://pending.schema.org/fundedItem sg:pub.10.1186/s13326-015-0010-8
    120 rdf:type schema:MonetaryGrant
    121 sg:grant.3801821 http://pending.schema.org/fundedItem sg:pub.10.1186/s13326-015-0010-8
    122 rdf:type schema:MonetaryGrant
    123 sg:journal.1043573 schema:issn 2041-1480
    124 schema:name Journal of Biomedical Semantics
    125 schema:publisher Springer Nature
    126 rdf:type schema:Periodical
    127 sg:person.01304304572.48 schema:affiliation grid-institutes:grid.21729.3f
    128 schema:familyName Hripcsak
    129 schema:givenName George
    130 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01304304572.48
    131 rdf:type schema:Person
    132 sg:person.0651210417.29 schema:affiliation grid-institutes:grid.21729.3f
    133 schema:familyName Tatonetti
    134 schema:givenName Nicholas P
    135 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0651210417.29
    136 rdf:type schema:Person
    137 sg:person.0770233234.36 schema:affiliation grid-institutes:grid.21729.3f
    138 schema:familyName Boland
    139 schema:givenName Mary Regina
    140 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0770233234.36
    141 rdf:type schema:Person
    142 sg:pub.10.1007/s10115-013-0707-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1047035835
    143 https://doi.org/10.1007/s10115-013-0707-x
    144 rdf:type schema:CreativeWork
    145 sg:pub.10.1038/nbt.2757 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041461250
    146 https://doi.org/10.1038/nbt.2757
    147 rdf:type schema:CreativeWork
    148 grid-institutes:grid.21729.3f schema:alternateName Department of Medicine, Columbia University, New York, NY USA
    149 Observational Health Data Sciences and Informatics (OHDSI), Columbia University, 622 West 168th Street, PH-20, New York, NY USA
    150 schema:name Department of Biomedical Informatics, Columbia University, New York, NY USA
    151 Department of Medicine, Columbia University, New York, NY USA
    152 Department of Systems Biology, Columbia University, New York, NY USA
    153 Observational Health Data Sciences and Informatics (OHDSI), Columbia University, 622 West 168th Street, PH-20, New York, NY USA
    154 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...