Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2007-12

AUTHORS

Tobias Kind, Oliver Fiehn

ABSTRACT

BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. RESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. CONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website. More... »

PAGES

105

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-8-105

DOI

http://dx.doi.org/10.1186/1471-2105-8-105

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1022661426

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/17389044


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0301", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Analytical Chemistry", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/03", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Chemical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Biopolymers", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computer Simulation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Mass Spectrometry", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Models, Chemical", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Organic Chemicals", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of California, Davis", 
          "id": "https://www.grid.ac/institutes/grid.27860.3b", 
          "name": [
            "University of California Davis, Genome Center, 451 E. Health Sci. Dr., 95616, Davis, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kind", 
        "givenName": "Tobias", 
        "id": "sg:person.0604176630.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0604176630.24"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of California, Davis", 
          "id": "https://www.grid.ac/institutes/grid.27860.3b", 
          "name": [
            "University of California Davis, Genome Center, 451 E. Health Sci. Dr., 95616, Davis, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Fiehn", 
        "givenName": "Oliver", 
        "id": "sg:person.0615142477.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0615142477.79"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1002/0471720895.ch3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000097356"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0968-0896(96)00081-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000262559"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0169-7439(92)80037-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001590621"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-6-180", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003217589", 
          "https://doi.org/10.1186/1471-2105-6-180"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkj158", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003849523"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.phytochem.2004.08.027", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013227242"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/jxb/eri069", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013571412"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac0518811", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015938063"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac0518811", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015938063"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/2001202a0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017184614", 
          "https://doi.org/10.1038/2001202a0"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/j.jasms.2005.12.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019417430", 
          "https://doi.org/10.1016/j.jasms.2005.12.001"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/j.jasms.2005.12.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019417430", 
          "https://doi.org/10.1016/j.jasms.2005.12.001"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/j.jasms.2004.10.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020784349", 
          "https://doi.org/10.1016/j.jasms.2004.10.001"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1351/pac200375060683", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023581713"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac035426l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026700796"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac035426l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026700796"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bti683", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034729911"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0378-4347(00)00320-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035606893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkj067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035890801"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/s1044-0305(99)00089-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038779198", 
          "https://doi.org/10.1016/s1044-0305(99)00089-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/mas.20061", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038884794"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/mas.20061", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038884794"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/anie.200462457", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039714105"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0031-9422(95)90037-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041942163"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0031-9422(95)90037-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041942163"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0169-7439(93)80021-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042779349"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/s1044-0305(99)00047-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042808879", 
          "https://doi.org/10.1016/s1044-0305(99)00047-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-7-234", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045172077", 
          "https://doi.org/10.1186/1471-2105-7-234"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0079-6565(79)80006-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045444296"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/oms.1210120115", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045775748"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/oms.1210120115", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045775748"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01569759", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045792473", 
          "https://doi.org/10.1007/bf01569759"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01569759", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045792473", 
          "https://doi.org/10.1007/bf01569759"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac50021a024", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055007827"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac960435y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055072793"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac960435y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055072793"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac971132m", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055074470"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac971132m", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055074470"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac991142i", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055077469"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac991142i", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055077469"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ar00012a003", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055148242"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci000135o", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055399965"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci000135o", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055399965"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci010056s", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401173"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci010056s", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401173"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci030404l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401552"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci030404l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401552"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci0341060", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401645"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci0341060", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401645"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci049714+", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401846"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci049714+", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401846"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci980171b", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405518"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci980171b", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405518"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci9902696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405716"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci9902696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405716"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ed049p613", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055447511"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ic011003v", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055553971"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ic011003v", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055553971"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ja00436a017", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055732347"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1103/physrev.11.316", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1060420208"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1103/physrev.11.316", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1060420208"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tcbb.2005.43", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061540468"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.127.3303.880", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062471218"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2174/138161206777585274", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069165901"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/2372318", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069898996"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1077246590", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2007-12", 
    "datePublishedReg": "2007-12-01", 
    "description": "BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas.\nRESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries.\nCONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1471-2105-8-105", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2503754", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2439931", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "8"
      }
    ], 
    "name": "Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry", 
    "pagination": "105", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "6bf6178eb75cfa6c5232e34f688838dc5501cc773cfc3e456e11e8a814133a07"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "17389044"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100965194"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-8-105"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1022661426"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-8-105", 
      "https://app.dimensions.ai/details/publication/pub.1022661426"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T23:22", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8693_00000505.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2F1471-2105-8-105"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'


 

This table displays all metadata directly associated to this object as RDF triples.

251 TRIPLES      21 PREDICATES      82 URIs      27 LITERALS      15 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-8-105 schema:about N02423ea9f73847f79e79707b8c3da4a1
2 N216bbe9cf76549678fc469efd0437f8b
3 N4782bf7b9a3347188902a2e211de958f
4 N48e8b3867c28463ab1e0bca4821f44f6
5 N4e43730560e44800ad70aa97866ab8eb
6 Ndcd3c16bb088493c8a91c9d56ae0d4b1
7 anzsrc-for:03
8 anzsrc-for:0301
9 schema:author Nee428b182df84ce98bf7dafbf9cf3985
10 schema:citation sg:pub.10.1007/bf01569759
11 sg:pub.10.1016/j.jasms.2004.10.001
12 sg:pub.10.1016/j.jasms.2005.12.001
13 sg:pub.10.1016/s1044-0305(99)00047-1
14 sg:pub.10.1016/s1044-0305(99)00089-6
15 sg:pub.10.1038/2001202a0
16 sg:pub.10.1186/1471-2105-6-180
17 sg:pub.10.1186/1471-2105-7-234
18 https://app.dimensions.ai/details/publication/pub.1077246590
19 https://doi.org/10.1002/0471720895.ch3
20 https://doi.org/10.1002/anie.200462457
21 https://doi.org/10.1002/mas.20061
22 https://doi.org/10.1002/oms.1210120115
23 https://doi.org/10.1016/0031-9422(95)90037-3
24 https://doi.org/10.1016/0079-6565(79)80006-0
25 https://doi.org/10.1016/0169-7439(92)80037-5
26 https://doi.org/10.1016/0169-7439(93)80021-9
27 https://doi.org/10.1016/0968-0896(96)00081-8
28 https://doi.org/10.1016/j.phytochem.2004.08.027
29 https://doi.org/10.1016/s0378-4347(00)00320-0
30 https://doi.org/10.1021/ac035426l
31 https://doi.org/10.1021/ac0518811
32 https://doi.org/10.1021/ac50021a024
33 https://doi.org/10.1021/ac960435y
34 https://doi.org/10.1021/ac971132m
35 https://doi.org/10.1021/ac991142i
36 https://doi.org/10.1021/ar00012a003
37 https://doi.org/10.1021/ci000135o
38 https://doi.org/10.1021/ci010056s
39 https://doi.org/10.1021/ci030404l
40 https://doi.org/10.1021/ci0341060
41 https://doi.org/10.1021/ci049714+
42 https://doi.org/10.1021/ci980171b
43 https://doi.org/10.1021/ci9902696
44 https://doi.org/10.1021/ed049p613
45 https://doi.org/10.1021/ic011003v
46 https://doi.org/10.1021/ja00436a017
47 https://doi.org/10.1093/bioinformatics/bti683
48 https://doi.org/10.1093/jxb/eri069
49 https://doi.org/10.1093/nar/gkj067
50 https://doi.org/10.1093/nar/gkj158
51 https://doi.org/10.1103/physrev.11.316
52 https://doi.org/10.1109/tcbb.2005.43
53 https://doi.org/10.1126/science.127.3303.880
54 https://doi.org/10.1351/pac200375060683
55 https://doi.org/10.2174/138161206777585274
56 https://doi.org/10.2307/2372318
57 schema:datePublished 2007-12
58 schema:datePublishedReg 2007-12-01
59 schema:description BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. RESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. CONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website.
60 schema:genre research_article
61 schema:inLanguage en
62 schema:isAccessibleForFree true
63 schema:isPartOf N180952bfa9834991adc4980a0e5cc5af
64 N8d9dcfb5522944189d5413f6a95108b7
65 sg:journal.1023786
66 schema:name Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry
67 schema:pagination 105
68 schema:productId N51b6145eca44449493034863a0b563b1
69 N7d440ba71c4e4c28b81e0da4a90a7037
70 Nd323805e9094464db90df915a0226332
71 Nd3c48c3d100d4bb381f4310cfac5df8b
72 Nd879bee198c742f9bc33bbe7897e6f75
73 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022661426
74 https://doi.org/10.1186/1471-2105-8-105
75 schema:sdDatePublished 2019-04-10T23:22
76 schema:sdLicense https://scigraph.springernature.com/explorer/license/
77 schema:sdPublisher Ncfebac86aae449fa8f8cc17395aa3d23
78 schema:url http://link.springer.com/10.1186%2F1471-2105-8-105
79 sgo:license sg:explorer/license/
80 sgo:sdDataset articles
81 rdf:type schema:ScholarlyArticle
82 N02423ea9f73847f79e79707b8c3da4a1 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
83 schema:name Models, Chemical
84 rdf:type schema:DefinedTerm
85 N180952bfa9834991adc4980a0e5cc5af schema:issueNumber 1
86 rdf:type schema:PublicationIssue
87 N216bbe9cf76549678fc469efd0437f8b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
88 schema:name Computer Simulation
89 rdf:type schema:DefinedTerm
90 N4782bf7b9a3347188902a2e211de958f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
91 schema:name Algorithms
92 rdf:type schema:DefinedTerm
93 N48e8b3867c28463ab1e0bca4821f44f6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
94 schema:name Biopolymers
95 rdf:type schema:DefinedTerm
96 N4e43730560e44800ad70aa97866ab8eb schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
97 schema:name Organic Chemicals
98 rdf:type schema:DefinedTerm
99 N51b6145eca44449493034863a0b563b1 schema:name readcube_id
100 schema:value 6bf6178eb75cfa6c5232e34f688838dc5501cc773cfc3e456e11e8a814133a07
101 rdf:type schema:PropertyValue
102 N7d440ba71c4e4c28b81e0da4a90a7037 schema:name pubmed_id
103 schema:value 17389044
104 rdf:type schema:PropertyValue
105 N8d9dcfb5522944189d5413f6a95108b7 schema:volumeNumber 8
106 rdf:type schema:PublicationVolume
107 Ncfebac86aae449fa8f8cc17395aa3d23 schema:name Springer Nature - SN SciGraph project
108 rdf:type schema:Organization
109 Nd323805e9094464db90df915a0226332 schema:name doi
110 schema:value 10.1186/1471-2105-8-105
111 rdf:type schema:PropertyValue
112 Nd3c48c3d100d4bb381f4310cfac5df8b schema:name nlm_unique_id
113 schema:value 100965194
114 rdf:type schema:PropertyValue
115 Nd879bee198c742f9bc33bbe7897e6f75 schema:name dimensions_id
116 schema:value pub.1022661426
117 rdf:type schema:PropertyValue
118 Ndcd3c16bb088493c8a91c9d56ae0d4b1 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
119 schema:name Mass Spectrometry
120 rdf:type schema:DefinedTerm
121 Nea1b2f8138aa4d878e0eb8dcc2a71f61 rdf:first sg:person.0615142477.79
122 rdf:rest rdf:nil
123 Nee428b182df84ce98bf7dafbf9cf3985 rdf:first sg:person.0604176630.24
124 rdf:rest Nea1b2f8138aa4d878e0eb8dcc2a71f61
125 anzsrc-for:03 schema:inDefinedTermSet anzsrc-for:
126 schema:name Chemical Sciences
127 rdf:type schema:DefinedTerm
128 anzsrc-for:0301 schema:inDefinedTermSet anzsrc-for:
129 schema:name Analytical Chemistry
130 rdf:type schema:DefinedTerm
131 sg:grant.2439931 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2105-8-105
132 rdf:type schema:MonetaryGrant
133 sg:grant.2503754 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2105-8-105
134 rdf:type schema:MonetaryGrant
135 sg:journal.1023786 schema:issn 1471-2105
136 schema:name BMC Bioinformatics
137 rdf:type schema:Periodical
138 sg:person.0604176630.24 schema:affiliation https://www.grid.ac/institutes/grid.27860.3b
139 schema:familyName Kind
140 schema:givenName Tobias
141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0604176630.24
142 rdf:type schema:Person
143 sg:person.0615142477.79 schema:affiliation https://www.grid.ac/institutes/grid.27860.3b
144 schema:familyName Fiehn
145 schema:givenName Oliver
146 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0615142477.79
147 rdf:type schema:Person
148 sg:pub.10.1007/bf01569759 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045792473
149 https://doi.org/10.1007/bf01569759
150 rdf:type schema:CreativeWork
151 sg:pub.10.1016/j.jasms.2004.10.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020784349
152 https://doi.org/10.1016/j.jasms.2004.10.001
153 rdf:type schema:CreativeWork
154 sg:pub.10.1016/j.jasms.2005.12.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019417430
155 https://doi.org/10.1016/j.jasms.2005.12.001
156 rdf:type schema:CreativeWork
157 sg:pub.10.1016/s1044-0305(99)00047-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042808879
158 https://doi.org/10.1016/s1044-0305(99)00047-1
159 rdf:type schema:CreativeWork
160 sg:pub.10.1016/s1044-0305(99)00089-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038779198
161 https://doi.org/10.1016/s1044-0305(99)00089-6
162 rdf:type schema:CreativeWork
163 sg:pub.10.1038/2001202a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017184614
164 https://doi.org/10.1038/2001202a0
165 rdf:type schema:CreativeWork
166 sg:pub.10.1186/1471-2105-6-180 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003217589
167 https://doi.org/10.1186/1471-2105-6-180
168 rdf:type schema:CreativeWork
169 sg:pub.10.1186/1471-2105-7-234 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045172077
170 https://doi.org/10.1186/1471-2105-7-234
171 rdf:type schema:CreativeWork
172 https://app.dimensions.ai/details/publication/pub.1077246590 schema:CreativeWork
173 https://doi.org/10.1002/0471720895.ch3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000097356
174 rdf:type schema:CreativeWork
175 https://doi.org/10.1002/anie.200462457 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039714105
176 rdf:type schema:CreativeWork
177 https://doi.org/10.1002/mas.20061 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038884794
178 rdf:type schema:CreativeWork
179 https://doi.org/10.1002/oms.1210120115 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045775748
180 rdf:type schema:CreativeWork
181 https://doi.org/10.1016/0031-9422(95)90037-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041942163
182 rdf:type schema:CreativeWork
183 https://doi.org/10.1016/0079-6565(79)80006-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045444296
184 rdf:type schema:CreativeWork
185 https://doi.org/10.1016/0169-7439(92)80037-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001590621
186 rdf:type schema:CreativeWork
187 https://doi.org/10.1016/0169-7439(93)80021-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042779349
188 rdf:type schema:CreativeWork
189 https://doi.org/10.1016/0968-0896(96)00081-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000262559
190 rdf:type schema:CreativeWork
191 https://doi.org/10.1016/j.phytochem.2004.08.027 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013227242
192 rdf:type schema:CreativeWork
193 https://doi.org/10.1016/s0378-4347(00)00320-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035606893
194 rdf:type schema:CreativeWork
195 https://doi.org/10.1021/ac035426l schema:sameAs https://app.dimensions.ai/details/publication/pub.1026700796
196 rdf:type schema:CreativeWork
197 https://doi.org/10.1021/ac0518811 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015938063
198 rdf:type schema:CreativeWork
199 https://doi.org/10.1021/ac50021a024 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055007827
200 rdf:type schema:CreativeWork
201 https://doi.org/10.1021/ac960435y schema:sameAs https://app.dimensions.ai/details/publication/pub.1055072793
202 rdf:type schema:CreativeWork
203 https://doi.org/10.1021/ac971132m schema:sameAs https://app.dimensions.ai/details/publication/pub.1055074470
204 rdf:type schema:CreativeWork
205 https://doi.org/10.1021/ac991142i schema:sameAs https://app.dimensions.ai/details/publication/pub.1055077469
206 rdf:type schema:CreativeWork
207 https://doi.org/10.1021/ar00012a003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055148242
208 rdf:type schema:CreativeWork
209 https://doi.org/10.1021/ci000135o schema:sameAs https://app.dimensions.ai/details/publication/pub.1055399965
210 rdf:type schema:CreativeWork
211 https://doi.org/10.1021/ci010056s schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401173
212 rdf:type schema:CreativeWork
213 https://doi.org/10.1021/ci030404l schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401552
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1021/ci0341060 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401645
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1021/ci049714+ schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401846
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1021/ci980171b schema:sameAs https://app.dimensions.ai/details/publication/pub.1055405518
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1021/ci9902696 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055405716
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1021/ed049p613 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055447511
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1021/ic011003v schema:sameAs https://app.dimensions.ai/details/publication/pub.1055553971
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1021/ja00436a017 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055732347
228 rdf:type schema:CreativeWork
229 https://doi.org/10.1093/bioinformatics/bti683 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034729911
230 rdf:type schema:CreativeWork
231 https://doi.org/10.1093/jxb/eri069 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013571412
232 rdf:type schema:CreativeWork
233 https://doi.org/10.1093/nar/gkj067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035890801
234 rdf:type schema:CreativeWork
235 https://doi.org/10.1093/nar/gkj158 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003849523
236 rdf:type schema:CreativeWork
237 https://doi.org/10.1103/physrev.11.316 schema:sameAs https://app.dimensions.ai/details/publication/pub.1060420208
238 rdf:type schema:CreativeWork
239 https://doi.org/10.1109/tcbb.2005.43 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061540468
240 rdf:type schema:CreativeWork
241 https://doi.org/10.1126/science.127.3303.880 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062471218
242 rdf:type schema:CreativeWork
243 https://doi.org/10.1351/pac200375060683 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023581713
244 rdf:type schema:CreativeWork
245 https://doi.org/10.2174/138161206777585274 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069165901
246 rdf:type schema:CreativeWork
247 https://doi.org/10.2307/2372318 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069898996
248 rdf:type schema:CreativeWork
249 https://www.grid.ac/institutes/grid.27860.3b schema:alternateName University of California, Davis
250 schema:name University of California Davis, Genome Center, 451 E. Health Sci. Dr., 95616, Davis, CA, USA
251 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...