Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2007-12

AUTHORS

Tobias Kind, Oliver Fiehn

ABSTRACT

BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. RESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. CONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website. More... »

PAGES

105

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-8-105

DOI

http://dx.doi.org/10.1186/1471-2105-8-105

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1022661426

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/17389044


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0301", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Analytical Chemistry", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/03", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Chemical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Biopolymers", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computer Simulation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Mass Spectrometry", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Models, Chemical", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Organic Chemicals", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of California, Davis", 
          "id": "https://www.grid.ac/institutes/grid.27860.3b", 
          "name": [
            "University of California Davis, Genome Center, 451 E. Health Sci. Dr., 95616, Davis, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kind", 
        "givenName": "Tobias", 
        "id": "sg:person.0604176630.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0604176630.24"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of California, Davis", 
          "id": "https://www.grid.ac/institutes/grid.27860.3b", 
          "name": [
            "University of California Davis, Genome Center, 451 E. Health Sci. Dr., 95616, Davis, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Fiehn", 
        "givenName": "Oliver", 
        "id": "sg:person.0615142477.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0615142477.79"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1002/0471720895.ch3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000097356"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0968-0896(96)00081-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000262559"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0169-7439(92)80037-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001590621"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-6-180", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003217589", 
          "https://doi.org/10.1186/1471-2105-6-180"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkj158", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003849523"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.phytochem.2004.08.027", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013227242"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/jxb/eri069", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013571412"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac0518811", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015938063"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac0518811", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015938063"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/2001202a0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017184614", 
          "https://doi.org/10.1038/2001202a0"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/j.jasms.2005.12.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019417430", 
          "https://doi.org/10.1016/j.jasms.2005.12.001"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/j.jasms.2005.12.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019417430", 
          "https://doi.org/10.1016/j.jasms.2005.12.001"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/j.jasms.2004.10.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020784349", 
          "https://doi.org/10.1016/j.jasms.2004.10.001"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1351/pac200375060683", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023581713"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac035426l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026700796"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac035426l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026700796"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bti683", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034729911"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0378-4347(00)00320-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035606893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkj067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035890801"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/s1044-0305(99)00089-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038779198", 
          "https://doi.org/10.1016/s1044-0305(99)00089-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/mas.20061", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038884794"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/mas.20061", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038884794"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/anie.200462457", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039714105"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0031-9422(95)90037-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041942163"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0031-9422(95)90037-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041942163"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0169-7439(93)80021-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042779349"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1016/s1044-0305(99)00047-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042808879", 
          "https://doi.org/10.1016/s1044-0305(99)00047-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-7-234", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045172077", 
          "https://doi.org/10.1186/1471-2105-7-234"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0079-6565(79)80006-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045444296"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/oms.1210120115", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045775748"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/oms.1210120115", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045775748"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01569759", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045792473", 
          "https://doi.org/10.1007/bf01569759"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01569759", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045792473", 
          "https://doi.org/10.1007/bf01569759"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac50021a024", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055007827"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac960435y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055072793"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac960435y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055072793"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac971132m", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055074470"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac971132m", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055074470"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac991142i", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055077469"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ac991142i", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055077469"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ar00012a003", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055148242"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci000135o", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055399965"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci000135o", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055399965"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci010056s", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401173"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci010056s", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401173"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci030404l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401552"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci030404l", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401552"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci0341060", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401645"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci0341060", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401645"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci049714+", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401846"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci049714+", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055401846"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci980171b", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405518"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci980171b", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405518"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci9902696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405716"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci9902696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055405716"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ed049p613", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055447511"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ic011003v", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055553971"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ic011003v", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055553971"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ja00436a017", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055732347"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1103/physrev.11.316", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1060420208"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1103/physrev.11.316", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1060420208"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tcbb.2005.43", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061540468"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.127.3303.880", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062471218"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2174/138161206777585274", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069165901"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/2372318", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069898996"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1077246590", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2007-12", 
    "datePublishedReg": "2007-12-01", 
    "description": "BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas.\nRESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries.\nCONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1471-2105-8-105", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2503754", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2439931", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "8"
      }
    ], 
    "name": "Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry", 
    "pagination": "105", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "6bf6178eb75cfa6c5232e34f688838dc5501cc773cfc3e456e11e8a814133a07"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "17389044"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100965194"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-8-105"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1022661426"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-8-105", 
      "https://app.dimensions.ai/details/publication/pub.1022661426"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T23:22", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8693_00000505.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2F1471-2105-8-105"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-8-105'


 

This table displays all metadata directly associated to this object as RDF triples.

251 TRIPLES      21 PREDICATES      82 URIs      27 LITERALS      15 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-8-105 schema:about N4615139bd38a4aedb9d30353056569ab
2 N7e30b88a7d3c4ef4903106089dd140b7
3 N94467ad8535d4fa896bbd871e27b0f35
4 N96cda24036c947558a2ace3e364a07ff
5 Nc96e90ca974049edae696c35fd03dfb8
6 Nd378797f043343549b43265375fb4009
7 anzsrc-for:03
8 anzsrc-for:0301
9 schema:author N6dd8151b2ef04bc5a7133457a47e40e3
10 schema:citation sg:pub.10.1007/bf01569759
11 sg:pub.10.1016/j.jasms.2004.10.001
12 sg:pub.10.1016/j.jasms.2005.12.001
13 sg:pub.10.1016/s1044-0305(99)00047-1
14 sg:pub.10.1016/s1044-0305(99)00089-6
15 sg:pub.10.1038/2001202a0
16 sg:pub.10.1186/1471-2105-6-180
17 sg:pub.10.1186/1471-2105-7-234
18 https://app.dimensions.ai/details/publication/pub.1077246590
19 https://doi.org/10.1002/0471720895.ch3
20 https://doi.org/10.1002/anie.200462457
21 https://doi.org/10.1002/mas.20061
22 https://doi.org/10.1002/oms.1210120115
23 https://doi.org/10.1016/0031-9422(95)90037-3
24 https://doi.org/10.1016/0079-6565(79)80006-0
25 https://doi.org/10.1016/0169-7439(92)80037-5
26 https://doi.org/10.1016/0169-7439(93)80021-9
27 https://doi.org/10.1016/0968-0896(96)00081-8
28 https://doi.org/10.1016/j.phytochem.2004.08.027
29 https://doi.org/10.1016/s0378-4347(00)00320-0
30 https://doi.org/10.1021/ac035426l
31 https://doi.org/10.1021/ac0518811
32 https://doi.org/10.1021/ac50021a024
33 https://doi.org/10.1021/ac960435y
34 https://doi.org/10.1021/ac971132m
35 https://doi.org/10.1021/ac991142i
36 https://doi.org/10.1021/ar00012a003
37 https://doi.org/10.1021/ci000135o
38 https://doi.org/10.1021/ci010056s
39 https://doi.org/10.1021/ci030404l
40 https://doi.org/10.1021/ci0341060
41 https://doi.org/10.1021/ci049714+
42 https://doi.org/10.1021/ci980171b
43 https://doi.org/10.1021/ci9902696
44 https://doi.org/10.1021/ed049p613
45 https://doi.org/10.1021/ic011003v
46 https://doi.org/10.1021/ja00436a017
47 https://doi.org/10.1093/bioinformatics/bti683
48 https://doi.org/10.1093/jxb/eri069
49 https://doi.org/10.1093/nar/gkj067
50 https://doi.org/10.1093/nar/gkj158
51 https://doi.org/10.1103/physrev.11.316
52 https://doi.org/10.1109/tcbb.2005.43
53 https://doi.org/10.1126/science.127.3303.880
54 https://doi.org/10.1351/pac200375060683
55 https://doi.org/10.2174/138161206777585274
56 https://doi.org/10.2307/2372318
57 schema:datePublished 2007-12
58 schema:datePublishedReg 2007-12-01
59 schema:description BACKGROUND: Structure elucidation of unknown small molecules by mass spectrometry is a challenge despite advances in instrumentation. The first crucial step is to obtain correct elemental compositions. In order to automatically constrain the thousands of possible candidate structures, rules need to be developed to select the most likely and chemically correct molecular formulas. RESULTS: An algorithm for filtering molecular formulas is derived from seven heuristic rules: (1) restrictions for the number of elements, (2) LEWIS and SENIOR chemical rules, (3) isotopic patterns, (4) hydrogen/carbon ratios, (5) element ratio of nitrogen, oxygen, phosphor, and sulphur versus carbon, (6) element ratio probabilities and (7) presence of trimethylsilylated compounds. Formulas are ranked according to their isotopic patterns and subsequently constrained by presence in public chemical databases. The seven rules were developed on 68,237 existing molecular formulas and were validated in four experiments. First, 432,968 formulas covering five million PubChem database entries were checked for consistency. Only 0.6% of these compounds did not pass all rules. Next, the rules were shown to effectively reducing the complement all eight billion theoretically possible C, H, N, S, O, P-formulas up to 2000 Da to only 623 million most probable elemental compositions. Thirdly 6,000 pharmaceutical, toxic and natural compounds were selected from DrugBank, TSCA and DNP databases. The correct formulas were retrieved as top hit at 80-99% probability when assuming data acquisition with complete resolution of unique compounds and 5% absolute isotope ratio deviation and 3 ppm mass accuracy. Last, some exemplary compounds were analyzed by Fourier transform ion cyclotron resonance mass spectrometry and by gas chromatography-time of flight mass spectrometry. In each case, the correct formula was ranked as top hit when combining the seven rules with database queries. CONCLUSION: The seven rules enable an automatic exclusion of molecular formulas which are either wrong or which contain unlikely high or low number of elements. The correct molecular formula is assigned with a probability of 98% if the formula exists in a compound database. For truly novel compounds that are not present in databases, the correct formula is found in the first three hits with a probability of 65-81%. Corresponding software and supplemental data are available for downloads from the authors' website.
60 schema:genre research_article
61 schema:inLanguage en
62 schema:isAccessibleForFree true
63 schema:isPartOf N87edd6ae84aa4207bc0fed5ffd27af47
64 N984cf01171bc458b96472fe369d0a710
65 sg:journal.1023786
66 schema:name Seven Golden Rules for heuristic filtering of molecular formulas obtained by accurate mass spectrometry
67 schema:pagination 105
68 schema:productId N27ef597c47c8495fbaa8ad3224b61bdf
69 N3c50587e373b495294330fa2b3e4aec1
70 N5c88e51503944a74a7e8998d74e46214
71 Nb3d8950f93b54bf29301a3e8243ef330
72 Nf88e92ea7a454691bd863d9880ef430b
73 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022661426
74 https://doi.org/10.1186/1471-2105-8-105
75 schema:sdDatePublished 2019-04-10T23:22
76 schema:sdLicense https://scigraph.springernature.com/explorer/license/
77 schema:sdPublisher Nd035d0bbda684c2388ee968cb07ea8ef
78 schema:url http://link.springer.com/10.1186%2F1471-2105-8-105
79 sgo:license sg:explorer/license/
80 sgo:sdDataset articles
81 rdf:type schema:ScholarlyArticle
82 N27ef597c47c8495fbaa8ad3224b61bdf schema:name readcube_id
83 schema:value 6bf6178eb75cfa6c5232e34f688838dc5501cc773cfc3e456e11e8a814133a07
84 rdf:type schema:PropertyValue
85 N3c50587e373b495294330fa2b3e4aec1 schema:name doi
86 schema:value 10.1186/1471-2105-8-105
87 rdf:type schema:PropertyValue
88 N4615139bd38a4aedb9d30353056569ab schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
89 schema:name Models, Chemical
90 rdf:type schema:DefinedTerm
91 N5c88e51503944a74a7e8998d74e46214 schema:name pubmed_id
92 schema:value 17389044
93 rdf:type schema:PropertyValue
94 N6dd8151b2ef04bc5a7133457a47e40e3 rdf:first sg:person.0604176630.24
95 rdf:rest Nf9f00f01e84448d0b554f1e775a40783
96 N7e30b88a7d3c4ef4903106089dd140b7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
97 schema:name Computer Simulation
98 rdf:type schema:DefinedTerm
99 N87edd6ae84aa4207bc0fed5ffd27af47 schema:volumeNumber 8
100 rdf:type schema:PublicationVolume
101 N94467ad8535d4fa896bbd871e27b0f35 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
102 schema:name Organic Chemicals
103 rdf:type schema:DefinedTerm
104 N96cda24036c947558a2ace3e364a07ff schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
105 schema:name Mass Spectrometry
106 rdf:type schema:DefinedTerm
107 N984cf01171bc458b96472fe369d0a710 schema:issueNumber 1
108 rdf:type schema:PublicationIssue
109 Nb3d8950f93b54bf29301a3e8243ef330 schema:name dimensions_id
110 schema:value pub.1022661426
111 rdf:type schema:PropertyValue
112 Nc96e90ca974049edae696c35fd03dfb8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
113 schema:name Algorithms
114 rdf:type schema:DefinedTerm
115 Nd035d0bbda684c2388ee968cb07ea8ef schema:name Springer Nature - SN SciGraph project
116 rdf:type schema:Organization
117 Nd378797f043343549b43265375fb4009 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
118 schema:name Biopolymers
119 rdf:type schema:DefinedTerm
120 Nf88e92ea7a454691bd863d9880ef430b schema:name nlm_unique_id
121 schema:value 100965194
122 rdf:type schema:PropertyValue
123 Nf9f00f01e84448d0b554f1e775a40783 rdf:first sg:person.0615142477.79
124 rdf:rest rdf:nil
125 anzsrc-for:03 schema:inDefinedTermSet anzsrc-for:
126 schema:name Chemical Sciences
127 rdf:type schema:DefinedTerm
128 anzsrc-for:0301 schema:inDefinedTermSet anzsrc-for:
129 schema:name Analytical Chemistry
130 rdf:type schema:DefinedTerm
131 sg:grant.2439931 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2105-8-105
132 rdf:type schema:MonetaryGrant
133 sg:grant.2503754 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2105-8-105
134 rdf:type schema:MonetaryGrant
135 sg:journal.1023786 schema:issn 1471-2105
136 schema:name BMC Bioinformatics
137 rdf:type schema:Periodical
138 sg:person.0604176630.24 schema:affiliation https://www.grid.ac/institutes/grid.27860.3b
139 schema:familyName Kind
140 schema:givenName Tobias
141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0604176630.24
142 rdf:type schema:Person
143 sg:person.0615142477.79 schema:affiliation https://www.grid.ac/institutes/grid.27860.3b
144 schema:familyName Fiehn
145 schema:givenName Oliver
146 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0615142477.79
147 rdf:type schema:Person
148 sg:pub.10.1007/bf01569759 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045792473
149 https://doi.org/10.1007/bf01569759
150 rdf:type schema:CreativeWork
151 sg:pub.10.1016/j.jasms.2004.10.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020784349
152 https://doi.org/10.1016/j.jasms.2004.10.001
153 rdf:type schema:CreativeWork
154 sg:pub.10.1016/j.jasms.2005.12.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019417430
155 https://doi.org/10.1016/j.jasms.2005.12.001
156 rdf:type schema:CreativeWork
157 sg:pub.10.1016/s1044-0305(99)00047-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042808879
158 https://doi.org/10.1016/s1044-0305(99)00047-1
159 rdf:type schema:CreativeWork
160 sg:pub.10.1016/s1044-0305(99)00089-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038779198
161 https://doi.org/10.1016/s1044-0305(99)00089-6
162 rdf:type schema:CreativeWork
163 sg:pub.10.1038/2001202a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017184614
164 https://doi.org/10.1038/2001202a0
165 rdf:type schema:CreativeWork
166 sg:pub.10.1186/1471-2105-6-180 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003217589
167 https://doi.org/10.1186/1471-2105-6-180
168 rdf:type schema:CreativeWork
169 sg:pub.10.1186/1471-2105-7-234 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045172077
170 https://doi.org/10.1186/1471-2105-7-234
171 rdf:type schema:CreativeWork
172 https://app.dimensions.ai/details/publication/pub.1077246590 schema:CreativeWork
173 https://doi.org/10.1002/0471720895.ch3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000097356
174 rdf:type schema:CreativeWork
175 https://doi.org/10.1002/anie.200462457 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039714105
176 rdf:type schema:CreativeWork
177 https://doi.org/10.1002/mas.20061 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038884794
178 rdf:type schema:CreativeWork
179 https://doi.org/10.1002/oms.1210120115 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045775748
180 rdf:type schema:CreativeWork
181 https://doi.org/10.1016/0031-9422(95)90037-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041942163
182 rdf:type schema:CreativeWork
183 https://doi.org/10.1016/0079-6565(79)80006-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045444296
184 rdf:type schema:CreativeWork
185 https://doi.org/10.1016/0169-7439(92)80037-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001590621
186 rdf:type schema:CreativeWork
187 https://doi.org/10.1016/0169-7439(93)80021-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042779349
188 rdf:type schema:CreativeWork
189 https://doi.org/10.1016/0968-0896(96)00081-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000262559
190 rdf:type schema:CreativeWork
191 https://doi.org/10.1016/j.phytochem.2004.08.027 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013227242
192 rdf:type schema:CreativeWork
193 https://doi.org/10.1016/s0378-4347(00)00320-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035606893
194 rdf:type schema:CreativeWork
195 https://doi.org/10.1021/ac035426l schema:sameAs https://app.dimensions.ai/details/publication/pub.1026700796
196 rdf:type schema:CreativeWork
197 https://doi.org/10.1021/ac0518811 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015938063
198 rdf:type schema:CreativeWork
199 https://doi.org/10.1021/ac50021a024 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055007827
200 rdf:type schema:CreativeWork
201 https://doi.org/10.1021/ac960435y schema:sameAs https://app.dimensions.ai/details/publication/pub.1055072793
202 rdf:type schema:CreativeWork
203 https://doi.org/10.1021/ac971132m schema:sameAs https://app.dimensions.ai/details/publication/pub.1055074470
204 rdf:type schema:CreativeWork
205 https://doi.org/10.1021/ac991142i schema:sameAs https://app.dimensions.ai/details/publication/pub.1055077469
206 rdf:type schema:CreativeWork
207 https://doi.org/10.1021/ar00012a003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055148242
208 rdf:type schema:CreativeWork
209 https://doi.org/10.1021/ci000135o schema:sameAs https://app.dimensions.ai/details/publication/pub.1055399965
210 rdf:type schema:CreativeWork
211 https://doi.org/10.1021/ci010056s schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401173
212 rdf:type schema:CreativeWork
213 https://doi.org/10.1021/ci030404l schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401552
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1021/ci0341060 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401645
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1021/ci049714+ schema:sameAs https://app.dimensions.ai/details/publication/pub.1055401846
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1021/ci980171b schema:sameAs https://app.dimensions.ai/details/publication/pub.1055405518
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1021/ci9902696 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055405716
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1021/ed049p613 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055447511
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1021/ic011003v schema:sameAs https://app.dimensions.ai/details/publication/pub.1055553971
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1021/ja00436a017 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055732347
228 rdf:type schema:CreativeWork
229 https://doi.org/10.1093/bioinformatics/bti683 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034729911
230 rdf:type schema:CreativeWork
231 https://doi.org/10.1093/jxb/eri069 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013571412
232 rdf:type schema:CreativeWork
233 https://doi.org/10.1093/nar/gkj067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035890801
234 rdf:type schema:CreativeWork
235 https://doi.org/10.1093/nar/gkj158 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003849523
236 rdf:type schema:CreativeWork
237 https://doi.org/10.1103/physrev.11.316 schema:sameAs https://app.dimensions.ai/details/publication/pub.1060420208
238 rdf:type schema:CreativeWork
239 https://doi.org/10.1109/tcbb.2005.43 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061540468
240 rdf:type schema:CreativeWork
241 https://doi.org/10.1126/science.127.3303.880 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062471218
242 rdf:type schema:CreativeWork
243 https://doi.org/10.1351/pac200375060683 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023581713
244 rdf:type schema:CreativeWork
245 https://doi.org/10.2174/138161206777585274 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069165901
246 rdf:type schema:CreativeWork
247 https://doi.org/10.2307/2372318 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069898996
248 rdf:type schema:CreativeWork
249 https://www.grid.ac/institutes/grid.27860.3b schema:alternateName University of California, Davis
250 schema:name University of California Davis, Genome Center, 451 E. Health Sci. Dr., 95616, Davis, CA, USA
251 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...