Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2011-06-30

AUTHORS

Jeffrey J Werner, Omry Koren, Philip Hugenholtz, Todd Z DeSantis, William A Walters, J Gregory Caporaso, Largus T Angenent, Rob Knight, Ruth E Ley

ABSTRACT

Taxonomic classification of the thousands–millions of 16S rRNA gene sequences generated in microbiome studies is often achieved using a naïve Bayesian classifier (for example, the Ribosomal Database Project II (RDP) classifier), due to favorable trade-offs among automation, speed and accuracy. The resulting classification depends on the reference sequences and taxonomic hierarchy used to train the model; although the influence of primer sets and classification algorithms have been explored in detail, the influence of training set has not been characterized. We compared classification results obtained using three different publicly available databases as training sets, applied to five different bacterial 16S rRNA gene pyrosequencing data sets generated (from human body, mouse gut, python gut, soil and anaerobic digester samples). We observed numerous advantages to using the largest, most diverse training set available, that we constructed from the Greengenes (GG) bacterial/archaeal 16S rRNA gene sequence database and the latest GG taxonomy. Phylogenetic clusters of previously unclassified experimental sequences were identified with notable improvements (for example, 50% reduction in reads unclassified at the phylum level in mouse gut, soil and anaerobic digester samples), especially for phylotypes belonging to specific phyla (Tenericutes, Chloroflexi, Synergistetes and Candidate phyla TM6, TM7). Trimming the reference sequences to the primer region resulted in systematic improvements in classification depth, and greatest gains at higher confidence thresholds. Phylotypes unclassified at the genus level represented a greater proportion of the total community variation than classified operational taxonomic units in mouse gut and anaerobic digester samples, underscoring the need for greater diversity in existing reference databases. More... »

PAGES

94-103

Identifiers

URI

http://scigraph.springernature.com/pub.10.1038/ismej.2011.82

DOI

http://dx.doi.org/10.1038/ismej.2011.82

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1033211760

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/21716311


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Animals", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Archaea", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Bacteria", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Bayes Theorem", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "DNA, Archaeal", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "DNA, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Gastrointestinal Tract", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "High-Throughput Nucleotide Sequencing", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Humans", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Metagenome", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Mice", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Phylogeny", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "RNA, Ribosomal, 16S", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Ribotyping", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY, USA", 
          "id": "http://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Werner", 
        "givenName": "Jeffrey J", 
        "id": "sg:person.0603323015.78", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0603323015.78"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Microbiology, Cornell University, Ithaca, NY, USA", 
          "id": "http://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "Department of Microbiology, Cornell University, Ithaca, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Koren", 
        "givenName": "Omry", 
        "id": "sg:person.01152611031.15", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01152611031.15"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia", 
          "id": "http://www.grid.ac/institutes/grid.1003.2", 
          "name": [
            "Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hugenholtz", 
        "givenName": "Philip", 
        "id": "sg:person.01055510700.73", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055510700.73"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Center for Environmental Biotechnology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.184769.5", 
          "name": [
            "Center for Environmental Biotechnology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "DeSantis", 
        "givenName": "Todd Z", 
        "id": "sg:person.01333365763.05", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01333365763.05"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biochemistry and Chemistry, University of Colorado, Boulder, CO, USA", 
          "id": "http://www.grid.ac/institutes/grid.266190.a", 
          "name": [
            "Department of Biochemistry and Chemistry, University of Colorado, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Walters", 
        "givenName": "William A", 
        "id": "sg:person.015342544657.49", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015342544657.49"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biochemistry and Chemistry, University of Colorado, Boulder, CO, USA", 
          "id": "http://www.grid.ac/institutes/grid.266190.a", 
          "name": [
            "Department of Biochemistry and Chemistry, University of Colorado, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Caporaso", 
        "givenName": "J Gregory", 
        "id": "sg:person.0624224157.70", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0624224157.70"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY, USA", 
          "id": "http://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Angenent", 
        "givenName": "Largus T", 
        "id": "sg:person.0724210263.91", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0724210263.91"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Howard Hughes Medical Institute, University of Colorado, Boulder, CO, USA", 
          "id": "http://www.grid.ac/institutes/grid.266190.a", 
          "name": [
            "Department of Biochemistry and Chemistry, University of Colorado, Boulder, CO, USA", 
            "Howard Hughes Medical Institute, University of Colorado, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Knight", 
        "givenName": "Rob", 
        "id": "sg:person.016311745377.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016311745377.96"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Microbiology, Cornell University, Ithaca, NY, USA", 
          "id": "http://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "Department of Microbiology, Cornell University, Ithaca, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ley", 
        "givenName": "Ruth E", 
        "id": "sg:person.0725775131.01", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0725775131.01"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1038/nmeth.1184", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042740345", 
          "https://doi.org/10.1038/nmeth.1184"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1472-6785-11-11", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000431206", 
          "https://doi.org/10.1186/1472-6785-11-11"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ismej.2010.71", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004755333", 
          "https://doi.org/10.1038/ismej.2010.71"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.f.303", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009032055", 
          "https://doi.org/10.1038/nmeth.f.303"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature03959", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021574562", 
          "https://doi.org/10.1038/nature03959"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth0910-668b", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017522391", 
          "https://doi.org/10.1038/nmeth0910-668b"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2011-06-30", 
    "datePublishedReg": "2011-06-30", 
    "description": "Taxonomic classification of the thousands\u2013millions of 16S rRNA gene sequences generated in microbiome studies is often achieved using a na\u00efve Bayesian classifier (for example, the Ribosomal Database Project II (RDP) classifier), due to favorable trade-offs among automation, speed and accuracy. The resulting classification depends on the reference sequences and taxonomic hierarchy used to train the model; although the influence of primer sets and classification algorithms have been explored in detail, the influence of training set has not been characterized. We compared classification results obtained using three different publicly available databases as training sets, applied to five different bacterial 16S rRNA gene pyrosequencing data sets generated (from human body, mouse gut, python gut, soil and anaerobic digester samples). We observed numerous advantages to using the largest, most diverse training set available, that we constructed from the Greengenes (GG) bacterial/archaeal 16S rRNA gene sequence database and the latest GG taxonomy. Phylogenetic clusters of previously unclassified experimental sequences were identified with notable improvements (for example, 50% reduction in reads unclassified at the phylum level in mouse gut, soil and anaerobic digester samples), especially for phylotypes belonging to specific phyla (Tenericutes, Chloroflexi, Synergistetes and Candidate phyla TM6, TM7). Trimming the reference sequences to the primer region resulted in systematic improvements in classification depth, and greatest gains at higher confidence thresholds. Phylotypes unclassified at the genus level represented a greater proportion of the total community variation than classified operational taxonomic units in mouse gut and anaerobic digester samples, underscoring the need for greater diversity in existing reference databases.", 
    "genre": "article", 
    "id": "sg:pub.10.1038/ismej.2011.82", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2705031", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2684042", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2705095", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2691272", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.8775219", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.8775717", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1038436", 
        "issn": [
          "1751-7362", 
          "1751-7370"
        ], 
        "name": "The ISME Journal: Multidisciplinary Journal of Microbial Ecology", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "6"
      }
    ], 
    "keywords": [
      "training set", 
      "na\u00efve Bayesian classifier", 
      "classification algorithms", 
      "classification results", 
      "Bayesian classifier", 
      "available databases", 
      "data sets", 
      "sequence databases", 
      "classification", 
      "set", 
      "gene sequence database", 
      "database", 
      "high confidence", 
      "reference database", 
      "reference sequence", 
      "classifier", 
      "automation", 
      "taxonomic hierarchy", 
      "rRNA gene sequence database", 
      "algorithm", 
      "numerous advantages", 
      "diverse training", 
      "notable improvement", 
      "accuracy", 
      "hierarchy", 
      "Greengenes", 
      "taxonomy", 
      "advantages", 
      "taxonomic classification", 
      "sequence", 
      "speed", 
      "improvement", 
      "training", 
      "systematic improvement", 
      "model", 
      "clusters", 
      "need", 
      "confidence", 
      "detail", 
      "results", 
      "gain", 
      "units", 
      "microbiome studies", 
      "diversity", 
      "great diversity", 
      "impact", 
      "survey", 
      "experimental sequence", 
      "greater gains", 
      "rRNA gene sequences", 
      "depth", 
      "levels", 
      "total community variation", 
      "operational taxonomic units", 
      "rRNA gene surveys", 
      "gene sequences", 
      "rRNA gene", 
      "phylogenetic clusters", 
      "specific phyla", 
      "community variation", 
      "taxonomic units", 
      "anaerobic digester samples", 
      "gene surveys", 
      "study", 
      "primer sets", 
      "phylotypes", 
      "region", 
      "genus level", 
      "variation", 
      "mouse gut", 
      "digester samples", 
      "primer regions", 
      "influence", 
      "phyla", 
      "genes", 
      "gut", 
      "samples", 
      "greater proportion", 
      "proportion", 
      "latest GG taxonomy", 
      "GG taxonomy", 
      "unclassified experimental sequences", 
      "classification depth", 
      "high-throughput bacterial 16s rRNA gene surveys", 
      "bacterial 16s rRNA gene surveys"
    ], 
    "name": "Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys", 
    "pagination": "94-103", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1033211760"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1038/ismej.2011.82"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "21716311"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1038/ismej.2011.82", 
      "https://app.dimensions.ai/details/publication/pub.1033211760"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2021-11-01T18:16", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211101/entities/gbq_results/article/article_538.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1038/ismej.2011.82"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.82'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.82'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.82'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.82'


 

This table displays all metadata directly associated to this object as RDF triples.

312 TRIPLES      22 PREDICATES      132 URIs      118 LITERALS      22 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1038/ismej.2011.82 schema:about N04244f165269479aba48f0afcb15c0eb
2 N062fffd520bb447cb2d435f189193756
3 N1c7dfbbdd9d24b5bbd2d667980d1ff42
4 N2b48458f80614c43bee97224b583c224
5 N34f8675d8f0f4a4bbda1c80a56e41b66
6 N4d6b20fc8a7f412fb3f4f9a515937691
7 N52dde72cd1cc40d3ae6a3432f99d3134
8 N61bb60e5a60d4e2997f58434ac5f5dce
9 N79365be9de0b4610a8ee0753f7cae8c0
10 N8541ddfbe380460e8b2c47d8c1d31567
11 Na060c11a8a7247f991a852cc749b1ed0
12 Na0f86ecf54d041bc9031c68ecdec56ee
13 Ncaa5773229c9454dbae27a6eaff8bad5
14 Ndd9b1cdadfd5408da29a8a3fcda1f4c5
15 Nf393be6ca5484011ab2275e862da4e81
16 anzsrc-for:06
17 anzsrc-for:0604
18 schema:author N62399e3b454f4b37bec0dddb946991e0
19 schema:citation sg:pub.10.1038/ismej.2010.71
20 sg:pub.10.1038/nature03959
21 sg:pub.10.1038/nmeth.1184
22 sg:pub.10.1038/nmeth.f.303
23 sg:pub.10.1038/nmeth0910-668b
24 sg:pub.10.1186/1472-6785-11-11
25 schema:datePublished 2011-06-30
26 schema:datePublishedReg 2011-06-30
27 schema:description Taxonomic classification of the thousands–millions of 16S rRNA gene sequences generated in microbiome studies is often achieved using a naïve Bayesian classifier (for example, the Ribosomal Database Project II (RDP) classifier), due to favorable trade-offs among automation, speed and accuracy. The resulting classification depends on the reference sequences and taxonomic hierarchy used to train the model; although the influence of primer sets and classification algorithms have been explored in detail, the influence of training set has not been characterized. We compared classification results obtained using three different publicly available databases as training sets, applied to five different bacterial 16S rRNA gene pyrosequencing data sets generated (from human body, mouse gut, python gut, soil and anaerobic digester samples). We observed numerous advantages to using the largest, most diverse training set available, that we constructed from the Greengenes (GG) bacterial/archaeal 16S rRNA gene sequence database and the latest GG taxonomy. Phylogenetic clusters of previously unclassified experimental sequences were identified with notable improvements (for example, 50% reduction in reads unclassified at the phylum level in mouse gut, soil and anaerobic digester samples), especially for phylotypes belonging to specific phyla (Tenericutes, Chloroflexi, Synergistetes and Candidate phyla TM6, TM7). Trimming the reference sequences to the primer region resulted in systematic improvements in classification depth, and greatest gains at higher confidence thresholds. Phylotypes unclassified at the genus level represented a greater proportion of the total community variation than classified operational taxonomic units in mouse gut and anaerobic digester samples, underscoring the need for greater diversity in existing reference databases.
28 schema:genre article
29 schema:inLanguage en
30 schema:isAccessibleForFree true
31 schema:isPartOf N06fe238c3db8457ab150e8d268b0f713
32 N217b2ba0b55e44f0a6dc44294438f078
33 sg:journal.1038436
34 schema:keywords Bayesian classifier
35 GG taxonomy
36 Greengenes
37 accuracy
38 advantages
39 algorithm
40 anaerobic digester samples
41 automation
42 available databases
43 bacterial 16s rRNA gene surveys
44 classification
45 classification algorithms
46 classification depth
47 classification results
48 classifier
49 clusters
50 community variation
51 confidence
52 data sets
53 database
54 depth
55 detail
56 digester samples
57 diverse training
58 diversity
59 experimental sequence
60 gain
61 gene sequence database
62 gene sequences
63 gene surveys
64 genes
65 genus level
66 great diversity
67 greater gains
68 greater proportion
69 gut
70 hierarchy
71 high confidence
72 high-throughput bacterial 16s rRNA gene surveys
73 impact
74 improvement
75 influence
76 latest GG taxonomy
77 levels
78 microbiome studies
79 model
80 mouse gut
81 naïve Bayesian classifier
82 need
83 notable improvement
84 numerous advantages
85 operational taxonomic units
86 phyla
87 phylogenetic clusters
88 phylotypes
89 primer regions
90 primer sets
91 proportion
92 rRNA gene
93 rRNA gene sequence database
94 rRNA gene sequences
95 rRNA gene surveys
96 reference database
97 reference sequence
98 region
99 results
100 samples
101 sequence
102 sequence databases
103 set
104 specific phyla
105 speed
106 study
107 survey
108 systematic improvement
109 taxonomic classification
110 taxonomic hierarchy
111 taxonomic units
112 taxonomy
113 total community variation
114 training
115 training set
116 unclassified experimental sequences
117 units
118 variation
119 schema:name Impact of training sets on classification of high-throughput bacterial 16s rRNA gene surveys
120 schema:pagination 94-103
121 schema:productId N4f8f98b1ec364911a165424afcb0a07f
122 N7a09e53bc2ce485bb64bb7fce2df10ab
123 N7a34dc6382f948d092b97b1524842a51
124 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033211760
125 https://doi.org/10.1038/ismej.2011.82
126 schema:sdDatePublished 2021-11-01T18:16
127 schema:sdLicense https://scigraph.springernature.com/explorer/license/
128 schema:sdPublisher Na0a7c775cefa49609c73c919d59768ae
129 schema:url https://doi.org/10.1038/ismej.2011.82
130 sgo:license sg:explorer/license/
131 sgo:sdDataset articles
132 rdf:type schema:ScholarlyArticle
133 N04244f165269479aba48f0afcb15c0eb schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
134 schema:name Archaea
135 rdf:type schema:DefinedTerm
136 N062fffd520bb447cb2d435f189193756 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
137 schema:name Mice
138 rdf:type schema:DefinedTerm
139 N06fe238c3db8457ab150e8d268b0f713 schema:issueNumber 1
140 rdf:type schema:PublicationIssue
141 N1c7dfbbdd9d24b5bbd2d667980d1ff42 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
142 schema:name Phylogeny
143 rdf:type schema:DefinedTerm
144 N217b2ba0b55e44f0a6dc44294438f078 schema:volumeNumber 6
145 rdf:type schema:PublicationVolume
146 N2b48458f80614c43bee97224b583c224 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
147 schema:name Bayes Theorem
148 rdf:type schema:DefinedTerm
149 N31eaad9cb0a7473c94413836644a9dac rdf:first sg:person.0624224157.70
150 rdf:rest Nba5f40111fe64a9f8fe294450d876515
151 N34f8675d8f0f4a4bbda1c80a56e41b66 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
152 schema:name RNA, Ribosomal, 16S
153 rdf:type schema:DefinedTerm
154 N461a97f943f641348314f183edc66f07 rdf:first sg:person.01333365763.05
155 rdf:rest N6daf54860aed4c99a2129597708f96e2
156 N4d6b20fc8a7f412fb3f4f9a515937691 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
157 schema:name Bacteria
158 rdf:type schema:DefinedTerm
159 N4f8f98b1ec364911a165424afcb0a07f schema:name doi
160 schema:value 10.1038/ismej.2011.82
161 rdf:type schema:PropertyValue
162 N52dde72cd1cc40d3ae6a3432f99d3134 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
163 schema:name High-Throughput Nucleotide Sequencing
164 rdf:type schema:DefinedTerm
165 N61bb60e5a60d4e2997f58434ac5f5dce schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
166 schema:name Algorithms
167 rdf:type schema:DefinedTerm
168 N62399e3b454f4b37bec0dddb946991e0 rdf:first sg:person.0603323015.78
169 rdf:rest N6d5e7a4bfdb549c0a719922007b8db13
170 N6d5e7a4bfdb549c0a719922007b8db13 rdf:first sg:person.01152611031.15
171 rdf:rest Nc62e2638e6d14ae9b07997f667cdf7ac
172 N6daf54860aed4c99a2129597708f96e2 rdf:first sg:person.015342544657.49
173 rdf:rest N31eaad9cb0a7473c94413836644a9dac
174 N79365be9de0b4610a8ee0753f7cae8c0 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
175 schema:name Humans
176 rdf:type schema:DefinedTerm
177 N7a09e53bc2ce485bb64bb7fce2df10ab schema:name pubmed_id
178 schema:value 21716311
179 rdf:type schema:PropertyValue
180 N7a34dc6382f948d092b97b1524842a51 schema:name dimensions_id
181 schema:value pub.1033211760
182 rdf:type schema:PropertyValue
183 N7c18268e0d264a1a992da42a379ae9d7 rdf:first sg:person.016311745377.96
184 rdf:rest Nbc34eec1c7ef40efb368c6bab3120254
185 N8541ddfbe380460e8b2c47d8c1d31567 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
186 schema:name DNA, Bacterial
187 rdf:type schema:DefinedTerm
188 Na060c11a8a7247f991a852cc749b1ed0 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
189 schema:name DNA, Archaeal
190 rdf:type schema:DefinedTerm
191 Na0a7c775cefa49609c73c919d59768ae schema:name Springer Nature - SN SciGraph project
192 rdf:type schema:Organization
193 Na0f86ecf54d041bc9031c68ecdec56ee schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
194 schema:name Ribotyping
195 rdf:type schema:DefinedTerm
196 Nba5f40111fe64a9f8fe294450d876515 rdf:first sg:person.0724210263.91
197 rdf:rest N7c18268e0d264a1a992da42a379ae9d7
198 Nbc34eec1c7ef40efb368c6bab3120254 rdf:first sg:person.0725775131.01
199 rdf:rest rdf:nil
200 Nc62e2638e6d14ae9b07997f667cdf7ac rdf:first sg:person.01055510700.73
201 rdf:rest N461a97f943f641348314f183edc66f07
202 Ncaa5773229c9454dbae27a6eaff8bad5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
203 schema:name Gastrointestinal Tract
204 rdf:type schema:DefinedTerm
205 Ndd9b1cdadfd5408da29a8a3fcda1f4c5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
206 schema:name Metagenome
207 rdf:type schema:DefinedTerm
208 Nf393be6ca5484011ab2275e862da4e81 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
209 schema:name Animals
210 rdf:type schema:DefinedTerm
211 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
212 schema:name Biological Sciences
213 rdf:type schema:DefinedTerm
214 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
215 schema:name Genetics
216 rdf:type schema:DefinedTerm
217 sg:grant.2684042 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.82
218 rdf:type schema:MonetaryGrant
219 sg:grant.2691272 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.82
220 rdf:type schema:MonetaryGrant
221 sg:grant.2705031 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.82
222 rdf:type schema:MonetaryGrant
223 sg:grant.2705095 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.82
224 rdf:type schema:MonetaryGrant
225 sg:grant.8775219 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.82
226 rdf:type schema:MonetaryGrant
227 sg:grant.8775717 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.82
228 rdf:type schema:MonetaryGrant
229 sg:journal.1038436 schema:issn 1751-7362
230 1751-7370
231 schema:name The ISME Journal: Multidisciplinary Journal of Microbial Ecology
232 schema:publisher Springer Nature
233 rdf:type schema:Periodical
234 sg:person.01055510700.73 schema:affiliation grid-institutes:grid.1003.2
235 schema:familyName Hugenholtz
236 schema:givenName Philip
237 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055510700.73
238 rdf:type schema:Person
239 sg:person.01152611031.15 schema:affiliation grid-institutes:grid.5386.8
240 schema:familyName Koren
241 schema:givenName Omry
242 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01152611031.15
243 rdf:type schema:Person
244 sg:person.01333365763.05 schema:affiliation grid-institutes:grid.184769.5
245 schema:familyName DeSantis
246 schema:givenName Todd Z
247 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01333365763.05
248 rdf:type schema:Person
249 sg:person.015342544657.49 schema:affiliation grid-institutes:grid.266190.a
250 schema:familyName Walters
251 schema:givenName William A
252 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015342544657.49
253 rdf:type schema:Person
254 sg:person.016311745377.96 schema:affiliation grid-institutes:grid.266190.a
255 schema:familyName Knight
256 schema:givenName Rob
257 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016311745377.96
258 rdf:type schema:Person
259 sg:person.0603323015.78 schema:affiliation grid-institutes:grid.5386.8
260 schema:familyName Werner
261 schema:givenName Jeffrey J
262 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0603323015.78
263 rdf:type schema:Person
264 sg:person.0624224157.70 schema:affiliation grid-institutes:grid.266190.a
265 schema:familyName Caporaso
266 schema:givenName J Gregory
267 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0624224157.70
268 rdf:type schema:Person
269 sg:person.0724210263.91 schema:affiliation grid-institutes:grid.5386.8
270 schema:familyName Angenent
271 schema:givenName Largus T
272 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0724210263.91
273 rdf:type schema:Person
274 sg:person.0725775131.01 schema:affiliation grid-institutes:grid.5386.8
275 schema:familyName Ley
276 schema:givenName Ruth E
277 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0725775131.01
278 rdf:type schema:Person
279 sg:pub.10.1038/ismej.2010.71 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004755333
280 https://doi.org/10.1038/ismej.2010.71
281 rdf:type schema:CreativeWork
282 sg:pub.10.1038/nature03959 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021574562
283 https://doi.org/10.1038/nature03959
284 rdf:type schema:CreativeWork
285 sg:pub.10.1038/nmeth.1184 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042740345
286 https://doi.org/10.1038/nmeth.1184
287 rdf:type schema:CreativeWork
288 sg:pub.10.1038/nmeth.f.303 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009032055
289 https://doi.org/10.1038/nmeth.f.303
290 rdf:type schema:CreativeWork
291 sg:pub.10.1038/nmeth0910-668b schema:sameAs https://app.dimensions.ai/details/publication/pub.1017522391
292 https://doi.org/10.1038/nmeth0910-668b
293 rdf:type schema:CreativeWork
294 sg:pub.10.1186/1472-6785-11-11 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000431206
295 https://doi.org/10.1186/1472-6785-11-11
296 rdf:type schema:CreativeWork
297 grid-institutes:grid.1003.2 schema:alternateName Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia
298 schema:name Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD, Australia
299 rdf:type schema:Organization
300 grid-institutes:grid.184769.5 schema:alternateName Center for Environmental Biotechnology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
301 schema:name Center for Environmental Biotechnology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
302 rdf:type schema:Organization
303 grid-institutes:grid.266190.a schema:alternateName Department of Biochemistry and Chemistry, University of Colorado, Boulder, CO, USA
304 Howard Hughes Medical Institute, University of Colorado, Boulder, CO, USA
305 schema:name Department of Biochemistry and Chemistry, University of Colorado, Boulder, CO, USA
306 Howard Hughes Medical Institute, University of Colorado, Boulder, CO, USA
307 rdf:type schema:Organization
308 grid-institutes:grid.5386.8 schema:alternateName Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY, USA
309 Department of Microbiology, Cornell University, Ithaca, NY, USA
310 schema:name Department of Biological and Environmental Engineering, Cornell University, Ithaca, NY, USA
311 Department of Microbiology, Cornell University, Ithaca, NY, USA
312 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...