An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2012-03

AUTHORS

Daniel McDonald, Morgan N Price, Julia Goodrich, Eric P Nawrocki, Todd Z DeSantis, Alexander Probst, Gary L Andersen, Rob Knight, Philip Hugenholtz

ABSTRACT

Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a 'taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408,315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/. More... »

PAGES

610

Identifiers

URI

http://scigraph.springernature.com/pub.10.1038/ismej.2011.139

DOI

http://dx.doi.org/10.1038/ismej.2011.139

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1051863807

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/22134646


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Archaea", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Bacteria", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Classification", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Genetic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Metagenomics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Phylogeny", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "RNA, Ribosomal, 16S", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Colorado Boulder", 
          "id": "https://www.grid.ac/institutes/grid.266190.a", 
          "name": [
            "Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "McDonald", 
        "givenName": "Daniel", 
        "id": "sg:person.01324411177.44", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324411177.44"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Lawrence Berkeley National Laboratory", 
          "id": "https://www.grid.ac/institutes/grid.184769.5", 
          "name": [
            "Lawrence Berkeley National Laboratory, Physical Biosciences Division, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Price", 
        "givenName": "Morgan N", 
        "id": "sg:person.01370663340.52", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01370663340.52"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Colorado Boulder", 
          "id": "https://www.grid.ac/institutes/grid.266190.a", 
          "name": [
            "Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Goodrich", 
        "givenName": "Julia", 
        "id": "sg:person.01216375001.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01216375001.41"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Howard Hughes Medical Institute", 
          "id": "https://www.grid.ac/institutes/grid.413575.1", 
          "name": [
            "Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Nawrocki", 
        "givenName": "Eric P", 
        "id": "sg:person.01303716266.13", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01303716266.13"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Second Genome (United States)", 
          "id": "https://www.grid.ac/institutes/grid.452682.f", 
          "name": [
            "Department of Bioinformatics, Second Genome Inc., San Bruno, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "DeSantis", 
        "givenName": "Todd Z", 
        "id": "sg:person.01333365763.05", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01333365763.05"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Lawrence Berkeley National Laboratory", 
          "id": "https://www.grid.ac/institutes/grid.184769.5", 
          "name": [
            "Lawrence Berkeley National Laboratory, Center for Environmental Biotechnology, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Probst", 
        "givenName": "Alexander", 
        "id": "sg:person.0757405471.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757405471.24"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Lawrence Berkeley National Laboratory", 
          "id": "https://www.grid.ac/institutes/grid.184769.5", 
          "name": [
            "Lawrence Berkeley National Laboratory, Center for Environmental Biotechnology, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Andersen", 
        "givenName": "Gary L", 
        "id": "sg:person.01160741113.38", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01160741113.38"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Howard Hughes Medical Institute", 
          "id": "https://www.grid.ac/institutes/grid.413575.1", 
          "name": [
            "Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA", 
            "Howard Hughes Medical Institute, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Knight", 
        "givenName": "Rob", 
        "id": "sg:person.016311745377.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016311745377.96"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences and Institute for Molecular Bioscience, St Lucia, Queensland, Australia"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hugenholtz", 
        "givenName": "Philip", 
        "id": "sg:person.01055510700.73", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055510700.73"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1371/journal.pone.0009490", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000778834"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.112730.110", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007579011"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature06244", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009917183", 
          "https://doi.org/10.1038/nature06244"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkm864", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010797283"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.mib.2008.09.011", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011594842"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1099/ijs.0.64915-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013602089"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08656", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013886837", 
          "https://doi.org/10.1038/nature08656"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08656", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013886837", 
          "https://doi.org/10.1038/nature08656"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.72.5.3685-3695.2006", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014115558"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkq1172", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015262597"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-0-387-21609-6_8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015420409", 
          "https://doi.org/10.1007/978-0-387-21609-6_8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-3-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017682287", 
          "https://doi.org/10.1186/1471-2105-3-2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp157", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018224286"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrmicro2119", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030726668", 
          "https://doi.org/10.1038/nrmicro2119"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1574-6941.2001.tb00791.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032787651"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1574-6941.2001.tb00791.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032787651"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ismej.2011.82", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033211760", 
          "https://doi.org/10.1038/ismej.2011.82"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-8-402", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033706648", 
          "https://doi.org/10.1186/1471-2105-8-402"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp636", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034512291"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.03006-05", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034568952"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.096651.109", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034877175"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2007-8-8-r171", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036049247", 
          "https://doi.org/10.1186/gb-2007-8-8-r171"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0004192", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1037765545"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.1123061", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043495787"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkn491", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044732652"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkn879", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044918953"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.00062-07", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045980007"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkh293", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047454621"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.syapm.2008.08.003", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053173751"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083201581", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083324436", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2012-03", 
    "datePublishedReg": "2012-03-01", 
    "description": "Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a 'taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408,315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1038/ismej.2011.139", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2705031", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2529347", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2691272", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1038436", 
        "issn": [
          "1751-7362", 
          "1751-7370"
        ], 
        "name": "The ISME Journal", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "3", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "6"
      }
    ], 
    "name": "An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea", 
    "pagination": "610", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "f70cb4f18da9d34128b19e8c69fbd8cacfbb1e12afa84328fe7e3c698dd1e6ca"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "22134646"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101301086"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1038/ismej.2011.139"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1051863807"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1038/ismej.2011.139", 
      "https://app.dimensions.ai/details/publication/pub.1051863807"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T14:48", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8663_00000426.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://www.nature.com/articles/ismej2011139"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'


 

This table displays all metadata directly associated to this object as RDF triples.

270 TRIPLES      21 PREDICATES      66 URIs      29 LITERALS      17 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1038/ismej.2011.139 schema:about N049c6c44ad88454a9a9a4ba6ab3bae05
2 N55165d5f776c45829f14693d95d48f0d
3 N75a165dd0642479cb0f9523830d7d041
4 N842ff2675c04452d8fda8ae5830d49de
5 Nb7a29ecc20ee4445bd7b5a9af26ec6c5
6 Nbcc336490b7c4f768b1c9475edcff955
7 Ne04ba3b5f5094e3fb91eb88494f90349
8 Nfd1fe70b87594127ae613d904927e59f
9 anzsrc-for:08
10 anzsrc-for:0806
11 schema:author N2be7c77943984226b7a5bb67f1be0db6
12 schema:citation sg:pub.10.1007/978-0-387-21609-6_8
13 sg:pub.10.1038/ismej.2011.82
14 sg:pub.10.1038/nature06244
15 sg:pub.10.1038/nature08656
16 sg:pub.10.1038/nrmicro2119
17 sg:pub.10.1186/1471-2105-3-2
18 sg:pub.10.1186/1471-2105-8-402
19 sg:pub.10.1186/gb-2007-8-8-r171
20 https://app.dimensions.ai/details/publication/pub.1083201581
21 https://app.dimensions.ai/details/publication/pub.1083324436
22 https://doi.org/10.1016/j.mib.2008.09.011
23 https://doi.org/10.1016/j.syapm.2008.08.003
24 https://doi.org/10.1093/bioinformatics/btp157
25 https://doi.org/10.1093/bioinformatics/btp636
26 https://doi.org/10.1093/nar/gkh293
27 https://doi.org/10.1093/nar/gkm864
28 https://doi.org/10.1093/nar/gkn491
29 https://doi.org/10.1093/nar/gkn879
30 https://doi.org/10.1093/nar/gkq1172
31 https://doi.org/10.1099/ijs.0.64915-0
32 https://doi.org/10.1101/gr.096651.109
33 https://doi.org/10.1101/gr.112730.110
34 https://doi.org/10.1111/j.1574-6941.2001.tb00791.x
35 https://doi.org/10.1126/science.1123061
36 https://doi.org/10.1128/aem.00062-07
37 https://doi.org/10.1128/aem.03006-05
38 https://doi.org/10.1128/aem.72.5.3685-3695.2006
39 https://doi.org/10.1371/journal.pone.0004192
40 https://doi.org/10.1371/journal.pone.0009490
41 schema:datePublished 2012-03
42 schema:datePublishedReg 2012-03-01
43 schema:description Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a 'taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408,315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/.
44 schema:genre research_article
45 schema:inLanguage en
46 schema:isAccessibleForFree true
47 schema:isPartOf N62e0724b1c694fe183427d8bed5fa826
48 Ne8914aaa23be4469954f9da2dc2e873c
49 sg:journal.1038436
50 schema:name An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea
51 schema:pagination 610
52 schema:productId N32fea1171779456297c292ed266ae3aa
53 N4cb1bd06c1134942a38d5939ecbf1b22
54 Na611178e3f53450e97c101f78c5a991f
55 Nba5c62c007cb4d5f9bd0190b55892b13
56 Ndad89fa9cc8f426cb17e5dd11a177167
57 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051863807
58 https://doi.org/10.1038/ismej.2011.139
59 schema:sdDatePublished 2019-04-10T14:48
60 schema:sdLicense https://scigraph.springernature.com/explorer/license/
61 schema:sdPublisher Nca07da014bf74faca101cf88c3b117d9
62 schema:url https://www.nature.com/articles/ismej2011139
63 sgo:license sg:explorer/license/
64 sgo:sdDataset articles
65 rdf:type schema:ScholarlyArticle
66 N049c6c44ad88454a9a9a4ba6ab3bae05 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
67 schema:name Classification
68 rdf:type schema:DefinedTerm
69 N1800729b5e954ade89a0dff19b5efc8b rdf:first sg:person.01303716266.13
70 rdf:rest N1ec1f8f672a647aa80278af8a4e1e0bd
71 N1ec1f8f672a647aa80278af8a4e1e0bd rdf:first sg:person.01333365763.05
72 rdf:rest Nb987d9d2e4ee4e3093d2f5bb9e7f33a4
73 N2a3cc6ab7a6e4a5baf787a31257157fd rdf:first sg:person.01055510700.73
74 rdf:rest rdf:nil
75 N2be7c77943984226b7a5bb67f1be0db6 rdf:first sg:person.01324411177.44
76 rdf:rest N8795c9a8755e4dc68cef1f57e7b3ff17
77 N32fea1171779456297c292ed266ae3aa schema:name doi
78 schema:value 10.1038/ismej.2011.139
79 rdf:type schema:PropertyValue
80 N4cb1bd06c1134942a38d5939ecbf1b22 schema:name nlm_unique_id
81 schema:value 101301086
82 rdf:type schema:PropertyValue
83 N55165d5f776c45829f14693d95d48f0d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
84 schema:name RNA, Ribosomal, 16S
85 rdf:type schema:DefinedTerm
86 N62e0724b1c694fe183427d8bed5fa826 schema:volumeNumber 6
87 rdf:type schema:PublicationVolume
88 N75a165dd0642479cb0f9523830d7d041 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
89 schema:name Archaea
90 rdf:type schema:DefinedTerm
91 N76e1c616cfb3475f90ed1341059d5216 rdf:first sg:person.01160741113.38
92 rdf:rest Nfa0b53a339454f50bf7c115452fd8b05
93 N842ff2675c04452d8fda8ae5830d49de schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
94 schema:name Metagenomics
95 rdf:type schema:DefinedTerm
96 N8795c9a8755e4dc68cef1f57e7b3ff17 rdf:first sg:person.01370663340.52
97 rdf:rest Na7ed259b536c4d128ff0b3365c8cbc83
98 N9e63b32ac93c4096addd2dde1ffaad38 schema:name Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences and Institute for Molecular Bioscience, St Lucia, Queensland, Australia
99 rdf:type schema:Organization
100 Na611178e3f53450e97c101f78c5a991f schema:name readcube_id
101 schema:value f70cb4f18da9d34128b19e8c69fbd8cacfbb1e12afa84328fe7e3c698dd1e6ca
102 rdf:type schema:PropertyValue
103 Na7ed259b536c4d128ff0b3365c8cbc83 rdf:first sg:person.01216375001.41
104 rdf:rest N1800729b5e954ade89a0dff19b5efc8b
105 Nb7a29ecc20ee4445bd7b5a9af26ec6c5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
106 schema:name Phylogeny
107 rdf:type schema:DefinedTerm
108 Nb987d9d2e4ee4e3093d2f5bb9e7f33a4 rdf:first sg:person.0757405471.24
109 rdf:rest N76e1c616cfb3475f90ed1341059d5216
110 Nba5c62c007cb4d5f9bd0190b55892b13 schema:name pubmed_id
111 schema:value 22134646
112 rdf:type schema:PropertyValue
113 Nbcc336490b7c4f768b1c9475edcff955 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
114 schema:name Software
115 rdf:type schema:DefinedTerm
116 Nca07da014bf74faca101cf88c3b117d9 schema:name Springer Nature - SN SciGraph project
117 rdf:type schema:Organization
118 Ndad89fa9cc8f426cb17e5dd11a177167 schema:name dimensions_id
119 schema:value pub.1051863807
120 rdf:type schema:PropertyValue
121 Ne04ba3b5f5094e3fb91eb88494f90349 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
122 schema:name Bacteria
123 rdf:type schema:DefinedTerm
124 Ne8914aaa23be4469954f9da2dc2e873c schema:issueNumber 3
125 rdf:type schema:PublicationIssue
126 Nfa0b53a339454f50bf7c115452fd8b05 rdf:first sg:person.016311745377.96
127 rdf:rest N2a3cc6ab7a6e4a5baf787a31257157fd
128 Nfd1fe70b87594127ae613d904927e59f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
129 schema:name Databases, Genetic
130 rdf:type schema:DefinedTerm
131 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
132 schema:name Information and Computing Sciences
133 rdf:type schema:DefinedTerm
134 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
135 schema:name Information Systems
136 rdf:type schema:DefinedTerm
137 sg:grant.2529347 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.139
138 rdf:type schema:MonetaryGrant
139 sg:grant.2691272 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.139
140 rdf:type schema:MonetaryGrant
141 sg:grant.2705031 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.139
142 rdf:type schema:MonetaryGrant
143 sg:journal.1038436 schema:issn 1751-7362
144 1751-7370
145 schema:name The ISME Journal
146 rdf:type schema:Periodical
147 sg:person.01055510700.73 schema:affiliation N9e63b32ac93c4096addd2dde1ffaad38
148 schema:familyName Hugenholtz
149 schema:givenName Philip
150 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055510700.73
151 rdf:type schema:Person
152 sg:person.01160741113.38 schema:affiliation https://www.grid.ac/institutes/grid.184769.5
153 schema:familyName Andersen
154 schema:givenName Gary L
155 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01160741113.38
156 rdf:type schema:Person
157 sg:person.01216375001.41 schema:affiliation https://www.grid.ac/institutes/grid.266190.a
158 schema:familyName Goodrich
159 schema:givenName Julia
160 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01216375001.41
161 rdf:type schema:Person
162 sg:person.01303716266.13 schema:affiliation https://www.grid.ac/institutes/grid.413575.1
163 schema:familyName Nawrocki
164 schema:givenName Eric P
165 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01303716266.13
166 rdf:type schema:Person
167 sg:person.01324411177.44 schema:affiliation https://www.grid.ac/institutes/grid.266190.a
168 schema:familyName McDonald
169 schema:givenName Daniel
170 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324411177.44
171 rdf:type schema:Person
172 sg:person.01333365763.05 schema:affiliation https://www.grid.ac/institutes/grid.452682.f
173 schema:familyName DeSantis
174 schema:givenName Todd Z
175 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01333365763.05
176 rdf:type schema:Person
177 sg:person.01370663340.52 schema:affiliation https://www.grid.ac/institutes/grid.184769.5
178 schema:familyName Price
179 schema:givenName Morgan N
180 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01370663340.52
181 rdf:type schema:Person
182 sg:person.016311745377.96 schema:affiliation https://www.grid.ac/institutes/grid.413575.1
183 schema:familyName Knight
184 schema:givenName Rob
185 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016311745377.96
186 rdf:type schema:Person
187 sg:person.0757405471.24 schema:affiliation https://www.grid.ac/institutes/grid.184769.5
188 schema:familyName Probst
189 schema:givenName Alexander
190 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757405471.24
191 rdf:type schema:Person
192 sg:pub.10.1007/978-0-387-21609-6_8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015420409
193 https://doi.org/10.1007/978-0-387-21609-6_8
194 rdf:type schema:CreativeWork
195 sg:pub.10.1038/ismej.2011.82 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033211760
196 https://doi.org/10.1038/ismej.2011.82
197 rdf:type schema:CreativeWork
198 sg:pub.10.1038/nature06244 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009917183
199 https://doi.org/10.1038/nature06244
200 rdf:type schema:CreativeWork
201 sg:pub.10.1038/nature08656 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013886837
202 https://doi.org/10.1038/nature08656
203 rdf:type schema:CreativeWork
204 sg:pub.10.1038/nrmicro2119 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030726668
205 https://doi.org/10.1038/nrmicro2119
206 rdf:type schema:CreativeWork
207 sg:pub.10.1186/1471-2105-3-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017682287
208 https://doi.org/10.1186/1471-2105-3-2
209 rdf:type schema:CreativeWork
210 sg:pub.10.1186/1471-2105-8-402 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033706648
211 https://doi.org/10.1186/1471-2105-8-402
212 rdf:type schema:CreativeWork
213 sg:pub.10.1186/gb-2007-8-8-r171 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036049247
214 https://doi.org/10.1186/gb-2007-8-8-r171
215 rdf:type schema:CreativeWork
216 https://app.dimensions.ai/details/publication/pub.1083201581 schema:CreativeWork
217 https://app.dimensions.ai/details/publication/pub.1083324436 schema:CreativeWork
218 https://doi.org/10.1016/j.mib.2008.09.011 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011594842
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1016/j.syapm.2008.08.003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053173751
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1093/bioinformatics/btp157 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018224286
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1093/bioinformatics/btp636 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034512291
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1093/nar/gkh293 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047454621
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1093/nar/gkm864 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010797283
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1093/nar/gkn491 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044732652
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1093/nar/gkn879 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044918953
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1093/nar/gkq1172 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015262597
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1099/ijs.0.64915-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013602089
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1101/gr.096651.109 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034877175
239 rdf:type schema:CreativeWork
240 https://doi.org/10.1101/gr.112730.110 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007579011
241 rdf:type schema:CreativeWork
242 https://doi.org/10.1111/j.1574-6941.2001.tb00791.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1032787651
243 rdf:type schema:CreativeWork
244 https://doi.org/10.1126/science.1123061 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043495787
245 rdf:type schema:CreativeWork
246 https://doi.org/10.1128/aem.00062-07 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045980007
247 rdf:type schema:CreativeWork
248 https://doi.org/10.1128/aem.03006-05 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034568952
249 rdf:type schema:CreativeWork
250 https://doi.org/10.1128/aem.72.5.3685-3695.2006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014115558
251 rdf:type schema:CreativeWork
252 https://doi.org/10.1371/journal.pone.0004192 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037765545
253 rdf:type schema:CreativeWork
254 https://doi.org/10.1371/journal.pone.0009490 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000778834
255 rdf:type schema:CreativeWork
256 https://www.grid.ac/institutes/grid.184769.5 schema:alternateName Lawrence Berkeley National Laboratory
257 schema:name Lawrence Berkeley National Laboratory, Center for Environmental Biotechnology, Berkeley, CA, USA
258 Lawrence Berkeley National Laboratory, Physical Biosciences Division, Berkeley, CA, USA
259 rdf:type schema:Organization
260 https://www.grid.ac/institutes/grid.266190.a schema:alternateName University of Colorado Boulder
261 schema:name Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA
262 rdf:type schema:Organization
263 https://www.grid.ac/institutes/grid.413575.1 schema:alternateName Howard Hughes Medical Institute
264 schema:name Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA
265 Howard Hughes Medical Institute, Boulder, CO, USA
266 Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
267 rdf:type schema:Organization
268 https://www.grid.ac/institutes/grid.452682.f schema:alternateName Second Genome (United States)
269 schema:name Department of Bioinformatics, Second Genome Inc., San Bruno, CA, USA
270 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...