An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2012-03

AUTHORS

Daniel McDonald, Morgan N Price, Julia Goodrich, Eric P Nawrocki, Todd Z DeSantis, Alexander Probst, Gary L Andersen, Rob Knight, Philip Hugenholtz

ABSTRACT

Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a 'taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408,315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/. More... »

PAGES

610

Identifiers

URI

http://scigraph.springernature.com/pub.10.1038/ismej.2011.139

DOI

http://dx.doi.org/10.1038/ismej.2011.139

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1051863807

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/22134646


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Archaea", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Bacteria", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Classification", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Genetic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Metagenomics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Phylogeny", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "RNA, Ribosomal, 16S", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Colorado Boulder", 
          "id": "https://www.grid.ac/institutes/grid.266190.a", 
          "name": [
            "Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "McDonald", 
        "givenName": "Daniel", 
        "id": "sg:person.01324411177.44", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324411177.44"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Lawrence Berkeley National Laboratory", 
          "id": "https://www.grid.ac/institutes/grid.184769.5", 
          "name": [
            "Lawrence Berkeley National Laboratory, Physical Biosciences Division, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Price", 
        "givenName": "Morgan N", 
        "id": "sg:person.01370663340.52", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01370663340.52"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Colorado Boulder", 
          "id": "https://www.grid.ac/institutes/grid.266190.a", 
          "name": [
            "Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Goodrich", 
        "givenName": "Julia", 
        "id": "sg:person.01216375001.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01216375001.41"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Howard Hughes Medical Institute", 
          "id": "https://www.grid.ac/institutes/grid.413575.1", 
          "name": [
            "Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Nawrocki", 
        "givenName": "Eric P", 
        "id": "sg:person.01303716266.13", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01303716266.13"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Second Genome (United States)", 
          "id": "https://www.grid.ac/institutes/grid.452682.f", 
          "name": [
            "Department of Bioinformatics, Second Genome Inc., San Bruno, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "DeSantis", 
        "givenName": "Todd Z", 
        "id": "sg:person.01333365763.05", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01333365763.05"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Lawrence Berkeley National Laboratory", 
          "id": "https://www.grid.ac/institutes/grid.184769.5", 
          "name": [
            "Lawrence Berkeley National Laboratory, Center for Environmental Biotechnology, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Probst", 
        "givenName": "Alexander", 
        "id": "sg:person.0757405471.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757405471.24"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Lawrence Berkeley National Laboratory", 
          "id": "https://www.grid.ac/institutes/grid.184769.5", 
          "name": [
            "Lawrence Berkeley National Laboratory, Center for Environmental Biotechnology, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Andersen", 
        "givenName": "Gary L", 
        "id": "sg:person.01160741113.38", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01160741113.38"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Howard Hughes Medical Institute", 
          "id": "https://www.grid.ac/institutes/grid.413575.1", 
          "name": [
            "Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA", 
            "Howard Hughes Medical Institute, Boulder, CO, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Knight", 
        "givenName": "Rob", 
        "id": "sg:person.016311745377.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016311745377.96"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences and Institute for Molecular Bioscience, St Lucia, Queensland, Australia"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hugenholtz", 
        "givenName": "Philip", 
        "id": "sg:person.01055510700.73", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055510700.73"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1371/journal.pone.0009490", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000778834"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.112730.110", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007579011"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature06244", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009917183", 
          "https://doi.org/10.1038/nature06244"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkm864", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010797283"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.mib.2008.09.011", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011594842"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1099/ijs.0.64915-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013602089"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08656", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013886837", 
          "https://doi.org/10.1038/nature08656"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08656", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013886837", 
          "https://doi.org/10.1038/nature08656"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.72.5.3685-3695.2006", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014115558"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkq1172", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015262597"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-0-387-21609-6_8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015420409", 
          "https://doi.org/10.1007/978-0-387-21609-6_8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-3-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017682287", 
          "https://doi.org/10.1186/1471-2105-3-2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp157", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018224286"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrmicro2119", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030726668", 
          "https://doi.org/10.1038/nrmicro2119"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1574-6941.2001.tb00791.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032787651"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1574-6941.2001.tb00791.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032787651"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ismej.2011.82", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033211760", 
          "https://doi.org/10.1038/ismej.2011.82"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-8-402", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033706648", 
          "https://doi.org/10.1186/1471-2105-8-402"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp636", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034512291"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.03006-05", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034568952"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.096651.109", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034877175"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2007-8-8-r171", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036049247", 
          "https://doi.org/10.1186/gb-2007-8-8-r171"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0004192", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1037765545"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.1123061", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043495787"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkn491", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044732652"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkn879", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044918953"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.00062-07", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045980007"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkh293", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047454621"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.syapm.2008.08.003", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053173751"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083201581", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083324436", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2012-03", 
    "datePublishedReg": "2012-03-01", 
    "description": "Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a 'taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408,315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1038/ismej.2011.139", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2705031", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2529347", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2691272", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1038436", 
        "issn": [
          "1751-7362", 
          "1751-7370"
        ], 
        "name": "The ISME Journal", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "3", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "6"
      }
    ], 
    "name": "An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea", 
    "pagination": "610", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "f70cb4f18da9d34128b19e8c69fbd8cacfbb1e12afa84328fe7e3c698dd1e6ca"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "22134646"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101301086"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1038/ismej.2011.139"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1051863807"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1038/ismej.2011.139", 
      "https://app.dimensions.ai/details/publication/pub.1051863807"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T14:48", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8663_00000426.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://www.nature.com/articles/ismej2011139"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/ismej.2011.139'


 

This table displays all metadata directly associated to this object as RDF triples.

270 TRIPLES      21 PREDICATES      66 URIs      29 LITERALS      17 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1038/ismej.2011.139 schema:about N218cfa68d38a4c9b90623411ba433814
2 N3bcaf5f3bba54ad3ad847a64b7baf624
3 N64e197b293724b9d9eafb18f16200b3a
4 N9905d3af829944e697b797b77e1441b3
5 Ncbf2c15a658d4dfda85e573b133e45f1
6 Ncd0aaa35e9f041b8aa712e983a218e48
7 Ne8efa397da1f42f6b9c7af6d8c97c922
8 Nf46329a4360142eeaa970766a3e2d3d6
9 anzsrc-for:08
10 anzsrc-for:0806
11 schema:author Nfa11ab427e434d849c2c9ae1c7e344d3
12 schema:citation sg:pub.10.1007/978-0-387-21609-6_8
13 sg:pub.10.1038/ismej.2011.82
14 sg:pub.10.1038/nature06244
15 sg:pub.10.1038/nature08656
16 sg:pub.10.1038/nrmicro2119
17 sg:pub.10.1186/1471-2105-3-2
18 sg:pub.10.1186/1471-2105-8-402
19 sg:pub.10.1186/gb-2007-8-8-r171
20 https://app.dimensions.ai/details/publication/pub.1083201581
21 https://app.dimensions.ai/details/publication/pub.1083324436
22 https://doi.org/10.1016/j.mib.2008.09.011
23 https://doi.org/10.1016/j.syapm.2008.08.003
24 https://doi.org/10.1093/bioinformatics/btp157
25 https://doi.org/10.1093/bioinformatics/btp636
26 https://doi.org/10.1093/nar/gkh293
27 https://doi.org/10.1093/nar/gkm864
28 https://doi.org/10.1093/nar/gkn491
29 https://doi.org/10.1093/nar/gkn879
30 https://doi.org/10.1093/nar/gkq1172
31 https://doi.org/10.1099/ijs.0.64915-0
32 https://doi.org/10.1101/gr.096651.109
33 https://doi.org/10.1101/gr.112730.110
34 https://doi.org/10.1111/j.1574-6941.2001.tb00791.x
35 https://doi.org/10.1126/science.1123061
36 https://doi.org/10.1128/aem.00062-07
37 https://doi.org/10.1128/aem.03006-05
38 https://doi.org/10.1128/aem.72.5.3685-3695.2006
39 https://doi.org/10.1371/journal.pone.0004192
40 https://doi.org/10.1371/journal.pone.0009490
41 schema:datePublished 2012-03
42 schema:datePublishedReg 2012-03-01
43 schema:description Reference phylogenies are crucial for providing a taxonomic framework for interpretation of marker gene and metagenomic surveys, which continue to reveal novel species at a remarkable rate. Greengenes is a dedicated full-length 16S rRNA gene database that provides users with a curated taxonomy based on de novo tree inference. We developed a 'taxonomy to tree' approach for transferring group names from an existing taxonomy to a tree topology, and used it to apply the Greengenes, National Center for Biotechnology Information (NCBI) and cyanoDB (Cyanobacteria only) taxonomies to a de novo tree comprising 408,315 sequences. We also incorporated explicit rank information provided by the NCBI taxonomy to group names (by prefixing rank designations) for better user orientation and classification consistency. The resulting merged taxonomy improved the classification of 75% of the sequences by one or more ranks relative to the original NCBI taxonomy with the most pronounced improvements occurring in under-classified environmental sequences. We also assessed candidate phyla (divisions) currently defined by NCBI and present recommendations for consolidation of 34 redundantly named groups. All intermediate results from the pipeline, which includes tree inference, jackknifing and transfer of a donor taxonomy to a recipient tree (tax2tree) are available for download. The improved Greengenes taxonomy should provide important infrastructure for a wide range of megasequencing projects studying ecosystems on scales ranging from our own bodies (the Human Microbiome Project) to the entire planet (the Earth Microbiome Project). The implementation of the software can be obtained from http://sourceforge.net/projects/tax2tree/.
44 schema:genre research_article
45 schema:inLanguage en
46 schema:isAccessibleForFree true
47 schema:isPartOf N8ad5961af8a64669a8c5f76c53a5dd35
48 N8f5c32ebba30474d891ecd546e60468c
49 sg:journal.1038436
50 schema:name An improved Greengenes taxonomy with explicit ranks for ecological and evolutionary analyses of bacteria and archaea
51 schema:pagination 610
52 schema:productId N0bf8c2a027194470a496853f2e899cce
53 N517344b41fd042b2ada1361d0b36f184
54 N72632de0baec42efa141f77b6fa3aaa1
55 N894ec98053d149bba27d8453def5e1e6
56 Ne8572d75f9794410a85c150ee86b7f5e
57 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051863807
58 https://doi.org/10.1038/ismej.2011.139
59 schema:sdDatePublished 2019-04-10T14:48
60 schema:sdLicense https://scigraph.springernature.com/explorer/license/
61 schema:sdPublisher N7d078595c6e7407abff897cf985430bc
62 schema:url https://www.nature.com/articles/ismej2011139
63 sgo:license sg:explorer/license/
64 sgo:sdDataset articles
65 rdf:type schema:ScholarlyArticle
66 N02ef480013564eea94a7d87cf767d7a1 rdf:first sg:person.016311745377.96
67 rdf:rest N86d96d7ac20e469287b0cf90c67e83bc
68 N0bf8c2a027194470a496853f2e899cce schema:name nlm_unique_id
69 schema:value 101301086
70 rdf:type schema:PropertyValue
71 N20a6d34e6b52456ba44566402948fecf rdf:first sg:person.01303716266.13
72 rdf:rest Ne1a2fb7224f94a05a1d4b63ecf23f95b
73 N218cfa68d38a4c9b90623411ba433814 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
74 schema:name Classification
75 rdf:type schema:DefinedTerm
76 N2204341764444be5936c07b2e41dafc4 rdf:first sg:person.01160741113.38
77 rdf:rest N02ef480013564eea94a7d87cf767d7a1
78 N253371d3fee14c32b55151b0667156e2 rdf:first sg:person.01370663340.52
79 rdf:rest Nd0174e70f59449a882a8b77b6eefc0ac
80 N3bcaf5f3bba54ad3ad847a64b7baf624 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
81 schema:name Bacteria
82 rdf:type schema:DefinedTerm
83 N517344b41fd042b2ada1361d0b36f184 schema:name doi
84 schema:value 10.1038/ismej.2011.139
85 rdf:type schema:PropertyValue
86 N64e197b293724b9d9eafb18f16200b3a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
87 schema:name Archaea
88 rdf:type schema:DefinedTerm
89 N72632de0baec42efa141f77b6fa3aaa1 schema:name readcube_id
90 schema:value f70cb4f18da9d34128b19e8c69fbd8cacfbb1e12afa84328fe7e3c698dd1e6ca
91 rdf:type schema:PropertyValue
92 N7d078595c6e7407abff897cf985430bc schema:name Springer Nature - SN SciGraph project
93 rdf:type schema:Organization
94 N86d96d7ac20e469287b0cf90c67e83bc rdf:first sg:person.01055510700.73
95 rdf:rest rdf:nil
96 N894ec98053d149bba27d8453def5e1e6 schema:name dimensions_id
97 schema:value pub.1051863807
98 rdf:type schema:PropertyValue
99 N8ad5961af8a64669a8c5f76c53a5dd35 schema:volumeNumber 6
100 rdf:type schema:PublicationVolume
101 N8f5c32ebba30474d891ecd546e60468c schema:issueNumber 3
102 rdf:type schema:PublicationIssue
103 N9905d3af829944e697b797b77e1441b3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
104 schema:name Databases, Genetic
105 rdf:type schema:DefinedTerm
106 Ncbf2c15a658d4dfda85e573b133e45f1 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
107 schema:name Software
108 rdf:type schema:DefinedTerm
109 Ncd0aaa35e9f041b8aa712e983a218e48 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
110 schema:name RNA, Ribosomal, 16S
111 rdf:type schema:DefinedTerm
112 Nd0174e70f59449a882a8b77b6eefc0ac rdf:first sg:person.01216375001.41
113 rdf:rest N20a6d34e6b52456ba44566402948fecf
114 Ne0808f83468b42c2bb5648447305c501 rdf:first sg:person.0757405471.24
115 rdf:rest N2204341764444be5936c07b2e41dafc4
116 Ne1a2fb7224f94a05a1d4b63ecf23f95b rdf:first sg:person.01333365763.05
117 rdf:rest Ne0808f83468b42c2bb5648447305c501
118 Ne8572d75f9794410a85c150ee86b7f5e schema:name pubmed_id
119 schema:value 22134646
120 rdf:type schema:PropertyValue
121 Ne8efa397da1f42f6b9c7af6d8c97c922 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
122 schema:name Metagenomics
123 rdf:type schema:DefinedTerm
124 Nf46329a4360142eeaa970766a3e2d3d6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
125 schema:name Phylogeny
126 rdf:type schema:DefinedTerm
127 Nfa11ab427e434d849c2c9ae1c7e344d3 rdf:first sg:person.01324411177.44
128 rdf:rest N253371d3fee14c32b55151b0667156e2
129 Nff712e3f517647c982badd079422061b schema:name Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences and Institute for Molecular Bioscience, St Lucia, Queensland, Australia
130 rdf:type schema:Organization
131 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
132 schema:name Information and Computing Sciences
133 rdf:type schema:DefinedTerm
134 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
135 schema:name Information Systems
136 rdf:type schema:DefinedTerm
137 sg:grant.2529347 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.139
138 rdf:type schema:MonetaryGrant
139 sg:grant.2691272 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.139
140 rdf:type schema:MonetaryGrant
141 sg:grant.2705031 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2011.139
142 rdf:type schema:MonetaryGrant
143 sg:journal.1038436 schema:issn 1751-7362
144 1751-7370
145 schema:name The ISME Journal
146 rdf:type schema:Periodical
147 sg:person.01055510700.73 schema:affiliation Nff712e3f517647c982badd079422061b
148 schema:familyName Hugenholtz
149 schema:givenName Philip
150 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055510700.73
151 rdf:type schema:Person
152 sg:person.01160741113.38 schema:affiliation https://www.grid.ac/institutes/grid.184769.5
153 schema:familyName Andersen
154 schema:givenName Gary L
155 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01160741113.38
156 rdf:type schema:Person
157 sg:person.01216375001.41 schema:affiliation https://www.grid.ac/institutes/grid.266190.a
158 schema:familyName Goodrich
159 schema:givenName Julia
160 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01216375001.41
161 rdf:type schema:Person
162 sg:person.01303716266.13 schema:affiliation https://www.grid.ac/institutes/grid.413575.1
163 schema:familyName Nawrocki
164 schema:givenName Eric P
165 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01303716266.13
166 rdf:type schema:Person
167 sg:person.01324411177.44 schema:affiliation https://www.grid.ac/institutes/grid.266190.a
168 schema:familyName McDonald
169 schema:givenName Daniel
170 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324411177.44
171 rdf:type schema:Person
172 sg:person.01333365763.05 schema:affiliation https://www.grid.ac/institutes/grid.452682.f
173 schema:familyName DeSantis
174 schema:givenName Todd Z
175 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01333365763.05
176 rdf:type schema:Person
177 sg:person.01370663340.52 schema:affiliation https://www.grid.ac/institutes/grid.184769.5
178 schema:familyName Price
179 schema:givenName Morgan N
180 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01370663340.52
181 rdf:type schema:Person
182 sg:person.016311745377.96 schema:affiliation https://www.grid.ac/institutes/grid.413575.1
183 schema:familyName Knight
184 schema:givenName Rob
185 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016311745377.96
186 rdf:type schema:Person
187 sg:person.0757405471.24 schema:affiliation https://www.grid.ac/institutes/grid.184769.5
188 schema:familyName Probst
189 schema:givenName Alexander
190 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757405471.24
191 rdf:type schema:Person
192 sg:pub.10.1007/978-0-387-21609-6_8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015420409
193 https://doi.org/10.1007/978-0-387-21609-6_8
194 rdf:type schema:CreativeWork
195 sg:pub.10.1038/ismej.2011.82 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033211760
196 https://doi.org/10.1038/ismej.2011.82
197 rdf:type schema:CreativeWork
198 sg:pub.10.1038/nature06244 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009917183
199 https://doi.org/10.1038/nature06244
200 rdf:type schema:CreativeWork
201 sg:pub.10.1038/nature08656 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013886837
202 https://doi.org/10.1038/nature08656
203 rdf:type schema:CreativeWork
204 sg:pub.10.1038/nrmicro2119 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030726668
205 https://doi.org/10.1038/nrmicro2119
206 rdf:type schema:CreativeWork
207 sg:pub.10.1186/1471-2105-3-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017682287
208 https://doi.org/10.1186/1471-2105-3-2
209 rdf:type schema:CreativeWork
210 sg:pub.10.1186/1471-2105-8-402 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033706648
211 https://doi.org/10.1186/1471-2105-8-402
212 rdf:type schema:CreativeWork
213 sg:pub.10.1186/gb-2007-8-8-r171 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036049247
214 https://doi.org/10.1186/gb-2007-8-8-r171
215 rdf:type schema:CreativeWork
216 https://app.dimensions.ai/details/publication/pub.1083201581 schema:CreativeWork
217 https://app.dimensions.ai/details/publication/pub.1083324436 schema:CreativeWork
218 https://doi.org/10.1016/j.mib.2008.09.011 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011594842
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1016/j.syapm.2008.08.003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053173751
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1093/bioinformatics/btp157 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018224286
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1093/bioinformatics/btp636 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034512291
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1093/nar/gkh293 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047454621
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1093/nar/gkm864 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010797283
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1093/nar/gkn491 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044732652
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1093/nar/gkn879 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044918953
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1093/nar/gkq1172 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015262597
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1099/ijs.0.64915-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013602089
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1101/gr.096651.109 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034877175
239 rdf:type schema:CreativeWork
240 https://doi.org/10.1101/gr.112730.110 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007579011
241 rdf:type schema:CreativeWork
242 https://doi.org/10.1111/j.1574-6941.2001.tb00791.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1032787651
243 rdf:type schema:CreativeWork
244 https://doi.org/10.1126/science.1123061 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043495787
245 rdf:type schema:CreativeWork
246 https://doi.org/10.1128/aem.00062-07 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045980007
247 rdf:type schema:CreativeWork
248 https://doi.org/10.1128/aem.03006-05 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034568952
249 rdf:type schema:CreativeWork
250 https://doi.org/10.1128/aem.72.5.3685-3695.2006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014115558
251 rdf:type schema:CreativeWork
252 https://doi.org/10.1371/journal.pone.0004192 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037765545
253 rdf:type schema:CreativeWork
254 https://doi.org/10.1371/journal.pone.0009490 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000778834
255 rdf:type schema:CreativeWork
256 https://www.grid.ac/institutes/grid.184769.5 schema:alternateName Lawrence Berkeley National Laboratory
257 schema:name Lawrence Berkeley National Laboratory, Center for Environmental Biotechnology, Berkeley, CA, USA
258 Lawrence Berkeley National Laboratory, Physical Biosciences Division, Berkeley, CA, USA
259 rdf:type schema:Organization
260 https://www.grid.ac/institutes/grid.266190.a schema:alternateName University of Colorado Boulder
261 schema:name Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA
262 rdf:type schema:Organization
263 https://www.grid.ac/institutes/grid.413575.1 schema:alternateName Howard Hughes Medical Institute
264 schema:name Department of Chemistry & Biochemistry and Biofrontiers Institute, University of Colorado, Boulder, CO, USA
265 Howard Hughes Medical Institute, Boulder, CO, USA
266 Janelia Farm Research Campus, Howard Hughes Medical Institute, Ashburn, VA, USA
267 rdf:type schema:Organization
268 https://www.grid.ac/institutes/grid.452682.f schema:alternateName Second Genome (United States)
269 schema:name Department of Bioinformatics, Second Genome Inc., San Bruno, CA, USA
270 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...