dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2017-07-25

AUTHORS

Matthew R Olm, Christopher T Brown, Brandon Brooks, Jillian F Banfield

ABSTRACT

The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that reduces the computational time for pairwise genome comparisons by sequentially applying a fast, inaccurate estimation of genome distance, and a slow, accurate measure of average nucleotide identity. dRep achieves a 28 × increase in speed with perfect recall and precision when benchmarked against previously developed algorithms. We demonstrate the use of dRep for genome recovery from time-series datasets. Each metagenome was assembled separately, and dRep was used to identify groups of essentially identical genomes and select the best genome from each replicate set. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using co-assembly. More... »

PAGES

2864-2868

Identifiers

URI

http://scigraph.springernature.com/pub.10.1038/ismej.2017.126

DOI

http://dx.doi.org/10.1038/ismej.2017.126

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1090891867

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/28742071


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Bacteria", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Metagenome", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Metagenomics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.47840.3f", 
          "name": [
            "Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Olm", 
        "givenName": "Matthew R", 
        "id": "sg:person.010531110205.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010531110205.41"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.47840.3f", 
          "name": [
            "Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Brown", 
        "givenName": "Christopher T", 
        "id": "sg:person.01236044570.46", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01236044570.46"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.47840.3f", 
          "name": [
            "Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Brooks", 
        "givenName": "Brandon", 
        "id": "sg:person.01234072512.13", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01234072512.13"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Earth and Planetary Science, University of California, Berkeley, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.47840.3f", 
          "name": [
            "Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA, USA", 
            "Department of Earth and Planetary Science, University of California, Berkeley, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Banfield", 
        "givenName": "Jillian F", 
        "id": "sg:person.01350542775.47", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01350542775.47"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1186/s13059-016-0997-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050687712", 
          "https://doi.org/10.1186/s13059-016-0997-x"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s40168-017-0270-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085135267", 
          "https://doi.org/10.1186/s40168-017-0270-x"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.4458", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1092065319", 
          "https://doi.org/10.1038/nmeth.4458"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ismej.2015.241", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032868433", 
          "https://doi.org/10.1038/ismej.2015.241"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature02340", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023089166", 
          "https://doi.org/10.1038/nature02340"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ismej.2016.83", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008606938", 
          "https://doi.org/10.1038/ismej.2016.83"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmicrobiol.2016.24", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052776941", 
          "https://doi.org/10.1038/nmicrobiol.2016.24"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2017-07-25", 
    "datePublishedReg": "2017-07-25", 
    "description": "The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that reduces the computational time for pairwise genome comparisons by sequentially applying a fast, inaccurate estimation of genome distance, and a slow, accurate measure of average nucleotide identity. dRep achieves a 28 \u00d7 increase in speed with perfect recall and precision when benchmarked against previously developed algorithms. We demonstrate the use of dRep for genome recovery from time-series datasets. Each metagenome was assembled separately, and dRep was used to identify groups of essentially identical genomes and select the best genome from each replicate set. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using co-assembly.", 
    "genre": "article", 
    "id": "sg:pub.10.1038/ismej.2017.126", 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.3000627", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2459162", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1038436", 
        "issn": [
          "1751-7362", 
          "1751-7370"
        ], 
        "name": "The ISME Journal: Multidisciplinary Journal of Microbial Ecology", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "12", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "11"
      }
    ], 
    "keywords": [
      "time series datasets", 
      "computational time", 
      "perfect recall", 
      "algorithm", 
      "DREP", 
      "rapid algorithm", 
      "pairwise genome comparisons", 
      "draft-quality genomes", 
      "best genome", 
      "inaccurate estimation", 
      "set", 
      "dataset", 
      "\u00d7 increase", 
      "microbial genomes", 
      "recall", 
      "tool", 
      "precision", 
      "speed", 
      "estimation", 
      "hundreds", 
      "metagenomic studies", 
      "genome comparison", 
      "accurate measure", 
      "number", 
      "genome sets", 
      "program", 
      "time", 
      "distance", 
      "use", 
      "comparison", 
      "part", 
      "genome distance", 
      "high-quality genomes", 
      "measures", 
      "metagenomes", 
      "identity", 
      "genome recovery", 
      "recovery", 
      "years", 
      "study", 
      "genome", 
      "group", 
      "increase", 
      "genomic comparison", 
      "replicates", 
      "identical genomes", 
      "average nucleotide identity", 
      "nucleotide identity"
    ], 
    "name": "dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication", 
    "pagination": "2864-2868", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1090891867"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1038/ismej.2017.126"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "28742071"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1038/ismej.2017.126", 
      "https://app.dimensions.ai/details/publication/pub.1090891867"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-12-01T06:36", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_732.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1038/ismej.2017.126"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/ismej.2017.126'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/ismej.2017.126'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/ismej.2017.126'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/ismej.2017.126'


 

This table displays all metadata directly associated to this object as RDF triples.

189 TRIPLES      21 PREDICATES      86 URIs      71 LITERALS      13 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1038/ismej.2017.126 schema:about N3666a9d891bc4d93af5d9fb51b742d5d
2 N3ea6e1ca6ea745279a504b5bafe03c9e
3 N76b57c1b2bfa4a85a0b2022aa82a3594
4 N9f96a26db3244c23abd4d83f57c9c5a4
5 Nc973f05f503443d091ea4fb776c418c5
6 Ne5ab1cf16bc6496b9df01dc376c21f73
7 anzsrc-for:06
8 anzsrc-for:0604
9 schema:author N6d82047fe985482b9e4baf53caed6366
10 schema:citation sg:pub.10.1038/ismej.2015.241
11 sg:pub.10.1038/ismej.2016.83
12 sg:pub.10.1038/nature02340
13 sg:pub.10.1038/nmeth.4458
14 sg:pub.10.1038/nmicrobiol.2016.24
15 sg:pub.10.1186/s13059-016-0997-x
16 sg:pub.10.1186/s40168-017-0270-x
17 schema:datePublished 2017-07-25
18 schema:datePublishedReg 2017-07-25
19 schema:description The number of microbial genomes sequenced each year is expanding rapidly, in part due to genome-resolved metagenomic studies that routinely recover hundreds of draft-quality genomes. Rapid algorithms have been developed to comprehensively compare large genome sets, but they are not accurate with draft-quality genomes. Here we present dRep, a program that reduces the computational time for pairwise genome comparisons by sequentially applying a fast, inaccurate estimation of genome distance, and a slow, accurate measure of average nucleotide identity. dRep achieves a 28 × increase in speed with perfect recall and precision when benchmarked against previously developed algorithms. We demonstrate the use of dRep for genome recovery from time-series datasets. Each metagenome was assembled separately, and dRep was used to identify groups of essentially identical genomes and select the best genome from each replicate set. This resulted in recovery of significantly more and higher-quality genomes compared to the set recovered using co-assembly.
20 schema:genre article
21 schema:isAccessibleForFree true
22 schema:isPartOf Nb1f5e9a11982400b86fd85e214112400
23 Nf57731d293b04b5aa7e2bec80e1fc1b2
24 sg:journal.1038436
25 schema:keywords DREP
26 accurate measure
27 algorithm
28 average nucleotide identity
29 best genome
30 comparison
31 computational time
32 dataset
33 distance
34 draft-quality genomes
35 estimation
36 genome
37 genome comparison
38 genome distance
39 genome recovery
40 genome sets
41 genomic comparison
42 group
43 high-quality genomes
44 hundreds
45 identical genomes
46 identity
47 inaccurate estimation
48 increase
49 measures
50 metagenomes
51 metagenomic studies
52 microbial genomes
53 nucleotide identity
54 number
55 pairwise genome comparisons
56 part
57 perfect recall
58 precision
59 program
60 rapid algorithm
61 recall
62 recovery
63 replicates
64 set
65 speed
66 study
67 time
68 time series datasets
69 tool
70 use
71 years
72 × increase
73 schema:name dRep: a tool for fast and accurate genomic comparisons that enables improved genome recovery from metagenomes through de-replication
74 schema:pagination 2864-2868
75 schema:productId N69be6d294860482ca05da34ae53440cb
76 N8435f8fd618d454da46a10d168ae92bb
77 Nd716b43ddd1e47dd9c96f875a7ffadcb
78 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090891867
79 https://doi.org/10.1038/ismej.2017.126
80 schema:sdDatePublished 2022-12-01T06:36
81 schema:sdLicense https://scigraph.springernature.com/explorer/license/
82 schema:sdPublisher N68cf2a2d27004bcbbeecb76dce3e0e0d
83 schema:url https://doi.org/10.1038/ismej.2017.126
84 sgo:license sg:explorer/license/
85 sgo:sdDataset articles
86 rdf:type schema:ScholarlyArticle
87 N3666a9d891bc4d93af5d9fb51b742d5d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
88 schema:name Metagenome
89 rdf:type schema:DefinedTerm
90 N395ac40afec1459aa0f8b7369a02de1c rdf:first sg:person.01350542775.47
91 rdf:rest rdf:nil
92 N3ea6e1ca6ea745279a504b5bafe03c9e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
93 schema:name Genome, Bacterial
94 rdf:type schema:DefinedTerm
95 N68cf2a2d27004bcbbeecb76dce3e0e0d schema:name Springer Nature - SN SciGraph project
96 rdf:type schema:Organization
97 N69be6d294860482ca05da34ae53440cb schema:name pubmed_id
98 schema:value 28742071
99 rdf:type schema:PropertyValue
100 N6d82047fe985482b9e4baf53caed6366 rdf:first sg:person.010531110205.41
101 rdf:rest N90e525bc2ffc487191aa9c3fd485768b
102 N76b57c1b2bfa4a85a0b2022aa82a3594 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
103 schema:name Software
104 rdf:type schema:DefinedTerm
105 N8435f8fd618d454da46a10d168ae92bb schema:name dimensions_id
106 schema:value pub.1090891867
107 rdf:type schema:PropertyValue
108 N90e525bc2ffc487191aa9c3fd485768b rdf:first sg:person.01236044570.46
109 rdf:rest Ne71fd643b0344a7fb0885caa6c337b28
110 N9f96a26db3244c23abd4d83f57c9c5a4 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
111 schema:name Algorithms
112 rdf:type schema:DefinedTerm
113 Nb1f5e9a11982400b86fd85e214112400 schema:volumeNumber 11
114 rdf:type schema:PublicationVolume
115 Nc973f05f503443d091ea4fb776c418c5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
116 schema:name Metagenomics
117 rdf:type schema:DefinedTerm
118 Nd716b43ddd1e47dd9c96f875a7ffadcb schema:name doi
119 schema:value 10.1038/ismej.2017.126
120 rdf:type schema:PropertyValue
121 Ne5ab1cf16bc6496b9df01dc376c21f73 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
122 schema:name Bacteria
123 rdf:type schema:DefinedTerm
124 Ne71fd643b0344a7fb0885caa6c337b28 rdf:first sg:person.01234072512.13
125 rdf:rest N395ac40afec1459aa0f8b7369a02de1c
126 Nf57731d293b04b5aa7e2bec80e1fc1b2 schema:issueNumber 12
127 rdf:type schema:PublicationIssue
128 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
129 schema:name Biological Sciences
130 rdf:type schema:DefinedTerm
131 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
132 schema:name Genetics
133 rdf:type schema:DefinedTerm
134 sg:grant.2459162 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2017.126
135 rdf:type schema:MonetaryGrant
136 sg:grant.3000627 http://pending.schema.org/fundedItem sg:pub.10.1038/ismej.2017.126
137 rdf:type schema:MonetaryGrant
138 sg:journal.1038436 schema:issn 1751-7362
139 1751-7370
140 schema:name The ISME Journal: Multidisciplinary Journal of Microbial Ecology
141 schema:publisher Springer Nature
142 rdf:type schema:Periodical
143 sg:person.010531110205.41 schema:affiliation grid-institutes:grid.47840.3f
144 schema:familyName Olm
145 schema:givenName Matthew R
146 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010531110205.41
147 rdf:type schema:Person
148 sg:person.01234072512.13 schema:affiliation grid-institutes:grid.47840.3f
149 schema:familyName Brooks
150 schema:givenName Brandon
151 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01234072512.13
152 rdf:type schema:Person
153 sg:person.01236044570.46 schema:affiliation grid-institutes:grid.47840.3f
154 schema:familyName Brown
155 schema:givenName Christopher T
156 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01236044570.46
157 rdf:type schema:Person
158 sg:person.01350542775.47 schema:affiliation grid-institutes:grid.47840.3f
159 schema:familyName Banfield
160 schema:givenName Jillian F
161 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01350542775.47
162 rdf:type schema:Person
163 sg:pub.10.1038/ismej.2015.241 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032868433
164 https://doi.org/10.1038/ismej.2015.241
165 rdf:type schema:CreativeWork
166 sg:pub.10.1038/ismej.2016.83 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008606938
167 https://doi.org/10.1038/ismej.2016.83
168 rdf:type schema:CreativeWork
169 sg:pub.10.1038/nature02340 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023089166
170 https://doi.org/10.1038/nature02340
171 rdf:type schema:CreativeWork
172 sg:pub.10.1038/nmeth.4458 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092065319
173 https://doi.org/10.1038/nmeth.4458
174 rdf:type schema:CreativeWork
175 sg:pub.10.1038/nmicrobiol.2016.24 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052776941
176 https://doi.org/10.1038/nmicrobiol.2016.24
177 rdf:type schema:CreativeWork
178 sg:pub.10.1186/s13059-016-0997-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1050687712
179 https://doi.org/10.1186/s13059-016-0997-x
180 rdf:type schema:CreativeWork
181 sg:pub.10.1186/s40168-017-0270-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1085135267
182 https://doi.org/10.1186/s40168-017-0270-x
183 rdf:type schema:CreativeWork
184 grid-institutes:grid.47840.3f schema:alternateName Department of Earth and Planetary Science, University of California, Berkeley, CA, USA
185 Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
186 schema:name Department of Earth and Planetary Science, University of California, Berkeley, CA, USA
187 Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA, USA
188 Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
189 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...