Updating microbial genomic sequences: improving accuracy & innovation View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2014-11-07

AUTHORS

Hongseok Tae, Enusha Karunasena, Jasmin H Bavarva, Harold R Garner

ABSTRACT

BackgroundMany bacterial genome sequences completed using the Sanger method may contain assembly errors due in-part to low sequence coverage driven by cost.FindingsTo illustrate the need for re-sequencing of pre-nextgen genomes and to validate sequenced genomes, we conducted a series of experiments, using high coverage sequencing data generated by a Illumina Miseq sequencer to sequence genomic DNAs of Bacteroides fragilis NCTC 9343, Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150, Vibrio cholerae O1 biovar El Tor str. N16961, Bacillus halodurans C-125 and Caulobacter crescentus CB15, which had previously been sequenced by the Sanger method during the early 2000’s.ConclusionsThis study revealed a number of discrepancies between the published assemblies and sequence read alignments for all five bacterial species, suggesting that the continued use of these error-containing genomes and their genetic information may contribute to false conclusions and/or incorrect future discoveries when they are used. More... »

PAGES

25

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1756-0381-7-25

DOI

http://dx.doi.org/10.1186/1756-0381-7-25

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1007938583


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/11", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Medical and Health Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/13", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Education", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1101", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Medical Biochemistry and Metabolomics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1303", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Specialist Studies In Education", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Tae", 
        "givenName": "Hongseok", 
        "id": "sg:person.01353355575.39", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01353355575.39"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Karunasena", 
        "givenName": "Enusha", 
        "id": "sg:person.01002436364.92", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01002436364.92"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Bavarva", 
        "givenName": "Jasmin H", 
        "id": "sg:person.01175530247.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01175530247.24"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Garner", 
        "givenName": "Harold R", 
        "id": "sg:person.0613230631.83", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613230631.83"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1038/ng1470", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1006877156", 
          "https://doi.org/10.1038/ng1470"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/35020000", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021753571", 
          "https://doi.org/10.1038/35020000"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014-11-07", 
    "datePublishedReg": "2014-11-07", 
    "description": "BackgroundMany bacterial genome sequences completed using the Sanger method may contain assembly errors due in-part to low sequence coverage driven by cost.FindingsTo illustrate the need for re-sequencing of pre-nextgen genomes and to validate sequenced genomes, we conducted a series of experiments, using high coverage sequencing data generated by a Illumina Miseq sequencer to sequence genomic DNAs of Bacteroides fragilis NCTC 9343, Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150, Vibrio cholerae O1 biovar El Tor str. N16961, Bacillus halodurans C-125 and Caulobacter crescentus CB15, which had previously been sequenced by the Sanger method during the early 2000\u2019s.ConclusionsThis study revealed a number of discrepancies between the published assemblies and sequence read alignments for all five bacterial species, suggesting that the continued use of these error-containing genomes and their genetic information may contribute to false conclusions and/or incorrect future discoveries when they are used.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/1756-0381-7-25", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1039156", 
        "issn": [
          "1756-0381"
        ], 
        "name": "BioData Mining", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "7"
      }
    ], 
    "keywords": [
      "microbial genomic sequences", 
      "bacterial genome sequences", 
      "coverage sequencing data", 
      "Caulobacter crescentus CB15", 
      "Sanger method", 
      "Bacillus halodurans C-125", 
      "high coverage sequencing data", 
      "low sequence coverage", 
      "Illumina MiSeq sequencer", 
      "sequenced genome", 
      "genome sequence", 
      "genomic sequences", 
      "crescentus CB15", 
      "halodurans C-125", 
      "sequencing data", 
      "genomic DNA", 
      "genetic information", 
      "bacterial species", 
      "genome", 
      "MiSeq sequencer", 
      "sequence coverage", 
      "NCTC 9343", 
      "future discoveries", 
      "sequence", 
      "Salmonella enterica subsp", 
      "enterica subsp", 
      "C-125", 
      "STR", 
      "DNA", 
      "species", 
      "CB15", 
      "subsp", 
      "enterica", 
      "assembly errors", 
      "sequencer", 
      "assembly", 
      "Bacteroides", 
      "discovery", 
      "number of discrepancies", 
      "false conclusions", 
      "alignment", 
      "number", 
      "series of experiments", 
      "experiments", 
      "part", 
      "study", 
      "data", 
      "information", 
      "ConclusionsThis study", 
      "use", 
      "conclusion", 
      "method", 
      "series", 
      "coverage", 
      "discrepancy", 
      "need", 
      "cost", 
      "innovation", 
      "accuracy", 
      "error", 
      "BackgroundMany bacterial genome sequences", 
      "FindingsTo", 
      "pre-nextgen genomes", 
      "Paratyphi A str", 
      "A str", 
      "ATCC 9150", 
      "Vibrio cholerae O1 biovar El Tor str", 
      "cholerae O1 biovar El Tor str", 
      "O1 biovar El Tor str", 
      "biovar El Tor str", 
      "El Tor str", 
      "Tor str", 
      "error-containing genomes"
    ], 
    "name": "Updating microbial genomic sequences: improving accuracy & innovation", 
    "pagination": "25", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1007938583"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1756-0381-7-25"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1756-0381-7-25", 
      "https://app.dimensions.ai/details/publication/pub.1007938583"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-01-01T18:33", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_633.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/1756-0381-7-25"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1756-0381-7-25'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1756-0381-7-25'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1756-0381-7-25'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1756-0381-7-25'


 

This table displays all metadata directly associated to this object as RDF triples.

175 TRIPLES      22 PREDICATES      104 URIs      90 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1756-0381-7-25 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 anzsrc-for:11
4 anzsrc-for:1101
5 anzsrc-for:13
6 anzsrc-for:1303
7 schema:author N1cc2eb9b62e247a1942ac2ea398fcf5e
8 schema:citation sg:pub.10.1038/35020000
9 sg:pub.10.1038/ng1470
10 schema:datePublished 2014-11-07
11 schema:datePublishedReg 2014-11-07
12 schema:description BackgroundMany bacterial genome sequences completed using the Sanger method may contain assembly errors due in-part to low sequence coverage driven by cost.FindingsTo illustrate the need for re-sequencing of pre-nextgen genomes and to validate sequenced genomes, we conducted a series of experiments, using high coverage sequencing data generated by a Illumina Miseq sequencer to sequence genomic DNAs of Bacteroides fragilis NCTC 9343, Salmonella enterica subsp. enterica serovar Paratyphi A str. ATCC 9150, Vibrio cholerae O1 biovar El Tor str. N16961, Bacillus halodurans C-125 and Caulobacter crescentus CB15, which had previously been sequenced by the Sanger method during the early 2000’s.ConclusionsThis study revealed a number of discrepancies between the published assemblies and sequence read alignments for all five bacterial species, suggesting that the continued use of these error-containing genomes and their genetic information may contribute to false conclusions and/or incorrect future discoveries when they are used.
13 schema:genre article
14 schema:inLanguage en
15 schema:isAccessibleForFree true
16 schema:isPartOf N2abaf65d575c4a53828dd5eeb3311564
17 N52aac36565724a21862ddb258893e4a5
18 sg:journal.1039156
19 schema:keywords A str
20 ATCC 9150
21 Bacillus halodurans C-125
22 BackgroundMany bacterial genome sequences
23 Bacteroides
24 C-125
25 CB15
26 Caulobacter crescentus CB15
27 ConclusionsThis study
28 DNA
29 El Tor str
30 FindingsTo
31 Illumina MiSeq sequencer
32 MiSeq sequencer
33 NCTC 9343
34 O1 biovar El Tor str
35 Paratyphi A str
36 STR
37 Salmonella enterica subsp
38 Sanger method
39 Tor str
40 Vibrio cholerae O1 biovar El Tor str
41 accuracy
42 alignment
43 assembly
44 assembly errors
45 bacterial genome sequences
46 bacterial species
47 biovar El Tor str
48 cholerae O1 biovar El Tor str
49 conclusion
50 cost
51 coverage
52 coverage sequencing data
53 crescentus CB15
54 data
55 discovery
56 discrepancy
57 enterica
58 enterica subsp
59 error
60 error-containing genomes
61 experiments
62 false conclusions
63 future discoveries
64 genetic information
65 genome
66 genome sequence
67 genomic DNA
68 genomic sequences
69 halodurans C-125
70 high coverage sequencing data
71 information
72 innovation
73 low sequence coverage
74 method
75 microbial genomic sequences
76 need
77 number
78 number of discrepancies
79 part
80 pre-nextgen genomes
81 sequence
82 sequence coverage
83 sequenced genome
84 sequencer
85 sequencing data
86 series
87 series of experiments
88 species
89 study
90 subsp
91 use
92 schema:name Updating microbial genomic sequences: improving accuracy & innovation
93 schema:pagination 25
94 schema:productId N21feeff6c34649c68ac3cbfa637fe167
95 N2e58006b6ffb422c8c63422b39b5dbed
96 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007938583
97 https://doi.org/10.1186/1756-0381-7-25
98 schema:sdDatePublished 2022-01-01T18:33
99 schema:sdLicense https://scigraph.springernature.com/explorer/license/
100 schema:sdPublisher N550c7a06b1714e2baf1f2cbada1c2baf
101 schema:url https://doi.org/10.1186/1756-0381-7-25
102 sgo:license sg:explorer/license/
103 sgo:sdDataset articles
104 rdf:type schema:ScholarlyArticle
105 N1cc2eb9b62e247a1942ac2ea398fcf5e rdf:first sg:person.01353355575.39
106 rdf:rest N5d6accfff6094dacab224364f0333182
107 N21feeff6c34649c68ac3cbfa637fe167 schema:name doi
108 schema:value 10.1186/1756-0381-7-25
109 rdf:type schema:PropertyValue
110 N260f8a7ae74744e09a99c66f662272a1 rdf:first sg:person.01175530247.24
111 rdf:rest Ne5c4035031f54ea39547335fce683a89
112 N2abaf65d575c4a53828dd5eeb3311564 schema:volumeNumber 7
113 rdf:type schema:PublicationVolume
114 N2e58006b6ffb422c8c63422b39b5dbed schema:name dimensions_id
115 schema:value pub.1007938583
116 rdf:type schema:PropertyValue
117 N52aac36565724a21862ddb258893e4a5 schema:issueNumber 1
118 rdf:type schema:PublicationIssue
119 N550c7a06b1714e2baf1f2cbada1c2baf schema:name Springer Nature - SN SciGraph project
120 rdf:type schema:Organization
121 N5d6accfff6094dacab224364f0333182 rdf:first sg:person.01002436364.92
122 rdf:rest N260f8a7ae74744e09a99c66f662272a1
123 Ne5c4035031f54ea39547335fce683a89 rdf:first sg:person.0613230631.83
124 rdf:rest rdf:nil
125 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
126 schema:name Information and Computing Sciences
127 rdf:type schema:DefinedTerm
128 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
129 schema:name Artificial Intelligence and Image Processing
130 rdf:type schema:DefinedTerm
131 anzsrc-for:11 schema:inDefinedTermSet anzsrc-for:
132 schema:name Medical and Health Sciences
133 rdf:type schema:DefinedTerm
134 anzsrc-for:1101 schema:inDefinedTermSet anzsrc-for:
135 schema:name Medical Biochemistry and Metabolomics
136 rdf:type schema:DefinedTerm
137 anzsrc-for:13 schema:inDefinedTermSet anzsrc-for:
138 schema:name Education
139 rdf:type schema:DefinedTerm
140 anzsrc-for:1303 schema:inDefinedTermSet anzsrc-for:
141 schema:name Specialist Studies In Education
142 rdf:type schema:DefinedTerm
143 sg:journal.1039156 schema:issn 1756-0381
144 schema:name BioData Mining
145 schema:publisher Springer Nature
146 rdf:type schema:Periodical
147 sg:person.01002436364.92 schema:affiliation grid-institutes:grid.438526.e
148 schema:familyName Karunasena
149 schema:givenName Enusha
150 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01002436364.92
151 rdf:type schema:Person
152 sg:person.01175530247.24 schema:affiliation grid-institutes:grid.438526.e
153 schema:familyName Bavarva
154 schema:givenName Jasmin H
155 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01175530247.24
156 rdf:type schema:Person
157 sg:person.01353355575.39 schema:affiliation grid-institutes:grid.438526.e
158 schema:familyName Tae
159 schema:givenName Hongseok
160 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01353355575.39
161 rdf:type schema:Person
162 sg:person.0613230631.83 schema:affiliation grid-institutes:grid.438526.e
163 schema:familyName Garner
164 schema:givenName Harold R
165 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613230631.83
166 rdf:type schema:Person
167 sg:pub.10.1038/35020000 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021753571
168 https://doi.org/10.1038/35020000
169 rdf:type schema:CreativeWork
170 sg:pub.10.1038/ng1470 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006877156
171 https://doi.org/10.1038/ng1470
172 rdf:type schema:CreativeWork
173 grid-institutes:grid.438526.e schema:alternateName Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA
174 schema:name Virginia Bioinformatics Institute at Virginia Polytechnic Institute and State University, 1015 Life Sciences Circle, 24061, Blacksburg, VA, USA
175 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...