Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2012-12

AUTHORS

Yajun Wang, Yao Yu, Bohu Pan, Pei Hao, Yixue Li, Zhifeng Shao, Xiaogang Xu, Xuan Li

ABSTRACT

BACKGROUND: Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data. RESULTS: A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data. CONCLUSIONS: The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications. More... »

PAGES

s21

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21

DOI

http://dx.doi.org/10.1186/1752-0509-6-s3-s21

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1031324990

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/23282199


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Cross Infection", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "DNA, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Genetic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Enterococcus faecium", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Gastrointestinal Tract", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genomics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "High-Throughput Nucleotide Sequencing", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Humans", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Nucleic Acid Hybridization", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Phylogeny", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Analysis, DNA", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Systems Biology", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Shanghai Jiao Tong University", 
          "id": "https://www.grid.ac/institutes/grid.16821.3c", 
          "name": [
            "Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Yajun", 
        "id": "sg:person.01143645130.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01143645130.48"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Institutes for Biological Sciences", 
          "id": "https://www.grid.ac/institutes/grid.419092.7", 
          "name": [
            "Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yu", 
        "givenName": "Yao", 
        "id": "sg:person.01100523570.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01100523570.79"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Institutes for Biological Sciences", 
          "id": "https://www.grid.ac/institutes/grid.419092.7", 
          "name": [
            "Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Pan", 
        "givenName": "Bohu", 
        "id": "sg:person.01341361016.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01341361016.33"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Chinese Academy of Sciences", 
          "id": "https://www.grid.ac/institutes/grid.9227.e", 
          "name": [
            "Institut Pasteur of Shanghai, Chinese Academy of Science, 200025, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hao", 
        "givenName": "Pei", 
        "id": "sg:person.01030424735.37", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01030424735.37"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Jiao Tong University", 
          "id": "https://www.grid.ac/institutes/grid.16821.3c", 
          "name": [
            "Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Yixue", 
        "id": "sg:person.012163147207.05", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012163147207.05"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Jiao Tong University", 
          "id": "https://www.grid.ac/institutes/grid.16821.3c", 
          "name": [
            "Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shao", 
        "givenName": "Zhifeng", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Huashan Hospital", 
          "id": "https://www.grid.ac/institutes/grid.411405.5", 
          "name": [
            "Institute of Antibiotics, Huashan Hospital of Fudan University, 200040, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Xu", 
        "givenName": "Xiaogang", 
        "id": "sg:person.01065067750.80", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065067750.80"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Institutes for Biological Sciences", 
          "id": "https://www.grid.ac/institutes/grid.419092.7", 
          "name": [
            "Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Xuan", 
        "id": "sg:person.0611615065.28", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0611615065.28"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.3201/eid1106.041204", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001835942"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.083311.108", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011112451"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2009-10-9-r94", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013301186", 
          "https://doi.org/10.1186/gb-2009-10-9-r94"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr026", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018328066"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.mib.2006.07.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020606795"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.1162986", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021160342"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature03959", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021574562", 
          "https://doi.org/10.1038/nature03959"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature03959", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021574562", 
          "https://doi.org/10.1038/nature03959"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2004-5-2-r12", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022585853", 
          "https://doi.org/10.1186/gb-2004-5-2-r12"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/jb.00259-12", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023155566"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg2626", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023911485", 
          "https://doi.org/10.1038/nrg2626"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg2626", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023911485", 
          "https://doi.org/10.1038/nrg2626"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-9-603", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028935538", 
          "https://doi.org/10.1186/1471-2164-9-603"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.6435207", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033023123"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr319", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033856182"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.ppat.0030007", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034259269"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp324", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038266369"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1590/s0100-879x2000001100002", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040792980"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-11-239", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042212259", 
          "https://doi.org/10.1186/1471-2164-11-239"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044030989", 
          "https://doi.org/10.1038/nature08696"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044030989", 
          "https://doi.org/10.1038/nature08696"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkn425", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044990606"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/25.17.3389", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047265454"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.8.3.175", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048253030"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0017915", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050529926"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.0010043", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050629286"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.0010043", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050629286"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.074492.107", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051720574"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0030187", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052473877"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ygeno.2010.03.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052838811"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2012-12", 
    "datePublishedReg": "2012-12-01", 
    "description": "BACKGROUND: Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data.\nRESULTS: A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data.\nCONCLUSIONS: The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1752-0509-6-s3-s21", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.7021539", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1327442", 
        "issn": [
          "1752-0509"
        ], 
        "name": "BMC Systems Biology", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "Suppl 3", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "6"
      }
    ], 
    "name": "Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome", 
    "pagination": "s21", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "953a01267c10ffd2c4edcf376dc6b8c8f4147ffeee1cfbcdd92ecebd8709f94c"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "23282199"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101301827"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1752-0509-6-s3-s21"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1031324990"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1752-0509-6-s3-s21", 
      "https://app.dimensions.ai/details/publication/pub.1031324990"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T13:17", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8659_00000513.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2F1752-0509-6-S3-S21"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'


 

This table displays all metadata directly associated to this object as RDF triples.

264 TRIPLES      21 PREDICATES      68 URIs      34 LITERALS      22 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1752-0509-6-s3-s21 schema:about N011c33538b8c47bbbd22c77fd073901b
2 N194ab3d0307c432fbd32e52d0c9f160a
3 N2250ce3908b044348262799eb82deebb
4 N5244f561a30b443ea3a2a0cc616ccae5
5 N5470ce014eec4fbf9bd9be254e6632a3
6 N5920a873a7c242d2b1e9339febe9e36d
7 N6e6c84de717d4402863395ef702b685c
8 N6fa281e53294432695df99810e11ca4f
9 N7332d66228c240948e66f2b0d7ff9975
10 N8f34cbcda2eb4ccbb1b0df1f88839acc
11 Nbe64895f80a148c48f5f0403886a8d88
12 Nca422b9ce41f4f238b620b147e8d24de
13 Nf46a498a3582468093938843ca1b82a5
14 anzsrc-for:06
15 anzsrc-for:0604
16 schema:author N38b7e19e43004634a2a90557eddaf5a3
17 schema:citation sg:pub.10.1038/nature03959
18 sg:pub.10.1038/nature08696
19 sg:pub.10.1038/nrg2626
20 sg:pub.10.1186/1471-2164-11-239
21 sg:pub.10.1186/1471-2164-9-603
22 sg:pub.10.1186/gb-2004-5-2-r12
23 sg:pub.10.1186/gb-2009-10-9-r94
24 https://doi.org/10.1016/j.mib.2006.07.001
25 https://doi.org/10.1016/j.ygeno.2010.03.001
26 https://doi.org/10.1093/bioinformatics/btp324
27 https://doi.org/10.1093/bioinformatics/btr026
28 https://doi.org/10.1093/bioinformatics/btr319
29 https://doi.org/10.1093/nar/25.17.3389
30 https://doi.org/10.1093/nar/gkn425
31 https://doi.org/10.1101/gr.074492.107
32 https://doi.org/10.1101/gr.083311.108
33 https://doi.org/10.1101/gr.6435207
34 https://doi.org/10.1101/gr.8.3.175
35 https://doi.org/10.1126/science.1162986
36 https://doi.org/10.1128/jb.00259-12
37 https://doi.org/10.1371/journal.pcbi.0010043
38 https://doi.org/10.1371/journal.pone.0017915
39 https://doi.org/10.1371/journal.pone.0030187
40 https://doi.org/10.1371/journal.ppat.0030007
41 https://doi.org/10.1590/s0100-879x2000001100002
42 https://doi.org/10.3201/eid1106.041204
43 schema:datePublished 2012-12
44 schema:datePublishedReg 2012-12-01
45 schema:description BACKGROUND: Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data. RESULTS: A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data. CONCLUSIONS: The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications.
46 schema:genre research_article
47 schema:inLanguage en
48 schema:isAccessibleForFree true
49 schema:isPartOf N3f5747a70dd34ceda5890da4c58e9bee
50 Nb6fc66f9e20a42c7aa2f9a6a6bc3fc9c
51 sg:journal.1327442
52 schema:name Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome
53 schema:pagination s21
54 schema:productId N2a23e04f5769469887d26871e4600432
55 N3ab6e88f9d35491a9341244cf84d8dae
56 N7a33c8984c8947c8abfd22a4c8a496cf
57 N8858727182a243eb80202a49bc505eca
58 Nc69d71b385a0459d80fbbebd002357f5
59 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031324990
60 https://doi.org/10.1186/1752-0509-6-s3-s21
61 schema:sdDatePublished 2019-04-10T13:17
62 schema:sdLicense https://scigraph.springernature.com/explorer/license/
63 schema:sdPublisher Nb357d13bc0b248c988c2b5d84062ce6e
64 schema:url http://link.springer.com/10.1186%2F1752-0509-6-S3-S21
65 sgo:license sg:explorer/license/
66 sgo:sdDataset articles
67 rdf:type schema:ScholarlyArticle
68 N011c33538b8c47bbbd22c77fd073901b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
69 schema:name Cross Infection
70 rdf:type schema:DefinedTerm
71 N194ab3d0307c432fbd32e52d0c9f160a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
72 schema:name High-Throughput Nucleotide Sequencing
73 rdf:type schema:DefinedTerm
74 N2250ce3908b044348262799eb82deebb schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
75 schema:name Genome, Bacterial
76 rdf:type schema:DefinedTerm
77 N29dc416f5837464a8b903d5e21ae195e schema:affiliation https://www.grid.ac/institutes/grid.16821.3c
78 schema:familyName Shao
79 schema:givenName Zhifeng
80 rdf:type schema:Person
81 N2a23e04f5769469887d26871e4600432 schema:name readcube_id
82 schema:value 953a01267c10ffd2c4edcf376dc6b8c8f4147ffeee1cfbcdd92ecebd8709f94c
83 rdf:type schema:PropertyValue
84 N2f0edb27ffe94cc7ad13d3c9d45fbaef rdf:first sg:person.01030424735.37
85 rdf:rest Nf9613821dd214aa3b13f49076a272e64
86 N38b7e19e43004634a2a90557eddaf5a3 rdf:first sg:person.01143645130.48
87 rdf:rest Nc35797dd19394ea0a2b7936f0fb68fb7
88 N3ab6e88f9d35491a9341244cf84d8dae schema:name pubmed_id
89 schema:value 23282199
90 rdf:type schema:PropertyValue
91 N3f5747a70dd34ceda5890da4c58e9bee schema:issueNumber Suppl 3
92 rdf:type schema:PublicationIssue
93 N5244f561a30b443ea3a2a0cc616ccae5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
94 schema:name Systems Biology
95 rdf:type schema:DefinedTerm
96 N5470ce014eec4fbf9bd9be254e6632a3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
97 schema:name DNA, Bacterial
98 rdf:type schema:DefinedTerm
99 N5920a873a7c242d2b1e9339febe9e36d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
100 schema:name Gastrointestinal Tract
101 rdf:type schema:DefinedTerm
102 N6e6c84de717d4402863395ef702b685c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
103 schema:name Nucleic Acid Hybridization
104 rdf:type schema:DefinedTerm
105 N6fa281e53294432695df99810e11ca4f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
106 schema:name Databases, Genetic
107 rdf:type schema:DefinedTerm
108 N7332d66228c240948e66f2b0d7ff9975 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
109 schema:name Phylogeny
110 rdf:type schema:DefinedTerm
111 N7a33c8984c8947c8abfd22a4c8a496cf schema:name nlm_unique_id
112 schema:value 101301827
113 rdf:type schema:PropertyValue
114 N8858727182a243eb80202a49bc505eca schema:name doi
115 schema:value 10.1186/1752-0509-6-s3-s21
116 rdf:type schema:PropertyValue
117 N8f34cbcda2eb4ccbb1b0df1f88839acc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
118 schema:name Humans
119 rdf:type schema:DefinedTerm
120 Nb357d13bc0b248c988c2b5d84062ce6e schema:name Springer Nature - SN SciGraph project
121 rdf:type schema:Organization
122 Nb6fc66f9e20a42c7aa2f9a6a6bc3fc9c schema:volumeNumber 6
123 rdf:type schema:PublicationVolume
124 Nbe64895f80a148c48f5f0403886a8d88 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
125 schema:name Sequence Analysis, DNA
126 rdf:type schema:DefinedTerm
127 Nc35797dd19394ea0a2b7936f0fb68fb7 rdf:first sg:person.01100523570.79
128 rdf:rest Nfe45a1fe2131435e96398cf95ee39a76
129 Nc69d71b385a0459d80fbbebd002357f5 schema:name dimensions_id
130 schema:value pub.1031324990
131 rdf:type schema:PropertyValue
132 Nca422b9ce41f4f238b620b147e8d24de schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
133 schema:name Genomics
134 rdf:type schema:DefinedTerm
135 Nd92561f937fc4b559cc8e92a4aa11fb1 rdf:first N29dc416f5837464a8b903d5e21ae195e
136 rdf:rest Neeb87becce6e4b55b0bd338c26b5441b
137 Neeb87becce6e4b55b0bd338c26b5441b rdf:first sg:person.01065067750.80
138 rdf:rest Nff716f3146cf4d2987a1256a5830ea41
139 Nf46a498a3582468093938843ca1b82a5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
140 schema:name Enterococcus faecium
141 rdf:type schema:DefinedTerm
142 Nf9613821dd214aa3b13f49076a272e64 rdf:first sg:person.012163147207.05
143 rdf:rest Nd92561f937fc4b559cc8e92a4aa11fb1
144 Nfe45a1fe2131435e96398cf95ee39a76 rdf:first sg:person.01341361016.33
145 rdf:rest N2f0edb27ffe94cc7ad13d3c9d45fbaef
146 Nff716f3146cf4d2987a1256a5830ea41 rdf:first sg:person.0611615065.28
147 rdf:rest rdf:nil
148 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
149 schema:name Biological Sciences
150 rdf:type schema:DefinedTerm
151 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
152 schema:name Genetics
153 rdf:type schema:DefinedTerm
154 sg:grant.7021539 http://pending.schema.org/fundedItem sg:pub.10.1186/1752-0509-6-s3-s21
155 rdf:type schema:MonetaryGrant
156 sg:journal.1327442 schema:issn 1752-0509
157 schema:name BMC Systems Biology
158 rdf:type schema:Periodical
159 sg:person.01030424735.37 schema:affiliation https://www.grid.ac/institutes/grid.9227.e
160 schema:familyName Hao
161 schema:givenName Pei
162 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01030424735.37
163 rdf:type schema:Person
164 sg:person.01065067750.80 schema:affiliation https://www.grid.ac/institutes/grid.411405.5
165 schema:familyName Xu
166 schema:givenName Xiaogang
167 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065067750.80
168 rdf:type schema:Person
169 sg:person.01100523570.79 schema:affiliation https://www.grid.ac/institutes/grid.419092.7
170 schema:familyName Yu
171 schema:givenName Yao
172 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01100523570.79
173 rdf:type schema:Person
174 sg:person.01143645130.48 schema:affiliation https://www.grid.ac/institutes/grid.16821.3c
175 schema:familyName Wang
176 schema:givenName Yajun
177 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01143645130.48
178 rdf:type schema:Person
179 sg:person.012163147207.05 schema:affiliation https://www.grid.ac/institutes/grid.16821.3c
180 schema:familyName Li
181 schema:givenName Yixue
182 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012163147207.05
183 rdf:type schema:Person
184 sg:person.01341361016.33 schema:affiliation https://www.grid.ac/institutes/grid.419092.7
185 schema:familyName Pan
186 schema:givenName Bohu
187 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01341361016.33
188 rdf:type schema:Person
189 sg:person.0611615065.28 schema:affiliation https://www.grid.ac/institutes/grid.419092.7
190 schema:familyName Li
191 schema:givenName Xuan
192 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0611615065.28
193 rdf:type schema:Person
194 sg:pub.10.1038/nature03959 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021574562
195 https://doi.org/10.1038/nature03959
196 rdf:type schema:CreativeWork
197 sg:pub.10.1038/nature08696 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044030989
198 https://doi.org/10.1038/nature08696
199 rdf:type schema:CreativeWork
200 sg:pub.10.1038/nrg2626 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023911485
201 https://doi.org/10.1038/nrg2626
202 rdf:type schema:CreativeWork
203 sg:pub.10.1186/1471-2164-11-239 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042212259
204 https://doi.org/10.1186/1471-2164-11-239
205 rdf:type schema:CreativeWork
206 sg:pub.10.1186/1471-2164-9-603 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028935538
207 https://doi.org/10.1186/1471-2164-9-603
208 rdf:type schema:CreativeWork
209 sg:pub.10.1186/gb-2004-5-2-r12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022585853
210 https://doi.org/10.1186/gb-2004-5-2-r12
211 rdf:type schema:CreativeWork
212 sg:pub.10.1186/gb-2009-10-9-r94 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013301186
213 https://doi.org/10.1186/gb-2009-10-9-r94
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1016/j.mib.2006.07.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020606795
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1016/j.ygeno.2010.03.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052838811
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1093/bioinformatics/btp324 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038266369
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1093/bioinformatics/btr026 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018328066
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1093/bioinformatics/btr319 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033856182
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1093/nar/25.17.3389 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047265454
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1093/nar/gkn425 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044990606
228 rdf:type schema:CreativeWork
229 https://doi.org/10.1101/gr.074492.107 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051720574
230 rdf:type schema:CreativeWork
231 https://doi.org/10.1101/gr.083311.108 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011112451
232 rdf:type schema:CreativeWork
233 https://doi.org/10.1101/gr.6435207 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033023123
234 rdf:type schema:CreativeWork
235 https://doi.org/10.1101/gr.8.3.175 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048253030
236 rdf:type schema:CreativeWork
237 https://doi.org/10.1126/science.1162986 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021160342
238 rdf:type schema:CreativeWork
239 https://doi.org/10.1128/jb.00259-12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023155566
240 rdf:type schema:CreativeWork
241 https://doi.org/10.1371/journal.pcbi.0010043 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050629286
242 rdf:type schema:CreativeWork
243 https://doi.org/10.1371/journal.pone.0017915 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050529926
244 rdf:type schema:CreativeWork
245 https://doi.org/10.1371/journal.pone.0030187 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052473877
246 rdf:type schema:CreativeWork
247 https://doi.org/10.1371/journal.ppat.0030007 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034259269
248 rdf:type schema:CreativeWork
249 https://doi.org/10.1590/s0100-879x2000001100002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040792980
250 rdf:type schema:CreativeWork
251 https://doi.org/10.3201/eid1106.041204 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001835942
252 rdf:type schema:CreativeWork
253 https://www.grid.ac/institutes/grid.16821.3c schema:alternateName Shanghai Jiao Tong University
254 schema:name Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China
255 rdf:type schema:Organization
256 https://www.grid.ac/institutes/grid.411405.5 schema:alternateName Huashan Hospital
257 schema:name Institute of Antibiotics, Huashan Hospital of Fudan University, 200040, Shanghai, China
258 rdf:type schema:Organization
259 https://www.grid.ac/institutes/grid.419092.7 schema:alternateName Shanghai Institutes for Biological Sciences
260 schema:name Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China
261 rdf:type schema:Organization
262 https://www.grid.ac/institutes/grid.9227.e schema:alternateName Chinese Academy of Sciences
263 schema:name Institut Pasteur of Shanghai, Chinese Academy of Science, 200025, Shanghai, China
264 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...