Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2012-12

AUTHORS

Yajun Wang, Yao Yu, Bohu Pan, Pei Hao, Yixue Li, Zhifeng Shao, Xiaogang Xu, Xuan Li

ABSTRACT

BACKGROUND: Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data. RESULTS: A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data. CONCLUSIONS: The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications. More... »

PAGES

s21

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21

DOI

http://dx.doi.org/10.1186/1752-0509-6-s3-s21

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1031324990

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/23282199


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Cross Infection", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "DNA, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Genetic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Enterococcus faecium", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Gastrointestinal Tract", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genomics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "High-Throughput Nucleotide Sequencing", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Humans", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Nucleic Acid Hybridization", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Phylogeny", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Analysis, DNA", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Systems Biology", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Shanghai Jiao Tong University", 
          "id": "https://www.grid.ac/institutes/grid.16821.3c", 
          "name": [
            "Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Yajun", 
        "id": "sg:person.01143645130.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01143645130.48"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Institutes for Biological Sciences", 
          "id": "https://www.grid.ac/institutes/grid.419092.7", 
          "name": [
            "Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yu", 
        "givenName": "Yao", 
        "id": "sg:person.01100523570.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01100523570.79"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Institutes for Biological Sciences", 
          "id": "https://www.grid.ac/institutes/grid.419092.7", 
          "name": [
            "Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Pan", 
        "givenName": "Bohu", 
        "id": "sg:person.01341361016.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01341361016.33"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Chinese Academy of Sciences", 
          "id": "https://www.grid.ac/institutes/grid.9227.e", 
          "name": [
            "Institut Pasteur of Shanghai, Chinese Academy of Science, 200025, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hao", 
        "givenName": "Pei", 
        "id": "sg:person.01030424735.37", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01030424735.37"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Jiao Tong University", 
          "id": "https://www.grid.ac/institutes/grid.16821.3c", 
          "name": [
            "Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Yixue", 
        "id": "sg:person.012163147207.05", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012163147207.05"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Jiao Tong University", 
          "id": "https://www.grid.ac/institutes/grid.16821.3c", 
          "name": [
            "Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shao", 
        "givenName": "Zhifeng", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Huashan Hospital", 
          "id": "https://www.grid.ac/institutes/grid.411405.5", 
          "name": [
            "Institute of Antibiotics, Huashan Hospital of Fudan University, 200040, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Xu", 
        "givenName": "Xiaogang", 
        "id": "sg:person.01065067750.80", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065067750.80"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanghai Institutes for Biological Sciences", 
          "id": "https://www.grid.ac/institutes/grid.419092.7", 
          "name": [
            "Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Xuan", 
        "id": "sg:person.0611615065.28", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0611615065.28"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.3201/eid1106.041204", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001835942"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.083311.108", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011112451"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2009-10-9-r94", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013301186", 
          "https://doi.org/10.1186/gb-2009-10-9-r94"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr026", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018328066"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.mib.2006.07.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020606795"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.1162986", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021160342"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature03959", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021574562", 
          "https://doi.org/10.1038/nature03959"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature03959", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021574562", 
          "https://doi.org/10.1038/nature03959"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2004-5-2-r12", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022585853", 
          "https://doi.org/10.1186/gb-2004-5-2-r12"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/jb.00259-12", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023155566"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg2626", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023911485", 
          "https://doi.org/10.1038/nrg2626"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg2626", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023911485", 
          "https://doi.org/10.1038/nrg2626"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-9-603", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028935538", 
          "https://doi.org/10.1186/1471-2164-9-603"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.6435207", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033023123"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr319", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033856182"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.ppat.0030007", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034259269"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp324", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038266369"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1590/s0100-879x2000001100002", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040792980"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-11-239", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042212259", 
          "https://doi.org/10.1186/1471-2164-11-239"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044030989", 
          "https://doi.org/10.1038/nature08696"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature08696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044030989", 
          "https://doi.org/10.1038/nature08696"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkn425", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044990606"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/25.17.3389", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047265454"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.8.3.175", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048253030"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0017915", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050529926"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.0010043", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050629286"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.0010043", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050629286"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.074492.107", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051720574"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0030187", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052473877"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ygeno.2010.03.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052838811"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2012-12", 
    "datePublishedReg": "2012-12-01", 
    "description": "BACKGROUND: Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data.\nRESULTS: A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data.\nCONCLUSIONS: The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1752-0509-6-s3-s21", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.7021539", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1327442", 
        "issn": [
          "1752-0509"
        ], 
        "name": "BMC Systems Biology", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "Suppl 3", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "6"
      }
    ], 
    "name": "Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome", 
    "pagination": "s21", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "953a01267c10ffd2c4edcf376dc6b8c8f4147ffeee1cfbcdd92ecebd8709f94c"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "23282199"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101301827"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1752-0509-6-s3-s21"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1031324990"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1752-0509-6-s3-s21", 
      "https://app.dimensions.ai/details/publication/pub.1031324990"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T13:17", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8659_00000513.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2F1752-0509-6-S3-S21"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1752-0509-6-s3-s21'


 

This table displays all metadata directly associated to this object as RDF triples.

264 TRIPLES      21 PREDICATES      68 URIs      34 LITERALS      22 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1752-0509-6-s3-s21 schema:about N06d6c1de33184da48a6a6290b88e2578
2 N16aec09094704fa7967dfe36b2bb29c8
3 N48e44d2e744e4000a9663cee733e5cc6
4 N51c74c79c3644cd7aec8be8031525be7
5 N602574e55abc4bb195c6f57624867b33
6 N7405dd4264614545931b0d58f2505bec
7 N8b2e85febab6484fb8647dc19c1e1329
8 Na6e72ed7c09d45ddbd1cb9df9ad57860
9 Na777948983554d93bf7f933675f53a43
10 Nb04cc3a6d60b46619a65815e16028df7
11 Nfa85cdf87b4c497394004209f60afb71
12 Nfb504293ce5f408983aacbdd86d25a57
13 Nfbcbe58294cc42428274dba2f066f280
14 anzsrc-for:06
15 anzsrc-for:0604
16 schema:author Nb60d86c3dc114603af9f104dd1aa57b9
17 schema:citation sg:pub.10.1038/nature03959
18 sg:pub.10.1038/nature08696
19 sg:pub.10.1038/nrg2626
20 sg:pub.10.1186/1471-2164-11-239
21 sg:pub.10.1186/1471-2164-9-603
22 sg:pub.10.1186/gb-2004-5-2-r12
23 sg:pub.10.1186/gb-2009-10-9-r94
24 https://doi.org/10.1016/j.mib.2006.07.001
25 https://doi.org/10.1016/j.ygeno.2010.03.001
26 https://doi.org/10.1093/bioinformatics/btp324
27 https://doi.org/10.1093/bioinformatics/btr026
28 https://doi.org/10.1093/bioinformatics/btr319
29 https://doi.org/10.1093/nar/25.17.3389
30 https://doi.org/10.1093/nar/gkn425
31 https://doi.org/10.1101/gr.074492.107
32 https://doi.org/10.1101/gr.083311.108
33 https://doi.org/10.1101/gr.6435207
34 https://doi.org/10.1101/gr.8.3.175
35 https://doi.org/10.1126/science.1162986
36 https://doi.org/10.1128/jb.00259-12
37 https://doi.org/10.1371/journal.pcbi.0010043
38 https://doi.org/10.1371/journal.pone.0017915
39 https://doi.org/10.1371/journal.pone.0030187
40 https://doi.org/10.1371/journal.ppat.0030007
41 https://doi.org/10.1590/s0100-879x2000001100002
42 https://doi.org/10.3201/eid1106.041204
43 schema:datePublished 2012-12
44 schema:datePublishedReg 2012-12-01
45 schema:description BACKGROUND: Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data. RESULTS: A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data. CONCLUSIONS: The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications.
46 schema:genre research_article
47 schema:inLanguage en
48 schema:isAccessibleForFree true
49 schema:isPartOf Nb035097a0f6d45269e783fecd6d32a7b
50 Nd924567605bc460e922505bf5ed0713e
51 sg:journal.1327442
52 schema:name Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome
53 schema:pagination s21
54 schema:productId N54174f3e15004c119798d1dd764012d7
55 N720cbdac7fee49c5bc692f6fa6e8b46a
56 N82753c897a494de3b4799bf5c19703e7
57 Na9dd7f6a63ad4ec190618d1628e70e30
58 Nfff0f1f5d56c44d8bc6b2ed54d36f3ef
59 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031324990
60 https://doi.org/10.1186/1752-0509-6-s3-s21
61 schema:sdDatePublished 2019-04-10T13:17
62 schema:sdLicense https://scigraph.springernature.com/explorer/license/
63 schema:sdPublisher N19debff95c794bdf94b804c1c9ff68c3
64 schema:url http://link.springer.com/10.1186%2F1752-0509-6-S3-S21
65 sgo:license sg:explorer/license/
66 sgo:sdDataset articles
67 rdf:type schema:ScholarlyArticle
68 N06d6c1de33184da48a6a6290b88e2578 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
69 schema:name Cross Infection
70 rdf:type schema:DefinedTerm
71 N118080a129604b5cb2aaac476efc50c6 schema:affiliation https://www.grid.ac/institutes/grid.16821.3c
72 schema:familyName Shao
73 schema:givenName Zhifeng
74 rdf:type schema:Person
75 N16aec09094704fa7967dfe36b2bb29c8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
76 schema:name Systems Biology
77 rdf:type schema:DefinedTerm
78 N19debff95c794bdf94b804c1c9ff68c3 schema:name Springer Nature - SN SciGraph project
79 rdf:type schema:Organization
80 N26b5d13a78e04960b595f3ca3312b48a rdf:first sg:person.012163147207.05
81 rdf:rest Nbcb9f431a8fe44fe8e034801a125a0f4
82 N31499cbb95384d6da409c576d40b0272 rdf:first sg:person.01341361016.33
83 rdf:rest N36ab354a294d4f729b551501e0c80a18
84 N36ab354a294d4f729b551501e0c80a18 rdf:first sg:person.01030424735.37
85 rdf:rest N26b5d13a78e04960b595f3ca3312b48a
86 N48e44d2e744e4000a9663cee733e5cc6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
87 schema:name Gastrointestinal Tract
88 rdf:type schema:DefinedTerm
89 N51c74c79c3644cd7aec8be8031525be7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
90 schema:name Genomics
91 rdf:type schema:DefinedTerm
92 N54174f3e15004c119798d1dd764012d7 schema:name dimensions_id
93 schema:value pub.1031324990
94 rdf:type schema:PropertyValue
95 N602574e55abc4bb195c6f57624867b33 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
96 schema:name Nucleic Acid Hybridization
97 rdf:type schema:DefinedTerm
98 N720cbdac7fee49c5bc692f6fa6e8b46a schema:name pubmed_id
99 schema:value 23282199
100 rdf:type schema:PropertyValue
101 N7349d15a7a374f66914739d65e4ce416 rdf:first sg:person.01065067750.80
102 rdf:rest N84351761733f4aa19725176157f32e9d
103 N7405dd4264614545931b0d58f2505bec schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
104 schema:name Phylogeny
105 rdf:type schema:DefinedTerm
106 N82753c897a494de3b4799bf5c19703e7 schema:name readcube_id
107 schema:value 953a01267c10ffd2c4edcf376dc6b8c8f4147ffeee1cfbcdd92ecebd8709f94c
108 rdf:type schema:PropertyValue
109 N84351761733f4aa19725176157f32e9d rdf:first sg:person.0611615065.28
110 rdf:rest rdf:nil
111 N8b2e85febab6484fb8647dc19c1e1329 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
112 schema:name Databases, Genetic
113 rdf:type schema:DefinedTerm
114 Na0ebaa92c3a64e45b96a1ab9ec30fe31 rdf:first sg:person.01100523570.79
115 rdf:rest N31499cbb95384d6da409c576d40b0272
116 Na6e72ed7c09d45ddbd1cb9df9ad57860 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
117 schema:name DNA, Bacterial
118 rdf:type schema:DefinedTerm
119 Na777948983554d93bf7f933675f53a43 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
120 schema:name High-Throughput Nucleotide Sequencing
121 rdf:type schema:DefinedTerm
122 Na9dd7f6a63ad4ec190618d1628e70e30 schema:name doi
123 schema:value 10.1186/1752-0509-6-s3-s21
124 rdf:type schema:PropertyValue
125 Nb035097a0f6d45269e783fecd6d32a7b schema:issueNumber Suppl 3
126 rdf:type schema:PublicationIssue
127 Nb04cc3a6d60b46619a65815e16028df7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
128 schema:name Humans
129 rdf:type schema:DefinedTerm
130 Nb60d86c3dc114603af9f104dd1aa57b9 rdf:first sg:person.01143645130.48
131 rdf:rest Na0ebaa92c3a64e45b96a1ab9ec30fe31
132 Nbcb9f431a8fe44fe8e034801a125a0f4 rdf:first N118080a129604b5cb2aaac476efc50c6
133 rdf:rest N7349d15a7a374f66914739d65e4ce416
134 Nd924567605bc460e922505bf5ed0713e schema:volumeNumber 6
135 rdf:type schema:PublicationVolume
136 Nfa85cdf87b4c497394004209f60afb71 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
137 schema:name Enterococcus faecium
138 rdf:type schema:DefinedTerm
139 Nfb504293ce5f408983aacbdd86d25a57 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
140 schema:name Genome, Bacterial
141 rdf:type schema:DefinedTerm
142 Nfbcbe58294cc42428274dba2f066f280 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
143 schema:name Sequence Analysis, DNA
144 rdf:type schema:DefinedTerm
145 Nfff0f1f5d56c44d8bc6b2ed54d36f3ef schema:name nlm_unique_id
146 schema:value 101301827
147 rdf:type schema:PropertyValue
148 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
149 schema:name Biological Sciences
150 rdf:type schema:DefinedTerm
151 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
152 schema:name Genetics
153 rdf:type schema:DefinedTerm
154 sg:grant.7021539 http://pending.schema.org/fundedItem sg:pub.10.1186/1752-0509-6-s3-s21
155 rdf:type schema:MonetaryGrant
156 sg:journal.1327442 schema:issn 1752-0509
157 schema:name BMC Systems Biology
158 rdf:type schema:Periodical
159 sg:person.01030424735.37 schema:affiliation https://www.grid.ac/institutes/grid.9227.e
160 schema:familyName Hao
161 schema:givenName Pei
162 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01030424735.37
163 rdf:type schema:Person
164 sg:person.01065067750.80 schema:affiliation https://www.grid.ac/institutes/grid.411405.5
165 schema:familyName Xu
166 schema:givenName Xiaogang
167 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065067750.80
168 rdf:type schema:Person
169 sg:person.01100523570.79 schema:affiliation https://www.grid.ac/institutes/grid.419092.7
170 schema:familyName Yu
171 schema:givenName Yao
172 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01100523570.79
173 rdf:type schema:Person
174 sg:person.01143645130.48 schema:affiliation https://www.grid.ac/institutes/grid.16821.3c
175 schema:familyName Wang
176 schema:givenName Yajun
177 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01143645130.48
178 rdf:type schema:Person
179 sg:person.012163147207.05 schema:affiliation https://www.grid.ac/institutes/grid.16821.3c
180 schema:familyName Li
181 schema:givenName Yixue
182 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012163147207.05
183 rdf:type schema:Person
184 sg:person.01341361016.33 schema:affiliation https://www.grid.ac/institutes/grid.419092.7
185 schema:familyName Pan
186 schema:givenName Bohu
187 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01341361016.33
188 rdf:type schema:Person
189 sg:person.0611615065.28 schema:affiliation https://www.grid.ac/institutes/grid.419092.7
190 schema:familyName Li
191 schema:givenName Xuan
192 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0611615065.28
193 rdf:type schema:Person
194 sg:pub.10.1038/nature03959 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021574562
195 https://doi.org/10.1038/nature03959
196 rdf:type schema:CreativeWork
197 sg:pub.10.1038/nature08696 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044030989
198 https://doi.org/10.1038/nature08696
199 rdf:type schema:CreativeWork
200 sg:pub.10.1038/nrg2626 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023911485
201 https://doi.org/10.1038/nrg2626
202 rdf:type schema:CreativeWork
203 sg:pub.10.1186/1471-2164-11-239 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042212259
204 https://doi.org/10.1186/1471-2164-11-239
205 rdf:type schema:CreativeWork
206 sg:pub.10.1186/1471-2164-9-603 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028935538
207 https://doi.org/10.1186/1471-2164-9-603
208 rdf:type schema:CreativeWork
209 sg:pub.10.1186/gb-2004-5-2-r12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022585853
210 https://doi.org/10.1186/gb-2004-5-2-r12
211 rdf:type schema:CreativeWork
212 sg:pub.10.1186/gb-2009-10-9-r94 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013301186
213 https://doi.org/10.1186/gb-2009-10-9-r94
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1016/j.mib.2006.07.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020606795
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1016/j.ygeno.2010.03.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052838811
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1093/bioinformatics/btp324 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038266369
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1093/bioinformatics/btr026 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018328066
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1093/bioinformatics/btr319 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033856182
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1093/nar/25.17.3389 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047265454
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1093/nar/gkn425 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044990606
228 rdf:type schema:CreativeWork
229 https://doi.org/10.1101/gr.074492.107 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051720574
230 rdf:type schema:CreativeWork
231 https://doi.org/10.1101/gr.083311.108 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011112451
232 rdf:type schema:CreativeWork
233 https://doi.org/10.1101/gr.6435207 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033023123
234 rdf:type schema:CreativeWork
235 https://doi.org/10.1101/gr.8.3.175 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048253030
236 rdf:type schema:CreativeWork
237 https://doi.org/10.1126/science.1162986 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021160342
238 rdf:type schema:CreativeWork
239 https://doi.org/10.1128/jb.00259-12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023155566
240 rdf:type schema:CreativeWork
241 https://doi.org/10.1371/journal.pcbi.0010043 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050629286
242 rdf:type schema:CreativeWork
243 https://doi.org/10.1371/journal.pone.0017915 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050529926
244 rdf:type schema:CreativeWork
245 https://doi.org/10.1371/journal.pone.0030187 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052473877
246 rdf:type schema:CreativeWork
247 https://doi.org/10.1371/journal.ppat.0030007 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034259269
248 rdf:type schema:CreativeWork
249 https://doi.org/10.1590/s0100-879x2000001100002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040792980
250 rdf:type schema:CreativeWork
251 https://doi.org/10.3201/eid1106.041204 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001835942
252 rdf:type schema:CreativeWork
253 https://www.grid.ac/institutes/grid.16821.3c schema:alternateName Shanghai Jiao Tong University
254 schema:name Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, 200240, Shanghai, China
255 rdf:type schema:Organization
256 https://www.grid.ac/institutes/grid.411405.5 schema:alternateName Huashan Hospital
257 schema:name Institute of Antibiotics, Huashan Hospital of Fudan University, 200040, Shanghai, China
258 rdf:type schema:Organization
259 https://www.grid.ac/institutes/grid.419092.7 schema:alternateName Shanghai Institutes for Biological Sciences
260 schema:name Key Laboratory of Synthetic Biology, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 200032, Shanghai, China
261 rdf:type schema:Organization
262 https://www.grid.ac/institutes/grid.9227.e schema:alternateName Chinese Academy of Sciences
263 schema:name Institut Pasteur of Shanghai, Chinese Academy of Science, 200025, Shanghai, China
264 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...