Rapid quantification of sequence repeats to resolve the size, structure and contents of bacterial genomes View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2013-12

AUTHORS

David Williams, William L Trimble, Meghan Shilts, Folker Meyer, Howard Ochman

ABSTRACT

BACKGROUND: The numerous classes of repeats often impede the assembly of genome sequences from the short reads provided by new sequencing technologies. We demonstrate a simple and rapid means to ascertain the repeat structure and total size of a bacterial or archaeal genome without the need for assembly by directly analyzing the abundances of distinct k-mers among reads. RESULTS: The sensitivity of this procedure to resolve variation within a bacterial species is demonstrated: genome sizes and repeat structure of five environmental strains of E. coli from short Illumina reads were estimated by this method, and total genome sizes corresponded well with those obtained for the same strains by pulsed-field gel electrophoresis. In addition, this approach was applied to read-sets for completed genomes and shown to be accurate over a wide range of microbial genome sizes. CONCLUSIONS: Application of these procedures, based solely on k-mer abundances in short read data sets, allows aspects of genome structure to be resolved that are not apparent from conventional short read assemblies. This knowledge of the repetitive content of genomes provides insights into genome evolution and diversity. More... »

PAGES

537

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2164-14-537

DOI

http://dx.doi.org/10.1186/1471-2164-14-537

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1044673702

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/23924250


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Escherichia coli", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Gene Dosage", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome Size", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genomics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Repetitive Sequences, Nucleic Acid", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Analysis", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Time Factors", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Yale University", 
          "id": "https://www.grid.ac/institutes/grid.47100.32", 
          "name": [
            "Department of Ecology & Evolutionary Biology, Yale University, New Haven, 06520, Connecticut, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Williams", 
        "givenName": "David", 
        "id": "sg:person.01050471232.64", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01050471232.64"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "Institute for Genomics & Systems Biology, University of Chicago, 5800 S Ellis Ave, 60637, Chicago, Illinois, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Trimble", 
        "givenName": "William L", 
        "id": "sg:person.01116103751.19", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01116103751.19"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Yale University", 
          "id": "https://www.grid.ac/institutes/grid.47100.32", 
          "name": [
            "Department of Ecology & Evolutionary Biology, Yale University, New Haven, 06520, Connecticut, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shilts", 
        "givenName": "Meghan", 
        "id": "sg:person.01056757425.91", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01056757425.91"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Argonne National Laboratory", 
          "id": "https://www.grid.ac/institutes/grid.187073.a", 
          "name": [
            "Institute for Genomics & Systems Biology, University of Chicago, 5800 S Ellis Ave, 60637, Chicago, Illinois, USA", 
            "Mathematics & Computer Science Division, Argonne National Laboratory, 9700 South Cass Avenue, 60439, Argonne, Illinois, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Meyer", 
        "givenName": "Folker", 
        "id": "sg:person.0623154651.88", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0623154651.88"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Yale University", 
          "id": "https://www.grid.ac/institutes/grid.47100.32", 
          "name": [
            "Department of Ecology & Evolutionary Biology, Yale University, New Haven, 06520, Connecticut, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ochman", 
        "givenName": "Howard", 
        "id": "sg:person.0600333516.52", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0600333516.52"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1371/journal.pgen.1000344", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002519668"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.277.5331.1453", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003267570"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/298760a0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003893697", 
          "https://doi.org/10.1038/298760a0"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-13-92", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1005137064", 
          "https://doi.org/10.1186/1471-2164-13-92"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/molbev/msj125", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1005759885"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.1121464109", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1006108084"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.1358", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008886215", 
          "https://doi.org/10.1038/nmeth.1358"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.1358", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008886215", 
          "https://doi.org/10.1038/nmeth.1358"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-8-456", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009030018", 
          "https://doi.org/10.1186/1471-2164-8-456"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrmicro2850", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010614211", 
          "https://doi.org/10.1038/nrmicro2850"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1365-2958.2006.05172.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1012203977"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0022-2836(05)80360-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013618994"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10295-011-1052-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013989162", 
          "https://doi.org/10.1007/s10295-011-1052-2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0913033107", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016559401"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0813249106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017699788"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bti039", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020577976"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.126953.111", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022021574"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/30.8.1826", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022213293"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.1350803", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023905199"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1365-2958.1996.tb02481.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028122134"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-11-485", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030070956", 
          "https://doi.org/10.1186/1471-2105-11-485"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.0030142", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030776001"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0923-2508(91)90033-7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031733560"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0923-2508(91)90033-7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031733560"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.1000593", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032365203"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr011", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032486702"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0903585106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034071251"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ygeno.2012.06.009", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036608653"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/molbev/msj085", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1037702207"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/17.suppl_1.s225", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040137959"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btm632", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040151889"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/jb.00619-08", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040327491"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-8-321", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040671622", 
          "https://doi.org/10.1186/1471-2164-8-321"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0092-8674(84)90436-7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042178000"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2010-11-11-r116", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042567408", 
          "https://doi.org/10.1186/gb-2010-11-11-r116"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.plasmid.2004.12.006", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043138519"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1089/cmb.2010.0245", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043925121"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2164-9-517", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045248549", 
          "https://doi.org/10.1186/1471-2164-9-517"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-11-48", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045472514", 
          "https://doi.org/10.1186/1471-2105-11-48"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-11-48", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045472514", 
          "https://doi.org/10.1186/1471-2105-11-48"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gks828", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047396934"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bth205", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053359138"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1089/cmb.1995.2.291", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059245099"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1099/13500872-142-8-2087", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1060381262"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/jb.174.14.4525-4529.1992", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062720657"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/jb.177.20.5784-5789.1995", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062724228"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1075327647", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1075517019", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1082784499", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2013-12", 
    "datePublishedReg": "2013-12-01", 
    "description": "BACKGROUND: The numerous classes of repeats often impede the assembly of genome sequences from the short reads provided by new sequencing technologies. We demonstrate a simple and rapid means to ascertain the repeat structure and total size of a bacterial or archaeal genome without the need for assembly by directly analyzing the abundances of distinct k-mers among reads.\nRESULTS: The sensitivity of this procedure to resolve variation within a bacterial species is demonstrated: genome sizes and repeat structure of five environmental strains of E. coli from short Illumina reads were estimated by this method, and total genome sizes corresponded well with those obtained for the same strains by pulsed-field gel electrophoresis. In addition, this approach was applied to read-sets for completed genomes and shown to be accurate over a wide range of microbial genome sizes.\nCONCLUSIONS: Application of these procedures, based solely on k-mer abundances in short read data sets, allows aspects of genome structure to be resolved that are not apparent from conventional short read assemblies. This knowledge of the repetitive content of genomes provides insights into genome evolution and diversity.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1471-2164-14-537", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.6540342", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1023790", 
        "issn": [
          "1471-2164"
        ], 
        "name": "BMC Genomics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "14"
      }
    ], 
    "name": "Rapid quantification of sequence repeats to resolve the size, structure and contents of bacterial genomes", 
    "pagination": "537", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "d4cc47b92f3b99aef80a356b5827fa0889065aad3afcd47dede3c20d368d9abe"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "23924250"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100965258"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2164-14-537"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1044673702"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2164-14-537", 
      "https://app.dimensions.ai/details/publication/pub.1044673702"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T14:08", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8660_00000507.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2F1471-2164-14-537"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2164-14-537'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2164-14-537'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2164-14-537'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2164-14-537'


 

This table displays all metadata directly associated to this object as RDF triples.

282 TRIPLES      21 PREDICATES      83 URIs      29 LITERALS      17 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2164-14-537 schema:about N238517e5c1d44648b640316f04e22853
2 N2b9411e56fff4ffc8144abc04aeefce2
3 N553c5dbbd41241378ac0fc50c95a3fbb
4 N5fbe1bfb68144484a5eca80099f1fdbe
5 N64d4d406cd61412987e346cb1d7e4942
6 Nde4a992c9160409bb2f818e27f90f19a
7 Ne7c39acc2604494f95abdc66c1be0af7
8 Nf724d42cfa064fb39593e2fde566625c
9 anzsrc-for:06
10 anzsrc-for:0604
11 schema:author N2fc853434b8e41e1bdfa571ff9eed4c8
12 schema:citation sg:pub.10.1007/s10295-011-1052-2
13 sg:pub.10.1038/298760a0
14 sg:pub.10.1038/nmeth.1358
15 sg:pub.10.1038/nrmicro2850
16 sg:pub.10.1186/1471-2105-11-48
17 sg:pub.10.1186/1471-2105-11-485
18 sg:pub.10.1186/1471-2164-13-92
19 sg:pub.10.1186/1471-2164-8-321
20 sg:pub.10.1186/1471-2164-8-456
21 sg:pub.10.1186/1471-2164-9-517
22 sg:pub.10.1186/gb-2010-11-11-r116
23 https://app.dimensions.ai/details/publication/pub.1075327647
24 https://app.dimensions.ai/details/publication/pub.1075517019
25 https://app.dimensions.ai/details/publication/pub.1082784499
26 https://doi.org/10.1016/0092-8674(84)90436-7
27 https://doi.org/10.1016/0923-2508(91)90033-7
28 https://doi.org/10.1016/j.plasmid.2004.12.006
29 https://doi.org/10.1016/j.ygeno.2012.06.009
30 https://doi.org/10.1016/s0022-2836(05)80360-2
31 https://doi.org/10.1073/pnas.0813249106
32 https://doi.org/10.1073/pnas.0903585106
33 https://doi.org/10.1073/pnas.0913033107
34 https://doi.org/10.1073/pnas.1121464109
35 https://doi.org/10.1089/cmb.1995.2.291
36 https://doi.org/10.1089/cmb.2010.0245
37 https://doi.org/10.1093/bioinformatics/17.suppl_1.s225
38 https://doi.org/10.1093/bioinformatics/bth205
39 https://doi.org/10.1093/bioinformatics/bti039
40 https://doi.org/10.1093/bioinformatics/btm632
41 https://doi.org/10.1093/bioinformatics/btr011
42 https://doi.org/10.1093/molbev/msj085
43 https://doi.org/10.1093/molbev/msj125
44 https://doi.org/10.1093/nar/30.8.1826
45 https://doi.org/10.1093/nar/gks828
46 https://doi.org/10.1099/13500872-142-8-2087
47 https://doi.org/10.1101/gr.126953.111
48 https://doi.org/10.1101/gr.1350803
49 https://doi.org/10.1111/j.1365-2958.1996.tb02481.x
50 https://doi.org/10.1111/j.1365-2958.2006.05172.x
51 https://doi.org/10.1126/science.277.5331.1453
52 https://doi.org/10.1128/jb.00619-08
53 https://doi.org/10.1128/jb.174.14.4525-4529.1992
54 https://doi.org/10.1128/jb.177.20.5784-5789.1995
55 https://doi.org/10.1371/journal.pcbi.1000593
56 https://doi.org/10.1371/journal.pgen.0030142
57 https://doi.org/10.1371/journal.pgen.1000344
58 schema:datePublished 2013-12
59 schema:datePublishedReg 2013-12-01
60 schema:description BACKGROUND: The numerous classes of repeats often impede the assembly of genome sequences from the short reads provided by new sequencing technologies. We demonstrate a simple and rapid means to ascertain the repeat structure and total size of a bacterial or archaeal genome without the need for assembly by directly analyzing the abundances of distinct k-mers among reads. RESULTS: The sensitivity of this procedure to resolve variation within a bacterial species is demonstrated: genome sizes and repeat structure of five environmental strains of E. coli from short Illumina reads were estimated by this method, and total genome sizes corresponded well with those obtained for the same strains by pulsed-field gel electrophoresis. In addition, this approach was applied to read-sets for completed genomes and shown to be accurate over a wide range of microbial genome sizes. CONCLUSIONS: Application of these procedures, based solely on k-mer abundances in short read data sets, allows aspects of genome structure to be resolved that are not apparent from conventional short read assemblies. This knowledge of the repetitive content of genomes provides insights into genome evolution and diversity.
61 schema:genre research_article
62 schema:inLanguage en
63 schema:isAccessibleForFree true
64 schema:isPartOf Na079a9417f4c48f89f38af03cb73b98f
65 Nb8c2203b3cf649389303407fa2c52bb7
66 sg:journal.1023790
67 schema:name Rapid quantification of sequence repeats to resolve the size, structure and contents of bacterial genomes
68 schema:pagination 537
69 schema:productId N1a2ef74d4cac44d8b63c46ce5c9f059e
70 N30dda46cad6c48fab5a534d94b3c28a3
71 N4b594d9fcf1a40988bc4274919789fda
72 N64651187360a4c8ea6fb4714127cbba3
73 Nb82f461ea79f48aaa8b861d3d6184e4d
74 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044673702
75 https://doi.org/10.1186/1471-2164-14-537
76 schema:sdDatePublished 2019-04-10T14:08
77 schema:sdLicense https://scigraph.springernature.com/explorer/license/
78 schema:sdPublisher N818af50aaad14b67bc031bba95eebf1e
79 schema:url http://link.springer.com/10.1186%2F1471-2164-14-537
80 sgo:license sg:explorer/license/
81 sgo:sdDataset articles
82 rdf:type schema:ScholarlyArticle
83 N1a2ef74d4cac44d8b63c46ce5c9f059e schema:name doi
84 schema:value 10.1186/1471-2164-14-537
85 rdf:type schema:PropertyValue
86 N238517e5c1d44648b640316f04e22853 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
87 schema:name Escherichia coli
88 rdf:type schema:DefinedTerm
89 N248913818c84403ebbf80389dc4fb19e rdf:first sg:person.01056757425.91
90 rdf:rest N46611c7a97a64a238e0726c0ae53a0d6
91 N267e292c644c4b15aa146ce9036f5960 rdf:first sg:person.01116103751.19
92 rdf:rest N248913818c84403ebbf80389dc4fb19e
93 N2b9411e56fff4ffc8144abc04aeefce2 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
94 schema:name Genomics
95 rdf:type schema:DefinedTerm
96 N2fc853434b8e41e1bdfa571ff9eed4c8 rdf:first sg:person.01050471232.64
97 rdf:rest N267e292c644c4b15aa146ce9036f5960
98 N30dda46cad6c48fab5a534d94b3c28a3 schema:name dimensions_id
99 schema:value pub.1044673702
100 rdf:type schema:PropertyValue
101 N46611c7a97a64a238e0726c0ae53a0d6 rdf:first sg:person.0623154651.88
102 rdf:rest Nd1b60e8cb2bb4a088df96bb553e38638
103 N4b594d9fcf1a40988bc4274919789fda schema:name pubmed_id
104 schema:value 23924250
105 rdf:type schema:PropertyValue
106 N553c5dbbd41241378ac0fc50c95a3fbb schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
107 schema:name Time Factors
108 rdf:type schema:DefinedTerm
109 N5fbe1bfb68144484a5eca80099f1fdbe schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
110 schema:name Genome, Bacterial
111 rdf:type schema:DefinedTerm
112 N64651187360a4c8ea6fb4714127cbba3 schema:name nlm_unique_id
113 schema:value 100965258
114 rdf:type schema:PropertyValue
115 N64d4d406cd61412987e346cb1d7e4942 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
116 schema:name Sequence Analysis
117 rdf:type schema:DefinedTerm
118 N818af50aaad14b67bc031bba95eebf1e schema:name Springer Nature - SN SciGraph project
119 rdf:type schema:Organization
120 Na079a9417f4c48f89f38af03cb73b98f schema:volumeNumber 14
121 rdf:type schema:PublicationVolume
122 Nb82f461ea79f48aaa8b861d3d6184e4d schema:name readcube_id
123 schema:value d4cc47b92f3b99aef80a356b5827fa0889065aad3afcd47dede3c20d368d9abe
124 rdf:type schema:PropertyValue
125 Nb8c2203b3cf649389303407fa2c52bb7 schema:issueNumber 1
126 rdf:type schema:PublicationIssue
127 Nd1b60e8cb2bb4a088df96bb553e38638 rdf:first sg:person.0600333516.52
128 rdf:rest rdf:nil
129 Nde4a992c9160409bb2f818e27f90f19a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
130 schema:name Repetitive Sequences, Nucleic Acid
131 rdf:type schema:DefinedTerm
132 Ne7c39acc2604494f95abdc66c1be0af7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
133 schema:name Genome Size
134 rdf:type schema:DefinedTerm
135 Nf6f8a6246d044e4083cb6119bcb2d21c schema:name Institute for Genomics & Systems Biology, University of Chicago, 5800 S Ellis Ave, 60637, Chicago, Illinois, USA
136 rdf:type schema:Organization
137 Nf724d42cfa064fb39593e2fde566625c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
138 schema:name Gene Dosage
139 rdf:type schema:DefinedTerm
140 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
141 schema:name Biological Sciences
142 rdf:type schema:DefinedTerm
143 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
144 schema:name Genetics
145 rdf:type schema:DefinedTerm
146 sg:grant.6540342 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2164-14-537
147 rdf:type schema:MonetaryGrant
148 sg:journal.1023790 schema:issn 1471-2164
149 schema:name BMC Genomics
150 rdf:type schema:Periodical
151 sg:person.01050471232.64 schema:affiliation https://www.grid.ac/institutes/grid.47100.32
152 schema:familyName Williams
153 schema:givenName David
154 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01050471232.64
155 rdf:type schema:Person
156 sg:person.01056757425.91 schema:affiliation https://www.grid.ac/institutes/grid.47100.32
157 schema:familyName Shilts
158 schema:givenName Meghan
159 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01056757425.91
160 rdf:type schema:Person
161 sg:person.01116103751.19 schema:affiliation Nf6f8a6246d044e4083cb6119bcb2d21c
162 schema:familyName Trimble
163 schema:givenName William L
164 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01116103751.19
165 rdf:type schema:Person
166 sg:person.0600333516.52 schema:affiliation https://www.grid.ac/institutes/grid.47100.32
167 schema:familyName Ochman
168 schema:givenName Howard
169 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0600333516.52
170 rdf:type schema:Person
171 sg:person.0623154651.88 schema:affiliation https://www.grid.ac/institutes/grid.187073.a
172 schema:familyName Meyer
173 schema:givenName Folker
174 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0623154651.88
175 rdf:type schema:Person
176 sg:pub.10.1007/s10295-011-1052-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013989162
177 https://doi.org/10.1007/s10295-011-1052-2
178 rdf:type schema:CreativeWork
179 sg:pub.10.1038/298760a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003893697
180 https://doi.org/10.1038/298760a0
181 rdf:type schema:CreativeWork
182 sg:pub.10.1038/nmeth.1358 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008886215
183 https://doi.org/10.1038/nmeth.1358
184 rdf:type schema:CreativeWork
185 sg:pub.10.1038/nrmicro2850 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010614211
186 https://doi.org/10.1038/nrmicro2850
187 rdf:type schema:CreativeWork
188 sg:pub.10.1186/1471-2105-11-48 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045472514
189 https://doi.org/10.1186/1471-2105-11-48
190 rdf:type schema:CreativeWork
191 sg:pub.10.1186/1471-2105-11-485 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030070956
192 https://doi.org/10.1186/1471-2105-11-485
193 rdf:type schema:CreativeWork
194 sg:pub.10.1186/1471-2164-13-92 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005137064
195 https://doi.org/10.1186/1471-2164-13-92
196 rdf:type schema:CreativeWork
197 sg:pub.10.1186/1471-2164-8-321 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040671622
198 https://doi.org/10.1186/1471-2164-8-321
199 rdf:type schema:CreativeWork
200 sg:pub.10.1186/1471-2164-8-456 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009030018
201 https://doi.org/10.1186/1471-2164-8-456
202 rdf:type schema:CreativeWork
203 sg:pub.10.1186/1471-2164-9-517 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045248549
204 https://doi.org/10.1186/1471-2164-9-517
205 rdf:type schema:CreativeWork
206 sg:pub.10.1186/gb-2010-11-11-r116 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042567408
207 https://doi.org/10.1186/gb-2010-11-11-r116
208 rdf:type schema:CreativeWork
209 https://app.dimensions.ai/details/publication/pub.1075327647 schema:CreativeWork
210 https://app.dimensions.ai/details/publication/pub.1075517019 schema:CreativeWork
211 https://app.dimensions.ai/details/publication/pub.1082784499 schema:CreativeWork
212 https://doi.org/10.1016/0092-8674(84)90436-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042178000
213 rdf:type schema:CreativeWork
214 https://doi.org/10.1016/0923-2508(91)90033-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031733560
215 rdf:type schema:CreativeWork
216 https://doi.org/10.1016/j.plasmid.2004.12.006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043138519
217 rdf:type schema:CreativeWork
218 https://doi.org/10.1016/j.ygeno.2012.06.009 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036608653
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1016/s0022-2836(05)80360-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013618994
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1073/pnas.0813249106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017699788
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1073/pnas.0903585106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034071251
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1073/pnas.0913033107 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016559401
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1073/pnas.1121464109 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006108084
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1089/cmb.1995.2.291 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245099
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1089/cmb.2010.0245 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043925121
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1093/bioinformatics/17.suppl_1.s225 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040137959
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1093/bioinformatics/bth205 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053359138
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1093/bioinformatics/bti039 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020577976
239 rdf:type schema:CreativeWork
240 https://doi.org/10.1093/bioinformatics/btm632 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040151889
241 rdf:type schema:CreativeWork
242 https://doi.org/10.1093/bioinformatics/btr011 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032486702
243 rdf:type schema:CreativeWork
244 https://doi.org/10.1093/molbev/msj085 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037702207
245 rdf:type schema:CreativeWork
246 https://doi.org/10.1093/molbev/msj125 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005759885
247 rdf:type schema:CreativeWork
248 https://doi.org/10.1093/nar/30.8.1826 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022213293
249 rdf:type schema:CreativeWork
250 https://doi.org/10.1093/nar/gks828 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047396934
251 rdf:type schema:CreativeWork
252 https://doi.org/10.1099/13500872-142-8-2087 schema:sameAs https://app.dimensions.ai/details/publication/pub.1060381262
253 rdf:type schema:CreativeWork
254 https://doi.org/10.1101/gr.126953.111 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022021574
255 rdf:type schema:CreativeWork
256 https://doi.org/10.1101/gr.1350803 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023905199
257 rdf:type schema:CreativeWork
258 https://doi.org/10.1111/j.1365-2958.1996.tb02481.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1028122134
259 rdf:type schema:CreativeWork
260 https://doi.org/10.1111/j.1365-2958.2006.05172.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1012203977
261 rdf:type schema:CreativeWork
262 https://doi.org/10.1126/science.277.5331.1453 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003267570
263 rdf:type schema:CreativeWork
264 https://doi.org/10.1128/jb.00619-08 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040327491
265 rdf:type schema:CreativeWork
266 https://doi.org/10.1128/jb.174.14.4525-4529.1992 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062720657
267 rdf:type schema:CreativeWork
268 https://doi.org/10.1128/jb.177.20.5784-5789.1995 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062724228
269 rdf:type schema:CreativeWork
270 https://doi.org/10.1371/journal.pcbi.1000593 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032365203
271 rdf:type schema:CreativeWork
272 https://doi.org/10.1371/journal.pgen.0030142 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030776001
273 rdf:type schema:CreativeWork
274 https://doi.org/10.1371/journal.pgen.1000344 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002519668
275 rdf:type schema:CreativeWork
276 https://www.grid.ac/institutes/grid.187073.a schema:alternateName Argonne National Laboratory
277 schema:name Institute for Genomics & Systems Biology, University of Chicago, 5800 S Ellis Ave, 60637, Chicago, Illinois, USA
278 Mathematics & Computer Science Division, Argonne National Laboratory, 9700 South Cass Avenue, 60439, Argonne, Illinois, USA
279 rdf:type schema:Organization
280 https://www.grid.ac/institutes/grid.47100.32 schema:alternateName Yale University
281 schema:name Department of Ecology & Evolutionary Biology, Yale University, New Haven, 06520, Connecticut, USA
282 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...