Tigmint: correcting assembly errors using linked reads from large molecules View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2018-12

AUTHORS

Shaun D. Jackman, Lauren Coombe, Justin Chu, Rene L. Warren, Benjamin P. Vandervalk, Sarah Yeo, Zhuyi Xue, Hamid Mohamadi, Joerg Bohlmann, Steven J.M. Jones, Inanc Birol

ABSTRACT

BACKGROUND: Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity. As a result, assembly errors are common. In the absence of a reference genome, these misassemblies may be identified by comparing the sequencing data to the assembly and looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembled sequence. Although tools exist to identify and correct misassemblies using Illumina paired-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint to address this gap. RESULTS: To demonstrate the effectiveness of Tigmint, we applied it to assemblies of a human genome using short reads assembled with ABySS 2.0 and other assemblers. Tigmint reduced the number of misassemblies identified by QUAST in the ABySS assembly by 216 (27%). While scaffolding with ARCS alone more than doubled the scaffold NGA50 of the assembly from 3 to 8 Mbp, the combination of Tigmint and ARCS improved the scaffold NGA50 of the assembly over five-fold to 16.4 Mbp. This notable improvement in contiguity highlights the utility of assembly correction in refining assemblies. We demonstrate the utility of Tigmint in correcting the assemblies of multiple tools, as well as in using Chromium reads to correct and scaffold assemblies of long single-molecule sequencing. CONCLUSIONS: Scaffolding an assembly that has been corrected with Tigmint yields a final assembly that is both more correct and substantially more contiguous than an assembly that has not been corrected. Using single-molecule sequencing in combination with linked reads enables a genome sequence assembly that achieves both a high sequence contiguity as well as high scaffold contiguity, a feat not currently achievable with either technology alone. More... »

PAGES

393

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/s12859-018-2425-6

DOI

http://dx.doi.org/10.1186/s12859-018-2425-6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1107860833

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/30367597


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Jackman", 
        "givenName": "Shaun D.", 
        "id": "sg:person.0674075626.77", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0674075626.77"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Coombe", 
        "givenName": "Lauren", 
        "id": "sg:person.07651147251.54", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07651147251.54"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chu", 
        "givenName": "Justin", 
        "id": "sg:person.0643374476.44", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0643374476.44"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Warren", 
        "givenName": "Rene L.", 
        "id": "sg:person.01020727037.00", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01020727037.00"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vandervalk", 
        "givenName": "Benjamin P.", 
        "id": "sg:person.01055630120.47", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055630120.47"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yeo", 
        "givenName": "Sarah", 
        "id": "sg:person.01262035753.51", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01262035753.51"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Xue", 
        "givenName": "Zhuyi", 
        "id": "sg:person.0615663405.00", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0615663405.00"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Mohamadi", 
        "givenName": "Hamid", 
        "id": "sg:person.01210300076.55", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01210300076.55"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of British Columbia", 
          "id": "https://www.grid.ac/institutes/grid.17091.3e", 
          "name": [
            "University of British Columbia, Michael Smith Laboratories, V6T 1Z4, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Bohlmann", 
        "givenName": "Joerg", 
        "id": "sg:person.01173572437.14", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01173572437.14"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Jones", 
        "givenName": "Steven J.M.", 
        "id": "sg:person.011076371162.80", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011076371162.80"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Birol", 
        "givenName": "Inanc", 
        "id": "sg:person.01120102220.26", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01120102220.26"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1093/bioinformatics/btt086", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000520351"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nbt.3432", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1006254156", 
          "https://doi.org/10.1038/nbt.3432"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.4035", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019059120", 
          "https://doi.org/10.1038/nmeth.4035"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0112963", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019307347"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.092759.109", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020163602"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btv057", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021229711"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp352", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023014918"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btw267", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023931122"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.3865", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029741262", 
          "https://doi.org/10.1038/nmeth.3865"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.178319.114", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030597566"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature15394", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031274231", 
          "https://doi.org/10.1038/nature15394"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btq033", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036892131"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/sdata.2016.25", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041924932", 
          "https://doi.org/10.1038/sdata.2016.25"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.7717/peerj.996", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046215289"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s13742-015-0076-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050360222", 
          "https://doi.org/10.1186/s13742-015-0076-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btw064", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059414662"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.213652.116", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1083534713"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.214346.116", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1083862088"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.214874.116", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1084603641"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/103549", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085103757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/103549", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085103757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/103549", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085103757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/007666", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085108680"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/007666", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085108680"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/007666", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085108680"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.4366", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090738388", 
          "https://doi.org/10.1038/nmeth.4366"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.4366", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090738388", 
          "https://doi.org/10.1038/nmeth.4366"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/190454", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091915038"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/190454", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091915038"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/190454", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091915038"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btx675", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1092359446"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btx712", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1092524579"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.csbj.2017.10.002", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1092597458"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nbt.4060", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1100685340", 
          "https://doi.org/10.1038/nbt.4060"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/s41592-018-0046-7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1105107323", 
          "https://doi.org/10.1038/s41592-018-0046-7"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2018-12", 
    "datePublishedReg": "2018-12-01", 
    "description": "BACKGROUND: Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity. As a result, assembly errors are common. In the absence of a reference genome, these misassemblies may be identified by comparing the sequencing data to the assembly and looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembled sequence. Although tools exist to identify and correct misassemblies using Illumina paired-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint to address this gap.\nRESULTS: To demonstrate the effectiveness of Tigmint, we applied it to assemblies of a human genome using short reads assembled with ABySS 2.0 and other assemblers. Tigmint reduced the number of misassemblies identified by QUAST in the ABySS assembly by 216 (27%). While scaffolding with ARCS alone more than doubled the scaffold NGA50 of the assembly from 3 to 8 Mbp, the combination of Tigmint and ARCS improved the scaffold NGA50 of the assembly over five-fold to 16.4 Mbp. This notable improvement in contiguity highlights the utility of assembly correction in refining assemblies. We demonstrate the utility of Tigmint in correcting the assemblies of multiple tools, as well as in using Chromium reads to correct and scaffold assemblies of long single-molecule sequencing.\nCONCLUSIONS: Scaffolding an assembly that has been corrected with Tigmint yields a final assembly that is both more correct and substantially more contiguous than an assembly that has not been corrected. Using single-molecule sequencing in combination with linked reads enables a genome sequence assembly that achieves both a high sequence contiguity as well as high scaffold contiguity, a feat not currently achievable with either technology alone.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/s12859-018-2425-6", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2407571", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "19"
      }
    ], 
    "name": "Tigmint: correcting assembly errors using linked reads from large molecules", 
    "pagination": "393", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "1ad9056f96f00158f193a93824a4b2cd044b3f9ca7c3e71c03f43cc26eed3920"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "30367597"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100965194"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/s12859-018-2425-6"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1107860833"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/s12859-018-2425-6", 
      "https://app.dimensions.ai/details/publication/pub.1107860833"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T21:48", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8687_00000575.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1186%2Fs12859-018-2425-6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s12859-018-2425-6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s12859-018-2425-6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s12859-018-2425-6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s12859-018-2425-6'


 

This table displays all metadata directly associated to this object as RDF triples.

253 TRIPLES      21 PREDICATES      57 URIs      21 LITERALS      9 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/s12859-018-2425-6 schema:about anzsrc-for:06
2 anzsrc-for:0604
3 schema:author N47b074e92cbc4cb3a752d72653f04890
4 schema:citation sg:pub.10.1038/nature15394
5 sg:pub.10.1038/nbt.3432
6 sg:pub.10.1038/nbt.4060
7 sg:pub.10.1038/nmeth.3865
8 sg:pub.10.1038/nmeth.4035
9 sg:pub.10.1038/nmeth.4366
10 sg:pub.10.1038/s41592-018-0046-7
11 sg:pub.10.1038/sdata.2016.25
12 sg:pub.10.1186/s13742-015-0076-3
13 https://doi.org/10.1016/j.csbj.2017.10.002
14 https://doi.org/10.1093/bioinformatics/btp352
15 https://doi.org/10.1093/bioinformatics/btq033
16 https://doi.org/10.1093/bioinformatics/btt086
17 https://doi.org/10.1093/bioinformatics/btv057
18 https://doi.org/10.1093/bioinformatics/btw064
19 https://doi.org/10.1093/bioinformatics/btw267
20 https://doi.org/10.1093/bioinformatics/btx675
21 https://doi.org/10.1093/bioinformatics/btx712
22 https://doi.org/10.1101/007666
23 https://doi.org/10.1101/103549
24 https://doi.org/10.1101/190454
25 https://doi.org/10.1101/gr.092759.109
26 https://doi.org/10.1101/gr.178319.114
27 https://doi.org/10.1101/gr.213652.116
28 https://doi.org/10.1101/gr.214346.116
29 https://doi.org/10.1101/gr.214874.116
30 https://doi.org/10.1371/journal.pone.0112963
31 https://doi.org/10.7717/peerj.996
32 schema:datePublished 2018-12
33 schema:datePublishedReg 2018-12-01
34 schema:description BACKGROUND: Genome sequencing yields the sequence of many short snippets of DNA (reads) from a genome. Genome assembly attempts to reconstruct the original genome from which these reads were derived. This task is difficult due to gaps and errors in the sequencing data, repetitive sequence in the underlying genome, and heterozygosity. As a result, assembly errors are common. In the absence of a reference genome, these misassemblies may be identified by comparing the sequencing data to the assembly and looking for discrepancies between the two. Once identified, these misassemblies may be corrected, improving the quality of the assembled sequence. Although tools exist to identify and correct misassemblies using Illumina paired-end and mate-pair sequencing, no such tool yet exists that makes use of the long distance information of the large molecules provided by linked reads, such as those offered by the 10x Genomics Chromium platform. We have developed the tool Tigmint to address this gap. RESULTS: To demonstrate the effectiveness of Tigmint, we applied it to assemblies of a human genome using short reads assembled with ABySS 2.0 and other assemblers. Tigmint reduced the number of misassemblies identified by QUAST in the ABySS assembly by 216 (27%). While scaffolding with ARCS alone more than doubled the scaffold NGA50 of the assembly from 3 to 8 Mbp, the combination of Tigmint and ARCS improved the scaffold NGA50 of the assembly over five-fold to 16.4 Mbp. This notable improvement in contiguity highlights the utility of assembly correction in refining assemblies. We demonstrate the utility of Tigmint in correcting the assemblies of multiple tools, as well as in using Chromium reads to correct and scaffold assemblies of long single-molecule sequencing. CONCLUSIONS: Scaffolding an assembly that has been corrected with Tigmint yields a final assembly that is both more correct and substantially more contiguous than an assembly that has not been corrected. Using single-molecule sequencing in combination with linked reads enables a genome sequence assembly that achieves both a high sequence contiguity as well as high scaffold contiguity, a feat not currently achievable with either technology alone.
35 schema:genre research_article
36 schema:inLanguage en
37 schema:isAccessibleForFree true
38 schema:isPartOf N50f38971a1eb4e2da74051002df5121c
39 Nc96246df15ab4f4d89d48ac9f6685cef
40 sg:journal.1023786
41 schema:name Tigmint: correcting assembly errors using linked reads from large molecules
42 schema:pagination 393
43 schema:productId N34e95f66e7d249c69618c51ab315eb30
44 N36d3aa22a6474bbca6e93461c29fc457
45 N7003aa08ac3f46f8ade57ee1c34c47c5
46 Na6052c4375c44f85b78630283ea45df4
47 Nf1788381bfe8447f981db65458175301
48 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107860833
49 https://doi.org/10.1186/s12859-018-2425-6
50 schema:sdDatePublished 2019-04-10T21:48
51 schema:sdLicense https://scigraph.springernature.com/explorer/license/
52 schema:sdPublisher Nf91b730bccf44b8b91733f15d9d6fba3
53 schema:url https://link.springer.com/10.1186%2Fs12859-018-2425-6
54 sgo:license sg:explorer/license/
55 sgo:sdDataset articles
56 rdf:type schema:ScholarlyArticle
57 N0c28296d8c5f41a5a508b11ba8441543 rdf:first sg:person.0643374476.44
58 rdf:rest N46bbd61dfff148c3bd160b6da0656c7c
59 N0f8616614c8a4117a40b60fd299eed7f rdf:first sg:person.01055630120.47
60 rdf:rest N5776d7370b3b46a6b1f2785bd34cb5d9
61 N2b80e21a573649efa5ce90bd58a7fe10 rdf:first sg:person.01173572437.14
62 rdf:rest N40da37a8064d46aea77d2b9bc0b29388
63 N2e629693d98640e9a77e3207ab40e324 schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
64 rdf:type schema:Organization
65 N2e860c45709a48b7af4f455ec2014757 schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
66 rdf:type schema:Organization
67 N30755739854e40bfa7bf8fb16e43a4e2 rdf:first sg:person.01210300076.55
68 rdf:rest N2b80e21a573649efa5ce90bd58a7fe10
69 N34e95f66e7d249c69618c51ab315eb30 schema:name doi
70 schema:value 10.1186/s12859-018-2425-6
71 rdf:type schema:PropertyValue
72 N36d3aa22a6474bbca6e93461c29fc457 schema:name pubmed_id
73 schema:value 30367597
74 rdf:type schema:PropertyValue
75 N40da37a8064d46aea77d2b9bc0b29388 rdf:first sg:person.011076371162.80
76 rdf:rest Nb5d5690e3ca04302b0886cf156faba62
77 N46bbd61dfff148c3bd160b6da0656c7c rdf:first sg:person.01020727037.00
78 rdf:rest N0f8616614c8a4117a40b60fd299eed7f
79 N47b074e92cbc4cb3a752d72653f04890 rdf:first sg:person.0674075626.77
80 rdf:rest Ne90262aea1bf4e6aa789f345ccd7d74e
81 N50f38971a1eb4e2da74051002df5121c schema:volumeNumber 19
82 rdf:type schema:PublicationVolume
83 N5776d7370b3b46a6b1f2785bd34cb5d9 rdf:first sg:person.01262035753.51
84 rdf:rest Nd0bdb144d3e94c8eac9891cd921b8a42
85 N7003aa08ac3f46f8ade57ee1c34c47c5 schema:name nlm_unique_id
86 schema:value 100965194
87 rdf:type schema:PropertyValue
88 N8abe3b362afd426591e5ce6d152ecc38 schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
89 rdf:type schema:Organization
90 N8eb1e4443e0e48fdb2b1ddc955a3c8b5 schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
91 rdf:type schema:Organization
92 N91e07d2a1ee24d91b8ac9178fa1b5330 schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
93 rdf:type schema:Organization
94 Na6052c4375c44f85b78630283ea45df4 schema:name readcube_id
95 schema:value 1ad9056f96f00158f193a93824a4b2cd044b3f9ca7c3e71c03f43cc26eed3920
96 rdf:type schema:PropertyValue
97 Na9c3bc1c1e7e4612973b27838557e27f schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
98 rdf:type schema:Organization
99 Naf47fda078124e9895920203086f23cb schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
100 rdf:type schema:Organization
101 Nb5d5690e3ca04302b0886cf156faba62 rdf:first sg:person.01120102220.26
102 rdf:rest rdf:nil
103 Nbe2fa545e9ab45809b84943bd3ec883d schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
104 rdf:type schema:Organization
105 Nbed73424ef05423bac778ccd6e40eb01 schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
106 rdf:type schema:Organization
107 Nc96246df15ab4f4d89d48ac9f6685cef schema:issueNumber 1
108 rdf:type schema:PublicationIssue
109 Nd0bdb144d3e94c8eac9891cd921b8a42 rdf:first sg:person.0615663405.00
110 rdf:rest N30755739854e40bfa7bf8fb16e43a4e2
111 Ne90262aea1bf4e6aa789f345ccd7d74e rdf:first sg:person.07651147251.54
112 rdf:rest N0c28296d8c5f41a5a508b11ba8441543
113 Nf1788381bfe8447f981db65458175301 schema:name dimensions_id
114 schema:value pub.1107860833
115 rdf:type schema:PropertyValue
116 Nf422e3e43e794b17887de0e9fa301718 schema:name BC Cancer Genome Sciences Centre, V5Z 4S6, Vancouver, BC, Canada
117 rdf:type schema:Organization
118 Nf91b730bccf44b8b91733f15d9d6fba3 schema:name Springer Nature - SN SciGraph project
119 rdf:type schema:Organization
120 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
121 schema:name Biological Sciences
122 rdf:type schema:DefinedTerm
123 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
124 schema:name Genetics
125 rdf:type schema:DefinedTerm
126 sg:grant.2407571 http://pending.schema.org/fundedItem sg:pub.10.1186/s12859-018-2425-6
127 rdf:type schema:MonetaryGrant
128 sg:journal.1023786 schema:issn 1471-2105
129 schema:name BMC Bioinformatics
130 rdf:type schema:Periodical
131 sg:person.01020727037.00 schema:affiliation N2e629693d98640e9a77e3207ab40e324
132 schema:familyName Warren
133 schema:givenName Rene L.
134 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01020727037.00
135 rdf:type schema:Person
136 sg:person.01055630120.47 schema:affiliation N91e07d2a1ee24d91b8ac9178fa1b5330
137 schema:familyName Vandervalk
138 schema:givenName Benjamin P.
139 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01055630120.47
140 rdf:type schema:Person
141 sg:person.011076371162.80 schema:affiliation Naf47fda078124e9895920203086f23cb
142 schema:familyName Jones
143 schema:givenName Steven J.M.
144 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011076371162.80
145 rdf:type schema:Person
146 sg:person.01120102220.26 schema:affiliation N8eb1e4443e0e48fdb2b1ddc955a3c8b5
147 schema:familyName Birol
148 schema:givenName Inanc
149 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01120102220.26
150 rdf:type schema:Person
151 sg:person.01173572437.14 schema:affiliation https://www.grid.ac/institutes/grid.17091.3e
152 schema:familyName Bohlmann
153 schema:givenName Joerg
154 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01173572437.14
155 rdf:type schema:Person
156 sg:person.01210300076.55 schema:affiliation N2e860c45709a48b7af4f455ec2014757
157 schema:familyName Mohamadi
158 schema:givenName Hamid
159 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01210300076.55
160 rdf:type schema:Person
161 sg:person.01262035753.51 schema:affiliation Nbe2fa545e9ab45809b84943bd3ec883d
162 schema:familyName Yeo
163 schema:givenName Sarah
164 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01262035753.51
165 rdf:type schema:Person
166 sg:person.0615663405.00 schema:affiliation Na9c3bc1c1e7e4612973b27838557e27f
167 schema:familyName Xue
168 schema:givenName Zhuyi
169 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0615663405.00
170 rdf:type schema:Person
171 sg:person.0643374476.44 schema:affiliation Nbed73424ef05423bac778ccd6e40eb01
172 schema:familyName Chu
173 schema:givenName Justin
174 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0643374476.44
175 rdf:type schema:Person
176 sg:person.0674075626.77 schema:affiliation N8abe3b362afd426591e5ce6d152ecc38
177 schema:familyName Jackman
178 schema:givenName Shaun D.
179 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0674075626.77
180 rdf:type schema:Person
181 sg:person.07651147251.54 schema:affiliation Nf422e3e43e794b17887de0e9fa301718
182 schema:familyName Coombe
183 schema:givenName Lauren
184 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07651147251.54
185 rdf:type schema:Person
186 sg:pub.10.1038/nature15394 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031274231
187 https://doi.org/10.1038/nature15394
188 rdf:type schema:CreativeWork
189 sg:pub.10.1038/nbt.3432 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006254156
190 https://doi.org/10.1038/nbt.3432
191 rdf:type schema:CreativeWork
192 sg:pub.10.1038/nbt.4060 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100685340
193 https://doi.org/10.1038/nbt.4060
194 rdf:type schema:CreativeWork
195 sg:pub.10.1038/nmeth.3865 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029741262
196 https://doi.org/10.1038/nmeth.3865
197 rdf:type schema:CreativeWork
198 sg:pub.10.1038/nmeth.4035 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019059120
199 https://doi.org/10.1038/nmeth.4035
200 rdf:type schema:CreativeWork
201 sg:pub.10.1038/nmeth.4366 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090738388
202 https://doi.org/10.1038/nmeth.4366
203 rdf:type schema:CreativeWork
204 sg:pub.10.1038/s41592-018-0046-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105107323
205 https://doi.org/10.1038/s41592-018-0046-7
206 rdf:type schema:CreativeWork
207 sg:pub.10.1038/sdata.2016.25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041924932
208 https://doi.org/10.1038/sdata.2016.25
209 rdf:type schema:CreativeWork
210 sg:pub.10.1186/s13742-015-0076-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050360222
211 https://doi.org/10.1186/s13742-015-0076-3
212 rdf:type schema:CreativeWork
213 https://doi.org/10.1016/j.csbj.2017.10.002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092597458
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1093/bioinformatics/btp352 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023014918
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1093/bioinformatics/btq033 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036892131
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1093/bioinformatics/btt086 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000520351
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1093/bioinformatics/btv057 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021229711
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1093/bioinformatics/btw064 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059414662
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1093/bioinformatics/btw267 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023931122
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1093/bioinformatics/btx675 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092359446
228 rdf:type schema:CreativeWork
229 https://doi.org/10.1093/bioinformatics/btx712 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092524579
230 rdf:type schema:CreativeWork
231 https://doi.org/10.1101/007666 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085108680
232 rdf:type schema:CreativeWork
233 https://doi.org/10.1101/103549 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085103757
234 rdf:type schema:CreativeWork
235 https://doi.org/10.1101/190454 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091915038
236 rdf:type schema:CreativeWork
237 https://doi.org/10.1101/gr.092759.109 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020163602
238 rdf:type schema:CreativeWork
239 https://doi.org/10.1101/gr.178319.114 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030597566
240 rdf:type schema:CreativeWork
241 https://doi.org/10.1101/gr.213652.116 schema:sameAs https://app.dimensions.ai/details/publication/pub.1083534713
242 rdf:type schema:CreativeWork
243 https://doi.org/10.1101/gr.214346.116 schema:sameAs https://app.dimensions.ai/details/publication/pub.1083862088
244 rdf:type schema:CreativeWork
245 https://doi.org/10.1101/gr.214874.116 schema:sameAs https://app.dimensions.ai/details/publication/pub.1084603641
246 rdf:type schema:CreativeWork
247 https://doi.org/10.1371/journal.pone.0112963 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019307347
248 rdf:type schema:CreativeWork
249 https://doi.org/10.7717/peerj.996 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046215289
250 rdf:type schema:CreativeWork
251 https://www.grid.ac/institutes/grid.17091.3e schema:alternateName University of British Columbia
252 schema:name University of British Columbia, Michael Smith Laboratories, V6T 1Z4, Vancouver, BC, Canada
253 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...