Repetitive DNA and next-generation sequencing: computational challenges and solutions View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2011-11-29

AUTHORS

Todd J. Treangen, Steven L. Salzberg

ABSTRACT

Key PointsNew high-throughput sequencing technologies have spurred explosive growth in the use of sequencing to discover mutations and structural variants in the human genome and in the number of projects to sequence and assemble new genomes.Highly efficient algorithms have been developed to align next-generation sequences to genomes, and these algorithms use a variety of strategies to place repetitive reads.Ambiguous mapping of sequences that are derived from repetitive regions makes it difficult to identify true polymorphisms and to reconstruct transcripts.Short read lengths combined with mapping ambiguities lead to false reports of single-nucleotide polymorphisms, inserts, deletions and other sequence variants.When assembling a genome de novo, repetitive sequences can lead to erroneous rearrangements, deletions, collapsed repeats and other assembly errors.Long-range linking information from paired-end reads can overcome some of the difficulties in short-read assembly. More... »

PAGES

36-46

References to SciGraph publications

  • 2008-05-30. Stem cell transcriptome profiling via massive-scale mRNA sequencing in NATURE METHODS
  • 2008-05-30. Mapping and quantifying mammalian transcriptomes by RNA-Seq in NATURE METHODS
  • 2009-03-04. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome in GENOME BIOLOGY
  • 2009-09-08. ChIP–seq: advantages and challenges of a maturing technology in NATURE REVIEWS GENETICS
  • 2009-08-30. Personalized copy number and segmental duplication maps using next-generation sequencing in NATURE GENETICS
  • 2011-06-20. Sniper: improved SNP discovery by multiply mapping deep sequenced reads in GENOME BIOLOGY
  • 2009-05-27. The 1001 Genomes Project for Arabidopsis thaliana in GENOME BIOLOGY
  • 2011-03-01. Genome structural variation discovery and genotyping in NATURE REVIEWS GENETICS
  • 2011-03-31. A vertebrate case study of the quality of assemblies derived from next-generation sequences in GENOME BIOLOGY
  • 2008-03-14. Genome assembly forensics: finding the elusive mis-assembly in GENOME BIOLOGY
  • 2011-07-27. An Archaeopteryx-like theropod from China and the origin of Avialae in NATURE
  • 2011-01-01. Integrative genomics viewer in NATURE BIOTECHNOLOGY
  • 2009-10-15. Computational methods for discovering structural variation with next-generation sequencing in NATURE METHODS
  • 2010-10-10. De novo assembly and analysis of RNA-seq data in NATURE METHODS
  • 2011-08-11. TopHat-Fusion: an algorithm for discovery of novel fusion transcripts in GENOME BIOLOGY
  • 2011-07-10. Genome sequence and analysis of the tuber crop potato in NATURE
  • 2010-04-01. State of the art de novoassembly of human genomes from massively parallel sequencing data in HUMAN GENOMICS
  • 2002-05. Alu repeats and human genomic diversity in NATURE REVIEWS GENETICS
  • 2011-03-17. The struggle for life of the genome's selfish architects in BIOLOGY DIRECT
  • 2010-10-27. A map of human genome variation from population-scale sequencing in NATURE
  • 2009-02-11. High tandem repeat content in the genome of the short-lived annual fish Nothobranchius furzeri: a new vertebrate model for aging research in GENOME BIOLOGY
  • 2010-09-17. Advances in understanding cancer genomes through second-generation sequencing in NATURE REVIEWS GENETICS
  • 2010-10-21. FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data in GENOME BIOLOGY
  • 2011-05-25. rnaSeqMap: a Bioconductor package for RNA sequencing data exploration in BMC BIOINFORMATICS
  • 2010-05-02. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation in NATURE BIOTECHNOLOGY
  • 2011-05-15. Full-length transcriptome assembly from RNA-Seq data without a reference genome in NATURE BIOTECHNOLOGY
  • 2000-12. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana in NATURE
  • 2011-05-27. Computational methods for transcriptome annotation and quantification using RNA-seq in NATURE METHODS
  • 2010-11-21. Limitations of next-generation genome sequence assembly in NATURE METHODS
  • 2011-04-13. Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies in BMC BIOINFORMATICS
  • 2011-04-10. A framework for variation discovery and genotyping using next-generation DNA sequencing data in NATURE GENETICS
  • Journal

    TITLE

    Nature Reviews Genetics

    ISSUE

    1

    VOLUME

    13

    Related Patents

  • Devices With A Fluid Transport Nanochannel Intersected By A Fluid Sensing Nanochannel And Related Methods
  • Accurate And Fast Mapping Of Reads To Genome
  • Triazole-Based Reader Molecules And Methods For Synthesizing And Use Thereof
  • Nanofluidic Devices For The Rapid Mapping Of Whole Genomes And Related Systems And Methods Of Analysis
  • Methods For Storing Digital Data As, And For Transforming Digital Data Into, Synthetic Dna
  • Nanofluidic Devices For The Rapid Mapping Of Whole Genomes And Related Systems And Methods Of Analysis
  • Nanofluidic Devices For The Rapid Mapping Of Whole Genomes And Related Systems And Methods Of Analysis
  • Methods For Maintaining The Integrity And Identification Of A Nucleic Acid Template In A Multiplex Sequencing Reaction
  • Modulating The Cellular Stress Response
  • Nanofluidic Devices For The Rapid Mapping Of Whole Genomes And Related Systems And Methods Of Analysis
  • Transposable Elements, Tdp-43, And Neurodegenerative Disorders
  • Sequence Assembly
  • Devices And Systems With Fluidic Nanofunnels For Processing Single Molecules
  • Biosynthetic Systems Producing Fungal Indole Alkaloids
  • Nanofluidic Devices For The Rapid Mapping Of Whole Genomes And Related Systems And Methods Of Analysis
  • Methods For Selectively Suppressing Non-Target Sequences
  • Nanofluidic Devices For The Rapid Mapping Of Whole Genomes And Related Systems And Methods Of Analysis
  • Nanofluidic Devices With Integrated Components For The Controlled Capture, Trapping, And Transport Of Macromolecules And Related Methods Of Analysis
  • Nanofluidic Devices With Integrated Components For The Controlled Capture, Trapping, And Transport Of Macromolecules And Related Methods Of Analysis
  • Method For Generating A Masked Reference Sequence Of The Y Chromosome
  • Sequence Assembly
  • Accurate And Fast Mapping Of Reads To Genome
  • Methods For Maintaining The Integrity And Identification Of A Nucleic Acid Template In A Multiplex Sequencing Reaction
  • Devices With Fluidic Nanofunnels, Associated Methods, Fabrication And Analysis Systems
  • Nanofluidic Devices For The Rapid Mapping Of Whole Genomes And Related Systems And Methods Of Analysis
  • Transposable Elements, Tdp-43, And Neurodegenerative Disorders
  • Fluidic Devices With Nanoscale Manifolds For Molecular Transport, Related Systems And Methods Of Analysis
  • Accurate And Fast Mapping Of Targeted Sequencing Reads
  • Method For Determining Copy Number Variations In Sex Chromosomes
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1038/nrg3117

    DOI

    http://dx.doi.org/10.1038/nrg3117

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1044335233

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/22124482


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Biological Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Genetics", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Algorithms", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Animals", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computational Biology", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "DNA", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Genome", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Humans", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Molecular Sequence Data", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Plants", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "RNA", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Repetitive Sequences, Nucleic Acid", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Reproducibility of Results", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Sequence Alignment", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Sequence Analysis, DNA", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Sequence Analysis, RNA", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Software", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "McKusick\u2013Nathans Institute for Genetic Medicine, Johns Hopkins University School of Medicine, 21205, Baltimore, Maryland, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "McKusick\u2013Nathans Institute for Genetic Medicine, Johns Hopkins University School of Medicine, 21205, Baltimore, Maryland, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Treangen", 
            "givenName": "Todd J.", 
            "id": "sg:person.01261673253.41", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01261673253.41"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 21205, Baltimore, Maryland, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "McKusick\u2013Nathans Institute for Genetic Medicine, Johns Hopkins University School of Medicine, 21205, Baltimore, Maryland, USA", 
                "Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 21205, Baltimore, Maryland, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Salzberg", 
            "givenName": "Steven L.", 
            "id": "sg:person.01223441713.02", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01223441713.02"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1038/nrg2841", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017741162", 
              "https://doi.org/10.1038/nrg2841"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-12-95", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020547896", 
              "https://doi.org/10.1186/1471-2105-12-95"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1479-7364-4-4-271", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028588465", 
              "https://doi.org/10.1186/1479-7364-4-4-271"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1745-6150-6-19", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000128333", 
              "https://doi.org/10.1186/1745-6150-6-19"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2010-11-10-r104", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002592427", 
              "https://doi.org/10.1186/gb-2010-11-10-r104"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/35048692", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044298669", 
              "https://doi.org/10.1038/35048692"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2009-10-3-r25", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1049583368", 
              "https://doi.org/10.1186/gb-2009-10-3-r25"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/ng.437", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035989827", 
              "https://doi.org/10.1038/ng.437"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2009-10-5-107", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043335129", 
              "https://doi.org/10.1186/gb-2009-10-5-107"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1223", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048586936", 
              "https://doi.org/10.1038/nmeth.1223"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2009-10-2-r16", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032910724", 
              "https://doi.org/10.1186/gb-2009-10-2-r16"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1374", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019637093", 
              "https://doi.org/10.1038/nmeth.1374"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1517", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032102367", 
              "https://doi.org/10.1038/nmeth.1517"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-12-200", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040958434", 
              "https://doi.org/10.1186/1471-2105-12-200"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2011-12-8-r72", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006381787", 
              "https://doi.org/10.1186/gb-2011-12-8-r72"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2011-12-6-r55", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018738511", 
              "https://doi.org/10.1186/gb-2011-12-6-r55"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1527", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1015617800", 
              "https://doi.org/10.1038/nmeth.1527"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature10288", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025276362", 
              "https://doi.org/10.1038/nature10288"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1613", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009798986", 
              "https://doi.org/10.1038/nmeth.1613"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.1621", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031035095", 
              "https://doi.org/10.1038/nbt.1621"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.1754", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019307928", 
              "https://doi.org/10.1038/nbt.1754"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nrg2958", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004346662", 
              "https://doi.org/10.1038/nrg2958"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature09534", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1010608717", 
              "https://doi.org/10.1038/nature09534"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.1883", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1015803168", 
              "https://doi.org/10.1038/nbt.1883"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature10158", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002525491", 
              "https://doi.org/10.1038/nature10158"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nrg2641", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006115199", 
              "https://doi.org/10.1038/nrg2641"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2011-12-3-r31", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002683835", 
              "https://doi.org/10.1186/gb-2011-12-3-r31"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nrg798", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025444771", 
              "https://doi.org/10.1038/nrg798"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2008-9-3-r55", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003390143", 
              "https://doi.org/10.1186/gb-2008-9-3-r55"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1226", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045381177", 
              "https://doi.org/10.1038/nmeth.1226"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/ng.806", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1010244476", 
              "https://doi.org/10.1038/ng.806"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2011-11-29", 
        "datePublishedReg": "2011-11-29", 
        "description": "Key PointsNew high-throughput sequencing technologies have spurred explosive growth in the use of sequencing to discover mutations and structural variants in the human genome and in the number of projects to sequence and assemble new genomes.Highly efficient algorithms have been developed to align next-generation sequences to genomes, and these algorithms use a variety of strategies to place repetitive reads.Ambiguous mapping of sequences that are derived from repetitive regions makes it difficult to identify true polymorphisms and to reconstruct transcripts.Short read lengths combined with mapping ambiguities lead to false reports of single-nucleotide polymorphisms, inserts, deletions and other sequence variants.When assembling a genome de novo, repetitive sequences can lead to erroneous rearrangements, deletions, collapsed repeats and other assembly errors.Long-range linking information from paired-end reads can overcome some of the difficulties in short-read assembly.", 
        "genre": "article", 
        "id": "sg:pub.10.1038/nrg3117", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.2519905", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2529453", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2529425", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1023607", 
            "issn": [
              "1471-0056", 
              "1471-0064"
            ], 
            "name": "Nature Reviews Genetics", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "13"
          }
        ], 
        "keywords": [
          "high-throughput sequencing technology", 
          "paired-end reads", 
          "short read assemblies", 
          "short read lengths", 
          "next-generation sequences", 
          "repetitive DNA", 
          "new genomes", 
          "human genome", 
          "single nucleotide polymorphisms", 
          "next-generation sequencing", 
          "repetitive sequences", 
          "sequencing technologies", 
          "true polymorphism", 
          "use of sequencing", 
          "repetitive regions", 
          "genome", 
          "structural variants", 
          "sequence variants", 
          "read length", 
          "repetitive reads", 
          "sequencing", 
          "sequence", 
          "deletion", 
          "reads", 
          "polymorphism", 
          "repeats", 
          "transcripts", 
          "variants", 
          "DNA", 
          "ambiguous mapping", 
          "mutations", 
          "novo", 
          "variety of strategies", 
          "assembly", 
          "assembly errors", 
          "rearrangement", 
          "growth", 
          "inserts", 
          "computational challenges", 
          "region", 
          "mapping", 
          "variety", 
          "length", 
          "number", 
          "strategies", 
          "lead", 
          "information", 
          "report", 
          "explosive growth", 
          "use", 
          "challenges", 
          "technology", 
          "project", 
          "number of projects", 
          "difficulties", 
          "solution", 
          "efficient algorithm", 
          "algorithm", 
          "error", 
          "false reports"
        ], 
        "name": "Repetitive DNA and next-generation sequencing: computational challenges and solutions", 
        "pagination": "36-46", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1044335233"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1038/nrg3117"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "22124482"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1038/nrg3117", 
          "https://app.dimensions.ai/details/publication/pub.1044335233"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-12-01T06:29", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_551.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1038/nrg3117"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/nrg3117'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/nrg3117'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/nrg3117'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/nrg3117'


     

    This table displays all metadata directly associated to this object as RDF triples.

    320 TRIPLES      21 PREDICATES      131 URIs      92 LITERALS      22 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1038/nrg3117 schema:about N04837e1f60374bc0afeb6268d40954ce
    2 N2a03af37276e47dcb38e21f1f1794741
    3 N380bda08ddef4b298b110ad23c28c8d3
    4 N45a3552e06ec4f428beceb32ba87147b
    5 N471da69a507349b0b31480e9fad75afc
    6 N59b5d6fcfee344b9bfa57d697b6ff972
    7 N6adb105d4e2948e4b86890589760867e
    8 N6c056f5da2434b859aeff12655aad6da
    9 N81805ba7eab84505b54029e3ecbb3170
    10 N8c93fda735b940dd8f3a7c35a58b06d7
    11 N986e0288912544b7b2b6ea651f4080be
    12 N9f52f4138b884effa5f981e8532b7f1a
    13 Nc7bc805276d64ee29acc817ead0ea2da
    14 Nd63231d3cfa047909447e9cf0bc5b0d8
    15 Ne3f9ca353255415b97fd174e59d41e31
    16 anzsrc-for:06
    17 anzsrc-for:0604
    18 schema:author N6faaeb44bae74153800fe0285b16d539
    19 schema:citation sg:pub.10.1038/35048692
    20 sg:pub.10.1038/nature09534
    21 sg:pub.10.1038/nature10158
    22 sg:pub.10.1038/nature10288
    23 sg:pub.10.1038/nbt.1621
    24 sg:pub.10.1038/nbt.1754
    25 sg:pub.10.1038/nbt.1883
    26 sg:pub.10.1038/ng.437
    27 sg:pub.10.1038/ng.806
    28 sg:pub.10.1038/nmeth.1223
    29 sg:pub.10.1038/nmeth.1226
    30 sg:pub.10.1038/nmeth.1374
    31 sg:pub.10.1038/nmeth.1517
    32 sg:pub.10.1038/nmeth.1527
    33 sg:pub.10.1038/nmeth.1613
    34 sg:pub.10.1038/nrg2641
    35 sg:pub.10.1038/nrg2841
    36 sg:pub.10.1038/nrg2958
    37 sg:pub.10.1038/nrg798
    38 sg:pub.10.1186/1471-2105-12-200
    39 sg:pub.10.1186/1471-2105-12-95
    40 sg:pub.10.1186/1479-7364-4-4-271
    41 sg:pub.10.1186/1745-6150-6-19
    42 sg:pub.10.1186/gb-2008-9-3-r55
    43 sg:pub.10.1186/gb-2009-10-2-r16
    44 sg:pub.10.1186/gb-2009-10-3-r25
    45 sg:pub.10.1186/gb-2009-10-5-107
    46 sg:pub.10.1186/gb-2010-11-10-r104
    47 sg:pub.10.1186/gb-2011-12-3-r31
    48 sg:pub.10.1186/gb-2011-12-6-r55
    49 sg:pub.10.1186/gb-2011-12-8-r72
    50 schema:datePublished 2011-11-29
    51 schema:datePublishedReg 2011-11-29
    52 schema:description Key PointsNew high-throughput sequencing technologies have spurred explosive growth in the use of sequencing to discover mutations and structural variants in the human genome and in the number of projects to sequence and assemble new genomes.Highly efficient algorithms have been developed to align next-generation sequences to genomes, and these algorithms use a variety of strategies to place repetitive reads.Ambiguous mapping of sequences that are derived from repetitive regions makes it difficult to identify true polymorphisms and to reconstruct transcripts.Short read lengths combined with mapping ambiguities lead to false reports of single-nucleotide polymorphisms, inserts, deletions and other sequence variants.When assembling a genome de novo, repetitive sequences can lead to erroneous rearrangements, deletions, collapsed repeats and other assembly errors.Long-range linking information from paired-end reads can overcome some of the difficulties in short-read assembly.
    53 schema:genre article
    54 schema:isAccessibleForFree true
    55 schema:isPartOf Nb5fd89ba69ff48ca9da21ab74e786cbb
    56 Nd80576e71e7e43f8b89f6d196745e9d8
    57 sg:journal.1023607
    58 schema:keywords DNA
    59 algorithm
    60 ambiguous mapping
    61 assembly
    62 assembly errors
    63 challenges
    64 computational challenges
    65 deletion
    66 difficulties
    67 efficient algorithm
    68 error
    69 explosive growth
    70 false reports
    71 genome
    72 growth
    73 high-throughput sequencing technology
    74 human genome
    75 information
    76 inserts
    77 lead
    78 length
    79 mapping
    80 mutations
    81 new genomes
    82 next-generation sequences
    83 next-generation sequencing
    84 novo
    85 number
    86 number of projects
    87 paired-end reads
    88 polymorphism
    89 project
    90 read length
    91 reads
    92 rearrangement
    93 region
    94 repeats
    95 repetitive DNA
    96 repetitive reads
    97 repetitive regions
    98 repetitive sequences
    99 report
    100 sequence
    101 sequence variants
    102 sequencing
    103 sequencing technologies
    104 short read assemblies
    105 short read lengths
    106 single nucleotide polymorphisms
    107 solution
    108 strategies
    109 structural variants
    110 technology
    111 transcripts
    112 true polymorphism
    113 use
    114 use of sequencing
    115 variants
    116 variety
    117 variety of strategies
    118 schema:name Repetitive DNA and next-generation sequencing: computational challenges and solutions
    119 schema:pagination 36-46
    120 schema:productId N34c75b2400644253b14be31830417c7e
    121 Nee3344ef2a124247b939ff88c95fbea3
    122 Nfeb55ddca0fb491c88411d20060542f6
    123 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044335233
    124 https://doi.org/10.1038/nrg3117
    125 schema:sdDatePublished 2022-12-01T06:29
    126 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    127 schema:sdPublisher N744683df4cc849c2bb1cbada520eb41e
    128 schema:url https://doi.org/10.1038/nrg3117
    129 sgo:license sg:explorer/license/
    130 sgo:sdDataset articles
    131 rdf:type schema:ScholarlyArticle
    132 N04837e1f60374bc0afeb6268d40954ce schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    133 schema:name Software
    134 rdf:type schema:DefinedTerm
    135 N2a03af37276e47dcb38e21f1f1794741 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    136 schema:name Computational Biology
    137 rdf:type schema:DefinedTerm
    138 N34c75b2400644253b14be31830417c7e schema:name pubmed_id
    139 schema:value 22124482
    140 rdf:type schema:PropertyValue
    141 N380bda08ddef4b298b110ad23c28c8d3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    142 schema:name Sequence Analysis, RNA
    143 rdf:type schema:DefinedTerm
    144 N45a3552e06ec4f428beceb32ba87147b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    145 schema:name Genome
    146 rdf:type schema:DefinedTerm
    147 N471da69a507349b0b31480e9fad75afc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    148 schema:name Sequence Alignment
    149 rdf:type schema:DefinedTerm
    150 N59b5d6fcfee344b9bfa57d697b6ff972 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    151 schema:name Molecular Sequence Data
    152 rdf:type schema:DefinedTerm
    153 N6adb105d4e2948e4b86890589760867e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    154 schema:name RNA
    155 rdf:type schema:DefinedTerm
    156 N6c056f5da2434b859aeff12655aad6da schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    157 schema:name Algorithms
    158 rdf:type schema:DefinedTerm
    159 N6faaeb44bae74153800fe0285b16d539 rdf:first sg:person.01261673253.41
    160 rdf:rest Nd36b621fbffc4b2c8ce658a5967b6d35
    161 N744683df4cc849c2bb1cbada520eb41e schema:name Springer Nature - SN SciGraph project
    162 rdf:type schema:Organization
    163 N81805ba7eab84505b54029e3ecbb3170 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    164 schema:name Reproducibility of Results
    165 rdf:type schema:DefinedTerm
    166 N8c93fda735b940dd8f3a7c35a58b06d7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    167 schema:name Repetitive Sequences, Nucleic Acid
    168 rdf:type schema:DefinedTerm
    169 N986e0288912544b7b2b6ea651f4080be schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    170 schema:name DNA
    171 rdf:type schema:DefinedTerm
    172 N9f52f4138b884effa5f981e8532b7f1a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    173 schema:name Animals
    174 rdf:type schema:DefinedTerm
    175 Nb5fd89ba69ff48ca9da21ab74e786cbb schema:volumeNumber 13
    176 rdf:type schema:PublicationVolume
    177 Nc7bc805276d64ee29acc817ead0ea2da schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    178 schema:name Plants
    179 rdf:type schema:DefinedTerm
    180 Nd36b621fbffc4b2c8ce658a5967b6d35 rdf:first sg:person.01223441713.02
    181 rdf:rest rdf:nil
    182 Nd63231d3cfa047909447e9cf0bc5b0d8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    183 schema:name Sequence Analysis, DNA
    184 rdf:type schema:DefinedTerm
    185 Nd80576e71e7e43f8b89f6d196745e9d8 schema:issueNumber 1
    186 rdf:type schema:PublicationIssue
    187 Ne3f9ca353255415b97fd174e59d41e31 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    188 schema:name Humans
    189 rdf:type schema:DefinedTerm
    190 Nee3344ef2a124247b939ff88c95fbea3 schema:name dimensions_id
    191 schema:value pub.1044335233
    192 rdf:type schema:PropertyValue
    193 Nfeb55ddca0fb491c88411d20060542f6 schema:name doi
    194 schema:value 10.1038/nrg3117
    195 rdf:type schema:PropertyValue
    196 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
    197 schema:name Biological Sciences
    198 rdf:type schema:DefinedTerm
    199 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
    200 schema:name Genetics
    201 rdf:type schema:DefinedTerm
    202 sg:grant.2519905 http://pending.schema.org/fundedItem sg:pub.10.1038/nrg3117
    203 rdf:type schema:MonetaryGrant
    204 sg:grant.2529425 http://pending.schema.org/fundedItem sg:pub.10.1038/nrg3117
    205 rdf:type schema:MonetaryGrant
    206 sg:grant.2529453 http://pending.schema.org/fundedItem sg:pub.10.1038/nrg3117
    207 rdf:type schema:MonetaryGrant
    208 sg:journal.1023607 schema:issn 1471-0056
    209 1471-0064
    210 schema:name Nature Reviews Genetics
    211 schema:publisher Springer Nature
    212 rdf:type schema:Periodical
    213 sg:person.01223441713.02 schema:affiliation grid-institutes:grid.21107.35
    214 schema:familyName Salzberg
    215 schema:givenName Steven L.
    216 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01223441713.02
    217 rdf:type schema:Person
    218 sg:person.01261673253.41 schema:affiliation grid-institutes:grid.21107.35
    219 schema:familyName Treangen
    220 schema:givenName Todd J.
    221 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01261673253.41
    222 rdf:type schema:Person
    223 sg:pub.10.1038/35048692 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044298669
    224 https://doi.org/10.1038/35048692
    225 rdf:type schema:CreativeWork
    226 sg:pub.10.1038/nature09534 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010608717
    227 https://doi.org/10.1038/nature09534
    228 rdf:type schema:CreativeWork
    229 sg:pub.10.1038/nature10158 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002525491
    230 https://doi.org/10.1038/nature10158
    231 rdf:type schema:CreativeWork
    232 sg:pub.10.1038/nature10288 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025276362
    233 https://doi.org/10.1038/nature10288
    234 rdf:type schema:CreativeWork
    235 sg:pub.10.1038/nbt.1621 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031035095
    236 https://doi.org/10.1038/nbt.1621
    237 rdf:type schema:CreativeWork
    238 sg:pub.10.1038/nbt.1754 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019307928
    239 https://doi.org/10.1038/nbt.1754
    240 rdf:type schema:CreativeWork
    241 sg:pub.10.1038/nbt.1883 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015803168
    242 https://doi.org/10.1038/nbt.1883
    243 rdf:type schema:CreativeWork
    244 sg:pub.10.1038/ng.437 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035989827
    245 https://doi.org/10.1038/ng.437
    246 rdf:type schema:CreativeWork
    247 sg:pub.10.1038/ng.806 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010244476
    248 https://doi.org/10.1038/ng.806
    249 rdf:type schema:CreativeWork
    250 sg:pub.10.1038/nmeth.1223 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048586936
    251 https://doi.org/10.1038/nmeth.1223
    252 rdf:type schema:CreativeWork
    253 sg:pub.10.1038/nmeth.1226 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045381177
    254 https://doi.org/10.1038/nmeth.1226
    255 rdf:type schema:CreativeWork
    256 sg:pub.10.1038/nmeth.1374 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019637093
    257 https://doi.org/10.1038/nmeth.1374
    258 rdf:type schema:CreativeWork
    259 sg:pub.10.1038/nmeth.1517 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032102367
    260 https://doi.org/10.1038/nmeth.1517
    261 rdf:type schema:CreativeWork
    262 sg:pub.10.1038/nmeth.1527 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015617800
    263 https://doi.org/10.1038/nmeth.1527
    264 rdf:type schema:CreativeWork
    265 sg:pub.10.1038/nmeth.1613 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009798986
    266 https://doi.org/10.1038/nmeth.1613
    267 rdf:type schema:CreativeWork
    268 sg:pub.10.1038/nrg2641 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006115199
    269 https://doi.org/10.1038/nrg2641
    270 rdf:type schema:CreativeWork
    271 sg:pub.10.1038/nrg2841 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017741162
    272 https://doi.org/10.1038/nrg2841
    273 rdf:type schema:CreativeWork
    274 sg:pub.10.1038/nrg2958 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004346662
    275 https://doi.org/10.1038/nrg2958
    276 rdf:type schema:CreativeWork
    277 sg:pub.10.1038/nrg798 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025444771
    278 https://doi.org/10.1038/nrg798
    279 rdf:type schema:CreativeWork
    280 sg:pub.10.1186/1471-2105-12-200 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040958434
    281 https://doi.org/10.1186/1471-2105-12-200
    282 rdf:type schema:CreativeWork
    283 sg:pub.10.1186/1471-2105-12-95 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020547896
    284 https://doi.org/10.1186/1471-2105-12-95
    285 rdf:type schema:CreativeWork
    286 sg:pub.10.1186/1479-7364-4-4-271 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028588465
    287 https://doi.org/10.1186/1479-7364-4-4-271
    288 rdf:type schema:CreativeWork
    289 sg:pub.10.1186/1745-6150-6-19 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000128333
    290 https://doi.org/10.1186/1745-6150-6-19
    291 rdf:type schema:CreativeWork
    292 sg:pub.10.1186/gb-2008-9-3-r55 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003390143
    293 https://doi.org/10.1186/gb-2008-9-3-r55
    294 rdf:type schema:CreativeWork
    295 sg:pub.10.1186/gb-2009-10-2-r16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032910724
    296 https://doi.org/10.1186/gb-2009-10-2-r16
    297 rdf:type schema:CreativeWork
    298 sg:pub.10.1186/gb-2009-10-3-r25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049583368
    299 https://doi.org/10.1186/gb-2009-10-3-r25
    300 rdf:type schema:CreativeWork
    301 sg:pub.10.1186/gb-2009-10-5-107 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043335129
    302 https://doi.org/10.1186/gb-2009-10-5-107
    303 rdf:type schema:CreativeWork
    304 sg:pub.10.1186/gb-2010-11-10-r104 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002592427
    305 https://doi.org/10.1186/gb-2010-11-10-r104
    306 rdf:type schema:CreativeWork
    307 sg:pub.10.1186/gb-2011-12-3-r31 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002683835
    308 https://doi.org/10.1186/gb-2011-12-3-r31
    309 rdf:type schema:CreativeWork
    310 sg:pub.10.1186/gb-2011-12-6-r55 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018738511
    311 https://doi.org/10.1186/gb-2011-12-6-r55
    312 rdf:type schema:CreativeWork
    313 sg:pub.10.1186/gb-2011-12-8-r72 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006381787
    314 https://doi.org/10.1186/gb-2011-12-8-r72
    315 rdf:type schema:CreativeWork
    316 grid-institutes:grid.21107.35 schema:alternateName Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 21205, Baltimore, Maryland, USA
    317 McKusick–Nathans Institute for Genetic Medicine, Johns Hopkins University School of Medicine, 21205, Baltimore, Maryland, USA
    318 schema:name Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 21205, Baltimore, Maryland, USA
    319 McKusick–Nathans Institute for Genetic Medicine, Johns Hopkins University School of Medicine, 21205, Baltimore, Maryland, USA
    320 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...