A general approach to single-nucleotide polymorphism discovery View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

1999-12

AUTHORS

Gabor T. Marth, Ian Korf, Mark D. Yandell, Raymond T. Yeh, Zhijie Gu, Hamideh Zakeri, Nathan O. Stitziel, LaDeana Hillier, Pui-Yan Kwok, Warren R. Gish

ABSTRACT

Single-nucleotide polymorphisms (SNPs) are the most abundant form of human genetic variation and a resource for mapping complex genetic traits. The large volume of data produced by high-throughput sequencing projects is a rich and largely untapped source of SNPs (refs 2, 3, 4, 5). We present here a unified approach to the discovery of variations in genetic sequence data of arbitrary DNA sources. We propose to use the rapidly emerging genomic sequence as a template on which to layer often unmapped, fragmentary sequence data and to use base quality values to discern true allelic variations from sequencing errors. By taking advantage of the genomic sequence we are able to use simpler yet more accurate methods for sequence organization: fragment clustering, paralogue identification and multiple alignment. We analyse these sequences with a novel, Bayesian inference engine, POLYBAYES, to calculate the probability that a given site is polymorphic. Rigorous treatment of base quality permits completely automated evaluation of the full length of all sequences, without limitations on alignment depth. We demonstrate this approach by accurate SNP predictions in human ESTs aligned to finished and working-draft quality genomic sequences, a data set representative of the typical challenges of sequence-based SNP discovery. More... »

PAGES

ng1299_452

References to SciGraph publications

Journal

TITLE

Nature Genetics

ISSUE

4

VOLUME

23

Related Patents

  • Method And System For Calling Variations In A Sample Polynucleotide Sequence With Respect To A Reference Polynucleotide Sequence
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Methods And System For Detecting Sequence Variants
  • Polymorphisms In The Human Gene For The Multidrug Resistance-Associated Protein 1 (Mrp-1) And Their Use In Diagnostic And Therapeutic Applications
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Method And System For Calling Variations In A Sample Polynucleotide Sequence With Respect To A Reference Polynucleotide Sequence
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Systems And Methods For Using Paired-End Data In Directed Acyclic Structure
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Method And System For Calling Variations In A Sample Polynucleotide Sequence With Respect To A Reference Polynucleotide Sequence
  • Novel Pharmacogene Single Nucleotide Polymorphisms And Methods Of Detecting Same
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Method And System For Calling Variations In A Sample Polynucleotide Sequence With Respect To A Reference Polynucleotide Sequence
  • Method, Computer-Accessible Medium And System For Base-Calling And Alignment
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Method And System For Calling Variations In A Sample Polynucleotide Sequence With Respect To A Reference Polynucleotide Sequence
  • Method For High-Throughput Aflp-Based Polymorphism Detection
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1038/70570

    DOI

    http://dx.doi.org/10.1038/70570

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1011138564

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/10581034


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Genetics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Biological Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Algorithms", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Alleles", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Bayes Theorem", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Data Interpretation, Statistical", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Expressed Sequence Tags", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Genetic Techniques", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Genetic Variation", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Genome, Human", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Humans", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Polymorphism, Single Nucleotide", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Sequence Alignment", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Software", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA ."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Marth", 
            "givenName": "Gabor T.", 
            "id": "sg:person.0617524562.14", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0617524562.14"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA ."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Korf", 
            "givenName": "Ian", 
            "id": "sg:person.01063041236.78", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01063041236.78"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA ."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Yandell", 
            "givenName": "Mark D.", 
            "id": "sg:person.0604140657.40", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0604140657.40"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA ."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Yeh", 
            "givenName": "Raymond T.", 
            "id": "sg:person.01121405025.22", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01121405025.22"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Division of Dermatology, St. Louis, Missouri, USA."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Gu", 
            "givenName": "Zhijie", 
            "id": "sg:person.01304003176.38", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01304003176.38"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Division of Dermatology, St. Louis, Missouri, USA."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Zakeri", 
            "givenName": "Hamideh", 
            "id": "sg:person.0724166244.21", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0724166244.21"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA ."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Stitziel", 
            "givenName": "Nathan O.", 
            "id": "sg:person.01065735320.09", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065735320.09"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA ."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Hillier", 
            "givenName": "LaDeana", 
            "id": "sg:person.01014313442.46", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01014313442.46"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Division of Dermatology, St. Louis, Missouri, USA."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Kwok", 
            "givenName": "Pui-Yan", 
            "id": "sg:person.01221224024.09", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01221224024.09"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Washington University in St. Louis", 
              "id": "https://www.grid.ac/institutes/grid.4367.6", 
              "name": [
                "Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA ."
              ], 
              "type": "Organization"
            }, 
            "familyName": "Gish", 
            "givenName": "Warren R.", 
            "id": "sg:person.01344711107.89", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344711107.89"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1101/gr.6.9.807", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000850162"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/geno.1994.1469", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008038684"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/geno.1997.5042", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008775422"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/10290", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008973203", 
              "https://doi.org/10.1038/10290"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/10290", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008973203", 
              "https://doi.org/10.1038/10290"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1098/rstl.1763.0053", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1010033334"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/10297", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011881979", 
              "https://doi.org/10.1038/10297"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/10297", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011881979", 
              "https://doi.org/10.1038/10297"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.8.3.195", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018564763"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.6.11.1118", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025457008"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1126/science.282.5389.682", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027647963"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.8.3.161", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031785094"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.8.3.186", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1038920266"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/6851", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1039024064", 
              "https://doi.org/10.1038/6851"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/6851", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1039024064", 
              "https://doi.org/10.1038/6851"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/ng0893-373", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1039932477", 
              "https://doi.org/10.1038/ng0893-373"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/907", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044380493", 
              "https://doi.org/10.1038/907"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/907", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044380493", 
              "https://doi.org/10.1038/907"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.8.3.175", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048253030"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.6.9.829", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1051679296"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1126/science.270.5244.1945", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062551849"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1126/science.278.5343.1580", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062558819"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1126/science.280.5366.1077", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062561068"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1126/science.280.5369.1540", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062561324"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1074251583", 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.8.7.748", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1083298515"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "1999-12", 
        "datePublishedReg": "1999-12-01", 
        "description": "Single-nucleotide polymorphisms (SNPs) are the most abundant form of human genetic variation and a resource for mapping complex genetic traits. The large volume of data produced by high-throughput sequencing projects is a rich and largely untapped source of SNPs (refs 2, 3, 4, 5). We present here a unified approach to the discovery of variations in genetic sequence data of arbitrary DNA sources. We propose to use the rapidly emerging genomic sequence as a template on which to layer often unmapped, fragmentary sequence data and to use base quality values to discern true allelic variations from sequencing errors. By taking advantage of the genomic sequence we are able to use simpler yet more accurate methods for sequence organization: fragment clustering, paralogue identification and multiple alignment. We analyse these sequences with a novel, Bayesian inference engine, POLYBAYES, to calculate the probability that a given site is polymorphic. Rigorous treatment of base quality permits completely automated evaluation of the full length of all sequences, without limitations on alignment depth. We demonstrate this approach by accurate SNP predictions in human ESTs aligned to finished and working-draft quality genomic sequences, a data set representative of the typical challenges of sequence-based SNP discovery.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1038/70570", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.2529016", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2682319", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2440598", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1103138", 
            "issn": [
              "1061-4036", 
              "1546-1718"
            ], 
            "name": "Nature Genetics", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "4", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "23"
          }
        ], 
        "name": "A general approach to single-nucleotide polymorphism discovery", 
        "pagination": "ng1299_452", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "3511af174d52a28b4cb7e4fa1c976aba8f896fa900f219ab60e87966e6d9ff06"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "10581034"
            ]
          }, 
          {
            "name": "nlm_unique_id", 
            "type": "PropertyValue", 
            "value": [
              "9216904"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1038/70570"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1011138564"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1038/70570", 
          "https://app.dimensions.ai/details/publication/pub.1011138564"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-11T12:26", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000362_0000000362/records_87112_00000000.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "http://www.nature.com/articles/ng1299_452"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/70570'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/70570'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/70570'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/70570'


     

    This table displays all metadata directly associated to this object as RDF triples.

    257 TRIPLES      21 PREDICATES      63 URIs      33 LITERALS      21 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1038/70570 schema:about N12e95cb7eb4144ea8fa432cf4eea3504
    2 N1c21e178aee04bf2b4d4cdb3a0e5fdee
    3 N2e336f357ceb458d91b975a1733f1946
    4 N3d311dc97f224f3ebb7d9a5357b65fca
    5 N590491b7ff6149a79197e166f8541dc6
    6 N5c62143c5e6d45879ddcf64c0c48e47e
    7 N6163a894393445099e9435e8947ad339
    8 N892212934bff43c786c5d5b7c59f10ef
    9 N9bd768c8f5724e329eece4972db62e8e
    10 N9cca7355a66e4020b1afebbca740d6fc
    11 Ne9914a7aa0924ada936c2a8cf2ea24f3
    12 Nfda74df2f2014dc1b4521985fbb10fea
    13 anzsrc-for:06
    14 anzsrc-for:0604
    15 schema:author N438dedae7ce34c27bde49b2e4914f3a3
    16 schema:citation sg:pub.10.1038/10290
    17 sg:pub.10.1038/10297
    18 sg:pub.10.1038/6851
    19 sg:pub.10.1038/907
    20 sg:pub.10.1038/ng0893-373
    21 https://app.dimensions.ai/details/publication/pub.1074251583
    22 https://doi.org/10.1006/geno.1994.1469
    23 https://doi.org/10.1006/geno.1997.5042
    24 https://doi.org/10.1098/rstl.1763.0053
    25 https://doi.org/10.1101/gr.6.11.1118
    26 https://doi.org/10.1101/gr.6.9.807
    27 https://doi.org/10.1101/gr.6.9.829
    28 https://doi.org/10.1101/gr.8.3.161
    29 https://doi.org/10.1101/gr.8.3.175
    30 https://doi.org/10.1101/gr.8.3.186
    31 https://doi.org/10.1101/gr.8.3.195
    32 https://doi.org/10.1101/gr.8.7.748
    33 https://doi.org/10.1126/science.270.5244.1945
    34 https://doi.org/10.1126/science.278.5343.1580
    35 https://doi.org/10.1126/science.280.5366.1077
    36 https://doi.org/10.1126/science.280.5369.1540
    37 https://doi.org/10.1126/science.282.5389.682
    38 schema:datePublished 1999-12
    39 schema:datePublishedReg 1999-12-01
    40 schema:description Single-nucleotide polymorphisms (SNPs) are the most abundant form of human genetic variation and a resource for mapping complex genetic traits. The large volume of data produced by high-throughput sequencing projects is a rich and largely untapped source of SNPs (refs 2, 3, 4, 5). We present here a unified approach to the discovery of variations in genetic sequence data of arbitrary DNA sources. We propose to use the rapidly emerging genomic sequence as a template on which to layer often unmapped, fragmentary sequence data and to use base quality values to discern true allelic variations from sequencing errors. By taking advantage of the genomic sequence we are able to use simpler yet more accurate methods for sequence organization: fragment clustering, paralogue identification and multiple alignment. We analyse these sequences with a novel, Bayesian inference engine, POLYBAYES, to calculate the probability that a given site is polymorphic. Rigorous treatment of base quality permits completely automated evaluation of the full length of all sequences, without limitations on alignment depth. We demonstrate this approach by accurate SNP predictions in human ESTs aligned to finished and working-draft quality genomic sequences, a data set representative of the typical challenges of sequence-based SNP discovery.
    41 schema:genre research_article
    42 schema:inLanguage en
    43 schema:isAccessibleForFree false
    44 schema:isPartOf N1fbc3511c31b4a8a8ae893f420e0e339
    45 N70939051755d47238d7a2a1cc95113bf
    46 sg:journal.1103138
    47 schema:name A general approach to single-nucleotide polymorphism discovery
    48 schema:pagination ng1299_452
    49 schema:productId N769c0fa5e7b84ff49f98a1b33e514d59
    50 N9d3c7d6d0af54b2bac1e6aa6ebf1a4b8
    51 Na8bbf3b5a2d845399ed164ecf3e01b11
    52 Nda09cd690f294540abe34273b92692ff
    53 Nf38ed0b30542457685f5e57c43a3125e
    54 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011138564
    55 https://doi.org/10.1038/70570
    56 schema:sdDatePublished 2019-04-11T12:26
    57 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    58 schema:sdPublisher N3bcb6d5676b848199a004da4af2a8755
    59 schema:url http://www.nature.com/articles/ng1299_452
    60 sgo:license sg:explorer/license/
    61 sgo:sdDataset articles
    62 rdf:type schema:ScholarlyArticle
    63 N12e95cb7eb4144ea8fa432cf4eea3504 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    64 schema:name Sequence Alignment
    65 rdf:type schema:DefinedTerm
    66 N1b2ca6316d4f4b7098edf43d259a7378 rdf:first sg:person.01063041236.78
    67 rdf:rest Nc4dad5624ca04291b498d0e5e9d7165c
    68 N1c21e178aee04bf2b4d4cdb3a0e5fdee schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    69 schema:name Data Interpretation, Statistical
    70 rdf:type schema:DefinedTerm
    71 N1fbc3511c31b4a8a8ae893f420e0e339 schema:issueNumber 4
    72 rdf:type schema:PublicationIssue
    73 N2e336f357ceb458d91b975a1733f1946 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    74 schema:name Bayes Theorem
    75 rdf:type schema:DefinedTerm
    76 N3bcb6d5676b848199a004da4af2a8755 schema:name Springer Nature - SN SciGraph project
    77 rdf:type schema:Organization
    78 N3d311dc97f224f3ebb7d9a5357b65fca schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    79 schema:name Polymorphism, Single Nucleotide
    80 rdf:type schema:DefinedTerm
    81 N438dedae7ce34c27bde49b2e4914f3a3 rdf:first sg:person.0617524562.14
    82 rdf:rest N1b2ca6316d4f4b7098edf43d259a7378
    83 N590491b7ff6149a79197e166f8541dc6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    84 schema:name Genome, Human
    85 rdf:type schema:DefinedTerm
    86 N5c0eef317315451284a40ae11011cb15 rdf:first sg:person.01344711107.89
    87 rdf:rest rdf:nil
    88 N5c62143c5e6d45879ddcf64c0c48e47e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    89 schema:name Expressed Sequence Tags
    90 rdf:type schema:DefinedTerm
    91 N6163a894393445099e9435e8947ad339 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    92 schema:name Humans
    93 rdf:type schema:DefinedTerm
    94 N64fda45ae4394ce89ed781bf778d2551 rdf:first sg:person.0724166244.21
    95 rdf:rest Neba51b049c8146c7b21171d90549fa33
    96 N650616b2a87f437fb4a2abaf88aa1285 rdf:first sg:person.01221224024.09
    97 rdf:rest N5c0eef317315451284a40ae11011cb15
    98 N70939051755d47238d7a2a1cc95113bf schema:volumeNumber 23
    99 rdf:type schema:PublicationVolume
    100 N769c0fa5e7b84ff49f98a1b33e514d59 schema:name dimensions_id
    101 schema:value pub.1011138564
    102 rdf:type schema:PropertyValue
    103 N7a4be45a077a429e87476a5f858648b7 rdf:first sg:person.01304003176.38
    104 rdf:rest N64fda45ae4394ce89ed781bf778d2551
    105 N892212934bff43c786c5d5b7c59f10ef schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    106 schema:name Genetic Techniques
    107 rdf:type schema:DefinedTerm
    108 N97f0aa15bcc444d9ba50a6b8da2e2de1 rdf:first sg:person.01121405025.22
    109 rdf:rest N7a4be45a077a429e87476a5f858648b7
    110 N9bd768c8f5724e329eece4972db62e8e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    111 schema:name Alleles
    112 rdf:type schema:DefinedTerm
    113 N9cca7355a66e4020b1afebbca740d6fc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    114 schema:name Genetic Variation
    115 rdf:type schema:DefinedTerm
    116 N9d3c7d6d0af54b2bac1e6aa6ebf1a4b8 schema:name nlm_unique_id
    117 schema:value 9216904
    118 rdf:type schema:PropertyValue
    119 Na8bbf3b5a2d845399ed164ecf3e01b11 schema:name pubmed_id
    120 schema:value 10581034
    121 rdf:type schema:PropertyValue
    122 Nc4dad5624ca04291b498d0e5e9d7165c rdf:first sg:person.0604140657.40
    123 rdf:rest N97f0aa15bcc444d9ba50a6b8da2e2de1
    124 Nda09cd690f294540abe34273b92692ff schema:name doi
    125 schema:value 10.1038/70570
    126 rdf:type schema:PropertyValue
    127 Ne9914a7aa0924ada936c2a8cf2ea24f3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    128 schema:name Algorithms
    129 rdf:type schema:DefinedTerm
    130 Neba51b049c8146c7b21171d90549fa33 rdf:first sg:person.01065735320.09
    131 rdf:rest Nfd3285c3a4194981bc946c18091ca38e
    132 Nf38ed0b30542457685f5e57c43a3125e schema:name readcube_id
    133 schema:value 3511af174d52a28b4cb7e4fa1c976aba8f896fa900f219ab60e87966e6d9ff06
    134 rdf:type schema:PropertyValue
    135 Nfd3285c3a4194981bc946c18091ca38e rdf:first sg:person.01014313442.46
    136 rdf:rest N650616b2a87f437fb4a2abaf88aa1285
    137 Nfda74df2f2014dc1b4521985fbb10fea schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    138 schema:name Software
    139 rdf:type schema:DefinedTerm
    140 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
    141 schema:name Biological Sciences
    142 rdf:type schema:DefinedTerm
    143 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
    144 schema:name Genetics
    145 rdf:type schema:DefinedTerm
    146 sg:grant.2440598 http://pending.schema.org/fundedItem sg:pub.10.1038/70570
    147 rdf:type schema:MonetaryGrant
    148 sg:grant.2529016 http://pending.schema.org/fundedItem sg:pub.10.1038/70570
    149 rdf:type schema:MonetaryGrant
    150 sg:grant.2682319 http://pending.schema.org/fundedItem sg:pub.10.1038/70570
    151 rdf:type schema:MonetaryGrant
    152 sg:journal.1103138 schema:issn 1061-4036
    153 1546-1718
    154 schema:name Nature Genetics
    155 rdf:type schema:Periodical
    156 sg:person.01014313442.46 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    157 schema:familyName Hillier
    158 schema:givenName LaDeana
    159 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01014313442.46
    160 rdf:type schema:Person
    161 sg:person.01063041236.78 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    162 schema:familyName Korf
    163 schema:givenName Ian
    164 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01063041236.78
    165 rdf:type schema:Person
    166 sg:person.01065735320.09 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    167 schema:familyName Stitziel
    168 schema:givenName Nathan O.
    169 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065735320.09
    170 rdf:type schema:Person
    171 sg:person.01121405025.22 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    172 schema:familyName Yeh
    173 schema:givenName Raymond T.
    174 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01121405025.22
    175 rdf:type schema:Person
    176 sg:person.01221224024.09 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    177 schema:familyName Kwok
    178 schema:givenName Pui-Yan
    179 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01221224024.09
    180 rdf:type schema:Person
    181 sg:person.01304003176.38 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    182 schema:familyName Gu
    183 schema:givenName Zhijie
    184 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01304003176.38
    185 rdf:type schema:Person
    186 sg:person.01344711107.89 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    187 schema:familyName Gish
    188 schema:givenName Warren R.
    189 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344711107.89
    190 rdf:type schema:Person
    191 sg:person.0604140657.40 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    192 schema:familyName Yandell
    193 schema:givenName Mark D.
    194 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0604140657.40
    195 rdf:type schema:Person
    196 sg:person.0617524562.14 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    197 schema:familyName Marth
    198 schema:givenName Gabor T.
    199 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0617524562.14
    200 rdf:type schema:Person
    201 sg:person.0724166244.21 schema:affiliation https://www.grid.ac/institutes/grid.4367.6
    202 schema:familyName Zakeri
    203 schema:givenName Hamideh
    204 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0724166244.21
    205 rdf:type schema:Person
    206 sg:pub.10.1038/10290 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008973203
    207 https://doi.org/10.1038/10290
    208 rdf:type schema:CreativeWork
    209 sg:pub.10.1038/10297 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011881979
    210 https://doi.org/10.1038/10297
    211 rdf:type schema:CreativeWork
    212 sg:pub.10.1038/6851 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039024064
    213 https://doi.org/10.1038/6851
    214 rdf:type schema:CreativeWork
    215 sg:pub.10.1038/907 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044380493
    216 https://doi.org/10.1038/907
    217 rdf:type schema:CreativeWork
    218 sg:pub.10.1038/ng0893-373 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039932477
    219 https://doi.org/10.1038/ng0893-373
    220 rdf:type schema:CreativeWork
    221 https://app.dimensions.ai/details/publication/pub.1074251583 schema:CreativeWork
    222 https://doi.org/10.1006/geno.1994.1469 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008038684
    223 rdf:type schema:CreativeWork
    224 https://doi.org/10.1006/geno.1997.5042 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008775422
    225 rdf:type schema:CreativeWork
    226 https://doi.org/10.1098/rstl.1763.0053 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010033334
    227 rdf:type schema:CreativeWork
    228 https://doi.org/10.1101/gr.6.11.1118 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025457008
    229 rdf:type schema:CreativeWork
    230 https://doi.org/10.1101/gr.6.9.807 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000850162
    231 rdf:type schema:CreativeWork
    232 https://doi.org/10.1101/gr.6.9.829 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051679296
    233 rdf:type schema:CreativeWork
    234 https://doi.org/10.1101/gr.8.3.161 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031785094
    235 rdf:type schema:CreativeWork
    236 https://doi.org/10.1101/gr.8.3.175 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048253030
    237 rdf:type schema:CreativeWork
    238 https://doi.org/10.1101/gr.8.3.186 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038920266
    239 rdf:type schema:CreativeWork
    240 https://doi.org/10.1101/gr.8.3.195 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018564763
    241 rdf:type schema:CreativeWork
    242 https://doi.org/10.1101/gr.8.7.748 schema:sameAs https://app.dimensions.ai/details/publication/pub.1083298515
    243 rdf:type schema:CreativeWork
    244 https://doi.org/10.1126/science.270.5244.1945 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062551849
    245 rdf:type schema:CreativeWork
    246 https://doi.org/10.1126/science.278.5343.1580 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062558819
    247 rdf:type schema:CreativeWork
    248 https://doi.org/10.1126/science.280.5366.1077 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062561068
    249 rdf:type schema:CreativeWork
    250 https://doi.org/10.1126/science.280.5369.1540 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062561324
    251 rdf:type schema:CreativeWork
    252 https://doi.org/10.1126/science.282.5389.682 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027647963
    253 rdf:type schema:CreativeWork
    254 https://www.grid.ac/institutes/grid.4367.6 schema:alternateName Washington University in St. Louis
    255 schema:name Washington University Department of Genetics and Genome Sequencing Center, St. Louis, Missouri, USA .
    256 Washington University Division of Dermatology, St. Louis, Missouri, USA.
    257 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...