Vector Seeds: An Extension to Spaced Seeds Allows Substantial Improvements in Sensitivity and Specificity View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2003

AUTHORS

Broňa Brejová , Daniel G. Brown , Tomáš Vinař

ABSTRACT

We present improved techniques for finding homologous regions in DNA and protein sequences. Our approach focuses on the core region of a local pairwise alignment; we suggest new ways to characterize these regions that allow marked improvements in both specificity and sensitivity over existing techniques for sequence alignment. For any such characterization, which we call a vector seed, we give an efficient algorithm that estimates the specificity and sensitivity of that seed under reasonable probabilistic models of sequence. We also characterize the probability of a match when an alignment is required to have multiple hits before it is detected. Our extensions fit well with existing approaches to sequence alignment, while still offering substantial improvement in runtime and sensitivity, particularly for the important problem of identifying matches between homologous coding DNA sequences. More... »

PAGES

39-54

References to SciGraph publications

Book

TITLE

Algorithms in Bioinformatics

ISBN

978-3-540-20076-5
978-3-540-39763-2

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-39763-2_4

DOI

http://dx.doi.org/10.1007/978-3-540-39763-2_4

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1048701396


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Waterloo", 
          "id": "https://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Brejov\u00e1", 
        "givenName": "Bro\u0148a", 
        "id": "sg:person.0642141060.90", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642141060.90"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Waterloo", 
          "id": "https://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Brown", 
        "givenName": "Daniel G.", 
        "id": "sg:person.0642727740.54", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Waterloo", 
          "id": "https://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vina\u0159", 
        "givenName": "Tom\u00e1\u0161", 
        "id": "sg:person.01041305251.67", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01041305251.67"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1093/nar/28.1.45", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004742321"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/18.3.440", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1006017712"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.229202", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1006260064"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.89.22.10915", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010234644"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0022-2836(05)80360-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013618994"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/640075.640083", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018184175"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/25.17.3389", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047265454"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-44888-8_4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047397326", 
          "https://doi.org/10.1007/3-540-44888-8_4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/17.suppl_1.s140", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051444796"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2003", 
    "datePublishedReg": "2003-01-01", 
    "description": "We present improved techniques for finding homologous regions in DNA and protein sequences. Our approach focuses on the core region of a local pairwise alignment; we suggest new ways to characterize these regions that allow marked improvements in both specificity and sensitivity over existing techniques for sequence alignment. For any such characterization, which we call a vector seed, we give an efficient algorithm that estimates the specificity and sensitivity of that seed under reasonable probabilistic models of sequence. We also characterize the probability of a match when an alignment is required to have multiple hits before it is detected. Our extensions fit well with existing approaches to sequence alignment, while still offering substantial improvement in runtime and sensitivity, particularly for the important problem of identifying matches between homologous coding DNA sequences.", 
    "editor": [
      {
        "familyName": "Benson", 
        "givenName": "Gary", 
        "type": "Person"
      }, 
      {
        "familyName": "Page", 
        "givenName": "Roderic D. M.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-39763-2_4", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-20076-5", 
        "978-3-540-39763-2"
      ], 
      "name": "Algorithms in Bioinformatics", 
      "type": "Book"
    }, 
    "name": "Vector Seeds: An Extension to Spaced Seeds Allows Substantial Improvements in Sensitivity and Specificity", 
    "pagination": "39-54", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1048701396"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-39763-2_4"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "e800ec4e414f26b62c8895ba4e882ffcc7c2427ff651e52368efc98cce9bf0e6"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-39763-2_4", 
      "https://app.dimensions.ai/details/publication/pub.1048701396"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T08:05", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000359_0000000359/records_29219_00000002.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F978-3-540-39763-2_4"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-39763-2_4'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-39763-2_4'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-39763-2_4'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-39763-2_4'


 

This table displays all metadata directly associated to this object as RDF triples.

112 TRIPLES      23 PREDICATES      36 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-39763-2_4 schema:about anzsrc-for:06
2 anzsrc-for:0604
3 schema:author Nd08ff1eed75b4a54b1fd0e71bf4c22d9
4 schema:citation sg:pub.10.1007/3-540-44888-8_4
5 https://doi.org/10.1016/s0022-2836(05)80360-2
6 https://doi.org/10.1073/pnas.89.22.10915
7 https://doi.org/10.1093/bioinformatics/17.suppl_1.s140
8 https://doi.org/10.1093/bioinformatics/18.3.440
9 https://doi.org/10.1093/nar/25.17.3389
10 https://doi.org/10.1093/nar/28.1.45
11 https://doi.org/10.1101/gr.229202
12 https://doi.org/10.1145/640075.640083
13 schema:datePublished 2003
14 schema:datePublishedReg 2003-01-01
15 schema:description We present improved techniques for finding homologous regions in DNA and protein sequences. Our approach focuses on the core region of a local pairwise alignment; we suggest new ways to characterize these regions that allow marked improvements in both specificity and sensitivity over existing techniques for sequence alignment. For any such characterization, which we call a vector seed, we give an efficient algorithm that estimates the specificity and sensitivity of that seed under reasonable probabilistic models of sequence. We also characterize the probability of a match when an alignment is required to have multiple hits before it is detected. Our extensions fit well with existing approaches to sequence alignment, while still offering substantial improvement in runtime and sensitivity, particularly for the important problem of identifying matches between homologous coding DNA sequences.
16 schema:editor Nc4c3437b6e8040ef91cd45625188a44d
17 schema:genre chapter
18 schema:inLanguage en
19 schema:isAccessibleForFree true
20 schema:isPartOf N07154f88d0a4469e996e6070a7cf069b
21 schema:name Vector Seeds: An Extension to Spaced Seeds Allows Substantial Improvements in Sensitivity and Specificity
22 schema:pagination 39-54
23 schema:productId N35b853886e4c4df0bb2aa9b3ae9c755a
24 Nc2404981e55a47a0beb65c2fccb3c2dd
25 Nd91849b7ca4549b49fccdaf510d7af2e
26 schema:publisher N3cfd20d7016040939d694460b3b65c21
27 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048701396
28 https://doi.org/10.1007/978-3-540-39763-2_4
29 schema:sdDatePublished 2019-04-16T08:05
30 schema:sdLicense https://scigraph.springernature.com/explorer/license/
31 schema:sdPublisher N1ad9f0dd660c4ddab697d1c45f66625e
32 schema:url https://link.springer.com/10.1007%2F978-3-540-39763-2_4
33 sgo:license sg:explorer/license/
34 sgo:sdDataset chapters
35 rdf:type schema:Chapter
36 N07154f88d0a4469e996e6070a7cf069b schema:isbn 978-3-540-20076-5
37 978-3-540-39763-2
38 schema:name Algorithms in Bioinformatics
39 rdf:type schema:Book
40 N1ad9f0dd660c4ddab697d1c45f66625e schema:name Springer Nature - SN SciGraph project
41 rdf:type schema:Organization
42 N35b853886e4c4df0bb2aa9b3ae9c755a schema:name readcube_id
43 schema:value e800ec4e414f26b62c8895ba4e882ffcc7c2427ff651e52368efc98cce9bf0e6
44 rdf:type schema:PropertyValue
45 N3cfd20d7016040939d694460b3b65c21 schema:location Berlin, Heidelberg
46 schema:name Springer Berlin Heidelberg
47 rdf:type schema:Organisation
48 N48fbf3974742401b8ea7d872bbf4cc2b rdf:first sg:person.0642727740.54
49 rdf:rest N7d4d1ae16b2441468e90fe0ee887eeaf
50 N7d4d1ae16b2441468e90fe0ee887eeaf rdf:first sg:person.01041305251.67
51 rdf:rest rdf:nil
52 N820f3c604aca450b8f68860359403956 schema:familyName Page
53 schema:givenName Roderic D. M.
54 rdf:type schema:Person
55 Nb5fa3ce43b534ac4aaacdb1f95f2cefe rdf:first N820f3c604aca450b8f68860359403956
56 rdf:rest rdf:nil
57 Nc2404981e55a47a0beb65c2fccb3c2dd schema:name dimensions_id
58 schema:value pub.1048701396
59 rdf:type schema:PropertyValue
60 Nc4c3437b6e8040ef91cd45625188a44d rdf:first Nc931d6471ba742ae989c823714b88335
61 rdf:rest Nb5fa3ce43b534ac4aaacdb1f95f2cefe
62 Nc931d6471ba742ae989c823714b88335 schema:familyName Benson
63 schema:givenName Gary
64 rdf:type schema:Person
65 Nd08ff1eed75b4a54b1fd0e71bf4c22d9 rdf:first sg:person.0642141060.90
66 rdf:rest N48fbf3974742401b8ea7d872bbf4cc2b
67 Nd91849b7ca4549b49fccdaf510d7af2e schema:name doi
68 schema:value 10.1007/978-3-540-39763-2_4
69 rdf:type schema:PropertyValue
70 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
71 schema:name Biological Sciences
72 rdf:type schema:DefinedTerm
73 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
74 schema:name Genetics
75 rdf:type schema:DefinedTerm
76 sg:person.01041305251.67 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
77 schema:familyName Vinař
78 schema:givenName Tomáš
79 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01041305251.67
80 rdf:type schema:Person
81 sg:person.0642141060.90 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
82 schema:familyName Brejová
83 schema:givenName Broňa
84 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642141060.90
85 rdf:type schema:Person
86 sg:person.0642727740.54 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
87 schema:familyName Brown
88 schema:givenName Daniel G.
89 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
90 rdf:type schema:Person
91 sg:pub.10.1007/3-540-44888-8_4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047397326
92 https://doi.org/10.1007/3-540-44888-8_4
93 rdf:type schema:CreativeWork
94 https://doi.org/10.1016/s0022-2836(05)80360-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013618994
95 rdf:type schema:CreativeWork
96 https://doi.org/10.1073/pnas.89.22.10915 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010234644
97 rdf:type schema:CreativeWork
98 https://doi.org/10.1093/bioinformatics/17.suppl_1.s140 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051444796
99 rdf:type schema:CreativeWork
100 https://doi.org/10.1093/bioinformatics/18.3.440 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006017712
101 rdf:type schema:CreativeWork
102 https://doi.org/10.1093/nar/25.17.3389 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047265454
103 rdf:type schema:CreativeWork
104 https://doi.org/10.1093/nar/28.1.45 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004742321
105 rdf:type schema:CreativeWork
106 https://doi.org/10.1101/gr.229202 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006260064
107 rdf:type schema:CreativeWork
108 https://doi.org/10.1145/640075.640083 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018184175
109 rdf:type schema:CreativeWork
110 https://www.grid.ac/institutes/grid.46078.3d schema:alternateName University of Waterloo
111 schema:name School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
112 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...