New Algorithms for Multiple DNA Sequence Alignment View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2004

AUTHORS

Daniel G. Brown , Alexander K. Hudek

ABSTRACT

We present a mathematical framework for anchoring inglobal multiple alignment. Our framework uses anchors that are hits to spaced seeds and identifies anchors progressively, using a phylogenetic tree. We compute anchors in the tree starting at the root and going to the leaves, and from the leaves going up. In both cases, we compute thresholds for anchors to minimize errors. One innovative aspect of our approach is the approximate inference of ancestral sequences with accomodation for ambiguity. This, combined with proper scoring techniques and seeding, lets us pick many anchors in homologous positions as we align up a phylogenetic tree, minimizing total work. Our algorithm is reasonably successful in simulations, is comparable to existing software in terms of accuracy and substantially more efficient. More... »

PAGES

314-325

References to SciGraph publications

  • 2003. Alignment between Two Multiple Alignments in COMBINATORIAL PATTERN MATCHING
  • 2003-12. Fast and sensitive multiple alignment of large genomic sequences in BMC BIOINFORMATICS
  • 1998. Aligning alignments in COMBINATORIAL PATTERN MATCHING
  • 2003. Vector Seeds: An Extension to Spaced Seeds Allows Substantial Improvements in Sensitivity and Specificity in ALGORITHMS IN BIOINFORMATICS
  • 2004. Multiple Vector Seeds for Protein Alignment in ALGORITHMS IN BIOINFORMATICS
  • Book

    TITLE

    Algorithms in Bioinformatics

    ISBN

    978-3-540-23018-2
    978-3-540-30219-3

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27

    DOI

    http://dx.doi.org/10.1007/978-3-540-30219-3_27

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1053411057


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0102", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Applied Mathematics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Brown", 
            "givenName": "Daniel G.", 
            "id": "sg:person.0642727740.54", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Hudek", 
            "givenName": "Alexander K.", 
            "id": "sg:person.0702641766.21", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702641766.21"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1073/pnas.93.22.12098", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000622219"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-30219-3_15", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004146474", 
              "https://doi.org/10.1007/978-3-540-30219-3_15"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/18.3.440", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006017712"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-4-66", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008117919", 
              "https://doi.org/10.1186/1471-2105-4-66"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/146637.146656", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009338888"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/3-540-44888-8_19", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011323716", 
              "https://doi.org/10.1007/3-540-44888-8_19"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.1960404", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017077573"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/18.suppl_1.s312", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023369433"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0166-218x(03)00382-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023652568"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0166-218x(03)00382-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023652568"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.1933104", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028112195"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bfb0030790", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035409083", 
              "https://doi.org/10.1007/bfb0030790"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/22.22.4673", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042438223"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.926603", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043392725"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-39763-2_4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048701396", 
              "https://doi.org/10.1007/978-3-540-39763-2_4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-39763-2_4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048701396", 
              "https://doi.org/10.1007/978-3-540-39763-2_4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/106652703322756096", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059204983"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1137/0148063", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062840630"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/s0219720004000326", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1063004526"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/s0219720004000661", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1063004556"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.2307/2412116", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1069920601"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/csb.2002.1039337", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1077040258"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2004", 
        "datePublishedReg": "2004-01-01", 
        "description": "We present a mathematical framework for anchoring inglobal multiple alignment. Our framework uses anchors that are hits to spaced seeds and identifies anchors progressively, using a phylogenetic tree. We compute anchors in the tree starting at the root and going to the leaves, and from the leaves going up. In both cases, we compute thresholds for anchors to minimize errors. One innovative aspect of our approach is the approximate inference of ancestral sequences with accomodation for ambiguity. This, combined with proper scoring techniques and seeding, lets us pick many anchors in homologous positions as we align up a phylogenetic tree, minimizing total work. Our algorithm is reasonably successful in simulations, is comparable to existing software in terms of accuracy and substantially more efficient.", 
        "editor": [
          {
            "familyName": "Jonassen", 
            "givenName": "Inge", 
            "type": "Person"
          }, 
          {
            "familyName": "Kim", 
            "givenName": "Junhyong", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-540-30219-3_27", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-540-23018-2", 
            "978-3-540-30219-3"
          ], 
          "name": "Algorithms in Bioinformatics", 
          "type": "Book"
        }, 
        "name": "New Algorithms for Multiple DNA Sequence Alignment", 
        "pagination": "314-325", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1053411057"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-540-30219-3_27"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "fe672aaed2703beb6820e8f29bad1c0c34d0e569b190f892302bab2fd08dec85"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-540-30219-3_27", 
          "https://app.dimensions.ai/details/publication/pub.1053411057"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T08:26", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000363_0000000363/records_70053_00000002.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-540-30219-3_27"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'


     

    This table displays all metadata directly associated to this object as RDF triples.

    142 TRIPLES      23 PREDICATES      47 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-540-30219-3_27 schema:about anzsrc-for:01
    2 anzsrc-for:0102
    3 schema:author N5ff3775fac48426884e942f91753d64b
    4 schema:citation sg:pub.10.1007/3-540-44888-8_19
    5 sg:pub.10.1007/978-3-540-30219-3_15
    6 sg:pub.10.1007/978-3-540-39763-2_4
    7 sg:pub.10.1007/bfb0030790
    8 sg:pub.10.1186/1471-2105-4-66
    9 https://doi.org/10.1016/s0166-218x(03)00382-2
    10 https://doi.org/10.1073/pnas.93.22.12098
    11 https://doi.org/10.1089/106652703322756096
    12 https://doi.org/10.1093/bioinformatics/18.3.440
    13 https://doi.org/10.1093/bioinformatics/18.suppl_1.s312
    14 https://doi.org/10.1093/nar/22.22.4673
    15 https://doi.org/10.1101/gr.1933104
    16 https://doi.org/10.1101/gr.1960404
    17 https://doi.org/10.1101/gr.926603
    18 https://doi.org/10.1109/csb.2002.1039337
    19 https://doi.org/10.1137/0148063
    20 https://doi.org/10.1142/s0219720004000326
    21 https://doi.org/10.1142/s0219720004000661
    22 https://doi.org/10.1145/146637.146656
    23 https://doi.org/10.2307/2412116
    24 schema:datePublished 2004
    25 schema:datePublishedReg 2004-01-01
    26 schema:description We present a mathematical framework for anchoring inglobal multiple alignment. Our framework uses anchors that are hits to spaced seeds and identifies anchors progressively, using a phylogenetic tree. We compute anchors in the tree starting at the root and going to the leaves, and from the leaves going up. In both cases, we compute thresholds for anchors to minimize errors. One innovative aspect of our approach is the approximate inference of ancestral sequences with accomodation for ambiguity. This, combined with proper scoring techniques and seeding, lets us pick many anchors in homologous positions as we align up a phylogenetic tree, minimizing total work. Our algorithm is reasonably successful in simulations, is comparable to existing software in terms of accuracy and substantially more efficient.
    27 schema:editor Nf293f25b31424a2cb648d14e9842b3cd
    28 schema:genre chapter
    29 schema:inLanguage en
    30 schema:isAccessibleForFree true
    31 schema:isPartOf N4751a4c7b04e4aefb2a3c4d156d786a5
    32 schema:name New Algorithms for Multiple DNA Sequence Alignment
    33 schema:pagination 314-325
    34 schema:productId N0395c42c5edf486b8c5750583bc30a92
    35 N38ead7d860924e768ea0b871ecdf4758
    36 N459c96eacef14b7886b6aaf5643c2487
    37 schema:publisher N636748d7833a493bb3866d29fb041acf
    38 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053411057
    39 https://doi.org/10.1007/978-3-540-30219-3_27
    40 schema:sdDatePublished 2019-04-16T08:26
    41 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    42 schema:sdPublisher N82716638d64144ca8b2c5fd2810ae46a
    43 schema:url https://link.springer.com/10.1007%2F978-3-540-30219-3_27
    44 sgo:license sg:explorer/license/
    45 sgo:sdDataset chapters
    46 rdf:type schema:Chapter
    47 N0395c42c5edf486b8c5750583bc30a92 schema:name readcube_id
    48 schema:value fe672aaed2703beb6820e8f29bad1c0c34d0e569b190f892302bab2fd08dec85
    49 rdf:type schema:PropertyValue
    50 N38ead7d860924e768ea0b871ecdf4758 schema:name dimensions_id
    51 schema:value pub.1053411057
    52 rdf:type schema:PropertyValue
    53 N459c96eacef14b7886b6aaf5643c2487 schema:name doi
    54 schema:value 10.1007/978-3-540-30219-3_27
    55 rdf:type schema:PropertyValue
    56 N4751a4c7b04e4aefb2a3c4d156d786a5 schema:isbn 978-3-540-23018-2
    57 978-3-540-30219-3
    58 schema:name Algorithms in Bioinformatics
    59 rdf:type schema:Book
    60 N5ff3775fac48426884e942f91753d64b rdf:first sg:person.0642727740.54
    61 rdf:rest Nb3ac973666604b56ae0cce9bca8c4c13
    62 N636748d7833a493bb3866d29fb041acf schema:location Berlin, Heidelberg
    63 schema:name Springer Berlin Heidelberg
    64 rdf:type schema:Organisation
    65 N7e53a74f90864084bf70c92b825c6a19 rdf:first N95951364730b4194835f5ddaee8b8628
    66 rdf:rest rdf:nil
    67 N8179ce032581434eb30d2cd3f1742864 schema:familyName Jonassen
    68 schema:givenName Inge
    69 rdf:type schema:Person
    70 N82716638d64144ca8b2c5fd2810ae46a schema:name Springer Nature - SN SciGraph project
    71 rdf:type schema:Organization
    72 N95951364730b4194835f5ddaee8b8628 schema:familyName Kim
    73 schema:givenName Junhyong
    74 rdf:type schema:Person
    75 Nb3ac973666604b56ae0cce9bca8c4c13 rdf:first sg:person.0702641766.21
    76 rdf:rest rdf:nil
    77 Nf293f25b31424a2cb648d14e9842b3cd rdf:first N8179ce032581434eb30d2cd3f1742864
    78 rdf:rest N7e53a74f90864084bf70c92b825c6a19
    79 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Mathematical Sciences
    81 rdf:type schema:DefinedTerm
    82 anzsrc-for:0102 schema:inDefinedTermSet anzsrc-for:
    83 schema:name Applied Mathematics
    84 rdf:type schema:DefinedTerm
    85 sg:person.0642727740.54 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    86 schema:familyName Brown
    87 schema:givenName Daniel G.
    88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
    89 rdf:type schema:Person
    90 sg:person.0702641766.21 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    91 schema:familyName Hudek
    92 schema:givenName Alexander K.
    93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702641766.21
    94 rdf:type schema:Person
    95 sg:pub.10.1007/3-540-44888-8_19 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011323716
    96 https://doi.org/10.1007/3-540-44888-8_19
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1007/978-3-540-30219-3_15 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004146474
    99 https://doi.org/10.1007/978-3-540-30219-3_15
    100 rdf:type schema:CreativeWork
    101 sg:pub.10.1007/978-3-540-39763-2_4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048701396
    102 https://doi.org/10.1007/978-3-540-39763-2_4
    103 rdf:type schema:CreativeWork
    104 sg:pub.10.1007/bfb0030790 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035409083
    105 https://doi.org/10.1007/bfb0030790
    106 rdf:type schema:CreativeWork
    107 sg:pub.10.1186/1471-2105-4-66 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008117919
    108 https://doi.org/10.1186/1471-2105-4-66
    109 rdf:type schema:CreativeWork
    110 https://doi.org/10.1016/s0166-218x(03)00382-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023652568
    111 rdf:type schema:CreativeWork
    112 https://doi.org/10.1073/pnas.93.22.12098 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000622219
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1089/106652703322756096 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059204983
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1093/bioinformatics/18.3.440 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006017712
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1093/bioinformatics/18.suppl_1.s312 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023369433
    119 rdf:type schema:CreativeWork
    120 https://doi.org/10.1093/nar/22.22.4673 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042438223
    121 rdf:type schema:CreativeWork
    122 https://doi.org/10.1101/gr.1933104 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028112195
    123 rdf:type schema:CreativeWork
    124 https://doi.org/10.1101/gr.1960404 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017077573
    125 rdf:type schema:CreativeWork
    126 https://doi.org/10.1101/gr.926603 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043392725
    127 rdf:type schema:CreativeWork
    128 https://doi.org/10.1109/csb.2002.1039337 schema:sameAs https://app.dimensions.ai/details/publication/pub.1077040258
    129 rdf:type schema:CreativeWork
    130 https://doi.org/10.1137/0148063 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062840630
    131 rdf:type schema:CreativeWork
    132 https://doi.org/10.1142/s0219720004000326 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063004526
    133 rdf:type schema:CreativeWork
    134 https://doi.org/10.1142/s0219720004000661 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063004556
    135 rdf:type schema:CreativeWork
    136 https://doi.org/10.1145/146637.146656 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009338888
    137 rdf:type schema:CreativeWork
    138 https://doi.org/10.2307/2412116 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069920601
    139 rdf:type schema:CreativeWork
    140 https://www.grid.ac/institutes/grid.46078.3d schema:alternateName University of Waterloo
    141 schema:name School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
    142 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...