New Algorithms for Multiple DNA Sequence Alignment View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2004

AUTHORS

Daniel G. Brown , Alexander K. Hudek

ABSTRACT

We present a mathematical framework for anchoring inglobal multiple alignment. Our framework uses anchors that are hits to spaced seeds and identifies anchors progressively, using a phylogenetic tree. We compute anchors in the tree starting at the root and going to the leaves, and from the leaves going up. In both cases, we compute thresholds for anchors to minimize errors. One innovative aspect of our approach is the approximate inference of ancestral sequences with accomodation for ambiguity. This, combined with proper scoring techniques and seeding, lets us pick many anchors in homologous positions as we align up a phylogenetic tree, minimizing total work. Our algorithm is reasonably successful in simulations, is comparable to existing software in terms of accuracy and substantially more efficient. More... »

PAGES

314-325

References to SciGraph publications

  • 2003. Alignment between Two Multiple Alignments in COMBINATORIAL PATTERN MATCHING
  • 2003-12. Fast and sensitive multiple alignment of large genomic sequences in BMC BIOINFORMATICS
  • 1998. Aligning alignments in COMBINATORIAL PATTERN MATCHING
  • 2003. Vector Seeds: An Extension to Spaced Seeds Allows Substantial Improvements in Sensitivity and Specificity in ALGORITHMS IN BIOINFORMATICS
  • 2004. Multiple Vector Seeds for Protein Alignment in ALGORITHMS IN BIOINFORMATICS
  • Book

    TITLE

    Algorithms in Bioinformatics

    ISBN

    978-3-540-23018-2
    978-3-540-30219-3

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27

    DOI

    http://dx.doi.org/10.1007/978-3-540-30219-3_27

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1053411057


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0102", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Applied Mathematics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Brown", 
            "givenName": "Daniel G.", 
            "id": "sg:person.0642727740.54", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Hudek", 
            "givenName": "Alexander K.", 
            "id": "sg:person.0702641766.21", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702641766.21"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1073/pnas.93.22.12098", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000622219"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-30219-3_15", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004146474", 
              "https://doi.org/10.1007/978-3-540-30219-3_15"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/18.3.440", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006017712"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-4-66", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008117919", 
              "https://doi.org/10.1186/1471-2105-4-66"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/146637.146656", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009338888"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/3-540-44888-8_19", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011323716", 
              "https://doi.org/10.1007/3-540-44888-8_19"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.1960404", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017077573"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/18.suppl_1.s312", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023369433"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0166-218x(03)00382-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023652568"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0166-218x(03)00382-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023652568"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.1933104", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028112195"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bfb0030790", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035409083", 
              "https://doi.org/10.1007/bfb0030790"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/22.22.4673", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042438223"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.926603", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043392725"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-39763-2_4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048701396", 
              "https://doi.org/10.1007/978-3-540-39763-2_4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-39763-2_4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048701396", 
              "https://doi.org/10.1007/978-3-540-39763-2_4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/106652703322756096", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059204983"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1137/0148063", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062840630"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/s0219720004000326", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1063004526"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/s0219720004000661", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1063004556"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.2307/2412116", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1069920601"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/csb.2002.1039337", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1077040258"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2004", 
        "datePublishedReg": "2004-01-01", 
        "description": "We present a mathematical framework for anchoring inglobal multiple alignment. Our framework uses anchors that are hits to spaced seeds and identifies anchors progressively, using a phylogenetic tree. We compute anchors in the tree starting at the root and going to the leaves, and from the leaves going up. In both cases, we compute thresholds for anchors to minimize errors. One innovative aspect of our approach is the approximate inference of ancestral sequences with accomodation for ambiguity. This, combined with proper scoring techniques and seeding, lets us pick many anchors in homologous positions as we align up a phylogenetic tree, minimizing total work. Our algorithm is reasonably successful in simulations, is comparable to existing software in terms of accuracy and substantially more efficient.", 
        "editor": [
          {
            "familyName": "Jonassen", 
            "givenName": "Inge", 
            "type": "Person"
          }, 
          {
            "familyName": "Kim", 
            "givenName": "Junhyong", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-540-30219-3_27", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-540-23018-2", 
            "978-3-540-30219-3"
          ], 
          "name": "Algorithms in Bioinformatics", 
          "type": "Book"
        }, 
        "name": "New Algorithms for Multiple DNA Sequence Alignment", 
        "pagination": "314-325", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1053411057"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-540-30219-3_27"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "fe672aaed2703beb6820e8f29bad1c0c34d0e569b190f892302bab2fd08dec85"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-540-30219-3_27", 
          "https://app.dimensions.ai/details/publication/pub.1053411057"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T08:26", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000363_0000000363/records_70053_00000002.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-540-30219-3_27"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-30219-3_27'


     

    This table displays all metadata directly associated to this object as RDF triples.

    142 TRIPLES      23 PREDICATES      47 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-540-30219-3_27 schema:about anzsrc-for:01
    2 anzsrc-for:0102
    3 schema:author N93df581bcfc347b386031b7cdf740019
    4 schema:citation sg:pub.10.1007/3-540-44888-8_19
    5 sg:pub.10.1007/978-3-540-30219-3_15
    6 sg:pub.10.1007/978-3-540-39763-2_4
    7 sg:pub.10.1007/bfb0030790
    8 sg:pub.10.1186/1471-2105-4-66
    9 https://doi.org/10.1016/s0166-218x(03)00382-2
    10 https://doi.org/10.1073/pnas.93.22.12098
    11 https://doi.org/10.1089/106652703322756096
    12 https://doi.org/10.1093/bioinformatics/18.3.440
    13 https://doi.org/10.1093/bioinformatics/18.suppl_1.s312
    14 https://doi.org/10.1093/nar/22.22.4673
    15 https://doi.org/10.1101/gr.1933104
    16 https://doi.org/10.1101/gr.1960404
    17 https://doi.org/10.1101/gr.926603
    18 https://doi.org/10.1109/csb.2002.1039337
    19 https://doi.org/10.1137/0148063
    20 https://doi.org/10.1142/s0219720004000326
    21 https://doi.org/10.1142/s0219720004000661
    22 https://doi.org/10.1145/146637.146656
    23 https://doi.org/10.2307/2412116
    24 schema:datePublished 2004
    25 schema:datePublishedReg 2004-01-01
    26 schema:description We present a mathematical framework for anchoring inglobal multiple alignment. Our framework uses anchors that are hits to spaced seeds and identifies anchors progressively, using a phylogenetic tree. We compute anchors in the tree starting at the root and going to the leaves, and from the leaves going up. In both cases, we compute thresholds for anchors to minimize errors. One innovative aspect of our approach is the approximate inference of ancestral sequences with accomodation for ambiguity. This, combined with proper scoring techniques and seeding, lets us pick many anchors in homologous positions as we align up a phylogenetic tree, minimizing total work. Our algorithm is reasonably successful in simulations, is comparable to existing software in terms of accuracy and substantially more efficient.
    27 schema:editor N26ca684aa03a40ada99fec296e395327
    28 schema:genre chapter
    29 schema:inLanguage en
    30 schema:isAccessibleForFree true
    31 schema:isPartOf Nf4cd0808abe84c539f9c44bdc4e17d95
    32 schema:name New Algorithms for Multiple DNA Sequence Alignment
    33 schema:pagination 314-325
    34 schema:productId N4eaa293c11a640b88aaad730ceeb44b6
    35 N66073716411340cabc6226c860586d08
    36 N936d986f74e6438a8cf1d1135b9903dc
    37 schema:publisher N5475e07771cb4dfb84f2dc8e083f04dd
    38 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053411057
    39 https://doi.org/10.1007/978-3-540-30219-3_27
    40 schema:sdDatePublished 2019-04-16T08:26
    41 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    42 schema:sdPublisher N720baf8e22b44cae83c9a307f1575c9f
    43 schema:url https://link.springer.com/10.1007%2F978-3-540-30219-3_27
    44 sgo:license sg:explorer/license/
    45 sgo:sdDataset chapters
    46 rdf:type schema:Chapter
    47 N115d509ff14549f58b4672e49494302b schema:familyName Jonassen
    48 schema:givenName Inge
    49 rdf:type schema:Person
    50 N1602dfb48b3d4f32b42327c84260f9f2 schema:familyName Kim
    51 schema:givenName Junhyong
    52 rdf:type schema:Person
    53 N26ca684aa03a40ada99fec296e395327 rdf:first N115d509ff14549f58b4672e49494302b
    54 rdf:rest Nc15f124122bb4a82ba218fc49001ed90
    55 N2d4347e598fc40aca14eaf4ce9e250d0 rdf:first sg:person.0702641766.21
    56 rdf:rest rdf:nil
    57 N4eaa293c11a640b88aaad730ceeb44b6 schema:name readcube_id
    58 schema:value fe672aaed2703beb6820e8f29bad1c0c34d0e569b190f892302bab2fd08dec85
    59 rdf:type schema:PropertyValue
    60 N5475e07771cb4dfb84f2dc8e083f04dd schema:location Berlin, Heidelberg
    61 schema:name Springer Berlin Heidelberg
    62 rdf:type schema:Organisation
    63 N66073716411340cabc6226c860586d08 schema:name dimensions_id
    64 schema:value pub.1053411057
    65 rdf:type schema:PropertyValue
    66 N720baf8e22b44cae83c9a307f1575c9f schema:name Springer Nature - SN SciGraph project
    67 rdf:type schema:Organization
    68 N936d986f74e6438a8cf1d1135b9903dc schema:name doi
    69 schema:value 10.1007/978-3-540-30219-3_27
    70 rdf:type schema:PropertyValue
    71 N93df581bcfc347b386031b7cdf740019 rdf:first sg:person.0642727740.54
    72 rdf:rest N2d4347e598fc40aca14eaf4ce9e250d0
    73 Nc15f124122bb4a82ba218fc49001ed90 rdf:first N1602dfb48b3d4f32b42327c84260f9f2
    74 rdf:rest rdf:nil
    75 Nf4cd0808abe84c539f9c44bdc4e17d95 schema:isbn 978-3-540-23018-2
    76 978-3-540-30219-3
    77 schema:name Algorithms in Bioinformatics
    78 rdf:type schema:Book
    79 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Mathematical Sciences
    81 rdf:type schema:DefinedTerm
    82 anzsrc-for:0102 schema:inDefinedTermSet anzsrc-for:
    83 schema:name Applied Mathematics
    84 rdf:type schema:DefinedTerm
    85 sg:person.0642727740.54 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    86 schema:familyName Brown
    87 schema:givenName Daniel G.
    88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
    89 rdf:type schema:Person
    90 sg:person.0702641766.21 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    91 schema:familyName Hudek
    92 schema:givenName Alexander K.
    93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702641766.21
    94 rdf:type schema:Person
    95 sg:pub.10.1007/3-540-44888-8_19 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011323716
    96 https://doi.org/10.1007/3-540-44888-8_19
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1007/978-3-540-30219-3_15 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004146474
    99 https://doi.org/10.1007/978-3-540-30219-3_15
    100 rdf:type schema:CreativeWork
    101 sg:pub.10.1007/978-3-540-39763-2_4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048701396
    102 https://doi.org/10.1007/978-3-540-39763-2_4
    103 rdf:type schema:CreativeWork
    104 sg:pub.10.1007/bfb0030790 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035409083
    105 https://doi.org/10.1007/bfb0030790
    106 rdf:type schema:CreativeWork
    107 sg:pub.10.1186/1471-2105-4-66 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008117919
    108 https://doi.org/10.1186/1471-2105-4-66
    109 rdf:type schema:CreativeWork
    110 https://doi.org/10.1016/s0166-218x(03)00382-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023652568
    111 rdf:type schema:CreativeWork
    112 https://doi.org/10.1073/pnas.93.22.12098 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000622219
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1089/106652703322756096 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059204983
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1093/bioinformatics/18.3.440 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006017712
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1093/bioinformatics/18.suppl_1.s312 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023369433
    119 rdf:type schema:CreativeWork
    120 https://doi.org/10.1093/nar/22.22.4673 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042438223
    121 rdf:type schema:CreativeWork
    122 https://doi.org/10.1101/gr.1933104 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028112195
    123 rdf:type schema:CreativeWork
    124 https://doi.org/10.1101/gr.1960404 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017077573
    125 rdf:type schema:CreativeWork
    126 https://doi.org/10.1101/gr.926603 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043392725
    127 rdf:type schema:CreativeWork
    128 https://doi.org/10.1109/csb.2002.1039337 schema:sameAs https://app.dimensions.ai/details/publication/pub.1077040258
    129 rdf:type schema:CreativeWork
    130 https://doi.org/10.1137/0148063 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062840630
    131 rdf:type schema:CreativeWork
    132 https://doi.org/10.1142/s0219720004000326 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063004526
    133 rdf:type schema:CreativeWork
    134 https://doi.org/10.1142/s0219720004000661 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063004556
    135 rdf:type schema:CreativeWork
    136 https://doi.org/10.1145/146637.146656 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009338888
    137 rdf:type schema:CreativeWork
    138 https://doi.org/10.2307/2412116 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069920601
    139 rdf:type schema:CreativeWork
    140 https://www.grid.ac/institutes/grid.46078.3d schema:alternateName University of Waterloo
    141 schema:name School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
    142 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...