Towards a Practical O(n logn) Phylogeny Algorithm View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2011

AUTHORS

Daniel G. Brown , Jakub Truszkowski

ABSTRACT

Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost. More... »

PAGES

14-25

References to SciGraph publications

  • 2009. Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep in RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY
  • 2011. Fast Error-Tolerant Quartet Phylogeny Algorithms in COMBINATORIAL PATTERN MATCHING
  • 2005-12. Scoredist: A simple and robust protein sequence distance estimator in BMC BIOINFORMATICS
  • 2009. Large-Scale Neighbor-Joining with NINJA in ALGORITHMS IN BIOINFORMATICS
  • 2005. Fast Neighbor Joining in AUTOMATA, LANGUAGES AND PROGRAMMING
  • 2004-02. Computing the Quartet Distance between Evolutionary Trees in Time O(n log n) in ALGORITHMICA
  • Book

    TITLE

    Algorithms in Bioinformatics

    ISBN

    978-3-642-23037-0
    978-3-642-23038-7

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2

    DOI

    http://dx.doi.org/10.1007/978-3-642-23038-7_2

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1050915819


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Brown", 
            "givenName": "Daniel G.", 
            "id": "sg:person.0642727740.54", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Truszkowski", 
            "givenName": "Jakub", 
            "id": "sg:person.01320220640.40", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1093/oxfordjournals.molbev.a025664", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000153997"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/279069.279105", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000943105"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.ipl.2007.07.002", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002252868"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11523468_102", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004861745", 
              "https://doi.org/10.1007/11523468_102"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11523468_102", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004861745", 
              "https://doi.org/10.1007/11523468_102"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-21458-5_14", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008082437", 
              "https://doi.org/10.1007/978-3-642-21458-5_14"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-21458-5_14", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008082437", 
              "https://doi.org/10.1007/978-3-642-21458-5_14"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s00453-003-1065-y", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1021604411", 
              "https://doi.org/10.1007/s00453-003-1065-y"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/molbev/msp077", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028540883"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/molbev/msp077", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028540883"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0304-3975(99)00028-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030334849"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-04241-6_31", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032768275", 
              "https://doi.org/10.1007/978-3-642-04241-6_31"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/oxfordjournals.molbev.a025756", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034694173"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-6-108", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037048502", 
              "https://doi.org/10.1186/1471-2105-6-108"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/jagm.1996.0035", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045540105"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-02008-7_32", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050043191", 
              "https://doi.org/10.1007/978-3-642-02008-7_32"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-02008-7_32", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050043191", 
              "https://doi.org/10.1007/978-3-642-02008-7_32"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/oxfordjournals.molbev.a003881", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052052269"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/10665270252935467", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059204929"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/106652702761034136", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059204960"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2007.0103", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059245568"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2011", 
        "datePublishedReg": "2011-01-01", 
        "description": "Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.", 
        "editor": [
          {
            "familyName": "Przytycka", 
            "givenName": "Teresa M.", 
            "type": "Person"
          }, 
          {
            "familyName": "Sagot", 
            "givenName": "Marie-France", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-642-23038-7_2", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-642-23037-0", 
            "978-3-642-23038-7"
          ], 
          "name": "Algorithms in Bioinformatics", 
          "type": "Book"
        }, 
        "name": "Towards a Practical O(n logn) Phylogeny Algorithm", 
        "pagination": "14-25", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1050915819"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-642-23038-7_2"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "f28b5f8050ff284410c0f206111750227f527cd3e4e7278035d8befd7527eb4a"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-642-23038-7_2", 
          "https://app.dimensions.ai/details/publication/pub.1050915819"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T09:10", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000371_0000000371/records_130793_00000004.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-642-23038-7_2"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'


     

    This table displays all metadata directly associated to this object as RDF triples.

    134 TRIPLES      23 PREDICATES      44 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-642-23038-7_2 schema:about anzsrc-for:01
    2 anzsrc-for:0104
    3 schema:author Nd4218dffa35342729b4dfd999836e38f
    4 schema:citation sg:pub.10.1007/11523468_102
    5 sg:pub.10.1007/978-3-642-02008-7_32
    6 sg:pub.10.1007/978-3-642-04241-6_31
    7 sg:pub.10.1007/978-3-642-21458-5_14
    8 sg:pub.10.1007/s00453-003-1065-y
    9 sg:pub.10.1186/1471-2105-6-108
    10 https://doi.org/10.1006/jagm.1996.0035
    11 https://doi.org/10.1016/j.ipl.2007.07.002
    12 https://doi.org/10.1016/s0304-3975(99)00028-6
    13 https://doi.org/10.1089/10665270252935467
    14 https://doi.org/10.1089/106652702761034136
    15 https://doi.org/10.1089/cmb.2007.0103
    16 https://doi.org/10.1093/molbev/msp077
    17 https://doi.org/10.1093/oxfordjournals.molbev.a003881
    18 https://doi.org/10.1093/oxfordjournals.molbev.a025664
    19 https://doi.org/10.1093/oxfordjournals.molbev.a025756
    20 https://doi.org/10.1145/279069.279105
    21 schema:datePublished 2011
    22 schema:datePublishedReg 2011-01-01
    23 schema:description Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.
    24 schema:editor Nc80c31996cd64c0bba6fa2bf1d485872
    25 schema:genre chapter
    26 schema:inLanguage en
    27 schema:isAccessibleForFree true
    28 schema:isPartOf N66c29524709448fcaf27f570cde2b5c0
    29 schema:name Towards a Practical O(n logn) Phylogeny Algorithm
    30 schema:pagination 14-25
    31 schema:productId N9d99eafc68b04062bed40277ab5f6a07
    32 Nb237c3abcb854a9c9a64b5287c0f575d
    33 Nb7ad1c5cca8f4e24a899623e2fd69591
    34 schema:publisher N5e556f98b8f64523af3507cc9c5deaf1
    35 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050915819
    36 https://doi.org/10.1007/978-3-642-23038-7_2
    37 schema:sdDatePublished 2019-04-16T09:10
    38 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    39 schema:sdPublisher N889897e04d7f4ece96efcb5a1c7ba50a
    40 schema:url https://link.springer.com/10.1007%2F978-3-642-23038-7_2
    41 sgo:license sg:explorer/license/
    42 sgo:sdDataset chapters
    43 rdf:type schema:Chapter
    44 N059fe0cd84304736b8fb04a1d0aa42e0 rdf:first N34b3fcbe925d4990ad2d7be18dfc1247
    45 rdf:rest rdf:nil
    46 N34b3fcbe925d4990ad2d7be18dfc1247 schema:familyName Sagot
    47 schema:givenName Marie-France
    48 rdf:type schema:Person
    49 N5e556f98b8f64523af3507cc9c5deaf1 schema:location Berlin, Heidelberg
    50 schema:name Springer Berlin Heidelberg
    51 rdf:type schema:Organisation
    52 N66c29524709448fcaf27f570cde2b5c0 schema:isbn 978-3-642-23037-0
    53 978-3-642-23038-7
    54 schema:name Algorithms in Bioinformatics
    55 rdf:type schema:Book
    56 N7b6fd3b5969d4f09ba727ab18958eb76 schema:familyName Przytycka
    57 schema:givenName Teresa M.
    58 rdf:type schema:Person
    59 N889897e04d7f4ece96efcb5a1c7ba50a schema:name Springer Nature - SN SciGraph project
    60 rdf:type schema:Organization
    61 N9d99eafc68b04062bed40277ab5f6a07 schema:name dimensions_id
    62 schema:value pub.1050915819
    63 rdf:type schema:PropertyValue
    64 Nad0a5a2e01bc46289f533b51b8222848 rdf:first sg:person.01320220640.40
    65 rdf:rest rdf:nil
    66 Nb237c3abcb854a9c9a64b5287c0f575d schema:name doi
    67 schema:value 10.1007/978-3-642-23038-7_2
    68 rdf:type schema:PropertyValue
    69 Nb7ad1c5cca8f4e24a899623e2fd69591 schema:name readcube_id
    70 schema:value f28b5f8050ff284410c0f206111750227f527cd3e4e7278035d8befd7527eb4a
    71 rdf:type schema:PropertyValue
    72 Nc80c31996cd64c0bba6fa2bf1d485872 rdf:first N7b6fd3b5969d4f09ba727ab18958eb76
    73 rdf:rest N059fe0cd84304736b8fb04a1d0aa42e0
    74 Nd4218dffa35342729b4dfd999836e38f rdf:first sg:person.0642727740.54
    75 rdf:rest Nad0a5a2e01bc46289f533b51b8222848
    76 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    77 schema:name Mathematical Sciences
    78 rdf:type schema:DefinedTerm
    79 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Statistics
    81 rdf:type schema:DefinedTerm
    82 sg:person.01320220640.40 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    83 schema:familyName Truszkowski
    84 schema:givenName Jakub
    85 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40
    86 rdf:type schema:Person
    87 sg:person.0642727740.54 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    88 schema:familyName Brown
    89 schema:givenName Daniel G.
    90 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
    91 rdf:type schema:Person
    92 sg:pub.10.1007/11523468_102 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004861745
    93 https://doi.org/10.1007/11523468_102
    94 rdf:type schema:CreativeWork
    95 sg:pub.10.1007/978-3-642-02008-7_32 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050043191
    96 https://doi.org/10.1007/978-3-642-02008-7_32
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1007/978-3-642-04241-6_31 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032768275
    99 https://doi.org/10.1007/978-3-642-04241-6_31
    100 rdf:type schema:CreativeWork
    101 sg:pub.10.1007/978-3-642-21458-5_14 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008082437
    102 https://doi.org/10.1007/978-3-642-21458-5_14
    103 rdf:type schema:CreativeWork
    104 sg:pub.10.1007/s00453-003-1065-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1021604411
    105 https://doi.org/10.1007/s00453-003-1065-y
    106 rdf:type schema:CreativeWork
    107 sg:pub.10.1186/1471-2105-6-108 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037048502
    108 https://doi.org/10.1186/1471-2105-6-108
    109 rdf:type schema:CreativeWork
    110 https://doi.org/10.1006/jagm.1996.0035 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045540105
    111 rdf:type schema:CreativeWork
    112 https://doi.org/10.1016/j.ipl.2007.07.002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002252868
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1016/s0304-3975(99)00028-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030334849
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1089/10665270252935467 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059204929
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1089/106652702761034136 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059204960
    119 rdf:type schema:CreativeWork
    120 https://doi.org/10.1089/cmb.2007.0103 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245568
    121 rdf:type schema:CreativeWork
    122 https://doi.org/10.1093/molbev/msp077 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028540883
    123 rdf:type schema:CreativeWork
    124 https://doi.org/10.1093/oxfordjournals.molbev.a003881 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052052269
    125 rdf:type schema:CreativeWork
    126 https://doi.org/10.1093/oxfordjournals.molbev.a025664 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000153997
    127 rdf:type schema:CreativeWork
    128 https://doi.org/10.1093/oxfordjournals.molbev.a025756 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034694173
    129 rdf:type schema:CreativeWork
    130 https://doi.org/10.1145/279069.279105 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000943105
    131 rdf:type schema:CreativeWork
    132 https://www.grid.ac/institutes/grid.46078.3d schema:alternateName University of Waterloo
    133 schema:name David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
    134 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...