Towards a Practical O(n logn) Phylogeny Algorithm View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2011

AUTHORS

Daniel G. Brown , Jakub Truszkowski

ABSTRACT

Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost. More... »

PAGES

14-25

References to SciGraph publications

  • 2009. Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep in RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY
  • 2011. Fast Error-Tolerant Quartet Phylogeny Algorithms in COMBINATORIAL PATTERN MATCHING
  • 2005-12. Scoredist: A simple and robust protein sequence distance estimator in BMC BIOINFORMATICS
  • 2009. Large-Scale Neighbor-Joining with NINJA in ALGORITHMS IN BIOINFORMATICS
  • 2005. Fast Neighbor Joining in AUTOMATA, LANGUAGES AND PROGRAMMING
  • 2004-02. Computing the Quartet Distance between Evolutionary Trees in Time O(n log n) in ALGORITHMICA
  • Book

    TITLE

    Algorithms in Bioinformatics

    ISBN

    978-3-642-23037-0
    978-3-642-23038-7

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2

    DOI

    http://dx.doi.org/10.1007/978-3-642-23038-7_2

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1050915819


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Brown", 
            "givenName": "Daniel G.", 
            "id": "sg:person.0642727740.54", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Waterloo", 
              "id": "https://www.grid.ac/institutes/grid.46078.3d", 
              "name": [
                "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Truszkowski", 
            "givenName": "Jakub", 
            "id": "sg:person.01320220640.40", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1093/oxfordjournals.molbev.a025664", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000153997"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/279069.279105", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000943105"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.ipl.2007.07.002", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002252868"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11523468_102", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004861745", 
              "https://doi.org/10.1007/11523468_102"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11523468_102", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004861745", 
              "https://doi.org/10.1007/11523468_102"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-21458-5_14", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008082437", 
              "https://doi.org/10.1007/978-3-642-21458-5_14"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-21458-5_14", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008082437", 
              "https://doi.org/10.1007/978-3-642-21458-5_14"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s00453-003-1065-y", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1021604411", 
              "https://doi.org/10.1007/s00453-003-1065-y"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/molbev/msp077", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028540883"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/molbev/msp077", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028540883"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0304-3975(99)00028-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030334849"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-04241-6_31", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032768275", 
              "https://doi.org/10.1007/978-3-642-04241-6_31"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/oxfordjournals.molbev.a025756", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034694173"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-6-108", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037048502", 
              "https://doi.org/10.1186/1471-2105-6-108"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/jagm.1996.0035", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045540105"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-02008-7_32", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050043191", 
              "https://doi.org/10.1007/978-3-642-02008-7_32"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-02008-7_32", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050043191", 
              "https://doi.org/10.1007/978-3-642-02008-7_32"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/oxfordjournals.molbev.a003881", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052052269"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/10665270252935467", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059204929"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/106652702761034136", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059204960"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2007.0103", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059245568"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2011", 
        "datePublishedReg": "2011-01-01", 
        "description": "Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.", 
        "editor": [
          {
            "familyName": "Przytycka", 
            "givenName": "Teresa M.", 
            "type": "Person"
          }, 
          {
            "familyName": "Sagot", 
            "givenName": "Marie-France", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-642-23038-7_2", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-642-23037-0", 
            "978-3-642-23038-7"
          ], 
          "name": "Algorithms in Bioinformatics", 
          "type": "Book"
        }, 
        "name": "Towards a Practical O(n logn) Phylogeny Algorithm", 
        "pagination": "14-25", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1050915819"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-642-23038-7_2"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "f28b5f8050ff284410c0f206111750227f527cd3e4e7278035d8befd7527eb4a"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-642-23038-7_2", 
          "https://app.dimensions.ai/details/publication/pub.1050915819"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T09:10", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000371_0000000371/records_130793_00000004.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-642-23038-7_2"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'


     

    This table displays all metadata directly associated to this object as RDF triples.

    134 TRIPLES      23 PREDICATES      44 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-642-23038-7_2 schema:about anzsrc-for:01
    2 anzsrc-for:0104
    3 schema:author Nb617181f32534f33b71d26a54c67c526
    4 schema:citation sg:pub.10.1007/11523468_102
    5 sg:pub.10.1007/978-3-642-02008-7_32
    6 sg:pub.10.1007/978-3-642-04241-6_31
    7 sg:pub.10.1007/978-3-642-21458-5_14
    8 sg:pub.10.1007/s00453-003-1065-y
    9 sg:pub.10.1186/1471-2105-6-108
    10 https://doi.org/10.1006/jagm.1996.0035
    11 https://doi.org/10.1016/j.ipl.2007.07.002
    12 https://doi.org/10.1016/s0304-3975(99)00028-6
    13 https://doi.org/10.1089/10665270252935467
    14 https://doi.org/10.1089/106652702761034136
    15 https://doi.org/10.1089/cmb.2007.0103
    16 https://doi.org/10.1093/molbev/msp077
    17 https://doi.org/10.1093/oxfordjournals.molbev.a003881
    18 https://doi.org/10.1093/oxfordjournals.molbev.a025664
    19 https://doi.org/10.1093/oxfordjournals.molbev.a025756
    20 https://doi.org/10.1145/279069.279105
    21 schema:datePublished 2011
    22 schema:datePublishedReg 2011-01-01
    23 schema:description Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.
    24 schema:editor N67643fa9673a46199eb39e846e7abb44
    25 schema:genre chapter
    26 schema:inLanguage en
    27 schema:isAccessibleForFree true
    28 schema:isPartOf N0125c8d3fa0e4d14a815eed93e6f55c0
    29 schema:name Towards a Practical O(n logn) Phylogeny Algorithm
    30 schema:pagination 14-25
    31 schema:productId N1d37f7ee8a4b4ce7b708fc29c5f2d276
    32 Nb20e424c84724a749c88f3e6c1ecce80
    33 Nd4965690248a423c89d1c668d98c9a95
    34 schema:publisher N789e809474aa468abaf5e1b966e95d14
    35 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050915819
    36 https://doi.org/10.1007/978-3-642-23038-7_2
    37 schema:sdDatePublished 2019-04-16T09:10
    38 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    39 schema:sdPublisher N387d5341b74340c1881333f5583e4b36
    40 schema:url https://link.springer.com/10.1007%2F978-3-642-23038-7_2
    41 sgo:license sg:explorer/license/
    42 sgo:sdDataset chapters
    43 rdf:type schema:Chapter
    44 N0125c8d3fa0e4d14a815eed93e6f55c0 schema:isbn 978-3-642-23037-0
    45 978-3-642-23038-7
    46 schema:name Algorithms in Bioinformatics
    47 rdf:type schema:Book
    48 N1d37f7ee8a4b4ce7b708fc29c5f2d276 schema:name doi
    49 schema:value 10.1007/978-3-642-23038-7_2
    50 rdf:type schema:PropertyValue
    51 N387d5341b74340c1881333f5583e4b36 schema:name Springer Nature - SN SciGraph project
    52 rdf:type schema:Organization
    53 N47ad957118394e459512f5b939e6ebf3 rdf:first sg:person.01320220640.40
    54 rdf:rest rdf:nil
    55 N49fbed2b842d4797aaa6e75034456f59 rdf:first Nc1a025f4383b438d8f519f7b04b7f60b
    56 rdf:rest rdf:nil
    57 N67643fa9673a46199eb39e846e7abb44 rdf:first Nf5e9ed25086549a0a596b13995c00571
    58 rdf:rest N49fbed2b842d4797aaa6e75034456f59
    59 N789e809474aa468abaf5e1b966e95d14 schema:location Berlin, Heidelberg
    60 schema:name Springer Berlin Heidelberg
    61 rdf:type schema:Organisation
    62 Nb20e424c84724a749c88f3e6c1ecce80 schema:name dimensions_id
    63 schema:value pub.1050915819
    64 rdf:type schema:PropertyValue
    65 Nb617181f32534f33b71d26a54c67c526 rdf:first sg:person.0642727740.54
    66 rdf:rest N47ad957118394e459512f5b939e6ebf3
    67 Nc1a025f4383b438d8f519f7b04b7f60b schema:familyName Sagot
    68 schema:givenName Marie-France
    69 rdf:type schema:Person
    70 Nd4965690248a423c89d1c668d98c9a95 schema:name readcube_id
    71 schema:value f28b5f8050ff284410c0f206111750227f527cd3e4e7278035d8befd7527eb4a
    72 rdf:type schema:PropertyValue
    73 Nf5e9ed25086549a0a596b13995c00571 schema:familyName Przytycka
    74 schema:givenName Teresa M.
    75 rdf:type schema:Person
    76 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    77 schema:name Mathematical Sciences
    78 rdf:type schema:DefinedTerm
    79 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Statistics
    81 rdf:type schema:DefinedTerm
    82 sg:person.01320220640.40 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    83 schema:familyName Truszkowski
    84 schema:givenName Jakub
    85 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40
    86 rdf:type schema:Person
    87 sg:person.0642727740.54 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
    88 schema:familyName Brown
    89 schema:givenName Daniel G.
    90 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
    91 rdf:type schema:Person
    92 sg:pub.10.1007/11523468_102 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004861745
    93 https://doi.org/10.1007/11523468_102
    94 rdf:type schema:CreativeWork
    95 sg:pub.10.1007/978-3-642-02008-7_32 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050043191
    96 https://doi.org/10.1007/978-3-642-02008-7_32
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1007/978-3-642-04241-6_31 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032768275
    99 https://doi.org/10.1007/978-3-642-04241-6_31
    100 rdf:type schema:CreativeWork
    101 sg:pub.10.1007/978-3-642-21458-5_14 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008082437
    102 https://doi.org/10.1007/978-3-642-21458-5_14
    103 rdf:type schema:CreativeWork
    104 sg:pub.10.1007/s00453-003-1065-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1021604411
    105 https://doi.org/10.1007/s00453-003-1065-y
    106 rdf:type schema:CreativeWork
    107 sg:pub.10.1186/1471-2105-6-108 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037048502
    108 https://doi.org/10.1186/1471-2105-6-108
    109 rdf:type schema:CreativeWork
    110 https://doi.org/10.1006/jagm.1996.0035 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045540105
    111 rdf:type schema:CreativeWork
    112 https://doi.org/10.1016/j.ipl.2007.07.002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002252868
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1016/s0304-3975(99)00028-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030334849
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1089/10665270252935467 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059204929
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1089/106652702761034136 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059204960
    119 rdf:type schema:CreativeWork
    120 https://doi.org/10.1089/cmb.2007.0103 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245568
    121 rdf:type schema:CreativeWork
    122 https://doi.org/10.1093/molbev/msp077 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028540883
    123 rdf:type schema:CreativeWork
    124 https://doi.org/10.1093/oxfordjournals.molbev.a003881 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052052269
    125 rdf:type schema:CreativeWork
    126 https://doi.org/10.1093/oxfordjournals.molbev.a025664 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000153997
    127 rdf:type schema:CreativeWork
    128 https://doi.org/10.1093/oxfordjournals.molbev.a025756 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034694173
    129 rdf:type schema:CreativeWork
    130 https://doi.org/10.1145/279069.279105 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000943105
    131 rdf:type schema:CreativeWork
    132 https://www.grid.ac/institutes/grid.46078.3d schema:alternateName University of Waterloo
    133 schema:name David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
    134 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...