Towards a Practical O(n logn) Phylogeny Algorithm View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2011

AUTHORS

Daniel G. Brown , Jakub Truszkowski

ABSTRACT

Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost. More... »

PAGES

14-25

Book

TITLE

Algorithms in Bioinformatics

ISBN

978-3-642-23037-0
978-3-642-23038-7

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2

DOI

http://dx.doi.org/10.1007/978-3-642-23038-7_2

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1050915819


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada", 
          "id": "http://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Brown", 
        "givenName": "Daniel G.", 
        "id": "sg:person.0642727740.54", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada", 
          "id": "http://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Truszkowski", 
        "givenName": "Jakub", 
        "id": "sg:person.01320220640.40", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2011", 
    "datePublishedReg": "2011-01-01", 
    "description": "Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.", 
    "editor": [
      {
        "familyName": "Przytycka", 
        "givenName": "Teresa M.", 
        "type": "Person"
      }, 
      {
        "familyName": "Sagot", 
        "givenName": "Marie-France", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-23038-7_2", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-642-23037-0", 
        "978-3-642-23038-7"
      ], 
      "name": "Algorithms in Bioinformatics", 
      "type": "Book"
    }, 
    "keywords": [
      "phylogeny algorithms", 
      "quartet errors", 
      "true topology", 
      "variety of extensions", 
      "unknown probabilities", 
      "theoretical setting", 
      "constant factor", 
      "guide tree", 
      "high probability", 
      "algorithm", 
      "probability", 
      "speed improvement", 
      "error", 
      "correct phylogeny", 
      "accuracy cost", 
      "topology", 
      "runtime", 
      "assumption", 
      "extension", 
      "direction", 
      "new directions", 
      "performance", 
      "trees", 
      "work", 
      "results", 
      "cost", 
      "reconstruction", 
      "Practical", 
      "variety", 
      "phylogenetic reconstruction", 
      "improvement", 
      "setting", 
      "factors", 
      "practice", 
      "phylogeny", 
      "taxa"
    ], 
    "name": "Towards a Practical O(n logn) Phylogeny Algorithm", 
    "pagination": "14-25", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1050915819"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-23038-7_2"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-23038-7_2", 
      "https://app.dimensions.ai/details/publication/pub.1050915819"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-12-01T06:55", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/chapter/chapter_99.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-642-23038-7_2"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-23038-7_2'


 

This table displays all metadata directly associated to this object as RDF triples.

107 TRIPLES      22 PREDICATES      61 URIs      54 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-23038-7_2 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author N54e95f9da98f4224b401405fe0326f37
4 schema:datePublished 2011
5 schema:datePublishedReg 2011-01-01
6 schema:description Recently, we have identified a quartet phylogeny algorithm with O(n logn) expected runtime, which is asymptotically optimal. Regardless of the true topology, our algorithm has high probability of returning the correct phylogeny when quartet errors are independent and occur with known probability, and when the algorithm uses a guide tree on O( loglogn) taxa that is correct with high probability. In practice, none of these assumptions is correct: quartet errors are positively correlated and occur with unknown probability, and the guide tree is often error prone. Here, we bring our work out of the purely theoretical setting. We present a variety of extensions which, while only slowing the algorithm down by a constant factor, make its performance nearly comparable to that of neighbour-joining, which requires O(n3) runtime. Our results suggest a new direction for quartet-based phylogenetic reconstruction that may yield striking speed improvements at minimal accuracy cost.
7 schema:editor N73831ef73c294d72a1fa6c543bf715d1
8 schema:genre chapter
9 schema:isAccessibleForFree true
10 schema:isPartOf N1ae9d20247db4217a5c6d50cf2f730bf
11 schema:keywords Practical
12 accuracy cost
13 algorithm
14 assumption
15 constant factor
16 correct phylogeny
17 cost
18 direction
19 error
20 extension
21 factors
22 guide tree
23 high probability
24 improvement
25 new directions
26 performance
27 phylogenetic reconstruction
28 phylogeny
29 phylogeny algorithms
30 practice
31 probability
32 quartet errors
33 reconstruction
34 results
35 runtime
36 setting
37 speed improvement
38 taxa
39 theoretical setting
40 topology
41 trees
42 true topology
43 unknown probabilities
44 variety
45 variety of extensions
46 work
47 schema:name Towards a Practical O(n logn) Phylogeny Algorithm
48 schema:pagination 14-25
49 schema:productId N2c92361031a142f0bd499f8014eac178
50 Na2456eefb8684b88a3edbdffe69ad098
51 schema:publisher N0dc259154f1340678abeca9c68e6eade
52 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050915819
53 https://doi.org/10.1007/978-3-642-23038-7_2
54 schema:sdDatePublished 2022-12-01T06:55
55 schema:sdLicense https://scigraph.springernature.com/explorer/license/
56 schema:sdPublisher Ncf1fc34b63ff4086b1658c9e35e972cb
57 schema:url https://doi.org/10.1007/978-3-642-23038-7_2
58 sgo:license sg:explorer/license/
59 sgo:sdDataset chapters
60 rdf:type schema:Chapter
61 N0dc259154f1340678abeca9c68e6eade schema:name Springer Nature
62 rdf:type schema:Organisation
63 N15dd9e6bbd2d4c0e8025d9ff00261069 schema:familyName Sagot
64 schema:givenName Marie-France
65 rdf:type schema:Person
66 N1ae9d20247db4217a5c6d50cf2f730bf schema:isbn 978-3-642-23037-0
67 978-3-642-23038-7
68 schema:name Algorithms in Bioinformatics
69 rdf:type schema:Book
70 N2c92361031a142f0bd499f8014eac178 schema:name doi
71 schema:value 10.1007/978-3-642-23038-7_2
72 rdf:type schema:PropertyValue
73 N54e95f9da98f4224b401405fe0326f37 rdf:first sg:person.0642727740.54
74 rdf:rest N987f57c60e594710b2c6a608fcb6f4b0
75 N73831ef73c294d72a1fa6c543bf715d1 rdf:first Na91b4284d0dd44c9b1cc697a81fb17be
76 rdf:rest Na61c7eb51e1e4f5cbd8d4a19bbd704b2
77 N987f57c60e594710b2c6a608fcb6f4b0 rdf:first sg:person.01320220640.40
78 rdf:rest rdf:nil
79 Na2456eefb8684b88a3edbdffe69ad098 schema:name dimensions_id
80 schema:value pub.1050915819
81 rdf:type schema:PropertyValue
82 Na61c7eb51e1e4f5cbd8d4a19bbd704b2 rdf:first N15dd9e6bbd2d4c0e8025d9ff00261069
83 rdf:rest rdf:nil
84 Na91b4284d0dd44c9b1cc697a81fb17be schema:familyName Przytycka
85 schema:givenName Teresa M.
86 rdf:type schema:Person
87 Ncf1fc34b63ff4086b1658c9e35e972cb schema:name Springer Nature - SN SciGraph project
88 rdf:type schema:Organization
89 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
90 schema:name Mathematical Sciences
91 rdf:type schema:DefinedTerm
92 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
93 schema:name Statistics
94 rdf:type schema:DefinedTerm
95 sg:person.01320220640.40 schema:affiliation grid-institutes:grid.46078.3d
96 schema:familyName Truszkowski
97 schema:givenName Jakub
98 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40
99 rdf:type schema:Person
100 sg:person.0642727740.54 schema:affiliation grid-institutes:grid.46078.3d
101 schema:familyName Brown
102 schema:givenName Daniel G.
103 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
104 rdf:type schema:Person
105 grid-institutes:grid.46078.3d schema:alternateName David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
106 schema:name David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
107 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...