Decoding HMMs using the k best paths: algorithms and applications View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2010-01

AUTHORS

Daniel G Brown, Daniil Golod

ABSTRACT

BACKGROUND: Traditional algorithms for hidden Markov model decoding seek to maximize either the probability of a state path or the number of positions of a sequence assigned to the correct state. These algorithms provide only a single answer and in practice do not produce good results. RESULTS: We explore an alternative approach, where we efficiently compute the k paths of highest probability to explain a sequence and then either use those paths to explore alternative explanations for a sequence or to combine them into a single explanation. Our procedure uses an online pruning technique to reduce usage of primary memory. CONCLUSION: Out algorithm uses much less memory than naive approach. For membrane proteins, even simple path combination algorithms give good explanations, and if we look at the paths we are combining, we can give a sense of confidence in the explanation as well. For proteins with two topologies, the k best paths can give insight into both correct explanations of a sequence, a feature lacking from traditional algorithms in this domain. More... »

PAGES

s28

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s28

DOI

http://dx.doi.org/10.1186/1471-2105-11-s1-s28

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1031992141

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/20122200


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0802", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computation Theory and Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Protein", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Markov Chains", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Proteins", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Waterloo", 
          "id": "https://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "Cheriton School of Computer Science, University of Waterloo, 200 University Avenue W., N2L 3G1, Waterloo, Ontario, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Brown", 
        "givenName": "Daniel G", 
        "id": "sg:person.0642727740.54", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Waterloo", 
          "id": "https://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "Cheriton School of Computer Science, University of Waterloo, 200 University Avenue W., N2L 3G1, Waterloo, Ontario, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Golod", 
        "givenName": "Daniil", 
        "id": "sg:person.01301341662.63", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01301341662.63"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/978-3-540-30219-3_36", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000399524", 
          "https://doi.org/10.1007/978-3-540-30219-3_36"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-30219-3_36", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000399524", 
          "https://doi.org/10.1007/978-3-540-30219-3_36"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bti1014", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008444097"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.jmb.2004.03.016", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021467321"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gki070", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022267696"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-74126-8_23", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034400198", 
          "https://doi.org/10.1007/978-3-540-74126-8_23"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-74126-8_23", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034400198", 
          "https://doi.org/10.1007/978-3-540-74126-8_23"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-6-s4-s12", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039318188", 
          "https://doi.org/10.1186/1471-2105-6-s4-s12"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nsmb1057", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048693867", 
          "https://doi.org/10.1038/nsmb1057"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nsmb1057", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048693867", 
          "https://doi.org/10.1038/nsmb1057"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bth340", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049699987"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/s0097539795290477", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062880056"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1142/s0219720009004242", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1063004912"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083156239", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2010-01", 
    "datePublishedReg": "2010-01-01", 
    "description": "BACKGROUND: Traditional algorithms for hidden Markov model decoding seek to maximize either the probability of a state path or the number of positions of a sequence assigned to the correct state. These algorithms provide only a single answer and in practice do not produce good results.\nRESULTS: We explore an alternative approach, where we efficiently compute the k paths of highest probability to explain a sequence and then either use those paths to explore alternative explanations for a sequence or to combine them into a single explanation. Our procedure uses an online pruning technique to reduce usage of primary memory.\nCONCLUSION: Out algorithm uses much less memory than naive approach. For membrane proteins, even simple path combination algorithms give good explanations, and if we look at the paths we are combining, we can give a sense of confidence in the explanation as well. For proteins with two topologies, the k best paths can give insight into both correct explanations of a sequence, a feature lacking from traditional algorithms in this domain.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1471-2105-11-s1-s28", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "Suppl 1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "11"
      }
    ], 
    "name": "Decoding HMMs using the k best paths: algorithms and applications", 
    "pagination": "s28", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "9867931e6b6d6e8fa4a3f30949fa2792e8acfa43e62bc21a7fecb42dabecfd5a"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "20122200"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100965194"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-11-s1-s28"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1031992141"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-11-s1-s28", 
      "https://app.dimensions.ai/details/publication/pub.1031992141"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T20:53", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8684_00000550.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186/1471-2105-11-S1-S28"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s28'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s28'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s28'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s28'


 

This table displays all metadata directly associated to this object as RDF triples.

127 TRIPLES      21 PREDICATES      44 URIs      25 LITERALS      13 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-11-s1-s28 schema:about N5b4f109f19684eb1a939501298b17b59
2 N934f76935204458a902cbd40f63bf6f8
3 N987dc0a4e7cb485085950fe9f4083700
4 Nf1d8172babe84efbb75ac14580ccf4ca
5 anzsrc-for:08
6 anzsrc-for:0802
7 schema:author N2ce95253b54f49d38c6d868310aff1f5
8 schema:citation sg:pub.10.1007/978-3-540-30219-3_36
9 sg:pub.10.1007/978-3-540-74126-8_23
10 sg:pub.10.1038/nsmb1057
11 sg:pub.10.1186/1471-2105-6-s4-s12
12 https://app.dimensions.ai/details/publication/pub.1083156239
13 https://doi.org/10.1016/j.jmb.2004.03.016
14 https://doi.org/10.1093/bioinformatics/bth340
15 https://doi.org/10.1093/bioinformatics/bti1014
16 https://doi.org/10.1093/nar/gki070
17 https://doi.org/10.1137/s0097539795290477
18 https://doi.org/10.1142/s0219720009004242
19 schema:datePublished 2010-01
20 schema:datePublishedReg 2010-01-01
21 schema:description BACKGROUND: Traditional algorithms for hidden Markov model decoding seek to maximize either the probability of a state path or the number of positions of a sequence assigned to the correct state. These algorithms provide only a single answer and in practice do not produce good results. RESULTS: We explore an alternative approach, where we efficiently compute the k paths of highest probability to explain a sequence and then either use those paths to explore alternative explanations for a sequence or to combine them into a single explanation. Our procedure uses an online pruning technique to reduce usage of primary memory. CONCLUSION: Out algorithm uses much less memory than naive approach. For membrane proteins, even simple path combination algorithms give good explanations, and if we look at the paths we are combining, we can give a sense of confidence in the explanation as well. For proteins with two topologies, the k best paths can give insight into both correct explanations of a sequence, a feature lacking from traditional algorithms in this domain.
22 schema:genre research_article
23 schema:inLanguage en
24 schema:isAccessibleForFree true
25 schema:isPartOf N558e006dd1cf412b86934c8125dded7d
26 N662da6e51aa242e8a15f0ce2506ae76b
27 sg:journal.1023786
28 schema:name Decoding HMMs using the k best paths: algorithms and applications
29 schema:pagination s28
30 schema:productId N6f28919f01b54e62835db1125de7b118
31 N84d59f68bb36415799807a21ae512d1e
32 Naa11b1823dfe4f6381bace322b158b6c
33 Nd14068d6aa0543928840de32e1de3d22
34 Nf75d6fe26ed5487d81f67c28b21c0c6c
35 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031992141
36 https://doi.org/10.1186/1471-2105-11-s1-s28
37 schema:sdDatePublished 2019-04-10T20:53
38 schema:sdLicense https://scigraph.springernature.com/explorer/license/
39 schema:sdPublisher N26e5b9d34b18438ba32a77276c4badd5
40 schema:url http://link.springer.com/10.1186/1471-2105-11-S1-S28
41 sgo:license sg:explorer/license/
42 sgo:sdDataset articles
43 rdf:type schema:ScholarlyArticle
44 N26e5b9d34b18438ba32a77276c4badd5 schema:name Springer Nature - SN SciGraph project
45 rdf:type schema:Organization
46 N2ce95253b54f49d38c6d868310aff1f5 rdf:first sg:person.0642727740.54
47 rdf:rest Ne1a420c4c17549dabe4505072a694d1b
48 N558e006dd1cf412b86934c8125dded7d schema:issueNumber Suppl 1
49 rdf:type schema:PublicationIssue
50 N5b4f109f19684eb1a939501298b17b59 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
51 schema:name Proteins
52 rdf:type schema:DefinedTerm
53 N662da6e51aa242e8a15f0ce2506ae76b schema:volumeNumber 11
54 rdf:type schema:PublicationVolume
55 N6f28919f01b54e62835db1125de7b118 schema:name nlm_unique_id
56 schema:value 100965194
57 rdf:type schema:PropertyValue
58 N84d59f68bb36415799807a21ae512d1e schema:name dimensions_id
59 schema:value pub.1031992141
60 rdf:type schema:PropertyValue
61 N934f76935204458a902cbd40f63bf6f8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
62 schema:name Databases, Protein
63 rdf:type schema:DefinedTerm
64 N987dc0a4e7cb485085950fe9f4083700 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
65 schema:name Markov Chains
66 rdf:type schema:DefinedTerm
67 Naa11b1823dfe4f6381bace322b158b6c schema:name pubmed_id
68 schema:value 20122200
69 rdf:type schema:PropertyValue
70 Nd14068d6aa0543928840de32e1de3d22 schema:name readcube_id
71 schema:value 9867931e6b6d6e8fa4a3f30949fa2792e8acfa43e62bc21a7fecb42dabecfd5a
72 rdf:type schema:PropertyValue
73 Ne1a420c4c17549dabe4505072a694d1b rdf:first sg:person.01301341662.63
74 rdf:rest rdf:nil
75 Nf1d8172babe84efbb75ac14580ccf4ca schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
76 schema:name Algorithms
77 rdf:type schema:DefinedTerm
78 Nf75d6fe26ed5487d81f67c28b21c0c6c schema:name doi
79 schema:value 10.1186/1471-2105-11-s1-s28
80 rdf:type schema:PropertyValue
81 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
82 schema:name Information and Computing Sciences
83 rdf:type schema:DefinedTerm
84 anzsrc-for:0802 schema:inDefinedTermSet anzsrc-for:
85 schema:name Computation Theory and Mathematics
86 rdf:type schema:DefinedTerm
87 sg:journal.1023786 schema:issn 1471-2105
88 schema:name BMC Bioinformatics
89 rdf:type schema:Periodical
90 sg:person.01301341662.63 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
91 schema:familyName Golod
92 schema:givenName Daniil
93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01301341662.63
94 rdf:type schema:Person
95 sg:person.0642727740.54 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
96 schema:familyName Brown
97 schema:givenName Daniel G
98 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
99 rdf:type schema:Person
100 sg:pub.10.1007/978-3-540-30219-3_36 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000399524
101 https://doi.org/10.1007/978-3-540-30219-3_36
102 rdf:type schema:CreativeWork
103 sg:pub.10.1007/978-3-540-74126-8_23 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034400198
104 https://doi.org/10.1007/978-3-540-74126-8_23
105 rdf:type schema:CreativeWork
106 sg:pub.10.1038/nsmb1057 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048693867
107 https://doi.org/10.1038/nsmb1057
108 rdf:type schema:CreativeWork
109 sg:pub.10.1186/1471-2105-6-s4-s12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039318188
110 https://doi.org/10.1186/1471-2105-6-s4-s12
111 rdf:type schema:CreativeWork
112 https://app.dimensions.ai/details/publication/pub.1083156239 schema:CreativeWork
113 https://doi.org/10.1016/j.jmb.2004.03.016 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021467321
114 rdf:type schema:CreativeWork
115 https://doi.org/10.1093/bioinformatics/bth340 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049699987
116 rdf:type schema:CreativeWork
117 https://doi.org/10.1093/bioinformatics/bti1014 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008444097
118 rdf:type schema:CreativeWork
119 https://doi.org/10.1093/nar/gki070 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022267696
120 rdf:type schema:CreativeWork
121 https://doi.org/10.1137/s0097539795290477 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062880056
122 rdf:type schema:CreativeWork
123 https://doi.org/10.1142/s0219720009004242 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063004912
124 rdf:type schema:CreativeWork
125 https://www.grid.ac/institutes/grid.46078.3d schema:alternateName University of Waterloo
126 schema:name Cheriton School of Computer Science, University of Waterloo, 200 University Avenue W., N2L 3G1, Waterloo, Ontario, Canada
127 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...