New decoding algorithms for Hidden Markov Models using distance measures on labellings View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2010-01-18

AUTHORS

Daniel G Brown, Jakub Truszkowski

ABSTRACT

BackgroundExisting hidden Markov model decoding algorithms do not focus on approximately identifying the sequence feature boundaries.ResultsWe give a set of algorithms to compute the conditional probability of all labellings "near" a reference labelling λ for a sequence y for a variety of definitions of "near". In addition, we give optimization algorithms to find the best labelling for a sequence in the robust sense of having all of its feature boundaries nearly correct. Natural problems in this domain are NP-hard to optimize. For membrane proteins, our algorithms find the approximate topology of such proteins with comparable success to existing programs, while being substantially more accurate in estimating the positions of transmembrane helix boundaries.ConclusionMore robust HMM decoding may allow for better analysis of sequence features, in reasonable runtimes. More... »

PAGES

s40

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s40

DOI

http://dx.doi.org/10.1186/1471-2105-11-s1-s40

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1052005567

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/20122214


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0102", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Applied Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Protein", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Markov Chains", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Membrane Proteins", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada", 
          "id": "http://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Brown", 
        "givenName": "Daniel G", 
        "id": "sg:person.0642727740.54", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada", 
          "id": "http://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Truszkowski", 
        "givenName": "Jakub", 
        "id": "sg:person.01320220640.40", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/978-3-540-30219-3_36", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000399524", 
          "https://doi.org/10.1007/978-3-540-30219-3_36"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-11-s1-s28", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031992141", 
          "https://doi.org/10.1186/1471-2105-11-s1-s28"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-6-s4-s12", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039318188", 
          "https://doi.org/10.1186/1471-2105-6-s4-s12"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2010-01-18", 
    "datePublishedReg": "2010-01-18", 
    "description": "BackgroundExisting hidden Markov model decoding algorithms do not focus on approximately identifying the sequence feature boundaries.ResultsWe give a set of algorithms to compute the conditional probability of all labellings \"near\" a reference labelling \u03bb for a sequence y for a variety of definitions of \"near\". In addition, we give optimization algorithms to find the best labelling for a sequence in the robust sense of having all of its feature boundaries nearly correct. Natural problems in this domain are NP-hard to optimize. For membrane proteins, our algorithms find the approximate topology of such proteins with comparable success to existing programs, while being substantially more accurate in estimating the positions of transmembrane helix boundaries.ConclusionMore robust HMM decoding may allow for better analysis of sequence features, in reasonable runtimes.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/1471-2105-11-s1-s40", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "Suppl 1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "11"
      }
    ], 
    "keywords": [
      "decoding algorithm", 
      "feature boundaries", 
      "set of algorithms", 
      "new decoding algorithm", 
      "Hidden Markov Model", 
      "approximate topology", 
      "HMM Decoding", 
      "optimization algorithm", 
      "natural problem", 
      "reasonable runtime", 
      "distance measure", 
      "algorithm", 
      "conditional probability", 
      "sequence Y", 
      "Markov model", 
      "better analysis", 
      "sequence features", 
      "runtime", 
      "helix boundaries", 
      "robust sense", 
      "decoding", 
      "boundaries", 
      "topology", 
      "NPs", 
      "comparable success", 
      "better labeling", 
      "probability", 
      "problem", 
      "set", 
      "variety of definitions", 
      "features", 
      "domain", 
      "model", 
      "sense", 
      "definition", 
      "labeling", 
      "variety", 
      "success", 
      "program", 
      "sequence", 
      "position", 
      "analysis", 
      "measures", 
      "addition", 
      "ResultsWe", 
      "such proteins", 
      "protein", 
      "membrane proteins"
    ], 
    "name": "New decoding algorithms for Hidden Markov Models using distance measures on labellings", 
    "pagination": "s40", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1052005567"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-11-s1-s40"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "20122214"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-11-s1-s40", 
      "https://app.dimensions.ai/details/publication/pub.1052005567"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-11-24T20:54", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221124/entities/gbq_results/article/article_503.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/1471-2105-11-s1-s40"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s40'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s40'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s40'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-s1-s40'


 

This table displays all metadata directly associated to this object as RDF triples.

143 TRIPLES      21 PREDICATES      80 URIs      69 LITERALS      11 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-11-s1-s40 schema:about N1ac2b26c99a84fc1a31462a334f33aff
2 N6e95d419dd42481097ae049d10658369
3 N7dcc74e89df5491aa472b256f6f5d427
4 N9af4c057fe0b4aa4992d3baf293f657b
5 anzsrc-for:01
6 anzsrc-for:0102
7 schema:author Ne956eae208d3416285dcd573d2f28b2f
8 schema:citation sg:pub.10.1007/978-3-540-30219-3_36
9 sg:pub.10.1186/1471-2105-11-s1-s28
10 sg:pub.10.1186/1471-2105-6-s4-s12
11 schema:datePublished 2010-01-18
12 schema:datePublishedReg 2010-01-18
13 schema:description BackgroundExisting hidden Markov model decoding algorithms do not focus on approximately identifying the sequence feature boundaries.ResultsWe give a set of algorithms to compute the conditional probability of all labellings "near" a reference labelling λ for a sequence y for a variety of definitions of "near". In addition, we give optimization algorithms to find the best labelling for a sequence in the robust sense of having all of its feature boundaries nearly correct. Natural problems in this domain are NP-hard to optimize. For membrane proteins, our algorithms find the approximate topology of such proteins with comparable success to existing programs, while being substantially more accurate in estimating the positions of transmembrane helix boundaries.ConclusionMore robust HMM decoding may allow for better analysis of sequence features, in reasonable runtimes.
14 schema:genre article
15 schema:isAccessibleForFree true
16 schema:isPartOf N084c41ad25794d348ed7a6ad5ab4315a
17 Nabca004b92cd440186d1ccbecf9ed299
18 sg:journal.1023786
19 schema:keywords HMM Decoding
20 Hidden Markov Model
21 Markov model
22 NPs
23 ResultsWe
24 addition
25 algorithm
26 analysis
27 approximate topology
28 better analysis
29 better labeling
30 boundaries
31 comparable success
32 conditional probability
33 decoding
34 decoding algorithm
35 definition
36 distance measure
37 domain
38 feature boundaries
39 features
40 helix boundaries
41 labeling
42 measures
43 membrane proteins
44 model
45 natural problem
46 new decoding algorithm
47 optimization algorithm
48 position
49 probability
50 problem
51 program
52 protein
53 reasonable runtime
54 robust sense
55 runtime
56 sense
57 sequence
58 sequence Y
59 sequence features
60 set
61 set of algorithms
62 success
63 such proteins
64 topology
65 variety
66 variety of definitions
67 schema:name New decoding algorithms for Hidden Markov Models using distance measures on labellings
68 schema:pagination s40
69 schema:productId N478e74d20efa453a9785b7d206e0ab31
70 N8f1522ca94e24a959d58a89a1d213a7b
71 Ndf6a00fabe8848cd996f52504ed32cf4
72 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052005567
73 https://doi.org/10.1186/1471-2105-11-s1-s40
74 schema:sdDatePublished 2022-11-24T20:54
75 schema:sdLicense https://scigraph.springernature.com/explorer/license/
76 schema:sdPublisher Nfccbec49231b4a888f29e3f8588c7735
77 schema:url https://doi.org/10.1186/1471-2105-11-s1-s40
78 sgo:license sg:explorer/license/
79 sgo:sdDataset articles
80 rdf:type schema:ScholarlyArticle
81 N084c41ad25794d348ed7a6ad5ab4315a schema:issueNumber Suppl 1
82 rdf:type schema:PublicationIssue
83 N1ac2b26c99a84fc1a31462a334f33aff schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
84 schema:name Algorithms
85 rdf:type schema:DefinedTerm
86 N478e74d20efa453a9785b7d206e0ab31 schema:name doi
87 schema:value 10.1186/1471-2105-11-s1-s40
88 rdf:type schema:PropertyValue
89 N6e95d419dd42481097ae049d10658369 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
90 schema:name Membrane Proteins
91 rdf:type schema:DefinedTerm
92 N7dcc74e89df5491aa472b256f6f5d427 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
93 schema:name Databases, Protein
94 rdf:type schema:DefinedTerm
95 N8f1522ca94e24a959d58a89a1d213a7b schema:name pubmed_id
96 schema:value 20122214
97 rdf:type schema:PropertyValue
98 N9af4c057fe0b4aa4992d3baf293f657b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
99 schema:name Markov Chains
100 rdf:type schema:DefinedTerm
101 Nabca004b92cd440186d1ccbecf9ed299 schema:volumeNumber 11
102 rdf:type schema:PublicationVolume
103 Ndb369ddf391b456bbee32df544a3039c rdf:first sg:person.01320220640.40
104 rdf:rest rdf:nil
105 Ndf6a00fabe8848cd996f52504ed32cf4 schema:name dimensions_id
106 schema:value pub.1052005567
107 rdf:type schema:PropertyValue
108 Ne956eae208d3416285dcd573d2f28b2f rdf:first sg:person.0642727740.54
109 rdf:rest Ndb369ddf391b456bbee32df544a3039c
110 Nfccbec49231b4a888f29e3f8588c7735 schema:name Springer Nature - SN SciGraph project
111 rdf:type schema:Organization
112 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
113 schema:name Mathematical Sciences
114 rdf:type schema:DefinedTerm
115 anzsrc-for:0102 schema:inDefinedTermSet anzsrc-for:
116 schema:name Applied Mathematics
117 rdf:type schema:DefinedTerm
118 sg:journal.1023786 schema:issn 1471-2105
119 schema:name BMC Bioinformatics
120 schema:publisher Springer Nature
121 rdf:type schema:Periodical
122 sg:person.01320220640.40 schema:affiliation grid-institutes:grid.46078.3d
123 schema:familyName Truszkowski
124 schema:givenName Jakub
125 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320220640.40
126 rdf:type schema:Person
127 sg:person.0642727740.54 schema:affiliation grid-institutes:grid.46078.3d
128 schema:familyName Brown
129 schema:givenName Daniel G
130 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642727740.54
131 rdf:type schema:Person
132 sg:pub.10.1007/978-3-540-30219-3_36 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000399524
133 https://doi.org/10.1007/978-3-540-30219-3_36
134 rdf:type schema:CreativeWork
135 sg:pub.10.1186/1471-2105-11-s1-s28 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031992141
136 https://doi.org/10.1186/1471-2105-11-s1-s28
137 rdf:type schema:CreativeWork
138 sg:pub.10.1186/1471-2105-6-s4-s12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039318188
139 https://doi.org/10.1186/1471-2105-6-s4-s12
140 rdf:type schema:CreativeWork
141 grid-institutes:grid.46078.3d schema:alternateName David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
142 schema:name David R. Cheriton School of Computer Science, University of Waterloo, N2L 3G1, Waterloo, ON, Canada
143 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...