Regular Expression Constrained Sequence Alignment View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2005

AUTHORS

Abdullah N. Arslan

ABSTRACT

Given strings S1, S2, and a regular expression R, we introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned with some substring s2 of S2, and both s1 and s2 match R, i.e. s1,s2 ∈ L(R) where L(R) is the regular language described by R. A motivation for the problem is that protein sequences can be aligned in a way that known motifs guide the alignments. We present an O(nmr) time algorithm for the regular expression constrained sequence alignment problem where n, and m are the lengths of S1, and S2, respectively, and r is in the order of the size of the transition function of a finite automaton M that we create from a nondeterministic finite automaton N accepting L(R). M contains O(t2) states if N has t states. More... »

PAGES

322-333

References to SciGraph publications

Book

TITLE

Combinatorial Pattern Matching

ISBN

978-3-540-26201-5
978-3-540-31562-9

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/11496656_28

DOI

http://dx.doi.org/10.1007/11496656_28

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1000307633


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0802", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computation Theory and Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Vermont", 
          "id": "https://www.grid.ac/institutes/grid.59062.38", 
          "name": [
            "Department of Computer Science, The University of Vermont, 05405, Burlington, VT, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Arslan", 
        "givenName": "Abdullah N.", 
        "id": "sg:person.01031672451.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01031672451.96"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1016/j.ipl.2004.02.008", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001651261"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bth220", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007615319"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0959-440x(96)80057-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042163926"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0097-8485(02)00005-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046266378"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ipl.2003.07.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048731588"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ipl.2003.07.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048731588"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/30.1.235", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048836665"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.7280687", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062645221"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/csb.2002.1039336", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1077040257"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/csb.2003.1227334", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1077183472"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/j.1460-2075.1982.tb01276.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1081682169"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4899-6846-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109706103", 
          "https://doi.org/10.1007/978-1-4899-6846-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4899-6846-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109706103", 
          "https://doi.org/10.1007/978-1-4899-6846-3"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2005", 
    "datePublishedReg": "2005-01-01", 
    "description": "Given strings S1, S2, and a regular expression R, we introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned with some substring s2 of S2, and both s1 and s2 match R, i.e. s1,s2 \u2208 L(R) where L(R) is the regular language described by R. A motivation for the problem is that protein sequences can be aligned in a way that known motifs guide the alignments. We present an O(nmr) time algorithm for the regular expression constrained sequence alignment problem where n, and m are the lengths of S1, and S2, respectively, and r is in the order of the size of the transition function of a finite automaton M that we create from a nondeterministic finite automaton N accepting L(R). M contains O(t2) states if N has t states.", 
    "editor": [
      {
        "familyName": "Apostolico", 
        "givenName": "Alberto", 
        "type": "Person"
      }, 
      {
        "familyName": "Crochemore", 
        "givenName": "Maxime", 
        "type": "Person"
      }, 
      {
        "familyName": "Park", 
        "givenName": "Kunsoo", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/11496656_28", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-540-26201-5", 
        "978-3-540-31562-9"
      ], 
      "name": "Combinatorial Pattern Matching", 
      "type": "Book"
    }, 
    "name": "Regular Expression Constrained Sequence Alignment", 
    "pagination": "322-333", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1000307633"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/11496656_28"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "01765e44bd8352b6271584b6481e08e583d911d279dc3be8a05a253278a653ad"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/11496656_28", 
      "https://app.dimensions.ai/details/publication/pub.1000307633"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T08:26", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000363_0000000363/records_70058_00000000.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F11496656_28"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'


 

This table displays all metadata directly associated to this object as RDF triples.

109 TRIPLES      23 PREDICATES      38 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/11496656_28 schema:about anzsrc-for:08
2 anzsrc-for:0802
3 schema:author N790055bcd4904b608aad19e1197d8a0d
4 schema:citation sg:pub.10.1007/978-1-4899-6846-3
5 https://doi.org/10.1002/j.1460-2075.1982.tb01276.x
6 https://doi.org/10.1016/j.ipl.2003.07.001
7 https://doi.org/10.1016/j.ipl.2004.02.008
8 https://doi.org/10.1016/s0097-8485(02)00005-0
9 https://doi.org/10.1016/s0959-440x(96)80057-1
10 https://doi.org/10.1093/bioinformatics/bth220
11 https://doi.org/10.1093/nar/30.1.235
12 https://doi.org/10.1109/csb.2002.1039336
13 https://doi.org/10.1109/csb.2003.1227334
14 https://doi.org/10.1126/science.7280687
15 schema:datePublished 2005
16 schema:datePublishedReg 2005-01-01
17 schema:description Given strings S1, S2, and a regular expression R, we introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned with some substring s2 of S2, and both s1 and s2 match R, i.e. s1,s2 ∈ L(R) where L(R) is the regular language described by R. A motivation for the problem is that protein sequences can be aligned in a way that known motifs guide the alignments. We present an O(nmr) time algorithm for the regular expression constrained sequence alignment problem where n, and m are the lengths of S1, and S2, respectively, and r is in the order of the size of the transition function of a finite automaton M that we create from a nondeterministic finite automaton N accepting L(R). M contains O(t2) states if N has t states.
18 schema:editor Ne8ce9e76d0574352bab7eb1464c7f3da
19 schema:genre chapter
20 schema:inLanguage en
21 schema:isAccessibleForFree false
22 schema:isPartOf Nec68866505fd44f9993e6ecaad203ca2
23 schema:name Regular Expression Constrained Sequence Alignment
24 schema:pagination 322-333
25 schema:productId Nb4e093546d6d4a289c78a81dc6e6662e
26 Nc489799747d84157916e383d75a96087
27 Nd2896fefd2a84cfbb43a8cf459a16cae
28 schema:publisher N1356344900cb4756a5d33b15fd9625be
29 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000307633
30 https://doi.org/10.1007/11496656_28
31 schema:sdDatePublished 2019-04-16T08:26
32 schema:sdLicense https://scigraph.springernature.com/explorer/license/
33 schema:sdPublisher N28a280043a0843e3a40bb623602d0c83
34 schema:url https://link.springer.com/10.1007%2F11496656_28
35 sgo:license sg:explorer/license/
36 sgo:sdDataset chapters
37 rdf:type schema:Chapter
38 N00ac40e657dc4caaab0a3fbfb79691ea rdf:first N0d72423826514856a0ea8293335219b9
39 rdf:rest rdf:nil
40 N0d72423826514856a0ea8293335219b9 schema:familyName Park
41 schema:givenName Kunsoo
42 rdf:type schema:Person
43 N1356344900cb4756a5d33b15fd9625be schema:location Berlin, Heidelberg
44 schema:name Springer Berlin Heidelberg
45 rdf:type schema:Organisation
46 N23d68539ba534a0fbd6e821d5956325d rdf:first N3e0581d2cd4a4342aeb8c58fc380f998
47 rdf:rest N00ac40e657dc4caaab0a3fbfb79691ea
48 N28a280043a0843e3a40bb623602d0c83 schema:name Springer Nature - SN SciGraph project
49 rdf:type schema:Organization
50 N3e0581d2cd4a4342aeb8c58fc380f998 schema:familyName Crochemore
51 schema:givenName Maxime
52 rdf:type schema:Person
53 N790055bcd4904b608aad19e1197d8a0d rdf:first sg:person.01031672451.96
54 rdf:rest rdf:nil
55 N9fad9f2d03a743cfa0de3a0a2ac9c0bd schema:familyName Apostolico
56 schema:givenName Alberto
57 rdf:type schema:Person
58 Nb4e093546d6d4a289c78a81dc6e6662e schema:name dimensions_id
59 schema:value pub.1000307633
60 rdf:type schema:PropertyValue
61 Nc489799747d84157916e383d75a96087 schema:name doi
62 schema:value 10.1007/11496656_28
63 rdf:type schema:PropertyValue
64 Nd2896fefd2a84cfbb43a8cf459a16cae schema:name readcube_id
65 schema:value 01765e44bd8352b6271584b6481e08e583d911d279dc3be8a05a253278a653ad
66 rdf:type schema:PropertyValue
67 Ne8ce9e76d0574352bab7eb1464c7f3da rdf:first N9fad9f2d03a743cfa0de3a0a2ac9c0bd
68 rdf:rest N23d68539ba534a0fbd6e821d5956325d
69 Nec68866505fd44f9993e6ecaad203ca2 schema:isbn 978-3-540-26201-5
70 978-3-540-31562-9
71 schema:name Combinatorial Pattern Matching
72 rdf:type schema:Book
73 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
74 schema:name Information and Computing Sciences
75 rdf:type schema:DefinedTerm
76 anzsrc-for:0802 schema:inDefinedTermSet anzsrc-for:
77 schema:name Computation Theory and Mathematics
78 rdf:type schema:DefinedTerm
79 sg:person.01031672451.96 schema:affiliation https://www.grid.ac/institutes/grid.59062.38
80 schema:familyName Arslan
81 schema:givenName Abdullah N.
82 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01031672451.96
83 rdf:type schema:Person
84 sg:pub.10.1007/978-1-4899-6846-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109706103
85 https://doi.org/10.1007/978-1-4899-6846-3
86 rdf:type schema:CreativeWork
87 https://doi.org/10.1002/j.1460-2075.1982.tb01276.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1081682169
88 rdf:type schema:CreativeWork
89 https://doi.org/10.1016/j.ipl.2003.07.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048731588
90 rdf:type schema:CreativeWork
91 https://doi.org/10.1016/j.ipl.2004.02.008 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001651261
92 rdf:type schema:CreativeWork
93 https://doi.org/10.1016/s0097-8485(02)00005-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046266378
94 rdf:type schema:CreativeWork
95 https://doi.org/10.1016/s0959-440x(96)80057-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042163926
96 rdf:type schema:CreativeWork
97 https://doi.org/10.1093/bioinformatics/bth220 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007615319
98 rdf:type schema:CreativeWork
99 https://doi.org/10.1093/nar/30.1.235 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048836665
100 rdf:type schema:CreativeWork
101 https://doi.org/10.1109/csb.2002.1039336 schema:sameAs https://app.dimensions.ai/details/publication/pub.1077040257
102 rdf:type schema:CreativeWork
103 https://doi.org/10.1109/csb.2003.1227334 schema:sameAs https://app.dimensions.ai/details/publication/pub.1077183472
104 rdf:type schema:CreativeWork
105 https://doi.org/10.1126/science.7280687 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062645221
106 rdf:type schema:CreativeWork
107 https://www.grid.ac/institutes/grid.59062.38 schema:alternateName University of Vermont
108 schema:name Department of Computer Science, The University of Vermont, 05405, Burlington, VT, USA
109 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...