Regular Expression Constrained Sequence Alignment View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2005

AUTHORS

Abdullah N. Arslan

ABSTRACT

Given strings S1, S2, and a regular expression R, we introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned with some substring s2 of S2, and both s1 and s2 match R, i.e. s1,s2 ∈ L(R) where L(R) is the regular language described by R. A motivation for the problem is that protein sequences can be aligned in a way that known motifs guide the alignments. We present an O(nmr) time algorithm for the regular expression constrained sequence alignment problem where n, and m are the lengths of S1, and S2, respectively, and r is in the order of the size of the transition function of a finite automaton M that we create from a nondeterministic finite automaton N accepting L(R). M contains O(t2) states if N has t states. More... »

PAGES

322-333

References to SciGraph publications

Book

TITLE

Combinatorial Pattern Matching

ISBN

978-3-540-26201-5
978-3-540-31562-9

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/11496656_28

DOI

http://dx.doi.org/10.1007/11496656_28

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1000307633


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0802", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computation Theory and Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Vermont", 
          "id": "https://www.grid.ac/institutes/grid.59062.38", 
          "name": [
            "Department of Computer Science, The University of Vermont, 05405, Burlington, VT, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Arslan", 
        "givenName": "Abdullah N.", 
        "id": "sg:person.01031672451.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01031672451.96"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1016/j.ipl.2004.02.008", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001651261"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bth220", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007615319"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0959-440x(96)80057-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042163926"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0097-8485(02)00005-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046266378"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ipl.2003.07.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048731588"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ipl.2003.07.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048731588"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/30.1.235", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048836665"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.7280687", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062645221"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/csb.2002.1039336", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1077040257"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/csb.2003.1227334", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1077183472"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/j.1460-2075.1982.tb01276.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1081682169"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4899-6846-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109706103", 
          "https://doi.org/10.1007/978-1-4899-6846-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4899-6846-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109706103", 
          "https://doi.org/10.1007/978-1-4899-6846-3"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2005", 
    "datePublishedReg": "2005-01-01", 
    "description": "Given strings S1, S2, and a regular expression R, we introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned with some substring s2 of S2, and both s1 and s2 match R, i.e. s1,s2 \u2208 L(R) where L(R) is the regular language described by R. A motivation for the problem is that protein sequences can be aligned in a way that known motifs guide the alignments. We present an O(nmr) time algorithm for the regular expression constrained sequence alignment problem where n, and m are the lengths of S1, and S2, respectively, and r is in the order of the size of the transition function of a finite automaton M that we create from a nondeterministic finite automaton N accepting L(R). M contains O(t2) states if N has t states.", 
    "editor": [
      {
        "familyName": "Apostolico", 
        "givenName": "Alberto", 
        "type": "Person"
      }, 
      {
        "familyName": "Crochemore", 
        "givenName": "Maxime", 
        "type": "Person"
      }, 
      {
        "familyName": "Park", 
        "givenName": "Kunsoo", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/11496656_28", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-540-26201-5", 
        "978-3-540-31562-9"
      ], 
      "name": "Combinatorial Pattern Matching", 
      "type": "Book"
    }, 
    "name": "Regular Expression Constrained Sequence Alignment", 
    "pagination": "322-333", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1000307633"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/11496656_28"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "01765e44bd8352b6271584b6481e08e583d911d279dc3be8a05a253278a653ad"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/11496656_28", 
      "https://app.dimensions.ai/details/publication/pub.1000307633"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T08:26", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000363_0000000363/records_70058_00000000.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F11496656_28"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11496656_28'


 

This table displays all metadata directly associated to this object as RDF triples.

109 TRIPLES      23 PREDICATES      38 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/11496656_28 schema:about anzsrc-for:08
2 anzsrc-for:0802
3 schema:author Nc69cfe1e865e4c1c80be6f4e0cc55dfd
4 schema:citation sg:pub.10.1007/978-1-4899-6846-3
5 https://doi.org/10.1002/j.1460-2075.1982.tb01276.x
6 https://doi.org/10.1016/j.ipl.2003.07.001
7 https://doi.org/10.1016/j.ipl.2004.02.008
8 https://doi.org/10.1016/s0097-8485(02)00005-0
9 https://doi.org/10.1016/s0959-440x(96)80057-1
10 https://doi.org/10.1093/bioinformatics/bth220
11 https://doi.org/10.1093/nar/30.1.235
12 https://doi.org/10.1109/csb.2002.1039336
13 https://doi.org/10.1109/csb.2003.1227334
14 https://doi.org/10.1126/science.7280687
15 schema:datePublished 2005
16 schema:datePublishedReg 2005-01-01
17 schema:description Given strings S1, S2, and a regular expression R, we introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between S1 and S2 over all alignments such that in these alignments there exists a segment where some substring s1 of S1 is aligned with some substring s2 of S2, and both s1 and s2 match R, i.e. s1,s2 ∈ L(R) where L(R) is the regular language described by R. A motivation for the problem is that protein sequences can be aligned in a way that known motifs guide the alignments. We present an O(nmr) time algorithm for the regular expression constrained sequence alignment problem where n, and m are the lengths of S1, and S2, respectively, and r is in the order of the size of the transition function of a finite automaton M that we create from a nondeterministic finite automaton N accepting L(R). M contains O(t2) states if N has t states.
18 schema:editor N2403db6478d34160b8f300ddb262c6e1
19 schema:genre chapter
20 schema:inLanguage en
21 schema:isAccessibleForFree false
22 schema:isPartOf Na0e466d56e824d808032c26a4d7e911a
23 schema:name Regular Expression Constrained Sequence Alignment
24 schema:pagination 322-333
25 schema:productId N092a3f19969b4bbbb627c04ef12e02af
26 N24321dc1ba784ae6849604d1a44822f7
27 N673ce21c7c1947cc896624aa36aedd71
28 schema:publisher N513c3ae5a055499bacf0625dcb27a334
29 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000307633
30 https://doi.org/10.1007/11496656_28
31 schema:sdDatePublished 2019-04-16T08:26
32 schema:sdLicense https://scigraph.springernature.com/explorer/license/
33 schema:sdPublisher N91a7b3622a054221b5dc9dfefbd44294
34 schema:url https://link.springer.com/10.1007%2F11496656_28
35 sgo:license sg:explorer/license/
36 sgo:sdDataset chapters
37 rdf:type schema:Chapter
38 N092a3f19969b4bbbb627c04ef12e02af schema:name doi
39 schema:value 10.1007/11496656_28
40 rdf:type schema:PropertyValue
41 N0d492a1ee332445bbaef31f8fccb4f35 rdf:first N193c8b91ff1e46a1a313f02854daea21
42 rdf:rest rdf:nil
43 N193c8b91ff1e46a1a313f02854daea21 schema:familyName Park
44 schema:givenName Kunsoo
45 rdf:type schema:Person
46 N2403db6478d34160b8f300ddb262c6e1 rdf:first Nac49f11c10694188be9f2f9ebbcfbb8f
47 rdf:rest N4c9991c57873477bb69e7d8204b70db8
48 N24321dc1ba784ae6849604d1a44822f7 schema:name readcube_id
49 schema:value 01765e44bd8352b6271584b6481e08e583d911d279dc3be8a05a253278a653ad
50 rdf:type schema:PropertyValue
51 N4c9991c57873477bb69e7d8204b70db8 rdf:first Nb34318bf0b9047db93632e411ddf1667
52 rdf:rest N0d492a1ee332445bbaef31f8fccb4f35
53 N513c3ae5a055499bacf0625dcb27a334 schema:location Berlin, Heidelberg
54 schema:name Springer Berlin Heidelberg
55 rdf:type schema:Organisation
56 N673ce21c7c1947cc896624aa36aedd71 schema:name dimensions_id
57 schema:value pub.1000307633
58 rdf:type schema:PropertyValue
59 N91a7b3622a054221b5dc9dfefbd44294 schema:name Springer Nature - SN SciGraph project
60 rdf:type schema:Organization
61 Na0e466d56e824d808032c26a4d7e911a schema:isbn 978-3-540-26201-5
62 978-3-540-31562-9
63 schema:name Combinatorial Pattern Matching
64 rdf:type schema:Book
65 Nac49f11c10694188be9f2f9ebbcfbb8f schema:familyName Apostolico
66 schema:givenName Alberto
67 rdf:type schema:Person
68 Nb34318bf0b9047db93632e411ddf1667 schema:familyName Crochemore
69 schema:givenName Maxime
70 rdf:type schema:Person
71 Nc69cfe1e865e4c1c80be6f4e0cc55dfd rdf:first sg:person.01031672451.96
72 rdf:rest rdf:nil
73 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
74 schema:name Information and Computing Sciences
75 rdf:type schema:DefinedTerm
76 anzsrc-for:0802 schema:inDefinedTermSet anzsrc-for:
77 schema:name Computation Theory and Mathematics
78 rdf:type schema:DefinedTerm
79 sg:person.01031672451.96 schema:affiliation https://www.grid.ac/institutes/grid.59062.38
80 schema:familyName Arslan
81 schema:givenName Abdullah N.
82 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01031672451.96
83 rdf:type schema:Person
84 sg:pub.10.1007/978-1-4899-6846-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109706103
85 https://doi.org/10.1007/978-1-4899-6846-3
86 rdf:type schema:CreativeWork
87 https://doi.org/10.1002/j.1460-2075.1982.tb01276.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1081682169
88 rdf:type schema:CreativeWork
89 https://doi.org/10.1016/j.ipl.2003.07.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048731588
90 rdf:type schema:CreativeWork
91 https://doi.org/10.1016/j.ipl.2004.02.008 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001651261
92 rdf:type schema:CreativeWork
93 https://doi.org/10.1016/s0097-8485(02)00005-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046266378
94 rdf:type schema:CreativeWork
95 https://doi.org/10.1016/s0959-440x(96)80057-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042163926
96 rdf:type schema:CreativeWork
97 https://doi.org/10.1093/bioinformatics/bth220 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007615319
98 rdf:type schema:CreativeWork
99 https://doi.org/10.1093/nar/30.1.235 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048836665
100 rdf:type schema:CreativeWork
101 https://doi.org/10.1109/csb.2002.1039336 schema:sameAs https://app.dimensions.ai/details/publication/pub.1077040257
102 rdf:type schema:CreativeWork
103 https://doi.org/10.1109/csb.2003.1227334 schema:sameAs https://app.dimensions.ai/details/publication/pub.1077183472
104 rdf:type schema:CreativeWork
105 https://doi.org/10.1126/science.7280687 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062645221
106 rdf:type schema:CreativeWork
107 https://www.grid.ac/institutes/grid.59062.38 schema:alternateName University of Vermont
108 schema:name Department of Computer Science, The University of Vermont, 05405, Burlington, VT, USA
109 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...