On the k-Closest Substring and k-Consensus Pattern Problems View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2004

AUTHORS

Yishan Jiao , Jingyi Xu , Ming Li

ABSTRACT

Given a set S ={s1,s2,...,sn } of strings each of length m, and an integer L, we study the following two problems. k-Closest Substring problem: find k center strings c1,c2 ,...,ck of length L minimizing d such that for each sj ∈ S, there is a length-L substring tj (closest substring) of sj with min 1 ≤ i ≤ kd(ci,tj) ≤ d. We give a PTAS for this problem, for k=O(1). k-Consensus Pattern problem: find k median strings c1,c2 ,...,ck of length L and a substring tj (consensus pattern) of length L from each sj minimizing the total cost . We give a PTAS for this problem, for k=O(1). Our results improve recent results of [10] and [16] both of which depended on the random linear transformation technique in [16]. As for general k case, we give an alternative and direct proof of the NP-hardness of (2-ε)-approximation of the Hamming radius k-clustering problem, a special case of the k-Closest Substring problem restricted to L=m. More... »

PAGES

130-144

Book

TITLE

Combinatorial Pattern Matching

ISBN

978-3-540-22341-2
978-3-540-27801-6

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-27801-6_10

DOI

http://dx.doi.org/10.1007/978-3-540-27801-6_10

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1042631196


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0802", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computation Theory and Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Institute Of Computing Technology", 
          "id": "https://www.grid.ac/institutes/grid.424936.e", 
          "name": [
            "Bioinformatics lab, Institute of Computing Technology, Chinese Academy of Sciences, 6#, South Road, Kexueyuan, Zhongguancun, Beijing, P.R.China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Jiao", 
        "givenName": "Yishan", 
        "id": "sg:person.015365622527.91", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015365622527.91"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute Of Computing Technology", 
          "id": "https://www.grid.ac/institutes/grid.424936.e", 
          "name": [
            "Bioinformatics lab, Institute of Computing Technology, Chinese Academy of Sciences, 6#, South Road, Kexueyuan, Zhongguancun, Beijing, P.R.China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Xu", 
        "givenName": "Jingyi", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Waterloo", 
          "id": "https://www.grid.ac/institutes/grid.46078.3d", 
          "name": [
            "University of Waterloo"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Ming", 
        "id": "sg:person.0621576316.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621576316.79"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1145/301250.301376", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002920963"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/276698.276876", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004824658"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/62212.62255", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1006272306"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-45123-4_10", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007323304", 
          "https://doi.org/10.1007/3-540-45123-4_10"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-45123-4_10", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007323304", 
          "https://doi.org/10.1007/3-540-45123-4_10"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/179812.179818", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007410209"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0304-3975(85)90224-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010675302"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0304-3975(85)90224-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010675302"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-45123-4_11", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017652076", 
          "https://doi.org/10.1007/3-540-45123-4_11"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-45123-4_11", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017652076", 
          "https://doi.org/10.1007/3-540-45123-4_11"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0020-0190(81)90111-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023699298"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/17.5.419", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027200294"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/506147.506150", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044001636"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/506147.506149", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051646016"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/0132071", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062839609"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1017/cbo9780511814075", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1098701235"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2004", 
    "datePublishedReg": "2004-01-01", 
    "description": "Given a set S ={s1,s2,...,sn } of strings each of length m, and an integer L, we study the following two problems. k-Closest Substring problem: find k center strings c1,c2 ,...,ck of length L minimizing d such that for each sj \u2208 S, there is a length-L substring tj (closest substring) of sj with min 1 \u2264 i \u2264 kd(ci,tj) \u2264 d. We give a PTAS for this problem, for k=O(1). k-Consensus Pattern problem: find k median strings c1,c2 ,...,ck of length L and a substring tj (consensus pattern) of length L from each sj minimizing the total cost . We give a PTAS for this problem, for k=O(1). Our results improve recent results of [10] and [16] both of which depended on the random linear transformation technique in [16]. As for general k case, we give an alternative and direct proof of the NP-hardness of (2-\u03b5)-approximation of the Hamming radius k-clustering problem, a special case of the k-Closest Substring problem restricted to L=m.", 
    "editor": [
      {
        "familyName": "Sahinalp", 
        "givenName": "Suleyman Cenk", 
        "type": "Person"
      }, 
      {
        "familyName": "Muthukrishnan", 
        "givenName": "S.", 
        "type": "Person"
      }, 
      {
        "familyName": "Dogrusoz", 
        "givenName": "Ugur", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-27801-6_10", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-22341-2", 
        "978-3-540-27801-6"
      ], 
      "name": "Combinatorial Pattern Matching", 
      "type": "Book"
    }, 
    "name": "On the k-Closest Substring and k-Consensus Pattern Problems", 
    "pagination": "130-144", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1042631196"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-27801-6_10"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "cb1208bd0f07217026c43a7f8e142a026190ba6d4bb571e4393517b308fb512a"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-27801-6_10", 
      "https://app.dimensions.ai/details/publication/pub.1042631196"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T08:18", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000362_0000000362/records_87097_00000001.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F978-3-540-27801-6_10"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-27801-6_10'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-27801-6_10'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-27801-6_10'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-27801-6_10'


 

This table displays all metadata directly associated to this object as RDF triples.

132 TRIPLES      23 PREDICATES      40 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-27801-6_10 schema:about anzsrc-for:08
2 anzsrc-for:0802
3 schema:author N3c34d3f17cce418eaefaacc86f548f18
4 schema:citation sg:pub.10.1007/3-540-45123-4_10
5 sg:pub.10.1007/3-540-45123-4_11
6 https://doi.org/10.1016/0020-0190(81)90111-3
7 https://doi.org/10.1016/0304-3975(85)90224-5
8 https://doi.org/10.1017/cbo9780511814075
9 https://doi.org/10.1093/bioinformatics/17.5.419
10 https://doi.org/10.1137/0132071
11 https://doi.org/10.1145/179812.179818
12 https://doi.org/10.1145/276698.276876
13 https://doi.org/10.1145/301250.301376
14 https://doi.org/10.1145/506147.506149
15 https://doi.org/10.1145/506147.506150
16 https://doi.org/10.1145/62212.62255
17 schema:datePublished 2004
18 schema:datePublishedReg 2004-01-01
19 schema:description Given a set S ={s1,s2,...,sn } of strings each of length m, and an integer L, we study the following two problems. k-Closest Substring problem: find k center strings c1,c2 ,...,ck of length L minimizing d such that for each sj ∈ S, there is a length-L substring tj (closest substring) of sj with min 1 ≤ i ≤ kd(ci,tj) ≤ d. We give a PTAS for this problem, for k=O(1). k-Consensus Pattern problem: find k median strings c1,c2 ,...,ck of length L and a substring tj (consensus pattern) of length L from each sj minimizing the total cost . We give a PTAS for this problem, for k=O(1). Our results improve recent results of [10] and [16] both of which depended on the random linear transformation technique in [16]. As for general k case, we give an alternative and direct proof of the NP-hardness of (2-ε)-approximation of the Hamming radius k-clustering problem, a special case of the k-Closest Substring problem restricted to L=m.
20 schema:editor N27bfde49c3114d8c960c177e99b21a4e
21 schema:genre chapter
22 schema:inLanguage en
23 schema:isAccessibleForFree true
24 schema:isPartOf Nacdf5fd432944223b60986772b3a4131
25 schema:name On the k-Closest Substring and k-Consensus Pattern Problems
26 schema:pagination 130-144
27 schema:productId N7419e6c68d8e4b06b36926c4fef9e805
28 Na0f18bbf7ad94121b33921307ae74648
29 Ne88e32bfda9f4ec6b072fc65a4bd2178
30 schema:publisher Naa058e2b7d4241c788bcee098fc1d0f0
31 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042631196
32 https://doi.org/10.1007/978-3-540-27801-6_10
33 schema:sdDatePublished 2019-04-16T08:18
34 schema:sdLicense https://scigraph.springernature.com/explorer/license/
35 schema:sdPublisher N092b41144be2452b911bde7121d1b684
36 schema:url https://link.springer.com/10.1007%2F978-3-540-27801-6_10
37 sgo:license sg:explorer/license/
38 sgo:sdDataset chapters
39 rdf:type schema:Chapter
40 N092b41144be2452b911bde7121d1b684 schema:name Springer Nature - SN SciGraph project
41 rdf:type schema:Organization
42 N0f6012c6038242a584ba3c28e1b55b34 schema:familyName Muthukrishnan
43 schema:givenName S.
44 rdf:type schema:Person
45 N27bfde49c3114d8c960c177e99b21a4e rdf:first N64ebd52b653a4d28ba8a4845fd9e6a99
46 rdf:rest Nc4492b1e95ad48009fce8ce71cb1036a
47 N3c34d3f17cce418eaefaacc86f548f18 rdf:first sg:person.015365622527.91
48 rdf:rest N9014cd76de2540d5912138268f2f875f
49 N64ebd52b653a4d28ba8a4845fd9e6a99 schema:familyName Sahinalp
50 schema:givenName Suleyman Cenk
51 rdf:type schema:Person
52 N7419e6c68d8e4b06b36926c4fef9e805 schema:name dimensions_id
53 schema:value pub.1042631196
54 rdf:type schema:PropertyValue
55 N74207a97821540479fd0125c4bee6179 schema:familyName Dogrusoz
56 schema:givenName Ugur
57 rdf:type schema:Person
58 N860736d763e5493fa17293b2b3eebf96 rdf:first sg:person.0621576316.79
59 rdf:rest rdf:nil
60 N9014cd76de2540d5912138268f2f875f rdf:first Nbe8e49dcc7c64dfa86d77b00463f6071
61 rdf:rest N860736d763e5493fa17293b2b3eebf96
62 Na0f18bbf7ad94121b33921307ae74648 schema:name doi
63 schema:value 10.1007/978-3-540-27801-6_10
64 rdf:type schema:PropertyValue
65 Naa058e2b7d4241c788bcee098fc1d0f0 schema:location Berlin, Heidelberg
66 schema:name Springer Berlin Heidelberg
67 rdf:type schema:Organisation
68 Nacdf5fd432944223b60986772b3a4131 schema:isbn 978-3-540-22341-2
69 978-3-540-27801-6
70 schema:name Combinatorial Pattern Matching
71 rdf:type schema:Book
72 Nbe8e49dcc7c64dfa86d77b00463f6071 schema:affiliation https://www.grid.ac/institutes/grid.424936.e
73 schema:familyName Xu
74 schema:givenName Jingyi
75 rdf:type schema:Person
76 Nc4492b1e95ad48009fce8ce71cb1036a rdf:first N0f6012c6038242a584ba3c28e1b55b34
77 rdf:rest Nd3839830e3aa4e6e96027ec9ac33c555
78 Nd3839830e3aa4e6e96027ec9ac33c555 rdf:first N74207a97821540479fd0125c4bee6179
79 rdf:rest rdf:nil
80 Ne88e32bfda9f4ec6b072fc65a4bd2178 schema:name readcube_id
81 schema:value cb1208bd0f07217026c43a7f8e142a026190ba6d4bb571e4393517b308fb512a
82 rdf:type schema:PropertyValue
83 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
84 schema:name Information and Computing Sciences
85 rdf:type schema:DefinedTerm
86 anzsrc-for:0802 schema:inDefinedTermSet anzsrc-for:
87 schema:name Computation Theory and Mathematics
88 rdf:type schema:DefinedTerm
89 sg:person.015365622527.91 schema:affiliation https://www.grid.ac/institutes/grid.424936.e
90 schema:familyName Jiao
91 schema:givenName Yishan
92 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015365622527.91
93 rdf:type schema:Person
94 sg:person.0621576316.79 schema:affiliation https://www.grid.ac/institutes/grid.46078.3d
95 schema:familyName Li
96 schema:givenName Ming
97 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621576316.79
98 rdf:type schema:Person
99 sg:pub.10.1007/3-540-45123-4_10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007323304
100 https://doi.org/10.1007/3-540-45123-4_10
101 rdf:type schema:CreativeWork
102 sg:pub.10.1007/3-540-45123-4_11 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017652076
103 https://doi.org/10.1007/3-540-45123-4_11
104 rdf:type schema:CreativeWork
105 https://doi.org/10.1016/0020-0190(81)90111-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023699298
106 rdf:type schema:CreativeWork
107 https://doi.org/10.1016/0304-3975(85)90224-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010675302
108 rdf:type schema:CreativeWork
109 https://doi.org/10.1017/cbo9780511814075 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098701235
110 rdf:type schema:CreativeWork
111 https://doi.org/10.1093/bioinformatics/17.5.419 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027200294
112 rdf:type schema:CreativeWork
113 https://doi.org/10.1137/0132071 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062839609
114 rdf:type schema:CreativeWork
115 https://doi.org/10.1145/179812.179818 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007410209
116 rdf:type schema:CreativeWork
117 https://doi.org/10.1145/276698.276876 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004824658
118 rdf:type schema:CreativeWork
119 https://doi.org/10.1145/301250.301376 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002920963
120 rdf:type schema:CreativeWork
121 https://doi.org/10.1145/506147.506149 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051646016
122 rdf:type schema:CreativeWork
123 https://doi.org/10.1145/506147.506150 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044001636
124 rdf:type schema:CreativeWork
125 https://doi.org/10.1145/62212.62255 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006272306
126 rdf:type schema:CreativeWork
127 https://www.grid.ac/institutes/grid.424936.e schema:alternateName Institute Of Computing Technology
128 schema:name Bioinformatics lab, Institute of Computing Technology, Chinese Academy of Sciences, 6#, South Road, Kexueyuan, Zhongguancun, Beijing, P.R.China
129 rdf:type schema:Organization
130 https://www.grid.ac/institutes/grid.46078.3d schema:alternateName University of Waterloo
131 schema:name University of Waterloo
132 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...