# Designing and Implementing Algorithms for the Closest String Problem

Ontology type: schema:Chapter

### Chapter Info

DATE

2017-05-23

AUTHORS ABSTRACT

Given a set of n strings of length L and a radius d, the closest string problem (CSP for short) asks for a string \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{sol}$$\end{document} that is within a Hamming distance of d to each of the given strings. It is known that the problem is NP-hard and its optimization version admits a polynomial time approximation scheme (PTAS). A number of parameterized algorithms have been then developed to solve the problem when d is small. Among them, the relatively new ones have not been implemented before and their performance in practice was unknown. In this study, we implement all of them by careful engineering. For those that have been implemented before, our implementation is much faster. For some of those that have not been implemented before, our experimental results show that there exist huge gaps between their theoretical and practical performances. We also design a new parameterized algorithm for the binary case of CSP. The algorithm is deterministic and runs in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O\left( nL + n^2d\cdot 6.16^d\right)$$\end{document} time, while the previously best deterministic algorithm runs in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O\left( nL + nd^3\cdot 6.731^d\right)$$\end{document} time. More... »

PAGES

79-90

### Book

TITLE

Frontiers in Algorithmics

ISBN

978-3-319-59604-4
978-3-319-59605-1

### Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-59605-1_8

DOI

http://dx.doi.org/10.1007/978-3-319-59605-1_8

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1086509850

Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service:

[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0802",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Computation Theory and Mathematics",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Division of Information System Design, Tokyo Denki University, 350-0394, Hatoyama, Saitama, Japan",
"id": "http://www.grid.ac/institutes/grid.412773.4",
"name": [
"Division of Information System Design, Tokyo Denki University, 350-0394, Hatoyama, Saitama, Japan"
],
"type": "Organization"
},
"familyName": "Yuasa",
"givenName": "Shota",
"id": "sg:person.010511171047.52",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010511171047.52"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Division of Information System Design, Tokyo Denki University, 350-0394, Hatoyama, Saitama, Japan",
"id": "http://www.grid.ac/institutes/grid.412773.4",
"name": [
"Division of Information System Design, Tokyo Denki University, 350-0394, Hatoyama, Saitama, Japan"
],
"type": "Organization"
},
"familyName": "Chen",
"givenName": "Zhi-Zhong",
"id": "sg:person.015654265661.98",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015654265661.98"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "School of Computer Science, University of Waterloo, 200 University Ave. W, N2L3G1, Waterloo, ON, Canada",
"id": "http://www.grid.ac/institutes/grid.46078.3d",
"name": [
"School of Computer Science, University of Waterloo, 200 University Ave. W, N2L3G1, Waterloo, ON, Canada"
],
"type": "Organization"
},
"familyName": "Ma",
"givenName": "Bin",
"id": "sg:person.01221430663.16",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01221430663.16"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR",
"id": "http://www.grid.ac/institutes/None",
"name": [
"Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR"
],
"type": "Organization"
},
"familyName": "Wang",
"givenName": "Lusheng",
"id": "sg:person.01105113721.52",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01105113721.52"
],
"type": "Person"
}
],
"datePublished": "2017-05-23",
"datePublishedReg": "2017-05-23",
"description": "Given a set of n strings of length L and a radius d, the closest string problem (CSP for short) asks for a string \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$$t_{sol}$$\\end{document} that is within a Hamming distance of d to each of the given strings. It is known that the problem is NP-hard and its optimization version admits a polynomial time approximation scheme (PTAS). A number of parameterized algorithms have been then developed to solve the problem when d is small. Among them, the relatively new ones have not been implemented before and their performance in practice was unknown. In this study, we implement all of them by careful engineering. For those that have been implemented before, our implementation is much faster. For some of those that have not been implemented before, our experimental results show that there exist huge gaps between their theoretical and practical performances. We also design a new parameterized algorithm for the binary case of CSP. The algorithm is deterministic and runs in \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$$O\\left( nL + n^2d\\cdot 6.16^d\\right)$$\\end{document} time, while the previously best deterministic algorithm runs in \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$$O\\left( nL + nd^3\\cdot 6.731^d\\right)$$\\end{document} time.",
"editor": [
{
"familyName": "Xiao",
"givenName": "Mingyu",
"type": "Person"
},
{
"familyName": "Rosamond",
"givenName": "Frances",
"type": "Person"
}
],
"genre": "chapter",
"id": "sg:pub.10.1007/978-3-319-59605-1_8",
"inLanguage": "en",
"isAccessibleForFree": false,
"isPartOf": {
"isbn": [
"978-3-319-59604-4",
"978-3-319-59605-1"
],
"name": "Frontiers in Algorithmics",
"type": "Book"
},
"keywords": [
"study",
"cases",
"time",
"practice",
"huge gap",
"number",
"results",
"problem",
"version",
"implementation",
"gap",
"new ones",
"one",
"performance",
"run",
"distance",
"CSP",
"NPs",
"set",
"polynomial time approximation scheme",
"strings",
"string problem",
"time approximation scheme",
"closest string problem",
"Hamming distance",
"optimization version",
"algorithm",
"algorithm runs",
"Closest String Problem",
"careful engineering",
"engineering",
"experimental results",
"practical performance",
"approximation scheme",
"binary case",
"scheme",
"length L"
],
"name": "Designing and Implementing Algorithms for the Closest String Problem",
"pagination": "79-90",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1086509850"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/978-3-319-59605-1_8"
]
}
],
"publisher": {
"name": "Springer Nature",
"type": "Organisation"
},
"sameAs": [
"https://doi.org/10.1007/978-3-319-59605-1_8",
"https://app.dimensions.ai/details/publication/pub.1086509850"
],
"sdDataset": "chapters",
"sdDatePublished": "2022-05-20T07:43",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/chapter/chapter_222.jsonl",
"type": "Chapter",
"url": "https://doi.org/10.1007/978-3-319-59605-1_8"
}
]

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-59605-1_8'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-59605-1_8'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-59605-1_8'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-59605-1_8'

This table displays all metadata directly associated to this object as RDF triples.

130 TRIPLES      23 PREDICATES      63 URIs      56 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-59605-1_8 schema:about anzsrc-for:08
2 anzsrc-for:0802
3 schema:author N8e0ca7c86eb647df8f0512d127ef984a
4 schema:datePublished 2017-05-23
5 schema:datePublishedReg 2017-05-23
6 schema:description Given a set of n strings of length L and a radius d, the closest string problem (CSP for short) asks for a string \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$t_{sol}$$\end{document} that is within a Hamming distance of d to each of the given strings. It is known that the problem is NP-hard and its optimization version admits a polynomial time approximation scheme (PTAS). A number of parameterized algorithms have been then developed to solve the problem when d is small. Among them, the relatively new ones have not been implemented before and their performance in practice was unknown. In this study, we implement all of them by careful engineering. For those that have been implemented before, our implementation is much faster. For some of those that have not been implemented before, our experimental results show that there exist huge gaps between their theoretical and practical performances. We also design a new parameterized algorithm for the binary case of CSP. The algorithm is deterministic and runs in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O\left( nL + n^2d\cdot 6.16^d\right)$$\end{document} time, while the previously best deterministic algorithm runs in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O\left( nL + nd^3\cdot 6.731^d\right)$$\end{document} time.
7 schema:editor N2910b43ba4b84eb9995fcd1fdb814b78
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N3519826f3d994d399a376c0381ced0f8
12 schema:keywords CSP
13 Closest String Problem
14 Hamming distance
15 NPs
16 algorithm
17 algorithm runs
18 approximation scheme
19 binary case
20 careful engineering
21 cases
22 closest string problem
23 distance
24 engineering
25 experimental results
26 gap
27 huge gap
28 implementation
29 length L
30 new ones
31 number
32 one
33 optimization version
34 performance
35 polynomial time approximation scheme
36 practical performance
37 practice
38 problem
40 results
41 run
42 scheme
43 set
44 string problem
45 strings
46 study
47 time
48 time approximation scheme
49 version
50 schema:name Designing and Implementing Algorithms for the Closest String Problem
51 schema:pagination 79-90
53 Ndce20be9b7ba4a6a8b6487f867b57c0a
54 schema:publisher N6670bffe9b294789b06d36f18bdf8123
55 schema:sameAs https://app.dimensions.ai/details/publication/pub.1086509850
56 https://doi.org/10.1007/978-3-319-59605-1_8
57 schema:sdDatePublished 2022-05-20T07:43
59 schema:sdPublisher Ne12a760fc8db4cf784aa70ab716d62c4
60 schema:url https://doi.org/10.1007/978-3-319-59605-1_8
62 sgo:sdDataset chapters
63 rdf:type schema:Chapter
64 N084077e1cf444a6e9a402504b3600cd0 rdf:first N9c479584829a4b478b4742582896e551
65 rdf:rest rdf:nil
66 N2910b43ba4b84eb9995fcd1fdb814b78 rdf:first Nd0ac982685b242a28e1a5c037e38bc06
67 rdf:rest N084077e1cf444a6e9a402504b3600cd0
68 N33c1c243d20649e4b4f63484ab3ae869 rdf:first sg:person.01221430663.16
69 rdf:rest Nbf6a48dd29b74128a6770880f206b31b
70 N3519826f3d994d399a376c0381ced0f8 schema:isbn 978-3-319-59604-4
71 978-3-319-59605-1
72 schema:name Frontiers in Algorithmics
73 rdf:type schema:Book
74 N6670bffe9b294789b06d36f18bdf8123 schema:name Springer Nature
75 rdf:type schema:Organisation
76 N74ea4b5fdce54c3a951a9f1d1d33e652 rdf:first sg:person.015654265661.98
77 rdf:rest N33c1c243d20649e4b4f63484ab3ae869
78 N8e0ca7c86eb647df8f0512d127ef984a rdf:first sg:person.010511171047.52
79 rdf:rest N74ea4b5fdce54c3a951a9f1d1d33e652
80 N9c479584829a4b478b4742582896e551 schema:familyName Rosamond
81 schema:givenName Frances
82 rdf:type schema:Person
83 N9ea4301938dc4a4e973add5856b96829 schema:name doi
84 schema:value 10.1007/978-3-319-59605-1_8
85 rdf:type schema:PropertyValue
86 Nbf6a48dd29b74128a6770880f206b31b rdf:first sg:person.01105113721.52
87 rdf:rest rdf:nil
88 Nd0ac982685b242a28e1a5c037e38bc06 schema:familyName Xiao
89 schema:givenName Mingyu
90 rdf:type schema:Person
91 Ndce20be9b7ba4a6a8b6487f867b57c0a schema:name dimensions_id
92 schema:value pub.1086509850
93 rdf:type schema:PropertyValue
94 Ne12a760fc8db4cf784aa70ab716d62c4 schema:name Springer Nature - SN SciGraph project
95 rdf:type schema:Organization
96 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
97 schema:name Information and Computing Sciences
98 rdf:type schema:DefinedTerm
99 anzsrc-for:0802 schema:inDefinedTermSet anzsrc-for:
100 schema:name Computation Theory and Mathematics
101 rdf:type schema:DefinedTerm
102 sg:person.010511171047.52 schema:affiliation grid-institutes:grid.412773.4
103 schema:familyName Yuasa
104 schema:givenName Shota
105 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010511171047.52
106 rdf:type schema:Person
107 sg:person.01105113721.52 schema:affiliation grid-institutes:None
108 schema:familyName Wang
109 schema:givenName Lusheng
110 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01105113721.52
111 rdf:type schema:Person
112 sg:person.01221430663.16 schema:affiliation grid-institutes:grid.46078.3d
113 schema:familyName Ma
114 schema:givenName Bin
115 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01221430663.16
116 rdf:type schema:Person
117 sg:person.015654265661.98 schema:affiliation grid-institutes:grid.412773.4
118 schema:familyName Chen
119 schema:givenName Zhi-Zhong
120 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015654265661.98
121 rdf:type schema:Person
122 grid-institutes:None schema:alternateName Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR
123 schema:name Department of Computer Science, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR
124 rdf:type schema:Organization
125 grid-institutes:grid.412773.4 schema:alternateName Division of Information System Design, Tokyo Denki University, 350-0394, Hatoyama, Saitama, Japan
126 schema:name Division of Information System Design, Tokyo Denki University, 350-0394, Hatoyama, Saitama, Japan
127 rdf:type schema:Organization
128 grid-institutes:grid.46078.3d schema:alternateName School of Computer Science, University of Waterloo, 200 University Ave. W, N2L3G1, Waterloo, ON, Canada
129 schema:name School of Computer Science, University of Waterloo, 200 University Ave. W, N2L3G1, Waterloo, ON, Canada
130 rdf:type schema:Organization

Preview window. Press ESC to close (or click here)

...