Faster Fully Compressed Pattern Matching by Recompression

Ontology type: schema:Chapter      Open Access: True

Chapter Info

DATE

2012

AUTHORS ABSTRACT

In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammar generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so that substrings of the pattern and text are encoded in both SLPs in the same way. To this end, the SLPs are locally decompressed and then recompressed in a uniform way.This technique yields an \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}((n+m)\log M \log(n+m))$\end{document} algorithm for compressed pattern matching, where n (m) is the size of the compressed representation of the text (pattern, respectively), while M is the size of the decompressed pattern. Since M ≤ 2m, this substantially improves the previously best \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}(m^2n)$\end{document} algorithm.Since LZ compression standard reduces to SLP with log( N / n) overhead and in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}(n \log(N/n))$\end{document} time, the presented algorithm can be applied also to the fully LZ-compressed pattern matching problem, yielding an \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}(s \log s \log M)$\end{document} running time, where s = n log(N/n) + m log(M/m). More... »

PAGES

533-544

Book

TITLE

Automata, Languages, and Programming

ISBN

978-3-642-31593-0
978-3-642-31594-7

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-31594-7_45

DOI

http://dx.doi.org/10.1007/978-3-642-31594-7_45

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1026589735

Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service:

[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Artificial Intelligence and Image Processing",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Institute of Computer Science, University of Wroc\u0142aw, Poland",
"id": "http://www.grid.ac/institutes/grid.8505.8",
"name": [
"Institute of Computer Science, University of Wroc\u0142aw, Poland"
],
"type": "Organization"
},
"familyName": "Je\u017c",
"givenName": "Artur",
"id": "sg:person.015252371751.76",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015252371751.76"
],
"type": "Person"
}
],
"datePublished": "2012",
"datePublishedReg": "2012-01-01",
"description": "In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammar generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so that substrings of the pattern and text are encoded in both SLPs in the same way. To this end, the SLPs are locally decompressed and then recompressed in a uniform way.This technique yields an \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$\\mathcal{O}((n+m)\\log M \\log(n+m))$\\end{document} algorithm for compressed pattern matching, where n (m) is the size of the compressed representation of the text (pattern, respectively), while M is the size of the decompressed pattern. Since M\u2009\u2264\u20092m, this substantially improves the previously best \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$\\mathcal{O}(m^2n)$\\end{document} algorithm.Since LZ compression standard reduces to SLP with log( N / n) overhead and in \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$\\mathcal{O}(n \\log(N/n))$\\end{document} time, the presented algorithm can be applied also to the fully LZ-compressed pattern matching problem, yielding an \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$\\mathcal{O}(s \\log s \\log M)$\\end{document} running time, where s\u2009=\u2009n log(N/n)\u2009+\u2009m log(M/m).",
"editor": [
{
"familyName": "Czumaj",
"givenName": "Artur",
"type": "Person"
},
{
"familyName": "Mehlhorn",
"givenName": "Kurt",
"type": "Person"
},
{
"familyName": "Pitts",
"givenName": "Andrew",
"type": "Person"
},
{
"familyName": "Wattenhofer",
"givenName": "Roger",
"type": "Person"
}
],
"genre": "chapter",
"id": "sg:pub.10.1007/978-3-642-31594-7_45",
"inLanguage": "en",
"isAccessibleForFree": true,
"isPartOf": {
"isbn": [
"978-3-642-31593-0",
"978-3-642-31594-7"
],
"name": "Automata, Languages, and Programming",
"type": "Book"
},
"keywords": [
"compression standard",
"size",
"compressed pattern matching",
"technique",
"algorithm",
"pattern matching",
"matching",
"matching problem",
"paper",
"standards",
"time",
"way",
"uniform way",
"problem",
"end",
"compression",
"recompression",
"running time",
"terms",
"patterns",
"same way",
"form",
"LZ",
"strings",
"logs",
"program",
"substrings",
"pattern matching problem",
"straight-line programs",
"context-free grammars",
"representation",
"text",
"grammar"
],
"name": "Faster Fully Compressed Pattern Matching by Recompression",
"pagination": "533-544",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1026589735"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/978-3-642-31594-7_45"
]
}
],
"publisher": {
"name": "Springer Nature",
"type": "Organisation"
},
"sameAs": [
"https://doi.org/10.1007/978-3-642-31594-7_45",
"https://app.dimensions.ai/details/publication/pub.1026589735"
],
"sdDataset": "chapters",
"sdDatePublished": "2022-05-20T07:47",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/chapter/chapter_398.jsonl",
"type": "Chapter",
"url": "https://doi.org/10.1007/978-3-642-31594-7_45"
}
]

Download the RDF metadata as:

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-31594-7_45'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-31594-7_45'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-31594-7_45'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-31594-7_45'

This table displays all metadata directly associated to this object as RDF triples.

108 TRIPLES      23 PREDICATES      59 URIs      52 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-31594-7_45 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nae40c4d63e704bafb650d69102bd38a4
4 schema:datePublished 2012
5 schema:datePublishedReg 2012-01-01
6 schema:description In this paper, a fully compressed pattern matching problem is studied. The compression is represented by straight-line programs (SLPs), i.e. a context-free grammar generating exactly one string; the term fully means that both the pattern and the text are given in the compressed form. The problem is approached using a recently developed technique of local recompression: the SLPs are refactored, so that substrings of the pattern and text are encoded in both SLPs in the same way. To this end, the SLPs are locally decompressed and then recompressed in a uniform way.This technique yields an \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}((n+m)\log M \log(n+m))$\end{document} algorithm for compressed pattern matching, where n (m) is the size of the compressed representation of the text (pattern, respectively), while M is the size of the decompressed pattern. Since M ≤ 2m, this substantially improves the previously best \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}(m^2n)$\end{document} algorithm.Since LZ compression standard reduces to SLP with log( N / n) overhead and in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}(n \log(N/n))$\end{document} time, the presented algorithm can be applied also to the fully LZ-compressed pattern matching problem, yielding an \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$\mathcal{O}(s \log s \log M)$\end{document} running time, where s = n log(N/n) + m log(M/m).
7 schema:editor Nbb2141920c7b439e8df2fc85653bac42
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf N1be13cf3336344d0a26c67c7b5388702
12 schema:keywords LZ
13 algorithm
14 compressed pattern matching
15 compression
16 compression standard
17 context-free grammars
18 end
19 form
20 grammar
21 logs
22 matching
23 matching problem
24 paper
25 pattern matching
26 pattern matching problem
27 patterns
28 problem
29 program
30 recompression
31 representation
32 running time
33 same way
34 size
35 standards
36 straight-line programs
37 strings
38 substrings
39 technique
40 terms
41 text
42 time
43 uniform way
44 way
45 schema:name Faster Fully Compressed Pattern Matching by Recompression
46 schema:pagination 533-544
47 schema:productId N38615502b12a4159b904a5d2bd96784f
48 N96c677560a9c46778325f714722257e3
49 schema:publisher Nee20b70b52304459969616ff8a733a2b
50 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026589735
51 https://doi.org/10.1007/978-3-642-31594-7_45
52 schema:sdDatePublished 2022-05-20T07:47
53 schema:sdLicense https://scigraph.springernature.com/explorer/license/
54 schema:sdPublisher N31384b8d70d34b11a5039c3a9a51e665
55 schema:url https://doi.org/10.1007/978-3-642-31594-7_45
56 sgo:license sg:explorer/license/
57 sgo:sdDataset chapters
58 rdf:type schema:Chapter
59 N0cee23944f9b4e5495dddacc3abd80ad schema:familyName Czumaj
60 schema:givenName Artur
61 rdf:type schema:Person
62 N1be13cf3336344d0a26c67c7b5388702 schema:isbn 978-3-642-31593-0
63 978-3-642-31594-7
64 schema:name Automata, Languages, and Programming
65 rdf:type schema:Book
66 N25f2a45b38ed465583ce7ef1dec71407 schema:familyName Mehlhorn
67 schema:givenName Kurt
68 rdf:type schema:Person
69 N31384b8d70d34b11a5039c3a9a51e665 schema:name Springer Nature - SN SciGraph project
70 rdf:type schema:Organization
71 N38615502b12a4159b904a5d2bd96784f schema:name doi
72 schema:value 10.1007/978-3-642-31594-7_45
73 rdf:type schema:PropertyValue
74 N532f1039fa6448b58357ad5f55adabbc rdf:first N25f2a45b38ed465583ce7ef1dec71407
75 rdf:rest N76b0996833ba44cabc0b8fecbf698172
76 N6f93bb324c2b48ea8be994e837a7cc7e schema:familyName Wattenhofer
77 schema:givenName Roger
78 rdf:type schema:Person
79 N76b0996833ba44cabc0b8fecbf698172 rdf:first N8fa2f986ff504750b62ba67f282655c1
80 rdf:rest Nb2e1e9b52fea4496b9615775b331723e
81 N8fa2f986ff504750b62ba67f282655c1 schema:familyName Pitts
82 schema:givenName Andrew
83 rdf:type schema:Person
84 N96c677560a9c46778325f714722257e3 schema:name dimensions_id
85 schema:value pub.1026589735
86 rdf:type schema:PropertyValue
87 Nae40c4d63e704bafb650d69102bd38a4 rdf:first sg:person.015252371751.76
88 rdf:rest rdf:nil
89 Nb2e1e9b52fea4496b9615775b331723e rdf:first N6f93bb324c2b48ea8be994e837a7cc7e
90 rdf:rest rdf:nil
91 Nbb2141920c7b439e8df2fc85653bac42 rdf:first N0cee23944f9b4e5495dddacc3abd80ad
92 rdf:rest N532f1039fa6448b58357ad5f55adabbc
93 Nee20b70b52304459969616ff8a733a2b schema:name Springer Nature
94 rdf:type schema:Organisation
95 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
96 schema:name Information and Computing Sciences
97 rdf:type schema:DefinedTerm
98 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
99 schema:name Artificial Intelligence and Image Processing
100 rdf:type schema:DefinedTerm
101 sg:person.015252371751.76 schema:affiliation grid-institutes:grid.8505.8
102 schema:familyName Jeż
103 schema:givenName Artur
104 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015252371751.76
105 rdf:type schema:Person
106 grid-institutes:grid.8505.8 schema:alternateName Institute of Computer Science, University of Wrocław, Poland
107 schema:name Institute of Computer Science, University of Wrocław, Poland
108 rdf:type schema:Organization

Preview window. Press ESC to close (or click here)

...