Sublinear Algorithms for Approximating String Compressibility View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2007

AUTHORS

Sofya Raskhodnikova , Dana Ron , Ronitt Rubinfeld , Adam Smith

ABSTRACT

We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and Lempel-Ziv (LZ), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly.Our investigation of LZ yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to Lempel-Ziv to the number of distinct short substrings contained in it. In addition, we show that approximating the compressibility with respect to LZ is related to approximating the support size of a distribution. More... »

PAGES

609-623

Book

TITLE

Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques

ISBN

978-3-540-74207-4
978-3-540-74208-1

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-74208-1_44

DOI

http://dx.doi.org/10.1007/978-3-540-74208-1_44

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1037212078


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Pennsylvania State University, USA", 
          "id": "http://www.grid.ac/institutes/grid.29857.31", 
          "name": [
            "Pennsylvania State University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Raskhodnikova", 
        "givenName": "Sofya", 
        "id": "sg:person.07502404747.45", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07502404747.45"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Tel Aviv University, Israel", 
          "id": "http://www.grid.ac/institutes/grid.12136.37", 
          "name": [
            "Tel Aviv University, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ron", 
        "givenName": "Dana", 
        "id": "sg:person.015036626555.78", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015036626555.78"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "MIT, Cambridge MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.116068.8", 
          "name": [
            "MIT, Cambridge MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Rubinfeld", 
        "givenName": "Ronitt", 
        "id": "sg:person.01300130330.47", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01300130330.47"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Pennsylvania State University, USA", 
          "id": "http://www.grid.ac/institutes/grid.29857.31", 
          "name": [
            "Pennsylvania State University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Smith", 
        "givenName": "Adam", 
        "id": "sg:person.013307226666.21", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013307226666.21"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2007", 
    "datePublishedReg": "2007-01-01", 
    "description": "We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and Lempel-Ziv (LZ), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly.Our investigation of LZ yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to Lempel-Ziv to the number of distinct short substrings contained in it. In addition, we show that approximating the compressibility with respect to LZ is related to approximating the support size of a distribution.", 
    "editor": [
      {
        "familyName": "Charikar", 
        "givenName": "Moses", 
        "type": "Person"
      }, 
      {
        "familyName": "Jansen", 
        "givenName": "Klaus", 
        "type": "Person"
      }, 
      {
        "familyName": "Reingold", 
        "givenName": "Omer", 
        "type": "Person"
      }, 
      {
        "familyName": "Rolim", 
        "givenName": "Jos\u00e9 D. P.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-74208-1_44", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-74207-4", 
        "978-3-540-74208-1"
      ], 
      "name": "Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques", 
      "type": "Book"
    }, 
    "keywords": [
      "run-length encoding", 
      "Lempel-Ziv", 
      "compression scheme", 
      "sublinear algorithms", 
      "lossless compression scheme", 
      "sublinear time", 
      "short substrings", 
      "algorithm", 
      "scheme", 
      "structural lemma", 
      "lower bounds", 
      "strings", 
      "encoding", 
      "substrings", 
      "support size", 
      "compressibility", 
      "bounds", 
      "lemma", 
      "respect", 
      "detail", 
      "interest", 
      "questions", 
      "time", 
      "initial question", 
      "number", 
      "distribution", 
      "size", 
      "addition", 
      "investigation", 
      "yield"
    ], 
    "name": "Sublinear Algorithms for Approximating String Compressibility", 
    "pagination": "609-623", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1037212078"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-74208-1_44"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-74208-1_44", 
      "https://app.dimensions.ai/details/publication/pub.1037212078"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-06-01T22:34", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220601/entities/gbq_results/chapter/chapter_409.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-540-74208-1_44"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74208-1_44'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74208-1_44'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74208-1_44'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74208-1_44'


 

This table displays all metadata directly associated to this object as RDF triples.

132 TRIPLES      23 PREDICATES      56 URIs      49 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-74208-1_44 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N3ee55477a7614d1c8f3c14413943e03d
4 schema:datePublished 2007
5 schema:datePublishedReg 2007-01-01
6 schema:description We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and Lempel-Ziv (LZ), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly.Our investigation of LZ yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to Lempel-Ziv to the number of distinct short substrings contained in it. In addition, we show that approximating the compressibility with respect to LZ is related to approximating the support size of a distribution.
7 schema:editor N16895df53d2e4904a2369c0bb4b7b143
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf Nd9c0296271ba4c989d8e7c3764ec0130
12 schema:keywords Lempel-Ziv
13 addition
14 algorithm
15 bounds
16 compressibility
17 compression scheme
18 detail
19 distribution
20 encoding
21 initial question
22 interest
23 investigation
24 lemma
25 lossless compression scheme
26 lower bounds
27 number
28 questions
29 respect
30 run-length encoding
31 scheme
32 short substrings
33 size
34 strings
35 structural lemma
36 sublinear algorithms
37 sublinear time
38 substrings
39 support size
40 time
41 yield
42 schema:name Sublinear Algorithms for Approximating String Compressibility
43 schema:pagination 609-623
44 schema:productId N195a1c7530d042c7be5dd209fa89e91b
45 N55202b797c93459d81f1995a411f8230
46 schema:publisher Nc06a518ff1de4282a8115f6f40d718f0
47 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037212078
48 https://doi.org/10.1007/978-3-540-74208-1_44
49 schema:sdDatePublished 2022-06-01T22:34
50 schema:sdLicense https://scigraph.springernature.com/explorer/license/
51 schema:sdPublisher N712cca2cc7e54beea007efb288176182
52 schema:url https://doi.org/10.1007/978-3-540-74208-1_44
53 sgo:license sg:explorer/license/
54 sgo:sdDataset chapters
55 rdf:type schema:Chapter
56 N16895df53d2e4904a2369c0bb4b7b143 rdf:first N7463b2a05e8044a8a6b36bd0c6a5b2f1
57 rdf:rest Nc3098893bc1b492997d9d1bb49dd93e0
58 N195a1c7530d042c7be5dd209fa89e91b schema:name dimensions_id
59 schema:value pub.1037212078
60 rdf:type schema:PropertyValue
61 N38d41e630b7c4448a920a151b5365115 schema:familyName Jansen
62 schema:givenName Klaus
63 rdf:type schema:Person
64 N3ee55477a7614d1c8f3c14413943e03d rdf:first sg:person.07502404747.45
65 rdf:rest N5f17394e92c24aa9ab55e8149743dad2
66 N41be4c6c07e049f597a69d330aa215b4 rdf:first N8362509b2638467389b4763b2645ae66
67 rdf:rest Nf7f7a9a35bd040f3a7a489a683a1fe3e
68 N55202b797c93459d81f1995a411f8230 schema:name doi
69 schema:value 10.1007/978-3-540-74208-1_44
70 rdf:type schema:PropertyValue
71 N5f17394e92c24aa9ab55e8149743dad2 rdf:first sg:person.015036626555.78
72 rdf:rest Ne7542e190b21496080d31fcf0527a8b8
73 N712cca2cc7e54beea007efb288176182 schema:name Springer Nature - SN SciGraph project
74 rdf:type schema:Organization
75 N7463b2a05e8044a8a6b36bd0c6a5b2f1 schema:familyName Charikar
76 schema:givenName Moses
77 rdf:type schema:Person
78 N7cc679864e63436ebd04e62e3fe8ccf6 rdf:first sg:person.013307226666.21
79 rdf:rest rdf:nil
80 N8362509b2638467389b4763b2645ae66 schema:familyName Reingold
81 schema:givenName Omer
82 rdf:type schema:Person
83 Nc06a518ff1de4282a8115f6f40d718f0 schema:name Springer Nature
84 rdf:type schema:Organisation
85 Nc3098893bc1b492997d9d1bb49dd93e0 rdf:first N38d41e630b7c4448a920a151b5365115
86 rdf:rest N41be4c6c07e049f597a69d330aa215b4
87 Nd9c0296271ba4c989d8e7c3764ec0130 schema:isbn 978-3-540-74207-4
88 978-3-540-74208-1
89 schema:name Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
90 rdf:type schema:Book
91 Ne1e9c83b592344a89f25dd24f6b7924a schema:familyName Rolim
92 schema:givenName José D. P.
93 rdf:type schema:Person
94 Ne7542e190b21496080d31fcf0527a8b8 rdf:first sg:person.01300130330.47
95 rdf:rest N7cc679864e63436ebd04e62e3fe8ccf6
96 Nf7f7a9a35bd040f3a7a489a683a1fe3e rdf:first Ne1e9c83b592344a89f25dd24f6b7924a
97 rdf:rest rdf:nil
98 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
99 schema:name Information and Computing Sciences
100 rdf:type schema:DefinedTerm
101 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
102 schema:name Artificial Intelligence and Image Processing
103 rdf:type schema:DefinedTerm
104 sg:person.01300130330.47 schema:affiliation grid-institutes:grid.116068.8
105 schema:familyName Rubinfeld
106 schema:givenName Ronitt
107 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01300130330.47
108 rdf:type schema:Person
109 sg:person.013307226666.21 schema:affiliation grid-institutes:grid.29857.31
110 schema:familyName Smith
111 schema:givenName Adam
112 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013307226666.21
113 rdf:type schema:Person
114 sg:person.015036626555.78 schema:affiliation grid-institutes:grid.12136.37
115 schema:familyName Ron
116 schema:givenName Dana
117 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015036626555.78
118 rdf:type schema:Person
119 sg:person.07502404747.45 schema:affiliation grid-institutes:grid.29857.31
120 schema:familyName Raskhodnikova
121 schema:givenName Sofya
122 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07502404747.45
123 rdf:type schema:Person
124 grid-institutes:grid.116068.8 schema:alternateName MIT, Cambridge MA, USA
125 schema:name MIT, Cambridge MA, USA
126 rdf:type schema:Organization
127 grid-institutes:grid.12136.37 schema:alternateName Tel Aviv University, Israel
128 schema:name Tel Aviv University, Israel
129 rdf:type schema:Organization
130 grid-institutes:grid.29857.31 schema:alternateName Pennsylvania State University, USA
131 schema:name Pennsylvania State University, USA
132 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...