On prediction by data compression View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

1997

AUTHORS

Paul Vitányi , Ming Li

ABSTRACT

Traditional wisdom has it that the better a theory compresses the learning data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This belief is vindicated in practice but apparently has not been rigorously proved in a general setting. Making these ideas rigorous involves the length of the shortest effective description of an individual object: its Kolmogorov complexity. In a previous paper we have shown that optimal compression is almost always a best strategy in hypotheses identification (an ideal form of the minimum description length (MDL) principle). Whereas the single best hypothesis does not necessarily give the best prediction, we demonstrate that nonetheless compression is almost always the best strategy in prediction methods in the style of R. Solomonoff. More... »

PAGES

14-30

Book

TITLE

Machine Learning: ECML-97

ISBN

978-3-540-62858-3
978-3-540-68708-5

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69

DOI

http://dx.doi.org/10.1007/3-540-62858-4_69

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1052481256


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Centrum Wiskunde and Informatica", 
          "id": "https://www.grid.ac/institutes/grid.6054.7", 
          "name": [
            "CWI, Kruislaan 413, 1098\u00a0SJ Amsterdam, The Netherlands"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vit\u00e1nyi", 
        "givenName": "Paul", 
        "id": "sg:person.014213763741.01", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014213763741.01"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "City University of Hong Kong", 
          "id": "https://www.grid.ac/institutes/grid.35030.35", 
          "name": [
            "Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Ming", 
        "id": "sg:person.0621576316.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621576316.79"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "1997", 
    "datePublishedReg": "1997-01-01", 
    "description": "Traditional wisdom has it that the better a theory compresses the learning data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This belief is vindicated in practice but apparently has not been rigorously proved in a general setting. Making these ideas rigorous involves the length of the shortest effective description of an individual object: its Kolmogorov complexity. In a previous paper we have shown that optimal compression is almost always a best strategy in hypotheses identification (an ideal form of the minimum description length (MDL) principle). Whereas the single best hypothesis does not necessarily give the best prediction, we demonstrate that nonetheless compression is almost always the best strategy in prediction methods in the style of R. Solomonoff.", 
    "editor": [
      {
        "familyName": "Someren", 
        "givenName": "Maarten", 
        "type": "Person"
      }, 
      {
        "familyName": "Widmer", 
        "givenName": "Gerhard", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/3-540-62858-4_69", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-62858-3", 
        "978-3-540-68708-5"
      ], 
      "name": "Machine Learning: ECML-97", 
      "type": "Book"
    }, 
    "name": "On prediction by data compression", 
    "pagination": "14-30", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/3-540-62858-4_69"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "f9fc7322906d2748a777f76cd17a4dda1dd6ada93eb8e0209a59517767e841b4"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1052481256"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/3-540-62858-4_69", 
      "https://app.dimensions.ai/details/publication/pub.1052481256"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T17:03", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8678_00000090.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/3-540-62858-4_69"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'


 

This table displays all metadata directly associated to this object as RDF triples.

80 TRIPLES      22 PREDICATES      27 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/3-540-62858-4_69 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author Nd960f2c0aa694d9a863a7a08afc8c9a4
4 schema:datePublished 1997
5 schema:datePublishedReg 1997-01-01
6 schema:description Traditional wisdom has it that the better a theory compresses the learning data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This belief is vindicated in practice but apparently has not been rigorously proved in a general setting. Making these ideas rigorous involves the length of the shortest effective description of an individual object: its Kolmogorov complexity. In a previous paper we have shown that optimal compression is almost always a best strategy in hypotheses identification (an ideal form of the minimum description length (MDL) principle). Whereas the single best hypothesis does not necessarily give the best prediction, we demonstrate that nonetheless compression is almost always the best strategy in prediction methods in the style of R. Solomonoff.
7 schema:editor N241a3448dbd1481db944a5bafb7418eb
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf N46a17c4ed8f14bf68164bc010d7f8012
12 schema:name On prediction by data compression
13 schema:pagination 14-30
14 schema:productId N23c7d55a4c63438f80263a7c8022a2d5
15 N34f29aed61804dddbeb940ddc28b6209
16 Ne727e5258348444eafbbeec67fac6f50
17 schema:publisher N700213a7dd8a4435944e807c0e1c8973
18 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052481256
19 https://doi.org/10.1007/3-540-62858-4_69
20 schema:sdDatePublished 2019-04-15T17:03
21 schema:sdLicense https://scigraph.springernature.com/explorer/license/
22 schema:sdPublisher N56f81719f1f8497d9202605939ca0a58
23 schema:url http://link.springer.com/10.1007/3-540-62858-4_69
24 sgo:license sg:explorer/license/
25 sgo:sdDataset chapters
26 rdf:type schema:Chapter
27 N13d88aacbf944eb3a36798acb0db9f0a rdf:first sg:person.0621576316.79
28 rdf:rest rdf:nil
29 N23c7d55a4c63438f80263a7c8022a2d5 schema:name doi
30 schema:value 10.1007/3-540-62858-4_69
31 rdf:type schema:PropertyValue
32 N241a3448dbd1481db944a5bafb7418eb rdf:first N654a75b3db44446da3535efd15a6f9be
33 rdf:rest Nff2a6e12aa0c46a997179917d4785420
34 N34f29aed61804dddbeb940ddc28b6209 schema:name readcube_id
35 schema:value f9fc7322906d2748a777f76cd17a4dda1dd6ada93eb8e0209a59517767e841b4
36 rdf:type schema:PropertyValue
37 N46a17c4ed8f14bf68164bc010d7f8012 schema:isbn 978-3-540-62858-3
38 978-3-540-68708-5
39 schema:name Machine Learning: ECML-97
40 rdf:type schema:Book
41 N56f81719f1f8497d9202605939ca0a58 schema:name Springer Nature - SN SciGraph project
42 rdf:type schema:Organization
43 N654a75b3db44446da3535efd15a6f9be schema:familyName Someren
44 schema:givenName Maarten
45 rdf:type schema:Person
46 N700213a7dd8a4435944e807c0e1c8973 schema:location Berlin, Heidelberg
47 schema:name Springer Berlin Heidelberg
48 rdf:type schema:Organisation
49 Nd960f2c0aa694d9a863a7a08afc8c9a4 rdf:first sg:person.014213763741.01
50 rdf:rest N13d88aacbf944eb3a36798acb0db9f0a
51 Ne727e5258348444eafbbeec67fac6f50 schema:name dimensions_id
52 schema:value pub.1052481256
53 rdf:type schema:PropertyValue
54 Nedea6dfdbe47426a9649dee0f0aa2d28 schema:familyName Widmer
55 schema:givenName Gerhard
56 rdf:type schema:Person
57 Nff2a6e12aa0c46a997179917d4785420 rdf:first Nedea6dfdbe47426a9649dee0f0aa2d28
58 rdf:rest rdf:nil
59 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
60 schema:name Mathematical Sciences
61 rdf:type schema:DefinedTerm
62 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
63 schema:name Statistics
64 rdf:type schema:DefinedTerm
65 sg:person.014213763741.01 schema:affiliation https://www.grid.ac/institutes/grid.6054.7
66 schema:familyName Vitányi
67 schema:givenName Paul
68 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014213763741.01
69 rdf:type schema:Person
70 sg:person.0621576316.79 schema:affiliation https://www.grid.ac/institutes/grid.35030.35
71 schema:familyName Li
72 schema:givenName Ming
73 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621576316.79
74 rdf:type schema:Person
75 https://www.grid.ac/institutes/grid.35030.35 schema:alternateName City University of Hong Kong
76 schema:name Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
77 rdf:type schema:Organization
78 https://www.grid.ac/institutes/grid.6054.7 schema:alternateName Centrum Wiskunde and Informatica
79 schema:name CWI, Kruislaan 413, 1098 SJ Amsterdam, The Netherlands
80 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...