On prediction by data compression View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

1997

AUTHORS

Paul Vitányi , Ming Li

ABSTRACT

Traditional wisdom has it that the better a theory compresses the learning data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This belief is vindicated in practice but apparently has not been rigorously proved in a general setting. Making these ideas rigorous involves the length of the shortest effective description of an individual object: its Kolmogorov complexity. In a previous paper we have shown that optimal compression is almost always a best strategy in hypotheses identification (an ideal form of the minimum description length (MDL) principle). Whereas the single best hypothesis does not necessarily give the best prediction, we demonstrate that nonetheless compression is almost always the best strategy in prediction methods in the style of R. Solomonoff. More... »

PAGES

14-30

Book

TITLE

Machine Learning: ECML-97

ISBN

978-3-540-62858-3
978-3-540-68708-5

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69

DOI

http://dx.doi.org/10.1007/3-540-62858-4_69

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1052481256


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Centrum Wiskunde and Informatica", 
          "id": "https://www.grid.ac/institutes/grid.6054.7", 
          "name": [
            "CWI, Kruislaan 413, 1098\u00a0SJ Amsterdam, The Netherlands"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vit\u00e1nyi", 
        "givenName": "Paul", 
        "id": "sg:person.014213763741.01", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014213763741.01"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "City University of Hong Kong", 
          "id": "https://www.grid.ac/institutes/grid.35030.35", 
          "name": [
            "Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Ming", 
        "id": "sg:person.0621576316.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621576316.79"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "1997", 
    "datePublishedReg": "1997-01-01", 
    "description": "Traditional wisdom has it that the better a theory compresses the learning data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This belief is vindicated in practice but apparently has not been rigorously proved in a general setting. Making these ideas rigorous involves the length of the shortest effective description of an individual object: its Kolmogorov complexity. In a previous paper we have shown that optimal compression is almost always a best strategy in hypotheses identification (an ideal form of the minimum description length (MDL) principle). Whereas the single best hypothesis does not necessarily give the best prediction, we demonstrate that nonetheless compression is almost always the best strategy in prediction methods in the style of R. Solomonoff.", 
    "editor": [
      {
        "familyName": "Someren", 
        "givenName": "Maarten", 
        "type": "Person"
      }, 
      {
        "familyName": "Widmer", 
        "givenName": "Gerhard", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/3-540-62858-4_69", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-62858-3", 
        "978-3-540-68708-5"
      ], 
      "name": "Machine Learning: ECML-97", 
      "type": "Book"
    }, 
    "name": "On prediction by data compression", 
    "pagination": "14-30", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/3-540-62858-4_69"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "f9fc7322906d2748a777f76cd17a4dda1dd6ada93eb8e0209a59517767e841b4"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1052481256"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/3-540-62858-4_69", 
      "https://app.dimensions.ai/details/publication/pub.1052481256"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T17:03", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8678_00000090.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/3-540-62858-4_69"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/3-540-62858-4_69'


 

This table displays all metadata directly associated to this object as RDF triples.

80 TRIPLES      22 PREDICATES      27 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/3-540-62858-4_69 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author N1c89aab2ba4e40f789e6ff876d96ad09
4 schema:datePublished 1997
5 schema:datePublishedReg 1997-01-01
6 schema:description Traditional wisdom has it that the better a theory compresses the learning data concerning some phenomenon under investigation, the better we learn, generalize, and the better the theory predicts unknown data. This belief is vindicated in practice but apparently has not been rigorously proved in a general setting. Making these ideas rigorous involves the length of the shortest effective description of an individual object: its Kolmogorov complexity. In a previous paper we have shown that optimal compression is almost always a best strategy in hypotheses identification (an ideal form of the minimum description length (MDL) principle). Whereas the single best hypothesis does not necessarily give the best prediction, we demonstrate that nonetheless compression is almost always the best strategy in prediction methods in the style of R. Solomonoff.
7 schema:editor Nee9ab2b850414277afb4fe9b2ec934f9
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf Nf056bb80bdb24b0099cc61f61c52290e
12 schema:name On prediction by data compression
13 schema:pagination 14-30
14 schema:productId N75ae5e4a0b7a4f75bb51da671a95da57
15 Ncf1a65bc9db54cb9a8f20d821cc46d12
16 Nf2812bb8dabd4601837017e62f159a18
17 schema:publisher N14855d8abb6448a6b24c83ea35cc875c
18 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052481256
19 https://doi.org/10.1007/3-540-62858-4_69
20 schema:sdDatePublished 2019-04-15T17:03
21 schema:sdLicense https://scigraph.springernature.com/explorer/license/
22 schema:sdPublisher N8c2ccf32f37d4c9c8497adef263636f3
23 schema:url http://link.springer.com/10.1007/3-540-62858-4_69
24 sgo:license sg:explorer/license/
25 sgo:sdDataset chapters
26 rdf:type schema:Chapter
27 N14855d8abb6448a6b24c83ea35cc875c schema:location Berlin, Heidelberg
28 schema:name Springer Berlin Heidelberg
29 rdf:type schema:Organisation
30 N1c89aab2ba4e40f789e6ff876d96ad09 rdf:first sg:person.014213763741.01
31 rdf:rest Nf099ec7495ec4383821bdda4107ed380
32 N5c2642daef53401eb496d0072ac5b8a7 schema:familyName Someren
33 schema:givenName Maarten
34 rdf:type schema:Person
35 N5c4443ff194f4e299ca17274b2303082 schema:familyName Widmer
36 schema:givenName Gerhard
37 rdf:type schema:Person
38 N75ae5e4a0b7a4f75bb51da671a95da57 schema:name dimensions_id
39 schema:value pub.1052481256
40 rdf:type schema:PropertyValue
41 N8c2ccf32f37d4c9c8497adef263636f3 schema:name Springer Nature - SN SciGraph project
42 rdf:type schema:Organization
43 Nc074bab536264455851edfe4735f93f8 rdf:first N5c4443ff194f4e299ca17274b2303082
44 rdf:rest rdf:nil
45 Ncf1a65bc9db54cb9a8f20d821cc46d12 schema:name doi
46 schema:value 10.1007/3-540-62858-4_69
47 rdf:type schema:PropertyValue
48 Nee9ab2b850414277afb4fe9b2ec934f9 rdf:first N5c2642daef53401eb496d0072ac5b8a7
49 rdf:rest Nc074bab536264455851edfe4735f93f8
50 Nf056bb80bdb24b0099cc61f61c52290e schema:isbn 978-3-540-62858-3
51 978-3-540-68708-5
52 schema:name Machine Learning: ECML-97
53 rdf:type schema:Book
54 Nf099ec7495ec4383821bdda4107ed380 rdf:first sg:person.0621576316.79
55 rdf:rest rdf:nil
56 Nf2812bb8dabd4601837017e62f159a18 schema:name readcube_id
57 schema:value f9fc7322906d2748a777f76cd17a4dda1dd6ada93eb8e0209a59517767e841b4
58 rdf:type schema:PropertyValue
59 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
60 schema:name Mathematical Sciences
61 rdf:type schema:DefinedTerm
62 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
63 schema:name Statistics
64 rdf:type schema:DefinedTerm
65 sg:person.014213763741.01 schema:affiliation https://www.grid.ac/institutes/grid.6054.7
66 schema:familyName Vitányi
67 schema:givenName Paul
68 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014213763741.01
69 rdf:type schema:Person
70 sg:person.0621576316.79 schema:affiliation https://www.grid.ac/institutes/grid.35030.35
71 schema:familyName Li
72 schema:givenName Ming
73 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621576316.79
74 rdf:type schema:Person
75 https://www.grid.ac/institutes/grid.35030.35 schema:alternateName City University of Hong Kong
76 schema:name Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
77 rdf:type schema:Organization
78 https://www.grid.ac/institutes/grid.6054.7 schema:alternateName Centrum Wiskunde and Informatica
79 schema:name CWI, Kruislaan 413, 1098 SJ Amsterdam, The Netherlands
80 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...