Learning constraints in spreadsheets and tabular data View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2017-10

AUTHORS

Samuel Kolb, Sergey Paramonov, Tias Guns, Luc De Raedt

ABSTRACT

Spreadsheets, comma separated value files and other tabular data representations are in wide use today. However, writing, maintaining and identifying good formulas for tabular data and spreadsheets can be time-consuming and error-prone. We investigate the automatic learning of constraints (formulas and relations) in raw tabular data in an unsupervised way. We represent common spreadsheet formulas and relations through predicates and expressions whose arguments must satisfy the inherent properties of the constraint. The challenge is to automatically infer the set of constraints present in the data, without labeled examples or user feedback. We propose a two-stage generate and test method where the first stage uses constraint solving techniques to efficiently reduce the number of candidates, based on the predicate signatures. Our approach takes inspiration from inductive logic programming, constraint learning and constraint satisfaction. We show that we are able to accurately discover constraints in spreadsheets from various sources. More... »

PAGES

1441-1468

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s10994-017-5640-x

DOI

http://dx.doi.org/10.1007/s10994-017-5640-x

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1085869747


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "KU Leuven", 
          "id": "https://www.grid.ac/institutes/grid.5596.f", 
          "name": [
            "KU Leuven, Leuven, Belgium"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kolb", 
        "givenName": "Samuel", 
        "id": "sg:person.016707313732.04", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016707313732.04"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "KU Leuven", 
          "id": "https://www.grid.ac/institutes/grid.5596.f", 
          "name": [
            "KU Leuven, Leuven, Belgium"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Paramonov", 
        "givenName": "Sergey", 
        "id": "sg:person.07443771633.15", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07443771633.15"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Vrije Universiteit Brussel", 
          "id": "https://www.grid.ac/institutes/grid.8767.e", 
          "name": [
            "Vrije Universiteit Brussel, Brussels, Belgium"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Guns", 
        "givenName": "Tias", 
        "id": "sg:person.015074144413.77", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015074144413.77"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "KU Leuven", 
          "id": "https://www.grid.ac/institutes/grid.5596.f", 
          "name": [
            "KU Leuven, Leuven, Belgium"
          ], 
          "type": "Organization"
        }, 
        "familyName": "De Raedt", 
        "givenName": "Luc", 
        "id": "sg:person.015333627665.77", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015333627665.77"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1016/0169-023x(94)90023-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009065409"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0169-023x(94)90023-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009065409"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-68856-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014103340", 
          "https://doi.org/10.1007/978-3-540-68856-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-68856-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014103340", 
          "https://doi.org/10.1007/978-3-540-68856-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-642-33558-7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017971821", 
          "https://doi.org/10.1007/978-3-642-33558-7"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-642-33558-7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017971821", 
          "https://doi.org/10.1007/978-3-642-33558-7"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-0-387-30164-8_258", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030513665", 
          "https://doi.org/10.1007/978-0-387-30164-8_258"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/2535838.2535850", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033234568"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1926385.1926423", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034087112"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1007361123060", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034235085", 
          "https://doi.org/10.1023/a:1007361123060"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/11564096_8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041371725", 
          "https://doi.org/10.1007/11564096_8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/11564096_8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041371725", 
          "https://doi.org/10.1007/11564096_8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/11564096_8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041371725", 
          "https://doi.org/10.1007/11564096_8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/195705.195708", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042038622"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/cje/bet075", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045048920"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/comjnl/42.2.100", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059479210"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1257/aer.100.2.573", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1064525201"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3233/ida-2000-43-403", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1107708734"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2017-10", 
    "datePublishedReg": "2017-10-01", 
    "description": "Spreadsheets, comma separated value files and other tabular data representations are in wide use today. However, writing, maintaining and identifying good formulas for tabular data and spreadsheets can be time-consuming and error-prone. We investigate the automatic learning of constraints (formulas and relations) in raw tabular data in an unsupervised way. We represent common spreadsheet formulas and relations through predicates and expressions whose arguments must satisfy the inherent properties of the constraint. The challenge is to automatically infer the set of constraints present in the data, without labeled examples or user feedback. We propose a two-stage generate and test method where the first stage uses constraint solving techniques to efficiently reduce the number of candidates, based on the predicate signatures. Our approach takes inspiration from inductive logic programming, constraint learning and constraint satisfaction. We show that we are able to accurately discover constraints in spreadsheets from various sources.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1007/s10994-017-5640-x", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1125588", 
        "issn": [
          "0885-6125", 
          "1573-0565"
        ], 
        "name": "Machine Learning", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "9-10", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "106"
      }
    ], 
    "name": "Learning constraints in spreadsheets and tabular data", 
    "pagination": "1441-1468", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "ee8e771cea98aa210a69c8b7776605c7e30a1159ee7311eea3dd1acefe24174d"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s10994-017-5640-x"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1085869747"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s10994-017-5640-x", 
      "https://app.dimensions.ai/details/publication/pub.1085869747"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T10:38", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000349_0000000349/records_113677_00000004.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1007%2Fs10994-017-5640-x"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10994-017-5640-x'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10994-017-5640-x'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10994-017-5640-x'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10994-017-5640-x'


 

This table displays all metadata directly associated to this object as RDF triples.

129 TRIPLES      21 PREDICATES      40 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s10994-017-5640-x schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N29dd1c21d15644b09c64cf8d96c42853
4 schema:citation sg:pub.10.1007/11564096_8
5 sg:pub.10.1007/978-0-387-30164-8_258
6 sg:pub.10.1007/978-3-540-68856-3
7 sg:pub.10.1007/978-3-642-33558-7
8 sg:pub.10.1023/a:1007361123060
9 https://doi.org/10.1016/0169-023x(94)90023-x
10 https://doi.org/10.1093/cje/bet075
11 https://doi.org/10.1093/comjnl/42.2.100
12 https://doi.org/10.1145/1926385.1926423
13 https://doi.org/10.1145/195705.195708
14 https://doi.org/10.1145/2535838.2535850
15 https://doi.org/10.1257/aer.100.2.573
16 https://doi.org/10.3233/ida-2000-43-403
17 schema:datePublished 2017-10
18 schema:datePublishedReg 2017-10-01
19 schema:description Spreadsheets, comma separated value files and other tabular data representations are in wide use today. However, writing, maintaining and identifying good formulas for tabular data and spreadsheets can be time-consuming and error-prone. We investigate the automatic learning of constraints (formulas and relations) in raw tabular data in an unsupervised way. We represent common spreadsheet formulas and relations through predicates and expressions whose arguments must satisfy the inherent properties of the constraint. The challenge is to automatically infer the set of constraints present in the data, without labeled examples or user feedback. We propose a two-stage generate and test method where the first stage uses constraint solving techniques to efficiently reduce the number of candidates, based on the predicate signatures. Our approach takes inspiration from inductive logic programming, constraint learning and constraint satisfaction. We show that we are able to accurately discover constraints in spreadsheets from various sources.
20 schema:genre research_article
21 schema:inLanguage en
22 schema:isAccessibleForFree true
23 schema:isPartOf Nca75c590f15649c6a81e6d8960a35a0f
24 Ne0815fc0e1aa4140b0433fafd4f72aab
25 sg:journal.1125588
26 schema:name Learning constraints in spreadsheets and tabular data
27 schema:pagination 1441-1468
28 schema:productId N129714c511e846fab963960bf4476e84
29 N14b971cdd6a1455f876550d2044fbe0d
30 N96608861281640e68924c73b60b61db5
31 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085869747
32 https://doi.org/10.1007/s10994-017-5640-x
33 schema:sdDatePublished 2019-04-11T10:38
34 schema:sdLicense https://scigraph.springernature.com/explorer/license/
35 schema:sdPublisher Ne00bd80dee4b477c847473449e80d11a
36 schema:url https://link.springer.com/10.1007%2Fs10994-017-5640-x
37 sgo:license sg:explorer/license/
38 sgo:sdDataset articles
39 rdf:type schema:ScholarlyArticle
40 N129714c511e846fab963960bf4476e84 schema:name doi
41 schema:value 10.1007/s10994-017-5640-x
42 rdf:type schema:PropertyValue
43 N14b971cdd6a1455f876550d2044fbe0d schema:name readcube_id
44 schema:value ee8e771cea98aa210a69c8b7776605c7e30a1159ee7311eea3dd1acefe24174d
45 rdf:type schema:PropertyValue
46 N29dd1c21d15644b09c64cf8d96c42853 rdf:first sg:person.016707313732.04
47 rdf:rest Ne8e65b192a034f639dae4ae457839d71
48 N3c5b4c9f13aa446eb782f761ac7d345e rdf:first sg:person.015074144413.77
49 rdf:rest N3d47e123cdc541d08cafada85c4dba7d
50 N3d47e123cdc541d08cafada85c4dba7d rdf:first sg:person.015333627665.77
51 rdf:rest rdf:nil
52 N96608861281640e68924c73b60b61db5 schema:name dimensions_id
53 schema:value pub.1085869747
54 rdf:type schema:PropertyValue
55 Nca75c590f15649c6a81e6d8960a35a0f schema:volumeNumber 106
56 rdf:type schema:PublicationVolume
57 Ne00bd80dee4b477c847473449e80d11a schema:name Springer Nature - SN SciGraph project
58 rdf:type schema:Organization
59 Ne0815fc0e1aa4140b0433fafd4f72aab schema:issueNumber 9-10
60 rdf:type schema:PublicationIssue
61 Ne8e65b192a034f639dae4ae457839d71 rdf:first sg:person.07443771633.15
62 rdf:rest N3c5b4c9f13aa446eb782f761ac7d345e
63 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
64 schema:name Information and Computing Sciences
65 rdf:type schema:DefinedTerm
66 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
67 schema:name Artificial Intelligence and Image Processing
68 rdf:type schema:DefinedTerm
69 sg:journal.1125588 schema:issn 0885-6125
70 1573-0565
71 schema:name Machine Learning
72 rdf:type schema:Periodical
73 sg:person.015074144413.77 schema:affiliation https://www.grid.ac/institutes/grid.8767.e
74 schema:familyName Guns
75 schema:givenName Tias
76 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015074144413.77
77 rdf:type schema:Person
78 sg:person.015333627665.77 schema:affiliation https://www.grid.ac/institutes/grid.5596.f
79 schema:familyName De Raedt
80 schema:givenName Luc
81 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015333627665.77
82 rdf:type schema:Person
83 sg:person.016707313732.04 schema:affiliation https://www.grid.ac/institutes/grid.5596.f
84 schema:familyName Kolb
85 schema:givenName Samuel
86 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016707313732.04
87 rdf:type schema:Person
88 sg:person.07443771633.15 schema:affiliation https://www.grid.ac/institutes/grid.5596.f
89 schema:familyName Paramonov
90 schema:givenName Sergey
91 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07443771633.15
92 rdf:type schema:Person
93 sg:pub.10.1007/11564096_8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041371725
94 https://doi.org/10.1007/11564096_8
95 rdf:type schema:CreativeWork
96 sg:pub.10.1007/978-0-387-30164-8_258 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030513665
97 https://doi.org/10.1007/978-0-387-30164-8_258
98 rdf:type schema:CreativeWork
99 sg:pub.10.1007/978-3-540-68856-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014103340
100 https://doi.org/10.1007/978-3-540-68856-3
101 rdf:type schema:CreativeWork
102 sg:pub.10.1007/978-3-642-33558-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017971821
103 https://doi.org/10.1007/978-3-642-33558-7
104 rdf:type schema:CreativeWork
105 sg:pub.10.1023/a:1007361123060 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034235085
106 https://doi.org/10.1023/a:1007361123060
107 rdf:type schema:CreativeWork
108 https://doi.org/10.1016/0169-023x(94)90023-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1009065409
109 rdf:type schema:CreativeWork
110 https://doi.org/10.1093/cje/bet075 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045048920
111 rdf:type schema:CreativeWork
112 https://doi.org/10.1093/comjnl/42.2.100 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059479210
113 rdf:type schema:CreativeWork
114 https://doi.org/10.1145/1926385.1926423 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034087112
115 rdf:type schema:CreativeWork
116 https://doi.org/10.1145/195705.195708 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042038622
117 rdf:type schema:CreativeWork
118 https://doi.org/10.1145/2535838.2535850 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033234568
119 rdf:type schema:CreativeWork
120 https://doi.org/10.1257/aer.100.2.573 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064525201
121 rdf:type schema:CreativeWork
122 https://doi.org/10.3233/ida-2000-43-403 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107708734
123 rdf:type schema:CreativeWork
124 https://www.grid.ac/institutes/grid.5596.f schema:alternateName KU Leuven
125 schema:name KU Leuven, Leuven, Belgium
126 rdf:type schema:Organization
127 https://www.grid.ac/institutes/grid.8767.e schema:alternateName Vrije Universiteit Brussel
128 schema:name Vrije Universiteit Brussel, Brussels, Belgium
129 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...