Classification Techniques and Error Control in Logic Mining View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2009-10-15

AUTHORS

Giovanni Felici , Bruno Simeone , Vincenzo Spinelli

ABSTRACT

In this chapter we consider box clustering, a method for supervised classification that partitions the feature space with particularly simple convex sets (boxes). Box clustering produces systems of logic rules obtained from data in numerical form. Such rules explicitly represent the logic relations hidden in the data w.r.t. a target class. The algorithm adopted to solve the box clustering problem is based on a simple and fast agglomerative method which can be affected by the initial choice of the starting point and by the rules adopted by the method. In this chapter we propose and motivate a randomized approach that generates a large number of candidate models using different data samples and then chooses the best candidate model according to two criteria: model size, as expressed by the number of boxes of the model, and model precision, as expressed by the error on the test split. We adopt a Pareto-optimal strategy for the choice of the solution, under the hypothesis that such a choice would identify simple models with good predictive power. This procedure has been applied to a wide range of well-known data sets to evaluate to what extent our results confirm this hypothesis; its performances are then compared with those of competing methods. More... »

PAGES

99-119

Book

TITLE

Data Mining

ISBN

978-1-4419-1279-4
978-1-4419-1280-0

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-1-4419-1280-0_5

DOI

http://dx.doi.org/10.1007/978-1-4419-1280-0_5

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1004841740


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1402", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Applied Economics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/14", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Economics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Istituto di Analisi dei Sistemi ed Informatica Antonio Ruberti", 
          "id": "https://www.grid.ac/institutes/grid.419461.f", 
          "name": [
            "Istituto di Analisi dei Sistemi ed Informatica \u2018Antonio Ruberti\u2019, Consiglio Nazionale delle Ricerche, Viale Manzoni, 30, 00185, Rome, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Felici", 
        "givenName": "Giovanni", 
        "id": "sg:person.0711271572.73", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0711271572.73"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Sapienza University of Rome", 
          "id": "https://www.grid.ac/institutes/grid.7841.a", 
          "name": [
            "Dipartimento di Statistica, Probabilit\u00e0 e Statistiche Applicate, Universit\u00e0 \u2018La Sapienza\u2019, Piazzale Aldo Moro 5, 00185, Rome, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Simeone", 
        "givenName": "Bruno", 
        "id": "sg:person.012600006066.78", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012600006066.78"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "ISTAT \u2013 Istituto Nazionale di Statistica, Via Tuscolana, 1788, 00173, Rome, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Spinelli", 
        "givenName": "Vincenzo", 
        "id": "sg:person.012112744414.39", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012112744414.39"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/bf02283750", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009714991", 
          "https://doi.org/10.1007/bf02283750"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02283750", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009714991", 
          "https://doi.org/10.1007/bf02283750"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.dam.2003.08.013", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019342117"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.dam.2004.05.002", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019668322"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02614316", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032847525", 
          "https://doi.org/10.1007/bf02614316"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02614316", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032847525", 
          "https://doi.org/10.1007/bf02614316"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/0-387-34296-6_5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038738290", 
          "https://doi.org/10.1007/0-387-34296-6_5"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1020546910706", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045163626", 
          "https://doi.org/10.1023/a:1020546910706"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/69.842268", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061213820"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/69.842268", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061213820"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tkde.2005.50", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061661459"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icassp.2006.1661344", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093537920"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/mlsp.2006.275565", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094365064"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2009-10-15", 
    "datePublishedReg": "2009-10-15", 
    "description": "In this chapter we consider box clustering, a method for supervised classification that partitions the feature space with particularly simple convex sets (boxes). Box clustering produces systems of logic rules obtained from data in numerical form. Such rules explicitly represent the logic relations hidden in the data w.r.t. a target class. The algorithm adopted to solve the box clustering problem is based on a simple and fast agglomerative method which can be affected by the initial choice of the starting point and by the rules adopted by the method. In this chapter we propose and motivate a randomized approach that generates a large number of candidate models using different data samples and then chooses the best candidate model according to two criteria: model size, as expressed by the number of boxes of the model, and model precision, as expressed by the error on the test split. We adopt a Pareto-optimal strategy for the choice of the solution, under the hypothesis that such a choice would identify simple models with good predictive power. This procedure has been applied to a wide range of well-known data sets to evaluate to what extent our results confirm this hypothesis; its performances are then compared with those of competing methods.", 
    "editor": [
      {
        "familyName": "Stahlbock", 
        "givenName": "Robert", 
        "type": "Person"
      }, 
      {
        "familyName": "Crone", 
        "givenName": "Sven F.", 
        "type": "Person"
      }, 
      {
        "familyName": "Lessmann", 
        "givenName": "Stefan", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-1-4419-1280-0_5", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-1-4419-1279-4", 
        "978-1-4419-1280-0"
      ], 
      "name": "Data Mining", 
      "type": "Book"
    }, 
    "name": "Classification Techniques and Error Control in Logic Mining", 
    "pagination": "99-119", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1004841740"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-1-4419-1280-0_5"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "17125227966ef62501575fdae9f2602a24bf5534b494a4f64eef2d8594dc26a7"
        ]
      }
    ], 
    "publisher": {
      "location": "Boston, MA", 
      "name": "Springer US", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-1-4419-1280-0_5", 
      "https://app.dimensions.ai/details/publication/pub.1004841740"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T07:26", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000355_0000000355/records_52987_00000000.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F978-1-4419-1280-0_5"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-1-4419-1280-0_5'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-1-4419-1280-0_5'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-1-4419-1280-0_5'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-1-4419-1280-0_5'


 

This table displays all metadata directly associated to this object as RDF triples.

128 TRIPLES      23 PREDICATES      36 URIs      19 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-1-4419-1280-0_5 schema:about anzsrc-for:14
2 anzsrc-for:1402
3 schema:author N50af729f0a5641f9af238e89a60f7a72
4 schema:citation sg:pub.10.1007/0-387-34296-6_5
5 sg:pub.10.1007/bf02283750
6 sg:pub.10.1007/bf02614316
7 sg:pub.10.1023/a:1020546910706
8 https://doi.org/10.1016/j.dam.2003.08.013
9 https://doi.org/10.1016/j.dam.2004.05.002
10 https://doi.org/10.1109/69.842268
11 https://doi.org/10.1109/icassp.2006.1661344
12 https://doi.org/10.1109/mlsp.2006.275565
13 https://doi.org/10.1109/tkde.2005.50
14 schema:datePublished 2009-10-15
15 schema:datePublishedReg 2009-10-15
16 schema:description In this chapter we consider box clustering, a method for supervised classification that partitions the feature space with particularly simple convex sets (boxes). Box clustering produces systems of logic rules obtained from data in numerical form. Such rules explicitly represent the logic relations hidden in the data w.r.t. a target class. The algorithm adopted to solve the box clustering problem is based on a simple and fast agglomerative method which can be affected by the initial choice of the starting point and by the rules adopted by the method. In this chapter we propose and motivate a randomized approach that generates a large number of candidate models using different data samples and then chooses the best candidate model according to two criteria: model size, as expressed by the number of boxes of the model, and model precision, as expressed by the error on the test split. We adopt a Pareto-optimal strategy for the choice of the solution, under the hypothesis that such a choice would identify simple models with good predictive power. This procedure has been applied to a wide range of well-known data sets to evaluate to what extent our results confirm this hypothesis; its performances are then compared with those of competing methods.
17 schema:editor Nb807d9c1c4eb4659a990e02349f5c9fc
18 schema:genre chapter
19 schema:inLanguage en
20 schema:isAccessibleForFree false
21 schema:isPartOf N6ebc60cc3c414fb0880778ecaf2a851c
22 schema:name Classification Techniques and Error Control in Logic Mining
23 schema:pagination 99-119
24 schema:productId N4190a020ef074f30a2795b84cd8e4793
25 N68baa1113f03414599ddd95b6fb56170
26 N94ce1ccb374f43dc90143445bebe5c5c
27 schema:publisher Nb7c8dd530c1445fc93da67d469955431
28 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004841740
29 https://doi.org/10.1007/978-1-4419-1280-0_5
30 schema:sdDatePublished 2019-04-16T07:26
31 schema:sdLicense https://scigraph.springernature.com/explorer/license/
32 schema:sdPublisher Nfb95177665c44b439e363bad15fa7777
33 schema:url https://link.springer.com/10.1007%2F978-1-4419-1280-0_5
34 sgo:license sg:explorer/license/
35 sgo:sdDataset chapters
36 rdf:type schema:Chapter
37 N0f5dbe08ff894dbbb03873691c603f79 schema:familyName Lessmann
38 schema:givenName Stefan
39 rdf:type schema:Person
40 N1922991921684c6d93410cdd838a354d schema:familyName Crone
41 schema:givenName Sven F.
42 rdf:type schema:Person
43 N4190a020ef074f30a2795b84cd8e4793 schema:name doi
44 schema:value 10.1007/978-1-4419-1280-0_5
45 rdf:type schema:PropertyValue
46 N4f0ec45b9214459a8e2a844e1cdd9812 rdf:first sg:person.012600006066.78
47 rdf:rest N55665420b6704578861ae4ab7eeebd82
48 N50af729f0a5641f9af238e89a60f7a72 rdf:first sg:person.0711271572.73
49 rdf:rest N4f0ec45b9214459a8e2a844e1cdd9812
50 N55665420b6704578861ae4ab7eeebd82 rdf:first sg:person.012112744414.39
51 rdf:rest rdf:nil
52 N68baa1113f03414599ddd95b6fb56170 schema:name dimensions_id
53 schema:value pub.1004841740
54 rdf:type schema:PropertyValue
55 N6ebc60cc3c414fb0880778ecaf2a851c schema:isbn 978-1-4419-1279-4
56 978-1-4419-1280-0
57 schema:name Data Mining
58 rdf:type schema:Book
59 N94ce1ccb374f43dc90143445bebe5c5c schema:name readcube_id
60 schema:value 17125227966ef62501575fdae9f2602a24bf5534b494a4f64eef2d8594dc26a7
61 rdf:type schema:PropertyValue
62 N9dd0fff75393415fa7bb214b9945d42f rdf:first N1922991921684c6d93410cdd838a354d
63 rdf:rest Nb2d8464f09c641bd8e496bff7d11da84
64 Nb03e0d2730154ba8ad4570eef17cb8df schema:name ISTAT – Istituto Nazionale di Statistica, Via Tuscolana, 1788, 00173, Rome, Italy
65 rdf:type schema:Organization
66 Nb2d8464f09c641bd8e496bff7d11da84 rdf:first N0f5dbe08ff894dbbb03873691c603f79
67 rdf:rest rdf:nil
68 Nb7c8dd530c1445fc93da67d469955431 schema:location Boston, MA
69 schema:name Springer US
70 rdf:type schema:Organisation
71 Nb807d9c1c4eb4659a990e02349f5c9fc rdf:first Nff90771a8eaf4c499ffae57aa6c2dfe9
72 rdf:rest N9dd0fff75393415fa7bb214b9945d42f
73 Nfb95177665c44b439e363bad15fa7777 schema:name Springer Nature - SN SciGraph project
74 rdf:type schema:Organization
75 Nff90771a8eaf4c499ffae57aa6c2dfe9 schema:familyName Stahlbock
76 schema:givenName Robert
77 rdf:type schema:Person
78 anzsrc-for:14 schema:inDefinedTermSet anzsrc-for:
79 schema:name Economics
80 rdf:type schema:DefinedTerm
81 anzsrc-for:1402 schema:inDefinedTermSet anzsrc-for:
82 schema:name Applied Economics
83 rdf:type schema:DefinedTerm
84 sg:person.012112744414.39 schema:affiliation Nb03e0d2730154ba8ad4570eef17cb8df
85 schema:familyName Spinelli
86 schema:givenName Vincenzo
87 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012112744414.39
88 rdf:type schema:Person
89 sg:person.012600006066.78 schema:affiliation https://www.grid.ac/institutes/grid.7841.a
90 schema:familyName Simeone
91 schema:givenName Bruno
92 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012600006066.78
93 rdf:type schema:Person
94 sg:person.0711271572.73 schema:affiliation https://www.grid.ac/institutes/grid.419461.f
95 schema:familyName Felici
96 schema:givenName Giovanni
97 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0711271572.73
98 rdf:type schema:Person
99 sg:pub.10.1007/0-387-34296-6_5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038738290
100 https://doi.org/10.1007/0-387-34296-6_5
101 rdf:type schema:CreativeWork
102 sg:pub.10.1007/bf02283750 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009714991
103 https://doi.org/10.1007/bf02283750
104 rdf:type schema:CreativeWork
105 sg:pub.10.1007/bf02614316 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032847525
106 https://doi.org/10.1007/bf02614316
107 rdf:type schema:CreativeWork
108 sg:pub.10.1023/a:1020546910706 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045163626
109 https://doi.org/10.1023/a:1020546910706
110 rdf:type schema:CreativeWork
111 https://doi.org/10.1016/j.dam.2003.08.013 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019342117
112 rdf:type schema:CreativeWork
113 https://doi.org/10.1016/j.dam.2004.05.002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019668322
114 rdf:type schema:CreativeWork
115 https://doi.org/10.1109/69.842268 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061213820
116 rdf:type schema:CreativeWork
117 https://doi.org/10.1109/icassp.2006.1661344 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093537920
118 rdf:type schema:CreativeWork
119 https://doi.org/10.1109/mlsp.2006.275565 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094365064
120 rdf:type schema:CreativeWork
121 https://doi.org/10.1109/tkde.2005.50 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061661459
122 rdf:type schema:CreativeWork
123 https://www.grid.ac/institutes/grid.419461.f schema:alternateName Istituto di Analisi dei Sistemi ed Informatica Antonio Ruberti
124 schema:name Istituto di Analisi dei Sistemi ed Informatica ‘Antonio Ruberti’, Consiglio Nazionale delle Ricerche, Viale Manzoni, 30, 00185, Rome, Italy
125 rdf:type schema:Organization
126 https://www.grid.ac/institutes/grid.7841.a schema:alternateName Sapienza University of Rome
127 schema:name Dipartimento di Statistica, Probabilità e Statistiche Applicate, Università ‘La Sapienza’, Piazzale Aldo Moro 5, 00185, Rome, Italy
128 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...