Levelwise Search and Borders of Theories in Knowledge Discovery View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

1997-09

AUTHORS

Heikki Mannila, Hannu Toivonen

ABSTRACT

One of the basic problems in knowledge discovery in databases (KDD) is the following: given a data set r, a class L of sentences for defining subgroups of r, and a selection predicate, find all sentences of L deemed interesting by the selection predicate. We analyze the simple levelwise algorithm for finding all such descriptions. We give bounds for the number of database accesses that the algorithm makes. For this, we introduce the concept of the border of a theory, a notion that turns out to be surprisingly powerful in analyzing the algorithm. We also consider the verification problem of a KDD process: given r and a set of sentences S L determine whether S is exactly the set of interesting statements about r. We show strong connections between the verification problem and the hypergraph transversal problem. The verification problem arises in a natural way when using sampling to speed up the pattern discovery step in KDD. More... »

PAGES

241-258

References to SciGraph publications

  • 1995-01. Efficient discovery of interesting statements in databases in JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
  • 1997-09. Discovery of Frequent Episodes in Event Sequences in DATA MINING AND KNOWLEDGE DISCOVERY
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1023/a:1009796218281

    DOI

    http://dx.doi.org/10.1023/a:1009796218281

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1011497512


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information Systems", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Helsinki", 
              "id": "https://www.grid.ac/institutes/grid.7737.4", 
              "name": [
                "Department of Computer Science, University of Helsinki, P.O. Box 26, FIN\u201300014, Finland", 
                "Department of Computer Science, University of Helsinki, P.O. Box 26, FIN\u201300014, Finland"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Mannila", 
            "givenName": "Heikki", 
            "id": "sg:person.01065176745.00", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065176745.00"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Helsinki", 
              "id": "https://www.grid.ac/institutes/grid.7737.4", 
              "name": [
                "Department of Computer Science, University of Helsinki, P.O. Box 26, FIN\u201300014, Finland", 
                "Department of Computer Science, University of Helsinki, P.O. Box 26, FIN\u201300014, Finland"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Toivonen", 
            "givenName": "Hannu", 
            "id": "sg:person.014126306365.50", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014126306365.50"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1023/a:1009748302351", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1005237640", 
              "https://doi.org/10.1023/a:1009748302351"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/240455.240472", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006761471"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0169-023x(94)90023-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009065409"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0169-023x(94)90023-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009065409"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0166-218x(92)90031-5", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1012092840"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf00962822", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017336206", 
              "https://doi.org/10.1007/bf00962822"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf00962822", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017336206", 
              "https://doi.org/10.1007/bf00962822"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/233269.233311", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019413618"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/int.4550070703", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1026054357"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/237661.237708", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027093368"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/170035.170072", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028726331"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/263661.263684", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031112924"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0022-0000(86)90015-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042700576"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/253260.253327", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045333834"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0004-3702(94)90112-0", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1046632075"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0004-3702(94)90112-0", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1046632075"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0004-3702(82)90040-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1049130265"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0004-3702(82)90040-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1049130265"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/jagm.1996.0062", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1051116080"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/191246.191314", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052019070"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1137/s0097539793250299", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062879828"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "1997-09", 
        "datePublishedReg": "1997-09-01", 
        "description": "One of the basic problems in knowledge discovery in databases (KDD) is the following: given a data set r, a class L of sentences for defining subgroups of r, and a selection predicate, find all sentences of L deemed interesting by the selection predicate. We analyze the simple levelwise algorithm for finding all such descriptions. We give bounds for the number of database accesses that the algorithm makes. For this, we introduce the concept of the border of a theory, a notion that turns out to be surprisingly powerful in analyzing the algorithm. We also consider the verification problem of a KDD process: given r and a set of sentences S L determine whether S is exactly the set of interesting statements about r. We show strong connections between the verification problem and the hypergraph transversal problem. The verification problem arises in a natural way when using sampling to speed up the pattern discovery step in KDD.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1023/a:1009796218281", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": [
          {
            "id": "sg:journal.1041853", 
            "issn": [
              "1384-5810", 
              "1573-756X"
            ], 
            "name": "Data Mining and Knowledge Discovery", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "3", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "1"
          }
        ], 
        "name": "Levelwise Search and Borders of Theories in Knowledge Discovery", 
        "pagination": "241-258", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "36c797cc5ba1f4a8928a6b89aba1df33bf1d5ef7455929d4355244e057acc3d0"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1023/a:1009796218281"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1011497512"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1023/a:1009796218281", 
          "https://app.dimensions.ai/details/publication/pub.1011497512"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-10T13:23", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8659_00000536.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "http://link.springer.com/10.1023%2FA%3A1009796218281"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1009796218281'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1009796218281'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1009796218281'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1009796218281'


     

    This table displays all metadata directly associated to this object as RDF triples.

    121 TRIPLES      21 PREDICATES      44 URIs      19 LITERALS      7 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1023/a:1009796218281 schema:about anzsrc-for:08
    2 anzsrc-for:0806
    3 schema:author N234b82f1a2dc4b4d9cdb67dda621b5a8
    4 schema:citation sg:pub.10.1007/bf00962822
    5 sg:pub.10.1023/a:1009748302351
    6 https://doi.org/10.1002/int.4550070703
    7 https://doi.org/10.1006/jagm.1996.0062
    8 https://doi.org/10.1016/0004-3702(82)90040-6
    9 https://doi.org/10.1016/0004-3702(94)90112-0
    10 https://doi.org/10.1016/0022-0000(86)90015-2
    11 https://doi.org/10.1016/0166-218x(92)90031-5
    12 https://doi.org/10.1016/0169-023x(94)90023-x
    13 https://doi.org/10.1137/s0097539793250299
    14 https://doi.org/10.1145/170035.170072
    15 https://doi.org/10.1145/191246.191314
    16 https://doi.org/10.1145/233269.233311
    17 https://doi.org/10.1145/237661.237708
    18 https://doi.org/10.1145/240455.240472
    19 https://doi.org/10.1145/253260.253327
    20 https://doi.org/10.1145/263661.263684
    21 schema:datePublished 1997-09
    22 schema:datePublishedReg 1997-09-01
    23 schema:description One of the basic problems in knowledge discovery in databases (KDD) is the following: given a data set r, a class L of sentences for defining subgroups of r, and a selection predicate, find all sentences of L deemed interesting by the selection predicate. We analyze the simple levelwise algorithm for finding all such descriptions. We give bounds for the number of database accesses that the algorithm makes. For this, we introduce the concept of the border of a theory, a notion that turns out to be surprisingly powerful in analyzing the algorithm. We also consider the verification problem of a KDD process: given r and a set of sentences S L determine whether S is exactly the set of interesting statements about r. We show strong connections between the verification problem and the hypergraph transversal problem. The verification problem arises in a natural way when using sampling to speed up the pattern discovery step in KDD.
    24 schema:genre research_article
    25 schema:inLanguage en
    26 schema:isAccessibleForFree false
    27 schema:isPartOf N57ffbc2f5d29432981d66f95d63e7c6a
    28 N6325350e09d145bbb7affca1ed9a2e60
    29 sg:journal.1041853
    30 schema:name Levelwise Search and Borders of Theories in Knowledge Discovery
    31 schema:pagination 241-258
    32 schema:productId N03974e8422ad4e2e9cb8a6d63b01e058
    33 N77a905d4b51f485c8e07760f620bbe01
    34 Ncbfa81b09c84471d8a3167356e476d02
    35 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011497512
    36 https://doi.org/10.1023/a:1009796218281
    37 schema:sdDatePublished 2019-04-10T13:23
    38 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    39 schema:sdPublisher N4a60abb3e1964e5fa5bd0932ad21a5e7
    40 schema:url http://link.springer.com/10.1023%2FA%3A1009796218281
    41 sgo:license sg:explorer/license/
    42 sgo:sdDataset articles
    43 rdf:type schema:ScholarlyArticle
    44 N03974e8422ad4e2e9cb8a6d63b01e058 schema:name doi
    45 schema:value 10.1023/a:1009796218281
    46 rdf:type schema:PropertyValue
    47 N234b82f1a2dc4b4d9cdb67dda621b5a8 rdf:first sg:person.01065176745.00
    48 rdf:rest N62427fe79656460c9a9c311c03f2a469
    49 N4a60abb3e1964e5fa5bd0932ad21a5e7 schema:name Springer Nature - SN SciGraph project
    50 rdf:type schema:Organization
    51 N57ffbc2f5d29432981d66f95d63e7c6a schema:volumeNumber 1
    52 rdf:type schema:PublicationVolume
    53 N62427fe79656460c9a9c311c03f2a469 rdf:first sg:person.014126306365.50
    54 rdf:rest rdf:nil
    55 N6325350e09d145bbb7affca1ed9a2e60 schema:issueNumber 3
    56 rdf:type schema:PublicationIssue
    57 N77a905d4b51f485c8e07760f620bbe01 schema:name dimensions_id
    58 schema:value pub.1011497512
    59 rdf:type schema:PropertyValue
    60 Ncbfa81b09c84471d8a3167356e476d02 schema:name readcube_id
    61 schema:value 36c797cc5ba1f4a8928a6b89aba1df33bf1d5ef7455929d4355244e057acc3d0
    62 rdf:type schema:PropertyValue
    63 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    64 schema:name Information and Computing Sciences
    65 rdf:type schema:DefinedTerm
    66 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
    67 schema:name Information Systems
    68 rdf:type schema:DefinedTerm
    69 sg:journal.1041853 schema:issn 1384-5810
    70 1573-756X
    71 schema:name Data Mining and Knowledge Discovery
    72 rdf:type schema:Periodical
    73 sg:person.01065176745.00 schema:affiliation https://www.grid.ac/institutes/grid.7737.4
    74 schema:familyName Mannila
    75 schema:givenName Heikki
    76 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065176745.00
    77 rdf:type schema:Person
    78 sg:person.014126306365.50 schema:affiliation https://www.grid.ac/institutes/grid.7737.4
    79 schema:familyName Toivonen
    80 schema:givenName Hannu
    81 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014126306365.50
    82 rdf:type schema:Person
    83 sg:pub.10.1007/bf00962822 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017336206
    84 https://doi.org/10.1007/bf00962822
    85 rdf:type schema:CreativeWork
    86 sg:pub.10.1023/a:1009748302351 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005237640
    87 https://doi.org/10.1023/a:1009748302351
    88 rdf:type schema:CreativeWork
    89 https://doi.org/10.1002/int.4550070703 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026054357
    90 rdf:type schema:CreativeWork
    91 https://doi.org/10.1006/jagm.1996.0062 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051116080
    92 rdf:type schema:CreativeWork
    93 https://doi.org/10.1016/0004-3702(82)90040-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049130265
    94 rdf:type schema:CreativeWork
    95 https://doi.org/10.1016/0004-3702(94)90112-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046632075
    96 rdf:type schema:CreativeWork
    97 https://doi.org/10.1016/0022-0000(86)90015-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042700576
    98 rdf:type schema:CreativeWork
    99 https://doi.org/10.1016/0166-218x(92)90031-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012092840
    100 rdf:type schema:CreativeWork
    101 https://doi.org/10.1016/0169-023x(94)90023-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1009065409
    102 rdf:type schema:CreativeWork
    103 https://doi.org/10.1137/s0097539793250299 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062879828
    104 rdf:type schema:CreativeWork
    105 https://doi.org/10.1145/170035.170072 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028726331
    106 rdf:type schema:CreativeWork
    107 https://doi.org/10.1145/191246.191314 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052019070
    108 rdf:type schema:CreativeWork
    109 https://doi.org/10.1145/233269.233311 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019413618
    110 rdf:type schema:CreativeWork
    111 https://doi.org/10.1145/237661.237708 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027093368
    112 rdf:type schema:CreativeWork
    113 https://doi.org/10.1145/240455.240472 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006761471
    114 rdf:type schema:CreativeWork
    115 https://doi.org/10.1145/253260.253327 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045333834
    116 rdf:type schema:CreativeWork
    117 https://doi.org/10.1145/263661.263684 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031112924
    118 rdf:type schema:CreativeWork
    119 https://www.grid.ac/institutes/grid.7737.4 schema:alternateName University of Helsinki
    120 schema:name Department of Computer Science, University of Helsinki, P.O. Box 26, FIN–00014, Finland
    121 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...