CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2004

AUTHORS

Yun Chi , Yirong Yang , Yi Xia , Richard R. Muntz

ABSTRACT

Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. One important problem in mining databases of trees is to find frequently occurring subtrees. However, because of the combinatorial explosion, the number of frequent subtrees usually grows exponentially with the size of the subtrees. In this paper, we present CMTreeMiner, a computationally efficient algorithm that discovers all closed and maximal frequent subtrees in a database of rooted unordered trees. The algorithm mines both closed and maximal frequent subtrees by traversing an enumeration tree that systematically enumerates all subtrees, while using an enumeration DAG to prune the branches of the enumeration tree that do not correspond to closed or maximal frequent subtrees. The enumeration tree and the enumeration DAG are defined based on a canonical form for rooted unordered trees–the depth-first canonical form (DFCF). We compare the performance of our algorithm with that of PathJoin, a recently published algorithm that mines maximal frequent subtrees. More... »

PAGES

63-73

References to SciGraph publications

Book

TITLE

Advances in Knowledge Discovery and Data Mining

ISBN

978-3-540-22064-0
978-3-540-24775-3

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-24775-3_9

DOI

http://dx.doi.org/10.1007/978-3-540-24775-3_9

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1038916179


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of California Los Angeles", 
          "id": "https://www.grid.ac/institutes/grid.19006.3e", 
          "name": [
            "University of California, 90095, Los Angeles, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chi", 
        "givenName": "Yun", 
        "id": "sg:person.011112563323.65", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011112563323.65"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of California Los Angeles", 
          "id": "https://www.grid.ac/institutes/grid.19006.3e", 
          "name": [
            "University of California, 90095, Los Angeles, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yang", 
        "givenName": "Yirong", 
        "id": "sg:person.07700022521.38", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07700022521.38"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of California Los Angeles", 
          "id": "https://www.grid.ac/institutes/grid.19006.3e", 
          "name": [
            "University of California, 90095, Los Angeles, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Xia", 
        "givenName": "Yi", 
        "id": "sg:person.012444621406.88", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012444621406.88"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of California Los Angeles", 
          "id": "https://www.grid.ac/institutes/grid.19006.3e", 
          "name": [
            "University of California, 90095, Los Angeles, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Muntz", 
        "givenName": "Richard R.", 
        "id": "sg:person.01332504062.09", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01332504062.09"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1145/775047.775058", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013503742"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-39644-4_6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025309521", 
          "https://doi.org/10.1007/978-3-540-39644-4_6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-39644-4_6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025309521", 
          "https://doi.org/10.1007/978-3-540-39644-4_6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/956750.956787", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040031905"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-49257-7_25", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048579740", 
          "https://doi.org/10.1007/3-540-49257-7_25"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-49257-7_25", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048579740", 
          "https://doi.org/10.1007/3-540-49257-7_25"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/1.9781611972726.10", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1088799860"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icdm.2003.1250943", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095357941"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icdm.2003.1250964", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095548226"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2004", 
    "datePublishedReg": "2004-01-01", 
    "description": "Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. One important problem in mining databases of trees is to find frequently occurring subtrees. However, because of the combinatorial explosion, the number of frequent subtrees usually grows exponentially with the size of the subtrees. In this paper, we present CMTreeMiner, a computationally efficient algorithm that discovers all closed and maximal frequent subtrees in a database of rooted unordered trees. The algorithm mines both closed and maximal frequent subtrees by traversing an enumeration tree that systematically enumerates all subtrees, while using an enumeration DAG to prune the branches of the enumeration tree that do not correspond to closed or maximal frequent subtrees. The enumeration tree and the enumeration DAG are defined based on a canonical form for rooted unordered trees\u2013the depth-first canonical form (DFCF). We compare the performance of our algorithm with that of PathJoin, a recently published algorithm that mines maximal frequent subtrees.", 
    "editor": [
      {
        "familyName": "Dai", 
        "givenName": "Honghua", 
        "type": "Person"
      }, 
      {
        "familyName": "Srikant", 
        "givenName": "Ramakrishnan", 
        "type": "Person"
      }, 
      {
        "familyName": "Zhang", 
        "givenName": "Chengqi", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-24775-3_9", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-540-22064-0", 
        "978-3-540-24775-3"
      ], 
      "name": "Advances in Knowledge Discovery and Data Mining", 
      "type": "Book"
    }, 
    "name": "CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees", 
    "pagination": "63-73", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1038916179"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-24775-3_9"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "b960710a48df9e0d8546f4405b11516df57c151bfd107db534cc129001b78bb0"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-24775-3_9", 
      "https://app.dimensions.ai/details/publication/pub.1038916179"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T08:24", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000363_0000000363/records_70043_00000001.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F978-3-540-24775-3_9"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-24775-3_9'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-24775-3_9'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-24775-3_9'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-24775-3_9'


 

This table displays all metadata directly associated to this object as RDF triples.

119 TRIPLES      23 PREDICATES      34 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-24775-3_9 schema:about anzsrc-for:08
2 anzsrc-for:0806
3 schema:author Nb51fcf774a8449aa97f7f4995ede6a41
4 schema:citation sg:pub.10.1007/3-540-49257-7_25
5 sg:pub.10.1007/978-3-540-39644-4_6
6 https://doi.org/10.1109/icdm.2003.1250943
7 https://doi.org/10.1109/icdm.2003.1250964
8 https://doi.org/10.1137/1.9781611972726.10
9 https://doi.org/10.1145/775047.775058
10 https://doi.org/10.1145/956750.956787
11 schema:datePublished 2004
12 schema:datePublishedReg 2004-01-01
13 schema:description Tree structures are used extensively in domains such as computational biology, pattern recognition, XML databases, computer networks, and so on. One important problem in mining databases of trees is to find frequently occurring subtrees. However, because of the combinatorial explosion, the number of frequent subtrees usually grows exponentially with the size of the subtrees. In this paper, we present CMTreeMiner, a computationally efficient algorithm that discovers all closed and maximal frequent subtrees in a database of rooted unordered trees. The algorithm mines both closed and maximal frequent subtrees by traversing an enumeration tree that systematically enumerates all subtrees, while using an enumeration DAG to prune the branches of the enumeration tree that do not correspond to closed or maximal frequent subtrees. The enumeration tree and the enumeration DAG are defined based on a canonical form for rooted unordered trees–the depth-first canonical form (DFCF). We compare the performance of our algorithm with that of PathJoin, a recently published algorithm that mines maximal frequent subtrees.
14 schema:editor N57eb71f0867f4b2481d757bf24963f0c
15 schema:genre chapter
16 schema:inLanguage en
17 schema:isAccessibleForFree false
18 schema:isPartOf Nfad03495cdad4562a4adccf0e486c100
19 schema:name CMTreeMiner: Mining Both Closed and Maximal Frequent Subtrees
20 schema:pagination 63-73
21 schema:productId N19e1e0ee5063444ea4257cb0f34c708a
22 N586a3df46f7e472290190f8abb47d19f
23 N6103188176824e87a1ea39f8d62dbf6a
24 schema:publisher N08109745d48649baba4a00b08c998d43
25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038916179
26 https://doi.org/10.1007/978-3-540-24775-3_9
27 schema:sdDatePublished 2019-04-16T08:24
28 schema:sdLicense https://scigraph.springernature.com/explorer/license/
29 schema:sdPublisher N85eb67f7f6984d4990c7987bbe47bd3d
30 schema:url https://link.springer.com/10.1007%2F978-3-540-24775-3_9
31 sgo:license sg:explorer/license/
32 sgo:sdDataset chapters
33 rdf:type schema:Chapter
34 N05558e7e29374855b8f6436fb98c412d rdf:first N32bf09f4627643f9a4ef8bc680487981
35 rdf:rest rdf:nil
36 N08109745d48649baba4a00b08c998d43 schema:location Berlin, Heidelberg
37 schema:name Springer Berlin Heidelberg
38 rdf:type schema:Organisation
39 N19e1e0ee5063444ea4257cb0f34c708a schema:name dimensions_id
40 schema:value pub.1038916179
41 rdf:type schema:PropertyValue
42 N1c7682b7508e4042a2f3d430c0441836 schema:familyName Srikant
43 schema:givenName Ramakrishnan
44 rdf:type schema:Person
45 N32bf09f4627643f9a4ef8bc680487981 schema:familyName Zhang
46 schema:givenName Chengqi
47 rdf:type schema:Person
48 N3b0a3c69d7d14d3f8ae99166ccacc321 rdf:first sg:person.07700022521.38
49 rdf:rest N945e061dbce34dceb026e48dec3203ff
50 N57eb71f0867f4b2481d757bf24963f0c rdf:first Naea3fd7376c249bc8e4247ab0fa8ca6e
51 rdf:rest Nd4e753c8ee8745199cfeebdc1fa6ef8e
52 N586a3df46f7e472290190f8abb47d19f schema:name doi
53 schema:value 10.1007/978-3-540-24775-3_9
54 rdf:type schema:PropertyValue
55 N6103188176824e87a1ea39f8d62dbf6a schema:name readcube_id
56 schema:value b960710a48df9e0d8546f4405b11516df57c151bfd107db534cc129001b78bb0
57 rdf:type schema:PropertyValue
58 N85eb67f7f6984d4990c7987bbe47bd3d schema:name Springer Nature - SN SciGraph project
59 rdf:type schema:Organization
60 N945e061dbce34dceb026e48dec3203ff rdf:first sg:person.012444621406.88
61 rdf:rest Ndfce6f0625464b1bb32d5f834f3ba20e
62 Naea3fd7376c249bc8e4247ab0fa8ca6e schema:familyName Dai
63 schema:givenName Honghua
64 rdf:type schema:Person
65 Nb51fcf774a8449aa97f7f4995ede6a41 rdf:first sg:person.011112563323.65
66 rdf:rest N3b0a3c69d7d14d3f8ae99166ccacc321
67 Nd4e753c8ee8745199cfeebdc1fa6ef8e rdf:first N1c7682b7508e4042a2f3d430c0441836
68 rdf:rest N05558e7e29374855b8f6436fb98c412d
69 Ndfce6f0625464b1bb32d5f834f3ba20e rdf:first sg:person.01332504062.09
70 rdf:rest rdf:nil
71 Nfad03495cdad4562a4adccf0e486c100 schema:isbn 978-3-540-22064-0
72 978-3-540-24775-3
73 schema:name Advances in Knowledge Discovery and Data Mining
74 rdf:type schema:Book
75 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
76 schema:name Information and Computing Sciences
77 rdf:type schema:DefinedTerm
78 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
79 schema:name Information Systems
80 rdf:type schema:DefinedTerm
81 sg:person.011112563323.65 schema:affiliation https://www.grid.ac/institutes/grid.19006.3e
82 schema:familyName Chi
83 schema:givenName Yun
84 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011112563323.65
85 rdf:type schema:Person
86 sg:person.012444621406.88 schema:affiliation https://www.grid.ac/institutes/grid.19006.3e
87 schema:familyName Xia
88 schema:givenName Yi
89 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012444621406.88
90 rdf:type schema:Person
91 sg:person.01332504062.09 schema:affiliation https://www.grid.ac/institutes/grid.19006.3e
92 schema:familyName Muntz
93 schema:givenName Richard R.
94 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01332504062.09
95 rdf:type schema:Person
96 sg:person.07700022521.38 schema:affiliation https://www.grid.ac/institutes/grid.19006.3e
97 schema:familyName Yang
98 schema:givenName Yirong
99 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07700022521.38
100 rdf:type schema:Person
101 sg:pub.10.1007/3-540-49257-7_25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048579740
102 https://doi.org/10.1007/3-540-49257-7_25
103 rdf:type schema:CreativeWork
104 sg:pub.10.1007/978-3-540-39644-4_6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025309521
105 https://doi.org/10.1007/978-3-540-39644-4_6
106 rdf:type schema:CreativeWork
107 https://doi.org/10.1109/icdm.2003.1250943 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095357941
108 rdf:type schema:CreativeWork
109 https://doi.org/10.1109/icdm.2003.1250964 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095548226
110 rdf:type schema:CreativeWork
111 https://doi.org/10.1137/1.9781611972726.10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1088799860
112 rdf:type schema:CreativeWork
113 https://doi.org/10.1145/775047.775058 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013503742
114 rdf:type schema:CreativeWork
115 https://doi.org/10.1145/956750.956787 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040031905
116 rdf:type schema:CreativeWork
117 https://www.grid.ac/institutes/grid.19006.3e schema:alternateName University of California Los Angeles
118 schema:name University of California, 90095, Los Angeles, CA, USA
119 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...