Bulk Loading the MKL-Tree View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2003

AUTHORS

Annalisa Franco , Alessandra Lumini , Dario Maio

ABSTRACT

MKL-tree is a hierarchical, height-balanced structure for high dimensional data indexing. This structure is based on data representation in a lower dimensional space by means of the MKL transform, a multi-space generalization of the KL transform. A local dimensionality reduction is performed at each node of the tree, allowing more selective features to be extracted and thus increasing the discriminating power of the index. The dynamical version of MKL-tree presents two main drawbacks: first, the incremental loading of data points can determine very different structures and, as a consequence, different query performance, depending on the insertion order; second, the creation of the index can be very expensive, due to the high number of updating required. Since, in real applications, a large dataset is usually available at the tree creation time, we propose a new bulk loading technique for MKL-tree, based on a recursive clustering of data objects. The new algorithm searches for an optimal partitioning of data points, in order to calculate the most suitable KL-subspaces to represent the dataset.Experimental results show that bulk loading can significantly improve the index performance with respect to the incremental insertion procedure, both in terms of effectiveness of similarity searches and of efficiency of the loading procedure. More... »

PAGES

119-128

Book

TITLE

Database and Expert Systems Applications

ISBN

978-3-540-40806-2
978-3-540-45227-0

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-45227-0_13

DOI

http://dx.doi.org/10.1007/978-3-540-45227-0_13

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1018631716


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "DEIS Universit\u00e0 di Bologna, viale Risorgimento 2, 40136, Bologna, Italy", 
          "id": "http://www.grid.ac/institutes/grid.6292.f", 
          "name": [
            "DEIS Universit\u00e0 di Bologna, viale Risorgimento 2, 40136, Bologna, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Franco", 
        "givenName": "Annalisa", 
        "id": "sg:person.011002501427.92", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011002501427.92"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "DEIS Universit\u00e0 di Bologna, viale Risorgimento 2, 40136, Bologna, Italy", 
          "id": "http://www.grid.ac/institutes/grid.6292.f", 
          "name": [
            "DEIS Universit\u00e0 di Bologna, viale Risorgimento 2, 40136, Bologna, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lumini", 
        "givenName": "Alessandra", 
        "id": "sg:person.01230440001.42", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01230440001.42"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "DEIS Universit\u00e0 di Bologna, viale Risorgimento 2, 40136, Bologna, Italy", 
          "id": "http://www.grid.ac/institutes/grid.6292.f", 
          "name": [
            "DEIS Universit\u00e0 di Bologna, viale Risorgimento 2, 40136, Bologna, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Maio", 
        "givenName": "Dario", 
        "id": "sg:person.013075040365.65", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013075040365.65"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2003", 
    "datePublishedReg": "2003-01-01", 
    "description": "MKL-tree is a hierarchical, height-balanced structure for high dimensional data indexing. This structure is based on data representation in a lower dimensional space by means of the MKL transform, a multi-space generalization of the KL transform. A local dimensionality reduction is performed at each node of the tree, allowing more selective features to be extracted and thus increasing the discriminating power of the index. The dynamical version of MKL-tree presents two main drawbacks: first, the incremental loading of data points can determine very different structures and, as a consequence, different query performance, depending on the insertion order; second, the creation of the index can be very expensive, due to the high number of updating required. Since, in real applications, a large dataset is usually available at the tree creation time, we propose a new bulk loading technique for MKL-tree, based on a recursive clustering of data objects. The new algorithm searches for an optimal partitioning of data points, in order to calculate the most suitable KL-subspaces to represent the dataset.Experimental results show that bulk loading can significantly improve the index performance with respect to the incremental insertion procedure, both in terms of effectiveness of similarity searches and of efficiency of the loading procedure.", 
    "editor": [
      {
        "familyName": "Ma\u0159\u00edk", 
        "givenName": "Vladim\u00edr", 
        "type": "Person"
      }, 
      {
        "familyName": "Retschitzegger", 
        "givenName": "Werner", 
        "type": "Person"
      }, 
      {
        "familyName": "\u0160t\u011bp\u00e1nkov\u00e1", 
        "givenName": "Olga", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-45227-0_13", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-540-40806-2", 
        "978-3-540-45227-0"
      ], 
      "name": "Database and Expert Systems Applications", 
      "type": "Book"
    }, 
    "keywords": [
      "high-dimensional data indexing", 
      "bulk loading techniques", 
      "new algorithm searches", 
      "data indexing", 
      "local dimensionality reduction", 
      "query performance", 
      "data objects", 
      "bulk loading", 
      "data representation", 
      "recursive clustering", 
      "data points", 
      "creation time", 
      "similarity search", 
      "terms of effectiveness", 
      "large datasets", 
      "dimensionality reduction", 
      "real applications", 
      "algorithm searches", 
      "insertion order", 
      "optimal partitioning", 
      "KL transform", 
      "dimensional space", 
      "selective features", 
      "experimental results", 
      "dataset", 
      "main drawback", 
      "indexing", 
      "search", 
      "clustering", 
      "performance", 
      "nodes", 
      "insertion procedure", 
      "transform", 
      "objects", 
      "index performance", 
      "representation", 
      "drawbacks", 
      "order", 
      "applications", 
      "partitioning", 
      "creation", 
      "incremental loading", 
      "effectiveness", 
      "trees", 
      "features", 
      "version", 
      "space", 
      "technique", 
      "point", 
      "efficiency", 
      "generalization", 
      "loading technique", 
      "different structures", 
      "terms", 
      "power", 
      "number", 
      "time", 
      "structure", 
      "procedure", 
      "results", 
      "higher number", 
      "means", 
      "dynamical version", 
      "respect", 
      "index", 
      "loading procedure", 
      "reduction", 
      "consequences", 
      "loading"
    ], 
    "name": "Bulk Loading the MKL-Tree", 
    "pagination": "119-128", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1018631716"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-45227-0_13"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-45227-0_13", 
      "https://app.dimensions.ai/details/publication/pub.1018631716"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-12-01T06:48", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/chapter/chapter_216.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-540-45227-0_13"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-45227-0_13'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-45227-0_13'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-45227-0_13'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-45227-0_13'


 

This table displays all metadata directly associated to this object as RDF triples.

152 TRIPLES      22 PREDICATES      94 URIs      87 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-45227-0_13 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nedc85d09bcd048dc9ce5c89e29cdd497
4 schema:datePublished 2003
5 schema:datePublishedReg 2003-01-01
6 schema:description MKL-tree is a hierarchical, height-balanced structure for high dimensional data indexing. This structure is based on data representation in a lower dimensional space by means of the MKL transform, a multi-space generalization of the KL transform. A local dimensionality reduction is performed at each node of the tree, allowing more selective features to be extracted and thus increasing the discriminating power of the index. The dynamical version of MKL-tree presents two main drawbacks: first, the incremental loading of data points can determine very different structures and, as a consequence, different query performance, depending on the insertion order; second, the creation of the index can be very expensive, due to the high number of updating required. Since, in real applications, a large dataset is usually available at the tree creation time, we propose a new bulk loading technique for MKL-tree, based on a recursive clustering of data objects. The new algorithm searches for an optimal partitioning of data points, in order to calculate the most suitable KL-subspaces to represent the dataset.Experimental results show that bulk loading can significantly improve the index performance with respect to the incremental insertion procedure, both in terms of effectiveness of similarity searches and of efficiency of the loading procedure.
7 schema:editor Nf047843bb4d741c2af144f4a92b51ce6
8 schema:genre chapter
9 schema:isAccessibleForFree false
10 schema:isPartOf N9093b1ff379a45d4a8cb0983fc1b4021
11 schema:keywords KL transform
12 algorithm searches
13 applications
14 bulk loading
15 bulk loading techniques
16 clustering
17 consequences
18 creation
19 creation time
20 data indexing
21 data objects
22 data points
23 data representation
24 dataset
25 different structures
26 dimensional space
27 dimensionality reduction
28 drawbacks
29 dynamical version
30 effectiveness
31 efficiency
32 experimental results
33 features
34 generalization
35 high-dimensional data indexing
36 higher number
37 incremental loading
38 index
39 index performance
40 indexing
41 insertion order
42 insertion procedure
43 large datasets
44 loading
45 loading procedure
46 loading technique
47 local dimensionality reduction
48 main drawback
49 means
50 new algorithm searches
51 nodes
52 number
53 objects
54 optimal partitioning
55 order
56 partitioning
57 performance
58 point
59 power
60 procedure
61 query performance
62 real applications
63 recursive clustering
64 reduction
65 representation
66 respect
67 results
68 search
69 selective features
70 similarity search
71 space
72 structure
73 technique
74 terms
75 terms of effectiveness
76 time
77 transform
78 trees
79 version
80 schema:name Bulk Loading the MKL-Tree
81 schema:pagination 119-128
82 schema:productId N88b15eade4d6450a8d4893ea4c938517
83 Nde1d34085ad04d2b91e5e101ddd680f9
84 schema:publisher N8c09d583442d4987bb3a3c621da655c1
85 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018631716
86 https://doi.org/10.1007/978-3-540-45227-0_13
87 schema:sdDatePublished 2022-12-01T06:48
88 schema:sdLicense https://scigraph.springernature.com/explorer/license/
89 schema:sdPublisher N39cbb0518c2f4511b2c5f501e5a2cfda
90 schema:url https://doi.org/10.1007/978-3-540-45227-0_13
91 sgo:license sg:explorer/license/
92 sgo:sdDataset chapters
93 rdf:type schema:Chapter
94 N39cbb0518c2f4511b2c5f501e5a2cfda schema:name Springer Nature - SN SciGraph project
95 rdf:type schema:Organization
96 N5019e588273f4c629a522aef1ffc4462 rdf:first sg:person.013075040365.65
97 rdf:rest rdf:nil
98 N656438d33e2c4e9ab9f5e22402d3ec44 schema:familyName Retschitzegger
99 schema:givenName Werner
100 rdf:type schema:Person
101 N88b15eade4d6450a8d4893ea4c938517 schema:name doi
102 schema:value 10.1007/978-3-540-45227-0_13
103 rdf:type schema:PropertyValue
104 N8c09d583442d4987bb3a3c621da655c1 schema:name Springer Nature
105 rdf:type schema:Organisation
106 N9093b1ff379a45d4a8cb0983fc1b4021 schema:isbn 978-3-540-40806-2
107 978-3-540-45227-0
108 schema:name Database and Expert Systems Applications
109 rdf:type schema:Book
110 N91a8fc57d53e444f99001de84d9cb76a rdf:first N656438d33e2c4e9ab9f5e22402d3ec44
111 rdf:rest Ncbb474dba26d497f9bfa63e27fc18cf5
112 Nc97fb14f99504725829216f195228c1d schema:familyName Štěpánková
113 schema:givenName Olga
114 rdf:type schema:Person
115 Ncbb474dba26d497f9bfa63e27fc18cf5 rdf:first Nc97fb14f99504725829216f195228c1d
116 rdf:rest rdf:nil
117 Ncd2df8f4b08d44e69f8221e2ccb7b4bd schema:familyName Mařík
118 schema:givenName Vladimír
119 rdf:type schema:Person
120 Nde1d34085ad04d2b91e5e101ddd680f9 schema:name dimensions_id
121 schema:value pub.1018631716
122 rdf:type schema:PropertyValue
123 Nedc85d09bcd048dc9ce5c89e29cdd497 rdf:first sg:person.011002501427.92
124 rdf:rest Nfc04190ded014284a1823daae8f5747e
125 Nf047843bb4d741c2af144f4a92b51ce6 rdf:first Ncd2df8f4b08d44e69f8221e2ccb7b4bd
126 rdf:rest N91a8fc57d53e444f99001de84d9cb76a
127 Nfc04190ded014284a1823daae8f5747e rdf:first sg:person.01230440001.42
128 rdf:rest N5019e588273f4c629a522aef1ffc4462
129 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
130 schema:name Information and Computing Sciences
131 rdf:type schema:DefinedTerm
132 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
133 schema:name Artificial Intelligence and Image Processing
134 rdf:type schema:DefinedTerm
135 sg:person.011002501427.92 schema:affiliation grid-institutes:grid.6292.f
136 schema:familyName Franco
137 schema:givenName Annalisa
138 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011002501427.92
139 rdf:type schema:Person
140 sg:person.01230440001.42 schema:affiliation grid-institutes:grid.6292.f
141 schema:familyName Lumini
142 schema:givenName Alessandra
143 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01230440001.42
144 rdf:type schema:Person
145 sg:person.013075040365.65 schema:affiliation grid-institutes:grid.6292.f
146 schema:familyName Maio
147 schema:givenName Dario
148 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013075040365.65
149 rdf:type schema:Person
150 grid-institutes:grid.6292.f schema:alternateName DEIS Università di Bologna, viale Risorgimento 2, 40136, Bologna, Italy
151 schema:name DEIS Università di Bologna, viale Risorgimento 2, 40136, Bologna, Italy
152 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...