Heterogeneous Replicas for Multi-dimensional Data Management View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2020-09-18

AUTHORS

Jialin Qiao , Yuyuan Kang , Xiangdong Huang , Lei Rui , Tian Jiang , Jianmin Wang , Philip S. Yu

ABSTRACT

Multi-dimensional data is widely used in different scenarios, such as cluster monitoring and user behavior analysis for web services. The data is usually managed by distributed databases with a replication strategy, which enhances the availability, fault-tolerance, and I/O throughput. Normally, these replicas share the same physical layout on the disk, which is designed by database administrators according to the target workload. However, it is critical to derive an optimal layout that benefits as many queries as possible, because a layout that accommodates only some queries can negatively impact the others. To tackle this limitation, we propose heterogeneous replicas for multi-dimensional data that provide a higher query throughput without additional disk occupation and without slowing down the writing speed, while still ensuring high availability and load balance. The proposed replication method allows different replicas to be logically identical while having different physical data layouts on the disk. We verified the efficiency of our method in a NoSQL system, Cassandra, with the TPC-H dataset and with a synthetically generated dataset. The results show that our method outperforms state-of-the-art solutions. More... »

PAGES

20-36

Book

TITLE

Database Systems for Advanced Applications

ISBN

978-3-030-59409-1
978-3-030-59410-7

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-030-59410-7_2

DOI

http://dx.doi.org/10.1007/978-3-030-59410-7_2

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1131073835


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Research Center for Big Data, Tsinghua University, Beijing, China", 
          "id": "http://www.grid.ac/institutes/grid.12527.33", 
          "name": [
            "KLiss, MOE; BNRist; School of Software, Tsinghua University, Beijing, China", 
            "Research Center for Big Data, Tsinghua University, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Qiao", 
        "givenName": "Jialin", 
        "id": "sg:person.013540351275.06", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013540351275.06"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Research Center for Big Data, Tsinghua University, Beijing, China", 
          "id": "http://www.grid.ac/institutes/grid.12527.33", 
          "name": [
            "KLiss, MOE; BNRist; School of Software, Tsinghua University, Beijing, China", 
            "Research Center for Big Data, Tsinghua University, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kang", 
        "givenName": "Yuyuan", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Research Center for Big Data, Tsinghua University, Beijing, China", 
          "id": "http://www.grid.ac/institutes/grid.12527.33", 
          "name": [
            "KLiss, MOE; BNRist; School of Software, Tsinghua University, Beijing, China", 
            "Research Center for Big Data, Tsinghua University, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Huang", 
        "givenName": "Xiangdong", 
        "id": "sg:person.011010233413.90", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011010233413.90"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Research Center for Big Data, Tsinghua University, Beijing, China", 
          "id": "http://www.grid.ac/institutes/grid.12527.33", 
          "name": [
            "KLiss, MOE; BNRist; School of Software, Tsinghua University, Beijing, China", 
            "Research Center for Big Data, Tsinghua University, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Rui", 
        "givenName": "Lei", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Research Center for Big Data, Tsinghua University, Beijing, China", 
          "id": "http://www.grid.ac/institutes/grid.12527.33", 
          "name": [
            "KLiss, MOE; BNRist; School of Software, Tsinghua University, Beijing, China", 
            "Research Center for Big Data, Tsinghua University, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Jiang", 
        "givenName": "Tian", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Research Center for Big Data, Tsinghua University, Beijing, China", 
          "id": "http://www.grid.ac/institutes/grid.12527.33", 
          "name": [
            "KLiss, MOE; BNRist; School of Software, Tsinghua University, Beijing, China", 
            "Research Center for Big Data, Tsinghua University, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Jianmin", 
        "id": "sg:person.012303351315.43", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012303351315.43"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Illinois, Champaign, IL, USA", 
          "id": "http://www.grid.ac/institutes/grid.35403.31", 
          "name": [
            "University of Illinois, Champaign, IL, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yu", 
        "givenName": "Philip S.", 
        "id": "sg:person.011016356115.95", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011016356115.95"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2020-09-18", 
    "datePublishedReg": "2020-09-18", 
    "description": "Multi-dimensional data is widely used in different scenarios, such as cluster monitoring and user behavior analysis for web services. The data is usually managed by distributed databases with a replication strategy, which enhances the availability, fault-tolerance, and I/O throughput. Normally, these replicas share the same physical layout on the disk, which is designed by database administrators according to the target workload. However, it is critical to derive an optimal layout that benefits as many queries as possible, because a layout that accommodates only some queries can negatively impact the others. To tackle this limitation, we propose heterogeneous replicas for multi-dimensional data that provide a higher query throughput without additional disk occupation and without slowing down the writing speed, while still ensuring high availability and load balance. The proposed replication method allows different replicas to be logically identical while having different physical data layouts on the disk. We verified the efficiency of our method in a NoSQL system, Cassandra, with the TPC-H dataset and with a synthetically generated dataset. The results show that our method outperforms state-of-the-art solutions.", 
    "editor": [
      {
        "familyName": "Nah", 
        "givenName": "Yunmook", 
        "type": "Person"
      }, 
      {
        "familyName": "Cui", 
        "givenName": "Bin", 
        "type": "Person"
      }, 
      {
        "familyName": "Lee", 
        "givenName": "Sang-Won", 
        "type": "Person"
      }, 
      {
        "familyName": "Yu", 
        "givenName": "Jeffrey Xu", 
        "type": "Person"
      }, 
      {
        "familyName": "Moon", 
        "givenName": "Yang-Sae", 
        "type": "Person"
      }, 
      {
        "familyName": "Whang", 
        "givenName": "Steven Euijong", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-030-59410-7_2", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-030-59409-1", 
        "978-3-030-59410-7"
      ], 
      "name": "Database Systems for Advanced Applications", 
      "type": "Book"
    }, 
    "keywords": [
      "multi-dimensional data", 
      "heterogeneous replicas", 
      "multi-dimensional data management", 
      "high query throughput", 
      "user behavior analysis", 
      "TPC-H dataset", 
      "physical data layout", 
      "web services", 
      "query throughput", 
      "NoSQL systems", 
      "same physical layout", 
      "database administrators", 
      "data layout", 
      "O throughput", 
      "data management", 
      "load balance", 
      "art solutions", 
      "cluster monitoring", 
      "target workload", 
      "different replicas", 
      "high availability", 
      "replication strategy", 
      "queries", 
      "behavior analysis", 
      "different scenarios", 
      "replication method", 
      "dataset", 
      "physical layout", 
      "throughput", 
      "replicas", 
      "layout", 
      "optimal layout", 
      "Cassandra", 
      "workload", 
      "services", 
      "scenarios", 
      "database", 
      "data", 
      "method", 
      "availability", 
      "administrators", 
      "writing speed", 
      "system", 
      "speed", 
      "monitoring", 
      "efficiency", 
      "solution", 
      "limitations", 
      "management", 
      "strategies", 
      "state", 
      "results", 
      "disk", 
      "analysis", 
      "balance", 
      "occupation"
    ], 
    "name": "Heterogeneous Replicas for Multi-dimensional Data Management", 
    "pagination": "20-36", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1131073835"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-030-59410-7_2"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-030-59410-7_2", 
      "https://app.dimensions.ai/details/publication/pub.1131073835"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-05-20T07:47", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/chapter/chapter_384.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-030-59410-7_2"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-59410-7_2'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-59410-7_2'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-59410-7_2'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-59410-7_2'


 

This table displays all metadata directly associated to this object as RDF triples.

184 TRIPLES      23 PREDICATES      81 URIs      74 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-030-59410-7_2 schema:about anzsrc-for:08
2 anzsrc-for:0806
3 schema:author Nead8598e40c54f28b0f5a50ecbe54e4a
4 schema:datePublished 2020-09-18
5 schema:datePublishedReg 2020-09-18
6 schema:description Multi-dimensional data is widely used in different scenarios, such as cluster monitoring and user behavior analysis for web services. The data is usually managed by distributed databases with a replication strategy, which enhances the availability, fault-tolerance, and I/O throughput. Normally, these replicas share the same physical layout on the disk, which is designed by database administrators according to the target workload. However, it is critical to derive an optimal layout that benefits as many queries as possible, because a layout that accommodates only some queries can negatively impact the others. To tackle this limitation, we propose heterogeneous replicas for multi-dimensional data that provide a higher query throughput without additional disk occupation and without slowing down the writing speed, while still ensuring high availability and load balance. The proposed replication method allows different replicas to be logically identical while having different physical data layouts on the disk. We verified the efficiency of our method in a NoSQL system, Cassandra, with the TPC-H dataset and with a synthetically generated dataset. The results show that our method outperforms state-of-the-art solutions.
7 schema:editor Neb39c59065ba49eba77a739ac1d21d47
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N7544453a782147d4b4f55b325be0576d
12 schema:keywords Cassandra
13 NoSQL systems
14 O throughput
15 TPC-H dataset
16 administrators
17 analysis
18 art solutions
19 availability
20 balance
21 behavior analysis
22 cluster monitoring
23 data
24 data layout
25 data management
26 database
27 database administrators
28 dataset
29 different replicas
30 different scenarios
31 disk
32 efficiency
33 heterogeneous replicas
34 high availability
35 high query throughput
36 layout
37 limitations
38 load balance
39 management
40 method
41 monitoring
42 multi-dimensional data
43 multi-dimensional data management
44 occupation
45 optimal layout
46 physical data layout
47 physical layout
48 queries
49 query throughput
50 replicas
51 replication method
52 replication strategy
53 results
54 same physical layout
55 scenarios
56 services
57 solution
58 speed
59 state
60 strategies
61 system
62 target workload
63 throughput
64 user behavior analysis
65 web services
66 workload
67 writing speed
68 schema:name Heterogeneous Replicas for Multi-dimensional Data Management
69 schema:pagination 20-36
70 schema:productId N28aaa2976db9405bac1d18dfb21091e8
71 Nd6d6cd3997ff4cc69f03c2f025b6f2eb
72 schema:publisher Na95315237498492995db9b340fbf3407
73 schema:sameAs https://app.dimensions.ai/details/publication/pub.1131073835
74 https://doi.org/10.1007/978-3-030-59410-7_2
75 schema:sdDatePublished 2022-05-20T07:47
76 schema:sdLicense https://scigraph.springernature.com/explorer/license/
77 schema:sdPublisher N88ae78637678456da17177fdebd8ad4d
78 schema:url https://doi.org/10.1007/978-3-030-59410-7_2
79 sgo:license sg:explorer/license/
80 sgo:sdDataset chapters
81 rdf:type schema:Chapter
82 N0bfe97b6c4e2427bbec6843991afa67b schema:familyName Cui
83 schema:givenName Bin
84 rdf:type schema:Person
85 N12e2785e0c3f47fca558ea6f6cecfb8f schema:familyName Lee
86 schema:givenName Sang-Won
87 rdf:type schema:Person
88 N1871ee6829624959a34e2abbfef5bd92 rdf:first sg:person.011010233413.90
89 rdf:rest N544ab960383b469c9131a7320860c924
90 N225c80a9fa914ad4bd1f5ac0a368b3d9 rdf:first Nb97622ce08de4dc99eff3795e41d0103
91 rdf:rest N1871ee6829624959a34e2abbfef5bd92
92 N28aaa2976db9405bac1d18dfb21091e8 schema:name dimensions_id
93 schema:value pub.1131073835
94 rdf:type schema:PropertyValue
95 N2f6f6829b91c43569a712e08c5a66a15 schema:familyName Nah
96 schema:givenName Yunmook
97 rdf:type schema:Person
98 N331c0d88146d40a6ae6746f738e5944b rdf:first N12e2785e0c3f47fca558ea6f6cecfb8f
99 rdf:rest Nd10940f3266c4475b2a2f01cd319db4e
100 N33670ddd56a14410a163214c3dc97b4d rdf:first Nbf1f3ae8cd84413cab03f1337df2c181
101 rdf:rest Na0c54abd965b4ade8032a4b61a7fb286
102 N342f5ff1485c4d0fbc1636a628226ce8 rdf:first N54af28e05a9f4724811e15f20a634e86
103 rdf:rest N4b7f137de0c5453cad9c439d4ce26c27
104 N3e56a691146b4a4a8f10f5982db6c408 rdf:first N0bfe97b6c4e2427bbec6843991afa67b
105 rdf:rest N331c0d88146d40a6ae6746f738e5944b
106 N4b7f137de0c5453cad9c439d4ce26c27 rdf:first Nb841e3a5f7dc4882a97c881b5c8d6a89
107 rdf:rest rdf:nil
108 N544ab960383b469c9131a7320860c924 rdf:first Nca74f8a50436437dbe46f3b31c64c35f
109 rdf:rest N33670ddd56a14410a163214c3dc97b4d
110 N54af28e05a9f4724811e15f20a634e86 schema:familyName Moon
111 schema:givenName Yang-Sae
112 rdf:type schema:Person
113 N5859d21638e243e48980108130587c12 schema:familyName Yu
114 schema:givenName Jeffrey Xu
115 rdf:type schema:Person
116 N7544453a782147d4b4f55b325be0576d schema:isbn 978-3-030-59409-1
117 978-3-030-59410-7
118 schema:name Database Systems for Advanced Applications
119 rdf:type schema:Book
120 N88ae78637678456da17177fdebd8ad4d schema:name Springer Nature - SN SciGraph project
121 rdf:type schema:Organization
122 Na0c54abd965b4ade8032a4b61a7fb286 rdf:first sg:person.012303351315.43
123 rdf:rest Nf51261d81b594b2c9339fac76659a89c
124 Na95315237498492995db9b340fbf3407 schema:name Springer Nature
125 rdf:type schema:Organisation
126 Nb841e3a5f7dc4882a97c881b5c8d6a89 schema:familyName Whang
127 schema:givenName Steven Euijong
128 rdf:type schema:Person
129 Nb97622ce08de4dc99eff3795e41d0103 schema:affiliation grid-institutes:grid.12527.33
130 schema:familyName Kang
131 schema:givenName Yuyuan
132 rdf:type schema:Person
133 Nbf1f3ae8cd84413cab03f1337df2c181 schema:affiliation grid-institutes:grid.12527.33
134 schema:familyName Jiang
135 schema:givenName Tian
136 rdf:type schema:Person
137 Nca74f8a50436437dbe46f3b31c64c35f schema:affiliation grid-institutes:grid.12527.33
138 schema:familyName Rui
139 schema:givenName Lei
140 rdf:type schema:Person
141 Nd10940f3266c4475b2a2f01cd319db4e rdf:first N5859d21638e243e48980108130587c12
142 rdf:rest N342f5ff1485c4d0fbc1636a628226ce8
143 Nd6d6cd3997ff4cc69f03c2f025b6f2eb schema:name doi
144 schema:value 10.1007/978-3-030-59410-7_2
145 rdf:type schema:PropertyValue
146 Nead8598e40c54f28b0f5a50ecbe54e4a rdf:first sg:person.013540351275.06
147 rdf:rest N225c80a9fa914ad4bd1f5ac0a368b3d9
148 Neb39c59065ba49eba77a739ac1d21d47 rdf:first N2f6f6829b91c43569a712e08c5a66a15
149 rdf:rest N3e56a691146b4a4a8f10f5982db6c408
150 Nf51261d81b594b2c9339fac76659a89c rdf:first sg:person.011016356115.95
151 rdf:rest rdf:nil
152 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
153 schema:name Information and Computing Sciences
154 rdf:type schema:DefinedTerm
155 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
156 schema:name Information Systems
157 rdf:type schema:DefinedTerm
158 sg:person.011010233413.90 schema:affiliation grid-institutes:grid.12527.33
159 schema:familyName Huang
160 schema:givenName Xiangdong
161 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011010233413.90
162 rdf:type schema:Person
163 sg:person.011016356115.95 schema:affiliation grid-institutes:grid.35403.31
164 schema:familyName Yu
165 schema:givenName Philip S.
166 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011016356115.95
167 rdf:type schema:Person
168 sg:person.012303351315.43 schema:affiliation grid-institutes:grid.12527.33
169 schema:familyName Wang
170 schema:givenName Jianmin
171 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012303351315.43
172 rdf:type schema:Person
173 sg:person.013540351275.06 schema:affiliation grid-institutes:grid.12527.33
174 schema:familyName Qiao
175 schema:givenName Jialin
176 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013540351275.06
177 rdf:type schema:Person
178 grid-institutes:grid.12527.33 schema:alternateName Research Center for Big Data, Tsinghua University, Beijing, China
179 schema:name KLiss, MOE; BNRist; School of Software, Tsinghua University, Beijing, China
180 Research Center for Big Data, Tsinghua University, Beijing, China
181 rdf:type schema:Organization
182 grid-institutes:grid.35403.31 schema:alternateName University of Illinois, Champaign, IL, USA
183 schema:name University of Illinois, Champaign, IL, USA
184 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...