Advances in Large-Scale RDF Data Management View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2014

AUTHORS

Peter Boncz , Orri Erling , Minh-Duc Pham

ABSTRACT

One of the prime goals of the LOD2 project is improving the performance and scalability of RDF storage solutions so that the increasing amount of Linked Open Data (LOD) can be efficiently managed. Virtuoso has been chosen as the basic RDF store for the LOD2 project, and during the project it has been significantly improved by incorporating advanced relational database techniques from MonetDB and Vectorwise, turning it into a compressed column store with vectored execution. This has reduced the performance gap (“RDF tax”) between Virtuoso’s SQL and SPARQL query performance in a way that still respects the “schema-last” nature of RDF. However, by lacking schema information, RDF database systems such as Virtuoso still cannot use advanced relational storage optimizations such as table partitioning or clustered indexes and have to execute SPARQL queries with many self-joins to a triple table, which leads to more join effort than needed in SQL systems. In this chapter, we first discuss the new column store techniques applied to Virtuoso, the enhancements in its cluster parallel version, and show its performance using the popular BSBM benchmark at the unsurpassed scale of 150 billion triples. We finally describe ongoing work in deriving an “emergent” relational schema from RDF data, which can help to close the performance gap between relational-based and RDF-based storage solutions. More... »

PAGES

21-44

References to SciGraph publications

Book

TITLE

Linked Open Data -- Creating Knowledge Out of Interlinked Data

ISBN

978-3-319-09845-6
978-3-319-09846-3

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2

DOI

http://dx.doi.org/10.1007/978-3-319-09846-3_2

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1021315309


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0804", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Data Format", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Centrum Wiskunde and Informatica", 
          "id": "https://www.grid.ac/institutes/grid.6054.7", 
          "name": [
            "CWI, Amsterdam, The Netherlands"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Boncz", 
        "givenName": "Peter", 
        "id": "sg:person.015341641231.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015341641231.33"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "OpenLink Software (United Kingdom)", 
          "id": "https://www.grid.ac/institutes/grid.426164.3", 
          "name": [
            "OpenLink Software, Burlington, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Erling", 
        "givenName": "Orri", 
        "id": "sg:person.011715705740.17", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011715705740.17"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Centrum Wiskunde and Informatica", 
          "id": "https://www.grid.ac/institutes/grid.6054.7", 
          "name": [
            "CWI, Amsterdam, The Netherlands"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Pham", 
        "givenName": "Minh-Duc", 
        "id": "sg:person.015106130107.21", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015106130107.21"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/978-3-642-28997-2_59", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004189052", 
          "https://doi.org/10.1007/978-3-642-28997-2_59"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/2247596.2247635", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018368791"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-319-04936-6_5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033535672", 
          "https://doi.org/10.1007/978-3-319-04936-6_5"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.4018/jswis.2009040101", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050425370"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.14778/1687553.1687625", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1067367540"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.14778/2367502.2367518", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1067368064"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icde.2011.5767868", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093532757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1201/b16859", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095906249"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014", 
    "datePublishedReg": "2014-01-01", 
    "description": "One of the prime goals of the LOD2 project is improving the performance and scalability of RDF storage solutions so that the increasing amount of Linked Open Data (LOD) can be efficiently managed. Virtuoso has been chosen as the basic RDF store for the LOD2 project, and during the project it has been significantly improved by incorporating advanced relational database techniques from MonetDB and Vectorwise, turning it into a compressed column store with vectored execution. This has reduced the performance gap (\u201cRDF tax\u201d) between Virtuoso\u2019s SQL and SPARQL query performance in a way that still respects the \u201cschema-last\u201d nature of RDF. However, by lacking schema information, RDF database systems such as Virtuoso still cannot use advanced relational storage optimizations such as table partitioning or clustered indexes and have to execute SPARQL queries with many self-joins to a triple table, which leads to more join effort than needed in SQL systems. In this chapter, we first discuss the new column store techniques applied to Virtuoso, the enhancements in its cluster parallel version, and show its performance using the popular BSBM benchmark at the unsurpassed scale of 150\u00a0billion triples. We finally describe ongoing work in deriving an \u201cemergent\u201d relational schema from RDF data, which can help to close the performance gap between relational-based and RDF-based storage solutions.", 
    "editor": [
      {
        "familyName": "Auer", 
        "givenName": "S\u00f6ren", 
        "type": "Person"
      }, 
      {
        "familyName": "Bryl", 
        "givenName": "Volha", 
        "type": "Person"
      }, 
      {
        "familyName": "Tramp", 
        "givenName": "Sebastian", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-09846-3_2", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-319-09845-6", 
        "978-3-319-09846-3"
      ], 
      "name": "Linked Open Data -- Creating Knowledge Out of Interlinked Data", 
      "type": "Book"
    }, 
    "name": "Advances in Large-Scale RDF Data Management", 
    "pagination": "21-44", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-09846-3_2"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "658e93d4f521806b78c2c801e4dc9585bab7a982dc48234570867b51e8da2ad6"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1021315309"
        ]
      }
    ], 
    "publisher": {
      "location": "Cham", 
      "name": "Springer International Publishing", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-09846-3_2", 
      "https://app.dimensions.ai/details/publication/pub.1021315309"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T11:34", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8660_00000256.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-319-09846-3_2"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'


 

This table displays all metadata directly associated to this object as RDF triples.

118 TRIPLES      23 PREDICATES      35 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-09846-3_2 schema:about anzsrc-for:08
2 anzsrc-for:0804
3 schema:author Na6f033fce701435483698fcc585b6172
4 schema:citation sg:pub.10.1007/978-3-319-04936-6_5
5 sg:pub.10.1007/978-3-642-28997-2_59
6 https://doi.org/10.1109/icde.2011.5767868
7 https://doi.org/10.1145/2247596.2247635
8 https://doi.org/10.1201/b16859
9 https://doi.org/10.14778/1687553.1687625
10 https://doi.org/10.14778/2367502.2367518
11 https://doi.org/10.4018/jswis.2009040101
12 schema:datePublished 2014
13 schema:datePublishedReg 2014-01-01
14 schema:description One of the prime goals of the LOD2 project is improving the performance and scalability of RDF storage solutions so that the increasing amount of Linked Open Data (LOD) can be efficiently managed. Virtuoso has been chosen as the basic RDF store for the LOD2 project, and during the project it has been significantly improved by incorporating advanced relational database techniques from MonetDB and Vectorwise, turning it into a compressed column store with vectored execution. This has reduced the performance gap (“RDF tax”) between Virtuoso’s SQL and SPARQL query performance in a way that still respects the “schema-last” nature of RDF. However, by lacking schema information, RDF database systems such as Virtuoso still cannot use advanced relational storage optimizations such as table partitioning or clustered indexes and have to execute SPARQL queries with many self-joins to a triple table, which leads to more join effort than needed in SQL systems. In this chapter, we first discuss the new column store techniques applied to Virtuoso, the enhancements in its cluster parallel version, and show its performance using the popular BSBM benchmark at the unsurpassed scale of 150 billion triples. We finally describe ongoing work in deriving an “emergent” relational schema from RDF data, which can help to close the performance gap between relational-based and RDF-based storage solutions.
15 schema:editor N4a9fa4dd95c94129a605c914cf05fa30
16 schema:genre chapter
17 schema:inLanguage en
18 schema:isAccessibleForFree true
19 schema:isPartOf N1cddb83f8b1f41b7ad9d547fa2da88b2
20 schema:name Advances in Large-Scale RDF Data Management
21 schema:pagination 21-44
22 schema:productId N2a09ffc5f4094a4da7b34e8fa9b5eb25
23 N4a716ea441304b8ca8325c2029061c5d
24 N6efd68a7531f40f2aca6e8fe2ab84d8c
25 schema:publisher Nc346c36bbe2f499d90e2266c9b7cb981
26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021315309
27 https://doi.org/10.1007/978-3-319-09846-3_2
28 schema:sdDatePublished 2019-04-15T11:34
29 schema:sdLicense https://scigraph.springernature.com/explorer/license/
30 schema:sdPublisher Ne912bc7d87764ded87a58deb23347e37
31 schema:url http://link.springer.com/10.1007/978-3-319-09846-3_2
32 sgo:license sg:explorer/license/
33 sgo:sdDataset chapters
34 rdf:type schema:Chapter
35 N12b5667fbf2d41da9b6ea237eec0c4bf rdf:first sg:person.015106130107.21
36 rdf:rest rdf:nil
37 N1cddb83f8b1f41b7ad9d547fa2da88b2 schema:isbn 978-3-319-09845-6
38 978-3-319-09846-3
39 schema:name Linked Open Data -- Creating Knowledge Out of Interlinked Data
40 rdf:type schema:Book
41 N22e7022c9ae443c18bc840477da6a3f6 schema:familyName Auer
42 schema:givenName Sören
43 rdf:type schema:Person
44 N2a09ffc5f4094a4da7b34e8fa9b5eb25 schema:name doi
45 schema:value 10.1007/978-3-319-09846-3_2
46 rdf:type schema:PropertyValue
47 N4a716ea441304b8ca8325c2029061c5d schema:name dimensions_id
48 schema:value pub.1021315309
49 rdf:type schema:PropertyValue
50 N4a9fa4dd95c94129a605c914cf05fa30 rdf:first N22e7022c9ae443c18bc840477da6a3f6
51 rdf:rest Ne461ccc8628349ca849759002d29a6f0
52 N6efd68a7531f40f2aca6e8fe2ab84d8c schema:name readcube_id
53 schema:value 658e93d4f521806b78c2c801e4dc9585bab7a982dc48234570867b51e8da2ad6
54 rdf:type schema:PropertyValue
55 N8a591b613af746938384143040e6b01d rdf:first Ne0383b1f7c744876a8ebe1420f978e2b
56 rdf:rest rdf:nil
57 N8e62ee7cdd7a4c73b91ead825a4f58da rdf:first sg:person.011715705740.17
58 rdf:rest N12b5667fbf2d41da9b6ea237eec0c4bf
59 N9290d83796bf44eda1780385c783e2c7 schema:familyName Bryl
60 schema:givenName Volha
61 rdf:type schema:Person
62 Na6f033fce701435483698fcc585b6172 rdf:first sg:person.015341641231.33
63 rdf:rest N8e62ee7cdd7a4c73b91ead825a4f58da
64 Nc346c36bbe2f499d90e2266c9b7cb981 schema:location Cham
65 schema:name Springer International Publishing
66 rdf:type schema:Organisation
67 Ne0383b1f7c744876a8ebe1420f978e2b schema:familyName Tramp
68 schema:givenName Sebastian
69 rdf:type schema:Person
70 Ne461ccc8628349ca849759002d29a6f0 rdf:first N9290d83796bf44eda1780385c783e2c7
71 rdf:rest N8a591b613af746938384143040e6b01d
72 Ne912bc7d87764ded87a58deb23347e37 schema:name Springer Nature - SN SciGraph project
73 rdf:type schema:Organization
74 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
75 schema:name Information and Computing Sciences
76 rdf:type schema:DefinedTerm
77 anzsrc-for:0804 schema:inDefinedTermSet anzsrc-for:
78 schema:name Data Format
79 rdf:type schema:DefinedTerm
80 sg:person.011715705740.17 schema:affiliation https://www.grid.ac/institutes/grid.426164.3
81 schema:familyName Erling
82 schema:givenName Orri
83 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011715705740.17
84 rdf:type schema:Person
85 sg:person.015106130107.21 schema:affiliation https://www.grid.ac/institutes/grid.6054.7
86 schema:familyName Pham
87 schema:givenName Minh-Duc
88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015106130107.21
89 rdf:type schema:Person
90 sg:person.015341641231.33 schema:affiliation https://www.grid.ac/institutes/grid.6054.7
91 schema:familyName Boncz
92 schema:givenName Peter
93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015341641231.33
94 rdf:type schema:Person
95 sg:pub.10.1007/978-3-319-04936-6_5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033535672
96 https://doi.org/10.1007/978-3-319-04936-6_5
97 rdf:type schema:CreativeWork
98 sg:pub.10.1007/978-3-642-28997-2_59 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004189052
99 https://doi.org/10.1007/978-3-642-28997-2_59
100 rdf:type schema:CreativeWork
101 https://doi.org/10.1109/icde.2011.5767868 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093532757
102 rdf:type schema:CreativeWork
103 https://doi.org/10.1145/2247596.2247635 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018368791
104 rdf:type schema:CreativeWork
105 https://doi.org/10.1201/b16859 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095906249
106 rdf:type schema:CreativeWork
107 https://doi.org/10.14778/1687553.1687625 schema:sameAs https://app.dimensions.ai/details/publication/pub.1067367540
108 rdf:type schema:CreativeWork
109 https://doi.org/10.14778/2367502.2367518 schema:sameAs https://app.dimensions.ai/details/publication/pub.1067368064
110 rdf:type schema:CreativeWork
111 https://doi.org/10.4018/jswis.2009040101 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050425370
112 rdf:type schema:CreativeWork
113 https://www.grid.ac/institutes/grid.426164.3 schema:alternateName OpenLink Software (United Kingdom)
114 schema:name OpenLink Software, Burlington, UK
115 rdf:type schema:Organization
116 https://www.grid.ac/institutes/grid.6054.7 schema:alternateName Centrum Wiskunde and Informatica
117 schema:name CWI, Amsterdam, The Netherlands
118 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...