Advances in Large-Scale RDF Data Management View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2014

AUTHORS

Peter Boncz , Orri Erling , Minh-Duc Pham

ABSTRACT

One of the prime goals of the LOD2 project is improving the performance and scalability of RDF storage solutions so that the increasing amount of Linked Open Data (LOD) can be efficiently managed. Virtuoso has been chosen as the basic RDF store for the LOD2 project, and during the project it has been significantly improved by incorporating advanced relational database techniques from MonetDB and Vectorwise, turning it into a compressed column store with vectored execution. This has reduced the performance gap (“RDF tax”) between Virtuoso’s SQL and SPARQL query performance in a way that still respects the “schema-last” nature of RDF. However, by lacking schema information, RDF database systems such as Virtuoso still cannot use advanced relational storage optimizations such as table partitioning or clustered indexes and have to execute SPARQL queries with many self-joins to a triple table, which leads to more join effort than needed in SQL systems. In this chapter, we first discuss the new column store techniques applied to Virtuoso, the enhancements in its cluster parallel version, and show its performance using the popular BSBM benchmark at the unsurpassed scale of 150 billion triples. We finally describe ongoing work in deriving an “emergent” relational schema from RDF data, which can help to close the performance gap between relational-based and RDF-based storage solutions. More... »

PAGES

21-44

References to SciGraph publications

Book

TITLE

Linked Open Data -- Creating Knowledge Out of Interlinked Data

ISBN

978-3-319-09845-6
978-3-319-09846-3

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2

DOI

http://dx.doi.org/10.1007/978-3-319-09846-3_2

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1021315309


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0804", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Data Format", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Centrum Wiskunde and Informatica", 
          "id": "https://www.grid.ac/institutes/grid.6054.7", 
          "name": [
            "CWI, Amsterdam, The Netherlands"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Boncz", 
        "givenName": "Peter", 
        "id": "sg:person.015341641231.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015341641231.33"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "OpenLink Software (United Kingdom)", 
          "id": "https://www.grid.ac/institutes/grid.426164.3", 
          "name": [
            "OpenLink Software, Burlington, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Erling", 
        "givenName": "Orri", 
        "id": "sg:person.011715705740.17", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011715705740.17"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Centrum Wiskunde and Informatica", 
          "id": "https://www.grid.ac/institutes/grid.6054.7", 
          "name": [
            "CWI, Amsterdam, The Netherlands"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Pham", 
        "givenName": "Minh-Duc", 
        "id": "sg:person.015106130107.21", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015106130107.21"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/978-3-642-28997-2_59", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004189052", 
          "https://doi.org/10.1007/978-3-642-28997-2_59"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/2247596.2247635", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018368791"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-319-04936-6_5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033535672", 
          "https://doi.org/10.1007/978-3-319-04936-6_5"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.4018/jswis.2009040101", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050425370"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.14778/1687553.1687625", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1067367540"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.14778/2367502.2367518", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1067368064"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icde.2011.5767868", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093532757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1201/b16859", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095906249"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014", 
    "datePublishedReg": "2014-01-01", 
    "description": "One of the prime goals of the LOD2 project is improving the performance and scalability of RDF storage solutions so that the increasing amount of Linked Open Data (LOD) can be efficiently managed. Virtuoso has been chosen as the basic RDF store for the LOD2 project, and during the project it has been significantly improved by incorporating advanced relational database techniques from MonetDB and Vectorwise, turning it into a compressed column store with vectored execution. This has reduced the performance gap (\u201cRDF tax\u201d) between Virtuoso\u2019s SQL and SPARQL query performance in a way that still respects the \u201cschema-last\u201d nature of RDF. However, by lacking schema information, RDF database systems such as Virtuoso still cannot use advanced relational storage optimizations such as table partitioning or clustered indexes and have to execute SPARQL queries with many self-joins to a triple table, which leads to more join effort than needed in SQL systems. In this chapter, we first discuss the new column store techniques applied to Virtuoso, the enhancements in its cluster parallel version, and show its performance using the popular BSBM benchmark at the unsurpassed scale of 150\u00a0billion triples. We finally describe ongoing work in deriving an \u201cemergent\u201d relational schema from RDF data, which can help to close the performance gap between relational-based and RDF-based storage solutions.", 
    "editor": [
      {
        "familyName": "Auer", 
        "givenName": "S\u00f6ren", 
        "type": "Person"
      }, 
      {
        "familyName": "Bryl", 
        "givenName": "Volha", 
        "type": "Person"
      }, 
      {
        "familyName": "Tramp", 
        "givenName": "Sebastian", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-09846-3_2", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-319-09845-6", 
        "978-3-319-09846-3"
      ], 
      "name": "Linked Open Data -- Creating Knowledge Out of Interlinked Data", 
      "type": "Book"
    }, 
    "name": "Advances in Large-Scale RDF Data Management", 
    "pagination": "21-44", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-09846-3_2"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "658e93d4f521806b78c2c801e4dc9585bab7a982dc48234570867b51e8da2ad6"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1021315309"
        ]
      }
    ], 
    "publisher": {
      "location": "Cham", 
      "name": "Springer International Publishing", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-09846-3_2", 
      "https://app.dimensions.ai/details/publication/pub.1021315309"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T11:34", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8660_00000256.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-319-09846-3_2"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-09846-3_2'


 

This table displays all metadata directly associated to this object as RDF triples.

118 TRIPLES      23 PREDICATES      35 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-09846-3_2 schema:about anzsrc-for:08
2 anzsrc-for:0804
3 schema:author N51f755eefb0444a6a3d9a5247bd28596
4 schema:citation sg:pub.10.1007/978-3-319-04936-6_5
5 sg:pub.10.1007/978-3-642-28997-2_59
6 https://doi.org/10.1109/icde.2011.5767868
7 https://doi.org/10.1145/2247596.2247635
8 https://doi.org/10.1201/b16859
9 https://doi.org/10.14778/1687553.1687625
10 https://doi.org/10.14778/2367502.2367518
11 https://doi.org/10.4018/jswis.2009040101
12 schema:datePublished 2014
13 schema:datePublishedReg 2014-01-01
14 schema:description One of the prime goals of the LOD2 project is improving the performance and scalability of RDF storage solutions so that the increasing amount of Linked Open Data (LOD) can be efficiently managed. Virtuoso has been chosen as the basic RDF store for the LOD2 project, and during the project it has been significantly improved by incorporating advanced relational database techniques from MonetDB and Vectorwise, turning it into a compressed column store with vectored execution. This has reduced the performance gap (“RDF tax”) between Virtuoso’s SQL and SPARQL query performance in a way that still respects the “schema-last” nature of RDF. However, by lacking schema information, RDF database systems such as Virtuoso still cannot use advanced relational storage optimizations such as table partitioning or clustered indexes and have to execute SPARQL queries with many self-joins to a triple table, which leads to more join effort than needed in SQL systems. In this chapter, we first discuss the new column store techniques applied to Virtuoso, the enhancements in its cluster parallel version, and show its performance using the popular BSBM benchmark at the unsurpassed scale of 150 billion triples. We finally describe ongoing work in deriving an “emergent” relational schema from RDF data, which can help to close the performance gap between relational-based and RDF-based storage solutions.
15 schema:editor N23b081033fd24fd2b7ca36ba545149d0
16 schema:genre chapter
17 schema:inLanguage en
18 schema:isAccessibleForFree true
19 schema:isPartOf Nce2dc0c0c5db4d6ca968d69e65a0d5de
20 schema:name Advances in Large-Scale RDF Data Management
21 schema:pagination 21-44
22 schema:productId N06cfca7b3fb4404a93e25bfd97f8323b
23 N92badf0a6bba4619b57a1f072804ce68
24 Ne5a3cb93e9f746ec99ffa53bc6ab47c6
25 schema:publisher Nda68fee576b44a008f08946a01ceae9a
26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021315309
27 https://doi.org/10.1007/978-3-319-09846-3_2
28 schema:sdDatePublished 2019-04-15T11:34
29 schema:sdLicense https://scigraph.springernature.com/explorer/license/
30 schema:sdPublisher N995fca93653543cf82bcf579bb558f4d
31 schema:url http://link.springer.com/10.1007/978-3-319-09846-3_2
32 sgo:license sg:explorer/license/
33 sgo:sdDataset chapters
34 rdf:type schema:Chapter
35 N06cfca7b3fb4404a93e25bfd97f8323b schema:name dimensions_id
36 schema:value pub.1021315309
37 rdf:type schema:PropertyValue
38 N0fe3bd0f4a934df095c0697c54151ebe rdf:first sg:person.011715705740.17
39 rdf:rest Ne87892a9b50d47f291209864c693d2e1
40 N1a12ba6148894b60bc9a97ba213abafc schema:familyName Auer
41 schema:givenName Sören
42 rdf:type schema:Person
43 N23aa232ecaf241339153686e09240416 schema:familyName Tramp
44 schema:givenName Sebastian
45 rdf:type schema:Person
46 N23b081033fd24fd2b7ca36ba545149d0 rdf:first N1a12ba6148894b60bc9a97ba213abafc
47 rdf:rest Ne65edef0e2ea4d8d976fd18c3be46468
48 N2a657b0553a742e389129aa45d317bbf rdf:first N23aa232ecaf241339153686e09240416
49 rdf:rest rdf:nil
50 N51f755eefb0444a6a3d9a5247bd28596 rdf:first sg:person.015341641231.33
51 rdf:rest N0fe3bd0f4a934df095c0697c54151ebe
52 N92badf0a6bba4619b57a1f072804ce68 schema:name readcube_id
53 schema:value 658e93d4f521806b78c2c801e4dc9585bab7a982dc48234570867b51e8da2ad6
54 rdf:type schema:PropertyValue
55 N92f120b832ed40b3bfee1b55ba04872b schema:familyName Bryl
56 schema:givenName Volha
57 rdf:type schema:Person
58 N995fca93653543cf82bcf579bb558f4d schema:name Springer Nature - SN SciGraph project
59 rdf:type schema:Organization
60 Nce2dc0c0c5db4d6ca968d69e65a0d5de schema:isbn 978-3-319-09845-6
61 978-3-319-09846-3
62 schema:name Linked Open Data -- Creating Knowledge Out of Interlinked Data
63 rdf:type schema:Book
64 Nda68fee576b44a008f08946a01ceae9a schema:location Cham
65 schema:name Springer International Publishing
66 rdf:type schema:Organisation
67 Ne5a3cb93e9f746ec99ffa53bc6ab47c6 schema:name doi
68 schema:value 10.1007/978-3-319-09846-3_2
69 rdf:type schema:PropertyValue
70 Ne65edef0e2ea4d8d976fd18c3be46468 rdf:first N92f120b832ed40b3bfee1b55ba04872b
71 rdf:rest N2a657b0553a742e389129aa45d317bbf
72 Ne87892a9b50d47f291209864c693d2e1 rdf:first sg:person.015106130107.21
73 rdf:rest rdf:nil
74 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
75 schema:name Information and Computing Sciences
76 rdf:type schema:DefinedTerm
77 anzsrc-for:0804 schema:inDefinedTermSet anzsrc-for:
78 schema:name Data Format
79 rdf:type schema:DefinedTerm
80 sg:person.011715705740.17 schema:affiliation https://www.grid.ac/institutes/grid.426164.3
81 schema:familyName Erling
82 schema:givenName Orri
83 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011715705740.17
84 rdf:type schema:Person
85 sg:person.015106130107.21 schema:affiliation https://www.grid.ac/institutes/grid.6054.7
86 schema:familyName Pham
87 schema:givenName Minh-Duc
88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015106130107.21
89 rdf:type schema:Person
90 sg:person.015341641231.33 schema:affiliation https://www.grid.ac/institutes/grid.6054.7
91 schema:familyName Boncz
92 schema:givenName Peter
93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015341641231.33
94 rdf:type schema:Person
95 sg:pub.10.1007/978-3-319-04936-6_5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033535672
96 https://doi.org/10.1007/978-3-319-04936-6_5
97 rdf:type schema:CreativeWork
98 sg:pub.10.1007/978-3-642-28997-2_59 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004189052
99 https://doi.org/10.1007/978-3-642-28997-2_59
100 rdf:type schema:CreativeWork
101 https://doi.org/10.1109/icde.2011.5767868 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093532757
102 rdf:type schema:CreativeWork
103 https://doi.org/10.1145/2247596.2247635 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018368791
104 rdf:type schema:CreativeWork
105 https://doi.org/10.1201/b16859 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095906249
106 rdf:type schema:CreativeWork
107 https://doi.org/10.14778/1687553.1687625 schema:sameAs https://app.dimensions.ai/details/publication/pub.1067367540
108 rdf:type schema:CreativeWork
109 https://doi.org/10.14778/2367502.2367518 schema:sameAs https://app.dimensions.ai/details/publication/pub.1067368064
110 rdf:type schema:CreativeWork
111 https://doi.org/10.4018/jswis.2009040101 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050425370
112 rdf:type schema:CreativeWork
113 https://www.grid.ac/institutes/grid.426164.3 schema:alternateName OpenLink Software (United Kingdom)
114 schema:name OpenLink Software, Burlington, UK
115 rdf:type schema:Organization
116 https://www.grid.ac/institutes/grid.6054.7 schema:alternateName Centrum Wiskunde and Informatica
117 schema:name CWI, Amsterdam, The Netherlands
118 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...