Streaming Partitioning of RDF Graphs for Datalog Reasoning View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2021-05-31

AUTHORS

Temitope Ajileye , Boris Motik , Ian Horrocks

ABSTRACT

A cluster of servers is often used to reason over RDF graphs whose size exceeds the capacity of a single server. While many distributed approaches to reasoning have been proposed, the problem of data partitioning has received little attention thus far. In practice, data is usually partitioned by a variant of hashing, which is very simple, but it does not pay attention to data locality. Locality-aware partitioning approaches have been considered, but they usually process the entire dataset on a single server. In this paper, we present two new RDF partitioning strategies. Both are inspired by recent streaming graph partitioning algorithms, which partition a graph while keeping only a small subset of the graph in memory. We have evaluated our approaches empirically against hash and min-cut partitioning. Our results suggest that our approaches can significantly improve reasoning performance, but without unrealistic demands on the memory of the servers used for partitioning. More... »

PAGES

3-22

Book

TITLE

The Semantic Web

ISBN

978-3-030-77384-7
978-3-030-77385-4

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-030-77385-4_1

DOI

http://dx.doi.org/10.1007/978-3-030-77385-4_1

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1138468547


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0803", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computer Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of Oxford, Oxford, UK", 
          "id": "http://www.grid.ac/institutes/grid.4991.5", 
          "name": [
            "Department of Computer Science, University of Oxford, Oxford, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ajileye", 
        "givenName": "Temitope", 
        "id": "sg:person.010564701705.97", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010564701705.97"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of Oxford, Oxford, UK", 
          "id": "http://www.grid.ac/institutes/grid.4991.5", 
          "name": [
            "Department of Computer Science, University of Oxford, Oxford, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Motik", 
        "givenName": "Boris", 
        "id": "sg:person.07401076267.36", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07401076267.36"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of Oxford, Oxford, UK", 
          "id": "http://www.grid.ac/institutes/grid.4991.5", 
          "name": [
            "Department of Computer Science, University of Oxford, Oxford, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Horrocks", 
        "givenName": "Ian", 
        "id": "sg:person.013100561643.19", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013100561643.19"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2021-05-31", 
    "datePublishedReg": "2021-05-31", 
    "description": "A cluster of servers is often used to reason over RDF graphs whose size exceeds the capacity of a single server. While many distributed approaches to reasoning have been proposed, the problem of data partitioning has received little attention thus far. In practice, data is usually partitioned by a variant of hashing, which is very simple, but it does not pay attention to data locality. Locality-aware partitioning approaches have been considered, but they usually process the entire dataset on a single server. In this paper, we present two new RDF partitioning strategies. Both are inspired by recent streaming graph partitioning algorithms, which partition a graph while keeping only a small subset of the graph in memory. We have evaluated our approaches empirically against hash and min-cut partitioning. Our results suggest that our approaches can significantly improve reasoning performance, but without unrealistic demands on the memory of the servers used for partitioning.", 
    "editor": [
      {
        "familyName": "Verborgh", 
        "givenName": "Ruben", 
        "type": "Person"
      }, 
      {
        "familyName": "Hose", 
        "givenName": "Katja", 
        "type": "Person"
      }, 
      {
        "familyName": "Paulheim", 
        "givenName": "Heiko", 
        "type": "Person"
      }, 
      {
        "familyName": "Champin", 
        "givenName": "Pierre-Antoine", 
        "type": "Person"
      }, 
      {
        "familyName": "Maleshkova", 
        "givenName": "Maria", 
        "type": "Person"
      }, 
      {
        "familyName": "Corcho", 
        "givenName": "Oscar", 
        "type": "Person"
      }, 
      {
        "familyName": "Ristoski", 
        "givenName": "Petar", 
        "type": "Person"
      }, 
      {
        "familyName": "Alam", 
        "givenName": "Mehwish", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-030-77385-4_1", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-030-77384-7", 
        "978-3-030-77385-4"
      ], 
      "name": "The Semantic Web", 
      "type": "Book"
    }, 
    "keywords": [
      "RDF graphs", 
      "single server", 
      "cluster of servers", 
      "data partitioning", 
      "data locality", 
      "server", 
      "partitioning strategies", 
      "partitioning approach", 
      "entire dataset", 
      "graph", 
      "reasoning", 
      "hashing", 
      "hash", 
      "small subset", 
      "partitioning", 
      "algorithm", 
      "dataset", 
      "memory", 
      "performance", 
      "unrealistic demands", 
      "demand", 
      "attention", 
      "little attention", 
      "clusters", 
      "data", 
      "subset", 
      "localities", 
      "strategies", 
      "results", 
      "variants", 
      "practice", 
      "size", 
      "capacity", 
      "approach", 
      "min", 
      "paper", 
      "problem", 
      "variant of hashing", 
      "Locality-aware partitioning approaches", 
      "new RDF partitioning strategies", 
      "RDF partitioning strategies", 
      "Datalog Reasoning"
    ], 
    "name": "Streaming Partitioning of RDF Graphs for Datalog Reasoning", 
    "pagination": "3-22", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1138468547"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-030-77385-4_1"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-030-77385-4_1", 
      "https://app.dimensions.ai/details/publication/pub.1138468547"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-01-01T19:17", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/chapter/chapter_30.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-030-77385-4_1"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-77385-4_1'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-77385-4_1'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-77385-4_1'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-77385-4_1'


 

This table displays all metadata directly associated to this object as RDF triples.

151 TRIPLES      23 PREDICATES      67 URIs      60 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-030-77385-4_1 schema:about anzsrc-for:08
2 anzsrc-for:0803
3 schema:author N361f0252c6934386a284accd171e3492
4 schema:datePublished 2021-05-31
5 schema:datePublishedReg 2021-05-31
6 schema:description A cluster of servers is often used to reason over RDF graphs whose size exceeds the capacity of a single server. While many distributed approaches to reasoning have been proposed, the problem of data partitioning has received little attention thus far. In practice, data is usually partitioned by a variant of hashing, which is very simple, but it does not pay attention to data locality. Locality-aware partitioning approaches have been considered, but they usually process the entire dataset on a single server. In this paper, we present two new RDF partitioning strategies. Both are inspired by recent streaming graph partitioning algorithms, which partition a graph while keeping only a small subset of the graph in memory. We have evaluated our approaches empirically against hash and min-cut partitioning. Our results suggest that our approaches can significantly improve reasoning performance, but without unrealistic demands on the memory of the servers used for partitioning.
7 schema:editor Nec8e7a96a408491bb6c0858d7d26e448
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf N2f8a72bde28445e98346822136b4b71b
12 schema:keywords Datalog Reasoning
13 Locality-aware partitioning approaches
14 RDF graphs
15 RDF partitioning strategies
16 algorithm
17 approach
18 attention
19 capacity
20 cluster of servers
21 clusters
22 data
23 data locality
24 data partitioning
25 dataset
26 demand
27 entire dataset
28 graph
29 hash
30 hashing
31 little attention
32 localities
33 memory
34 min
35 new RDF partitioning strategies
36 paper
37 partitioning
38 partitioning approach
39 partitioning strategies
40 performance
41 practice
42 problem
43 reasoning
44 results
45 server
46 single server
47 size
48 small subset
49 strategies
50 subset
51 unrealistic demands
52 variant of hashing
53 variants
54 schema:name Streaming Partitioning of RDF Graphs for Datalog Reasoning
55 schema:pagination 3-22
56 schema:productId N82d70042d4ab4531af009f1475cba629
57 Nd1cc7eca218649c79ff61aa62e34681f
58 schema:publisher N31c7746108d94bc78b19d591b9897e73
59 schema:sameAs https://app.dimensions.ai/details/publication/pub.1138468547
60 https://doi.org/10.1007/978-3-030-77385-4_1
61 schema:sdDatePublished 2022-01-01T19:17
62 schema:sdLicense https://scigraph.springernature.com/explorer/license/
63 schema:sdPublisher Nde42110c818f4cdd88d241dc137b1bfa
64 schema:url https://doi.org/10.1007/978-3-030-77385-4_1
65 sgo:license sg:explorer/license/
66 sgo:sdDataset chapters
67 rdf:type schema:Chapter
68 N0907a18f529d43578cb1a9cffa139ae6 rdf:first N3d22d3cbcd9d4c619d1617e8aa77f8f7
69 rdf:rest N3005b6fdafd943f796515cd00d628633
70 N0f337f0838714e999873c197b3d96ccc schema:familyName Ristoski
71 schema:givenName Petar
72 rdf:type schema:Person
73 N1aeef874943c4ed1b7c3db7e1ec52f20 rdf:first sg:person.07401076267.36
74 rdf:rest Ne1e8cd185f5c435d8d8f8ed7d35fb1b0
75 N2f8a72bde28445e98346822136b4b71b schema:isbn 978-3-030-77384-7
76 978-3-030-77385-4
77 schema:name The Semantic Web
78 rdf:type schema:Book
79 N3005b6fdafd943f796515cd00d628633 rdf:first N0f337f0838714e999873c197b3d96ccc
80 rdf:rest Nc99b1a39d154437da44c66f361234cfa
81 N31c7746108d94bc78b19d591b9897e73 schema:name Springer Nature
82 rdf:type schema:Organisation
83 N361f0252c6934386a284accd171e3492 rdf:first sg:person.010564701705.97
84 rdf:rest N1aeef874943c4ed1b7c3db7e1ec52f20
85 N385d9c82291d41a998bb6d09e1ac21cf schema:familyName Paulheim
86 schema:givenName Heiko
87 rdf:type schema:Person
88 N3d22d3cbcd9d4c619d1617e8aa77f8f7 schema:familyName Corcho
89 schema:givenName Oscar
90 rdf:type schema:Person
91 N764dbff4d2d54befb2ace80feb6c8431 rdf:first N8380cc389b4d4a949ac3356d4d60db19
92 rdf:rest Nfeee0c937b324c45b369b15b360904ff
93 N7830124640bf414b8fccd2b684a04a04 schema:familyName Alam
94 schema:givenName Mehwish
95 rdf:type schema:Person
96 N82d70042d4ab4531af009f1475cba629 schema:name doi
97 schema:value 10.1007/978-3-030-77385-4_1
98 rdf:type schema:PropertyValue
99 N8380cc389b4d4a949ac3356d4d60db19 schema:familyName Champin
100 schema:givenName Pierre-Antoine
101 rdf:type schema:Person
102 N8fdd0c31bcf0439eb0a07077d6a01eab schema:familyName Hose
103 schema:givenName Katja
104 rdf:type schema:Person
105 Nc99b1a39d154437da44c66f361234cfa rdf:first N7830124640bf414b8fccd2b684a04a04
106 rdf:rest rdf:nil
107 Nca982661aba94cde9050a867355c7a91 rdf:first N8fdd0c31bcf0439eb0a07077d6a01eab
108 rdf:rest Nedc0ef70d6144bd49841bec5dd30f8af
109 Nd1cc7eca218649c79ff61aa62e34681f schema:name dimensions_id
110 schema:value pub.1138468547
111 rdf:type schema:PropertyValue
112 Nd50afd442db34251b7c05f20484aea86 schema:familyName Verborgh
113 schema:givenName Ruben
114 rdf:type schema:Person
115 Nde42110c818f4cdd88d241dc137b1bfa schema:name Springer Nature - SN SciGraph project
116 rdf:type schema:Organization
117 Ndf88e5d4b25845279e1febdf2380dc3a schema:familyName Maleshkova
118 schema:givenName Maria
119 rdf:type schema:Person
120 Ne1e8cd185f5c435d8d8f8ed7d35fb1b0 rdf:first sg:person.013100561643.19
121 rdf:rest rdf:nil
122 Nec8e7a96a408491bb6c0858d7d26e448 rdf:first Nd50afd442db34251b7c05f20484aea86
123 rdf:rest Nca982661aba94cde9050a867355c7a91
124 Nedc0ef70d6144bd49841bec5dd30f8af rdf:first N385d9c82291d41a998bb6d09e1ac21cf
125 rdf:rest N764dbff4d2d54befb2ace80feb6c8431
126 Nfeee0c937b324c45b369b15b360904ff rdf:first Ndf88e5d4b25845279e1febdf2380dc3a
127 rdf:rest N0907a18f529d43578cb1a9cffa139ae6
128 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
129 schema:name Information and Computing Sciences
130 rdf:type schema:DefinedTerm
131 anzsrc-for:0803 schema:inDefinedTermSet anzsrc-for:
132 schema:name Computer Software
133 rdf:type schema:DefinedTerm
134 sg:person.010564701705.97 schema:affiliation grid-institutes:grid.4991.5
135 schema:familyName Ajileye
136 schema:givenName Temitope
137 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010564701705.97
138 rdf:type schema:Person
139 sg:person.013100561643.19 schema:affiliation grid-institutes:grid.4991.5
140 schema:familyName Horrocks
141 schema:givenName Ian
142 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013100561643.19
143 rdf:type schema:Person
144 sg:person.07401076267.36 schema:affiliation grid-institutes:grid.4991.5
145 schema:familyName Motik
146 schema:givenName Boris
147 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07401076267.36
148 rdf:type schema:Person
149 grid-institutes:grid.4991.5 schema:alternateName Department of Computer Science, University of Oxford, Oxford, UK
150 schema:name Department of Computer Science, University of Oxford, Oxford, UK
151 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...