When Distributed Computation Is Communication Expensive View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2013

AUTHORS

David P. Woodruff , Qin Zhang

ABSTRACT

We consider a number of fundamental statistical and graph problems in the message-passing model, where we have k machines (sites), each holding a piece of data, and the machines want to jointly solve a problem defined on the union of the k data sets. The communication is point-to-point, and the goal is to minimize the total communication among the k machines. This model captures all point-to-point distributed computational models with respect to minimizing communication costs. Our analysis shows that exact computation of many statistical and graph problems in this distributed setting requires a prohibitively large amount of communication, and often one cannot improve upon the communication of the simple protocol in which all machines send their data to a centralized server. Thus, in order to obtain protocols that are communication-efficient, one has to allow approximation, or investigate the distribution or layout of the data sets. More... »

PAGES

16-30

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-41527-2_2

DOI

http://dx.doi.org/10.1007/978-3-642-41527-2_2

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1046994347


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0802", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computation Theory and Mathematics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "IBM Research Almaden, USA", 
          "id": "http://www.grid.ac/institutes/grid.481551.c", 
          "name": [
            "IBM Research Almaden, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Woodruff", 
        "givenName": "David P.", 
        "id": "sg:person.012727410605.86", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012727410605.86"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Indiana University Bloomington, USA", 
          "id": "http://www.grid.ac/institutes/grid.411377.7", 
          "name": [
            "Indiana University Bloomington, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhang", 
        "givenName": "Qin", 
        "id": "sg:person.011214363755.14", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011214363755.14"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2013", 
    "datePublishedReg": "2013-01-01", 
    "description": "We consider a number of fundamental statistical and graph problems in the message-passing model, where we have k machines (sites), each holding a piece of data, and the machines want to jointly solve a problem defined on the union of the k data sets. The communication is point-to-point, and the goal is to minimize the total communication among the k machines. This model captures all point-to-point distributed computational models with respect to minimizing communication costs. Our analysis shows that exact computation of many statistical and graph problems in this distributed setting requires a prohibitively large amount of communication, and often one cannot improve upon the communication of the simple protocol in which all machines send their data to a centralized server. Thus, in order to obtain protocols that are communication-efficient, one has to allow approximation, or investigate the distribution or layout of the data sets.", 
    "editor": [
      {
        "familyName": "Afek", 
        "givenName": "Yehuda", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-41527-2_2", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-642-41526-5", 
        "978-3-642-41527-2"
      ], 
      "name": "Distributed Computing", 
      "type": "Book"
    }, 
    "keywords": [
      "graph problems", 
      "k machines", 
      "exact computation", 
      "data sets", 
      "message-passing model", 
      "computation", 
      "pieces of data", 
      "centralized server", 
      "communication cost", 
      "K data set", 
      "computational model", 
      "problem", 
      "approximation", 
      "machine", 
      "model", 
      "set", 
      "point", 
      "total communication", 
      "communication", 
      "large amount", 
      "server", 
      "protocol", 
      "distribution", 
      "order", 
      "number", 
      "respect", 
      "layout", 
      "simple protocol", 
      "data", 
      "cost", 
      "goal", 
      "analysis", 
      "pieces", 
      "amount", 
      "setting", 
      "Union"
    ], 
    "name": "When Distributed Computation Is Communication Expensive", 
    "pagination": "16-30", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1046994347"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-41527-2_2"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-41527-2_2", 
      "https://app.dimensions.ai/details/publication/pub.1046994347"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-12-01T06:51", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/chapter/chapter_312.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-642-41527-2_2"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-41527-2_2'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-41527-2_2'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-41527-2_2'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-41527-2_2'


 

This table displays all metadata directly associated to this object as RDF triples.

113 TRIPLES      22 PREDICATES      63 URIs      54 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-41527-2_2 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 anzsrc-for:08
4 anzsrc-for:0802
5 schema:author Nbf4d38a9fb1f4b05ad9adaf7b458aa0b
6 schema:datePublished 2013
7 schema:datePublishedReg 2013-01-01
8 schema:description We consider a number of fundamental statistical and graph problems in the message-passing model, where we have k machines (sites), each holding a piece of data, and the machines want to jointly solve a problem defined on the union of the k data sets. The communication is point-to-point, and the goal is to minimize the total communication among the k machines. This model captures all point-to-point distributed computational models with respect to minimizing communication costs. Our analysis shows that exact computation of many statistical and graph problems in this distributed setting requires a prohibitively large amount of communication, and often one cannot improve upon the communication of the simple protocol in which all machines send their data to a centralized server. Thus, in order to obtain protocols that are communication-efficient, one has to allow approximation, or investigate the distribution or layout of the data sets.
9 schema:editor N79fde544df1b4747a01020123edd3293
10 schema:genre chapter
11 schema:isAccessibleForFree true
12 schema:isPartOf N24acf98c50954057b1635d00eaaad2a1
13 schema:keywords K data set
14 Union
15 amount
16 analysis
17 approximation
18 centralized server
19 communication
20 communication cost
21 computation
22 computational model
23 cost
24 data
25 data sets
26 distribution
27 exact computation
28 goal
29 graph problems
30 k machines
31 large amount
32 layout
33 machine
34 message-passing model
35 model
36 number
37 order
38 pieces
39 pieces of data
40 point
41 problem
42 protocol
43 respect
44 server
45 set
46 setting
47 simple protocol
48 total communication
49 schema:name When Distributed Computation Is Communication Expensive
50 schema:pagination 16-30
51 schema:productId N2d8666f53c1f45dda313d64c762dc05b
52 Na53897f7ab5d492abee3a165ded4a717
53 schema:publisher N9977dc0b235d4c03a28cbe7ab9a0c2f6
54 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046994347
55 https://doi.org/10.1007/978-3-642-41527-2_2
56 schema:sdDatePublished 2022-12-01T06:51
57 schema:sdLicense https://scigraph.springernature.com/explorer/license/
58 schema:sdPublisher N4ea69d2a930f4e53ac4c5cef7bdd87c3
59 schema:url https://doi.org/10.1007/978-3-642-41527-2_2
60 sgo:license sg:explorer/license/
61 sgo:sdDataset chapters
62 rdf:type schema:Chapter
63 N24acf98c50954057b1635d00eaaad2a1 schema:isbn 978-3-642-41526-5
64 978-3-642-41527-2
65 schema:name Distributed Computing
66 rdf:type schema:Book
67 N2d8666f53c1f45dda313d64c762dc05b schema:name doi
68 schema:value 10.1007/978-3-642-41527-2_2
69 rdf:type schema:PropertyValue
70 N4ea69d2a930f4e53ac4c5cef7bdd87c3 schema:name Springer Nature - SN SciGraph project
71 rdf:type schema:Organization
72 N574143afa2264b6fa3f6e14925cae72c schema:familyName Afek
73 schema:givenName Yehuda
74 rdf:type schema:Person
75 N79fde544df1b4747a01020123edd3293 rdf:first N574143afa2264b6fa3f6e14925cae72c
76 rdf:rest rdf:nil
77 N9977dc0b235d4c03a28cbe7ab9a0c2f6 schema:name Springer Nature
78 rdf:type schema:Organisation
79 Na01ec853d06d43559cd3775b03d46390 rdf:first sg:person.011214363755.14
80 rdf:rest rdf:nil
81 Na53897f7ab5d492abee3a165ded4a717 schema:name dimensions_id
82 schema:value pub.1046994347
83 rdf:type schema:PropertyValue
84 Nbf4d38a9fb1f4b05ad9adaf7b458aa0b rdf:first sg:person.012727410605.86
85 rdf:rest Na01ec853d06d43559cd3775b03d46390
86 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
87 schema:name Mathematical Sciences
88 rdf:type schema:DefinedTerm
89 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
90 schema:name Statistics
91 rdf:type schema:DefinedTerm
92 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
93 schema:name Information and Computing Sciences
94 rdf:type schema:DefinedTerm
95 anzsrc-for:0802 schema:inDefinedTermSet anzsrc-for:
96 schema:name Computation Theory and Mathematics
97 rdf:type schema:DefinedTerm
98 sg:person.011214363755.14 schema:affiliation grid-institutes:grid.411377.7
99 schema:familyName Zhang
100 schema:givenName Qin
101 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011214363755.14
102 rdf:type schema:Person
103 sg:person.012727410605.86 schema:affiliation grid-institutes:grid.481551.c
104 schema:familyName Woodruff
105 schema:givenName David P.
106 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012727410605.86
107 rdf:type schema:Person
108 grid-institutes:grid.411377.7 schema:alternateName Indiana University Bloomington, USA
109 schema:name Indiana University Bloomington, USA
110 rdf:type schema:Organization
111 grid-institutes:grid.481551.c schema:alternateName IBM Research Almaden, USA
112 schema:name IBM Research Almaden, USA
113 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...