NearBucket-LSH: Efficient Similarity Search in P2P Networks View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2016-09-27

AUTHORS

Naama Kraus , David Carmel , Idit Keidar , Meni Orenbach

ABSTRACT

We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collections of objects, called buckets, that have a high probability to be similar to the query. More specifically, NearBucket-LSH employs an LSH extension that searches in near buckets, and improves search quality but also significantly increases the network cost. We decrease the network cost by considering the internals of both LSH and the P2P overlay, and harnessing their properties to our needs. We show that our NearBucket-LSH increases search quality for a given network cost compared to previous art. In many cases, the search quality increases by more than 50%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$50\,\%$$\end{document}. More... »

PAGES

236-249

Book

TITLE

Similarity Search and Applications

ISBN

978-3-319-46758-0
978-3-319-46759-7

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-46759-7_18

DOI

http://dx.doi.org/10.1007/978-3-319-46759-7_18

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1024167017


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/10", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Technology", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1005", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Communications Technologies", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Viterbi EE Technion, Haifa, Israel", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Viterbi EE Technion, Haifa, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kraus", 
        "givenName": "Naama", 
        "id": "sg:person.011543157475.92", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011543157475.92"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Yahoo Research, Haifa, Israel", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Yahoo Research, Haifa, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Carmel", 
        "givenName": "David", 
        "id": "sg:person.015176477151.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015176477151.41"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Yahoo Research, Haifa, Israel", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Viterbi EE Technion, Haifa, Israel", 
            "Yahoo Research, Haifa, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Keidar", 
        "givenName": "Idit", 
        "id": "sg:person.07674464077.03", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07674464077.03"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Viterbi EE Technion, Haifa, Israel", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Viterbi EE Technion, Haifa, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Orenbach", 
        "givenName": "Meni", 
        "id": "sg:person.016361010307.49", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016361010307.49"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2016-09-27", 
    "datePublishedReg": "2016-09-27", 
    "description": "We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collections of objects, called buckets, that have a high probability to be similar to the query. More specifically, NearBucket-LSH employs an LSH extension that searches in near buckets, and improves search quality but also significantly increases the network cost. We decrease the network cost by considering the internals of both LSH and the P2P overlay, and harnessing their properties to our needs. We show that our NearBucket-LSH increases search quality for a given network cost compared to previous art. In many cases, the search quality increases by more than 50%\\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$$50\\,\\%$$\\end{document}.", 
    "editor": [
      {
        "familyName": "Amsaleg", 
        "givenName": "Laurent", 
        "type": "Person"
      }, 
      {
        "familyName": "Houle", 
        "givenName": "Michael E.", 
        "type": "Person"
      }, 
      {
        "familyName": "Schubert", 
        "givenName": "Erich", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-46759-7_18", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-319-46758-0", 
        "978-3-319-46759-7"
      ], 
      "name": "Similarity Search and Applications", 
      "type": "Book"
    }, 
    "keywords": [
      "Locality Sensitive Hashing", 
      "search quality", 
      "network cost", 
      "similarity search", 
      "better search quality", 
      "efficient similarity search", 
      "online social networks", 
      "collection of objects", 
      "Sensitive Hashing", 
      "peer overlay", 
      "P2P overlay", 
      "P2P networks", 
      "effective algorithm", 
      "social networks", 
      "previous art", 
      "algorithm", 
      "network", 
      "overlay", 
      "search", 
      "queries", 
      "hashing", 
      "cost", 
      "quality increases", 
      "bucket", 
      "objects", 
      "high probability", 
      "communication", 
      "quality", 
      "dominant consideration", 
      "peers", 
      "art", 
      "system", 
      "collection", 
      "extension", 
      "internals", 
      "need", 
      "probability", 
      "consideration", 
      "cases", 
      "properties", 
      "increase", 
      "NearBucket-LSH", 
      "LSH extension", 
      "near buckets", 
      "search quality increases"
    ], 
    "name": "NearBucket-LSH: Efficient Similarity Search in P2P Networks", 
    "pagination": "236-249", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1024167017"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-46759-7_18"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-46759-7_18", 
      "https://app.dimensions.ai/details/publication/pub.1024167017"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-01-01T19:14", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/chapter/chapter_245.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-319-46759-7_18"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-46759-7_18'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-46759-7_18'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-46759-7_18'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-46759-7_18'


 

This table displays all metadata directly associated to this object as RDF triples.

146 TRIPLES      23 PREDICATES      72 URIs      63 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-46759-7_18 schema:about anzsrc-for:08
2 anzsrc-for:0806
3 anzsrc-for:10
4 anzsrc-for:1005
5 schema:author N73985a52fd694b56bb57c02d8ec6acaa
6 schema:datePublished 2016-09-27
7 schema:datePublishedReg 2016-09-27
8 schema:description We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collections of objects, called buckets, that have a high probability to be similar to the query. More specifically, NearBucket-LSH employs an LSH extension that searches in near buckets, and improves search quality but also significantly increases the network cost. We decrease the network cost by considering the internals of both LSH and the P2P overlay, and harnessing their properties to our needs. We show that our NearBucket-LSH increases search quality for a given network cost compared to previous art. In many cases, the search quality increases by more than 50%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$50\,\%$$\end{document}.
9 schema:editor N4d7934db163c401093da97d64ee106b5
10 schema:genre chapter
11 schema:inLanguage en
12 schema:isAccessibleForFree true
13 schema:isPartOf N2fb24e6bbfca4a76b443cfaae6cbbd19
14 schema:keywords LSH extension
15 Locality Sensitive Hashing
16 NearBucket-LSH
17 P2P networks
18 P2P overlay
19 Sensitive Hashing
20 algorithm
21 art
22 better search quality
23 bucket
24 cases
25 collection
26 collection of objects
27 communication
28 consideration
29 cost
30 dominant consideration
31 effective algorithm
32 efficient similarity search
33 extension
34 hashing
35 high probability
36 increase
37 internals
38 near buckets
39 need
40 network
41 network cost
42 objects
43 online social networks
44 overlay
45 peer overlay
46 peers
47 previous art
48 probability
49 properties
50 quality
51 quality increases
52 queries
53 search
54 search quality
55 search quality increases
56 similarity search
57 social networks
58 system
59 schema:name NearBucket-LSH: Efficient Similarity Search in P2P Networks
60 schema:pagination 236-249
61 schema:productId N37bd25b8dd4140f98b141f56763cc68a
62 Nad543b3366d442c0a5ef1e1d219439e8
63 schema:publisher Neda97522e8c7449daef118da6d147d4d
64 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024167017
65 https://doi.org/10.1007/978-3-319-46759-7_18
66 schema:sdDatePublished 2022-01-01T19:14
67 schema:sdLicense https://scigraph.springernature.com/explorer/license/
68 schema:sdPublisher Ne7c9c5c1add74668ae58f5dae3dcc9f1
69 schema:url https://doi.org/10.1007/978-3-319-46759-7_18
70 sgo:license sg:explorer/license/
71 sgo:sdDataset chapters
72 rdf:type schema:Chapter
73 N0c4346d12b6644f088067d4ab814413b schema:familyName Houle
74 schema:givenName Michael E.
75 rdf:type schema:Person
76 N2fb24e6bbfca4a76b443cfaae6cbbd19 schema:isbn 978-3-319-46758-0
77 978-3-319-46759-7
78 schema:name Similarity Search and Applications
79 rdf:type schema:Book
80 N30b3bc08ef99428bae3b46ac1feebf1d rdf:first sg:person.016361010307.49
81 rdf:rest rdf:nil
82 N37bd25b8dd4140f98b141f56763cc68a schema:name doi
83 schema:value 10.1007/978-3-319-46759-7_18
84 rdf:type schema:PropertyValue
85 N4d7934db163c401093da97d64ee106b5 rdf:first N9d25ef15ed4a498e82ff1de6367c698b
86 rdf:rest N95cf84e616ca470da390885923aef789
87 N5b887fca2f13402e9a228bd6aee5ee18 rdf:first sg:person.07674464077.03
88 rdf:rest N30b3bc08ef99428bae3b46ac1feebf1d
89 N73985a52fd694b56bb57c02d8ec6acaa rdf:first sg:person.011543157475.92
90 rdf:rest N7a059d22948c4ea49ab8c8c5d4eeb029
91 N7a059d22948c4ea49ab8c8c5d4eeb029 rdf:first sg:person.015176477151.41
92 rdf:rest N5b887fca2f13402e9a228bd6aee5ee18
93 N81d15f9243d1489497138ca2cfcbbf78 rdf:first Nd16e7f7bb66a4ce4a897d4fa9cf7e147
94 rdf:rest rdf:nil
95 N95cf84e616ca470da390885923aef789 rdf:first N0c4346d12b6644f088067d4ab814413b
96 rdf:rest N81d15f9243d1489497138ca2cfcbbf78
97 N9d25ef15ed4a498e82ff1de6367c698b schema:familyName Amsaleg
98 schema:givenName Laurent
99 rdf:type schema:Person
100 Nad543b3366d442c0a5ef1e1d219439e8 schema:name dimensions_id
101 schema:value pub.1024167017
102 rdf:type schema:PropertyValue
103 Nd16e7f7bb66a4ce4a897d4fa9cf7e147 schema:familyName Schubert
104 schema:givenName Erich
105 rdf:type schema:Person
106 Ne7c9c5c1add74668ae58f5dae3dcc9f1 schema:name Springer Nature - SN SciGraph project
107 rdf:type schema:Organization
108 Neda97522e8c7449daef118da6d147d4d schema:name Springer Nature
109 rdf:type schema:Organisation
110 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
111 schema:name Information and Computing Sciences
112 rdf:type schema:DefinedTerm
113 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
114 schema:name Information Systems
115 rdf:type schema:DefinedTerm
116 anzsrc-for:10 schema:inDefinedTermSet anzsrc-for:
117 schema:name Technology
118 rdf:type schema:DefinedTerm
119 anzsrc-for:1005 schema:inDefinedTermSet anzsrc-for:
120 schema:name Communications Technologies
121 rdf:type schema:DefinedTerm
122 sg:person.011543157475.92 schema:affiliation grid-institutes:None
123 schema:familyName Kraus
124 schema:givenName Naama
125 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011543157475.92
126 rdf:type schema:Person
127 sg:person.015176477151.41 schema:affiliation grid-institutes:None
128 schema:familyName Carmel
129 schema:givenName David
130 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015176477151.41
131 rdf:type schema:Person
132 sg:person.016361010307.49 schema:affiliation grid-institutes:None
133 schema:familyName Orenbach
134 schema:givenName Meni
135 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016361010307.49
136 rdf:type schema:Person
137 sg:person.07674464077.03 schema:affiliation grid-institutes:None
138 schema:familyName Keidar
139 schema:givenName Idit
140 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07674464077.03
141 rdf:type schema:Person
142 grid-institutes:None schema:alternateName Viterbi EE Technion, Haifa, Israel
143 Yahoo Research, Haifa, Israel
144 schema:name Viterbi EE Technion, Haifa, Israel
145 Yahoo Research, Haifa, Israel
146 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...