Summarization Graph Indexing: Beyond Frequent Structure-Based Approach View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2008-01-01

AUTHORS

Lei Zou , Lei Chen , Huaming Zhang , Yansheng Lu , Qiang Lou

ABSTRACT

Graph is an important data structure to model complex structural data, such as chemical compounds, proteins, and XML documents. Among many graph data-based applications, sub-graph search is a key problem, which is defined as given a query Q, retrieving all graphs containing Q as a sub-graph in the graph database. Most existing sub-graph search methods try to filter out false positives (graphs that are not possible in the results) as many as possible by indexing some frequent sub-structures in graph database, such as [20,22,4,23]. However, due to ignoring the relationships between sub-structures, these methods still admit a high percentage of false positives. In this paper, we propose a novel concept, Summarization Graph, which is a complete graph and captures most topology information of the original graph, such as sub-structures and their relationships. Based on Summarization Graphs, we convert the filtering problem into retrieving objects with set-valued attributes. Moreover, we build an efficient signature file-based index to improve the filtering process. We prove theoretically that the pruning power of our method is larger than existing structure-based approaches. Finally, we show by extensive experimental study on real and synthetic data sets that the size of candidate set generated by Summarization Graph-based approach is only about 50% of that left by existing graph indexing methods, and the total response time of our method is reduced 2-10 times. More... »

PAGES

141-155

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-78568-2_13

DOI

http://dx.doi.org/10.1007/978-3-540-78568-2_13

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1038116125


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Huazhong University of Science and Technology, Wuhan, China", 
          "id": "http://www.grid.ac/institutes/grid.33199.31", 
          "name": [
            "Huazhong University of Science and Technology, Wuhan, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zou", 
        "givenName": "Lei", 
        "id": "sg:person.010105712435.00", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010105712435.00"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Hong Kong of Science and Technology, Hong Kong, China", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Hong Kong of Science and Technology, Hong Kong, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chen", 
        "givenName": "Lei", 
        "id": "sg:person.010574746005.15", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010574746005.15"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "The University of Alabama in Huntsville, 35899, Huntsville, AL, USA", 
          "id": "http://www.grid.ac/institutes/grid.265893.3", 
          "name": [
            "The University of Alabama in Huntsville, 35899, Huntsville, AL, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhang", 
        "givenName": "Huaming", 
        "id": "sg:person.012041227127.88", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012041227127.88"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Huazhong University of Science and Technology, Wuhan, China", 
          "id": "http://www.grid.ac/institutes/grid.33199.31", 
          "name": [
            "Huazhong University of Science and Technology, Wuhan, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lu", 
        "givenName": "Yansheng", 
        "id": "sg:person.010073477267.89", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010073477267.89"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "The Temple University, USA", 
          "id": "http://www.grid.ac/institutes/grid.264727.2", 
          "name": [
            "The Temple University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lou", 
        "givenName": "Qiang", 
        "id": "sg:person.010052456257.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010052456257.41"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2008-01-01", 
    "datePublishedReg": "2008-01-01", 
    "description": "Graph is an important data structure to model complex structural data, such as chemical compounds, proteins, and XML documents. Among many graph data-based applications, sub-graph search is a key problem, which is defined as given a query Q, retrieving all graphs containing Q as a sub-graph in the graph database. Most existing sub-graph search methods try to filter out false positives (graphs that are not possible in the results) as many as possible by indexing some frequent sub-structures in graph database, such as [20,22,4,23]. However, due to ignoring the relationships between sub-structures, these methods still admit a high percentage of false positives. In this paper, we propose a novel concept, Summarization Graph, which is a complete graph and captures most topology information of the original graph, such as sub-structures and their relationships. Based on Summarization Graphs, we convert the filtering problem into retrieving objects with set-valued attributes. Moreover, we build an efficient signature file-based index to improve the filtering process. We prove theoretically that the pruning power of our method is larger than existing structure-based approaches. Finally, we show by extensive experimental study on real and synthetic data sets that the size of candidate set generated by Summarization Graph-based approach is only about 50% of that left by existing graph indexing methods, and the total response time of our method is reduced 2-10 times.", 
    "editor": [
      {
        "familyName": "Haritsa", 
        "givenName": "Jayant R.", 
        "type": "Person"
      }, 
      {
        "familyName": "Kotagiri", 
        "givenName": "Ramamohanarao", 
        "type": "Person"
      }, 
      {
        "familyName": "Pudi", 
        "givenName": "Vikram", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-78568-2_13", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-540-78567-5", 
        "978-3-540-78568-2"
      ], 
      "name": "Database Systems for Advanced Applications", 
      "type": "Book"
    }, 
    "keywords": [
      "graph database", 
      "set-valued attributes", 
      "data base applications", 
      "important data structures", 
      "size of candidate", 
      "synthetic data sets", 
      "extensive experimental study", 
      "XML documents", 
      "query Q", 
      "false positives", 
      "retrieving objects", 
      "topology information", 
      "indexing method", 
      "data structure", 
      "total response time", 
      "original graph", 
      "search method", 
      "key problem", 
      "graph", 
      "data sets", 
      "filtering process", 
      "response time", 
      "novel concept", 
      "database", 
      "filtering problem", 
      "complete graph", 
      "structure-based approach", 
      "positives", 
      "objects", 
      "documents", 
      "method", 
      "information", 
      "attributes", 
      "set", 
      "search", 
      "applications", 
      "concept", 
      "time", 
      "data", 
      "power", 
      "process", 
      "problem", 
      "experimental study", 
      "approach", 
      "structural data", 
      "structure", 
      "candidates", 
      "size", 
      "relationship", 
      "chemical compounds", 
      "index", 
      "study", 
      "percentage", 
      "high percentage", 
      "compounds", 
      "paper", 
      "protein"
    ], 
    "name": "Summarization Graph Indexing: Beyond Frequent Structure-Based Approach", 
    "pagination": "141-155", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1038116125"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-78568-2_13"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-78568-2_13", 
      "https://app.dimensions.ai/details/publication/pub.1038116125"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-05-20T07:41", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/chapter/chapter_110.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-540-78568-2_13"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-78568-2_13'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-78568-2_13'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-78568-2_13'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-78568-2_13'


 

This table displays all metadata directly associated to this object as RDF triples.

164 TRIPLES      23 PREDICATES      82 URIs      75 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-78568-2_13 schema:about anzsrc-for:08
2 anzsrc-for:0806
3 schema:author N6c97c411acd44d29a907a8424a4c21c5
4 schema:datePublished 2008-01-01
5 schema:datePublishedReg 2008-01-01
6 schema:description Graph is an important data structure to model complex structural data, such as chemical compounds, proteins, and XML documents. Among many graph data-based applications, sub-graph search is a key problem, which is defined as given a query Q, retrieving all graphs containing Q as a sub-graph in the graph database. Most existing sub-graph search methods try to filter out false positives (graphs that are not possible in the results) as many as possible by indexing some frequent sub-structures in graph database, such as [20,22,4,23]. However, due to ignoring the relationships between sub-structures, these methods still admit a high percentage of false positives. In this paper, we propose a novel concept, Summarization Graph, which is a complete graph and captures most topology information of the original graph, such as sub-structures and their relationships. Based on Summarization Graphs, we convert the filtering problem into retrieving objects with set-valued attributes. Moreover, we build an efficient signature file-based index to improve the filtering process. We prove theoretically that the pruning power of our method is larger than existing structure-based approaches. Finally, we show by extensive experimental study on real and synthetic data sets that the size of candidate set generated by Summarization Graph-based approach is only about 50% of that left by existing graph indexing methods, and the total response time of our method is reduced 2-10 times.
7 schema:editor Ndb0bc7c5b02648fd9360a87195da3b2f
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N04425794d8cb452891093226a0a7ee7c
12 schema:keywords XML documents
13 applications
14 approach
15 attributes
16 candidates
17 chemical compounds
18 complete graph
19 compounds
20 concept
21 data
22 data base applications
23 data sets
24 data structure
25 database
26 documents
27 experimental study
28 extensive experimental study
29 false positives
30 filtering problem
31 filtering process
32 graph
33 graph database
34 high percentage
35 important data structures
36 index
37 indexing method
38 information
39 key problem
40 method
41 novel concept
42 objects
43 original graph
44 paper
45 percentage
46 positives
47 power
48 problem
49 process
50 protein
51 query Q
52 relationship
53 response time
54 retrieving objects
55 search
56 search method
57 set
58 set-valued attributes
59 size
60 size of candidate
61 structural data
62 structure
63 structure-based approach
64 study
65 synthetic data sets
66 time
67 topology information
68 total response time
69 schema:name Summarization Graph Indexing: Beyond Frequent Structure-Based Approach
70 schema:pagination 141-155
71 schema:productId N6442e497e96d4d7fbe9e74426de7feaf
72 Nd48a2ac15b5c4956980c428a9257f216
73 schema:publisher N716ccdb286414bb8aceba488a861bc6e
74 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038116125
75 https://doi.org/10.1007/978-3-540-78568-2_13
76 schema:sdDatePublished 2022-05-20T07:41
77 schema:sdLicense https://scigraph.springernature.com/explorer/license/
78 schema:sdPublisher Ncce60e9c01a44b5b9e22924d85c3aac0
79 schema:url https://doi.org/10.1007/978-3-540-78568-2_13
80 sgo:license sg:explorer/license/
81 sgo:sdDataset chapters
82 rdf:type schema:Chapter
83 N04425794d8cb452891093226a0a7ee7c schema:isbn 978-3-540-78567-5
84 978-3-540-78568-2
85 schema:name Database Systems for Advanced Applications
86 rdf:type schema:Book
87 N2b939780be3d48c981a444a4c3f3e65f rdf:first sg:person.010574746005.15
88 rdf:rest Na838fddc301848e6b17d6e850b07a7c6
89 N4ed2fbe3b378423e958e0b1dd167a2ef rdf:first sg:person.010073477267.89
90 rdf:rest Ncee84a8160fa4707817b965449ad613b
91 N6442e497e96d4d7fbe9e74426de7feaf schema:name doi
92 schema:value 10.1007/978-3-540-78568-2_13
93 rdf:type schema:PropertyValue
94 N6c97c411acd44d29a907a8424a4c21c5 rdf:first sg:person.010105712435.00
95 rdf:rest N2b939780be3d48c981a444a4c3f3e65f
96 N6fbf7948dfcc405d967b72683b91dc95 schema:familyName Haritsa
97 schema:givenName Jayant R.
98 rdf:type schema:Person
99 N716ccdb286414bb8aceba488a861bc6e schema:name Springer Nature
100 rdf:type schema:Organisation
101 N8508735cadb249d0b76db64a0faab541 schema:familyName Kotagiri
102 schema:givenName Ramamohanarao
103 rdf:type schema:Person
104 Na838fddc301848e6b17d6e850b07a7c6 rdf:first sg:person.012041227127.88
105 rdf:rest N4ed2fbe3b378423e958e0b1dd167a2ef
106 Nac2fd091e53f446dac077d03c4659a13 rdf:first N8508735cadb249d0b76db64a0faab541
107 rdf:rest Nd68714151b8b4a4bbb1aa8f7616a190e
108 Ncce60e9c01a44b5b9e22924d85c3aac0 schema:name Springer Nature - SN SciGraph project
109 rdf:type schema:Organization
110 Ncee84a8160fa4707817b965449ad613b rdf:first sg:person.010052456257.41
111 rdf:rest rdf:nil
112 Nd48a2ac15b5c4956980c428a9257f216 schema:name dimensions_id
113 schema:value pub.1038116125
114 rdf:type schema:PropertyValue
115 Nd68714151b8b4a4bbb1aa8f7616a190e rdf:first Nf33fbef9a0e24278933a629cbb1156a0
116 rdf:rest rdf:nil
117 Ndb0bc7c5b02648fd9360a87195da3b2f rdf:first N6fbf7948dfcc405d967b72683b91dc95
118 rdf:rest Nac2fd091e53f446dac077d03c4659a13
119 Nf33fbef9a0e24278933a629cbb1156a0 schema:familyName Pudi
120 schema:givenName Vikram
121 rdf:type schema:Person
122 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
123 schema:name Information and Computing Sciences
124 rdf:type schema:DefinedTerm
125 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
126 schema:name Information Systems
127 rdf:type schema:DefinedTerm
128 sg:person.010052456257.41 schema:affiliation grid-institutes:grid.264727.2
129 schema:familyName Lou
130 schema:givenName Qiang
131 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010052456257.41
132 rdf:type schema:Person
133 sg:person.010073477267.89 schema:affiliation grid-institutes:grid.33199.31
134 schema:familyName Lu
135 schema:givenName Yansheng
136 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010073477267.89
137 rdf:type schema:Person
138 sg:person.010105712435.00 schema:affiliation grid-institutes:grid.33199.31
139 schema:familyName Zou
140 schema:givenName Lei
141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010105712435.00
142 rdf:type schema:Person
143 sg:person.010574746005.15 schema:affiliation grid-institutes:None
144 schema:familyName Chen
145 schema:givenName Lei
146 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010574746005.15
147 rdf:type schema:Person
148 sg:person.012041227127.88 schema:affiliation grid-institutes:grid.265893.3
149 schema:familyName Zhang
150 schema:givenName Huaming
151 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012041227127.88
152 rdf:type schema:Person
153 grid-institutes:None schema:alternateName Hong Kong of Science and Technology, Hong Kong, China
154 schema:name Hong Kong of Science and Technology, Hong Kong, China
155 rdf:type schema:Organization
156 grid-institutes:grid.264727.2 schema:alternateName The Temple University, USA
157 schema:name The Temple University, USA
158 rdf:type schema:Organization
159 grid-institutes:grid.265893.3 schema:alternateName The University of Alabama in Huntsville, 35899, Huntsville, AL, USA
160 schema:name The University of Alabama in Huntsville, 35899, Huntsville, AL, USA
161 rdf:type schema:Organization
162 grid-institutes:grid.33199.31 schema:alternateName Huazhong University of Science and Technology, Wuhan, China
163 schema:name Huazhong University of Science and Technology, Wuhan, China
164 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...