Efficient Memory Representation of XML Documents View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2005

AUTHORS

Giorgio Busatto , Markus Lohrey , Sebastian Maneth

ABSTRACT

Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. Here a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by “compressing” their tree structure; the latter means to detect and remove repetitions of tree patterns. The functionality of basic tree operations, like traversal along edges, is preserved in the compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. For certain tasks like validation against an XML type or checking equality of documents, the representation allows for provably more efficient algorithms than those running on conventional representations. More... »

PAGES

199-216

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/11601524_13

DOI

http://dx.doi.org/10.1007/11601524_13

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1014657474


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0804", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Data Format", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department f\u00fcr Informatik, Universit\u00e4t Oldenburg, Germany", 
          "id": "http://www.grid.ac/institutes/grid.5560.6", 
          "name": [
            "Department f\u00fcr Informatik, Universit\u00e4t Oldenburg, Germany"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Busatto", 
        "givenName": "Giorgio", 
        "id": "sg:person.010376065260.68", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010376065260.68"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "FMI, Universit\u00e4t Stuttgart, Germany", 
          "id": "http://www.grid.ac/institutes/grid.5719.a", 
          "name": [
            "FMI, Universit\u00e4t Stuttgart, Germany"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lohrey", 
        "givenName": "Markus", 
        "id": "sg:person.015611460437.34", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015611460437.34"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Facult\u00e9 I & C, EPFL, Switzerland", 
          "id": "http://www.grid.ac/institutes/grid.5333.6", 
          "name": [
            "Facult\u00e9 I & C, EPFL, Switzerland"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Maneth", 
        "givenName": "Sebastian", 
        "id": "sg:person.016240662443.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016240662443.33"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2005", 
    "datePublishedReg": "2005-01-01", 
    "description": "Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. Here a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by \u201ccompressing\u201d their tree structure; the latter means to detect and remove repetitions of tree patterns. The functionality of basic tree operations, like traversal along edges, is preserved in the compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. For certain tasks like validation against an XML type or checking equality of documents, the representation allows for provably more efficient algorithms than those running on conventional representations.", 
    "editor": [
      {
        "familyName": "Bierman", 
        "givenName": "Gavin", 
        "type": "Person"
      }, 
      {
        "familyName": "Koch", 
        "givenName": "Christoph", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/11601524_13", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-30951-2", 
        "978-3-540-31445-5"
      ], 
      "name": "Database Programming Languages", 
      "type": "Book"
    }, 
    "keywords": [
      "XML documents", 
      "tree structure", 
      "huge memory demands", 
      "basic tree operations", 
      "efficient memory representations", 
      "XML types", 
      "efficient algorithm", 
      "tree operations", 
      "prior decompression", 
      "certain tasks", 
      "tree patterns", 
      "efficient way", 
      "memory demands", 
      "documents", 
      "representation", 
      "queries", 
      "traversal", 
      "algorithm", 
      "conventional representation", 
      "implementation", 
      "task", 
      "functionality", 
      "access", 
      "considerable amount", 
      "memory", 
      "operation", 
      "high regularity", 
      "space", 
      "technique", 
      "demand", 
      "validation", 
      "way", 
      "edge", 
      "memory representations", 
      "structure", 
      "time", 
      "DOM", 
      "regularity", 
      "amount", 
      "size", 
      "types", 
      "repetition", 
      "patterns", 
      "equality", 
      "decompression", 
      "equality of documents"
    ], 
    "name": "Efficient Memory Representation of XML Documents", 
    "pagination": "199-216", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1014657474"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/11601524_13"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/11601524_13", 
      "https://app.dimensions.ai/details/publication/pub.1014657474"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2021-11-01T18:54", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211101/entities/gbq_results/chapter/chapter_297.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/11601524_13"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11601524_13'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11601524_13'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11601524_13'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11601524_13'


 

This table displays all metadata directly associated to this object as RDF triples.

131 TRIPLES      23 PREDICATES      72 URIs      65 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/11601524_13 schema:about anzsrc-for:08
2 anzsrc-for:0804
3 schema:author Nfbcabae296c74688a646286d8396e5da
4 schema:datePublished 2005
5 schema:datePublishedReg 2005-01-01
6 schema:description Implementations that load XML documents and give access to them via, e.g., the DOM, suffer from huge memory demands: the space needed to load an XML document is usually many times larger than the size of the document. A considerable amount of memory is needed to store the tree structure of the XML document. Here a technique is presented that allows to represent the tree structure of an XML document in an efficient way. The representation exploits the high regularity in XML documents by “compressing” their tree structure; the latter means to detect and remove repetitions of tree patterns. The functionality of basic tree operations, like traversal along edges, is preserved in the compressed representation. This allows to directly execute queries (and in particular, bulk operations) without prior decompression. For certain tasks like validation against an XML type or checking equality of documents, the representation allows for provably more efficient algorithms than those running on conventional representations.
7 schema:editor Nb252130a02784f01a618c7223242af93
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf N41d44ff13e7f400e8071cb870aee4c57
12 schema:keywords DOM
13 XML documents
14 XML types
15 access
16 algorithm
17 amount
18 basic tree operations
19 certain tasks
20 considerable amount
21 conventional representation
22 decompression
23 demand
24 documents
25 edge
26 efficient algorithm
27 efficient memory representations
28 efficient way
29 equality
30 equality of documents
31 functionality
32 high regularity
33 huge memory demands
34 implementation
35 memory
36 memory demands
37 memory representations
38 operation
39 patterns
40 prior decompression
41 queries
42 regularity
43 repetition
44 representation
45 size
46 space
47 structure
48 task
49 technique
50 time
51 traversal
52 tree operations
53 tree patterns
54 tree structure
55 types
56 validation
57 way
58 schema:name Efficient Memory Representation of XML Documents
59 schema:pagination 199-216
60 schema:productId Na07e8c08c0b346fc90eacd9f4231fb63
61 Nc6a79c51f1a3497f9b786577527ec2e0
62 schema:publisher Ne48208a780cb4289a1e12ec9e859b1a6
63 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014657474
64 https://doi.org/10.1007/11601524_13
65 schema:sdDatePublished 2021-11-01T18:54
66 schema:sdLicense https://scigraph.springernature.com/explorer/license/
67 schema:sdPublisher Nb4637866f92f4aa68314eccb23e97ab7
68 schema:url https://doi.org/10.1007/11601524_13
69 sgo:license sg:explorer/license/
70 sgo:sdDataset chapters
71 rdf:type schema:Chapter
72 N0b3bc14ed54747bc9a69ecdaefbeafa3 schema:familyName Bierman
73 schema:givenName Gavin
74 rdf:type schema:Person
75 N41d44ff13e7f400e8071cb870aee4c57 schema:isbn 978-3-540-30951-2
76 978-3-540-31445-5
77 schema:name Database Programming Languages
78 rdf:type schema:Book
79 N78aa7351425447ad8c3c4fafe7b7b78f rdf:first sg:person.015611460437.34
80 rdf:rest N99c8c91ef892478581ee1df85ac2d03e
81 N93271332247b4e079658cded40bb94a7 schema:familyName Koch
82 schema:givenName Christoph
83 rdf:type schema:Person
84 N99c8c91ef892478581ee1df85ac2d03e rdf:first sg:person.016240662443.33
85 rdf:rest rdf:nil
86 Na07e8c08c0b346fc90eacd9f4231fb63 schema:name doi
87 schema:value 10.1007/11601524_13
88 rdf:type schema:PropertyValue
89 Nb252130a02784f01a618c7223242af93 rdf:first N0b3bc14ed54747bc9a69ecdaefbeafa3
90 rdf:rest Nd221fb87927a4a1ea54d351a83d50ecc
91 Nb4637866f92f4aa68314eccb23e97ab7 schema:name Springer Nature - SN SciGraph project
92 rdf:type schema:Organization
93 Nc6a79c51f1a3497f9b786577527ec2e0 schema:name dimensions_id
94 schema:value pub.1014657474
95 rdf:type schema:PropertyValue
96 Nd221fb87927a4a1ea54d351a83d50ecc rdf:first N93271332247b4e079658cded40bb94a7
97 rdf:rest rdf:nil
98 Ne48208a780cb4289a1e12ec9e859b1a6 schema:name Springer Nature
99 rdf:type schema:Organisation
100 Nfbcabae296c74688a646286d8396e5da rdf:first sg:person.010376065260.68
101 rdf:rest N78aa7351425447ad8c3c4fafe7b7b78f
102 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
103 schema:name Information and Computing Sciences
104 rdf:type schema:DefinedTerm
105 anzsrc-for:0804 schema:inDefinedTermSet anzsrc-for:
106 schema:name Data Format
107 rdf:type schema:DefinedTerm
108 sg:person.010376065260.68 schema:affiliation grid-institutes:grid.5560.6
109 schema:familyName Busatto
110 schema:givenName Giorgio
111 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010376065260.68
112 rdf:type schema:Person
113 sg:person.015611460437.34 schema:affiliation grid-institutes:grid.5719.a
114 schema:familyName Lohrey
115 schema:givenName Markus
116 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015611460437.34
117 rdf:type schema:Person
118 sg:person.016240662443.33 schema:affiliation grid-institutes:grid.5333.6
119 schema:familyName Maneth
120 schema:givenName Sebastian
121 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016240662443.33
122 rdf:type schema:Person
123 grid-institutes:grid.5333.6 schema:alternateName Faculté I & C, EPFL, Switzerland
124 schema:name Faculté I & C, EPFL, Switzerland
125 rdf:type schema:Organization
126 grid-institutes:grid.5560.6 schema:alternateName Department für Informatik, Universität Oldenburg, Germany
127 schema:name Department für Informatik, Universität Oldenburg, Germany
128 rdf:type schema:Organization
129 grid-institutes:grid.5719.a schema:alternateName FMI, Universität Stuttgart, Germany
130 schema:name FMI, Universität Stuttgart, Germany
131 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...