Characterizing Web Document Change View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2001

AUTHORS

Lipyeow Lim , Min Wang , Sriram Padmanabhan , Jeffrey Scott Vitter , Ramesh Agarwal

ABSTRACT

The World Wide Web is growing and changing at an astonishing rate. For the information in the web to be useful, web information systems such as search engines have to keep up with the growth and change of the web. In this paper we study how web documents change. In particular, we study two important characteristics of web document change that are directly related to keeping web information systems up-to-date: the degree of the change and the clusteredness of the change. We analyze the evolution of web documents with respect to these two measures and discuss the implications for web information systems update. More... »

PAGES

133-144

References to SciGraph publications

Book

TITLE

Advances in Web-Age Information Management

ISBN

978-3-540-42298-3
978-3-540-47714-3

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/3-540-47714-4_13

DOI

http://dx.doi.org/10.1007/3-540-47714-4_13

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1029669526


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Duke University", 
          "id": "https://www.grid.ac/institutes/grid.26009.3d", 
          "name": [
            "Dept. of Computer Science, Duke University, Durham, NC, 27708-0129"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lim", 
        "givenName": "Lipyeow", 
        "id": "sg:person.015270452377.19", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015270452377.19"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "IBM Research \u2013 Thomas J. Watson Research Center", 
          "id": "https://www.grid.ac/institutes/grid.481554.9", 
          "name": [
            "IBM T. J. Watson Research Ctr., Hawthorne, NY, 10532"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Min", 
        "id": "sg:person.012657435165.75", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012657435165.75"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "IBM Research \u2013 Thomas J. Watson Research Center", 
          "id": "https://www.grid.ac/institutes/grid.481554.9", 
          "name": [
            "IBM T. J. Watson Research Ctr., Hawthorne, NY, 10532"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Padmanabhan", 
        "givenName": "Sriram", 
        "id": "sg:person.015175246447.02", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015175246447.02"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Duke University", 
          "id": "https://www.grid.ac/institutes/grid.26009.3d", 
          "name": [
            "Dept. of Computer Science, Duke University, Durham, NC, 27708-0129"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vitter", 
        "givenName": "Jeffrey Scott", 
        "id": "sg:person.0613677314.28", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613677314.28"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "IBM Research \u2013 Thomas J. Watson Research Center", 
          "id": "https://www.grid.ac/institutes/grid.481554.9", 
          "name": [
            "IBM T. J. Watson Research Ctr., Hawthorne, NY, 10532"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Agarwal", 
        "givenName": "Ramesh", 
        "id": "sg:person.013776432617.04", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013776432617.04"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1038/21987", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034820983", 
          "https://doi.org/10.1038/21987"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2001", 
    "datePublishedReg": "2001-01-01", 
    "description": "The World Wide Web is growing and changing at an astonishing rate. For the information in the web to be useful, web information systems such as search engines have to keep up with the growth and change of the web. In this paper we study how web documents change. In particular, we study two important characteristics of web document change that are directly related to keeping web information systems up-to-date: the degree of the change and the clusteredness of the change. We analyze the evolution of web documents with respect to these two measures and discuss the implications for web information systems update.", 
    "editor": [
      {
        "familyName": "Wang", 
        "givenName": "X. Sean", 
        "type": "Person"
      }, 
      {
        "familyName": "Yu", 
        "givenName": "Ge", 
        "type": "Person"
      }, 
      {
        "familyName": "Lu", 
        "givenName": "Hongjun", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/3-540-47714-4_13", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-42298-3", 
        "978-3-540-47714-3"
      ], 
      "name": "Advances in Web-Age Information Management", 
      "type": "Book"
    }, 
    "name": "Characterizing Web Document Change", 
    "pagination": "133-144", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/3-540-47714-4_13"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "8d56c2f30c8ccbadda02e90efed8e6c3af7ae0d89355ca2349f43bbd718c8b77"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1029669526"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/3-540-47714-4_13", 
      "https://app.dimensions.ai/details/publication/pub.1029669526"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T16:17", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8675_00000261.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/3-540-47714-4_13"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/3-540-47714-4_13'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/3-540-47714-4_13'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/3-540-47714-4_13'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/3-540-47714-4_13'


 

This table displays all metadata directly associated to this object as RDF triples.

110 TRIPLES      23 PREDICATES      28 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/3-540-47714-4_13 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nb55da7cebe554c188db853f6d1f5bac0
4 schema:citation sg:pub.10.1038/21987
5 schema:datePublished 2001
6 schema:datePublishedReg 2001-01-01
7 schema:description The World Wide Web is growing and changing at an astonishing rate. For the information in the web to be useful, web information systems such as search engines have to keep up with the growth and change of the web. In this paper we study how web documents change. In particular, we study two important characteristics of web document change that are directly related to keeping web information systems up-to-date: the degree of the change and the clusteredness of the change. We analyze the evolution of web documents with respect to these two measures and discuss the implications for web information systems update.
8 schema:editor N8aa5451ff13b454eba95dfad52d436d2
9 schema:genre chapter
10 schema:inLanguage en
11 schema:isAccessibleForFree true
12 schema:isPartOf N63dcbb59b34646aeb46f6c184184444e
13 schema:name Characterizing Web Document Change
14 schema:pagination 133-144
15 schema:productId N744b94b69b5148d585675bf38a3e3610
16 N9b6b6c6827624717be5fae9665bf0390
17 Nf883db4d9de548ee849bfc7c05c928c6
18 schema:publisher N2c1cfaabdfce445f913f9046b213b4c7
19 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029669526
20 https://doi.org/10.1007/3-540-47714-4_13
21 schema:sdDatePublished 2019-04-15T16:17
22 schema:sdLicense https://scigraph.springernature.com/explorer/license/
23 schema:sdPublisher Nb935d76827294bcfbc31a3823b3d2239
24 schema:url http://link.springer.com/10.1007/3-540-47714-4_13
25 sgo:license sg:explorer/license/
26 sgo:sdDataset chapters
27 rdf:type schema:Chapter
28 N1899ad26118e4a379b7d6d01599cf926 rdf:first sg:person.013776432617.04
29 rdf:rest rdf:nil
30 N1ae3c8d6322c43faa3032344027ddd6e schema:familyName Yu
31 schema:givenName Ge
32 rdf:type schema:Person
33 N24c47760e41840c095c32040ff750993 rdf:first sg:person.015175246447.02
34 rdf:rest Ncdaaa4ef744248a4a61c3a84ebc06907
35 N2c1cfaabdfce445f913f9046b213b4c7 schema:location Berlin, Heidelberg
36 schema:name Springer Berlin Heidelberg
37 rdf:type schema:Organisation
38 N63dcbb59b34646aeb46f6c184184444e schema:isbn 978-3-540-42298-3
39 978-3-540-47714-3
40 schema:name Advances in Web-Age Information Management
41 rdf:type schema:Book
42 N744b94b69b5148d585675bf38a3e3610 schema:name readcube_id
43 schema:value 8d56c2f30c8ccbadda02e90efed8e6c3af7ae0d89355ca2349f43bbd718c8b77
44 rdf:type schema:PropertyValue
45 N8083d2740a994d0cb565415ce549ede7 schema:familyName Lu
46 schema:givenName Hongjun
47 rdf:type schema:Person
48 N8aa5451ff13b454eba95dfad52d436d2 rdf:first Nba5bfbd692194d81b0bf8ef45ad2bfcd
49 rdf:rest Ndb892ef9682b49c3bdd1de1eceb2c1a3
50 N9b6b6c6827624717be5fae9665bf0390 schema:name dimensions_id
51 schema:value pub.1029669526
52 rdf:type schema:PropertyValue
53 Nb2cfb85766ff474d813c1be21ec5ad21 rdf:first sg:person.012657435165.75
54 rdf:rest N24c47760e41840c095c32040ff750993
55 Nb55da7cebe554c188db853f6d1f5bac0 rdf:first sg:person.015270452377.19
56 rdf:rest Nb2cfb85766ff474d813c1be21ec5ad21
57 Nb935d76827294bcfbc31a3823b3d2239 schema:name Springer Nature - SN SciGraph project
58 rdf:type schema:Organization
59 Nba5bfbd692194d81b0bf8ef45ad2bfcd schema:familyName Wang
60 schema:givenName X. Sean
61 rdf:type schema:Person
62 Ncdaaa4ef744248a4a61c3a84ebc06907 rdf:first sg:person.0613677314.28
63 rdf:rest N1899ad26118e4a379b7d6d01599cf926
64 Ndb892ef9682b49c3bdd1de1eceb2c1a3 rdf:first N1ae3c8d6322c43faa3032344027ddd6e
65 rdf:rest Nf8cac41eb2a04605b19aecaf2f464f63
66 Nf883db4d9de548ee849bfc7c05c928c6 schema:name doi
67 schema:value 10.1007/3-540-47714-4_13
68 rdf:type schema:PropertyValue
69 Nf8cac41eb2a04605b19aecaf2f464f63 rdf:first N8083d2740a994d0cb565415ce549ede7
70 rdf:rest rdf:nil
71 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
72 schema:name Information and Computing Sciences
73 rdf:type schema:DefinedTerm
74 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
75 schema:name Artificial Intelligence and Image Processing
76 rdf:type schema:DefinedTerm
77 sg:person.012657435165.75 schema:affiliation https://www.grid.ac/institutes/grid.481554.9
78 schema:familyName Wang
79 schema:givenName Min
80 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012657435165.75
81 rdf:type schema:Person
82 sg:person.013776432617.04 schema:affiliation https://www.grid.ac/institutes/grid.481554.9
83 schema:familyName Agarwal
84 schema:givenName Ramesh
85 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013776432617.04
86 rdf:type schema:Person
87 sg:person.015175246447.02 schema:affiliation https://www.grid.ac/institutes/grid.481554.9
88 schema:familyName Padmanabhan
89 schema:givenName Sriram
90 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015175246447.02
91 rdf:type schema:Person
92 sg:person.015270452377.19 schema:affiliation https://www.grid.ac/institutes/grid.26009.3d
93 schema:familyName Lim
94 schema:givenName Lipyeow
95 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015270452377.19
96 rdf:type schema:Person
97 sg:person.0613677314.28 schema:affiliation https://www.grid.ac/institutes/grid.26009.3d
98 schema:familyName Vitter
99 schema:givenName Jeffrey Scott
100 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613677314.28
101 rdf:type schema:Person
102 sg:pub.10.1038/21987 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034820983
103 https://doi.org/10.1038/21987
104 rdf:type schema:CreativeWork
105 https://www.grid.ac/institutes/grid.26009.3d schema:alternateName Duke University
106 schema:name Dept. of Computer Science, Duke University, Durham, NC, 27708-0129
107 rdf:type schema:Organization
108 https://www.grid.ac/institutes/grid.481554.9 schema:alternateName IBM Research – Thomas J. Watson Research Center
109 schema:name IBM T. J. Watson Research Ctr., Hawthorne, NY, 10532
110 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...