Ontology type: schema:ScholarlyArticle
2011-10
AUTHORSEmmanuel Bruno, Nicolas Faessel, Hervé Glotin, Jacques Le Maitre, Michel Scholl
ABSTRACTWe present in this paper a model for indexing and querying web pages, based on the hierarchical decomposition of pages into blocks. Splitting up a page into blocks has several advantages in terms of page design, indexing and querying such as (i) blocks of a page most similar to a query may be returned instead of the page as a whole (ii) the importance of a block can be taken into account, as well as (iii) the permeability of the blocks to neighbor blocks: a block b is said to be permeable to a block b′ in the same page if b′ content (text, image, etc.) can be (partially) inherited by b upon indexing. An engine implementing this model is described including: the transformation of web pages into blocks hierarchies, the definition of a dedicated language to express indexing rules and the storage of indexed blocks into an XML repository. The model is assessed on a dataset of electronic news, and a dataset drawn from web pages of the ImagEval campaign where it improves by 16% the mean average precision of the baseline. More... »
PAGES623-649
http://scigraph.springernature.com/pub.10.1007/s11280-011-0124-6
DOIhttp://dx.doi.org/10.1007/s11280-011-0124-6
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1009855802
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Artificial Intelligence and Image Processing",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Laboratoire des Sciences de l'Information et des Syst\u00e8mes",
"id": "https://www.grid.ac/institutes/grid.462878.7",
"name": [
"LSIS, Universit\u00e9 du Sud Toulon-Var, BP 20132, 83957, La Garde Cedex, France"
],
"type": "Organization"
},
"familyName": "Bruno",
"givenName": "Emmanuel",
"id": "sg:person.011735523635.44",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011735523635.44"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Laboratoire des Sciences de l'Information et des Syst\u00e8mes",
"id": "https://www.grid.ac/institutes/grid.462878.7",
"name": [
"LSIS, Universit\u00e9 Paul C\u00e9zanne, Avenue Escadrille Normandie-Niemen, 13397, Marseille Cedex 20, France"
],
"type": "Organization"
},
"familyName": "Faessel",
"givenName": "Nicolas",
"id": "sg:person.010714732035.74",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010714732035.74"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Laboratoire des Sciences de l'Information et des Syst\u00e8mes",
"id": "https://www.grid.ac/institutes/grid.462878.7",
"name": [
"LSIS, Universit\u00e9 du Sud Toulon-Var, BP 20132, 83957, La Garde Cedex, France"
],
"type": "Organization"
},
"familyName": "Glotin",
"givenName": "Herv\u00e9",
"id": "sg:person.016622300103.82",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016622300103.82"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Laboratoire des Sciences de l'Information et des Syst\u00e8mes",
"id": "https://www.grid.ac/institutes/grid.462878.7",
"name": [
"LSIS, Universit\u00e9 du Sud Toulon-Var, BP 20132, 83957, La Garde Cedex, France"
],
"type": "Organization"
},
"familyName": "Le Maitre",
"givenName": "Jacques",
"id": "sg:person.015627601511.39",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015627601511.39"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Conservatoire National des Arts et M\u00e9tiers",
"id": "https://www.grid.ac/institutes/grid.36823.3c",
"name": [
"Cedric/Wisdom, CNAM, 292 Rue St Martin, 75141, Paris Cedex 03, France"
],
"type": "Organization"
},
"familyName": "Scholl",
"givenName": "Michel",
"id": "sg:person.016173501771.43",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016173501771.43"
],
"type": "Person"
}
],
"citation": [
{
"id": "https://doi.org/10.1145/361219.361220",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1004270480"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/3-540-36618-0_6",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1008624834",
"https://doi.org/10.1007/3-540-36618-0_6"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/956750.956785",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1016384532"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/1141753.1141777",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1016797502"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/1282280.1282289",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1018248675"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/1027527.1027747",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1018537760"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/s11280-007-0021-1",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1029731426",
"https://doi.org/10.1007/s11280-007-0021-1"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/988672.988700",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1032277700"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/1600193.1600209",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1041648376"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1109/tkde.2005.138",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1061661387"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1109/cbmi.2009.36",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1093930639"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1109/icassp.2008.4517838",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1095351518"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1109/icdar.1995.602059",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1095715659"
],
"type": "CreativeWork"
}
],
"datePublished": "2011-10",
"datePublishedReg": "2011-10-01",
"description": "We present in this paper a model for indexing and querying web pages, based on the hierarchical decomposition of pages into blocks. Splitting up a page into blocks has several advantages in terms of page design, indexing and querying such as (i) blocks of a page most similar to a query may be returned instead of the page as a whole (ii) the importance of a block can be taken into account, as well as (iii) the permeability of the blocks to neighbor blocks: a block b is said to be permeable to a block b\u2032 in the same page if b\u2032 content (text, image, etc.) can be (partially) inherited by b upon indexing. An engine implementing this model is described including: the transformation of web pages into blocks hierarchies, the definition of a dedicated language to express indexing rules and the storage of indexed blocks into an XML repository. The model is assessed on a dataset of electronic news, and a dataset drawn from web pages of the ImagEval campaign where it improves by 16% the mean average precision of the baseline.",
"genre": "research_article",
"id": "sg:pub.10.1007/s11280-011-0124-6",
"inLanguage": [
"en"
],
"isAccessibleForFree": false,
"isPartOf": [
{
"id": "sg:journal.1136663",
"issn": [
"1386-145X",
"1573-1413"
],
"name": "World Wide Web",
"type": "Periodical"
},
{
"issueNumber": "5-6",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "14"
}
],
"name": "Indexing and querying segmented web pages: the BlockWeb Model",
"pagination": "623-649",
"productId": [
{
"name": "readcube_id",
"type": "PropertyValue",
"value": [
"2ef3d16ec7b72638fee6463ef216d7727a021a66494fb97a25ac6ddfbfcef3e0"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/s11280-011-0124-6"
]
},
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1009855802"
]
}
],
"sameAs": [
"https://doi.org/10.1007/s11280-011-0124-6",
"https://app.dimensions.ai/details/publication/pub.1009855802"
],
"sdDataset": "articles",
"sdDatePublished": "2019-04-11T00:18",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8695_00000520.jsonl",
"type": "ScholarlyArticle",
"url": "http://link.springer.com/10.1007%2Fs11280-011-0124-6"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s11280-011-0124-6'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s11280-011-0124-6'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s11280-011-0124-6'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s11280-011-0124-6'
This table displays all metadata directly associated to this object as RDF triples.
134 TRIPLES
21 PREDICATES
40 URIs
19 LITERALS
7 BLANK NODES