2015
AUTHORSRan Yu , Ujwal Gadiraju , Besnik Fetahu , Stefan Dietze
ABSTRACTGiven the evolution of publicly available Linked Data, crawling and preservation have become increasingly important challenges. Due to the scale of available data on the Web, efficient focused crawling approaches which are able to capture the relevant semantic neighborhood of seed entities are required. Here, determining relevant entities for a given set of seed entities is a crucial problem. While the weight of seeds within a seed list vary significantly with respect to the crawl intent, we argue that an adaptive crawler is required, which considers such characteristics when configuring the crawling and relevance detection approach. To address this problem, we introduce a crawling configuration, which considers seed list-specific features as part of its crawling and ranking algorithm. We evaluate it through extensive experiments in comparison to a number of baseline methods and crawling parameters. We demonstrate that, configurations which consider seed list features outperform the baselines and present further insights gained from our experiments. More... »
PAGES554-569
Web Information Systems Engineering – WISE 2015
ISBN
978-3-319-26189-8
978-3-319-26190-4
http://scigraph.springernature.com/pub.10.1007/978-3-319-26190-4_37
DOIhttp://dx.doi.org/10.1007/978-3-319-26190-4_37
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1003210262
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Artificial Intelligence and Image Processing",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "University of Hannover",
"id": "https://www.grid.ac/institutes/grid.9122.8",
"name": [
"L3S Research Center, Leibniz Universit\u00e4t Hannover"
],
"type": "Organization"
},
"familyName": "Yu",
"givenName": "Ran",
"id": "sg:person.014350275752.88",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014350275752.88"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "University of Hannover",
"id": "https://www.grid.ac/institutes/grid.9122.8",
"name": [
"L3S Research Center, Leibniz Universit\u00e4t Hannover"
],
"type": "Organization"
},
"familyName": "Gadiraju",
"givenName": "Ujwal",
"id": "sg:person.015354112341.71",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015354112341.71"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "University of Hannover",
"id": "https://www.grid.ac/institutes/grid.9122.8",
"name": [
"L3S Research Center, Leibniz Universit\u00e4t Hannover"
],
"type": "Organization"
},
"familyName": "Fetahu",
"givenName": "Besnik",
"id": "sg:person.014510614501.15",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014510614501.15"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "University of Hannover",
"id": "https://www.grid.ac/institutes/grid.9122.8",
"name": [
"L3S Research Center, Leibniz Universit\u00e4t Hannover"
],
"type": "Organization"
},
"familyName": "Dietze",
"givenName": "Stefan",
"id": "sg:person.015741663301.42",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015741663301.42"
],
"type": "Person"
}
],
"citation": [
{
"id": "sg:pub.10.1007/978-3-319-25007-6_28",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1000792368",
"https://doi.org/10.1007/978-3-319-25007-6_28"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-642-38288-8_37",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1002846055",
"https://doi.org/10.1007/978-3-642-38288-8_37"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/2702123.2702443",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1009845147"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/s1389-1286(99)00052-3",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1013401015"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/1772690.1772769",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1019680571"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/511446.511466",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1024836575"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-540-76298-0_52",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1028565626",
"https://doi.org/10.1007/978-3-540-76298-0_52"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-540-76298-0_52",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1028565626",
"https://doi.org/10.1007/978-3-540-76298-0_52"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/s0169-7552(98)00110-x",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1035913093"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/2661829.2661902",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1046988485"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/bf02289026",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1051171362",
"https://doi.org/10.1007/bf02289026"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/bf02289026",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1051171362",
"https://doi.org/10.1007/bf02289026"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1109/mis.2015.66",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1061406538"
],
"type": "CreativeWork"
}
],
"datePublished": "2015",
"datePublishedReg": "2015-01-01",
"description": "Given the evolution of publicly available Linked Data, crawling and preservation have become increasingly important challenges. Due to the scale of available data on the Web, efficient focused crawling approaches which are able to capture the relevant semantic neighborhood of seed entities are required. Here, determining relevant entities for a given set of seed entities is a crucial problem. While the weight of seeds within a seed list vary significantly with respect to the crawl intent, we argue that an adaptive crawler is required, which considers such characteristics when configuring the crawling and relevance detection approach. To address this problem, we introduce a crawling configuration, which considers seed list-specific features as part of its crawling and ranking algorithm. We evaluate it through extensive experiments in comparison to a number of baseline methods and crawling parameters. We demonstrate that, configurations which consider seed list features outperform the baselines and present further insights gained from our experiments.",
"editor": [
{
"familyName": "Wang",
"givenName": "Jianyong",
"type": "Person"
},
{
"familyName": "Cellary",
"givenName": "Wojciech",
"type": "Person"
},
{
"familyName": "Wang",
"givenName": "Dingding",
"type": "Person"
},
{
"familyName": "Wang",
"givenName": "Hua",
"type": "Person"
},
{
"familyName": "Chen",
"givenName": "Shu-Ching",
"type": "Person"
},
{
"familyName": "Li",
"givenName": "Tao",
"type": "Person"
},
{
"familyName": "Zhang",
"givenName": "Yanchun",
"type": "Person"
}
],
"genre": "chapter",
"id": "sg:pub.10.1007/978-3-319-26190-4_37",
"inLanguage": [
"en"
],
"isAccessibleForFree": false,
"isPartOf": {
"isbn": [
"978-3-319-26189-8",
"978-3-319-26190-4"
],
"name": "Web Information Systems Engineering \u2013 WISE 2015",
"type": "Book"
},
"name": "Adaptive Focused Crawling of Linked Data",
"pagination": "554-569",
"productId": [
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/978-3-319-26190-4_37"
]
},
{
"name": "readcube_id",
"type": "PropertyValue",
"value": [
"0f4a9740732b1a8521e58b84faacc3859759123e1d8d1eb540ec5f3603f9a5e3"
]
},
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1003210262"
]
}
],
"publisher": {
"location": "Cham",
"name": "Springer International Publishing",
"type": "Organisation"
},
"sameAs": [
"https://doi.org/10.1007/978-3-319-26190-4_37",
"https://app.dimensions.ai/details/publication/pub.1003210262"
],
"sdDataset": "chapters",
"sdDatePublished": "2019-04-15T13:25",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8664_00000245.jsonl",
"type": "Chapter",
"url": "http://link.springer.com/10.1007/978-3-319-26190-4_37"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-26190-4_37'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-26190-4_37'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-26190-4_37'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-26190-4_37'
This table displays all metadata directly associated to this object as RDF triples.
153 TRIPLES
23 PREDICATES
38 URIs
20 LITERALS
8 BLANK NODES