When Simple is (more than) Good Enough: Effective Semantic Search with (almost) no Semantics View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2012

AUTHORS

Robert Neumayer , Krisztian Balog , Kjetil Nørvåg

ABSTRACT

Using keyword queries to find entities has emerged as one of the major search types on the Web. In this paper, we study the task of ad-hoc entity retrieval: keyword search in a collection of structured data. We start with a baseline retrieval system that constructs pseudo documents from RDF triples and introduce three extensions: preprocessing of URIs, using two-fielded retrieval models, and boosting popular domains. Using the query sets of the 2010 and 2011 Semantic Search Challenge, we show that our straightforward approach outperforms all previously reported results, some generated by far more complex systems. More... »

PAGES

540-543

References to SciGraph publications

  • 2011. Effective and Efficient Entity Search in RDF Data in THE SEMANTIC WEB – ISWC 2011
  • Book

    TITLE

    Advances in Information Retrieval

    ISBN

    978-3-642-28996-5
    978-3-642-28997-2

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-642-28997-2_59

    DOI

    http://dx.doi.org/10.1007/978-3-642-28997-2_59

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1004189052


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information Systems", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Norwegian University of Science and Technology", 
              "id": "https://www.grid.ac/institutes/grid.5947.f", 
              "name": [
                "Norwegian University of Science and Technology, Trondheim, Norway"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Neumayer", 
            "givenName": "Robert", 
            "id": "sg:person.016430411346.51", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016430411346.51"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Norwegian University of Science and Technology", 
              "id": "https://www.grid.ac/institutes/grid.5947.f", 
              "name": [
                "Norwegian University of Science and Technology, Trondheim, Norway"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Balog", 
            "givenName": "Krisztian", 
            "id": "sg:person.016051147465.50", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016051147465.50"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Norwegian University of Science and Technology", 
              "id": "https://www.grid.ac/institutes/grid.5947.f", 
              "name": [
                "Norwegian University of Science and Technology, Trondheim, Norway"
              ], 
              "type": "Organization"
            }, 
            "familyName": "N\u00f8rv\u00e5g", 
            "givenName": "Kjetil", 
            "id": "sg:person.015174447107.16", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015174447107.16"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1145/1772690.1772769", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019680571"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/860435.860463", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035548445"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1031171.1031181", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1036793541"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-25073-6_6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037405303", 
              "https://doi.org/10.1007/978-3-642-25073-6_6"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2012", 
        "datePublishedReg": "2012-01-01", 
        "description": "Using keyword queries to find entities has emerged as one of the major search types on the Web. In this paper, we study the task of ad-hoc entity retrieval: keyword search in a collection of structured data. We start with a baseline retrieval system that constructs pseudo documents from RDF triples and introduce three extensions: preprocessing of URIs, using two-fielded retrieval models, and boosting popular domains. Using the query sets of the 2010 and 2011 Semantic Search Challenge, we show that our straightforward approach outperforms all previously reported results, some generated by far more complex systems.", 
        "editor": [
          {
            "familyName": "Baeza-Yates", 
            "givenName": "Ricardo", 
            "type": "Person"
          }, 
          {
            "familyName": "de Vries", 
            "givenName": "Arjen P.", 
            "type": "Person"
          }, 
          {
            "familyName": "Zaragoza", 
            "givenName": "Hugo", 
            "type": "Person"
          }, 
          {
            "familyName": "Cambazoglu", 
            "givenName": "B. Barla", 
            "type": "Person"
          }, 
          {
            "familyName": "Murdock", 
            "givenName": "Vanessa", 
            "type": "Person"
          }, 
          {
            "familyName": "Lempel", 
            "givenName": "Ronny", 
            "type": "Person"
          }, 
          {
            "familyName": "Silvestri", 
            "givenName": "Fabrizio", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-642-28997-2_59", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-642-28996-5", 
            "978-3-642-28997-2"
          ], 
          "name": "Advances in Information Retrieval", 
          "type": "Book"
        }, 
        "name": "When Simple is (more than) Good Enough: Effective Semantic Search with (almost) no Semantics", 
        "pagination": "540-543", 
        "productId": [
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-642-28997-2_59"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "13f64582d76fd89ba35cc854e44e93c8d42725c234c6a2fde02767310710415b"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1004189052"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-642-28997-2_59", 
          "https://app.dimensions.ai/details/publication/pub.1004189052"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-15T20:03", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8687_00000245.jsonl", 
        "type": "Chapter", 
        "url": "http://link.springer.com/10.1007/978-3-642-28997-2_59"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-28997-2_59'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-28997-2_59'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-28997-2_59'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-28997-2_59'


     

    This table displays all metadata directly associated to this object as RDF triples.

    122 TRIPLES      23 PREDICATES      31 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-642-28997-2_59 schema:about anzsrc-for:08
    2 anzsrc-for:0806
    3 schema:author N15bc346c954b4ab2a7f682592efa8c85
    4 schema:citation sg:pub.10.1007/978-3-642-25073-6_6
    5 https://doi.org/10.1145/1031171.1031181
    6 https://doi.org/10.1145/1772690.1772769
    7 https://doi.org/10.1145/860435.860463
    8 schema:datePublished 2012
    9 schema:datePublishedReg 2012-01-01
    10 schema:description Using keyword queries to find entities has emerged as one of the major search types on the Web. In this paper, we study the task of ad-hoc entity retrieval: keyword search in a collection of structured data. We start with a baseline retrieval system that constructs pseudo documents from RDF triples and introduce three extensions: preprocessing of URIs, using two-fielded retrieval models, and boosting popular domains. Using the query sets of the 2010 and 2011 Semantic Search Challenge, we show that our straightforward approach outperforms all previously reported results, some generated by far more complex systems.
    11 schema:editor N7b211805af1a42a5bf1744fea0aea7d1
    12 schema:genre chapter
    13 schema:inLanguage en
    14 schema:isAccessibleForFree true
    15 schema:isPartOf N736a81a048084a5cb474abc02fce13f6
    16 schema:name When Simple is (more than) Good Enough: Effective Semantic Search with (almost) no Semantics
    17 schema:pagination 540-543
    18 schema:productId N6e4f74856c2c461eb6838b47ac01e6f8
    19 Nacd35229d660432e9d0d0cab9790663e
    20 Nc6578939c9f9400d9c1c599c56bee94c
    21 schema:publisher N4a19c4201c9f4e20a639815eb791f218
    22 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004189052
    23 https://doi.org/10.1007/978-3-642-28997-2_59
    24 schema:sdDatePublished 2019-04-15T20:03
    25 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    26 schema:sdPublisher N580f686f6fed4a6787c9f85e00d01006
    27 schema:url http://link.springer.com/10.1007/978-3-642-28997-2_59
    28 sgo:license sg:explorer/license/
    29 sgo:sdDataset chapters
    30 rdf:type schema:Chapter
    31 N05bed93b5dac411e9bb95e43b5917625 rdf:first N5df856eabd7b4956b5c912ac21c015ca
    32 rdf:rest N1aa5a28924ed44f2a5db3d24545f2921
    33 N05de71bc725243efa82b51423e6f63b7 schema:familyName Cambazoglu
    34 schema:givenName B. Barla
    35 rdf:type schema:Person
    36 N068ee3b9e30a43a28b15615d5e733107 schema:familyName Zaragoza
    37 schema:givenName Hugo
    38 rdf:type schema:Person
    39 N0e9fc1164d034654a52c0607e633b3b9 schema:familyName Murdock
    40 schema:givenName Vanessa
    41 rdf:type schema:Person
    42 N15bc346c954b4ab2a7f682592efa8c85 rdf:first sg:person.016430411346.51
    43 rdf:rest N2735a5e7a7184b138c2fa792ddaaeb4e
    44 N1aa5a28924ed44f2a5db3d24545f2921 rdf:first N068ee3b9e30a43a28b15615d5e733107
    45 rdf:rest Nd9e2c84b115c46cea3c5007868e5047d
    46 N2735a5e7a7184b138c2fa792ddaaeb4e rdf:first sg:person.016051147465.50
    47 rdf:rest N283a4fa17e884bad93531c9e8758cb7f
    48 N283a4fa17e884bad93531c9e8758cb7f rdf:first sg:person.015174447107.16
    49 rdf:rest rdf:nil
    50 N2b33c8ff34ce4555831a687fe74cce37 schema:familyName Baeza-Yates
    51 schema:givenName Ricardo
    52 rdf:type schema:Person
    53 N4a19c4201c9f4e20a639815eb791f218 schema:location Berlin, Heidelberg
    54 schema:name Springer Berlin Heidelberg
    55 rdf:type schema:Organisation
    56 N580f686f6fed4a6787c9f85e00d01006 schema:name Springer Nature - SN SciGraph project
    57 rdf:type schema:Organization
    58 N5df856eabd7b4956b5c912ac21c015ca schema:familyName de Vries
    59 schema:givenName Arjen P.
    60 rdf:type schema:Person
    61 N68b3ab25fcd7425abd2761bb3b007488 schema:familyName Lempel
    62 schema:givenName Ronny
    63 rdf:type schema:Person
    64 N6e4f74856c2c461eb6838b47ac01e6f8 schema:name dimensions_id
    65 schema:value pub.1004189052
    66 rdf:type schema:PropertyValue
    67 N736a81a048084a5cb474abc02fce13f6 schema:isbn 978-3-642-28996-5
    68 978-3-642-28997-2
    69 schema:name Advances in Information Retrieval
    70 rdf:type schema:Book
    71 N7b211805af1a42a5bf1744fea0aea7d1 rdf:first N2b33c8ff34ce4555831a687fe74cce37
    72 rdf:rest N05bed93b5dac411e9bb95e43b5917625
    73 Na7b796adfa994c388519299664511619 rdf:first N68b3ab25fcd7425abd2761bb3b007488
    74 rdf:rest Nc969760fbaec4c2aaabbb2890259648d
    75 Nacd35229d660432e9d0d0cab9790663e schema:name readcube_id
    76 schema:value 13f64582d76fd89ba35cc854e44e93c8d42725c234c6a2fde02767310710415b
    77 rdf:type schema:PropertyValue
    78 Nbb7c9eb1f62f43d5ad97a3c9c314b4ed rdf:first N0e9fc1164d034654a52c0607e633b3b9
    79 rdf:rest Na7b796adfa994c388519299664511619
    80 Nc6578939c9f9400d9c1c599c56bee94c schema:name doi
    81 schema:value 10.1007/978-3-642-28997-2_59
    82 rdf:type schema:PropertyValue
    83 Nc969760fbaec4c2aaabbb2890259648d rdf:first Nd9ebaefb0e1442769ed580603d75eb6f
    84 rdf:rest rdf:nil
    85 Nd9e2c84b115c46cea3c5007868e5047d rdf:first N05de71bc725243efa82b51423e6f63b7
    86 rdf:rest Nbb7c9eb1f62f43d5ad97a3c9c314b4ed
    87 Nd9ebaefb0e1442769ed580603d75eb6f schema:familyName Silvestri
    88 schema:givenName Fabrizio
    89 rdf:type schema:Person
    90 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    91 schema:name Information and Computing Sciences
    92 rdf:type schema:DefinedTerm
    93 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
    94 schema:name Information Systems
    95 rdf:type schema:DefinedTerm
    96 sg:person.015174447107.16 schema:affiliation https://www.grid.ac/institutes/grid.5947.f
    97 schema:familyName Nørvåg
    98 schema:givenName Kjetil
    99 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015174447107.16
    100 rdf:type schema:Person
    101 sg:person.016051147465.50 schema:affiliation https://www.grid.ac/institutes/grid.5947.f
    102 schema:familyName Balog
    103 schema:givenName Krisztian
    104 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016051147465.50
    105 rdf:type schema:Person
    106 sg:person.016430411346.51 schema:affiliation https://www.grid.ac/institutes/grid.5947.f
    107 schema:familyName Neumayer
    108 schema:givenName Robert
    109 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016430411346.51
    110 rdf:type schema:Person
    111 sg:pub.10.1007/978-3-642-25073-6_6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037405303
    112 https://doi.org/10.1007/978-3-642-25073-6_6
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1145/1031171.1031181 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036793541
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1145/1772690.1772769 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019680571
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1145/860435.860463 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035548445
    119 rdf:type schema:CreativeWork
    120 https://www.grid.ac/institutes/grid.5947.f schema:alternateName Norwegian University of Science and Technology
    121 schema:name Norwegian University of Science and Technology, Trondheim, Norway
    122 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...