What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2007

AUTHORS

Sören Auer , Jens Lehmann

ABSTRACT

Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used. More... »

PAGES

503-517

References to SciGraph publications

  • 2006. Information Integration Via an End-to-End Distributed Semantic Web System in THE SEMANTIC WEB - ISWC 2006
  • 2002-03. Evaluating the performance of table processing algorithms in INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (IJDAR)
  • 2002. Automatically Extracting Ontologically Specified Data from HTML Tables of Unknown Structure in CONCEPTUAL MODELING — ER 2002
  • 2006. OntoWiki – A Tool for Social, Semantic Collaboration in THE SEMANTIC WEB - ISWC 2006
  • 2004-03. A survey of table recognition in INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (IJDAR)
  • Book

    TITLE

    The Semantic Web: Research and Applications

    ISBN

    978-3-540-72666-1
    978-3-540-72667-8

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36

    DOI

    http://dx.doi.org/10.1007/978-3-540-72667-8_36

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1046332012


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information Systems", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "name": [
                "Universit\u00e4t Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Auer", 
            "givenName": "S\u00f6ren", 
            "id": "sg:person.07377234042.68", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07377234042.68"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "name": [
                "Universit\u00e4t Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Lehmann", 
            "givenName": "Jens", 
            "id": "sg:person.016325666777.06", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016325666777.06"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/11926078_53", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019685949", 
              "https://doi.org/10.1007/11926078_53"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11926078_53", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019685949", 
              "https://doi.org/10.1007/11926078_53"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.websem.2005.06.003", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020843849"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.websem.2005.06.003", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020843849"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.patcog.2004.01.012", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1022346240"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11926078_55", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030414405", 
              "https://doi.org/10.1007/11926078_55"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11926078_55", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030414405", 
              "https://doi.org/10.1007/11926078_55"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s100320200074", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031692172", 
              "https://doi.org/10.1007/s100320200074"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1146/annurev.cs.04.060190.002221", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1036425691"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s10032-004-0120-9", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1038982559", 
              "https://doi.org/10.1007/s10032-004-0120-9"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/860435.860479", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040183355"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/3-540-45816-6_32", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041368308", 
              "https://doi.org/10.1007/3-540-45816-6_32"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1135777.1135863", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043205484"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.3115/990820.990869", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053690603"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2007", 
        "datePublishedReg": "2007-01-01", 
        "description": "Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.", 
        "editor": [
          {
            "familyName": "Franconi", 
            "givenName": "Enrico", 
            "type": "Person"
          }, 
          {
            "familyName": "Kifer", 
            "givenName": "Michael", 
            "type": "Person"
          }, 
          {
            "familyName": "May", 
            "givenName": "Wolfgang", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-540-72667-8_36", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-540-72666-1", 
            "978-3-540-72667-8"
          ], 
          "name": "The Semantic Web: Research and Applications", 
          "type": "Book"
        }, 
        "name": "What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content", 
        "pagination": "503-517", 
        "productId": [
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-540-72667-8_36"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "dcb456b159775c72be31fa19eb2701c9f74a3866fd319d18f375a09e4daa9c3b"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1046332012"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-540-72667-8_36", 
          "https://app.dimensions.ai/details/publication/pub.1046332012"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-15T19:11", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8684_00000272.jsonl", 
        "type": "Chapter", 
        "url": "http://link.springer.com/10.1007/978-3-540-72667-8_36"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'


     

    This table displays all metadata directly associated to this object as RDF triples.

    121 TRIPLES      23 PREDICATES      38 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-540-72667-8_36 schema:about anzsrc-for:08
    2 anzsrc-for:0806
    3 schema:author Nb9ed9482056d4fc9a30ab406ed972786
    4 schema:citation sg:pub.10.1007/11926078_53
    5 sg:pub.10.1007/11926078_55
    6 sg:pub.10.1007/3-540-45816-6_32
    7 sg:pub.10.1007/s10032-004-0120-9
    8 sg:pub.10.1007/s100320200074
    9 https://doi.org/10.1016/j.patcog.2004.01.012
    10 https://doi.org/10.1016/j.websem.2005.06.003
    11 https://doi.org/10.1145/1135777.1135863
    12 https://doi.org/10.1145/860435.860479
    13 https://doi.org/10.1146/annurev.cs.04.060190.002221
    14 https://doi.org/10.3115/990820.990869
    15 schema:datePublished 2007
    16 schema:datePublishedReg 2007-01-01
    17 schema:description Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.
    18 schema:editor N3ebe15ed866645f19ff2a322e65452db
    19 schema:genre chapter
    20 schema:inLanguage en
    21 schema:isAccessibleForFree true
    22 schema:isPartOf N5d5948f973744c748aadcb47dcbfe3b7
    23 schema:name What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content
    24 schema:pagination 503-517
    25 schema:productId Nb454ce2131584ba495cd2fc1b3f21edd
    26 Nb7736228a4c74f57bc5355fdf22a27f3
    27 Nf1324f8dc40b47ec9c818cd6289bf110
    28 schema:publisher N512eb89399f644b9bd3c2a82bbda7ada
    29 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046332012
    30 https://doi.org/10.1007/978-3-540-72667-8_36
    31 schema:sdDatePublished 2019-04-15T19:11
    32 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    33 schema:sdPublisher Ned590b8bed6e4a699e3f9cb74c5801c0
    34 schema:url http://link.springer.com/10.1007/978-3-540-72667-8_36
    35 sgo:license sg:explorer/license/
    36 sgo:sdDataset chapters
    37 rdf:type schema:Chapter
    38 N23e41ca06f174d2780a1fb26e55df74c schema:familyName Kifer
    39 schema:givenName Michael
    40 rdf:type schema:Person
    41 N35eb033372f94d2bbf0da51fd5af4f12 schema:familyName May
    42 schema:givenName Wolfgang
    43 rdf:type schema:Person
    44 N3ebe15ed866645f19ff2a322e65452db rdf:first Nadc722206aef42239110c0d5641043fd
    45 rdf:rest N5b3bccfca4bd482e86415676c62d549d
    46 N4662b3424a8348259de1be081362c50a rdf:first sg:person.016325666777.06
    47 rdf:rest rdf:nil
    48 N512eb89399f644b9bd3c2a82bbda7ada schema:location Berlin, Heidelberg
    49 schema:name Springer Berlin Heidelberg
    50 rdf:type schema:Organisation
    51 N5b3bccfca4bd482e86415676c62d549d rdf:first N23e41ca06f174d2780a1fb26e55df74c
    52 rdf:rest Nad9704c4ece44baf9796a2f49fef3c94
    53 N5d5948f973744c748aadcb47dcbfe3b7 schema:isbn 978-3-540-72666-1
    54 978-3-540-72667-8
    55 schema:name The Semantic Web: Research and Applications
    56 rdf:type schema:Book
    57 Naaf13a4edff449899eb29f7c7adb2bda schema:name Universität Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany
    58 rdf:type schema:Organization
    59 Nad9704c4ece44baf9796a2f49fef3c94 rdf:first N35eb033372f94d2bbf0da51fd5af4f12
    60 rdf:rest rdf:nil
    61 Nadc722206aef42239110c0d5641043fd schema:familyName Franconi
    62 schema:givenName Enrico
    63 rdf:type schema:Person
    64 Nb454ce2131584ba495cd2fc1b3f21edd schema:name dimensions_id
    65 schema:value pub.1046332012
    66 rdf:type schema:PropertyValue
    67 Nb7736228a4c74f57bc5355fdf22a27f3 schema:name readcube_id
    68 schema:value dcb456b159775c72be31fa19eb2701c9f74a3866fd319d18f375a09e4daa9c3b
    69 rdf:type schema:PropertyValue
    70 Nb9ed9482056d4fc9a30ab406ed972786 rdf:first sg:person.07377234042.68
    71 rdf:rest N4662b3424a8348259de1be081362c50a
    72 Nd463b7f8e9ed48cd92681d02f8f0b7fd schema:name Universität Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany
    73 rdf:type schema:Organization
    74 Ned590b8bed6e4a699e3f9cb74c5801c0 schema:name Springer Nature - SN SciGraph project
    75 rdf:type schema:Organization
    76 Nf1324f8dc40b47ec9c818cd6289bf110 schema:name doi
    77 schema:value 10.1007/978-3-540-72667-8_36
    78 rdf:type schema:PropertyValue
    79 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Information and Computing Sciences
    81 rdf:type schema:DefinedTerm
    82 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
    83 schema:name Information Systems
    84 rdf:type schema:DefinedTerm
    85 sg:person.016325666777.06 schema:affiliation Naaf13a4edff449899eb29f7c7adb2bda
    86 schema:familyName Lehmann
    87 schema:givenName Jens
    88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016325666777.06
    89 rdf:type schema:Person
    90 sg:person.07377234042.68 schema:affiliation Nd463b7f8e9ed48cd92681d02f8f0b7fd
    91 schema:familyName Auer
    92 schema:givenName Sören
    93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07377234042.68
    94 rdf:type schema:Person
    95 sg:pub.10.1007/11926078_53 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019685949
    96 https://doi.org/10.1007/11926078_53
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1007/11926078_55 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030414405
    99 https://doi.org/10.1007/11926078_55
    100 rdf:type schema:CreativeWork
    101 sg:pub.10.1007/3-540-45816-6_32 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041368308
    102 https://doi.org/10.1007/3-540-45816-6_32
    103 rdf:type schema:CreativeWork
    104 sg:pub.10.1007/s10032-004-0120-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038982559
    105 https://doi.org/10.1007/s10032-004-0120-9
    106 rdf:type schema:CreativeWork
    107 sg:pub.10.1007/s100320200074 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031692172
    108 https://doi.org/10.1007/s100320200074
    109 rdf:type schema:CreativeWork
    110 https://doi.org/10.1016/j.patcog.2004.01.012 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022346240
    111 rdf:type schema:CreativeWork
    112 https://doi.org/10.1016/j.websem.2005.06.003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020843849
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1145/1135777.1135863 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043205484
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1145/860435.860479 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040183355
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1146/annurev.cs.04.060190.002221 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036425691
    119 rdf:type schema:CreativeWork
    120 https://doi.org/10.3115/990820.990869 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053690603
    121 rdf:type schema:CreativeWork
     




    Preview window. Press ESC to close (or click here)


    ...