What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2007

AUTHORS

Sören Auer , Jens Lehmann

ABSTRACT

Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used. More... »

PAGES

503-517

References to SciGraph publications

  • 2006. Information Integration Via an End-to-End Distributed Semantic Web System in THE SEMANTIC WEB - ISWC 2006
  • 2002-03. Evaluating the performance of table processing algorithms in INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (IJDAR)
  • 2002. Automatically Extracting Ontologically Specified Data from HTML Tables of Unknown Structure in CONCEPTUAL MODELING — ER 2002
  • 2006. OntoWiki – A Tool for Social, Semantic Collaboration in THE SEMANTIC WEB - ISWC 2006
  • 2004-03. A survey of table recognition in INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (IJDAR)
  • Book

    TITLE

    The Semantic Web: Research and Applications

    ISBN

    978-3-540-72666-1
    978-3-540-72667-8

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36

    DOI

    http://dx.doi.org/10.1007/978-3-540-72667-8_36

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1046332012


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information Systems", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "name": [
                "Universit\u00e4t Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Auer", 
            "givenName": "S\u00f6ren", 
            "id": "sg:person.07377234042.68", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07377234042.68"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "name": [
                "Universit\u00e4t Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Lehmann", 
            "givenName": "Jens", 
            "id": "sg:person.016325666777.06", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016325666777.06"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/11926078_53", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019685949", 
              "https://doi.org/10.1007/11926078_53"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11926078_53", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019685949", 
              "https://doi.org/10.1007/11926078_53"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.websem.2005.06.003", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020843849"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.websem.2005.06.003", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020843849"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.patcog.2004.01.012", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1022346240"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11926078_55", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030414405", 
              "https://doi.org/10.1007/11926078_55"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11926078_55", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030414405", 
              "https://doi.org/10.1007/11926078_55"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s100320200074", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031692172", 
              "https://doi.org/10.1007/s100320200074"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1146/annurev.cs.04.060190.002221", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1036425691"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s10032-004-0120-9", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1038982559", 
              "https://doi.org/10.1007/s10032-004-0120-9"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/860435.860479", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040183355"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/3-540-45816-6_32", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041368308", 
              "https://doi.org/10.1007/3-540-45816-6_32"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1135777.1135863", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043205484"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.3115/990820.990869", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053690603"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2007", 
        "datePublishedReg": "2007-01-01", 
        "description": "Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.", 
        "editor": [
          {
            "familyName": "Franconi", 
            "givenName": "Enrico", 
            "type": "Person"
          }, 
          {
            "familyName": "Kifer", 
            "givenName": "Michael", 
            "type": "Person"
          }, 
          {
            "familyName": "May", 
            "givenName": "Wolfgang", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-540-72667-8_36", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-540-72666-1", 
            "978-3-540-72667-8"
          ], 
          "name": "The Semantic Web: Research and Applications", 
          "type": "Book"
        }, 
        "name": "What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content", 
        "pagination": "503-517", 
        "productId": [
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-540-72667-8_36"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "dcb456b159775c72be31fa19eb2701c9f74a3866fd319d18f375a09e4daa9c3b"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1046332012"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-540-72667-8_36", 
          "https://app.dimensions.ai/details/publication/pub.1046332012"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-15T19:11", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8684_00000272.jsonl", 
        "type": "Chapter", 
        "url": "http://link.springer.com/10.1007/978-3-540-72667-8_36"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-72667-8_36'


     

    This table displays all metadata directly associated to this object as RDF triples.

    121 TRIPLES      23 PREDICATES      38 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-540-72667-8_36 schema:about anzsrc-for:08
    2 anzsrc-for:0806
    3 schema:author N40e7f96550ac400a9b93e6d80309c56b
    4 schema:citation sg:pub.10.1007/11926078_53
    5 sg:pub.10.1007/11926078_55
    6 sg:pub.10.1007/3-540-45816-6_32
    7 sg:pub.10.1007/s10032-004-0120-9
    8 sg:pub.10.1007/s100320200074
    9 https://doi.org/10.1016/j.patcog.2004.01.012
    10 https://doi.org/10.1016/j.websem.2005.06.003
    11 https://doi.org/10.1145/1135777.1135863
    12 https://doi.org/10.1145/860435.860479
    13 https://doi.org/10.1146/annurev.cs.04.060190.002221
    14 https://doi.org/10.3115/990820.990869
    15 schema:datePublished 2007
    16 schema:datePublishedReg 2007-01-01
    17 schema:description Wikis are established means for the collaborative authoring, versioning and publishing of textual articles. The Wikipedia project, for example, succeeded in creating the by far largest encyclopedia just on the basis of a wiki. Recently, several approaches have been proposed on how to extend wikis to allow the creation of structured and semantically enriched content. However, the means for creating semantically enriched structured content are already available and are, although unconsciously, even used by Wikipedia authors. In this article, we present a method for revealing this structured content by extracting information from template instances. We suggest ways to efficiently query the vast amount of extracted information (e.g. more than 8 million RDF statements for the English Wikipedia version alone), leading to astonishing query answering possibilities (such as for the title question). We analyze the quality of the extracted content, and propose strategies for quality improvements with just minor modifications of the wiki systems being currently used.
    18 schema:editor Naf18148681644c1fa1950da074e77763
    19 schema:genre chapter
    20 schema:inLanguage en
    21 schema:isAccessibleForFree true
    22 schema:isPartOf N6d2d27a7212641d18783368ffa7cca77
    23 schema:name What Have Innsbruck and Leipzig in Common? Extracting Semantics from Wiki Content
    24 schema:pagination 503-517
    25 schema:productId N454531bd28364e73aa6db2d3ea2d6892
    26 N5a25e0f62ad94c0a87c650591fde1b5c
    27 N607328e9f9874c4c924f4bcd28e0a931
    28 schema:publisher N9f070a6ba51d4978a53c53850e782549
    29 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046332012
    30 https://doi.org/10.1007/978-3-540-72667-8_36
    31 schema:sdDatePublished 2019-04-15T19:11
    32 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    33 schema:sdPublisher Nc44d86738e1e47cdb25ed5c9fd093456
    34 schema:url http://link.springer.com/10.1007/978-3-540-72667-8_36
    35 sgo:license sg:explorer/license/
    36 sgo:sdDataset chapters
    37 rdf:type schema:Chapter
    38 N1b05e6133e4a41c69f62119b1edaa284 rdf:first Nafd598b18a244b5892f8c5651b381569
    39 rdf:rest rdf:nil
    40 N40e7f96550ac400a9b93e6d80309c56b rdf:first sg:person.07377234042.68
    41 rdf:rest Na907693b599847f4914624e35677cf0d
    42 N454531bd28364e73aa6db2d3ea2d6892 schema:name doi
    43 schema:value 10.1007/978-3-540-72667-8_36
    44 rdf:type schema:PropertyValue
    45 N5a25e0f62ad94c0a87c650591fde1b5c schema:name readcube_id
    46 schema:value dcb456b159775c72be31fa19eb2701c9f74a3866fd319d18f375a09e4daa9c3b
    47 rdf:type schema:PropertyValue
    48 N607328e9f9874c4c924f4bcd28e0a931 schema:name dimensions_id
    49 schema:value pub.1046332012
    50 rdf:type schema:PropertyValue
    51 N6d2d27a7212641d18783368ffa7cca77 schema:isbn 978-3-540-72666-1
    52 978-3-540-72667-8
    53 schema:name The Semantic Web: Research and Applications
    54 rdf:type schema:Book
    55 N7c78e02a1d3842c9bc4cfc7cd9141839 schema:familyName Franconi
    56 schema:givenName Enrico
    57 rdf:type schema:Person
    58 N82ef477a06d34d1183ac886f6deb6804 schema:familyName Kifer
    59 schema:givenName Michael
    60 rdf:type schema:Person
    61 N8ab10e901be9490c93640938f492c717 rdf:first N82ef477a06d34d1183ac886f6deb6804
    62 rdf:rest N1b05e6133e4a41c69f62119b1edaa284
    63 N9f070a6ba51d4978a53c53850e782549 schema:location Berlin, Heidelberg
    64 schema:name Springer Berlin Heidelberg
    65 rdf:type schema:Organisation
    66 Na907693b599847f4914624e35677cf0d rdf:first sg:person.016325666777.06
    67 rdf:rest rdf:nil
    68 Naf18148681644c1fa1950da074e77763 rdf:first N7c78e02a1d3842c9bc4cfc7cd9141839
    69 rdf:rest N8ab10e901be9490c93640938f492c717
    70 Nafd598b18a244b5892f8c5651b381569 schema:familyName May
    71 schema:givenName Wolfgang
    72 rdf:type schema:Person
    73 Nb2ff078d7c0641a2aa9b7df179333565 schema:name Universität Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany
    74 rdf:type schema:Organization
    75 Nc44d86738e1e47cdb25ed5c9fd093456 schema:name Springer Nature - SN SciGraph project
    76 rdf:type schema:Organization
    77 Nfc22658c2c1d4aac8ad4e67ff5c51c8a schema:name Universität Leipzig, Department of Computer Science, Johannisgasse 26, D-04103 Leipzig, Germany
    78 rdf:type schema:Organization
    79 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Information and Computing Sciences
    81 rdf:type schema:DefinedTerm
    82 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
    83 schema:name Information Systems
    84 rdf:type schema:DefinedTerm
    85 sg:person.016325666777.06 schema:affiliation Nfc22658c2c1d4aac8ad4e67ff5c51c8a
    86 schema:familyName Lehmann
    87 schema:givenName Jens
    88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016325666777.06
    89 rdf:type schema:Person
    90 sg:person.07377234042.68 schema:affiliation Nb2ff078d7c0641a2aa9b7df179333565
    91 schema:familyName Auer
    92 schema:givenName Sören
    93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07377234042.68
    94 rdf:type schema:Person
    95 sg:pub.10.1007/11926078_53 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019685949
    96 https://doi.org/10.1007/11926078_53
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1007/11926078_55 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030414405
    99 https://doi.org/10.1007/11926078_55
    100 rdf:type schema:CreativeWork
    101 sg:pub.10.1007/3-540-45816-6_32 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041368308
    102 https://doi.org/10.1007/3-540-45816-6_32
    103 rdf:type schema:CreativeWork
    104 sg:pub.10.1007/s10032-004-0120-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038982559
    105 https://doi.org/10.1007/s10032-004-0120-9
    106 rdf:type schema:CreativeWork
    107 sg:pub.10.1007/s100320200074 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031692172
    108 https://doi.org/10.1007/s100320200074
    109 rdf:type schema:CreativeWork
    110 https://doi.org/10.1016/j.patcog.2004.01.012 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022346240
    111 rdf:type schema:CreativeWork
    112 https://doi.org/10.1016/j.websem.2005.06.003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020843849
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1145/1135777.1135863 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043205484
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1145/860435.860479 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040183355
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1146/annurev.cs.04.060190.002221 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036425691
    119 rdf:type schema:CreativeWork
    120 https://doi.org/10.3115/990820.990869 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053690603
    121 rdf:type schema:CreativeWork
     




    Preview window. Press ESC to close (or click here)


    ...