Machine Learning Architectures for Scalable and Reliable Subject Indexing View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2017-09-02

AUTHORS

Martin Toepfer

ABSTRACT

Digital libraries desire automatic subject indexing as a scalable provider of high-quality semantic document representations. The task is, however, complex and challenging, thus many issues are still unsolved. For instance, certain concepts are not detected accurately, and confidence estimates are often unreliable. Accurate quality estimates are, however, crucial in practice, for example, to filter results and ensure highest standards before subsequent use. The proposed thesis studies applications of machine learning for automatic subject indexing, which faces considerable challenges like class imbalance, concept drift, and zero-shot learning. Special attention will be paid to architecture design and automatic quality estimation, with experiments on scholarly publications in economics and business studies. First results indicate the importance of knowledge transfer between concepts and point out the value of so-called fusion approaches that carefully combine lexical and associative subsystems. This extended abstract summarizes the main topic and status of the thesis and provides an outlook on future directions. More... »

PAGES

644-647

Book

TITLE

Research and Advanced Technology for Digital Libraries

ISBN

978-3-319-67007-2
978-3-319-67008-9

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-67008-9_61

DOI

http://dx.doi.org/10.1007/978-3-319-67008-9_61

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1091456244


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "German National Library of Economics", 
          "id": "https://www.grid.ac/institutes/grid.461649.8", 
          "name": [
            "ZBW \u2013 Leibniz Information Centre for Economics, D\u00fcsternbrooker Weg 120, 24105, Kiel, Germany"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Toepfer", 
        "givenName": "Martin", 
        "id": "sg:person.07462236272.50", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07462236272.50"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1145/2716262", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015712391"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1136/amiajnl-2010-000055", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018410533"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1141753.1141816", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022391156"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/505282.505283", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023316280"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.5626/jcse.2012.6.2.151", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026789273"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/coli_a_00239", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030937853"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/asi.20790", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033027459"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1079084402", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1609/aimag.v31i3.2303", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090968457"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/jcdl.2017.7991557", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094377822"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2017-09-02", 
    "datePublishedReg": "2017-09-02", 
    "description": "Digital libraries desire automatic subject indexing as a scalable provider of high-quality semantic document representations. The task is, however, complex and challenging, thus many issues are still unsolved. For instance, certain concepts are not detected accurately, and confidence estimates are often unreliable. Accurate quality estimates are, however, crucial in practice, for example, to filter results and ensure highest standards before subsequent use. The proposed thesis studies applications of machine learning for automatic subject indexing, which faces considerable challenges like class imbalance, concept drift, and zero-shot learning. Special attention will be paid to architecture design and automatic quality estimation, with experiments on scholarly publications in economics and business studies. First results indicate the importance of knowledge transfer between concepts and point out the value of so-called fusion approaches that carefully combine lexical and associative subsystems. This extended abstract summarizes the main topic and status of the thesis and provides an outlook on future directions.", 
    "editor": [
      {
        "familyName": "Kamps", 
        "givenName": "Jaap", 
        "type": "Person"
      }, 
      {
        "familyName": "Tsakonas", 
        "givenName": "Giannis", 
        "type": "Person"
      }, 
      {
        "familyName": "Manolopoulos", 
        "givenName": "Yannis", 
        "type": "Person"
      }, 
      {
        "familyName": "Iliadis", 
        "givenName": "Lazaros", 
        "type": "Person"
      }, 
      {
        "familyName": "Karydis", 
        "givenName": "Ioannis", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-67008-9_61", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-319-67007-2", 
        "978-3-319-67008-9"
      ], 
      "name": "Research and Advanced Technology for Digital Libraries", 
      "type": "Book"
    }, 
    "name": "Machine Learning Architectures for Scalable and Reliable Subject Indexing", 
    "pagination": "644-647", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-67008-9_61"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "c9869a0369ca2d75f2cd4fa8376100d4d399c2981e64234dbd23b318ffc37c52"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1091456244"
        ]
      }
    ], 
    "publisher": {
      "location": "Cham", 
      "name": "Springer International Publishing", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-67008-9_61", 
      "https://app.dimensions.ai/details/publication/pub.1091456244"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T04:59", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000325_0000000325/records_100783_00000000.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F978-3-319-67008-9_61"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-67008-9_61'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-67008-9_61'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-67008-9_61'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-67008-9_61'


 

This table displays all metadata directly associated to this object as RDF triples.

114 TRIPLES      23 PREDICATES      36 URIs      19 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-67008-9_61 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N357ea5a1ef954bd3af209cc932a4ccbb
4 schema:citation https://app.dimensions.ai/details/publication/pub.1079084402
5 https://doi.org/10.1002/asi.20790
6 https://doi.org/10.1109/jcdl.2017.7991557
7 https://doi.org/10.1136/amiajnl-2010-000055
8 https://doi.org/10.1145/1141753.1141816
9 https://doi.org/10.1145/2716262
10 https://doi.org/10.1145/505282.505283
11 https://doi.org/10.1162/coli_a_00239
12 https://doi.org/10.1609/aimag.v31i3.2303
13 https://doi.org/10.5626/jcse.2012.6.2.151
14 schema:datePublished 2017-09-02
15 schema:datePublishedReg 2017-09-02
16 schema:description Digital libraries desire automatic subject indexing as a scalable provider of high-quality semantic document representations. The task is, however, complex and challenging, thus many issues are still unsolved. For instance, certain concepts are not detected accurately, and confidence estimates are often unreliable. Accurate quality estimates are, however, crucial in practice, for example, to filter results and ensure highest standards before subsequent use. The proposed thesis studies applications of machine learning for automatic subject indexing, which faces considerable challenges like class imbalance, concept drift, and zero-shot learning. Special attention will be paid to architecture design and automatic quality estimation, with experiments on scholarly publications in economics and business studies. First results indicate the importance of knowledge transfer between concepts and point out the value of so-called fusion approaches that carefully combine lexical and associative subsystems. This extended abstract summarizes the main topic and status of the thesis and provides an outlook on future directions.
17 schema:editor Na899e302d6544a43986ca9ee8c053529
18 schema:genre chapter
19 schema:inLanguage en
20 schema:isAccessibleForFree false
21 schema:isPartOf N552e712ea9a643a790fc08a39c4e364f
22 schema:name Machine Learning Architectures for Scalable and Reliable Subject Indexing
23 schema:pagination 644-647
24 schema:productId N35b5dadadbe64222ba0894620a0821e5
25 N3f9ff69ae83a4dd7bd1538d20d5f4d46
26 N9f6e451949ff42f9bc43e807c11e549e
27 schema:publisher N5f4230320f9c437fa0a17330b59ba3df
28 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091456244
29 https://doi.org/10.1007/978-3-319-67008-9_61
30 schema:sdDatePublished 2019-04-16T04:59
31 schema:sdLicense https://scigraph.springernature.com/explorer/license/
32 schema:sdPublisher N67f65b64c2f44e979e1b05f38bbfa09a
33 schema:url https://link.springer.com/10.1007%2F978-3-319-67008-9_61
34 sgo:license sg:explorer/license/
35 sgo:sdDataset chapters
36 rdf:type schema:Chapter
37 N045bf534f2494d1288cd4c87fb259d9a schema:familyName Iliadis
38 schema:givenName Lazaros
39 rdf:type schema:Person
40 N0465f9aebbbc4bada36006894b2ecfda schema:familyName Kamps
41 schema:givenName Jaap
42 rdf:type schema:Person
43 N12a0431c5ce14d6198deb0185a930109 rdf:first Nf6bd911919da4852afc74498f45be1eb
44 rdf:rest rdf:nil
45 N357ea5a1ef954bd3af209cc932a4ccbb rdf:first sg:person.07462236272.50
46 rdf:rest rdf:nil
47 N35b5dadadbe64222ba0894620a0821e5 schema:name doi
48 schema:value 10.1007/978-3-319-67008-9_61
49 rdf:type schema:PropertyValue
50 N3f9ff69ae83a4dd7bd1538d20d5f4d46 schema:name dimensions_id
51 schema:value pub.1091456244
52 rdf:type schema:PropertyValue
53 N552e712ea9a643a790fc08a39c4e364f schema:isbn 978-3-319-67007-2
54 978-3-319-67008-9
55 schema:name Research and Advanced Technology for Digital Libraries
56 rdf:type schema:Book
57 N5f4230320f9c437fa0a17330b59ba3df schema:location Cham
58 schema:name Springer International Publishing
59 rdf:type schema:Organisation
60 N6356ef3945c045fdb6c4f98c050d82e9 schema:familyName Tsakonas
61 schema:givenName Giannis
62 rdf:type schema:Person
63 N63fd6e05c01148be817eeb54e6604073 rdf:first N6356ef3945c045fdb6c4f98c050d82e9
64 rdf:rest Nd9f9d405f29e4e948535aef0396ab3a1
65 N67f65b64c2f44e979e1b05f38bbfa09a schema:name Springer Nature - SN SciGraph project
66 rdf:type schema:Organization
67 N9f6e451949ff42f9bc43e807c11e549e schema:name readcube_id
68 schema:value c9869a0369ca2d75f2cd4fa8376100d4d399c2981e64234dbd23b318ffc37c52
69 rdf:type schema:PropertyValue
70 Na899e302d6544a43986ca9ee8c053529 rdf:first N0465f9aebbbc4bada36006894b2ecfda
71 rdf:rest N63fd6e05c01148be817eeb54e6604073
72 Ncab2aecb592c41c0b4cb21c5c5b50908 rdf:first N045bf534f2494d1288cd4c87fb259d9a
73 rdf:rest N12a0431c5ce14d6198deb0185a930109
74 Nd9f9d405f29e4e948535aef0396ab3a1 rdf:first Ndb8be291ca044aa1adc0ccee8de4d7de
75 rdf:rest Ncab2aecb592c41c0b4cb21c5c5b50908
76 Ndb8be291ca044aa1adc0ccee8de4d7de schema:familyName Manolopoulos
77 schema:givenName Yannis
78 rdf:type schema:Person
79 Nf6bd911919da4852afc74498f45be1eb schema:familyName Karydis
80 schema:givenName Ioannis
81 rdf:type schema:Person
82 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
83 schema:name Information and Computing Sciences
84 rdf:type schema:DefinedTerm
85 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
86 schema:name Artificial Intelligence and Image Processing
87 rdf:type schema:DefinedTerm
88 sg:person.07462236272.50 schema:affiliation https://www.grid.ac/institutes/grid.461649.8
89 schema:familyName Toepfer
90 schema:givenName Martin
91 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07462236272.50
92 rdf:type schema:Person
93 https://app.dimensions.ai/details/publication/pub.1079084402 schema:CreativeWork
94 https://doi.org/10.1002/asi.20790 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033027459
95 rdf:type schema:CreativeWork
96 https://doi.org/10.1109/jcdl.2017.7991557 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094377822
97 rdf:type schema:CreativeWork
98 https://doi.org/10.1136/amiajnl-2010-000055 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018410533
99 rdf:type schema:CreativeWork
100 https://doi.org/10.1145/1141753.1141816 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022391156
101 rdf:type schema:CreativeWork
102 https://doi.org/10.1145/2716262 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015712391
103 rdf:type schema:CreativeWork
104 https://doi.org/10.1145/505282.505283 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023316280
105 rdf:type schema:CreativeWork
106 https://doi.org/10.1162/coli_a_00239 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030937853
107 rdf:type schema:CreativeWork
108 https://doi.org/10.1609/aimag.v31i3.2303 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090968457
109 rdf:type schema:CreativeWork
110 https://doi.org/10.5626/jcse.2012.6.2.151 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026789273
111 rdf:type schema:CreativeWork
112 https://www.grid.ac/institutes/grid.461649.8 schema:alternateName German National Library of Economics
113 schema:name ZBW – Leibniz Information Centre for Economics, Düsternbrooker Weg 120, 24105, Kiel, Germany
114 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...