A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2018-09-18

AUTHORS

Lorenzo Canale , Pasquale Lisena , Raphaël Troncy

ABSTRACT

Named entity recognition (NER) and disambiguation (NED) are subtasks of information extraction that aim to recognize named entities mentioned in text, to assign them pre-defined types, and to link them with their matching entities in a knowledge base. Many approaches, often exposed as web APIs, have been proposed to solve these tasks during the last years. These APIs classify entities using different taxonomies and disambiguate them with different knowledge bases. In this paper, we describe Ensemble Nerd, a framework that collects numerous extractors responses, normalizes them and combines them in order to produce a final entity list according to the pattern (surface form, type, link). The presented approach is based on representing the extractors responses as real-value vectors and on using them as input samples for two Deep Learning networks: ENNTR (Ensemble Neural Network for Type Recognition) and ENND (Ensemble Neural Network for Disambiguation). We train these networks using specific gold standards. We show that the models produced outperform each single extractor responses in terms of micro and macro F1 measures computed by the GERBIL framework. More... »

PAGES

91-107

Book

TITLE

The Semantic Web – ISWC 2018

ISBN

978-3-030-00670-9
978-3-030-00671-6

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6

DOI

http://dx.doi.org/10.1007/978-3-030-00671-6_6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1107054154


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Politecnico di Torino, Turin, Italy", 
          "id": "http://www.grid.ac/institutes/grid.4800.c", 
          "name": [
            "EURECOM, Sophia Antipolis, France", 
            "Politecnico di Torino, Turin, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Canale", 
        "givenName": "Lorenzo", 
        "id": "sg:person.014452763710.07", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014452763710.07"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lisena", 
        "givenName": "Pasquale", 
        "id": "sg:person.014343777056.38", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014343777056.38"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Troncy", 
        "givenName": "Rapha\u00ebl", 
        "id": "sg:person.011023401073.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011023401073.96"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2018-09-18", 
    "datePublishedReg": "2018-09-18", 
    "description": "Named entity recognition (NER) and disambiguation (NED) are subtasks of information extraction that aim to recognize named entities mentioned in text, to assign them pre-defined types, and to link them with their matching entities in a knowledge base. Many approaches, often exposed as web APIs, have been proposed to solve these tasks during the last years. These APIs classify entities using different taxonomies and disambiguate them with different knowledge bases. In this paper, we describe Ensemble Nerd, a framework that collects numerous extractors responses, normalizes them and combines them in order to produce a final entity list according to the pattern (surface form, type, link). The presented approach is based on representing the extractors responses as real-value vectors and on using them as input samples for two Deep Learning networks: ENNTR (Ensemble Neural Network for Type Recognition) and ENND (Ensemble Neural Network for Disambiguation). We train these networks using specific gold standards. We show that the models produced outperform each single extractor responses in terms of micro and macro F1 measures computed by the GERBIL framework.", 
    "editor": [
      {
        "familyName": "Vrande\u010di\u0107", 
        "givenName": "Denny", 
        "type": "Person"
      }, 
      {
        "familyName": "Bontcheva", 
        "givenName": "Kalina", 
        "type": "Person"
      }, 
      {
        "familyName": "Su\u00e1rez-Figueroa", 
        "givenName": "Mari Carmen", 
        "type": "Person"
      }, 
      {
        "familyName": "Presutti", 
        "givenName": "Valentina", 
        "type": "Person"
      }, 
      {
        "familyName": "Celino", 
        "givenName": "Irene", 
        "type": "Person"
      }, 
      {
        "familyName": "Sabou", 
        "givenName": "Marta", 
        "type": "Person"
      }, 
      {
        "familyName": "Kaffee", 
        "givenName": "Lucie-Aim\u00e9e", 
        "type": "Person"
      }, 
      {
        "familyName": "Simperl", 
        "givenName": "Elena", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-030-00671-6_6", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-030-00670-9", 
        "978-3-030-00671-6"
      ], 
      "name": "The Semantic Web \u2013 ISWC 2018", 
      "type": "Book"
    }, 
    "keywords": [
      "entity recognition", 
      "pre-defined types", 
      "deep learning network", 
      "Named Entity Recognition", 
      "real-value vectors", 
      "novel ensemble method", 
      "Web APIs", 
      "different knowledge bases", 
      "entity lists", 
      "information extraction", 
      "neural network", 
      "knowledge bases", 
      "F1 measure", 
      "learning network", 
      "ensemble method", 
      "knowledge base", 
      "input samples", 
      "presented approach", 
      "APIs", 
      "different taxonomies", 
      "network", 
      "disambiguation", 
      "framework", 
      "subtasks", 
      "recognition", 
      "last years", 
      "entities", 
      "task", 
      "text", 
      "extraction", 
      "taxonomy", 
      "vector", 
      "standards", 
      "list", 
      "order", 
      "base", 
      "method", 
      "model", 
      "terms", 
      "measures", 
      "basis", 
      "types", 
      "patterns", 
      "NERD", 
      "gold standard", 
      "years", 
      "response", 
      "samples", 
      "approach", 
      "paper", 
      "Ensemble Nerd", 
      "numerous extractors responses", 
      "extractors responses", 
      "final entity list", 
      "ENNTR", 
      "ENND", 
      "specific gold standards", 
      "single extractor responses", 
      "macro F1 measures", 
      "GERBIL framework"
    ], 
    "name": "A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network", 
    "pagination": "91-107", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1107054154"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-030-00671-6_6"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-030-00671-6_6", 
      "https://app.dimensions.ai/details/publication/pub.1107054154"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-01-01T19:27", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/chapter/chapter_85.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-030-00671-6_6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'


 

This table displays all metadata directly associated to this object as RDF triples.

173 TRIPLES      23 PREDICATES      85 URIs      78 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-030-00671-6_6 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N3efdfefd0ea9404c8a554e1494e40810
4 schema:datePublished 2018-09-18
5 schema:datePublishedReg 2018-09-18
6 schema:description Named entity recognition (NER) and disambiguation (NED) are subtasks of information extraction that aim to recognize named entities mentioned in text, to assign them pre-defined types, and to link them with their matching entities in a knowledge base. Many approaches, often exposed as web APIs, have been proposed to solve these tasks during the last years. These APIs classify entities using different taxonomies and disambiguate them with different knowledge bases. In this paper, we describe Ensemble Nerd, a framework that collects numerous extractors responses, normalizes them and combines them in order to produce a final entity list according to the pattern (surface form, type, link). The presented approach is based on representing the extractors responses as real-value vectors and on using them as input samples for two Deep Learning networks: ENNTR (Ensemble Neural Network for Type Recognition) and ENND (Ensemble Neural Network for Disambiguation). We train these networks using specific gold standards. We show that the models produced outperform each single extractor responses in terms of micro and macro F1 measures computed by the GERBIL framework.
7 schema:editor Necc82fe2e1bb4a5b83ce151a8d37f7e6
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N8e3c410b53d045159186c842446937b1
12 schema:keywords APIs
13 ENND
14 ENNTR
15 Ensemble Nerd
16 F1 measure
17 GERBIL framework
18 NERD
19 Named Entity Recognition
20 Web APIs
21 approach
22 base
23 basis
24 deep learning network
25 different knowledge bases
26 different taxonomies
27 disambiguation
28 ensemble method
29 entities
30 entity lists
31 entity recognition
32 extraction
33 extractors responses
34 final entity list
35 framework
36 gold standard
37 information extraction
38 input samples
39 knowledge base
40 knowledge bases
41 last years
42 learning network
43 list
44 macro F1 measures
45 measures
46 method
47 model
48 network
49 neural network
50 novel ensemble method
51 numerous extractors responses
52 order
53 paper
54 patterns
55 pre-defined types
56 presented approach
57 real-value vectors
58 recognition
59 response
60 samples
61 single extractor responses
62 specific gold standards
63 standards
64 subtasks
65 task
66 taxonomy
67 terms
68 text
69 types
70 vector
71 years
72 schema:name A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network
73 schema:pagination 91-107
74 schema:productId N384397d3d1e5402889487ee24f41381e
75 Nbd3a9b51c2d042d09bcf552be55b1c96
76 schema:publisher N9a8f4776f24d4cf4be98a2d3bce555e5
77 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107054154
78 https://doi.org/10.1007/978-3-030-00671-6_6
79 schema:sdDatePublished 2022-01-01T19:27
80 schema:sdLicense https://scigraph.springernature.com/explorer/license/
81 schema:sdPublisher N5aa2669f41c546f5ba8d2f059987cb1d
82 schema:url https://doi.org/10.1007/978-3-030-00671-6_6
83 sgo:license sg:explorer/license/
84 sgo:sdDataset chapters
85 rdf:type schema:Chapter
86 N06f89ec2305d4317b43273505517e8cc schema:familyName Sabou
87 schema:givenName Marta
88 rdf:type schema:Person
89 N241d79639e0b49099ab45fb5129d74cb schema:familyName Kaffee
90 schema:givenName Lucie-Aimée
91 rdf:type schema:Person
92 N2a1039b0a2a04be9a7675b4000f04db5 rdf:first N5feca1142dc648db979bb0071afed4c6
93 rdf:rest Necd733a6dbbe4f9eb09639f0989df489
94 N2e637437eb2c4f5eb35eeb8a5884c68b rdf:first N37d08553946c4dc9bd4cf548c6725181
95 rdf:rest N2a1039b0a2a04be9a7675b4000f04db5
96 N37d08553946c4dc9bd4cf548c6725181 schema:familyName Suárez-Figueroa
97 schema:givenName Mari Carmen
98 rdf:type schema:Person
99 N384397d3d1e5402889487ee24f41381e schema:name doi
100 schema:value 10.1007/978-3-030-00671-6_6
101 rdf:type schema:PropertyValue
102 N3efa72f44fd64e8d91ad7b6c9c035d9f schema:familyName Vrandečić
103 schema:givenName Denny
104 rdf:type schema:Person
105 N3efdfefd0ea9404c8a554e1494e40810 rdf:first sg:person.014452763710.07
106 rdf:rest Nd90440e9e03940dca81eadf2e563aac5
107 N5aa2669f41c546f5ba8d2f059987cb1d schema:name Springer Nature - SN SciGraph project
108 rdf:type schema:Organization
109 N5feca1142dc648db979bb0071afed4c6 schema:familyName Presutti
110 schema:givenName Valentina
111 rdf:type schema:Person
112 N73c967e39ca5460abba5c1c426bdee40 schema:familyName Bontcheva
113 schema:givenName Kalina
114 rdf:type schema:Person
115 N7de330a8188446f1bca44eb8377cba41 rdf:first N73c967e39ca5460abba5c1c426bdee40
116 rdf:rest N2e637437eb2c4f5eb35eeb8a5884c68b
117 N8e3c410b53d045159186c842446937b1 schema:isbn 978-3-030-00670-9
118 978-3-030-00671-6
119 schema:name The Semantic Web – ISWC 2018
120 rdf:type schema:Book
121 N95fdd0f913b44ba9a690d023fec8be71 rdf:first N06f89ec2305d4317b43273505517e8cc
122 rdf:rest Necd85f9d260b4b6c979a0f9cc32366a4
123 N9a8f4776f24d4cf4be98a2d3bce555e5 schema:name Springer Nature
124 rdf:type schema:Organisation
125 N9f20bdd52b3d4883a5c9ada9d66e0ea1 schema:familyName Simperl
126 schema:givenName Elena
127 rdf:type schema:Person
128 Nbd3a9b51c2d042d09bcf552be55b1c96 schema:name dimensions_id
129 schema:value pub.1107054154
130 rdf:type schema:PropertyValue
131 Nc80e1788f9434eb9853f586246ac0f3d rdf:first sg:person.011023401073.96
132 rdf:rest rdf:nil
133 Ncecc5294c981457c9c2d21881173f268 schema:familyName Celino
134 schema:givenName Irene
135 rdf:type schema:Person
136 Nd90440e9e03940dca81eadf2e563aac5 rdf:first sg:person.014343777056.38
137 rdf:rest Nc80e1788f9434eb9853f586246ac0f3d
138 Ne79ffc7b159e4899b2c2d10f73cb4fab rdf:first N9f20bdd52b3d4883a5c9ada9d66e0ea1
139 rdf:rest rdf:nil
140 Necc82fe2e1bb4a5b83ce151a8d37f7e6 rdf:first N3efa72f44fd64e8d91ad7b6c9c035d9f
141 rdf:rest N7de330a8188446f1bca44eb8377cba41
142 Necd733a6dbbe4f9eb09639f0989df489 rdf:first Ncecc5294c981457c9c2d21881173f268
143 rdf:rest N95fdd0f913b44ba9a690d023fec8be71
144 Necd85f9d260b4b6c979a0f9cc32366a4 rdf:first N241d79639e0b49099ab45fb5129d74cb
145 rdf:rest Ne79ffc7b159e4899b2c2d10f73cb4fab
146 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
147 schema:name Information and Computing Sciences
148 rdf:type schema:DefinedTerm
149 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
150 schema:name Artificial Intelligence and Image Processing
151 rdf:type schema:DefinedTerm
152 sg:person.011023401073.96 schema:affiliation grid-institutes:grid.28848.3e
153 schema:familyName Troncy
154 schema:givenName Raphaël
155 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011023401073.96
156 rdf:type schema:Person
157 sg:person.014343777056.38 schema:affiliation grid-institutes:grid.28848.3e
158 schema:familyName Lisena
159 schema:givenName Pasquale
160 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014343777056.38
161 rdf:type schema:Person
162 sg:person.014452763710.07 schema:affiliation grid-institutes:grid.4800.c
163 schema:familyName Canale
164 schema:givenName Lorenzo
165 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014452763710.07
166 rdf:type schema:Person
167 grid-institutes:grid.28848.3e schema:alternateName EURECOM, Sophia Antipolis, France
168 schema:name EURECOM, Sophia Antipolis, France
169 rdf:type schema:Organization
170 grid-institutes:grid.4800.c schema:alternateName Politecnico di Torino, Turin, Italy
171 schema:name EURECOM, Sophia Antipolis, France
172 Politecnico di Torino, Turin, Italy
173 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...