A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2018-09-18

AUTHORS

Lorenzo Canale , Pasquale Lisena , Raphaël Troncy

ABSTRACT

Named entity recognition (NER) and disambiguation (NED) are subtasks of information extraction that aim to recognize named entities mentioned in text, to assign them pre-defined types, and to link them with their matching entities in a knowledge base. Many approaches, often exposed as web APIs, have been proposed to solve these tasks during the last years. These APIs classify entities using different taxonomies and disambiguate them with different knowledge bases. In this paper, we describe Ensemble Nerd, a framework that collects numerous extractors responses, normalizes them and combines them in order to produce a final entity list according to the pattern (surface form, type, link). The presented approach is based on representing the extractors responses as real-value vectors and on using them as input samples for two Deep Learning networks: ENNTR (Ensemble Neural Network for Type Recognition) and ENND (Ensemble Neural Network for Disambiguation). We train these networks using specific gold standards. We show that the models produced outperform each single extractor responses in terms of micro and macro F1 measures computed by the GERBIL framework. More... »

PAGES

91-107

Book

TITLE

The Semantic Web – ISWC 2018

ISBN

978-3-030-00670-9
978-3-030-00671-6

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6

DOI

http://dx.doi.org/10.1007/978-3-030-00671-6_6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1107054154


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Politecnico di Torino, Turin, Italy", 
          "id": "http://www.grid.ac/institutes/grid.4800.c", 
          "name": [
            "EURECOM, Sophia Antipolis, France", 
            "Politecnico di Torino, Turin, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Canale", 
        "givenName": "Lorenzo", 
        "id": "sg:person.014452763710.07", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014452763710.07"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lisena", 
        "givenName": "Pasquale", 
        "id": "sg:person.014343777056.38", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014343777056.38"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Troncy", 
        "givenName": "Rapha\u00ebl", 
        "id": "sg:person.011023401073.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011023401073.96"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2018-09-18", 
    "datePublishedReg": "2018-09-18", 
    "description": "Named entity recognition (NER) and disambiguation (NED) are subtasks of information extraction that aim to recognize named entities mentioned in text, to assign them pre-defined types, and to link them with their matching entities in a knowledge base. Many approaches, often exposed as web APIs, have been proposed to solve these tasks during the last years. These APIs classify entities using different taxonomies and disambiguate them with different knowledge bases. In this paper, we describe Ensemble Nerd, a framework that collects numerous extractors responses, normalizes them and combines them in order to produce a final entity list according to the pattern (surface form, type, link). The presented approach is based on representing the extractors responses as real-value vectors and on using them as input samples for two Deep Learning networks: ENNTR (Ensemble Neural Network for Type Recognition) and ENND (Ensemble Neural Network for Disambiguation). We train these networks using specific gold standards. We show that the models produced outperform each single extractor responses in terms of micro and macro F1 measures computed by the GERBIL framework.", 
    "editor": [
      {
        "familyName": "Vrande\u010di\u0107", 
        "givenName": "Denny", 
        "type": "Person"
      }, 
      {
        "familyName": "Bontcheva", 
        "givenName": "Kalina", 
        "type": "Person"
      }, 
      {
        "familyName": "Su\u00e1rez-Figueroa", 
        "givenName": "Mari Carmen", 
        "type": "Person"
      }, 
      {
        "familyName": "Presutti", 
        "givenName": "Valentina", 
        "type": "Person"
      }, 
      {
        "familyName": "Celino", 
        "givenName": "Irene", 
        "type": "Person"
      }, 
      {
        "familyName": "Sabou", 
        "givenName": "Marta", 
        "type": "Person"
      }, 
      {
        "familyName": "Kaffee", 
        "givenName": "Lucie-Aim\u00e9e", 
        "type": "Person"
      }, 
      {
        "familyName": "Simperl", 
        "givenName": "Elena", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-030-00671-6_6", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-030-00670-9", 
        "978-3-030-00671-6"
      ], 
      "name": "The Semantic Web \u2013 ISWC 2018", 
      "type": "Book"
    }, 
    "keywords": [
      "entity recognition", 
      "pre-defined types", 
      "deep learning network", 
      "Named Entity Recognition", 
      "real-value vectors", 
      "novel ensemble method", 
      "Web APIs", 
      "different knowledge bases", 
      "entity lists", 
      "information extraction", 
      "neural network", 
      "knowledge bases", 
      "F1 measure", 
      "learning network", 
      "ensemble method", 
      "knowledge base", 
      "input samples", 
      "presented approach", 
      "APIs", 
      "different taxonomies", 
      "network", 
      "disambiguation", 
      "framework", 
      "subtasks", 
      "recognition", 
      "last years", 
      "entities", 
      "task", 
      "text", 
      "extraction", 
      "taxonomy", 
      "vector", 
      "standards", 
      "list", 
      "order", 
      "base", 
      "method", 
      "model", 
      "terms", 
      "measures", 
      "basis", 
      "types", 
      "patterns", 
      "NERD", 
      "gold standard", 
      "years", 
      "response", 
      "samples", 
      "approach", 
      "paper", 
      "Ensemble Nerd", 
      "numerous extractors responses", 
      "extractors responses", 
      "final entity list", 
      "ENNTR", 
      "ENND", 
      "specific gold standards", 
      "single extractor responses", 
      "macro F1 measures", 
      "GERBIL framework"
    ], 
    "name": "A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network", 
    "pagination": "91-107", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1107054154"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-030-00671-6_6"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-030-00671-6_6", 
      "https://app.dimensions.ai/details/publication/pub.1107054154"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2021-12-01T20:00", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211201/entities/gbq_results/chapter/chapter_222.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-030-00671-6_6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-030-00671-6_6'


 

This table displays all metadata directly associated to this object as RDF triples.

173 TRIPLES      23 PREDICATES      85 URIs      78 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-030-00671-6_6 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N0787721d517b4f26be85881936af9670
4 schema:datePublished 2018-09-18
5 schema:datePublishedReg 2018-09-18
6 schema:description Named entity recognition (NER) and disambiguation (NED) are subtasks of information extraction that aim to recognize named entities mentioned in text, to assign them pre-defined types, and to link them with their matching entities in a knowledge base. Many approaches, often exposed as web APIs, have been proposed to solve these tasks during the last years. These APIs classify entities using different taxonomies and disambiguate them with different knowledge bases. In this paper, we describe Ensemble Nerd, a framework that collects numerous extractors responses, normalizes them and combines them in order to produce a final entity list according to the pattern (surface form, type, link). The presented approach is based on representing the extractors responses as real-value vectors and on using them as input samples for two Deep Learning networks: ENNTR (Ensemble Neural Network for Type Recognition) and ENND (Ensemble Neural Network for Disambiguation). We train these networks using specific gold standards. We show that the models produced outperform each single extractor responses in terms of micro and macro F1 measures computed by the GERBIL framework.
7 schema:editor Nbc086dd0fe574c5fb9500c2b6181aa6d
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N7f6d386149194909a2398ff74566a599
12 schema:keywords APIs
13 ENND
14 ENNTR
15 Ensemble Nerd
16 F1 measure
17 GERBIL framework
18 NERD
19 Named Entity Recognition
20 Web APIs
21 approach
22 base
23 basis
24 deep learning network
25 different knowledge bases
26 different taxonomies
27 disambiguation
28 ensemble method
29 entities
30 entity lists
31 entity recognition
32 extraction
33 extractors responses
34 final entity list
35 framework
36 gold standard
37 information extraction
38 input samples
39 knowledge base
40 knowledge bases
41 last years
42 learning network
43 list
44 macro F1 measures
45 measures
46 method
47 model
48 network
49 neural network
50 novel ensemble method
51 numerous extractors responses
52 order
53 paper
54 patterns
55 pre-defined types
56 presented approach
57 real-value vectors
58 recognition
59 response
60 samples
61 single extractor responses
62 specific gold standards
63 standards
64 subtasks
65 task
66 taxonomy
67 terms
68 text
69 types
70 vector
71 years
72 schema:name A Novel Ensemble Method for Named Entity Recognition and Disambiguation Based on Neural Network
73 schema:pagination 91-107
74 schema:productId N886b4aa8b1154debbb43cefbe83c4f21
75 Nbb0d3d9ff2b34d6bac2fc716328bce00
76 schema:publisher N750d1e8765ac49eaaa9abee0947171bf
77 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107054154
78 https://doi.org/10.1007/978-3-030-00671-6_6
79 schema:sdDatePublished 2021-12-01T20:00
80 schema:sdLicense https://scigraph.springernature.com/explorer/license/
81 schema:sdPublisher Ne1418a9a47fd4490acc81495e1569ca0
82 schema:url https://doi.org/10.1007/978-3-030-00671-6_6
83 sgo:license sg:explorer/license/
84 sgo:sdDataset chapters
85 rdf:type schema:Chapter
86 N0787721d517b4f26be85881936af9670 rdf:first sg:person.014452763710.07
87 rdf:rest N97d21e0c40cb49e98d2c325ae0745953
88 N0e1244a01109405ca2339dab45cefee5 rdf:first N64a74cbcda6e4411aa3f8b11c72274f9
89 rdf:rest Ne43b12bfc96c4ce6b0964666621e1374
90 N31d37b3cb8b342e1aa9f91cded2d66a8 rdf:first Ndda0ef0ea3c4445bba28d5339c19d565
91 rdf:rest N816cbea7d6fa405cba248b1020bfd72c
92 N476deb02edf74e9dab824912e7c3c02f schema:familyName Celino
93 schema:givenName Irene
94 rdf:type schema:Person
95 N54f3a6ca0e8e494381acfd344662446a schema:familyName Suárez-Figueroa
96 schema:givenName Mari Carmen
97 rdf:type schema:Person
98 N64a74cbcda6e4411aa3f8b11c72274f9 schema:familyName Sabou
99 schema:givenName Marta
100 rdf:type schema:Person
101 N720f56ee877d4901919ea99d508221fe rdf:first N8d31538ef55e473c88919eced8e1ec5a
102 rdf:rest rdf:nil
103 N750d1e8765ac49eaaa9abee0947171bf schema:name Springer Nature
104 rdf:type schema:Organisation
105 N7f6d386149194909a2398ff74566a599 schema:isbn 978-3-030-00670-9
106 978-3-030-00671-6
107 schema:name The Semantic Web – ISWC 2018
108 rdf:type schema:Book
109 N816cbea7d6fa405cba248b1020bfd72c rdf:first N476deb02edf74e9dab824912e7c3c02f
110 rdf:rest N0e1244a01109405ca2339dab45cefee5
111 N886b4aa8b1154debbb43cefbe83c4f21 schema:name dimensions_id
112 schema:value pub.1107054154
113 rdf:type schema:PropertyValue
114 N89b60d07a24f487183060aeb4929251b rdf:first N54f3a6ca0e8e494381acfd344662446a
115 rdf:rest N31d37b3cb8b342e1aa9f91cded2d66a8
116 N8d31538ef55e473c88919eced8e1ec5a schema:familyName Simperl
117 schema:givenName Elena
118 rdf:type schema:Person
119 N97d21e0c40cb49e98d2c325ae0745953 rdf:first sg:person.014343777056.38
120 rdf:rest Ncb0d1d6b420a440488fd2cc1fac9eef3
121 Nbb0d3d9ff2b34d6bac2fc716328bce00 schema:name doi
122 schema:value 10.1007/978-3-030-00671-6_6
123 rdf:type schema:PropertyValue
124 Nbc086dd0fe574c5fb9500c2b6181aa6d rdf:first Neb545d43cacf406db7b957591b2af4b6
125 rdf:rest Ne5cf40ef5fd746e7aebfdbfad5eb444e
126 Ncb0d1d6b420a440488fd2cc1fac9eef3 rdf:first sg:person.011023401073.96
127 rdf:rest rdf:nil
128 Ndda0ef0ea3c4445bba28d5339c19d565 schema:familyName Presutti
129 schema:givenName Valentina
130 rdf:type schema:Person
131 Ne1418a9a47fd4490acc81495e1569ca0 schema:name Springer Nature - SN SciGraph project
132 rdf:type schema:Organization
133 Ne43b12bfc96c4ce6b0964666621e1374 rdf:first Nef333a695b434787954dcc48c64c89bf
134 rdf:rest N720f56ee877d4901919ea99d508221fe
135 Ne5cf40ef5fd746e7aebfdbfad5eb444e rdf:first Nf3e7dab68ea04259a2f9c1815b31b53b
136 rdf:rest N89b60d07a24f487183060aeb4929251b
137 Neb545d43cacf406db7b957591b2af4b6 schema:familyName Vrandečić
138 schema:givenName Denny
139 rdf:type schema:Person
140 Nef333a695b434787954dcc48c64c89bf schema:familyName Kaffee
141 schema:givenName Lucie-Aimée
142 rdf:type schema:Person
143 Nf3e7dab68ea04259a2f9c1815b31b53b schema:familyName Bontcheva
144 schema:givenName Kalina
145 rdf:type schema:Person
146 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
147 schema:name Information and Computing Sciences
148 rdf:type schema:DefinedTerm
149 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
150 schema:name Artificial Intelligence and Image Processing
151 rdf:type schema:DefinedTerm
152 sg:person.011023401073.96 schema:affiliation grid-institutes:grid.28848.3e
153 schema:familyName Troncy
154 schema:givenName Raphaël
155 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011023401073.96
156 rdf:type schema:Person
157 sg:person.014343777056.38 schema:affiliation grid-institutes:grid.28848.3e
158 schema:familyName Lisena
159 schema:givenName Pasquale
160 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014343777056.38
161 rdf:type schema:Person
162 sg:person.014452763710.07 schema:affiliation grid-institutes:grid.4800.c
163 schema:familyName Canale
164 schema:givenName Lorenzo
165 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014452763710.07
166 rdf:type schema:Person
167 grid-institutes:grid.28848.3e schema:alternateName EURECOM, Sophia Antipolis, France
168 schema:name EURECOM, Sophia Antipolis, France
169 rdf:type schema:Organization
170 grid-institutes:grid.4800.c schema:alternateName Politecnico di Torino, Turin, Italy
171 schema:name EURECOM, Sophia Antipolis, France
172 Politecnico di Torino, Turin, Italy
173 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...