Ontology type: schema:ScholarlyArticle Open Access: True
2018-12-06
AUTHORS ABSTRACTChemical named entity recognition (NER) has traditionally been dominated by conditional random fields (CRF)-based approaches but given the success of the artificial neural network techniques known as “deep learning” we decided to examine them as an alternative to CRFs. We present here several chemical named entity recognition systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using rich per-token features and neural word embeddings, and producing a sequence of tags using bidirectional long short term memory (LSTM) networks—a type of recurrent neural net. The second system eschews the rich feature set—and even tokenisation—in favour of character labelling using neural character embeddings and multiple LSTM layers. The third system is an ensemble that combines the results of the first two systems. Our original BioCreative V.5 competition entry was placed in the top group with the highest F scores, and subsequent using transfer learning have achieved a final F score of 90.33% on the test data (precision 91.47%, recall 89.21%). More... »
PAGES59
http://scigraph.springernature.com/pub.10.1186/s13321-018-0313-8
DOIhttp://dx.doi.org/10.1186/s13321-018-0313-8
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1110434409
PUBMEDhttps://www.ncbi.nlm.nih.gov/pubmed/30523437
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/03",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Chemical Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0303",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Macromolecular and Materials Chemistry",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Data Science Group, Technology Department, The Royal Society of Chemistry, Cambridge, UK",
"id": "http://www.grid.ac/institutes/grid.431456.1",
"name": [
"Data Science Group, Technology Department, The Royal Society of Chemistry, Cambridge, UK"
],
"type": "Organization"
},
"familyName": "Corbett",
"givenName": "Peter",
"id": "sg:person.010641656713.56",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010641656713.56"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Data Science Group, Technology Department, The Royal Society of Chemistry, Cambridge, UK",
"id": "http://www.grid.ac/institutes/grid.431456.1",
"name": [
"Data Science Group, Technology Department, The Royal Society of Chemistry, Cambridge, UK"
],
"type": "Organization"
},
"familyName": "Boyle",
"givenName": "John",
"id": "sg:person.01110033460.10",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01110033460.10"
],
"type": "Person"
}
],
"citation": [
{
"id": "sg:pub.10.1038/nature14539",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1010020120",
"https://doi.org/10.1038/nature14539"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1758-2946-7-s1-s3",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1043621969",
"https://doi.org/10.1186/1758-2946-7-s1-s3"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/s40537-016-0043-6",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1046078126",
"https://doi.org/10.1186/s40537-016-0043-6"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1471-2105-9-s11-s4",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1009242068",
"https://doi.org/10.1186/1471-2105-9-s11-s4"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1758-2946-3-41",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1012459844",
"https://doi.org/10.1186/1758-2946-3-41"
],
"type": "CreativeWork"
}
],
"datePublished": "2018-12-06",
"datePublishedReg": "2018-12-06",
"description": "Chemical named entity recognition (NER) has traditionally been dominated by conditional random fields (CRF)-based approaches but given the success of the artificial neural network techniques known as \u201cdeep learning\u201d we decided to examine them as an alternative to CRFs. We present here several chemical named entity recognition systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using rich per-token features and neural word embeddings, and producing a sequence of tags using bidirectional long short term memory (LSTM) networks\u2014a type of recurrent neural net. The second system eschews the rich feature set\u2014and even tokenisation\u2014in favour of character labelling using neural character embeddings and multiple LSTM layers. The third system is an ensemble that combines the results of the first two systems. Our original BioCreative V.5 competition entry was placed in the top group with the highest F scores, and subsequent using transfer learning have achieved a final F score of 90.33% on the test data (precision 91.47%, recall 89.21%).",
"genre": "article",
"id": "sg:pub.10.1186/s13321-018-0313-8",
"inLanguage": "en",
"isAccessibleForFree": true,
"isPartOf": [
{
"id": "sg:journal.1042252",
"issn": [
"1758-2946"
],
"name": "Journal of Cheminformatics",
"publisher": "Springer Nature",
"type": "Periodical"
},
{
"issueNumber": "1",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "10"
}
],
"keywords": [
"entity recognition",
"F-score",
"bidirectional long short-term memory network",
"long short-term memory network",
"multiple LSTM layers",
"deep learning framework",
"short-term memory network",
"artificial neural network technique",
"entity recognition system",
"recurrent neural network",
"Conditional Random Fields",
"highest F-score",
"neural network technique",
"term memory network",
"recurrent neural nets",
"neural word embeddings",
"character labeling",
"deep learning",
"transfer learning",
"sequence of tags",
"learning framework",
"recognition system",
"LSTM layers",
"neural network",
"network technique",
"character embeddings",
"word embeddings",
"rich features",
"neural nets",
"token features",
"memory network",
"traditional CRF",
"random fields",
"first system",
"embedding",
"network",
"learning",
"test data",
"recognition",
"tokenisation",
"system",
"second system",
"competition entry",
"features",
"framework",
"third system",
"nets",
"scores",
"tags",
"CRF",
"ensemble",
"technique",
"idioms",
"data",
"success",
"top group",
"field",
"sequence",
"favor",
"results",
"group",
"alternative",
"labeling",
"entry",
"layer",
"types",
"approach",
"chemicals"
],
"name": "Chemlistem: chemical named entity recognition using recurrent neural networks",
"pagination": "59",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1110434409"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1186/s13321-018-0313-8"
]
},
{
"name": "pubmed_id",
"type": "PropertyValue",
"value": [
"30523437"
]
}
],
"sameAs": [
"https://doi.org/10.1186/s13321-018-0313-8",
"https://app.dimensions.ai/details/publication/pub.1110434409"
],
"sdDataset": "articles",
"sdDatePublished": "2022-06-01T22:19",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220601/entities/gbq_results/article/article_758.jsonl",
"type": "ScholarlyArticle",
"url": "https://doi.org/10.1186/s13321-018-0313-8"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s13321-018-0313-8'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s13321-018-0313-8'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s13321-018-0313-8'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s13321-018-0313-8'
This table displays all metadata directly associated to this object as RDF triples.
156 TRIPLES
22 PREDICATES
99 URIs
86 LITERALS
7 BLANK NODES
Subject | Predicate | Object | |
---|---|---|---|
1 | sg:pub.10.1186/s13321-018-0313-8 | schema:about | anzsrc-for:03 |
2 | ″ | ″ | anzsrc-for:0303 |
3 | ″ | schema:author | Nfd6668d4675a408a80fe97b71bd61b7e |
4 | ″ | schema:citation | sg:pub.10.1038/nature14539 |
5 | ″ | ″ | sg:pub.10.1186/1471-2105-9-s11-s4 |
6 | ″ | ″ | sg:pub.10.1186/1758-2946-3-41 |
7 | ″ | ″ | sg:pub.10.1186/1758-2946-7-s1-s3 |
8 | ″ | ″ | sg:pub.10.1186/s40537-016-0043-6 |
9 | ″ | schema:datePublished | 2018-12-06 |
10 | ″ | schema:datePublishedReg | 2018-12-06 |
11 | ″ | schema:description | Chemical named entity recognition (NER) has traditionally been dominated by conditional random fields (CRF)-based approaches but given the success of the artificial neural network techniques known as “deep learning” we decided to examine them as an alternative to CRFs. We present here several chemical named entity recognition systems. The first system translates the traditional CRF-based idioms into a deep learning framework, using rich per-token features and neural word embeddings, and producing a sequence of tags using bidirectional long short term memory (LSTM) networks—a type of recurrent neural net. The second system eschews the rich feature set—and even tokenisation—in favour of character labelling using neural character embeddings and multiple LSTM layers. The third system is an ensemble that combines the results of the first two systems. Our original BioCreative V.5 competition entry was placed in the top group with the highest F scores, and subsequent using transfer learning have achieved a final F score of 90.33% on the test data (precision 91.47%, recall 89.21%). |
12 | ″ | schema:genre | article |
13 | ″ | schema:inLanguage | en |
14 | ″ | schema:isAccessibleForFree | true |
15 | ″ | schema:isPartOf | N8cfcda5b988d4f5d9d8150c412722aef |
16 | ″ | ″ | Nbcff4a17c5a943439fee24a79553281f |
17 | ″ | ″ | sg:journal.1042252 |
18 | ″ | schema:keywords | CRF |
19 | ″ | ″ | Conditional Random Fields |
20 | ″ | ″ | F-score |
21 | ″ | ″ | LSTM layers |
22 | ″ | ″ | alternative |
23 | ″ | ″ | approach |
24 | ″ | ″ | artificial neural network technique |
25 | ″ | ″ | bidirectional long short-term memory network |
26 | ″ | ″ | character embeddings |
27 | ″ | ″ | character labeling |
28 | ″ | ″ | chemicals |
29 | ″ | ″ | competition entry |
30 | ″ | ″ | data |
31 | ″ | ″ | deep learning |
32 | ″ | ″ | deep learning framework |
33 | ″ | ″ | embedding |
34 | ″ | ″ | ensemble |
35 | ″ | ″ | entity recognition |
36 | ″ | ″ | entity recognition system |
37 | ″ | ″ | entry |
38 | ″ | ″ | favor |
39 | ″ | ″ | features |
40 | ″ | ″ | field |
41 | ″ | ″ | first system |
42 | ″ | ″ | framework |
43 | ″ | ″ | group |
44 | ″ | ″ | highest F-score |
45 | ″ | ″ | idioms |
46 | ″ | ″ | labeling |
47 | ″ | ″ | layer |
48 | ″ | ″ | learning |
49 | ″ | ″ | learning framework |
50 | ″ | ″ | long short-term memory network |
51 | ″ | ″ | memory network |
52 | ″ | ″ | multiple LSTM layers |
53 | ″ | ″ | nets |
54 | ″ | ″ | network |
55 | ″ | ″ | network technique |
56 | ″ | ″ | neural nets |
57 | ″ | ″ | neural network |
58 | ″ | ″ | neural network technique |
59 | ″ | ″ | neural word embeddings |
60 | ″ | ″ | random fields |
61 | ″ | ″ | recognition |
62 | ″ | ″ | recognition system |
63 | ″ | ″ | recurrent neural nets |
64 | ″ | ″ | recurrent neural network |
65 | ″ | ″ | results |
66 | ″ | ″ | rich features |
67 | ″ | ″ | scores |
68 | ″ | ″ | second system |
69 | ″ | ″ | sequence |
70 | ″ | ″ | sequence of tags |
71 | ″ | ″ | short-term memory network |
72 | ″ | ″ | success |
73 | ″ | ″ | system |
74 | ″ | ″ | tags |
75 | ″ | ″ | technique |
76 | ″ | ″ | term memory network |
77 | ″ | ″ | test data |
78 | ″ | ″ | third system |
79 | ″ | ″ | token features |
80 | ″ | ″ | tokenisation |
81 | ″ | ″ | top group |
82 | ″ | ″ | traditional CRF |
83 | ″ | ″ | transfer learning |
84 | ″ | ″ | types |
85 | ″ | ″ | word embeddings |
86 | ″ | schema:name | Chemlistem: chemical named entity recognition using recurrent neural networks |
87 | ″ | schema:pagination | 59 |
88 | ″ | schema:productId | N5f74f6cce80a4a6bb3b9f8d41aa7ab49 |
89 | ″ | ″ | Nb24896cc884e480b86f90ee1fca72902 |
90 | ″ | ″ | Nc91d9a26baca4891aebb1f28f82a9f8b |
91 | ″ | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1110434409 |
92 | ″ | ″ | https://doi.org/10.1186/s13321-018-0313-8 |
93 | ″ | schema:sdDatePublished | 2022-06-01T22:19 |
94 | ″ | schema:sdLicense | https://scigraph.springernature.com/explorer/license/ |
95 | ″ | schema:sdPublisher | N2f637145e4724ed083a289747ebea73d |
96 | ″ | schema:url | https://doi.org/10.1186/s13321-018-0313-8 |
97 | ″ | sgo:license | sg:explorer/license/ |
98 | ″ | sgo:sdDataset | articles |
99 | ″ | rdf:type | schema:ScholarlyArticle |
100 | N2f637145e4724ed083a289747ebea73d | schema:name | Springer Nature - SN SciGraph project |
101 | ″ | rdf:type | schema:Organization |
102 | N5f74f6cce80a4a6bb3b9f8d41aa7ab49 | schema:name | pubmed_id |
103 | ″ | schema:value | 30523437 |
104 | ″ | rdf:type | schema:PropertyValue |
105 | N72233708e7054159bfd77869251b0e53 | rdf:first | sg:person.01110033460.10 |
106 | ″ | rdf:rest | rdf:nil |
107 | N8cfcda5b988d4f5d9d8150c412722aef | schema:volumeNumber | 10 |
108 | ″ | rdf:type | schema:PublicationVolume |
109 | Nb24896cc884e480b86f90ee1fca72902 | schema:name | doi |
110 | ″ | schema:value | 10.1186/s13321-018-0313-8 |
111 | ″ | rdf:type | schema:PropertyValue |
112 | Nbcff4a17c5a943439fee24a79553281f | schema:issueNumber | 1 |
113 | ″ | rdf:type | schema:PublicationIssue |
114 | Nc91d9a26baca4891aebb1f28f82a9f8b | schema:name | dimensions_id |
115 | ″ | schema:value | pub.1110434409 |
116 | ″ | rdf:type | schema:PropertyValue |
117 | Nfd6668d4675a408a80fe97b71bd61b7e | rdf:first | sg:person.010641656713.56 |
118 | ″ | rdf:rest | N72233708e7054159bfd77869251b0e53 |
119 | anzsrc-for:03 | schema:inDefinedTermSet | anzsrc-for: |
120 | ″ | schema:name | Chemical Sciences |
121 | ″ | rdf:type | schema:DefinedTerm |
122 | anzsrc-for:0303 | schema:inDefinedTermSet | anzsrc-for: |
123 | ″ | schema:name | Macromolecular and Materials Chemistry |
124 | ″ | rdf:type | schema:DefinedTerm |
125 | sg:journal.1042252 | schema:issn | 1758-2946 |
126 | ″ | schema:name | Journal of Cheminformatics |
127 | ″ | schema:publisher | Springer Nature |
128 | ″ | rdf:type | schema:Periodical |
129 | sg:person.010641656713.56 | schema:affiliation | grid-institutes:grid.431456.1 |
130 | ″ | schema:familyName | Corbett |
131 | ″ | schema:givenName | Peter |
132 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010641656713.56 |
133 | ″ | rdf:type | schema:Person |
134 | sg:person.01110033460.10 | schema:affiliation | grid-institutes:grid.431456.1 |
135 | ″ | schema:familyName | Boyle |
136 | ″ | schema:givenName | John |
137 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01110033460.10 |
138 | ″ | rdf:type | schema:Person |
139 | sg:pub.10.1038/nature14539 | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1010020120 |
140 | ″ | ″ | https://doi.org/10.1038/nature14539 |
141 | ″ | rdf:type | schema:CreativeWork |
142 | sg:pub.10.1186/1471-2105-9-s11-s4 | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1009242068 |
143 | ″ | ″ | https://doi.org/10.1186/1471-2105-9-s11-s4 |
144 | ″ | rdf:type | schema:CreativeWork |
145 | sg:pub.10.1186/1758-2946-3-41 | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1012459844 |
146 | ″ | ″ | https://doi.org/10.1186/1758-2946-3-41 |
147 | ″ | rdf:type | schema:CreativeWork |
148 | sg:pub.10.1186/1758-2946-7-s1-s3 | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1043621969 |
149 | ″ | ″ | https://doi.org/10.1186/1758-2946-7-s1-s3 |
150 | ″ | rdf:type | schema:CreativeWork |
151 | sg:pub.10.1186/s40537-016-0043-6 | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1046078126 |
152 | ″ | ″ | https://doi.org/10.1186/s40537-016-0043-6 |
153 | ″ | rdf:type | schema:CreativeWork |
154 | grid-institutes:grid.431456.1 | schema:alternateName | Data Science Group, Technology Department, The Royal Society of Chemistry, Cambridge, UK |
155 | ″ | schema:name | Data Science Group, Technology Department, The Royal Society of Chemistry, Cambridge, UK |
156 | ″ | rdf:type | schema:Organization |