Ontology type: schema:Chapter
2014
AUTHORSAlexey Cheptsov , Axel Tenschert , Paul Schmidt , Birte Glimm , Mauricio Matthesius , Thorsten Liebig
ABSTRACTA good deal of digital data produced in academia, commerce and industry is made up of a raw, unstructured text, such as Word documents, Excel tables, emails, web pages, etc., which are also often represented in a natural language. An important analytical task in a number of scientific and technological domains is to retrieve information from text data, aiming to get a deeper insight into the content represented by the data in order to obtain some useful, often not explicitly stated knowledge and facts, related to a particular domain of interest. The major challenge is the size, structural complexity, and frequency of the analysed text sets’ updates (i.e., the ‘big data’ aspect), which makes the use of traditional analysis techniques and tools impossible. We introduce an innovative approach to analyse unstructured text data. This allows for improving traditional data mining techniques by adopting algorithms from ontological domain modelling, natural language processing, and machine learning. The technique is inherently designed with parallelism in mind, which allows for high performance on large-scale Cloud computing infrastructures. More... »
PAGES62-74
Web Information Systems Engineering – WISE 2013 Workshops
ISBN
978-3-642-54369-2
978-3-642-54370-8
http://scigraph.springernature.com/pub.10.1007/978-3-642-54370-8_6
DOIhttp://dx.doi.org/10.1007/978-3-642-54370-8_6
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1001886785
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Artificial Intelligence and Image Processing",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information Systems",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "High-Performance Computing Center Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany",
"id": "http://www.grid.ac/institutes/None",
"name": [
"High-Performance Computing Center Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany"
],
"type": "Organization"
},
"familyName": "Cheptsov",
"givenName": "Alexey",
"id": "sg:person.010137572622.04",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010137572622.04"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "High-Performance Computing Center Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany",
"id": "http://www.grid.ac/institutes/None",
"name": [
"High-Performance Computing Center Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany"
],
"type": "Organization"
},
"familyName": "Tenschert",
"givenName": "Axel",
"type": "Person"
},
{
"affiliation": {
"alternateName": "Institute of the Society for the Promotion of Applied Information Sciences, Saarland University, Martin-Luther-Str. 14, 66111, Saarbr\u00fccken, Germany",
"id": "http://www.grid.ac/institutes/grid.11749.3a",
"name": [
"Institute of the Society for the Promotion of Applied Information Sciences, Saarland University, Martin-Luther-Str. 14, 66111, Saarbr\u00fccken, Germany"
],
"type": "Organization"
},
"familyName": "Schmidt",
"givenName": "Paul",
"type": "Person"
},
{
"affiliation": {
"alternateName": "Institute of Artificial Intelligence, University of Ulm, 89069, Ulm, Germany",
"id": "http://www.grid.ac/institutes/grid.6582.9",
"name": [
"Institute of Artificial Intelligence, University of Ulm, 89069, Ulm, Germany"
],
"type": "Organization"
},
"familyName": "Glimm",
"givenName": "Birte",
"id": "sg:person.015234565343.35",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015234565343.35"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Objectivity, Inc., 3099 North First Street, Suite 200, 95134, San Jose, CA, USA",
"id": "http://www.grid.ac/institutes/None",
"name": [
"Objectivity, Inc., 3099 North First Street, Suite 200, 95134, San Jose, CA, USA"
],
"type": "Organization"
},
"familyName": "Matthesius",
"givenName": "Mauricio",
"type": "Person"
},
{
"affiliation": {
"alternateName": "derivo GmbH, James-Franck-Ring, 89081, Ulm, Germany",
"id": "http://www.grid.ac/institutes/None",
"name": [
"derivo GmbH, James-Franck-Ring, 89081, Ulm, Germany"
],
"type": "Organization"
},
"familyName": "Liebig",
"givenName": "Thorsten",
"id": "sg:person.014437204743.19",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014437204743.19"
],
"type": "Person"
}
],
"datePublished": "2014",
"datePublishedReg": "2014-01-01",
"description": "A good deal of digital data produced in academia, commerce and industry is made up of a raw, unstructured text, such as Word documents, Excel tables, emails, web pages, etc., which are also often represented in a natural language. An important analytical task in a number of scientific and technological domains is to retrieve information from text data, aiming to get a deeper insight into the content represented by the data in order to obtain some useful, often not explicitly stated knowledge and facts, related to a particular domain of interest. The major challenge is the size, structural complexity, and frequency of the analysed text sets\u2019 updates (i.e., the \u2018big data\u2019 aspect), which makes the use of traditional analysis techniques and tools impossible. We introduce an innovative approach to analyse unstructured text data. This allows for improving traditional data mining techniques by adopting algorithms from ontological domain modelling, natural language processing, and machine learning. The technique is inherently designed with parallelism in mind, which allows for high performance on large-scale Cloud computing infrastructures.",
"editor": [
{
"familyName": "Huang",
"givenName": "Zhisheng",
"type": "Person"
},
{
"familyName": "Liu",
"givenName": "Chengfei",
"type": "Person"
},
{
"familyName": "He",
"givenName": "Jing",
"type": "Person"
},
{
"familyName": "Huang",
"givenName": "Guangyan",
"type": "Person"
}
],
"genre": "chapter",
"id": "sg:pub.10.1007/978-3-642-54370-8_6",
"inLanguage": "en",
"isAccessibleForFree": false,
"isPartOf": {
"isbn": [
"978-3-642-54369-2",
"978-3-642-54370-8"
],
"name": "Web Information Systems Engineering \u2013 WISE 2013 Workshops",
"type": "Book"
},
"keywords": [
"natural language processing",
"mining techniques",
"text data",
"language processing",
"traditional data mining techniques",
"traditional text mining techniques",
"cloud computing infrastructures",
"service cloud platform",
"unstructured text data",
"data mining techniques",
"text mining techniques",
"computing infrastructures",
"cloud platform",
"unstructured text",
"ontology modelling",
"web pages",
"scalable data",
"machine learning",
"traditional analysis techniques",
"natural language",
"Word documents",
"digital data",
"particular domain",
"text sets",
"domain modelling",
"analytical tasks",
"Excel tables",
"high performance",
"technological domains",
"important analytical task",
"analysis techniques",
"major challenge",
"processing",
"parallelism",
"commerce",
"algorithm",
"infrastructure",
"innovative approach",
"pages",
"technique",
"platform",
"email",
"task",
"learning",
"complexity",
"documents",
"language",
"domain",
"data",
"modelling",
"academia",
"information",
"set",
"update",
"text",
"tool",
"deeper insight",
"performance",
"table",
"structural complexity",
"challenges",
"knowledge",
"industry",
"good deal",
"deal",
"order",
"interest",
"number",
"use",
"fact",
"mind",
"content",
"insights",
"size",
"frequency",
"approach"
],
"name": "Introducing a New Scalable Data-as-a-Service Cloud Platform for Enriching Traditional Text Mining Techniques by Integrating Ontology Modelling and Natural Language Processing",
"pagination": "62-74",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1001886785"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/978-3-642-54370-8_6"
]
}
],
"publisher": {
"name": "Springer Nature",
"type": "Organisation"
},
"sameAs": [
"https://doi.org/10.1007/978-3-642-54370-8_6",
"https://app.dimensions.ai/details/publication/pub.1001886785"
],
"sdDataset": "chapters",
"sdDatePublished": "2022-06-01T22:31",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220601/entities/gbq_results/chapter/chapter_292.jsonl",
"type": "Chapter",
"url": "https://doi.org/10.1007/978-3-642-54370-8_6"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-54370-8_6'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-54370-8_6'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-54370-8_6'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-54370-8_6'
This table displays all metadata directly associated to this object as RDF triples.
197 TRIPLES
23 PREDICATES
103 URIs
95 LITERALS
7 BLANK NODES
Subject | Predicate | Object | |
---|---|---|---|
1 | sg:pub.10.1007/978-3-642-54370-8_6 | schema:about | anzsrc-for:08 |
2 | ″ | ″ | anzsrc-for:0801 |
3 | ″ | ″ | anzsrc-for:0806 |
4 | ″ | schema:author | Nda4db0c9e36c4d05976dad932eedeb4e |
5 | ″ | schema:datePublished | 2014 |
6 | ″ | schema:datePublishedReg | 2014-01-01 |
7 | ″ | schema:description | A good deal of digital data produced in academia, commerce and industry is made up of a raw, unstructured text, such as Word documents, Excel tables, emails, web pages, etc., which are also often represented in a natural language. An important analytical task in a number of scientific and technological domains is to retrieve information from text data, aiming to get a deeper insight into the content represented by the data in order to obtain some useful, often not explicitly stated knowledge and facts, related to a particular domain of interest. The major challenge is the size, structural complexity, and frequency of the analysed text sets’ updates (i.e., the ‘big data’ aspect), which makes the use of traditional analysis techniques and tools impossible. We introduce an innovative approach to analyse unstructured text data. This allows for improving traditional data mining techniques by adopting algorithms from ontological domain modelling, natural language processing, and machine learning. The technique is inherently designed with parallelism in mind, which allows for high performance on large-scale Cloud computing infrastructures. |
8 | ″ | schema:editor | N7bf4bdc11ef14466a2fda3334d88f051 |
9 | ″ | schema:genre | chapter |
10 | ″ | schema:inLanguage | en |
11 | ″ | schema:isAccessibleForFree | false |
12 | ″ | schema:isPartOf | N481f2ac6604c430788e3c201662121d6 |
13 | ″ | schema:keywords | Excel tables |
14 | ″ | ″ | Word documents |
15 | ″ | ″ | academia |
16 | ″ | ″ | algorithm |
17 | ″ | ″ | analysis techniques |
18 | ″ | ″ | analytical tasks |
19 | ″ | ″ | approach |
20 | ″ | ″ | challenges |
21 | ″ | ″ | cloud computing infrastructures |
22 | ″ | ″ | cloud platform |
23 | ″ | ″ | commerce |
24 | ″ | ″ | complexity |
25 | ″ | ″ | computing infrastructures |
26 | ″ | ″ | content |
27 | ″ | ″ | data |
28 | ″ | ″ | data mining techniques |
29 | ″ | ″ | deal |
30 | ″ | ″ | deeper insight |
31 | ″ | ″ | digital data |
32 | ″ | ″ | documents |
33 | ″ | ″ | domain |
34 | ″ | ″ | domain modelling |
35 | ″ | ″ | |
36 | ″ | ″ | fact |
37 | ″ | ″ | frequency |
38 | ″ | ″ | good deal |
39 | ″ | ″ | high performance |
40 | ″ | ″ | important analytical task |
41 | ″ | ″ | industry |
42 | ″ | ″ | information |
43 | ″ | ″ | infrastructure |
44 | ″ | ″ | innovative approach |
45 | ″ | ″ | insights |
46 | ″ | ″ | interest |
47 | ″ | ″ | knowledge |
48 | ″ | ″ | language |
49 | ″ | ″ | language processing |
50 | ″ | ″ | learning |
51 | ″ | ″ | machine learning |
52 | ″ | ″ | major challenge |
53 | ″ | ″ | mind |
54 | ″ | ″ | mining techniques |
55 | ″ | ″ | modelling |
56 | ″ | ″ | natural language |
57 | ″ | ″ | natural language processing |
58 | ″ | ″ | number |
59 | ″ | ″ | ontology modelling |
60 | ″ | ″ | order |
61 | ″ | ″ | pages |
62 | ″ | ″ | parallelism |
63 | ″ | ″ | particular domain |
64 | ″ | ″ | performance |
65 | ″ | ″ | platform |
66 | ″ | ″ | processing |
67 | ″ | ″ | scalable data |
68 | ″ | ″ | service cloud platform |
69 | ″ | ″ | set |
70 | ″ | ″ | size |
71 | ″ | ″ | structural complexity |
72 | ″ | ″ | table |
73 | ″ | ″ | task |
74 | ″ | ″ | technique |
75 | ″ | ″ | technological domains |
76 | ″ | ″ | text |
77 | ″ | ″ | text data |
78 | ″ | ″ | text mining techniques |
79 | ″ | ″ | text sets |
80 | ″ | ″ | tool |
81 | ″ | ″ | traditional analysis techniques |
82 | ″ | ″ | traditional data mining techniques |
83 | ″ | ″ | traditional text mining techniques |
84 | ″ | ″ | unstructured text |
85 | ″ | ″ | unstructured text data |
86 | ″ | ″ | update |
87 | ″ | ″ | use |
88 | ″ | ″ | web pages |
89 | ″ | schema:name | Introducing a New Scalable Data-as-a-Service Cloud Platform for Enriching Traditional Text Mining Techniques by Integrating Ontology Modelling and Natural Language Processing |
90 | ″ | schema:pagination | 62-74 |
91 | ″ | schema:productId | N2bdc267580c742bda741a345224552c4 |
92 | ″ | ″ | Nea4ec086184649dab48fe30a6b42db4a |
93 | ″ | schema:publisher | N467b2f6f18244617869a6db27aa18189 |
94 | ″ | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1001886785 |
95 | ″ | ″ | https://doi.org/10.1007/978-3-642-54370-8_6 |
96 | ″ | schema:sdDatePublished | 2022-06-01T22:31 |
97 | ″ | schema:sdLicense | https://scigraph.springernature.com/explorer/license/ |
98 | ″ | schema:sdPublisher | N2cf6167d1ed24d72b636c0e240cbc5cc |
99 | ″ | schema:url | https://doi.org/10.1007/978-3-642-54370-8_6 |
100 | ″ | sgo:license | sg:explorer/license/ |
101 | ″ | sgo:sdDataset | chapters |
102 | ″ | rdf:type | schema:Chapter |
103 | N07c83a5a06a9432588809f7dafbefc28 | schema:familyName | Huang |
104 | ″ | schema:givenName | Guangyan |
105 | ″ | rdf:type | schema:Person |
106 | N16161394677c4536bddaf5d084a7fa03 | schema:familyName | Liu |
107 | ″ | schema:givenName | Chengfei |
108 | ″ | rdf:type | schema:Person |
109 | N22e6b55d1a8d458f9adc3ce51a364e7e | schema:familyName | He |
110 | ″ | schema:givenName | Jing |
111 | ″ | rdf:type | schema:Person |
112 | N2a0ceecdc4b24b6f9dd984dbe9167167 | rdf:first | sg:person.015234565343.35 |
113 | ″ | rdf:rest | N91dc5495e0bb4aa288413102a548815b |
114 | N2bdc267580c742bda741a345224552c4 | schema:name | doi |
115 | ″ | schema:value | 10.1007/978-3-642-54370-8_6 |
116 | ″ | rdf:type | schema:PropertyValue |
117 | N2c05e72d50554fa0977f03320e2a3e23 | schema:affiliation | grid-institutes:None |
118 | ″ | schema:familyName | Tenschert |
119 | ″ | schema:givenName | Axel |
120 | ″ | rdf:type | schema:Person |
121 | N2cf6167d1ed24d72b636c0e240cbc5cc | schema:name | Springer Nature - SN SciGraph project |
122 | ″ | rdf:type | schema:Organization |
123 | N3a853fa9cacd4856b3645356f4f5567c | rdf:first | N22e6b55d1a8d458f9adc3ce51a364e7e |
124 | ″ | rdf:rest | Ne48f93e732f24037a6e3d32507f59bfe |
125 | N467b2f6f18244617869a6db27aa18189 | schema:name | Springer Nature |
126 | ″ | rdf:type | schema:Organisation |
127 | N481f2ac6604c430788e3c201662121d6 | schema:isbn | 978-3-642-54369-2 |
128 | ″ | ″ | 978-3-642-54370-8 |
129 | ″ | schema:name | Web Information Systems Engineering – WISE 2013 Workshops |
130 | ″ | rdf:type | schema:Book |
131 | N7bf4bdc11ef14466a2fda3334d88f051 | rdf:first | N92fa07f2af254279b6aadf78cf22e363 |
132 | ″ | rdf:rest | Nce26721eb99e4a6097f590b15df4b59b |
133 | N91dc5495e0bb4aa288413102a548815b | rdf:first | Nc4518be2e2d54b21bbafd8673b55aa26 |
134 | ″ | rdf:rest | Ncbbfe2a67a4d44ad832c5905c316ce91 |
135 | N92fa07f2af254279b6aadf78cf22e363 | schema:familyName | Huang |
136 | ″ | schema:givenName | Zhisheng |
137 | ″ | rdf:type | schema:Person |
138 | Na890d94d076641c38ab135ae4b78f2ca | rdf:first | N2c05e72d50554fa0977f03320e2a3e23 |
139 | ″ | rdf:rest | Nd9e98d2887cb4ba39f346ca7430db64b |
140 | Nbf13a8a89cc742b193db1e8310992451 | schema:affiliation | grid-institutes:grid.11749.3a |
141 | ″ | schema:familyName | Schmidt |
142 | ″ | schema:givenName | Paul |
143 | ″ | rdf:type | schema:Person |
144 | Nc4518be2e2d54b21bbafd8673b55aa26 | schema:affiliation | grid-institutes:None |
145 | ″ | schema:familyName | Matthesius |
146 | ″ | schema:givenName | Mauricio |
147 | ″ | rdf:type | schema:Person |
148 | Ncbbfe2a67a4d44ad832c5905c316ce91 | rdf:first | sg:person.014437204743.19 |
149 | ″ | rdf:rest | rdf:nil |
150 | Nce26721eb99e4a6097f590b15df4b59b | rdf:first | N16161394677c4536bddaf5d084a7fa03 |
151 | ″ | rdf:rest | N3a853fa9cacd4856b3645356f4f5567c |
152 | Nd9e98d2887cb4ba39f346ca7430db64b | rdf:first | Nbf13a8a89cc742b193db1e8310992451 |
153 | ″ | rdf:rest | N2a0ceecdc4b24b6f9dd984dbe9167167 |
154 | Nda4db0c9e36c4d05976dad932eedeb4e | rdf:first | sg:person.010137572622.04 |
155 | ″ | rdf:rest | Na890d94d076641c38ab135ae4b78f2ca |
156 | Ne48f93e732f24037a6e3d32507f59bfe | rdf:first | N07c83a5a06a9432588809f7dafbefc28 |
157 | ″ | rdf:rest | rdf:nil |
158 | Nea4ec086184649dab48fe30a6b42db4a | schema:name | dimensions_id |
159 | ″ | schema:value | pub.1001886785 |
160 | ″ | rdf:type | schema:PropertyValue |
161 | anzsrc-for:08 | schema:inDefinedTermSet | anzsrc-for: |
162 | ″ | schema:name | Information and Computing Sciences |
163 | ″ | rdf:type | schema:DefinedTerm |
164 | anzsrc-for:0801 | schema:inDefinedTermSet | anzsrc-for: |
165 | ″ | schema:name | Artificial Intelligence and Image Processing |
166 | ″ | rdf:type | schema:DefinedTerm |
167 | anzsrc-for:0806 | schema:inDefinedTermSet | anzsrc-for: |
168 | ″ | schema:name | Information Systems |
169 | ″ | rdf:type | schema:DefinedTerm |
170 | sg:person.010137572622.04 | schema:affiliation | grid-institutes:None |
171 | ″ | schema:familyName | Cheptsov |
172 | ″ | schema:givenName | Alexey |
173 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010137572622.04 |
174 | ″ | rdf:type | schema:Person |
175 | sg:person.014437204743.19 | schema:affiliation | grid-institutes:None |
176 | ″ | schema:familyName | Liebig |
177 | ″ | schema:givenName | Thorsten |
178 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014437204743.19 |
179 | ″ | rdf:type | schema:Person |
180 | sg:person.015234565343.35 | schema:affiliation | grid-institutes:grid.6582.9 |
181 | ″ | schema:familyName | Glimm |
182 | ″ | schema:givenName | Birte |
183 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015234565343.35 |
184 | ″ | rdf:type | schema:Person |
185 | grid-institutes:None | schema:alternateName | High-Performance Computing Center Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany |
186 | ″ | ″ | Objectivity, Inc., 3099 North First Street, Suite 200, 95134, San Jose, CA, USA |
187 | ″ | ″ | derivo GmbH, James-Franck-Ring, 89081, Ulm, Germany |
188 | ″ | schema:name | High-Performance Computing Center Stuttgart, Nobelstr. 19, 70569, Stuttgart, Germany |
189 | ″ | ″ | Objectivity, Inc., 3099 North First Street, Suite 200, 95134, San Jose, CA, USA |
190 | ″ | ″ | derivo GmbH, James-Franck-Ring, 89081, Ulm, Germany |
191 | ″ | rdf:type | schema:Organization |
192 | grid-institutes:grid.11749.3a | schema:alternateName | Institute of the Society for the Promotion of Applied Information Sciences, Saarland University, Martin-Luther-Str. 14, 66111, Saarbrücken, Germany |
193 | ″ | schema:name | Institute of the Society for the Promotion of Applied Information Sciences, Saarland University, Martin-Luther-Str. 14, 66111, Saarbrücken, Germany |
194 | ″ | rdf:type | schema:Organization |
195 | grid-institutes:grid.6582.9 | schema:alternateName | Institute of Artificial Intelligence, University of Ulm, 89069, Ulm, Germany |
196 | ″ | schema:name | Institute of Artificial Intelligence, University of Ulm, 89069, Ulm, Germany |
197 | ″ | rdf:type | schema:Organization |