Ontology type: schema:Chapter
2002-12-16
AUTHORSJen-Yuan Yeh , Hao-Ren Ke , Wei-Pang Yang
ABSTRACTIn this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function by the genetic algorithm for obtaining a suitable combination of feature weights. The second approach combines the ideas of latent semantic analysis and text relationship maps to interpret conceptual structures of a document. Both approaches are applied to Chinese text summarization. The two approaches were evaluated by using a data corpus composed of 100 articles about politics from New Taiwan Weekly, and when the compression ratio was 30%, average recalls of 52.0% and 45.6% were achieved respectively. More... »
PAGES76-87
Digital Libraries: People, Knowledge, and Technology
ISBN
978-3-540-00261-1
978-3-540-36227-2
http://scigraph.springernature.com/pub.10.1007/3-540-36227-4_8
DOIhttp://dx.doi.org/10.1007/3-540-36227-4_8
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1010939703
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Artificial Intelligence and Image Processing",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Department of Computer & Information Science, National Chiao-Tung University, 1001 Ta Hsueh Rd., 30050, Hsinchu, Taiwan, R.O.C.",
"id": "http://www.grid.ac/institutes/grid.260539.b",
"name": [
"Department of Computer & Information Science, National Chiao-Tung University, 1001 Ta Hsueh Rd., 30050, Hsinchu, Taiwan, R.O.C."
],
"type": "Organization"
},
"familyName": "Yeh",
"givenName": "Jen-Yuan",
"id": "sg:person.013047064577.06",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013047064577.06"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Digital Library & Information Section of Library, National Chiao-Tung University, 1001 Ta Hsueh Rd., 30050, Hsinchu, Taiwan, R.O.C.",
"id": "http://www.grid.ac/institutes/grid.260539.b",
"name": [
"Digital Library & Information Section of Library, National Chiao-Tung University, 1001 Ta Hsueh Rd., 30050, Hsinchu, Taiwan, R.O.C."
],
"type": "Organization"
},
"familyName": "Ke",
"givenName": "Hao-Ren",
"id": "sg:person.015237406177.37",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015237406177.37"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Department of Computer & Information Science, National Chiao-Tung University, 1001 Ta Hsueh Rd., 30050, Hsinchu, Taiwan, R.O.C.",
"id": "http://www.grid.ac/institutes/grid.260539.b",
"name": [
"Department of Computer & Information Science, National Chiao-Tung University, 1001 Ta Hsueh Rd., 30050, Hsinchu, Taiwan, R.O.C."
],
"type": "Organization"
},
"familyName": "Yang",
"givenName": "Wei-Pang",
"id": "sg:person.014374171260.51",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014374171260.51"
],
"type": "Person"
}
],
"datePublished": "2002-12-16",
"datePublishedReg": "2002-12-16",
"description": "In this paper, two novel approaches are proposed to extract important sentences from a document to create its summary. The first is a corpus-based approach using feature analysis. It brings up three new ideas: 1) to employ ranked position to emphasize the significance of sentence position, 2) to reshape word unit to achieve higher accuracy of keyword importance, and 3) to train a score function by the genetic algorithm for obtaining a suitable combination of feature weights. The second approach combines the ideas of latent semantic analysis and text relationship maps to interpret conceptual structures of a document. Both approaches are applied to Chinese text summarization. The two approaches were evaluated by using a data corpus composed of 100 articles about politics from New Taiwan Weekly, and when the compression ratio was 30%, average recalls of 52.0% and 45.6% were achieved respectively.",
"editor": [
{
"familyName": "Lim",
"givenName": "Ee- Peng",
"type": "Person"
},
{
"familyName": "Foo",
"givenName": "Schubert",
"type": "Person"
},
{
"familyName": "Khoo",
"givenName": "Chris",
"type": "Person"
},
{
"familyName": "Chen",
"givenName": "Hsinchun",
"type": "Person"
},
{
"familyName": "Fox",
"givenName": "Edward",
"type": "Person"
},
{
"familyName": "Urs",
"givenName": "Shalini",
"type": "Person"
},
{
"familyName": "Costantino",
"givenName": "Thanos",
"type": "Person"
}
],
"genre": "chapter",
"id": "sg:pub.10.1007/3-540-36227-4_8",
"inLanguage": "en",
"isAccessibleForFree": false,
"isPartOf": {
"isbn": [
"978-3-540-00261-1",
"978-3-540-36227-2"
],
"name": "Digital Libraries: People, Knowledge, and Technology",
"type": "Book"
},
"keywords": [
"corpus-based approach",
"Latent Semantic Analysis",
"semantic analysis",
"Chinese text summarization",
"text summarization",
"important sentences",
"sentence position",
"word units",
"conceptual structure",
"sentences",
"documents",
"idea",
"summarization",
"politics",
"new ideas",
"keyword importance",
"article",
"summarizer",
"novel approach",
"feature analysis",
"high accuracy",
"score function",
"genetic algorithm",
"feature weights",
"relationship map",
"compression ratio",
"average recall",
"trainable summarizer",
"approach",
"position",
"significance",
"second approach",
"paper",
"analysis",
"accuracy",
"importance",
"algorithm",
"suitable combination",
"weekly",
"recall",
"summary",
"maps",
"data",
"units",
"function",
"combination",
"structure",
"weight",
"ratio"
],
"name": "Chinese Text Summarization Using a Trainable Summarizer and Latent Semantic Analysis",
"pagination": "76-87",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1010939703"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/3-540-36227-4_8"
]
}
],
"publisher": {
"name": "Springer Nature",
"type": "Organisation"
},
"sameAs": [
"https://doi.org/10.1007/3-540-36227-4_8",
"https://app.dimensions.ai/details/publication/pub.1010939703"
],
"sdDataset": "chapters",
"sdDatePublished": "2022-05-20T07:49",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/chapter/chapter_49.jsonl",
"type": "Chapter",
"url": "https://doi.org/10.1007/3-540-36227-4_8"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/3-540-36227-4_8'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/3-540-36227-4_8'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/3-540-36227-4_8'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/3-540-36227-4_8'
This table displays all metadata directly associated to this object as RDF triples.
155 TRIPLES
23 PREDICATES
74 URIs
67 LITERALS
7 BLANK NODES