Research on multi-feature fusion algorithm for subject words extraction and summary generation of text View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2017-10-16

AUTHORS

Gui-Xian Xu, Hai-Shen Yao, Changzhi Wang

ABSTRACT

Subject words represent the brief information of the text. Text automatic summary reflects its theme and core content. In this paper, the research is conducted on multi-feature fusion algorithm on subject words extraction and summary generation of Tibetan network text. Firstly, Tibetan web pages are collected and preprocessing is conducted to extract the useful information from web pages. Secondly, BCCF algorithm of word segmentation is utilized to cut the text’s words. Then multi-feature fusion algorithm is proposed to extract the subject words of the text. The algorithm takes into account the multi-factors such as the word’s frequency, length, type to calculate the words’ weight and effectively select the text’s subject words. For text summary generation, the algorithm of the sentence weight calculation is designed in terms of the word frequency, position and so on. The algorithm of text summary generation is to compute the sentences’ weight, remove the redundant sentences and form the text summary. The experiments show that multi-feature fusion algorithm of the subject words extraction and the summary generation have reached the better achievement. The research is useful and helpful to the study of Tibetan information processing. More... »

PAGES

1-13

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s10586-017-1219-3

DOI

http://dx.doi.org/10.1007/s10586-017-1219-3

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1092245993


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Minzu University of China", 
          "id": "https://www.grid.ac/institutes/grid.411077.4", 
          "name": [
            "Information Engineering College, Minzu University of China, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Xu", 
        "givenName": "Gui-Xian", 
        "id": "sg:person.011747311776.65", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011747311776.65"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Minzu University of China", 
          "id": "https://www.grid.ac/institutes/grid.411077.4", 
          "name": [
            "Information Engineering College, Minzu University of China, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yao", 
        "givenName": "Hai-Shen", 
        "id": "sg:person.010021477227.81", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010021477227.81"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Minzu University of China", 
          "id": "https://www.grid.ac/institutes/grid.411077.4", 
          "name": [
            "Information Engineering College, Minzu University of China, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Changzhi", 
        "id": "sg:person.015727170027.14", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015727170027.14"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/s10586-016-0561-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014713134", 
          "https://doi.org/10.1007/s10586-016-0561-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10586-016-0561-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014713134", 
          "https://doi.org/10.1007/s10586-016-0561-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1639714.1639726", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032910111"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1177/0164027506296758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043133955"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1177/0164027506296758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043133955"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10586-016-0603-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047211259", 
          "https://doi.org/10.1007/s10586-016-0603-8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10586-016-0603-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047211259", 
          "https://doi.org/10.1007/s10586-016-0603-8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/319950.319957", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053570663"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1147/rd.22.0159", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1063180666"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1147/rd.24.0354", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1063180879"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/adl.1998.670375", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093890882"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icdm.2009.121", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094082510"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1017/cbo9780511809071", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1098672059"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2017-10-16", 
    "datePublishedReg": "2017-10-16", 
    "description": "Subject words represent the brief information of the text. Text automatic summary reflects its theme and core content. In this paper, the research is conducted on multi-feature fusion algorithm on subject words extraction and summary generation of Tibetan network text. Firstly, Tibetan web pages are collected and preprocessing is conducted to extract the useful information from web pages. Secondly, BCCF algorithm of word segmentation is utilized to cut the text\u2019s words. Then multi-feature fusion algorithm is proposed to extract the subject words of the text. The algorithm takes into account the multi-factors such as the word\u2019s frequency, length, type to calculate the words\u2019 weight and effectively select the text\u2019s subject words. For text summary generation, the algorithm of the sentence weight calculation is designed in terms of the word frequency, position and so on. The algorithm of text summary generation is to compute the sentences\u2019 weight, remove the redundant sentences and form the text summary. The experiments show that multi-feature fusion algorithm of the subject words extraction and the summary generation have reached the better achievement. The research is useful and helpful to the study of Tibetan information processing.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1007/s10586-017-1219-3", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.7183189", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1046649", 
        "issn": [
          "1386-7857", 
          "1573-7543"
        ], 
        "name": "Cluster Computing", 
        "type": "Periodical"
      }
    ], 
    "name": "Research on multi-feature fusion algorithm for subject words extraction and summary generation of text", 
    "pagination": "1-13", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "f357533b40083928601ad1ca968247cd211a425e10417add8b74efbf2bb2b4ca"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s10586-017-1219-3"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1092245993"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s10586-017-1219-3", 
      "https://app.dimensions.ai/details/publication/pub.1092245993"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T13:31", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8659_00000601.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1007%2Fs10586-017-1219-3"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10586-017-1219-3'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10586-017-1219-3'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10586-017-1219-3'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10586-017-1219-3'


 

This table displays all metadata directly associated to this object as RDF triples.

103 TRIPLES      21 PREDICATES      34 URIs      16 LITERALS      5 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s10586-017-1219-3 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N578865e1f0704b23937a456d3c12e5c7
4 schema:citation sg:pub.10.1007/s10586-016-0561-1
5 sg:pub.10.1007/s10586-016-0603-8
6 https://doi.org/10.1017/cbo9780511809071
7 https://doi.org/10.1109/adl.1998.670375
8 https://doi.org/10.1109/icdm.2009.121
9 https://doi.org/10.1145/1639714.1639726
10 https://doi.org/10.1145/319950.319957
11 https://doi.org/10.1147/rd.22.0159
12 https://doi.org/10.1147/rd.24.0354
13 https://doi.org/10.1177/0164027506296758
14 schema:datePublished 2017-10-16
15 schema:datePublishedReg 2017-10-16
16 schema:description Subject words represent the brief information of the text. Text automatic summary reflects its theme and core content. In this paper, the research is conducted on multi-feature fusion algorithm on subject words extraction and summary generation of Tibetan network text. Firstly, Tibetan web pages are collected and preprocessing is conducted to extract the useful information from web pages. Secondly, BCCF algorithm of word segmentation is utilized to cut the text’s words. Then multi-feature fusion algorithm is proposed to extract the subject words of the text. The algorithm takes into account the multi-factors such as the word’s frequency, length, type to calculate the words’ weight and effectively select the text’s subject words. For text summary generation, the algorithm of the sentence weight calculation is designed in terms of the word frequency, position and so on. The algorithm of text summary generation is to compute the sentences’ weight, remove the redundant sentences and form the text summary. The experiments show that multi-feature fusion algorithm of the subject words extraction and the summary generation have reached the better achievement. The research is useful and helpful to the study of Tibetan information processing.
17 schema:genre research_article
18 schema:inLanguage en
19 schema:isAccessibleForFree false
20 schema:isPartOf sg:journal.1046649
21 schema:name Research on multi-feature fusion algorithm for subject words extraction and summary generation of text
22 schema:pagination 1-13
23 schema:productId N2968bf5ae7654b559d4050f56cac5c66
24 Nad3bb25044384183bcc590fe4b8f652a
25 Nd5a171b9b34647f2b79437bf55658fe0
26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092245993
27 https://doi.org/10.1007/s10586-017-1219-3
28 schema:sdDatePublished 2019-04-10T13:31
29 schema:sdLicense https://scigraph.springernature.com/explorer/license/
30 schema:sdPublisher Ne75bfe40039643bca144205e91c107e3
31 schema:url https://link.springer.com/10.1007%2Fs10586-017-1219-3
32 sgo:license sg:explorer/license/
33 sgo:sdDataset articles
34 rdf:type schema:ScholarlyArticle
35 N2577e356f0354fcd921dbb6448999507 rdf:first sg:person.010021477227.81
36 rdf:rest Na03ca9cdc1ae4b90811294780d477850
37 N2968bf5ae7654b559d4050f56cac5c66 schema:name dimensions_id
38 schema:value pub.1092245993
39 rdf:type schema:PropertyValue
40 N578865e1f0704b23937a456d3c12e5c7 rdf:first sg:person.011747311776.65
41 rdf:rest N2577e356f0354fcd921dbb6448999507
42 Na03ca9cdc1ae4b90811294780d477850 rdf:first sg:person.015727170027.14
43 rdf:rest rdf:nil
44 Nad3bb25044384183bcc590fe4b8f652a schema:name readcube_id
45 schema:value f357533b40083928601ad1ca968247cd211a425e10417add8b74efbf2bb2b4ca
46 rdf:type schema:PropertyValue
47 Nd5a171b9b34647f2b79437bf55658fe0 schema:name doi
48 schema:value 10.1007/s10586-017-1219-3
49 rdf:type schema:PropertyValue
50 Ne75bfe40039643bca144205e91c107e3 schema:name Springer Nature - SN SciGraph project
51 rdf:type schema:Organization
52 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
53 schema:name Information and Computing Sciences
54 rdf:type schema:DefinedTerm
55 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
56 schema:name Artificial Intelligence and Image Processing
57 rdf:type schema:DefinedTerm
58 sg:grant.7183189 http://pending.schema.org/fundedItem sg:pub.10.1007/s10586-017-1219-3
59 rdf:type schema:MonetaryGrant
60 sg:journal.1046649 schema:issn 1386-7857
61 1573-7543
62 schema:name Cluster Computing
63 rdf:type schema:Periodical
64 sg:person.010021477227.81 schema:affiliation https://www.grid.ac/institutes/grid.411077.4
65 schema:familyName Yao
66 schema:givenName Hai-Shen
67 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010021477227.81
68 rdf:type schema:Person
69 sg:person.011747311776.65 schema:affiliation https://www.grid.ac/institutes/grid.411077.4
70 schema:familyName Xu
71 schema:givenName Gui-Xian
72 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011747311776.65
73 rdf:type schema:Person
74 sg:person.015727170027.14 schema:affiliation https://www.grid.ac/institutes/grid.411077.4
75 schema:familyName Wang
76 schema:givenName Changzhi
77 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015727170027.14
78 rdf:type schema:Person
79 sg:pub.10.1007/s10586-016-0561-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014713134
80 https://doi.org/10.1007/s10586-016-0561-1
81 rdf:type schema:CreativeWork
82 sg:pub.10.1007/s10586-016-0603-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047211259
83 https://doi.org/10.1007/s10586-016-0603-8
84 rdf:type schema:CreativeWork
85 https://doi.org/10.1017/cbo9780511809071 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098672059
86 rdf:type schema:CreativeWork
87 https://doi.org/10.1109/adl.1998.670375 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093890882
88 rdf:type schema:CreativeWork
89 https://doi.org/10.1109/icdm.2009.121 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094082510
90 rdf:type schema:CreativeWork
91 https://doi.org/10.1145/1639714.1639726 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032910111
92 rdf:type schema:CreativeWork
93 https://doi.org/10.1145/319950.319957 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053570663
94 rdf:type schema:CreativeWork
95 https://doi.org/10.1147/rd.22.0159 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063180666
96 rdf:type schema:CreativeWork
97 https://doi.org/10.1147/rd.24.0354 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063180879
98 rdf:type schema:CreativeWork
99 https://doi.org/10.1177/0164027506296758 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043133955
100 rdf:type schema:CreativeWork
101 https://www.grid.ac/institutes/grid.411077.4 schema:alternateName Minzu University of China
102 schema:name Information Engineering College, Minzu University of China, Beijing, China
103 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...