A Lexicon-Constrained Character Model for Chinese Morphological Analysis View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2005

AUTHORS

Yao Meng , Hao Yu , Fumihito Nishino

ABSTRACT

This paper proposes a lexicon-constrained character model that combines both word and character features to solve complicated issues in Chinese morphological analysis. A Chinese character-based model constrained by a lexicon is built to acquire word building rules. Each character in a Chinese sentence is assigned a tag by the proposed model. The word segmentation and part-of-speech tagging results are then generated based on the character tags. The proposed method solves such problems as unknown word identification, data sparseness, and estimation bias in an integrated, unified framework. Preliminary experiments indicate that the proposed method outperforms the best SIGHAN word segmentation systems in the open track on 3 out of the 4 test corpora. Additionally, our method can be conveniently integrated with any other Chinese morphological systems as a post-processing module leading to significant improvement in performance. More... »

PAGES

542-552

Book

TITLE

Natural Language Processing – IJCNLP 2005

ISBN

978-3-540-29172-5
978-3-540-31724-1

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/11562214_48

DOI

http://dx.doi.org/10.1007/11562214_48

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1004572348


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "name": [
            "Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Meng", 
        "givenName": "Yao", 
        "id": "sg:person.015016035647.71", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015016035647.71"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yu", 
        "givenName": "Hao", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Nishino", 
        "givenName": "Fumihito", 
        "id": "sg:person.015634767150.63", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015634767150.63"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.3115/1220355.1220422", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004023044"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119355.1119380", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031461469"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/795458.795460", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032918837"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1072228.1072373", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047034253"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119254", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221105"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119254", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221105"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119261", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221112"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119261", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221112"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119269", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221121"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119269", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221121"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119273", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221125"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119273", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221125"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119277", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221129"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119277", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221129"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119278", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221130"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119278", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221130"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119280", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221132"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119280", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221132"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1218955.1219014", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221243"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1218955.1219014", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221243"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2005", 
    "datePublishedReg": "2005-01-01", 
    "description": "This paper proposes a lexicon-constrained character model that combines both word and character features to solve complicated issues in Chinese morphological analysis. A Chinese character-based model constrained by a lexicon is built to acquire word building rules. Each character in a Chinese sentence is assigned a tag by the proposed model. The word segmentation and part-of-speech tagging results are then generated based on the character tags. The proposed method solves such problems as unknown word identification, data sparseness, and estimation bias in an integrated, unified framework. Preliminary experiments indicate that the proposed method outperforms the best SIGHAN word segmentation systems in the open track on 3 out of the 4 test corpora. Additionally, our method can be conveniently integrated with any other Chinese morphological systems as a post-processing module leading to significant improvement in performance.", 
    "editor": [
      {
        "familyName": "Dale", 
        "givenName": "Robert", 
        "type": "Person"
      }, 
      {
        "familyName": "Wong", 
        "givenName": "Kam-Fai", 
        "type": "Person"
      }, 
      {
        "familyName": "Su", 
        "givenName": "Jian", 
        "type": "Person"
      }, 
      {
        "familyName": "Kwong", 
        "givenName": "Oi Yee", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/11562214_48", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-29172-5", 
        "978-3-540-31724-1"
      ], 
      "name": "Natural Language Processing \u2013 IJCNLP 2005", 
      "type": "Book"
    }, 
    "name": "A Lexicon-Constrained Character Model for Chinese Morphological Analysis", 
    "pagination": "542-552", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1004572348"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/11562214_48"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "15c9bf08ec02d062735687f3a771c86d3db465d0f9e4a559c85d0ec521ed33bf"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/11562214_48", 
      "https://app.dimensions.ai/details/publication/pub.1004572348"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T08:10", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000360_0000000360/records_118346_00000000.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F11562214_48"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'


 

This table displays all metadata directly associated to this object as RDF triples.

132 TRIPLES      23 PREDICATES      39 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/11562214_48 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N4cc7be687d60478089394fae840e15e4
4 schema:citation https://doi.org/10.1145/795458.795460
5 https://doi.org/10.3115/1072228.1072373
6 https://doi.org/10.3115/1119250.1119254
7 https://doi.org/10.3115/1119250.1119261
8 https://doi.org/10.3115/1119250.1119269
9 https://doi.org/10.3115/1119250.1119273
10 https://doi.org/10.3115/1119250.1119277
11 https://doi.org/10.3115/1119250.1119278
12 https://doi.org/10.3115/1119250.1119280
13 https://doi.org/10.3115/1119355.1119380
14 https://doi.org/10.3115/1218955.1219014
15 https://doi.org/10.3115/1220355.1220422
16 schema:datePublished 2005
17 schema:datePublishedReg 2005-01-01
18 schema:description This paper proposes a lexicon-constrained character model that combines both word and character features to solve complicated issues in Chinese morphological analysis. A Chinese character-based model constrained by a lexicon is built to acquire word building rules. Each character in a Chinese sentence is assigned a tag by the proposed model. The word segmentation and part-of-speech tagging results are then generated based on the character tags. The proposed method solves such problems as unknown word identification, data sparseness, and estimation bias in an integrated, unified framework. Preliminary experiments indicate that the proposed method outperforms the best SIGHAN word segmentation systems in the open track on 3 out of the 4 test corpora. Additionally, our method can be conveniently integrated with any other Chinese morphological systems as a post-processing module leading to significant improvement in performance.
19 schema:editor N8486b954152d45efb347608cb05571c8
20 schema:genre chapter
21 schema:inLanguage en
22 schema:isAccessibleForFree true
23 schema:isPartOf Nb10c43577b824b0f869381eee89544d4
24 schema:name A Lexicon-Constrained Character Model for Chinese Morphological Analysis
25 schema:pagination 542-552
26 schema:productId Ne337c4abd1564a3f827e22c88001d008
27 Nf40ecadeafe742188f5c8638470045ac
28 Nf6590cdbdfa64ea58c3a5cc7ec16d17c
29 schema:publisher Nf36d0f33cb6647a39a3eaf3a498657d6
30 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004572348
31 https://doi.org/10.1007/11562214_48
32 schema:sdDatePublished 2019-04-16T08:10
33 schema:sdLicense https://scigraph.springernature.com/explorer/license/
34 schema:sdPublisher N3958ec5c7cc2499db320b5bebe4ac57e
35 schema:url https://link.springer.com/10.1007%2F11562214_48
36 sgo:license sg:explorer/license/
37 sgo:sdDataset chapters
38 rdf:type schema:Chapter
39 N02a9b46c5df24055bb481e744d116815 schema:name Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China
40 rdf:type schema:Organization
41 N12d361de430342878dc46933e3a55c08 rdf:first Na321d56c6ac04369b9524e799c32c65e
42 rdf:rest N415f12e20f2e4cf2afc533981fede421
43 N1d7ea0d4101c4282837b4f14d4318594 schema:affiliation N02a9b46c5df24055bb481e744d116815
44 schema:familyName Yu
45 schema:givenName Hao
46 rdf:type schema:Person
47 N3958ec5c7cc2499db320b5bebe4ac57e schema:name Springer Nature - SN SciGraph project
48 rdf:type schema:Organization
49 N3a9166322b1644dba79382343d24f869 rdf:first N1d7ea0d4101c4282837b4f14d4318594
50 rdf:rest N79ff0ba9edac49019f3ab5f8cdc9d996
51 N415f12e20f2e4cf2afc533981fede421 rdf:first Nc5984f2282f545b39daf4d4c18a36646
52 rdf:rest rdf:nil
53 N47e732894018403f9d59159a77eab17c schema:familyName Dale
54 schema:givenName Robert
55 rdf:type schema:Person
56 N4cc7be687d60478089394fae840e15e4 rdf:first sg:person.015016035647.71
57 rdf:rest N3a9166322b1644dba79382343d24f869
58 N79ff0ba9edac49019f3ab5f8cdc9d996 rdf:first sg:person.015634767150.63
59 rdf:rest rdf:nil
60 N8486b954152d45efb347608cb05571c8 rdf:first N47e732894018403f9d59159a77eab17c
61 rdf:rest Nb22f8e1eb4484caba41c9d2f36181d36
62 N8a1abb875ed240268492316aa5911fc6 schema:familyName Wong
63 schema:givenName Kam-Fai
64 rdf:type schema:Person
65 N9b0b01e4ff7e474c83c1fdf0367f8a57 schema:name Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China
66 rdf:type schema:Organization
67 Na321d56c6ac04369b9524e799c32c65e schema:familyName Su
68 schema:givenName Jian
69 rdf:type schema:Person
70 Nb10c43577b824b0f869381eee89544d4 schema:isbn 978-3-540-29172-5
71 978-3-540-31724-1
72 schema:name Natural Language Processing – IJCNLP 2005
73 rdf:type schema:Book
74 Nb22f8e1eb4484caba41c9d2f36181d36 rdf:first N8a1abb875ed240268492316aa5911fc6
75 rdf:rest N12d361de430342878dc46933e3a55c08
76 Nc5984f2282f545b39daf4d4c18a36646 schema:familyName Kwong
77 schema:givenName Oi Yee
78 rdf:type schema:Person
79 Ne337c4abd1564a3f827e22c88001d008 schema:name doi
80 schema:value 10.1007/11562214_48
81 rdf:type schema:PropertyValue
82 Nf36d0f33cb6647a39a3eaf3a498657d6 schema:location Berlin, Heidelberg
83 schema:name Springer Berlin Heidelberg
84 rdf:type schema:Organisation
85 Nf40ecadeafe742188f5c8638470045ac schema:name readcube_id
86 schema:value 15c9bf08ec02d062735687f3a771c86d3db465d0f9e4a559c85d0ec521ed33bf
87 rdf:type schema:PropertyValue
88 Nf45c30d7e4934fdda45b1e721c26e5d9 schema:name Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China
89 rdf:type schema:Organization
90 Nf6590cdbdfa64ea58c3a5cc7ec16d17c schema:name dimensions_id
91 schema:value pub.1004572348
92 rdf:type schema:PropertyValue
93 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
94 schema:name Information and Computing Sciences
95 rdf:type schema:DefinedTerm
96 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
97 schema:name Artificial Intelligence and Image Processing
98 rdf:type schema:DefinedTerm
99 sg:person.015016035647.71 schema:affiliation Nf45c30d7e4934fdda45b1e721c26e5d9
100 schema:familyName Meng
101 schema:givenName Yao
102 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015016035647.71
103 rdf:type schema:Person
104 sg:person.015634767150.63 schema:affiliation N9b0b01e4ff7e474c83c1fdf0367f8a57
105 schema:familyName Nishino
106 schema:givenName Fumihito
107 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015634767150.63
108 rdf:type schema:Person
109 https://doi.org/10.1145/795458.795460 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032918837
110 rdf:type schema:CreativeWork
111 https://doi.org/10.3115/1072228.1072373 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047034253
112 rdf:type schema:CreativeWork
113 https://doi.org/10.3115/1119250.1119254 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221105
114 rdf:type schema:CreativeWork
115 https://doi.org/10.3115/1119250.1119261 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221112
116 rdf:type schema:CreativeWork
117 https://doi.org/10.3115/1119250.1119269 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221121
118 rdf:type schema:CreativeWork
119 https://doi.org/10.3115/1119250.1119273 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221125
120 rdf:type schema:CreativeWork
121 https://doi.org/10.3115/1119250.1119277 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221129
122 rdf:type schema:CreativeWork
123 https://doi.org/10.3115/1119250.1119278 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221130
124 rdf:type schema:CreativeWork
125 https://doi.org/10.3115/1119250.1119280 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221132
126 rdf:type schema:CreativeWork
127 https://doi.org/10.3115/1119355.1119380 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031461469
128 rdf:type schema:CreativeWork
129 https://doi.org/10.3115/1218955.1219014 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221243
130 rdf:type schema:CreativeWork
131 https://doi.org/10.3115/1220355.1220422 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004023044
132 rdf:type schema:CreativeWork
 




Preview window. Press ESC to close (or click here)


...