A Lexicon-Constrained Character Model for Chinese Morphological Analysis View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2005

AUTHORS

Yao Meng , Hao Yu , Fumihito Nishino

ABSTRACT

This paper proposes a lexicon-constrained character model that combines both word and character features to solve complicated issues in Chinese morphological analysis. A Chinese character-based model constrained by a lexicon is built to acquire word building rules. Each character in a Chinese sentence is assigned a tag by the proposed model. The word segmentation and part-of-speech tagging results are then generated based on the character tags. The proposed method solves such problems as unknown word identification, data sparseness, and estimation bias in an integrated, unified framework. Preliminary experiments indicate that the proposed method outperforms the best SIGHAN word segmentation systems in the open track on 3 out of the 4 test corpora. Additionally, our method can be conveniently integrated with any other Chinese morphological systems as a post-processing module leading to significant improvement in performance. More... »

PAGES

542-552

Book

TITLE

Natural Language Processing – IJCNLP 2005

ISBN

978-3-540-29172-5
978-3-540-31724-1

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/11562214_48

DOI

http://dx.doi.org/10.1007/11562214_48

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1004572348


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "name": [
            "Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Meng", 
        "givenName": "Yao", 
        "id": "sg:person.015016035647.71", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015016035647.71"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yu", 
        "givenName": "Hao", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "name": [
            "Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Nishino", 
        "givenName": "Fumihito", 
        "id": "sg:person.015634767150.63", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015634767150.63"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.3115/1220355.1220422", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004023044"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119355.1119380", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031461469"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/795458.795460", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032918837"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1072228.1072373", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047034253"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119254", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221105"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119254", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221105"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119261", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221112"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119261", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221112"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119269", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221121"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119269", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221121"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119273", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221125"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119273", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221125"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119277", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221129"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119277", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221129"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119278", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221130"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119278", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221130"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119280", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221132"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1119250.1119280", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221132"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1218955.1219014", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221243"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1218955.1219014", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099221243"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2005", 
    "datePublishedReg": "2005-01-01", 
    "description": "This paper proposes a lexicon-constrained character model that combines both word and character features to solve complicated issues in Chinese morphological analysis. A Chinese character-based model constrained by a lexicon is built to acquire word building rules. Each character in a Chinese sentence is assigned a tag by the proposed model. The word segmentation and part-of-speech tagging results are then generated based on the character tags. The proposed method solves such problems as unknown word identification, data sparseness, and estimation bias in an integrated, unified framework. Preliminary experiments indicate that the proposed method outperforms the best SIGHAN word segmentation systems in the open track on 3 out of the 4 test corpora. Additionally, our method can be conveniently integrated with any other Chinese morphological systems as a post-processing module leading to significant improvement in performance.", 
    "editor": [
      {
        "familyName": "Dale", 
        "givenName": "Robert", 
        "type": "Person"
      }, 
      {
        "familyName": "Wong", 
        "givenName": "Kam-Fai", 
        "type": "Person"
      }, 
      {
        "familyName": "Su", 
        "givenName": "Jian", 
        "type": "Person"
      }, 
      {
        "familyName": "Kwong", 
        "givenName": "Oi Yee", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/11562214_48", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-29172-5", 
        "978-3-540-31724-1"
      ], 
      "name": "Natural Language Processing \u2013 IJCNLP 2005", 
      "type": "Book"
    }, 
    "name": "A Lexicon-Constrained Character Model for Chinese Morphological Analysis", 
    "pagination": "542-552", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1004572348"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/11562214_48"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "15c9bf08ec02d062735687f3a771c86d3db465d0f9e4a559c85d0ec521ed33bf"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/11562214_48", 
      "https://app.dimensions.ai/details/publication/pub.1004572348"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-16T08:10", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000360_0000000360/records_118346_00000000.jsonl", 
    "type": "Chapter", 
    "url": "https://link.springer.com/10.1007%2F11562214_48"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11562214_48'


 

This table displays all metadata directly associated to this object as RDF triples.

132 TRIPLES      23 PREDICATES      39 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/11562214_48 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N8c5bbd2cb35b4ae39c2264cacbb5d23c
4 schema:citation https://doi.org/10.1145/795458.795460
5 https://doi.org/10.3115/1072228.1072373
6 https://doi.org/10.3115/1119250.1119254
7 https://doi.org/10.3115/1119250.1119261
8 https://doi.org/10.3115/1119250.1119269
9 https://doi.org/10.3115/1119250.1119273
10 https://doi.org/10.3115/1119250.1119277
11 https://doi.org/10.3115/1119250.1119278
12 https://doi.org/10.3115/1119250.1119280
13 https://doi.org/10.3115/1119355.1119380
14 https://doi.org/10.3115/1218955.1219014
15 https://doi.org/10.3115/1220355.1220422
16 schema:datePublished 2005
17 schema:datePublishedReg 2005-01-01
18 schema:description This paper proposes a lexicon-constrained character model that combines both word and character features to solve complicated issues in Chinese morphological analysis. A Chinese character-based model constrained by a lexicon is built to acquire word building rules. Each character in a Chinese sentence is assigned a tag by the proposed model. The word segmentation and part-of-speech tagging results are then generated based on the character tags. The proposed method solves such problems as unknown word identification, data sparseness, and estimation bias in an integrated, unified framework. Preliminary experiments indicate that the proposed method outperforms the best SIGHAN word segmentation systems in the open track on 3 out of the 4 test corpora. Additionally, our method can be conveniently integrated with any other Chinese morphological systems as a post-processing module leading to significant improvement in performance.
19 schema:editor N1b91d49bff23491a81402af39bd92e94
20 schema:genre chapter
21 schema:inLanguage en
22 schema:isAccessibleForFree true
23 schema:isPartOf Nafe55a49220a458d92d95dea5b28a6df
24 schema:name A Lexicon-Constrained Character Model for Chinese Morphological Analysis
25 schema:pagination 542-552
26 schema:productId N0b35a596bfba4eb0b429fd2d99709657
27 N491513808a0c43a8bcf5d519b63de9af
28 Nb735316a9200404c9967c70b28177b24
29 schema:publisher N761732b6652b4a54a94261a3401b2af5
30 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004572348
31 https://doi.org/10.1007/11562214_48
32 schema:sdDatePublished 2019-04-16T08:10
33 schema:sdLicense https://scigraph.springernature.com/explorer/license/
34 schema:sdPublisher N9662fa10e8044391ba9d3447ac3e2766
35 schema:url https://link.springer.com/10.1007%2F11562214_48
36 sgo:license sg:explorer/license/
37 sgo:sdDataset chapters
38 rdf:type schema:Chapter
39 N0b35a596bfba4eb0b429fd2d99709657 schema:name dimensions_id
40 schema:value pub.1004572348
41 rdf:type schema:PropertyValue
42 N1b91d49bff23491a81402af39bd92e94 rdf:first Ned9c77e8a8384dec908541333fa82e6a
43 rdf:rest Nf805f9149d7b4275a3d2dd47a460c308
44 N36ec6721b6a04ea09f31a7e42b135979 schema:familyName Kwong
45 schema:givenName Oi Yee
46 rdf:type schema:Person
47 N4354b20d1643425b99f12f12036a308d schema:name Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China
48 rdf:type schema:Organization
49 N491513808a0c43a8bcf5d519b63de9af schema:name doi
50 schema:value 10.1007/11562214_48
51 rdf:type schema:PropertyValue
52 N522a1fa03d274d5aa6e6e59888e02c58 rdf:first Nfb46bbab82884d7ebcc0cc19ebd86ef7
53 rdf:rest Nb75eb11a235e415eae68fc5fbc0f44cd
54 N6a64552ed4b24d8a98982d9c0f78c618 schema:familyName Wong
55 schema:givenName Kam-Fai
56 rdf:type schema:Person
57 N761732b6652b4a54a94261a3401b2af5 schema:location Berlin, Heidelberg
58 schema:name Springer Berlin Heidelberg
59 rdf:type schema:Organisation
60 N8c5bbd2cb35b4ae39c2264cacbb5d23c rdf:first sg:person.015016035647.71
61 rdf:rest N91698de2171941e2998425df344a96cc
62 N91698de2171941e2998425df344a96cc rdf:first Nf0a019dfaa6241ec95a77bda9f61ce29
63 rdf:rest Nc670719962d34a16ad6238eb80bb712e
64 N937862c0195541ecaebf1b5c065c5160 schema:name Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China
65 rdf:type schema:Organization
66 N9662fa10e8044391ba9d3447ac3e2766 schema:name Springer Nature - SN SciGraph project
67 rdf:type schema:Organization
68 N9feaf21e0e144743bb4ade1ae9928f08 schema:name Fujitsu R&D Center Co., Ltd, Room B1003, Eagle Run Plaza, No. 26 Xiaoyun Road, Chaoyang District, 100016, Bejing, P. R. China
69 rdf:type schema:Organization
70 Nafe55a49220a458d92d95dea5b28a6df schema:isbn 978-3-540-29172-5
71 978-3-540-31724-1
72 schema:name Natural Language Processing – IJCNLP 2005
73 rdf:type schema:Book
74 Nb735316a9200404c9967c70b28177b24 schema:name readcube_id
75 schema:value 15c9bf08ec02d062735687f3a771c86d3db465d0f9e4a559c85d0ec521ed33bf
76 rdf:type schema:PropertyValue
77 Nb75eb11a235e415eae68fc5fbc0f44cd rdf:first N36ec6721b6a04ea09f31a7e42b135979
78 rdf:rest rdf:nil
79 Nc670719962d34a16ad6238eb80bb712e rdf:first sg:person.015634767150.63
80 rdf:rest rdf:nil
81 Ned9c77e8a8384dec908541333fa82e6a schema:familyName Dale
82 schema:givenName Robert
83 rdf:type schema:Person
84 Nf0a019dfaa6241ec95a77bda9f61ce29 schema:affiliation N937862c0195541ecaebf1b5c065c5160
85 schema:familyName Yu
86 schema:givenName Hao
87 rdf:type schema:Person
88 Nf805f9149d7b4275a3d2dd47a460c308 rdf:first N6a64552ed4b24d8a98982d9c0f78c618
89 rdf:rest N522a1fa03d274d5aa6e6e59888e02c58
90 Nfb46bbab82884d7ebcc0cc19ebd86ef7 schema:familyName Su
91 schema:givenName Jian
92 rdf:type schema:Person
93 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
94 schema:name Information and Computing Sciences
95 rdf:type schema:DefinedTerm
96 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
97 schema:name Artificial Intelligence and Image Processing
98 rdf:type schema:DefinedTerm
99 sg:person.015016035647.71 schema:affiliation N9feaf21e0e144743bb4ade1ae9928f08
100 schema:familyName Meng
101 schema:givenName Yao
102 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015016035647.71
103 rdf:type schema:Person
104 sg:person.015634767150.63 schema:affiliation N4354b20d1643425b99f12f12036a308d
105 schema:familyName Nishino
106 schema:givenName Fumihito
107 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015634767150.63
108 rdf:type schema:Person
109 https://doi.org/10.1145/795458.795460 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032918837
110 rdf:type schema:CreativeWork
111 https://doi.org/10.3115/1072228.1072373 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047034253
112 rdf:type schema:CreativeWork
113 https://doi.org/10.3115/1119250.1119254 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221105
114 rdf:type schema:CreativeWork
115 https://doi.org/10.3115/1119250.1119261 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221112
116 rdf:type schema:CreativeWork
117 https://doi.org/10.3115/1119250.1119269 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221121
118 rdf:type schema:CreativeWork
119 https://doi.org/10.3115/1119250.1119273 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221125
120 rdf:type schema:CreativeWork
121 https://doi.org/10.3115/1119250.1119277 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221129
122 rdf:type schema:CreativeWork
123 https://doi.org/10.3115/1119250.1119278 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221130
124 rdf:type schema:CreativeWork
125 https://doi.org/10.3115/1119250.1119280 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221132
126 rdf:type schema:CreativeWork
127 https://doi.org/10.3115/1119355.1119380 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031461469
128 rdf:type schema:CreativeWork
129 https://doi.org/10.3115/1218955.1219014 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099221243
130 rdf:type schema:CreativeWork
131 https://doi.org/10.3115/1220355.1220422 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004023044
132 rdf:type schema:CreativeWork
 




Preview window. Press ESC to close (or click here)


...