Publication View

Experiments on unsupervised chinese word segmentation and classification (2002)

Abstract
Abstract: There are several problems encountered for Chinese language processing as Chinese is written without word delimiters. The difficulty in defining a word makes it even harder. This paper explores the possibility of automatically segmenting Chinese character sequences into words and classifying these words through distributional analysis in contrast with the usual approaches that depends on dictionaries.

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.95.1011
Source http://members.dodo.com.au/~powers/Research/AI/papers/200203-SWC-EUCWSC.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Keywords Key words, Unsupervised learning, Word segmentation, Word classification
Type text
Language English
Relation 10.1.1.13.9919, 10.1.1.45.3348, 10.1.1.13.8615, 10.1.1.29.5592, 10.1.1.16.1275, 10.1.1.16.3884, 10.1.1.28.7852, 10.1.1.40.5252, 10.1.1.26.6472, 10.1.1.12.8796, 10.1.1.2.172, 10.1.1.58.3283