Abstract
Each semantic category in the Tongyici Cilin (A Dictionary of Synonyms) corresponding to only one semantic code. The basic hypothesis made in this paper is that when a semantic category occurs in a text, the content words co-occurring in it are similar in terms of statistics. Our primary experiments show that the consistency between the Cilin categories and the derived clusters is over 80%. The meaning of the research is that it makes it possible to put forward a method for the quantitative analysis of linguistic word classification and lays a foundation for word sense disambiguation by using the thesaurus provided by linguistics.
Keywords: | word sense disambiguation; a language model |
---|
[Chinese Version | Index | Applied Linguistics (Yuyan Wenzi Yingyong) | Other Journals | Subscription form | Enquiry ]