Applied Linguistics (Yuyan Wenzi Yingyong)


No. 2 , Pages 99 - 108 , 2004

The Limitations of the Statistically-based NLP Models (Article written in Chinese)

YUAN Yulin

Abstract

This paper demonstrates the limitations of the statistically-based natural language processing (NLP) models in the perspective of linguistic theory by introducing and commenting the mechanism of the statistical language models (SLM) and their applying cases. Firstly, it introduces the studies of the statistical structure of language under the influence of information theory, especially Chomsky’s demonstration that finite state grammar (FSG) based on Markov, process is not suited to description of natural language. Then, it reveals mechanism and possible applying fields of SLM by discussing N-gramm and its applying in parts-of-speech tagging. It discusses the recursion property of linguistic structure and the structure-dependent property of linguistic knowledge, and argues that recursive nested constructions would upset the statistic regularity and the structure-dependent property of linguistic knowledge would make the independence assumption, whereby SLM can be realized, lose effectiveness. Finally, it suggests that the right track of NLP may be integration of rule-based approach and statistics-based approach, because natural language is a miscellaneous system.

Keywords: language processing; statistical models; finite state grammar; Markov process; recursion; structure-dependent property

[Chinese Version | Index | Applied Linguistics (Yuyan Wenzi Yingyong) | Other Journals | Subscription form | Enquiry ]


Mail any comments and suggestions to hkier-journal@cuhk.edu.hk .