intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Báo cáo khoa học: "Fundamentals of Chinese Language Processing"

Chia sẻ: Hongphan_1 Hongphan_1 | Ngày: | Loại File: PDF | Số trang:1

57
lượt xem
3
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

This tutorial gives an introduction to the fundamentals of Chinese language processing for text processing. Today, more and more Chinese information are available in electronic form and over the internet. Computer processing of Chinese text requires the understanding of both the language itself and the technology to handle them. This tutorial is targeted for both Chinese linguists who are interested in computational linguistics and computer scientists who are interested in research on processing Chinese. ...

Chủ đề:
Lưu

Nội dung Text: Báo cáo khoa học: "Fundamentals of Chinese Language Processing"

  1. Fundamentals of Chinese Language Processing Chu-Ren Huang Qin Lu Dept. of Chinese and Bilingual Studies Department of Computing Hong Kong polytechnic University Hong Kong Polytechnic University Churen.huang@inet.polyu.edu.hk csluqin@comp.polyu.edu.hk 1.2 Basic unit of processing: word or character? 1 Introduction a. Word-forms vs. character forms b. Word-senses vs. character-senses This tutorial gives an introduction to the funda- 1.3 Part-of-Speech: important issues in defin- mentals of Chinese language processing for text ing word classes processing. Today, more and more Chinese in- 1.4 Word formation: from affixation to com- formation are available in electronic form and pounding over the internet. Computer processing of Chi- 1.5 Unique constructions and challenges nese text requires the understanding of both the a. Classifier-noun agreement language itself and the technology to handle b. Separable compounds (or ionization) them. This tutorial is targeted for both Chinese c. ‘Verbless’ Constructions linguists who are interested in computational 1.6. Chinese NLP resources linguistics and computer scientists who are inter- ested in research on processing Chinese. Part 2: Text Processing 2 Content Overview 2.1 Lexical processing a. Segmentation This tutorial consists of two parts. The first part b. Disambiguation overviews the grammar of the Chinese language c. Unknown word detection from a language processing perspective based on d. Named Entity Recognition naturally occurring data. The second part over- 2.2 Syntactic processing views Chinese specific processing issues and a. Issues in PoS tagging corresponding computational technologies. b. Hidden Markov Models The grammar introduced is a descriptive 2.3 NLP Applications grammar of general-purpose, present-day stan- dard Mandarin Chinese, which is fast becoming References an internationally spoken language. Real exam- Academia Sinica Balance Corpus of Mandarin Chi- ples of actual language use will be illustrated nese. http://www.sinica.edu.tw/SinicaCorpus/ based on a data driven and corpus based ap- proach so that its links to computational linguis- Chao, Y. R. 1968. A Grammar of Spoken Chinese. tic approaches for computer processing are natu- Berkeley: University of California Press. rally bridged in. A number of important Chinese Huang, C.-R., K.-j. Chen and B. K. T'sou. 1996. NLP resources are also presented. On the tech- Readings in Chinese Natural Language Processing. nology side, the tutorial mainly covers Chinese Journal of Chinese Linguistics Monograph Series word segmentation and Part-of-Speech tagging. No. 9. Berkeley: POLA. Word segmentation problem has to deal with T'sou, B. K. 2004. Chinese Language Processing at some Chinese language unique problems such as the Dawn of the 21st Century. In C.-R. Huang and unknown word detection and named entity rec- W. Lenders. Eds. Computational Linguistics and ognition which are the emphasis of this tutorial. Beyond. Pp. 189-206. Taipei: AcademiaSinica. Miao, S.Q., Wei, Z.H. 2007, Chinese Text Informa- 3 Tutorial Outline tion Processing Principles and Applications (In Chinese). Tsinghua University Press. Part 1: Highlights of Chinese Grammar for NLP 1.1 Preliminaries: Orthography and writing conventions 1 Tutorial Abstracts of ACL-IJCNLP 2009, page 1, Suntec, Singapore, 2 August 2009. c 2009 ACL and AFNLP
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2