Báo cáo khoa học: "Supervised Grammar Induction using Training Data with Limited Constituent Information *"
36
lượt xem 3
download
lượt xem 3
download
Download
Vui lòng tải xuống để xem tài liệu đầy đủ
Corpus-based grammar induction generally relies on hand-parsed training data to learn the structure of the language. Unfortunately, the cost of building large annotated corpora is prohibitively expensive. This work aims to improve the induction strategy when there are few labels in the training data. We show that the most informative linguistic constituents are the higher nodes in the parse trees, typically denoting complex noun phrases and sentential clauses. They account for only 20% of all constituents. ...
Chủ đề:
Bình luận(0) Đăng nhập để gửi bình luận!
CÓ THỂ BẠN MUỐN DOWNLOAD