Báo cáo khoa học: "User Edits Classification Using Document Revision Histories"
54
lượt xem 2
download
lượt xem 2
download
Download
Vui lòng tải xuống để xem tài liệu đầy đủ
Document revision histories are a useful and abundant source of data for natural language processing, but selecting relevant data for the task at hand is not trivial. In this paper we introduce a scalable approach for automatically distinguishing between factual and fluency edits in document revision histories. The approach is based on supervised machine learning using language model probabilities, string similarity measured over different representations of user edits, comparison of part-of-speech tags and named entities, and a set of adaptive features extracted from large amounts of unlabeled user edits. ...
Chủ đề:
Bình luận(0) Đăng nhập để gửi bình luận!
CÓ THỂ BẠN MUỐN DOWNLOAD