Báo cáo khoa học: "A Morphologically Sensitive Clustering Algorithm for Identifying Arabic Roots"
36
lượt xem 1
download
lượt xem 1
download
Download
Vui lòng tải xuống để xem tài liệu đầy đủ
We present a clustering algorithm for Arabic words sharing the same root. Root based clusters can substitute dictionaries in indexing for IR. Modifying Adamson and Boreham (1974), our Two-stage algorithm applies light stemming before calculating word pair similarity coefficients using techniques sensitive to Arabic morphology. Tests show a successful treatment of infixes and accurate clustering to up to 94.06% for unedited Arabic text samples, without the use of dictionaries.
Chủ đề:
Bình luận(0) Đăng nhập để gửi bình luận!
CÓ THỂ BẠN MUỐN DOWNLOAD