intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Báo cáo khoa học: "Analysing Wikipedia and Gold-Standard Corpora for NER Training"

Chia sẻ: Nhung Nhung | Ngày: | Loại File: PDF | Số trang:9

39
lượt xem
2
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Named entity recognition (NER) for English typically involves one of three gold standards: MUC, CoNLL, or BBN, all created by costly manual annotation. Recent work has used Wikipedia to automatically create a massive corpus of named entity annotated text. We present the first comprehensive crosscorpus evaluation of NER. We identify the causes of poor cross-corpus performance and demonstrate ways of making them more compatible. Using our process, we develop a Wikipedia corpus which outperforms gold standard corpora on crosscorpus evaluation by up to 11%. ...

Chủ đề:
Lưu

Nội dung Text: Báo cáo khoa học: "Analysing Wikipedia and Gold-Standard Corpora for NER Training"

ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2