Ging viên:
Hoàng Anh Vit
hoanganhviet@gmail.com
Understanding – the Big Picture
Morphology
POS Tagging
Syntax
Semantics
Discourse Integration
Generation goes backwards. For this reason, we generally want
declarative representations of the facts. POS tagging is an
exception to this.
Overview
¨Bài toán gán nhãn
¤Ngôn ng hc: phân t loi cú pháp
¤Toán hc: gán nhãn cho mt dãy ký hiu
¨Gán nhãn t loi
¤Tiếp cn thng kê: Hidden Markov Model và Viterbi
algorithm
¤Transformation Rule (Brill's tagger)
Tagging problems
¨Input cho mt dãy các ký hiu:
¤x1 x2 ... xn
¨Output: gán nhãn cho các ký hiu này
¤y1 y2 ... Yn
a b c c a d e -> a/C b/D c/E c/E a/D d/C e/C
¨Bài toán:
¤Gán nhãn t loi: POS tagging
¤Nhn dng thc th tên: Name Entity Recognition
¤....
Part-Of-Speech tagging
INPUT:
Profits soared at Boeing Co., easily topping forecasts on Wall Street, as
their CEO Alan Mulally announced first quarter results.
OUTPUT:
Profits/N soared/V at/P Boeing/N Co./N ,/, easily/ADV topping/V
forecasts/N on/P Wall/N Street/N ,/, as/P their/POSS CEO/N Alan/N
Mulally/N announced/V first/ADJ quarter/N results/N ./.
N = Noun
V = Verb
P = Preposition
Adv = Adverb
Adj = Adjective