YOMEDIA
ADSENSE
Một thuật toán lọc công tác cho trường hợp ít dữ liệu
20
lượt xem 2
download
lượt xem 2
download
Download
Vui lòng tải xuống để xem tài liệu đầy đủ
The method formulates the collaborative filtering problem as classification problems and performs classification for all users simultaneously by using a modified boosting algorithm. This allows sharing common features among different classification tasks and thus reduces the negative effect of data sparseness. Experimental results show the effectiveness of the proposed method in comparison with other methods, especially when data are sparse.
AMBIENT/
Chủ đề:
Bình luận(0) Đăng nhập để gửi bình luận!
Nội dung Text: Một thuật toán lọc công tác cho trường hợp ít dữ liệu
’<br />
Tap ch´ Tin hoc v` Diˆu khiˆn hoc, T.24, S.1 (2008), 60–72<br />
ı<br />
a `<br />
e<br />
e<br />
.<br />
.<br />
.<br />
<br />
. `.<br />
.<br />
. ˆ<br />
ˆ<br />
ˆ<br />
´<br />
ˆ<br />
´<br />
MOT THUAT TOAN LOC CONG TAC CHO TRU O NG HO P ´ DU LIEU<br />
.<br />
.<br />
.<br />
.<br />
. IT ˜<br />
.<br />
. .<br />
. .<br />
˜<br />
ˆ<br />
`.<br />
NGUYEN DUY PHU O NG, TU MINH PHU O NG<br />
<br />
˜<br />
ınh e<br />
o<br />
Hoc viˆn cˆng nghˆ bu.u ch´ viˆn thˆng; phuong.ptit@yahoo.com<br />
e o<br />
e<br />
.<br />
.<br />
.<br />
Abstract. Collaborative filtering is a technique to predict the utility of items for a particular user<br />
by exploiting the behavior patterns of a group of users with similar preferences. This technique has<br />
been widely used for recommender systems and has a number of useful applications in e-commerce.<br />
In this paper, we present a collaborative filtering method based on an multi-task learning algorithm<br />
that was designed for pattern recognition . The method formulates the collaborative filtering problem<br />
as classification problems and performs classification for all users simultaneously by using a modified<br />
boosting algorithm. This allows sharing common features among different classification tasks and<br />
thus reduces the negative effect of data sparseness. Experimental results show the effectiveness of the<br />
proposed method in comparison with other methods, especially when data are sparse.<br />
´<br />
a<br />
o u<br />
a<br />
e<br />
T´m t˘t. Loc cˆng t´c l` phu.o.ng ph´p du. do´n vˆ m˘t h`ng m` ngu.`.i d` ng quan tˆm du.a trˆn<br />
o<br />
a<br />
a a<br />
a ` a a<br />
e .<br />
a<br />
.<br />
.<br />
. o<br />
.<br />
’ ´<br />
’ ıch. Phu.o.ng ph´p n`y du.o.c su. dung phˆ biˆn cho c´c<br />
’ .<br />
thˆng tin t`. nh˜.ng ngu.`.i d` ng c´ c` ng so. th´<br />
o<br />
u u<br />
o e<br />
a<br />
o u<br />
o u<br />
a a<br />
.<br />
˜ tro. tu. vˆn v` c´ nhiˆu u.ng dung trong thu.o.ng mai diˆn tu.. Trong b`i b´o n`y, ch´ ng tˆi dˆ<br />
´ a o<br />
` ´<br />
’<br />
a<br />
e<br />
a a a<br />
u<br />
o `<br />
hˆ hˆ .<br />
e o<br />
e<br />
e<br />
.<br />
.<br />
.<br />
.<br />
. dung mˆt phu.o.ng ph´p loc cˆ ng t´c du.a trˆn k˜ thuˆt hoc da nhiˆm d˜ du.o.c d` ng trong<br />
´<br />
e<br />
a<br />
e y<br />
a .<br />
u<br />
o<br />
a . o<br />
a<br />
xuˆt su .<br />
a ’<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
’<br />
`<br />
’<br />
e o<br />
a ’ .<br />
y<br />
a a<br />
o<br />
a<br />
a a<br />
e .<br />
o e<br />
nhˆn dang anh. Dˆy l` phu.o.ng ph´p su. dung k˜ thuˆt t˘ng cu.`.ng dˆ thu.c hiˆn dˆ ng th`.i viˆc phˆn<br />
a<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.`.i d` ng kh´c nhau, qua d´ cho ph´p chia se nh˜.ng d˘c tru.ng chung gi˜.a c´c b`i<br />
`<br />
’<br />
o<br />
e<br />
u<br />
a<br />
loai cho nhiˆu ngu o u<br />
e<br />
a<br />
u a a<br />
.<br />
.<br />
’<br />
´<br />
’<br />
’ ’<br />
e ’<br />
o ’<br />
to´n phˆn loai dˆ giam b´.t anh hu.´.ng cua viˆc c´ ´ d˜. liˆu. Kˆt qua thu. nghiˆm v` so s´nh v´.i k˜<br />
a<br />
a<br />
e<br />
a<br />
a<br />
o y<br />
o<br />
e o ıt u e<br />
e<br />
.<br />
.<br />
.<br />
.<br />
´<br />
´ a<br />
´y phu.o.ng ph´p cho kˆt qua tˆt, d˘c biˆt trong tru.`.ng ho.p ´ d˜. liˆu.<br />
’ o<br />
a<br />
e<br />
o<br />
e<br />
thuˆt loc kh´c cho thˆ<br />
a .<br />
a<br />
a<br />
. ıt u e<br />
.<br />
.<br />
.<br />
.<br />
<br />
’. A<br />
ˆ<br />
1. MO D` U<br />
’ ’ y a `<br />
Ngu.`.i su. dung Internet thu.`.ng g˘p kh´ kh˘n v` phai xu. l´ qu´ nhiˆu thˆng tin tru.´.c khi<br />
o ’ .<br />
o<br />
a<br />
o a ı<br />
e<br />
o<br />
o<br />
.<br />
.o.c thˆng tin quan tˆm. Mˆt trong nh˜.ng giai ph´p hˆ tro. ngu.`.i d`ng trong tru.`.ng<br />
˜<br />
’<br />
o<br />
a<br />
o<br />
u<br />
a o .<br />
o u<br />
o<br />
t` du .<br />
ım<br />
.<br />
.p n`y l` su. dung c´c hˆ hˆ tro. tu. vˆ n (recommender systems). Hˆ hˆ tro. tu. vˆ n l` nh˜.ng<br />
´<br />
´<br />
a e o .<br />
a<br />
e o .<br />
a a u<br />
ho a a ’ .<br />
.<br />
. ˜<br />
. ˜<br />
. dˆng lu.a chon v` cung cˆ p cho ngu.`.i d`ng nh˜.ng thˆng tin m`<br />
´<br />
´<br />
’ a<br />
hˆ thˆng c´ kha n˘ng tu o<br />
e o<br />
o<br />
a<br />
a<br />
o u<br />
u<br />
o<br />
a<br />
.<br />
. .<br />
.<br />
.<br />
’ a<br />
.`.i d´ quan tˆm. O. dˆy thˆng tin c´ thˆ l` thˆng tin du.´.i dang v˘n ban nhu. trang web,<br />
’ a o<br />
’<br />
a<br />
o<br />
o e<br />
o .<br />
a<br />
ngu o o<br />
’<br />
` a<br />
`<br />
´<br />
’<br />
’<br />
e’<br />
ban tin, hay thˆng tin vˆ h`ng ho´, san phˆ m .v.v. Dˆ do.n gian, trong phˆn tr` b`y tiˆp<br />
o<br />
e<br />
a ’<br />
a<br />
a<br />
ınh a e<br />
. tu.<br />
theo, ch´ng tˆi s˜ goi chung thˆng tin hay h`ng ho´ l` c´c m˘t h`ng (item). Hˆ hˆ tro<br />
u<br />
o e .<br />
o<br />
a<br />
a a a<br />
a a<br />
e o .<br />
.<br />
. ˜<br />
.ng dung thu.o.ng mai th`nh cˆng nhu. tu. vˆ n mua h`ng trong website mua<br />
` ´<br />
´<br />
´<br />
e<br />
a<br />
o<br />
a<br />
a<br />
vˆ n d˜ c´ nhiˆu u<br />
a a o<br />
.<br />
.<br />
.c tuyˆn l´.n nhˆ t thˆ gi´.i http://www. amazon.com, c´c website tu. vˆ n lu.a chon d˜<br />
´ o<br />
´<br />
´ o<br />
´<br />
b´n tru<br />
a<br />
e<br />
a<br />
e<br />
a<br />
a .<br />
ıa<br />
.<br />
.<br />
’<br />
nhac v` phim anh ...<br />
a<br />
.<br />
´<br />
a<br />
a<br />
e ’ .<br />
a . o<br />
a<br />
Hˆ hˆ tro. tu. vˆ n du.o.c chia th`nh hai loai: hˆ su. dung phu.o.ng ph´p loc cˆng t´c (cole o .<br />
.<br />
.<br />
.<br />
.<br />
. ˜<br />
laborative filtering) v` hˆ loc theo nˆi dung (content-based filtering) [2]. Loc theo nˆi dung<br />
a e .<br />
o<br />
o<br />
.<br />
.<br />
.<br />
.<br />
’<br />
a<br />
e<br />
e<br />
a<br />
o<br />
o<br />
o ’ a<br />
a e’ ım<br />
l` phu.o.ng ph´p du.a trˆn viˆc so s´nh nˆi dung cua thˆng tin hay mˆ ta h`ng ho´ dˆ t` ra<br />
a<br />
.<br />
.<br />
.<br />
<br />
. `.<br />
.<br />
. ˆ<br />
ˆ<br />
ˆ<br />
´<br />
ˆ<br />
´<br />
MOT THUAT TOAN LOC CONG TAC CHO TRU O NG HO P ´ DU LIEU<br />
.<br />
.<br />
.<br />
.<br />
. IT ˜<br />
.<br />
<br />
61<br />
<br />
nh˜.ng m˘t h`ng tu.o.ng tu. v´.i nh˜.ng g` ngu.`.i d`ng t`.ng quan tˆm v` gi´.i thiˆu cho ngu.`.i<br />
u<br />
a a<br />
u<br />
ı<br />
o u<br />
u<br />
a<br />
a o<br />
e<br />
o<br />
.<br />
. o<br />
.<br />
a a<br />
a<br />
d`ng nh˜.ng m˘t h`ng n`y.<br />
u<br />
u<br />
.<br />
o<br />
a<br />
o<br />
e<br />
e ’ .<br />
o<br />
o<br />
Kh´c v´.i loc theo nˆi dung, loc cˆng t´c khˆng du.a trˆn viˆc su. dung nˆi dung thˆng<br />
a o .<br />
.<br />
. o<br />
.<br />
.<br />
.<br />
.<br />
.o.ng ph´p n`y x´c dinh nh˜.ng ngu.`.i c´ c`ng so. th´ v´.i ngu.`.i d`ng<br />
’ ıch o<br />
tin. Thay v`o d´, phu<br />
a o<br />
a a a .<br />
o o u<br />
o u<br />
u<br />
. vˆ n, t`. d´ gi´.i thiˆu v´.i ngu.`.i cˆn tu. vˆ n nh˜.ng m˘t h`ng m` ngu.`.i c`ng so. th´<br />
´ u o o<br />
´<br />
`<br />
’ ıch<br />
e o<br />
o `<br />
a<br />
a<br />
u<br />
a a<br />
a<br />
o u<br />
cˆn tu a<br />
a<br />
.<br />
.<br />
.ng mua ho˘c d´nh gi´ cao [13]. Hai ngu.`.i du.o.c coi l` c´ c`ng so. th´ nˆu ho d˜ t`.ng<br />
´<br />
’ ıch e<br />
t`<br />
u<br />
a a<br />
o<br />
a o u<br />
a<br />
.<br />
.<br />
. a u<br />
´<br />
´<br />
a<br />
o<br />
u<br />
a<br />
u<br />
o<br />
o<br />
a<br />
mua nh˜.ng loai h`ng giˆng nhau hay c`ng truy cˆp nh˜.ng thˆng tin giˆng nhau trong qu´<br />
u<br />
.<br />
.<br />
.. So v´.i loc theo nˆi dung, loc cˆng t´c c´ mˆt sˆ u.u diˆ m nhu. c´ thˆ su. dung v´.i moi<br />
o .<br />
o<br />
a o o o<br />
o e’ ’ .<br />
o<br />
kh´<br />
u<br />
e’<br />
.<br />
. o<br />
.<br />
. ´<br />
.<br />
.´.i dang v˘n ban. Kˆt qua thu. nghiˆm<br />
` o o ’<br />
´<br />
’<br />
’ ’<br />
loai thˆng tin hay h`ng ho´ m` khˆng cˆn c´ mˆ ta du o .<br />
o<br />
a<br />
a a o<br />
a<br />
a<br />
e<br />
e<br />
.<br />
.<br />
´ ´<br />
`<br />
´<br />
´<br />
’<br />
a o<br />
e<br />
o<br />
c˜ng cho thˆ y, loc cˆng t´c cho kˆt qua tu. vˆ n tˆt ho.n trong nhiˆu tru.`.ng ho.p [1, 10]. Trong<br />
u<br />
a . o<br />
a<br />
e<br />
.<br />
.<br />
’ a<br />
b`i b´o n`y, ch´ng tˆi chı tˆp trung v`o phu.o.ng ph´p loc cˆng t´c.<br />
a a a<br />
u<br />
o<br />
a<br />
a . o<br />
a<br />
.<br />
.<br />
’ .<br />
o<br />
o<br />
o a<br />
a a<br />
e’<br />
o .<br />
Ngo`i c´c u.u diˆ m so v´.i loc theo nˆi dung, mˆt kh´ kh˘n khi su. dung loc cˆng t´c l`<br />
a a<br />
.<br />
.<br />
. o<br />
.<br />
.`.i d`ng thu.`.ng chı d´nh gi´ ho˘c mua tu.o.ng dˆi ´ m˘t h`ng, ho˘c khi xuˆ t hiˆn<br />
˜<br />
´<br />
´ e<br />
’ a<br />
mˆi ngu o u<br />
o<br />
a a<br />
o ıt a a<br />
a<br />
a<br />
o<br />
.<br />
.<br />
.<br />
.<br />
’<br />
o a<br />
o<br />
a<br />
a<br />
o<br />
o ıt u<br />
a ’<br />
m˘t h`ng m´.i chu.a c´ d´nh gi´ cua ai ca. Tru.`.ng ho.p n`y du.o.c goi l` tru.`.ng ho.p c´ ´ d˜.<br />
a a<br />
o<br />
.<br />
.<br />
.<br />
.<br />
.<br />
´<br />
liˆu ho˘c d˜. liˆu thu.a th´.t v` s˜ du.o.c ch´ y giai quyˆt trong nghiˆn c´.u n`y.<br />
e<br />
a u e<br />
o a e<br />
u´ ’<br />
e<br />
e u a<br />
.<br />
.<br />
.<br />
.<br />
’<br />
a<br />
a a<br />
a<br />
B`i to´n loc cˆng t´c c´ thˆ ph´t biˆ u nhu. b`i to´n phˆn loai tu. dˆng cua hoc m´y. Du.a<br />
a a . o<br />
a o e’ a e’<br />
.<br />
.<br />
. . o<br />
.<br />
.<br />
˜<br />
`<br />
’ a ngu.`.i d`ng vˆ nh˜.ng m˘t h`ng kh´c nhau, v´.i mˆi ngu.`.i d`ng, mˆt mˆ<br />
o u<br />
e u<br />
a a<br />
a<br />
o<br />
o<br />
o u<br />
o<br />
o<br />
trˆn d´nh gi´ cu<br />
e a<br />
a<br />
.<br />
.<br />
´<br />
’ .<br />
o<br />
e’ a<br />
a<br />
a a<br />
e<br />
o ınh a<br />
h` phˆn loai s˜ du.o.c xˆy du.ng v` huˆ n luyˆn, mˆ h` n`y sau d´ du.o.c su. dung dˆ phˆn<br />
ınh a<br />
e<br />
.<br />
.<br />
.<br />
.<br />
.<br />
chia m˘t h`ng m´.i th`nh c´c loai kh´c nhau, v´ du nhu. loai “th´<br />
a a<br />
o<br />
a<br />
a<br />
a<br />
ı .<br />
ıch” v` “khˆng th´<br />
a<br />
o<br />
ıch”. Tu.o.ng<br />
.<br />
.<br />
.<br />
’<br />
o<br />
o u<br />
a o e’<br />
o u<br />
a a a<br />
a a<br />
o a<br />
tu. nhu. vˆy, c´ thˆ thay dˆ i vai tr` gi˜.a ngu.`.i d`ng v` m˘t h`ng v` xˆy du.ng bˆ phˆn loai<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
. do´n mˆt m˘t h`ng cu thˆ s˜ du.o.c mˆt ngu.`.i d`ng “th´<br />
’ e<br />
cho ph´p du a<br />
e<br />
o<br />
o u<br />
ıch” hay “khˆng th´<br />
o<br />
ıch”.<br />
o<br />
a a<br />
.<br />
.<br />
.<br />
.<br />
.<br />
. e<br />
´<br />
´<br />
’ .<br />
a a a<br />
e<br />
a<br />
a<br />
e<br />
o o<br />
e u ` . o<br />
Phu.o.ng ph´p n`y d˜ du.o.c su. dung trong mˆt sˆ nghiˆn c´.u vˆ loc cˆng t´c [3, 5] v` cho kˆt<br />
.<br />
.<br />
.<br />
.i phu.o.ng ph´p loc cˆng t´c truyˆn thˆng.<br />
`<br />
´<br />
´<br />
’ o<br />
a . o<br />
a<br />
e<br />
o<br />
qua tˆt so v´<br />
o<br />
.<br />
.ng nhˆ t cua phu.o.ng ph´p phˆn loai n´i trˆn l` mˆi ngu.`.i d`ng hay mˆi<br />
˜<br />
˜<br />
´<br />
a ’<br />
a<br />
a<br />
o u<br />
o<br />
Diˆ m d˘c tru<br />
e’<br />
a<br />
. o e a o<br />
.<br />
.o.c coi nhu. mˆt b`i to´n phˆn loai riˆng, dˆc lˆp v´.i nh˜.ng ngu.`.i kh´c. Tuy nhiˆn,<br />
o a a<br />
a<br />
u<br />
o<br />
a<br />
e<br />
muc s˜ du .<br />
o a o<br />
.<br />
. e<br />
. e<br />
. .<br />
˜ d`ng nhˆn thˆ y, nh˜.ng ngu.`.i d`ng kh´c nhau khˆng ho`n to`n dˆc lˆp v´.i nhau v` c´<br />
´<br />
dˆ a<br />
e<br />
a<br />
a<br />
u<br />
o u<br />
a<br />
o<br />
a<br />
a o a o<br />
a o<br />
.<br />
. .<br />
´<br />
´<br />
’ ıch. Viˆc chia se thˆng tin tu.o.ng quan gi˜.a c´c<br />
’ o<br />
o<br />
a .<br />
e<br />
u a<br />
nh˜.ng mˆi tu.o.ng quan nhˆ t dinh trong so. th´<br />
u<br />
.<br />
’<br />
o u<br />
a<br />
o e’<br />
e ’<br />
e<br />
e<br />
a<br />
b`i to´n phˆn loai cho nh˜.ng ngu.`.i d`ng kh´c nhau c´ thˆ cho ph´p cai thiˆn hiˆu qua phˆn<br />
a a<br />
a<br />
u<br />
.<br />
.<br />
.<br />
.`.ng ho.p t`.ng b`i to´n phˆn loai c´ ´ d˜. liˆu huˆ n luyˆn, tu.o.ng u.ng<br />
´<br />
´<br />
loai, nhˆ t l` trong tru o<br />
a a<br />
a a<br />
a<br />
a<br />
e<br />
´<br />
.<br />
. u<br />
. o ıt u e<br />
.<br />
.<br />
.i viˆc mˆi ngu.`.i d`ng m´.i c´ d´nh gi´ cho rˆ t ´ m˘t h`ng. Phu.o.ng ph´p su. dung thˆng<br />
˜<br />
´ ıt a a<br />
o<br />
o u<br />
o o a<br />
a ’ .<br />
o<br />
v´ e<br />
o<br />
a<br />
a<br />
.<br />
.<br />
tin t`. b`i to´n phˆn loai n`y cho b`i to´n phˆn loai kh´c du.o.c goi l` hoc da nhiˆm (multitask<br />
u a a<br />
a<br />
a<br />
a a<br />
a<br />
a<br />
a .<br />
e<br />
.<br />
.<br />
. .<br />
.<br />
` a<br />
`<br />
e .<br />
e<br />
e u<br />
learning) hay hoc chuyˆ n giao (transfer learning) v` d˜ du.o.c dˆ cˆp trong nhiˆu nghiˆn c´.u,<br />
e’<br />
a a<br />
.<br />
.<br />
v´ du [4, 7].<br />
ı .<br />
´<br />
e a ’ .<br />
e<br />
Trong b`i b´o n`y, ch´ng tˆi dˆ xuˆ t su. dung mˆt k˜ thuˆt hoc da nhiˆm cho loc cˆng<br />
a a a<br />
u<br />
o `<br />
o y<br />
a .<br />
.<br />
. o<br />
.<br />
.<br />
.<br />
.i k˜ thuˆt phˆn loai tr` b`y trong [5], phu.o.ng ph´p hoc da nhiˆm tiˆn h`nh<br />
´<br />
e<br />
e a<br />
t´c. Kh´c v´ y<br />
a<br />
a o<br />
a<br />
a<br />
ınh a<br />
a<br />
.<br />
.<br />
.<br />
.<br />
`<br />
´<br />
´<br />
´<br />
’ .<br />
huˆ n luyˆn dˆ ng th`.i bˆ phˆn loai cho tˆ t ca ngu.`.i d`ng su. dung k˜ thuˆt boosting kˆt ho.p<br />
a<br />
e o<br />
o o a<br />
a ’<br />
o u<br />
y<br />
a<br />
e .<br />
.<br />
.<br />
.<br />
.<br />
.i gˆc cˆy quyˆt dinh (decision stump). Viˆc huˆ n luyˆn dˆ ng th`.i cho ph´p ph´t hiˆn d˘c<br />
`<br />
´<br />
´<br />
e<br />
a<br />
e o<br />
o<br />
e .<br />
e<br />
a<br />
e a<br />
v´ o a<br />
o ´<br />
.<br />
.<br />
.<br />
.<br />
.ng chung cho nh˜.ng ngu.`.i d`ng kh´c nhau. D˘c tru.ng chung d´ng vai tr` bˆ sung thˆng<br />
’<br />
tru<br />
a<br />
o<br />
o o<br />
o<br />
u<br />
o u<br />
a<br />
.<br />
’ a<br />
’<br />
tin gi˜.a t`.ng b`i to´n phˆn loai riˆng le v` cho ph´p cai thiˆn hiˆu qua phˆn loai. Phu.o.ng<br />
u u<br />
a a<br />
a<br />
e ’<br />
e<br />
e<br />
a<br />
. e<br />
.<br />
.<br />
.<br />
.o.c su. dung kh´ th`nh cˆng trong nhˆn dang anh nh˘ m muc d´ ph´t hiˆn<br />
`<br />
’ .<br />
’<br />
a a<br />
o<br />
a<br />
a<br />
ph´p n`y d˜ du .<br />
a a a<br />
ıch a<br />
e<br />
.<br />
.<br />
.<br />
.<br />
´<br />
`<br />
a<br />
u a o<br />
a<br />
a<br />
nh˜.ng d˘c tru.ng chung gi˜.a c´c dˆi tu.o.ng cˆn phˆn loai [14].<br />
u<br />
.<br />
.<br />
.<br />
<br />
62<br />
<br />
. .<br />
. .<br />
˜<br />
ˆ<br />
`.<br />
NGUYEN DUY PHU O NG, TU MINH PHU O NG<br />
<br />
´<br />
’<br />
Phu.o.ng ph´p dˆ xuˆ t trong b`i b´o du.o.c thu. nghiˆm trˆn hai bˆ d˜. liˆu thu.c vˆ d´nh<br />
a `<br />
e<br />
e<br />
o u e<br />
e<br />
e a<br />
a a<br />
.<br />
.<br />
.<br />
.<br />
. ` a<br />
.`.i d`ng dˆi v´.i phim. Kˆt qua thu. nghiˆm cho thˆ y viˆc hoc da nhiˆm v` su.<br />
´<br />
´<br />
´<br />
’<br />
’<br />
e<br />
e<br />
a<br />
e<br />
o o<br />
e<br />
a ’<br />
gi´ cua ngu o u<br />
a ’<br />
.<br />
.<br />
.<br />
.<br />
.ng chung cho kˆt qua loc tˆt ho.n so v´.i hoc riˆng r˜ v` phu.o.ng ph´p loc cˆng<br />
´<br />
´<br />
’ . o<br />
e<br />
o .<br />
e<br />
e a<br />
a . o<br />
dung d˘c tru<br />
a<br />
.<br />
.<br />
.<br />
.a trˆn tu.o.ng quan gi˜.a ngu.`.i d`ng trong tru.`.ng ho.p c´ ´ d˜. liˆu.<br />
`<br />
´<br />
t´c truyˆn thˆng du<br />
a<br />
e<br />
o<br />
e<br />
u<br />
o u<br />
o<br />
.<br />
. o ıt u e<br />
.<br />
`<br />
ˆ<br />
´<br />
˘<br />
ˆ<br />
2. LOC CONG TAC BANG PHAN LOAI<br />
.<br />
.<br />
.<br />
´<br />
´<br />
`<br />
a `<br />
Dˆ tiˆn cho viˆc tr` b`y phu.o.ng ph´p dˆ xuˆ t, trong phˆn n`y ch´ng tˆi s˜ nh˘c lai<br />
e’ e<br />
e<br />
ınh a<br />
e a<br />
a a<br />
u<br />
o e a .<br />
.<br />
.<br />
.o.ng ph´p phˆn loai d`ng cho loc cˆng t´c cua Billsus v` Pazani [5].<br />
´<br />
a<br />
a<br />
a ’<br />
a<br />
t´m t˘t phu<br />
o<br />
a<br />
. u<br />
. o<br />
.<br />
.p bao gˆ m K ngu.`.i d`ng, G = {g1, ..., gN } l` tˆp gˆ m N<br />
`<br />
`<br />
Goi U = {u1 , ..., uK} l` tˆp ho<br />
a a<br />
o<br />
o u<br />
a a o<br />
.<br />
.<br />
.<br />
.<br />
` a a<br />
o u<br />
e .<br />
a<br />
a<br />
a ’<br />
a<br />
m˘t h`ng. D´nh gi´ cua ngu.`.i d`ng ui vˆ m˘t h`ng gj du.o.c k´ hiˆu b˘ ng rij . Nhu. vˆy, d´nh<br />
a a<br />
. y e a<br />
. `<br />
.<br />
.<br />
. liˆu cua b`i to´n loc cˆng t´c, trong d´ c´c h`ng tu.o.ng u.ng v´.i<br />
o a a<br />
gi´ rij tao th`nh ma trˆn d˜ e ’ a a . o<br />
a<br />
a<br />
a u .<br />
a<br />
´<br />
o<br />
.<br />
.<br />
.<br />
.`.i d`ng v` cˆt tu.o.ng u.ng v´.i m˘t h`ng. Gi´ tri rij c´ thˆ du.o.c thu thˆp tru.c tiˆp b˘ ng<br />
´ a<br />
a o<br />
´<br />
o a a<br />
a .<br />
o e’<br />
a<br />
e `<br />
ngu o u<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.`.i d`ng ho˘c thu thˆp gi´n tiˆp: v´ du khi ngu.`.i d`ng mua m˘t h`ng gi<br />
´<br />
´<br />
’ ´ e<br />
c´ch hoi y kiˆn ngu o u<br />
a<br />
a<br />
a<br />
a<br />
e<br />
ı .<br />
o u<br />
a a<br />
.<br />
.<br />
.<br />
.`.ng ho.p tˆ ng qu´t,<br />
’<br />
’ a ngu.`.i d`ng s˜ l` ”th´<br />
a<br />
o u<br />
e a<br />
ıch”. Trong tru o<br />
a<br />
hay xem bˆ phim gi th` d´nh gi´ cu<br />
o<br />
ı a<br />
. o<br />
.<br />
. mˆt tˆp c´c m´.c d´nh gi´ c´ th´. tu.. Dˆ cho do.n gian, o. dˆy ch´ng<br />
’ ’ a<br />
a o u .<br />
e’<br />
u<br />
rij c´ thˆ nhˆn gi´ tri t` o a a<br />
o e’ a<br />
a . u . .<br />
u a<br />
.<br />
tˆi gia su. rij c´ thˆ nhˆn gi´ tri ”th´<br />
o ’ ’<br />
o e’ a<br />
a .<br />
ıch” ho˘c ”khˆng th´<br />
a<br />
o<br />
ıch” hay +1 v` −1. Thˆng thu.`.ng,<br />
a<br />
o<br />
o<br />
.<br />
.<br />
˜i ngu.`.i d`ng chı d´nh gi´ mˆt tˆp rˆ t nho c´c m˘t h`ng v` do vˆy da sˆ c´c gi´ tri rij<br />
´<br />
´<br />
’ a<br />
’ a<br />
a o a a<br />
a a<br />
a<br />
a<br />
o a<br />
a .<br />
o u<br />
mˆ<br />
o<br />
. .<br />
.<br />
.<br />
.o.c dˆ trˆng r = φ. B`i to´n d˘t ra khi d´ l` du. do´n nh˜.ng gi´ tri c`n trˆng n`y cua<br />
’ o<br />
´<br />
´<br />
’<br />
a . o<br />
o<br />
a<br />
du .<br />
e<br />
a a a<br />
o a .<br />
a<br />
u<br />
ij<br />
.<br />
ma trˆn rij .<br />
a<br />
.<br />
˜<br />
´<br />
o<br />
Dˆ x´c dinh c´c gi´ tri c`n trˆng, [5] coi viˆc du. do´n nh˜.ng d´nh gi´ cho mˆi ngu.`.i<br />
e’ a .<br />
a<br />
a . o<br />
o<br />
e<br />
a<br />
u<br />
a<br />
a<br />
o<br />
.<br />
.<br />
. vˆy phu.o.ng ph´p n`y d`i hoi huˆ n luyˆn m bˆ<br />
´<br />
’<br />
a a o ’<br />
d`ng l` mˆt b`i to´n phˆn loai riˆng le. Nhu a<br />
u<br />
a o a a<br />
a<br />
a<br />
e<br />
o<br />
.<br />
.<br />
. e<br />
.<br />
.<br />
˜i bˆ cho ph´p du. do´n gi´ tri trˆng trong mˆt h`ng cua ma trˆn xij .<br />
´<br />
’<br />
phˆn loai riˆng biˆt, mˆ o<br />
a<br />
e<br />
o .<br />
e<br />
a<br />
a . o<br />
o a<br />
a<br />
.<br />
. e<br />
.<br />
.<br />
.<br />
`<br />
’<br />
’ ’ `<br />
X´t v´ du cho trong Bang 1. V´ du n`y bao gˆ m 4 ngu.`.i d`ng v` 5 m˘t h`ng. Gia su. cˆn<br />
e ı .<br />
ı . a<br />
o<br />
o u<br />
a<br />
a a<br />
a<br />
.<br />
. do´n d´nh gi´ cua ngu.`.i d`ng 4 dˆi v´.i m˘t h`ng 5. Do ngu.`.i d`ng 4 d˜ c´ d´nh gi´ v´.i<br />
´<br />
a ’<br />
o o a a<br />
a o a<br />
a o<br />
o u<br />
o u<br />
du a a<br />
.<br />
.<br />
´<br />
3 m˘t h`ng 1, 2 v` 3, nh˜.ng d´nh gi´ n`y s˜ du.o.c d`ng l`m v´ du huˆ n luyˆn bˆ phˆn loai.<br />
a a<br />
a<br />
u<br />
a<br />
ı .<br />
a<br />
e o a<br />
a<br />
a a e<br />
.<br />
. u<br />
.<br />
.<br />
.<br />
˜ a<br />
˜i v´ du huˆ n luyˆn c´ dang mˆt vecto. c´c d˘c tru.ng, mˆi d˘c tru.ng tu.o.ng u.ng v´.i mˆt<br />
´<br />
a a<br />
o .<br />
´<br />
o<br />
o<br />
Mˆ ı .<br />
o<br />
a<br />
e o .<br />
o<br />
.<br />
.<br />
.<br />
.<br />
.`.i d`ng kh´c ngu.`.i d`ng 4, gi´ tri cua d˘c tru.ng l` gi´ tri c´c ˆ cua ma trˆn. Nh˜n phˆn<br />
’ a<br />
’<br />
ngu o u<br />
a<br />
o u<br />
a .<br />
a a . a o<br />
a<br />
a<br />
a<br />
.<br />
.<br />
´<br />
’<br />
´<br />
o u<br />
a a<br />
loai cho c´c v´ du huˆ n luyˆn l` c´c d´nh gi´ tu.o.ng u.ng cua ngu.`.i d`ng 4 cho m˘t h`ng 1,<br />
a ı .<br />
a<br />
e a a a<br />
a<br />
.<br />
.<br />
.<br />
2 v` 3.<br />
a<br />
’<br />
Bang 1<br />
u1<br />
u2<br />
u3<br />
u4<br />
<br />
G1<br />
+1<br />
+1<br />
+1<br />
<br />
g2<br />
−1<br />
+1<br />
+1<br />
−1<br />
<br />
g3<br />
+1<br />
−1<br />
+1<br />
<br />
g4<br />
+1<br />
−1<br />
<br />
g5<br />
+1<br />
−1<br />
?<br />
<br />
`<br />
´<br />
V´.i v´ du huˆ n luyˆn nhu. trˆn, b`i to´n phˆn loa i c´ thˆ thu.c hiˆn b˘ ng nh˜.ng phu.o.ng<br />
o ı .<br />
a<br />
e<br />
e<br />
a a<br />
a<br />
e a<br />
u<br />
.<br />
. o e’ .<br />
.<br />
. ron nhˆn tao, cˆy quyˆt dinh, support vector<br />
´<br />
a .<br />
a<br />
e .<br />
ph´p phˆn loai thˆng dung, v´ du mang no<br />
a<br />
a<br />
o<br />
ı . .<br />
.<br />
.<br />
.´.c khi su. dung tru.c tiˆp d˜. liˆu dˆ huˆ n luyˆn v` phˆn loai,<br />
’ a<br />
´<br />
´<br />
’ .<br />
e a a<br />
machines .v.v. Tuy nhiˆn, tru o<br />
e<br />
e u e e<br />
.<br />
.<br />
.<br />
.<br />
´n dˆ cˆn giai quyˆt l` vˆ n dˆ tr´ chon d˘c tru.ng. Trong tru.`.ng ho.p tr` b`y o. dˆy,<br />
` `<br />
´ a a ` ıch . a<br />
´ e<br />
’ a<br />
’<br />
mˆt vˆ e a<br />
o a<br />
o<br />
ınh a<br />
e<br />
.<br />
.<br />
.<br />
˜ .<br />
a ’<br />
o<br />
e<br />
ı<br />
ınh a a<br />
o u<br />
a o<br />
o u<br />
mˆi d˘c tru.ng ch´ l` d´nh gi´ cua mˆt ngu.`.i d`ng kh´c v´.i ngu.`.i d`ng dang x´t (trong v´<br />
o a<br />
.<br />
<br />
. `.<br />
.<br />
. ˆ<br />
ˆ<br />
ˆ<br />
´<br />
ˆ<br />
´<br />
MOT THUAT TOAN LOC CONG TAC CHO TRU O NG HO P ´ DU LIEU<br />
.<br />
.<br />
.<br />
.<br />
. IT ˜<br />
.<br />
<br />
63<br />
<br />
du o. Bang 1, b`i to´n phˆn loai cho ngu.`.i d`ng 4 c´ 3 d˘c tru.ng l` d´nh gi´ cua ngu.`.i d`ng<br />
a a<br />
a<br />
o u<br />
o<br />
a a<br />
o u<br />
a<br />
a ’<br />
. ’ ’<br />
.<br />
.<br />
.c tˆ, sˆ lu.o.ng d˘c tru.ng rˆ t l´.n v` khˆng phai d˘c tru.ng n`o c˜ng liˆn<br />
´ ´<br />
´<br />
’ a<br />
a o a o<br />
a u<br />
e<br />
1, 2, v` 3). Trˆn thu e o .<br />
a<br />
e<br />
a<br />
.<br />
.<br />
.<br />
.i d´nh gi´ cua ngu.`.i d`ng dang x´t. Viˆc su. dung ca c´c d˘c tru.ng khˆng liˆn quan<br />
’ a a<br />
o u<br />
o<br />
e<br />
quan t´ a<br />
o<br />
a ’<br />
e<br />
e ’ .<br />
.<br />
.<br />
.c tap t´ to´n dˆ ng th`.i l`m giam dˆ ch´ x´c phˆn loai.<br />
`<br />
’<br />
l`m t˘ng dˆ ph´ . ınh a o<br />
a<br />
a<br />
o u<br />
o a<br />
o ınh a<br />
a<br />
.<br />
.<br />
.<br />
’ giai quyˆt vˆ n dˆ tr´ chon d˘c tru.ng, trong [5], c´c t´c gia su. dung phu.o.ng ph´p<br />
´ ´ e<br />
’ ’ .<br />
a a<br />
a<br />
Dˆ ’<br />
e<br />
e a ` ıch .<br />
a<br />
.<br />
.o.ng ph´p n`y phˆn t´ ma trˆn xij th`nh t´ cua<br />
singular value decomposition (SVD). Phu<br />
a a<br />
a ıch<br />
a<br />
a<br />
ıch ’<br />
.<br />
. riˆng v` ma trˆ n du.`.ng ch´o bao gˆ m c´c gi´ tri riˆng sau d´<br />
`<br />
`<br />
a<br />
a<br />
o<br />
e<br />
o<br />
a<br />
a . e<br />
o<br />
ma trˆn bao gˆ m c´c vecto e<br />
a<br />
o<br />
a<br />
.<br />
.<br />
.´.c ma trˆn b˘ ng c´ch chı gi˜. lai nh˜.ng vecto. riˆng tu.o.ng u.ng v´.i nh˜.ng gi´<br />
’ u .<br />
r´t gon k´ thu o<br />
u . ıch<br />
a `<br />
a<br />
a<br />
u<br />
e<br />
´<br />
o<br />
u<br />
a<br />
.<br />
.n nhˆ t. Nh`. vˆy, nh˜.ng d˘c tru.ng ban dˆu du.o.c biˆn dˆ i th`nh d˘c tru.ng m´.i.<br />
’<br />
´<br />
´<br />
`<br />
a<br />
o a<br />
u<br />
e o<br />
o<br />
a<br />
a<br />
a<br />
a<br />
tri riˆng l´<br />
o<br />
.<br />
.<br />
.<br />
.<br />
. e<br />
’m cua d˘c tru.ng m´.i l` sˆ lu.o.ng d˘c tru.ng ´ ho.n nhu.ng sau khi chiˆu d˜. liˆu xuˆng<br />
´ .<br />
´ u e<br />
´<br />
’ a<br />
D˘c diˆ<br />
a<br />
e<br />
a<br />
o a o<br />
ıt<br />
e<br />
o<br />
.<br />
.<br />
.<br />
.<br />
´<br />
´<br />
´<br />
d˘c tru.ng m´.i s˜ cho phu.o.ng sai l´.n ho.n so v´.i khi chiˆu xuˆng d˘c tru.ng gˆc v` do vˆy dˆ<br />
a<br />
a<br />
o e<br />
o<br />
o<br />
e<br />
o<br />
o a<br />
a ˜<br />
.<br />
.<br />
. e<br />
. liˆu ho.n.<br />
phˆn loai d˜ e<br />
a<br />
. u .<br />
.<br />
˘<br />
´<br />
ˆ<br />
´.<br />
3. PHAN LOAI VO I CAC DAC TRU NG CHUNG<br />
.<br />
.<br />
´<br />
a<br />
ınh a ’ e<br />
a ınh ıch .<br />
a a<br />
e o a<br />
V´.i phu.o.ng ph´p tr` b`y o. trˆn, qu´ tr` tr´ tron d˘c tru.ng v` huˆ n luyˆn bˆ phˆn<br />
o<br />
a<br />
.<br />
.<br />
.<br />
.`.i d`ng ui ngu.`.i d`ng du.o.c thu.c hiˆn trˆn d˜. liˆu du.o.c tao th`nh t`. nh˜.ng m˘t<br />
o u<br />
e<br />
e u e<br />
a<br />
u u<br />
a<br />
loai cho ngu o u<br />
.<br />
.<br />
.<br />
.<br />
. .<br />
.<br />
.<br />
.`.i d`ng n`y d˜ c´ d´nh gi´. Thˆng thu.`.ng, mˆi ngu.`.i d`ng chı d´nh gi´ mˆt<br />
˜<br />
’ a<br />
h`ng m` ngu o u<br />
a<br />
a<br />
a a o a<br />
o<br />
o<br />
o u<br />
a<br />
o<br />
a o<br />
.<br />
˜ o a<br />
´<br />
´t nho c´c m˘t h`ng, do vˆy mˆi bˆ phˆn loai chı du.o.c huˆ n luyˆn trˆn mˆt lu.o.ng d˜.<br />
’ a<br />
’<br />
a<br />
e<br />
e<br />
o<br />
u<br />
tˆp rˆ<br />
a a<br />
a a<br />
a<br />
o .<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
´<br />
´ ´ ˜<br />
’<br />
’<br />
liˆu nho. Dˆy l` yˆu tˆ dˆn t´.i hiˆu qua phˆn loai thˆ p.<br />
e<br />
e<br />
a<br />
a<br />
a a e o a o<br />
.<br />
.<br />
.<br />
´<br />
`<br />
a<br />
o<br />
Dˆ giai quyˆt nhu.o.c diˆ m n´i trˆn, phˆn n`y s˜ tr` b`y phu.o.ng ph´p m´.i trong d´ viˆc<br />
e’ ’<br />
e<br />
e’<br />
o e<br />
a a e ınh a<br />
o e<br />
.<br />
.<br />
.ng du.o.c thu.c hiˆn dˆ ng th`.i cho tˆ t ca ngu.`.i d`ng thay v`<br />
`<br />
´<br />
´<br />
a<br />
o<br />
huˆ n luyˆn v` tr´ tron d˘c tru<br />
a<br />
e a ıch .<br />
e o<br />
a ’<br />
o u<br />
ı<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.ng ngu.`.i riˆng r˜ nhu. v`.a mˆ ta. Viˆc huˆ n luyˆn dˆ ng th`.i cho ph´p kˆt ho.p thˆng<br />
`<br />
´<br />
´<br />
o<br />
o e<br />
e<br />
u<br />
o ’<br />
e<br />
a<br />
e o<br />
e e .<br />
o<br />
cho t`<br />
u<br />
.<br />
.<br />
. liˆu huˆ n luyˆn t`. nh˜.ng ngu.`.i d`ng kh´c, nh`. vˆy giam b´.t yˆu cˆu c´ nhiˆu<br />
´<br />
`<br />
`<br />
’<br />
tin v` d˜ e<br />
a u .<br />
a<br />
e u<br />
u<br />
o u<br />
a<br />
o a<br />
o e a o<br />
e<br />
.<br />
.<br />
.o.c d´nh gi´ tru.´.c cho mˆi ngu.`.i d`ng. Dˆy l` mˆt k˜ thuˆt thu.`.ng du.o.c goi<br />
˜<br />
a<br />
a<br />
a a o y<br />
a<br />
o<br />
o<br />
o u<br />
o<br />
m˘t h`ng du .<br />
a a<br />
.<br />
.<br />
.<br />
.<br />
.<br />
e<br />
l` hoc da nhiˆm.<br />
a .<br />
.<br />
`<br />
´<br />
´<br />
Viˆc tr´ tro n d˘c tru.ng v` huˆ n luyˆn dˆ ng th`.i cho tˆ t ca ngu.`.i d`ng du.o.c thu.c hiˆn<br />
e ıch .<br />
a a<br />
e o<br />
a ’<br />
o u<br />
e<br />
a<br />
o<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
` ng thuˆt to´n boosting kˆt ho.p v´.i gˆc cˆy quyˆt dinh (decision stump) [9, 12, 14].<br />
´ a<br />
´ .<br />
´ .<br />
o o<br />
e<br />
b˘<br />
a<br />
a<br />
a<br />
e<br />
.<br />
a<br />
3.1. Phu.o.ng ph´p boosting<br />
`<br />
Boosting l` phu.o.ng ph´p hoc m´y cho ph´p tao ra bˆ phˆn loai c´ dˆ ch´ x´c cao b˘ ng<br />
a<br />
a<br />
a<br />
e .<br />
o a<br />
a<br />
.<br />
.<br />
. o o ınh a<br />
.<br />
`<br />
´<br />
´t ho.p nhiˆu bˆ phˆn loai c´ dˆ ch´ x´c k´m ho.n (c`n goi l` bˆ phˆn loai yˆu) [12].<br />
e o a<br />
o . a o a<br />
c´ch kˆ .<br />
a<br />
e<br />
.<br />
. o o ınh a e<br />
.<br />
. e<br />
.<br />
´<br />
`<br />
’<br />
’ y<br />
e<br />
e a<br />
a<br />
e<br />
e<br />
a<br />
a<br />
Du.a trˆn nguyˆn t˘c chung n`y, nhiˆu phiˆn ban kh´c nhau cua k˜ thuˆt boosting d˜ du.o.c<br />
a<br />
.<br />
.<br />
.<br />
. dung [8, 9, 12]. Trong nghiˆn c´.u n`y, ch´ng tˆi s˜ su. dung phiˆn ban Gentle<br />
`<br />
´<br />
’<br />
a<br />
u<br />
o e ’ .<br />
e u<br />
e<br />
dˆ xuˆ t v` su .<br />
e a a ’<br />
´ ´<br />
´<br />
’<br />
a<br />
AdaBoost (viˆt t˘t l` gentleboost) du.o.c dˆ xuˆ t trong [9] do c´c u.u diˆ m cua phu.o.ng ph´p<br />
e a a<br />
e a<br />
a<br />
e’<br />
. `<br />
. do.n gian, ˆ n dinh, v` cho kˆt qua phˆn loai tˆt trong nhiˆu u.ng dung.<br />
’<br />
´<br />
´<br />
` ´<br />
’ o .<br />
’<br />
n`y nhu<br />
a<br />
a<br />
e<br />
a<br />
e<br />
.<br />
. o<br />
.o.ng ph´p gentleboost cho tru.`.ng ho.p phˆn loa i hai l´.p c´ thˆ mˆ ta t´m t˘t nhu. sau.<br />
’ o ’ o a<br />
´<br />
Phu<br />
a<br />
o<br />
a<br />
o o e<br />
.<br />
.<br />
. liˆu huˆ n luyˆn bao gˆ m N v´ du (x , y ), ..., (x , y ) v´.i x l` vecto. c´c d˘c tru.ng<br />
`<br />
´<br />
a<br />
e<br />
o<br />
ı . 1 1<br />
o i a<br />
a a<br />
Cho d˜ e<br />
u .<br />
N N<br />
.<br />
.<br />
.o.ng u.ng v´.i “th´<br />
´<br />
o<br />
ıch” v` “khˆng th´<br />
a<br />
o<br />
ıch”). Bˆ<br />
o<br />
v` yi l` nh˜n phˆn loai: yi = +1 ho˘c −1 (tu<br />
a<br />
a a<br />
a<br />
a<br />
.<br />
.<br />
.<br />
.o.c tao th`nh b˘ ng c´ch tˆ ho.p F (x) = M f (x), trong d´ f (x) l` bˆ<br />
’ .<br />
`<br />
o m<br />
a o<br />
phˆn loai F (x) du . .<br />
a<br />
a<br />
a<br />
a<br />
o<br />
m<br />
.<br />
.<br />
m=1<br />
´<br />
`<br />
´<br />
’ a<br />
’<br />
phˆn loai yˆu c´ kha n˘ng du. do´n nh˜n phˆn loai cho vecto. dˆu v`o x. Kˆt qua phˆn loai<br />
a<br />
a<br />
a<br />
a<br />
a a<br />
e<br />
a<br />
. e o<br />
.<br />
.<br />
.<br />
.o.c tao ra b˘ ng c´ch t´ sign(F (x)). Thuˆt to´n bao gˆ m M v`ng. Tai v`ng<br />
`<br />
`<br />
´<br />
a<br />
a<br />
ınh<br />
a<br />
a<br />
o<br />
o<br />
o<br />
cuˆi c`ng du . .<br />
o u<br />
.<br />
.<br />
<br />
64<br />
<br />
. .<br />
. .<br />
˜<br />
ˆ<br />
`.<br />
NGUYEN DUY PHU O NG, TU MINH PHU O NG<br />
<br />
´<br />
´<br />
th´. m, c´c v´ du huˆ n luyˆn s˜ du.o.c d´nh trong sˆ lai sao cho nh˜.ng v´ du bi phˆn loai sai<br />
u<br />
a ı .<br />
a<br />
e e<br />
ı . . a<br />
o .<br />
u<br />
.<br />
. a<br />
.<br />
.<br />
.´.c nhˆn du.o.c trong sˆ cao ho.n v` do vˆy cˆn du.o.c bˆ phˆn loa i ch´ y ho.n.<br />
´<br />
a<br />
o<br />
a<br />
a `<br />
a<br />
o a<br />
u´<br />
trong v`ng tru o<br />
o<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
´<br />
´<br />
a<br />
e<br />
e u e o .<br />
o<br />
o<br />
u<br />
a<br />
a<br />
Bˆ phˆn loai fm (x) du.o.c huˆ n luyˆn trˆn d˜. liˆu c´ trong sˆ trong v`ng th´. m. Thuˆt to´n<br />
o a<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.o.c thˆ hiˆn trˆn h` du.´.i dˆy.<br />
’ e<br />
gentleboost du .<br />
e .<br />
e ınh o a<br />
Thuˆt to´n gentleboost<br />
a<br />
a<br />
.<br />
´<br />
´<br />
´<br />
’ . a<br />
o<br />
a .<br />
o ’ ı .<br />
a<br />
e<br />
u<br />
1. Kho.i tao c´c tro ng sˆ wi = 1/N, i = 1..N, wi l` trong sˆ cua v´ du huˆ n luyˆn th´. i.<br />
.<br />
.<br />
.i tao F (x) = 0<br />
’<br />
Kho .<br />
2. L˘p v´.i m = 1, 2, ..., M<br />
a o<br />
.<br />
´<br />
´<br />
´<br />
’ .<br />
a. Huˆ n luyˆn fm (x) su. dung d˜. liˆu huˆ n luyˆn c´ trong sˆ<br />
a<br />
e<br />
u e<br />
a<br />
e o .<br />
o<br />
.<br />
.<br />
.<br />
b. Cˆp nhˆt F (x) ← F (x) + fm (x)<br />
a<br />
a<br />
.<br />
.<br />
’ ´<br />
´<br />
´<br />
a<br />
a a<br />
a .<br />
o<br />
c. Cˆp nhˆt trong sˆ wi ← wi e−yi fm (xi) v` chuˆ n t˘c ho´ trong sˆ<br />
a<br />
a<br />
o<br />
.<br />
.<br />
.<br />
M<br />
’ ` o a<br />
3. Tra vˆ bˆ phˆn loai sign[F (x)] = sign[ m=1 fm (x)]<br />
e .<br />
.<br />
˜<br />
´<br />
’<br />
o<br />
o o<br />
a<br />
a<br />
a .<br />
o a<br />
o<br />
Tai bu.´.c (a) cua mˆi v`ng l˘p, thuˆt to´n lu.a chon fm (x) sao cho sai sˆ phˆn loai du.´.i<br />
.<br />
.<br />
.<br />
.<br />
.<br />
´<br />
’<br />
dˆy l` nho nhˆ t:<br />
a a<br />
a<br />
N<br />
<br />
wi(yi − fm (xi))2 .<br />
<br />
J=<br />
<br />
(1)<br />
<br />
i=1<br />
<br />
`<br />
´<br />
e .<br />
e’<br />
a<br />
a a .<br />
Dˆ t` du.o.c bˆ phˆn loai cho ph´p cu.c tiˆ u ho´ (1), cˆn x´c dinh bˆ phˆn loai yˆu fm (x)<br />
e’ ım<br />
o a<br />
. o a<br />
.<br />
.<br />
.<br />
. e<br />
.c tiˆ u ho´ b` phu.o.ng lˆi phˆn loai c´ t´ t´.i trong sˆ. O. dˆy, ch´ng tˆi s˜ su.<br />
’ a<br />
˜<br />
´<br />
a ınh<br />
o<br />
a<br />
o<br />
u<br />
o e ’<br />
cho ph´p cu e’<br />
e .<br />
. o ınh o<br />
.<br />
´<br />
´<br />
´<br />
´<br />
´<br />
’<br />
’<br />
dung gˆc quyˆt dinh (stump) l`m bˆ phˆn loai yˆu. Gˆc quyˆt dinh l` phiˆn ban do.n gian<br />
o<br />
e .<br />
a<br />
o a<br />
o<br />
e .<br />
a<br />
e<br />
.<br />
.<br />
. e<br />
´<br />
´<br />
´ .<br />
´ .<br />
’ a cˆy quyˆt dinh (decision tree) v´.i mˆt n´t duy nhˆ t. Gˆc quyˆt dinh lu.a chon mˆt d˘c<br />
a<br />
o<br />
e<br />
o a<br />
o o u<br />
cu a<br />
e<br />
.<br />
.<br />
.<br />
. .<br />
´<br />
’ ı .<br />
tru.ng cua v´ du huˆ n luyˆn, sau d´ tu` thuˆc v`o gi´ tri cua d˘c tru.ng dˆ g´n cho nh˜n gi´<br />
o y<br />
o a<br />
a . ’ a<br />
e’ a<br />
a<br />
a<br />
a<br />
e<br />
.<br />
.<br />
.<br />
.o.c biˆ u diˆn bo.i cˆng th´.c<br />
˜<br />
’ o<br />
e’<br />
e<br />
u<br />
a<br />
a<br />
tri 1 hay −1. Qu´ tr` x´c dinh nh˜n phˆn loai du .<br />
a ınh a .<br />
.<br />
.<br />
fm (x) = aδ(xf > t) + bδ(xf ≤ t)<br />
<br />
(2)<br />
<br />
´<br />
´<br />
o<br />
e<br />
u<br />
a<br />
e<br />
trong d´ δ(e) = 1 nˆu e d´ng v` δ(e) = 0 nˆu ngu.o.c lai, t l` mˆt gi´ tri ngu.˜.ng, a v` b l`<br />
a o a .<br />
o<br />
a a<br />
. .<br />
.<br />
f<br />
.ng th´. f cua vecto. x. Trong tru.`.ng ho.p d˜. liˆu d´nh gi´ chı<br />
´<br />
’<br />
tham sˆ, x l` gi´ tri d˘c tru<br />
o<br />
a a . a<br />
u<br />
o<br />
u e a<br />
a ’<br />
.<br />
.<br />
.<br />
’ .<br />
`m gi´ tri 1 v` 0 ho˘c 1 v` −1, c´ thˆ chon ngu.˜.ng t = 0. Nhu. vˆy, ngo`i viˆc phˆn<br />
o<br />
a<br />
a e<br />
a<br />
bao gˆ<br />
o<br />
a .<br />
a<br />
a<br />
a<br />
o e<br />
.<br />
.<br />
.<br />
˜ ´<br />
´<br />
´<br />
’ .<br />
loai, gˆc quyˆt dinh c`n thu.c hiˆn tr´ tron d˘c tru.ng do mˆi gˆc chı chon mˆt d˘c tru.ng<br />
o<br />
e .<br />
e<br />
ıch .<br />
o o<br />
o a<br />
o<br />
a<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
´ ´<br />
´<br />
´<br />
´<br />
e’<br />
a<br />
o o<br />
a<br />
e<br />
duy nhˆ t. Qu´ tr` huˆ n luyˆn dˆ chon ra gˆc tˆt nhˆ t (cho ph´p cu.c tiˆ u ho´ (1)) du.o.c<br />
a<br />
a ınh a<br />
e e’ .<br />
.<br />
.<br />
.<br />
˜<br />
´<br />
´<br />
’<br />
’ a ’ a<br />
e `<br />
a<br />
a<br />
o<br />
o a . ’<br />
a . o<br />
a<br />
thu.c hiˆn b˘ ng c´ch thu. tˆ t ca d˘c tru.ng f . V´.i mˆi gi´ tri cua f, gi´ tri tˆi u.u cua a v`<br />
.<br />
.<br />
.<br />
.o.c t´ nhu. sau (su. dung k˜ thuˆt least square estimation m` ban chˆ t l` t´ gi´ tri<br />
´ a ınh a .<br />
’ .<br />
’<br />
b du . ınh<br />
y<br />
a<br />
a<br />
a<br />
.<br />
`<br />
´<br />
tham sˆ tai diˆ m c´ dao h`m b˘ ng 0):<br />
o .<br />
e’<br />
o . a<br />
a<br />
wi yi δ(xf > 0)<br />
,<br />
(3)<br />
a= i<br />
f<br />
i wiδ(x > 0)<br />
b=<br />
<br />
i<br />
<br />
wi yi δ(xf ≤ 0)<br />
,<br />
f<br />
i wi δ(x ≤ 0)<br />
<br />
(4)<br />
<br />
´<br />
´<br />
’<br />
a<br />
a e<br />
e’<br />
Gi´ tri f v` gi´ tri a v` b tu.o.ng u.ng, cho sai sˆ du. do´n (1) nho nhˆ t s˜ du.o.c chon dˆ<br />
a .<br />
a a .<br />
a<br />
´<br />
o .<br />
.<br />
.<br />
tao ra bˆ phˆn loai fm (x) cho v`ng l˘p th´. m. fm (x) sau d´ du.o.c thˆm v`o bˆ phˆn loai<br />
o a<br />
o<br />
a<br />
u<br />
e<br />
a o a<br />
o<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
.<br />
ch´ F (x) (bu.´.c b).<br />
ınh<br />
o<br />
<br />
Thêm tài liệu vào bộ sưu tập có sẵn:
Báo xấu
LAVA
AANETWORK
TRỢ GIÚP
HỖ TRỢ KHÁCH HÀNG
Chịu trách nhiệm nội dung:
Nguyễn Công Hà - Giám đốc Công ty TNHH TÀI LIỆU TRỰC TUYẾN VI NA
LIÊN HỆ
Địa chỉ: P402, 54A Nơ Trang Long, Phường 14, Q.Bình Thạnh, TP.HCM
Hotline: 093 303 0098
Email: support@tailieu.vn