Bài giảng Khai phá dữ liệu (Data mining): Ensemble models - Trịnh Tấn Đạt

Chia sẻ: Diệp Khinh Châu | Ngày: | Loại File: PDF | Số trang:90

Thêm vào BST

Báo xấu

9
lượt xem 5
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Bài giảng Khai phá dữ liệu (Data mining): Ensemble models, chương này trình bày những nội dung về: introduction; voting; bagging; boosting; stacking and blending; learning ensembles; methods of constructing ensembles; bias-variance tradeoff; simple ensemble techniques;... Mời các bạn cùng tham khảo chi tiết nội dung bài giảng!

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Bài giảng Khai phá dữ liệu (Data mining): Ensemble models - Trịnh Tấn Đạt

Trịnh Tấn Đạt Khoa CNTT – Đại Học Sài Gòn Email: trinhtandat@sgu.edu.vn Website: https://sites.google.com/site/ttdat88/
Contents  Introduction  Voting  Bagging  Boosting  Stacking and Blending
Introduction
Definition  An ensemble of classifiers is a set of classifiers whose individual decisions are combined in some way (typically, by weighted or un-weighted voting) to classify new examples  Ensembles are often much more accurate than the individual classifiers that make them up.
Learning Ensembles  Learn multiple alternative definitions of a concept using different training data or different learning algorithms.  Combine decisions of multiple definitions, e.g. using voting. Training Data Data 1 Data 2  Data K Learner 1 Learner 2  Learner K Model 1 Model 2  Model K Model Combiner Final Model
Necessary and Sufficient Condition  For the idea to work, the classifiers should be  Accurate  Diverse  Accurate: Has an error rate better than random guessing on new instances  Diverse: They make different errors on new data points
Why they Work?  Suppose there are 25 base classifiers  Each classifier has an error rate,  = 0.35  Assume classifiers are independent  Probability that the ensemble classifier makes a wrong prediction: 25  25  i  i  i =13   (1 −  ) 25−i = 0.06  Marquis de Condorcet (1785) Majority vote is wrong with probability:
Value of Ensembles  When combing multiple independent and diverse decisions each of which is at least more accurate than random guessing, random errors cancel each other out, correct decisions are reinforced.  Human ensembles are demonstrably better  How many jelly beans in the jar?: Individual estimates vs. group average.
A Motivating Example  Suppose that you are a patient with a set of symptoms  Instead of taking opinion of just one doctor (classifier), you decide to take opinion of a few doctors!  Is this a good idea? Indeed it is.  Consult many doctors and then based on their diagnosis; you can get a fairly accurate idea of the diagnosis.
The Wisdom of Crowds  The collective knowledge of a diverse and independent body of people typically exceeds the knowledge of any single individual and can be harnessed by voting
When Ensembles Work?  Ensemble methods work better with ‘unstable classifiers’  Classifiers that are sensitive to minor perturbations in the training set  Examples:  Decision trees  Rule-based  Artificial neural networks
Ensembles  Homogeneous Ensembles : all individual models are obtained with the same learning algorithm, on slightly different datasets  Use a single, arbitrary learning algorithm but manipulate training data to make it learn multiple models.  Data1  Data2  …  Data K  Learner1 = Learner2 = … = Learner K  Different methods for changing training data:  Bagging: Resample training data  Boosting: Reweight training data  Heterogeneous Ensembles : individual models are obtained with different algorithms  Stacking and Blending  combining mechanism is that the output of the classifiers (Level 0 classifiers) will be used as training data for another classifier (Level 1 classifier)
Methods of Constructing Ensembles 1. Manipulate training data set 2. Cross-validated Committees 3. Weighted Training Examples 4. Manipulating Input Features 5. Manipulating Output Targets 6. Injecting Randomness
Methods of Constructing Ensembles - 1 1. Manipulate training data set  Bagging (bootstrap aggregation)  On each run, Bagging presents the learning algorithm with a training set drawn randomly, with replacement, from the original training data. This process is called boostrapping.  Each bootstrap aggregate contains, on the average 63.2% of original training data, with several examples appearing multiple times
Methods of Constructing Ensembles - 2 2. Cross-validated Committees  Construct training sets by leaving out disjointed sets of training data  Idea similar to k-fold cross validation 3. Maintain a set of weights over the training examples. At each iteration the weights are changed to place more emphasis on misclassified examples (Adaboost)
Methods of Constructing Ensembles - 3 4. Manipulating Input Features  Works if the input features are highly redundant (e.g., down sampling FFT bins) 5. Manipulating Output Targets 6. Injecting Randomness
Variance and Bias  Bias is due to differences between the model and the true function.  Variance represents the sensitivity of the model to individual data points
Variance and Bias
Variance and Bias
Variance and Bias