Lecture Applied data science: Learning process and Bias – variance tradeoff

Chia sẻ: _ _ | Ngày: | Loại File: PDF | Số trang:24

Thêm vào BST

Báo xấu

11
lượt xem 3
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Lecture "Applied data science: Learning process and Bias – variance tradeoff" includes content: learning process, bias – variance tradeoff, variance tradeoff, bias variance,... We invite you to consult!

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Lecture Applied data science: Learning process and Bias – variance tradeoff

Applied Data Science Sonpvh, 2022
1
1. Introduction 8. Validation 2. Application 9. Regularization 3. EDA 10. Clustering 4. Learning Process 11. Evaluation 5. Bias – Variance TradeOff 12. Deployment 6. Regression 13. Ethics 7. Classification 2
UNKNOWN TARGET FUNCTION 𝑓: 𝒳 → Υ Training sample (x 𝟏 , y 𝟏 ), (x 𝟐 , y 𝟐 ), … g(x) ≈ 𝑓(x) Learning Final Hypothesis algorithms g: 𝒳 → Υ 𝓐 Hypothesis Set ℋ Learning From Data – Yaser [1] 3
Probability Distribution P on 𝒳 AWARENESS GOOD vs BAD • Age INTEREST 𝑓: 𝒳 → Υ • Salary • Job status • Household size LEAD FORM • …. x 𝟏 , x 𝟐 , … , xN Training sample TELESALE (x 𝟏 , y 𝟏 ), (x 𝟐 , y 𝟐 ), … g(x) ≈ 𝑓(x) Learning ELIGIBILITY Final Hypothesis algorithms g: 𝒳 → Υ GOOD vs BAD 𝓐 y (1:0) DISBURSED Hypothesis Set What is eligibility? (label definition) ℋ 4
• Learning purpose: g(x) ≈ 𝑓(x) ✓ Binary error: : e(𝑓, g) = ۤ 𝑓(x) ≠ g(x)‫ۥ‬ But “What does g(x) ≈ 𝑓(x) mean?” ⟹ ERROR MEASURE: e(𝑓, g) ✓ Squared error: e(𝑓, g) = 𝑓(x) – g(x) 2 Supermarket verify for CIA verify for security 5 Learning From Data – Yaser [1] discount
UNKNOWN TARGET DISTRIBUTION UNKNOWN TARGET FUNCTION Probability Distribution 𝑬 𝒊𝒏 = 𝔼 𝑒(𝑓, g) given x ∈ Training set Target function 𝑓: 𝒳 → Υ + NOISE P on 𝒳 𝑬 𝒐𝒖𝒕 = 𝔼 𝑒(𝑓, g) given x ∈ Test set (xtrain ) ERROR (xtest ) Training sample g(x) ≈ 𝑓(x)⟹ 𝑬in ≈ 0 MEASURE e() (x 𝟏 , y 𝟏 ), (x 𝟐 , y 𝟐 ), … g(x) ≈ 𝑓(x) Learning feasible: 𝑬 𝒊𝒏 ≈ 𝑬 𝒐𝒖𝒕 Learning Final Hypothesis algorithms g: 𝒳 → Υ 𝓐 Hypothesis Set ℋ Learning From Data – Yaser [1] 6
• Purposes? Measure Metrics? • Target population? • Target / Non Target definition? • What are the use cases? Exclusions? • …. What/Where/When/How to collect data? … UNKNOWN TARGET DISTRIBUTION Probability Business Data Target function 𝑓: 𝒳 → Υ + Distribution P on 𝒳 Understand Understand Noise ing ing (xtrain) (xtest ) ERROR Data Training MEASURE e() sample Preparation (x 𝟏 , y 𝟏 ), (x 𝟐 , y 𝟐 ), … g(x) ≈ 𝑓(x) Final Deploymen Learning Hypothesis Modeling algorithms g: 𝒳 → Υ t 𝓐 Evaluation Hypothesis Set ℋ Learning process Data Unify Data Governance 7
Learning purpose: With probability ≥ 1- 𝛿 ✓ 𝑬in ≈ 0 Approximati Eout(g) − Ein(g) ≤ Ω(𝓗,N, 𝜹) on ✓ 𝑬 𝒊𝒏 ≈ 𝑬 𝒐𝒖𝒕 Generalizati 𝓗 ~ model complexity N: sample size. 𝟏 − 𝜹: confidence on requirement Approximation – Generalization trade-off More complex H ➔ better chance of approximation f Less complex H ➔ better chance of generalizing out of sample 8 Learning From Data – Yaser [1]
y = 𝑓 (x) = sin(𝜋𝑥). H0: h(x) = b vs H1: h(x) = ax + b y Which is better? Learning purpose: ✓ 𝑬in ≈ 0 ✓ 𝑬 𝒊𝒏 ≈ 𝑬 𝒐𝒖𝒕 x 9 Learning From Data – Yaser [1]
H0: h(x) = b H1: h(x) = ax + b y y k1 g2 k2 g1 x x 10 Learning From Data – Yaser [1]
“Approximation” of 𝓗 - BIAS H0: h(x) = b H1: h(x) = ax + b y y ത k g ത E0 = 0.5 E1 = 0.2 x x 11 Learning From Data – Yaser [1]
Who is the winner? 12 Learning From Data – Yaser [1]
2 Eout = 𝑓 − 𝑔 2 = 𝔼 𝑓 − 𝑔ҧ + 𝑔ҧ − 𝑔 =… D 2 2 Eout = 𝔼 (𝑔 𝐷 − 𝑔) ҧ + 𝔼 𝑔ҧ − 𝑓 Variance Bias 13
Learning purpose: ✓ 𝑬in ≈ 0 ✓ 𝑬 𝒊𝒏 ≈ 𝑬 𝒐𝒖𝒕 ℋ ERROR = BIAS + Model complexity VARIANCE SIMPLE MODEL COMPLEX MODEL Expected Error Expected Error 𝑬 𝒐𝒖𝒕 𝑬 𝒊𝒏 𝑬 𝒐𝒖𝒕 𝑬 𝒊𝒏 14 Learning From Data – Yaser [1] Number of data Number of data
y = 𝑓 (x) = sin(𝜋𝑥) + Noise ERROR = BIAS + VARIANCE + NOISE D Eout = 𝔼 𝑔ҧ − 𝑓 2 + 𝔼 (𝑔 𝐷 − 𝑔)2 ҧ + 𝔼 y − 𝑓 2 15
Methodology? Hypothesis – proxy variables suggestion … Which kind of data should be collected? … UNKNOWN TARGET DISTRIBUTION Probability Target function 𝑓: 𝒳 → Υ + Distribution Business Data Noise P on 𝒳 Understandi Understandi ng ng (xtrain ) (xtest ) ERROR Training MEASURE e() Data sample Data Unify Preparatio Data (x 𝟏 , y 𝟏 ), (x 𝟐 , y 𝟐 ), … n Analysi g(x) ≈ 𝑓(x) s Final Learning Hypothesis algorithms g: 𝒳 → Υ Deployme 𝓐 Modeling nt Hypothesis Set ℋ Evaluation Learning process 16
Business Data Business Data Understandi Understandi Understandi Understandi ng ng ng ng Data Preparatio Data Data Unify Preparatio Data n Analysi n s Deployme Modeling Deployme nt Modeling nt Evaluation Evaluation 17 Data science for business Book, 2013[2]
1. Business team should involve in almost every part of ML lifecycle 2. Business Problems vs Data problems vs Data Unification 3. Data unification provides flexifibity 4. Non-hypothesis driven data analysis (Boiling the ocean) vs Hypothesis driven analysis; Hypothesis ⇌ Data analysis 5. EDA is conducted in each stage of ML 6. Big data improve the learning quality – but business understanding is the key 18
1. Learning from data – Yaser S Abu-Mostafa, California institute of technology 2. Data science for business, Foster Provost & Tom Fawcett, 2013 19