Applied Data Science
Sonpvh, 2022
1
1. Introduction
2. Application
3. EDA
4. Learning Process
5. Bias Variance TradeOff
6. Regression
7. Classification
8. Validation
9. Regularization
10. Clustering
11. Evaluation
12. Deployment
13. Ethics
222
- Loan amount
+ Interest
+/- Duration
Good/Bad user
classification
Loan for specific
purpose
Label
Collection
DATA MODELING
”BLACKBOX
Hypothesis
Algorithm
Data (Labels
Features)
GOOD
BAD
Test
Evaluation
Metrics
Deployment -
Monitor
Label
Definition
Benchmarking
3
Business
Understand
ing
Data
Understandi
ng
Data Unify Data
Analysi
s
Data
Preparati
on
Modeling
Evaluatio
n
Deployme
nt
Evaluation
DATA MODELING
”BLACKBOX
Hypothesis
Algorithm
Data (Labels
Features)
EVALUATION
DEPLOYMENT
Purposes
Target distribution
Target/nonTarget
definition
Usecases constrains
BUSINESS
UNDERSTANDING TESTs
MONITORs
4
Validation System
Model Design Backtest Benchmarking Data Quality Problems &
reports
Business User-
case
Studies on the Validation of Internal Rating Systems [1]
Supervisory
Examination
Validation of
Output
Validation of
Process