
Applied Data Science
Sonpvh, 2022

1
1. Introduction
2. Application
3. EDA
4. Learning Process
5. Bias –Variance TradeOff
6. Regression
7. Classification
8. Validation
9. Regularization
10. Clustering
11. Evaluation
12. Deployment
13. Ethics

222
- Loan amount
+ Interest
+/- Duration
Good/Bad user
classification
Loan for specific
purpose
Label
Collection
DATA MODELING
”BLACKBOX”
•Hypothesis
•Algorithm
•Data (Labels –
Features)
GOOD
BAD
Test
Evaluation
Metrics
Deployment -
Monitor
Label
Definition
Benchmarking

3
Business
Understand
ing
Data
Understandi
ng
Data Unify Data
Analysi
s
Data
Preparati
on
Modeling
Evaluatio
n
Deployme
nt
•Evaluation
•…
DATA MODELING
”BLACKBOX”
•Hypothesis
•Algorithm
•Data (Labels –
Features)
EVALUATION
DEPLOYMENT
•Purposes
•Target distribution
•Target/nonTarget
definition
•Usecases –constrains
BUSINESS
UNDERSTANDING TESTs
MONITORs

4
Validation System
Model Design Backtest Benchmarking Data Quality Problems &
reports
Business User-
case
Studies on the Validation of Internal Rating Systems [1]
Supervisory
Examination
Validation of
Output
Validation of
Process