intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Lecture Applied data science: Evaluation, deployment, ethics

Chia sẻ: _ _ | Ngày: | Loại File: PDF | Số trang:19

6
lượt xem
3
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

Lecture "Applied data science: Evaluation, deployment, ethics" includes content: evaluation validation components, evaluation dataset, simpon's paradox, some ethical issues in ML,... We invite you to consult!

Chủ đề:
Lưu

Nội dung Text: Lecture Applied data science: Evaluation, deployment, ethics

  1. Applied Data Science Sonpvh, 2022
  2. 1. Introduction 8. Validation 2. Application 9. Regularization 3. EDA 10. Clustering 4. Learning Process 11. Evaluation 5. Bias – Variance TradeOff 12. Deployment 6. Regression 13. Ethics 7. Classification 1
  3. - Loan amount + Interest GOOD +/- Duration Good/Bad user BAD classification Label Loan for specific Definition purpose DATA MODELING ”BLACKBOX” Deployment - • Hypothesis Monitor • Algorithm • Data (Labels – Label Evaluation Features) Collection Metrics Test Benchmarking 2
  4. Business Data Understand Understandi ing ng Data Data Data Unify Preparati Analysi on s Deployme Modeling nt Evaluatio n BUSINESS UNDERSTANDING EVALUATION TESTs • Purposes DATA MODELING ”BLACKBOX” • Target distribution • Hypothesis • Target/nonTarget • Algorithm • Data (Labels – definition Features) • Usecases – constrains • Evaluation MONITORs DEPLOYMENT • … 3
  5. Supervisory Validation System Examination Validation of Validation of Output Process Problems & Business User- Model Design Backtest Benchmarking Data Quality reports case Studies on the Validation of Internal Rating Systems [1] 4
  6. • Time-based Evaluation • Sliced-based Evaluation • Product-based Evaluation DATASET • Perturbation Evaluation • What-if Evaluation • … BACKTEST DATA • Definition Conformance QUALITY • Uniqueness METRICS • Completeness • Derivation Integrity • Validity (consistency) • Accuracy • Accessibility Dr. Manjunath T.N, 2011 [3] 5 • Timeliness
  7. Time-based evaluation TRAIN TEST time Product-based evaluation TEST Sliced-based Evaluation • Important segmentations: Age, Gender, Location … • Models perform differently on different time/product/segments Milk A Milk B Milk C Milk D • Models perform the same with different cost 6 • Simpson’s Paradox
  8. Treatment 1 Treatment 2 Group A 93% (81/87) 87% (234/270) Group B 73% 69% (55/80) (192/263) Overall 78% 83% (273/350) (289/350) 7 Numbers from a kidney stone treatment study. (Charig et al., 1986)
  9. f( = Prob Bank Data ( + ▪ ▪ Coverage Regulatory Compliance ▪ Stability 3rd Data ▪ Timeliness ▪ Predictive Power ▪ Orthogonality ▪ … 8 ICCR: International Committee on Credit Reporting, 2018 [5]
  10. Performance Model A Rating External Rating “Expert” Model ModelModel B C • Common user cross platforms Model A • Mapping on master scale • …. Heuristic rules - Simple model Do Nothing Resources Internal Time/ Infrastructure External benchmarking 9 benchmarking Src[1]
  11. Performance 1. Basedline model (A) 2. Model B pass backtest 3. A percentile of traffic is routed to new model 4. Always keep a percentile of A 5. Monitor any statistically significant different Shutdown B& Debugs • A/A Testing & A/B/C … testing • Model cannot improve performance from nothing to perfect, but can make it better Canary Testing B: 10% B: 50% B: Time • Distribution Drift A/B Testing A: 90% A: 50% 90% 10 Shadow A:
  12. 1. Model user has primary responsibility for validation 2. Validation is fundamentally about assessing the predictive ability & the use of modeling 3. Validation is an iterative process. 4. There is no single validation method, it depends … 5. Validation should encompass both quantitative and qualitative elements. 6. Don’t benchmark with do nothing – benchmarking with do “something” 7. A/A testing & A/B Testing - always keep a percentile of based line model A Studies on the Validation of Internal Rating Systems [1] 11
  13. 1. Explainable AI (XAI) 2. Biased AI 3. Algorithm-driven decision-making system [6] 4. Trust & Transparency [7] 5. Data sovereignty [8] Ethics is complex and difficult topic, purpose here is just to raise the issues 12
  14. • Demographics • Income • Credit history • …. AI APPRAISAL APPROVE OR NOT • Job status • Why a loan was approved or denied? • Why a resume was selected or not? • Why a transaction was flagged as suppecious or not ? … 13 AnandSRao, 2020 [9]
  15. Gender classification via Face Regnition Google photo, 2015 Word2vec learns semantic/ syntactic relationships • King – man + woman = queen • Bananas – banana + apple = apples What if words also keep company with stereotypes and biases? • Doctor – man + woman = nurse • Computer programmer – man + woman = homemaker • Occupations he maestro, skipper, philosopher, captain, architect, financier, warrior, magician. • Most similar to she homemaker, nurse, receptionist, librarian, hairdresser, bookkeeper, housekeeper Src: http://gendershades.org 14 Should read: Bias in the vision & language of AI, Margaret Mitchell – Senior Research Scientist Google AI, 2019
  16. Src: The ethics of self-driving cars - Src: When self-driving cars drive the ethical questions, 2015 AI In the count of law, Unesco [9] what would you do? WeForum, [13] 2016 [12] AI created Art AI Doctor …. 15
  17. 16 Src: customer-data-designing-for-transparency-and-trust, 2015
  18. 17
  19. 1. Studies on the Validation of Internal Rating Systems, Bank for international settlements, May 2005 2. https://www.shutterstock.com/search/pregnancy+trimester+icon 3. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.206.2673&rep=rep1&type=pdf 4. http://dinus.ac.id/repository/docs/ajar/Kenneth_C._Laudon,Jane_P_._Laudon_--_Management_Information_System_12th_Edition_.pdf 5. International committee on credit reporting ICCR (2018). ‘Use of Alternative Data to Enhance Credit Reporting to Enable Access to Digital Financial Services by Individuals and SMEs operating in the Informal Economy, June 2018. 6. https://doi.org/10.1057/s41599-020-0501-9 7. https://hbr.org/2015/05/customer-data-designing-for-transparency-and-trust 8. https://www.linkedin.com/pulse/current-issues-data-sovereignty-its-impact-security-jeannot-phd/?_ga=2.212370708.1069714253.1648324667- 840471894.1648324667 9. https://en.unesco.org/artificial-intelligence/ethics/cases 10. https://towardsdatascience.com/five-critical-questions-to-explain-explainable-ai-e0c40bdca368 11. https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1194/slides/cs224n-2019-lecture19-bias.pdf 12. https://www.weforum.org/agenda/2016/08/the-ethics-of-self-driving-cars-what-would-you-do/ 13. https://techxplore.com/news/2015-10-self-driving-cars-ethical.html 18
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2