6 Desgn cycle Data Feature selecton Model selecton Requre pror knowledge Learnng Evaluaton Model selecton What s the rght model to learn? E.g what polynomal to use A pror knowledge helps a lot, but stll a lot of guessng Intal data analyss and vsualzaton We can make a good guess about the form of the dstrbuton, shape of the functon Overfttng problem Take nto account the bas and varance of error estmates Smpler (more based model parameters can be estmated more relably (smaller varance of estmates Complex model wth many parameters parameter estmates are less relable (large varance of the estmate

7 Solutons for overfttng How to make the learner avod the overft? Assure suffcent number of samples n the tranng set May not be possble (small number of examples Hold some data out of the tranng set = valdaton set Tran (ft on the tranng set (w/o data held out; Check for the generalzaton error on the valdaton set, choose the model based on the valdaton set error (random resamplng valdaton technques Regularzaton (Occam s Razor Penalze for the model complexty (number of parameters Explct preference towards smple models Desgn cycle Data Feature selecton Model selecton Requre pror knowledge Learnng Evaluaton

14 Parameter estmaton. Con example. Con example: we have a con that can be based Outcomes: two possble values -- head or tal Data: D a sequence of outcomes x such that head x =1 tal = 0 x Model: probablty of a head probablty of a tal ( 1 Objectve: We would lke to estmate the probablty of a head from data ˆ Parameter estmaton. Example. Assume the unknown and possbly based con Probablty of the head s Data: H H T T H H T H T H T T T H T H H H H T H H H H T Heads: 15 Tals: 10 What would be your estmate of the probablty of a head? ~ =?

15 Parameter estmaton. Example Assume the unknown and possbly based con Probablty of the head s Data: H H T T H H T H T H T T T H T H H H H T H H H H T Heads: 15 Tals: 10 What would be your choce of the probablty of a head? Soluton: use frequences of occurrences to do the estmate ~ 15 = = Ths s the maxmum lkelhood estmate of the parameter Probablty of an outcome Data: D a sequence of outcomes such that head x =1 tal x = 0 Model: probablty of a head probablty of a tal ( 1 Assume: we know the probablty Probablty of an outcome of a con flp x (1 ( x P x = (1 Bernoull dstrbuton Combnes the probablty of a head and a tal So that x s gong to pck ts correct probablty Gves for x =1 Gves ( 1 for = 0 x x x

16 Probablty of a sequence of outcomes. Data: D a sequence of outcomes such that head x tal x =1 = 0 Model: probablty of a head probablty of a tal ( 1 Assume: a sequence of ndependent con flps D = H H T H T H (encoded as D= What s the probablty of observng the data sequence D: P( D =? x Probablty of a sequence of outcomes. Data: D a sequence of outcomes such that head x =1 tal x = 0 Model: probablty of a head probablty of a tal ( 1 Assume: a sequence of con flps D = H H T H T H encoded as D= What s the probablty of observng a data sequence D: P( D = (1 (1 x

18 The goodness of ft to the data. Learnng: we do not know the value of the parameter Our learnng goal: Fnd the parameter that fts the data D the best? One soluton to the best : Maxmze the lkelhood n x P( D = (1 = 1 (1 x Intuton: more lkely are the data gven the model, the better s the ft Note: Instead of an error functon that measures how bad the data ft the model we have a measure that tells us how well the data ft : Error ( D, = P( D Example: Bernoull dstrbuton. Con example: we have a con that can be based Outcomes: two possble values -- head or tal Data: D a sequence of outcomes x such that head x =1 tal x = 0 Model: probablty of a head probablty of a tal ( 1 Objectve: We would lke to estmate the probablty of a head ˆ Probablty of an outcome P( = (1 x x x (1 x Bernoull dstrbuton

The Greedy Method Introducton We have completed data structures. We now are gong to look at algorthm desgn methods. Often we are lookng at optmzaton problems whose performance s exponental. For an optmzaton

Quantzaton Effects n Dgtal Flters Dstrbuton of Truncaton Errors In two's complement representaton an exact number would have nfntely many bts (n general). When we lmt the number of bts to some fnte value

What s Canddate Samplng Say we have a multclass or mult label problem where each tranng example ( x, T ) conssts of a context x a small (mult)set of target classes T out of a large unverse L of possble

Bnary Dependent Varables In some cases the outcome of nterest rather than one of the rght hand sde varables s dscrete rather than contnuous The smplest example of ths s when the Y varable s bnary so that

REVIEW OF RISK MANAGEMENT CONCEPTS LOSS DISTRIBUTIONS AND INSURANCE Loss and nsurance: When someone s subject to the rsk of ncurrng a fnancal loss, the loss s generally modeled usng a random varable or

Antono Olmos, 01 Multple Regresson Problem: we want to determne the effect of Desre for control, Famly support, Number of frends, and Score on the BDI test on Perceved Support of Latno women. Dependent

Regresson Lectures So far we have talked only about statstcs that descrbe one varable. What we are gong to be dscussng for much of the remander of the course s relatonshps between two or more varables.

Inequalty and The Accountng Perod Quentn Wodon and Shlomo Ytzha World Ban and Hebrew Unversty September Abstract Income nequalty typcally declnes wth the length of tme taen nto account for measurement.

POFIT ATIO AND MAKET STUCTUE By Yong Yun Introducton: Industral economsts followng from Mason and Ban have run nnumerable tests of the relaton between varous market structural varables and varous dmensons

Approach to Modelng I Lecture 1: Lnear Regresson Approach, Assumptons and Dagnostcs Sandy Eckel seckel@jhsph.edu 8 May 8 General approach for most statstcal modelng: Defne the populaton of nterest State

Meda M Modelng vs. ANCOVA An Analytcal Debate What s the best way to measure ncremental sales, or lft, generated from marketng nvestment dollars? 2 Measurng ROI From Promotonal Spend Where possble to mplement,

Learnng Objectves 9.1 The Cumulatve Sum Control Chart 9.1.1 Basc Prncples: Cusum Control Chart for Montorng the Process Mean If s the target for the process mean, then the cumulatve sum control chart s

DISTRIBUTIONS FOR CATEGORICAL DATA 5 present models for a categorcal response wth matched pars; these apply, for nstance, wth a categorcal response measured for the same subjects at two tmes. Chapter 11

ErrorPropagaton.nb Error Propagaton Suppose that we make observatons of a quantty x that s subject to random fluctuatons or measurement errors. Our best estmate of the true value for ths quantty s then

HE DISRIBUION OF LOAN PORFOLIO VALUE * Oldrch Alfons Vascek he amount of captal necessary to support a portfolo of debt securtes depends on the probablty dstrbuton of the portfolo loss. Consder a portfolo

Evaluatng the generalzablty of an RCT usng electronc health records data 3 nterestng questons Is our RCT representatve? How can we generalze RCT results? Can we use EHR* data as a control group? *) Electronc

1 Why People Who Prce Dervatves Are Interested In Correlaton mon Acomb NAG Fnancal Mathematcs Day Correlaton Rsk What Is Correlaton No lnear relatonshp between ponts Co-movement between the ponts Postve

De ntons and Censorng. Survval Analyss We begn by consderng smple analyses but we wll lead up to and take a look at regresson on explanatory factors., as n lnear regresson part A. The mportant d erence

Module LOSSLESS IMAGE COMPRESSION SYSTEMS Lesson 3 Lossless Compresson: Huffman Codng Instructonal Objectves At the end of ths lesson, the students should be able to:. Defne and measure source entropy..