The Optimal Algorithm. 7. Algorithm-Independent Learning. No Free Lunch theorem. Theorem: No Free Lunch. Aleix M. Martinez
|
|
- Osborne Ryan
- 5 years ago
- Views:
Transcription
1 The Optmal Algorthm 7. Algorthm-Idepedet Learg Alex M. Martez Hadouts Hadoutsfor forece ECE874, I ths course we have defed a large umber of PR algorthms. The obvous questo to as ext s: whch oe s best? Of course to properly aswer ths, we would eed to defe what best meas. Let s asume we agree o that; e.g. accordg to the Bayes error. Eve the, o PR algorthm s heretly superor to ay other. No Free Luch theorem Recall we do ot ow the fucto to be leared. We do ot eve ow f t s parametrc or ot, learly separable, etc. We usually use test data to estmate the performace of a algorthm. If the trag ad testg sets are depedet, the all algorthms wll perform poorly. If the trag set s very large, the the testg set wll overlap wth t ad we wll merely be testg what we had leared (ot the geeralzato). We wll retur to ths ey pot latter. Cosder the -class problem where D={x } s the trag set wth labels y =+/-, geerated by the uow target fucto F(x). If F corporates ose (error), the the Bayes error wll be dfferet tha zero. Now, let H be the (dscrete) set of hypothess ad h(x) H wth pror h). The probablty for a algorthm to yeld hypothess h usg D s gve by h D). The atural error measure s the expected value of error gve D, summed over all possble h: E[ error D] h, F xd F x) ( x), h( x) P ( h D) F D). where(.) s the Kroecer delta fucto. Ad the expected off-trag-set classfcato error s for algorthm s: E [ error F, ] xd x) F( x), h( x) P h( x) D. algorthm Theorem: No Free Luch For ay two learg algorthms P (h D) ad P (h D), the followg s true, depedetly of the samplg dstrbuto x) ad the umber of trag samples:. Uformly averaged over all target fuctos F, E [error F,]- E [error F,]=0.. For ay fxed trag set D, uformly average over F, E [error F,D]- E [error F,D]=0. 3. Uformly averaged over all prors F), E [error ]- E [error ]=0. 4. For ay fxed trag set D, uformly averaged over F), E [error D]- E [error D]=0.
2 says that uformly averaged over all target fuctos the expected-off trag set error for all learg algorthms s the same. I.e., f all target fuctos are equally lely, the good algorthms wl ot outperform the bad oes. says that eve f we ow D, the offtrag error averaged over all target fuctos s the same. 3 & 4 cocer ouform target fucto dstrbutos. Problem space No system ca perform well for all possble problems. There s always a trade-off. Whle our hope s that we wll ever have to use algorthm A for certa problems, we ca oly hope that ths wl deftely be so. Ths stresses the pot that the assumptos ad pror owledge we have about our problem s what maes the dfferece. Ths s what a paper should be all about. Example Example Ugly Ducg Theorem trag x F h - h Smlar to the o free luch theorem, but for features. I the absece of assumptos, there s o prvleged or best set of features. The oto of smlarty betwee paters s also determed by the assumptos whch may or may ot be correct. Predcates: ay combato of patters (=,, ).
3 Ve dagrams Each patter represets d-tuples of bary features f. Ra The ra r of a predcate s the umber of the smples or dvsble elemets t cotas; e.g., x, x, x or x, x ad x, etc. X Ra r= f AND NOT f X f AND f X 3 f AND NOT f X 4 NOT(f OR f ) X OR X X OR X 3 X OR X 4 X OR X 3 Ra r= f XOR f NOT f 4 X OR X 4 NOT (f AND f ) 4 X 3 OR X 4 NOT f Total # of predcates:. r0 r 4 6 f f Example f = bld the rght eye; x ={,0}. f = bld the left eye; x ={0,}. Total bld s equally dssmlar to a ormally sghted tha x s to x. Wthout pror owledge, we obta usatsfactory results. f = bld the rght. f = same both eyes. f f f f x x x x 4 Smlarty We ca use the umber of predcates rather tha the umber of features. The umber of shared predcates s: d d d d. r r Note that ths result s depedet of the choce of x s. Ay dstct paters are equaly smlar. Theorem: Ugly Duclg Gve that we use a fte set of predcates that eables us to dstgush ay two patters uder cosderato, the umber of predcates shared by ay two such patters s costat ad depedet of the choce of those patters. Furthermore, f patter smlarty s based o the total umber of predcates shared by two patters, the ay two paters are equaly smlar. 3
4 Smlarty Ths theorem tells us that: eve the apparetly smple oto of smlarty betwee patters s fudametally based o mplct assumptos about the problem doma. Mmum Descrpto Legth Obvously, we ca (effcetly) represet our data may ways. Nevertheless, oe s always terested to fd the smallest represetato. Occma s razor: ettes should ot be multpled beyod ecessary => PR: oe should ot use classfers that are more complcated tha ecesary (where ecesary meas qualty of ft). Overftg avodace If there are o problem-depedet reasos to prefer oe algorthm over aother, why should we prefer the smplest oe? Obvously, there are problems for whch overftg avodace s bad. I geeral though, the less features (or parameters), the lower the probablty of error. It ca be see as elmatg the osy features. Algorthmc complexty Is the heret complexty of a bary strg. If seder ad recever have a commo L, the L(y)=x; ad the cost to trasmt x s y. I geeral, a abstract computer taes y ad outputs x. The algorthmc complexty of x s defed as the shortest program that computes x ad halts: K( x) m y U ( y) x, U a Turg mache. Example x cossts of s. We eed K(x)=log bts. To represetk(x)=. To represet a arbtrary strg of bts, K(x)=. MDL I geeral the members of a class have commo ad dfferet features. We wat to lear the esetal oes. We do ot wat to eep redudat (overft) or osy features. Mmze the sum of the model s algorthmc complexty ad the descrpto of the trag data: K( h, D) K ( h) K ( D usg h). 4
5 Whe, t ca be show that the MDL coverges to the deal (true) model. However, we may ot be clever eough to fd that best represetato the fte case. Varatos of MDL use a weghted verso of the equato show above. Relato to Bayes h) D h) p( h D) D) The optmal hypothess h* s: * h arg max[ h) D h)] arg max[log h) log D h)]. h h Mxture Models ad MDL The mxture of models troduced secto 5 requres for us to defe the umber of clusters C. The a posteror probablty caot be used to fd C, because t creases wth t. MDL ca help us select the best C. To do ths we eed to corporate a pealty term, whch prevets us to crease C to uecessary values. Oe early approach was based o the dea of formato crtero: AIC( C, ) log p ( y C, ) L LMM (mea & covarace matrx) ( M) M E.g., LC M. Ufortuately, AIC does ot lead to a cosstet estmator. To solve ths we use MDL. Rssae developed a approxmate expresso: MDL( C, ) log p( y C, ) L log( NM ). y Whch gves us the followg term (to be mmzed): N MDL( C, ) C log p( y, ) L log( NM ). Bas ad Varace How ca we evaluate the qualty ad precso of a classfer a partcular problem? Bas: measures the accuracy; hgh bas mples a poor match. Varace: measures the precso; hgh varace mples a wea match. It ca be studed regresso ad classfcato. The regresso case s much smpler though. 5
6 Bas & varace for regresso We create a estmate g(x;d) of the true but uow dstrbuto F(x). For some D ths approxmato wll be excellet, whle for others wll be poor. The mea-square devato from ts true value s: ED[( g( x; D) F ( x)) ] E[ g( x; D) F ( x)] E [( g( x; D) E [ g( x; D)]) ]. bas D varace D Algorthms wth more parameters (.e., more flexblty), ted to have lower bas but hgher varace => t wll ft the data very well. Smpler algorthms ted to have hgh bas but lower varace (are more predctable). Not always so smple though. The best way to have low bas ad varace s to have some owledge of the problem (.e. target fucto). Resamplg for Estmato How ca oe calculate the bas ad varace of a problem wth uow target fucto? There are two geeral methods to do ths: Leave-oe-out procedure (also ow as Jacfe). Bootstrap Jacefe It s easy to calculate the sample mea ad sample covarace: ˆ x ( x )ˆ. But ths caot be used for other estmates le the meda or the mod. To do ths we compute the mea leavg the j th sample out: ( j) x. j The mea s the gve by: ( ) ( j) j Ad the varace: Var[ ]ˆ j ) ( ) I geeral for ay estmator Jacefe Bas Estmate: )( ˆ). bas jac ( ( ) Jacefe Varace Estmate: Varjac[ ]ˆ ( j) ˆ j (..ˆ j 6
7 Example: mod D={0,0,0,0,0,0}, =6 ad ˆ0. ˆ ( ) ˆ ( ) ( ).5. bas Var 5 6 jac jac 6 ( )( ˆ )ˆ 5(.50).5 [ ]ˆ ( ) ˆ ˆ ( ) ( ) (0.5) 3(5.5) (0.5) /- 5.6 Bootstrap It s smlar to the prevous method, oly that we B radomly select samples B tmes: ˆ ( ) ˆ ( b). B b Bootstrap Bas Estmate: bas boot B ˆ ˆ ˆ B ( b) ( ).ˆ b Bootstrap Varace Estmate: B Var [ ] ˆ ˆ boot ( b) ( ). B b Stablty of the Classfer A classfer s ustable f small chages o D result dfferet classfers wth large dffereces classfcato accuracy. The dea of Resamplg (to be dscussed ext) s to use several datasets to fd (select) a more stable soluto. We do ths by combg several compoet classfers. There are o covcg theoretcal results to prove ths. A recet result relates to the geeralzato of a algorthm. Geeralzato Def: the emprcal error (.e., the performace o the trag examples) must be a good dcator of the expected error (.e., the performace of future samples). Emprcal error: E ( f ) V ( f, ). S z Expected error: E( f ) V ( f, z) d( z). z loss fucto Mappg: a learg algorthm s a mappg L : Z H. where H s the hypothess space (set of possble fuctos). We have see that wthout restrcto o H, t s mpossble to guaratee geeralzato. Stablty of a algorthm s a way to see ths. Cross-valdato leave-oe-out (CV loo ) stablty: L s dstrbuto-depedet, Cv loo stable f uformly over all pdf, lm sup V ( fs, z ) V ( f S,,..., z ) 0 probablty. S leavg the th sample out. Suffcet codto: CVEEE loo L s dstrbuto-depedet, CVEEE loo stable for allf:. s CV loo stable,. lm sup 3.,..., E( f S,..., S ) E ( f lm sup E ( f ) E S ) 0 probablty, If these codtos hold ad the loss fucto s bouded, the f S geeralzes. S ( f f S S The probablty (o) of those samples where our eq. does ot hold s approx 0. ) 0 probablty. 7
8 Resamplg for Classfcato Arcg (adaptve reweghtg ad combg): s the dea of reusg (or selectg specfc) data to mprove upo curret classfcato results. Baggg (bootstrap aggregato): uses may subsets of samples ( <) for D wth replacemets to tra dfferet compoet classfers. Classfcato s based o the vote gve by each of the compoet classfers whch are usually of the same form; e.g. Perceptros. Boostg. We radomly select ( <) samples from D; we call ths D.. We tra the frst classfer, C, wth D. 3. Now we create a secod trag set, D, where half of the samples are correctly classfed by C ad half are ot; DD 4. Tra C wth D. (Note that D s complemetary to D.) 0. Classfcato s based accordg to the votes gve by C ad C.. We ow create D 3 wth those 3 samples that are ot cosstetly classfed (.e., receved dfferet votes) by C ad C.. Tra C 3 wth D 3. Classfcato: If C ad C agree wth the classfcato of our test vector, we output that class, Otherwse, we select the class gve by C 3. Notes o boostg The compoet classfer, C, eeds oly be a wea learer a algorthm wth results above chace. The classfcato mprove as we add compoet classfers. There s o geeral way to select. Ideally oe mght wat = = 3 =/3, but there s o way to guaratee ths. I practce we mght eed to ru the algorthm several tme to optmze. AdaBoost( adaptg boostg ) Allows us to eep addg (wea) learers. I ths case each sample D receves a weghtg factor, W. At each terato we radomly select samples accordg to W. l( E ) / E W e f correctly classfed W Z e otherwse. ormalzg costat creases or decreases accordg to the results 8
9 AddaBoost focuses o the most formatve or most dffcult patters. For each group of samples, we tra a compoet classfer. We ca eep learg utl the trag error s below a pre-defed threshold. The fal classfcato fucto s: g( x) max h ( x). The classfcato decso s gve by sg[g(x)]. E max E ( ) exp E max G. AddaBoost-based Feature Selecto The same dea defed above ca ow be used to select those features that best classfy (dscrmate) our sample vectors. It s oly applcable to the -class problem though. The dea s that at each terato, we wll select that feature f assocated to the smallest classfcato error. Italzed: w, m f y ad w, l For t=,, Normalze weghts: w, f y 0. t, t, j w t, j Tra a classfer h for each of the p features. Error: E j w hj ( x ) y. Choose that classfer wth smallest error. Update the weghts: Samples class w w t w Samples class e, t, t., where e =0 f x s successfully classfed; otherwse e =; ad e. e The fal strog classfer s gve by T ( ) ) tht x h x 0 otherwse t log. where T ( t t t t Learg wth queres The prevous methods are supervsed (.e., data s labeled). May applcatos requre usupervsed techques though. We assume that our data ca be labeled (by a oracle), but there s a cost assocated to ths. E.g., hadwrtte text. Cost-based learg: the goal s mmze the overall cost classfcato accuracy & labelg. Cofdece-based query selecto: select those patters wth smlar dscrmat value (~ ½). 9
10 Votg-based or commttee-based query selecto (multclass): select the patters that yeld the greatest dsagreemet amog the dscrmat fuctos. Advatages: we eed ot guess (or lear) the form of the uderlyg dstrbuto of the data. Istead, we eed to estmate the classfcato boudary. We eed ot test our classfer wth data draw from the same dstrbuto. 0
Unsupervised Learning and Other Neural Networks
CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More informationFeature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)
CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.
More informationDiscrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b
CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package
More informationGenerative classification models
CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More informationSupervised learning: Linear regression Logistic regression
CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s
More informationRecall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I
Chapter 8 Heterosedastcty Recall MLR 5 Homsedastcty error u has the same varace gve ay values of the eplaatory varables Varu,..., = or EUU = I Suppose other GM assumptos hold but have heterosedastcty.
More informationFunctions of Random Variables
Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationChapter 5 Properties of a Random Sample
Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationDimensionality Reduction and Learning
CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that
More informationECON 5360 Class Notes GMM
ECON 560 Class Notes GMM Geeralzed Method of Momets (GMM) I beg by outlg the classcal method of momets techque (Fsher, 95) ad the proceed to geeralzed method of momets (Hase, 98).. radtoal Method of Momets
More informationCIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights
CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:
More informationKernel-based Methods and Support Vector Machines
Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More information22 Nonparametric Methods.
22 oparametrc Methods. I parametrc models oe assumes apror that the dstrbutos have a specfc form wth oe or more ukow parameters ad oe tres to fd the best or atleast reasoably effcet procedures that aswer
More informationC-1: Aerodynamics of Airfoils 1 C-2: Aerodynamics of Airfoils 2 C-3: Panel Methods C-4: Thin Airfoil Theory
ROAD MAP... AE301 Aerodyamcs I UNIT C: 2-D Arfols C-1: Aerodyamcs of Arfols 1 C-2: Aerodyamcs of Arfols 2 C-3: Pael Methods C-4: Th Arfol Theory AE301 Aerodyamcs I Ut C-3: Lst of Subects Problem Solutos?
More informationEstimation of Stress- Strength Reliability model using finite mixture of exponential distributions
Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur
More informationMultiple Choice Test. Chapter Adequacy of Models for Regression
Multple Choce Test Chapter 06.0 Adequac of Models for Regresso. For a lear regresso model to be cosdered adequate, the percetage of scaled resduals that eed to be the rage [-,] s greater tha or equal to
More informationSimulation Output Analysis
Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5
More informationAn Introduction to. Support Vector Machine
A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork
More informationSTA302/1001-Fall 2008 Midterm Test October 21, 2008
STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from
More informationCS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x
CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model
ECON 48 / WH Hog The Smple Regresso Model. Defto of the Smple Regresso Model Smple Regresso Model Expla varable y terms of varable x y = β + β x+ u y : depedet varable, explaed varable, respose varable,
More informationLecture Notes Types of economic variables
Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte
More information6. Nonparametric techniques
6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)
More informationCHAPTER VI Statistical Analysis of Experimental Data
Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca
More informationRademacher Complexity. Examples
Algorthmc Foudatos of Learg Lecture 3 Rademacher Complexty. Examples Lecturer: Patrck Rebesch Verso: October 16th 018 3.1 Itroducto I the last lecture we troduced the oto of Rademacher complexty ad showed
More informationDimensionality reduction Feature selection
CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data
More informationLogistic regression (continued)
STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory
More informationbest estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best
Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg
More information1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.
Ecoomcs 3 Itroducto to Ecoometrcs Sprg 004 Professor Dobk Name Studet ID Frst Mdterm Exam You must aswer all the questos. The exam s closed book ad closed otes. You may use your calculators but please
More informationX ε ) = 0, or equivalently, lim
Revew for the prevous lecture Cocepts: order statstcs Theorems: Dstrbutos of order statstcs Examples: How to get the dstrbuto of order statstcs Chapter 5 Propertes of a Radom Sample Secto 55 Covergece
More informationPTAS for Bin-Packing
CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More informationLecture 2 - What are component and system reliability and how it can be improved?
Lecture 2 - What are compoet ad system relablty ad how t ca be mproved? Relablty s a measure of the qualty of the product over the log ru. The cocept of relablty s a exteded tme perod over whch the expected
More informationPart 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))
art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the
More information9 U-STATISTICS. Eh =(m!) 1 Eh(X (1),..., X (m ) ) i.i.d
9 U-STATISTICS Suppose,,..., are P P..d. wth CDF F. Our goal s to estmate the expectato t (P)=Eh(,,..., m ). Note that ths expectato requres more tha oe cotrast to E, E, or Eh( ). Oe example s E or P((,
More informationClass 13,14 June 17, 19, 2015
Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral
More informationModel Fitting, RANSAC. Jana Kosecka
Model Fttg, RANSAC Jaa Kosecka Fttg: Issues Prevous strateges Le detecto Hough trasform Smple parametrc model, two parameters m, b m + b Votg strateg Hard to geeralze to hgher dmesos a o + a + a 2 2 +
More informationX X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then
Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers
More information18.413: Error Correcting Codes Lab March 2, Lecture 8
18.413: Error Correctg Codes Lab March 2, 2004 Lecturer: Dael A. Spelma Lecture 8 8.1 Vector Spaces A set C {0, 1} s a vector space f for x all C ad y C, x + y C, where we take addto to be compoet wse
More informationMachine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18
Mache Learg The feld of mache learg s cocered wth the questo of how to costruct computer programs that automatcally mprove wth eperece. (Mtchell, 1997) Thgs lear whe they chage ther behavor a way that
More informationLecture 3. Sampling, sampling distributions, and parameter estimation
Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called
More informationENGI 4421 Propagation of Error Page 8-01
ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.
More informationSTK4011 and STK9011 Autumn 2016
STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto
More informationSimple Linear Regression
Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More informationBayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier
Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa
More informationENGI 3423 Simple Linear Regression Page 12-01
ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable
More informationInvestigating Cellular Automata
Researcher: Taylor Dupuy Advsor: Aaro Wootto Semester: Fall 4 Ivestgatg Cellular Automata A Overvew of Cellular Automata: Cellular Automata are smple computer programs that geerate rows of black ad whte
More informationLecture 16: Backpropogation Algorithm Neural Networks with smooth activation functions
CO-511: Learg Theory prg 2017 Lecturer: Ro Lv Lecture 16: Bacpropogato Algorthm Dsclamer: These otes have ot bee subected to the usual scruty reserved for formal publcatos. They may be dstrbuted outsde
More informationLecture 02: Bounding tail distributions of a random variable
CSCI-B609: A Theorst s Toolkt, Fall 206 Aug 25 Lecture 02: Boudg tal dstrbutos of a radom varable Lecturer: Yua Zhou Scrbe: Yua Xe & Yua Zhou Let us cosder the ubased co flps aga. I.e. let the outcome
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity
ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data
More informationTHE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5
THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should
More informationWu-Hausman Test: But if X and ε are independent, βˆ. ECON 324 Page 1
Wu-Hausma Test: Detectg Falure of E( ε X ) Caot drectly test ths assumpto because lack ubased estmator of ε ad the OLS resduals wll be orthogoal to X, by costructo as ca be see from the momet codto X'
More informationChapter 11 Systematic Sampling
Chapter stematc amplg The sstematc samplg techue s operatoall more coveet tha the smple radom samplg. It also esures at the same tme that each ut has eual probablt of cluso the sample. I ths method of
More informationLecture 1 Review of Fundamental Statistical Concepts
Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber
More informationChapter 8. Inferences about More Than Two Population Central Values
Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha
More informationLecture 7: Linear and quadratic classifiers
Lecture 7: Lear ad quadratc classfers Bayes classfers for ormally dstrbuted classes Case : Σ σ I Case : Σ Σ (Σ daoal Case : Σ Σ (Σ o-daoal Case 4: Σ σ I Case 5: Σ Σ j eeral case Lear ad quadratc classfers:
More informationNaïve Bayes MIT Course Notes Cynthia Rudin
Thaks to Şeyda Ertek Credt: Ng, Mtchell Naïve Bayes MIT 5.097 Course Notes Cytha Rud The Naïve Bayes algorthm comes from a geeratve model. There s a mportat dstcto betwee geeratve ad dscrmatve models.
More informationTHE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE
THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the
More informationChapter 3 Sampling For Proportions and Percentages
Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys
More informationLecture 8: Linear Regression
Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE
More informationSupport vector machines
CS 75 Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Outle Outle: Algorthms for lear decso boudary Support vector maches Mamum marg hyperplae.
More informationHomework 1: Solutions Sid Banerjee Problem 1: (Practice with Asymptotic Notation) ORIE 4520: Stochastics at Scale Fall 2015
Fall 05 Homework : Solutos Problem : (Practce wth Asymptotc Notato) A essetal requremet for uderstadg scalg behavor s comfort wth asymptotc (or bg-o ) otato. I ths problem, you wll prove some basc facts
More informationPseudo-random Functions
Pseudo-radom Fuctos Debdeep Mukhopadhyay IIT Kharagpur We have see the costructo of PRG (pseudo-radom geerators) beg costructed from ay oe-way fuctos. Now we shall cosder a related cocept: Pseudo-radom
More informationThe Mathematical Appendix
The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.
More informationChapter 8: Statistical Analysis of Simulated Data
Marquette Uversty MSCS600 Chapter 8: Statstcal Aalyss of Smulated Data Dael B. Rowe, Ph.D. Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 08 by Marquette Uversty MSCS600 Ageda 8. The Sample
More informationTo use adaptive cluster sampling we must first make some definitions of the sampling universe:
8.3 ADAPTIVE SAMPLING Most of the methods dscussed samplg theory are lmted to samplg desgs hch the selecto of the samples ca be doe before the survey, so that oe of the decsos about samplg deped ay ay
More informationFor combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.
Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the
More informationModule 7: Probability and Statistics
Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Aalyss of Varace ad Desg of Exermets-I MODULE II LECTURE - GENERAL LINEAR HYPOTHESIS AND ANALYSIS OF VARIANCE Dr Shalabh Deartmet of Mathematcs ad Statstcs Ida Isttute of Techology Kaur Tukey s rocedure
More informationTHE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA
THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for
More informationAlgorithms Design & Analysis. Hash Tables
Algorthms Desg & Aalyss Hash Tables Recap Lower boud Order statstcs 2 Today s topcs Drect-accessble table Hash tables Hash fuctos Uversal hashg Perfect Hashg Ope addressg 3 Symbol-table problem Symbol
More information8.1 Hashing Algorithms
CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More informationL5 Polynomial / Spline Curves
L5 Polyomal / Sple Curves Cotets Coc sectos Polyomal Curves Hermte Curves Bezer Curves B-Sples No-Uform Ratoal B-Sples (NURBS) Mapulato ad Represetato of Curves Types of Curve Equatos Implct: Descrbe a
More information6.867 Machine Learning
6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though
More informationLECTURE - 4 SIMPLE RANDOM SAMPLING DR. SHALABH DEPARTMENT OF MATHEMATICS AND STATISTICS INDIAN INSTITUTE OF TECHNOLOGY KANPUR
amplg Theory MODULE II LECTURE - 4 IMPLE RADOM AMPLIG DR. HALABH DEPARTMET OF MATHEMATIC AD TATITIC IDIA ITITUTE OF TECHOLOGY KAPUR Estmato of populato mea ad populato varace Oe of the ma objectves after
More informationParametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier
arametrc Dest Estmato: Baesa Estmato. Naïve Baes Classfer Baesa arameter Estmato Suppose we have some dea of the rage where parameters θ should be Should t we formalze such pror owledge hopes that t wll
More informationSTA 105-M BASIC STATISTICS (This is a multiple choice paper.)
DCDM BUSINESS SCHOOL September Mock Eamatos STA 0-M BASIC STATISTICS (Ths s a multple choce paper.) Tme: hours 0 mutes INSTRUCTIONS TO CANDIDATES Do ot ope ths questo paper utl you have bee told to do
More informationInvestigation of Partially Conditional RP Model with Response Error. Ed Stanek
Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a
More informationMidterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes
coometrcs, CON Sa Fracsco State Uverst Mchael Bar Sprg 5 Mdterm xam, secto Soluto Thursda, Februar 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes exam.. No calculators of a kd are allowed..
More informationIntroduction to Probability
Itroducto to Probablty Nader H Bshouty Departmet of Computer Scece Techo 32000 Israel e-mal: bshouty@cstechoacl 1 Combatorcs 11 Smple Rules I Combatorcs The rule of sum says that the umber of ways to choose
More informationMean is only appropriate for interval or ratio scales, not ordinal or nominal.
Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot
More informationQR Factorization and Singular Value Decomposition COS 323
QR Factorzato ad Sgular Value Decomposto COS 33 Why Yet Aother Method? How do we solve least-squares wthout currg codto-squarg effect of ormal equatos (A T A A T b) whe A s sgular, fat, or otherwse poorly-specfed?
More informationNonparametric Techniques
Noparametrc Techques Noparametrc Techques w/o assumg ay partcular dstrbuto the uderlyg fucto may ot be kow e.g. mult-modal destes too may parameters Estmatg desty dstrbuto drectly Trasform to a lower-dmesoal
More information