Lecture 9: Regression: Regressogram and Kernel Regression
|
|
- Bernice Pope
- 5 years ago
- Views:
Transcription
1 STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 9: Regressio: Regressogram ad erel Regressio Istructor: Ye-Ci Ce Referece: Capter 5 of All of oparametric statistics 9 Itroductio Let X, Y,, X, Y be a bivariate radom sample I te regressio aalysis, we are ofte iterested i te regressio fuctio mx EY X x Sometimes, we will write Y i mx i + ɛ i, were ɛ i is a mea 0 oise Te simple liear regressio model is to assume tat mx β 0 + β x, were β 0 ad β are te itercept ad slope parameter I tis lecture, we will talk about metods tat direct estimate te regressio fuctio mx witout imposig ay parametric form of mx 92 Regressogram Biig We start wit a very simple but extremely popular metod Tis metod is called regressogram but people ofte call it biig approac You ca view it as regressogram regressio + istogram For simplicity, we assume tat te covariates X i s are from a distributio over [0, ] Similar to te istogram, we first coose M, te umber of bis Te we partitio te iterval [0, ] ito M equal-widt bis: [ [ [ B 0,, B 2 M M, 2 M 2,, B M M M, M [ ] M, B M M M, We x B l, we estimate mx by m M x Y iix i B l IX average of te resposes wose covariates is i te same bi as x i B l Bias Te bias of a regressogram estimator is bias m M x O Variace Te variatio of a regressogram estimator is Var m M x O 9- M M
2 9-2 Lecture 9: Regressio: Regressogram ad erel Regressio Terefore, te MSE ad MISE will be at rate MSE O M 2 + O M, MISE O M 2 + O M, leadig to te optimal umber of bis M /3 ad te optimal covergece rate O 2/3, te same as te istogram Similar to te istogram, te regressogram as a slower covergece rate compared to may oter competitors we will itroduce several oter cadidates However, tey istogram ad regressogram are still very popular because te costructio of a estimator is very simple ad ituitive; practitioers wit little matematical traiig ca easily master tese approaces 93 erel Regressio Give a poit x 0, assume tat we are iterested i te value mx 0 Here is a simple metod to estimate tat value We mx 0 is smoot, a observatio X i x 0 implies mx i mx 0 Tus, te respose value Y i mx i + ɛ i mx 0 + ɛ i Usig tis observatio, to reduce te oise ɛ i, we ca use te sample average Tus, a estimator of mx 0 is to take te average of tose resposes wose covariate are close to x 0 To make it more cocrete, let > 0 be a tresold Te above procedure suggests to use i: X m loc x 0 Y i x 0 i Y ii X i x 0 x 0 I X i x 0, 9 were x 0 is te umber of observatios were te covariate X : X i x 0 Tis estimator, m loc, is called te local average estimator Ideed, to estimate mx at ay give poit x, we are usig a local average as a estimator Te local average estimator ca be rewritte as m loc x 0 Y ii X i x 0 I X i x 0 were W i x 0 I X i x 0 l I X l x 0 Y i I X i x 0 l I X l x 0 W i x 0 Y i, 92 is a weigt for eac observatio Note tat W ix 0 ad W i x 0 > 0 for all i,, ; tis implies tat W i x 0 s are ideed weigts Equatio 92 sows tat te local average estimator ca be writte as a weigted average estimator so te i-t weigt W i x 0 determies te cotributio of respose Y i to te estimator m loc x 0 I costructig te local average estimator, we are placig a ard-tresoldig o te eigborig poits tose witi a distace are give equal weigt but tose outside te tresold will be igored completely Tis ard-tresoldig leads to a estimator tat is ot cotiuous To avoid problem, we cosider aoter costructio of te weigts Ideally, we wat to give more weigts to tose observatios tat are close to x 0 ad we wat to ave a weigt tat is smoot Te Gaussia fuctio Gx 2π e x2 /2 seems to be a good cadidate We ow use te Gaussia fuctio to costruct a estimator We first costruct te weigt Wi G G x 0 l G x 0 X l 93
3 Lecture 9: Regressio: Regressogram ad erel Regressio 9-3 Te quatity > 0 is te similar quatity to te tresold i te local average but ow it acts as te smootig badwidt of te Gaussia After costructig te weigt, our ew estimator is m G x 0 Wi G G x 0 Y i l G x 0 X l Y i Y ig l G x 0 X l 94 Tis ew estimator as a weigt tat cages more smootly ta te local average ad is smoot as we desire Observig from equatio 9 ad 94, oe may otice tat tese local estimators are all of a similar form: m x 0 Y i l x 0 X l Wi x 0 Y i, Wi x 0 l x 0 X l, 95 were is some fuctio We is a Gaussia, we obtai estimator 94; we is a uiform over [, ], we obtai te local average 9 Te estimator i equatio 95 is called te kerel regressio estimator or Nadaraya-Watso estimator Te fuctio plays a similar role as te kerel fuctio i te DE ad tus it is also called te kerel fuctio Ad te quatity > 0 is similar to te smootig badwidt i te DE so it is also called te smootig badwidt 93 Teory Now we study some statistical properties of te estimator m We skip te details of derivatios 2 Bias Te bias of te kerel regressio at a poit x is bias m x 2 2 µ m x + 2 m xp x + o 2, px were px is te probability desity fuctio of te covariates X,, X ad µ x 2 xdx is te same costat of te kerel fuctio as i te DE Te bias as two compoets: a curvature compoet m x ad a desig compoet m xp x px Te curvature compoet is similar to te oe i te DE; we te regressio fuctio curved a lot, kerel smootig will smoot out te structure, itroducig some bias Te secod compoet, also kow as te desig bias, is a ew compoet compare to te bias i te DE Tis compoet depeds o te desity of covariate px Note tat i some studies, we ca coose te values of covariates so te desity px is also called te desig tis is wy it is kow as te desig bias Variace Te variace of te estimator is Var m x σ2 σ 2 px + o, were σ 2 Varɛ i is te error of te regressio model ad σ 2 2 xdx is a costat of te kerel fuctio te same as i te DE Tis expressio tells us possible sources of variace First, te variace icreases we σ 2 icreases Tis makes perfect sese because σ 2 is te oise level We te oise level is large, we expect te estimatio error icreases Secod, te desity of covariate px is iversely related to te variace Tis is also very reasoable because we px is large, tere teds to be more data poits ttps://ewikipediaorg/wiki/erel_regressio 2 if you are iterested i te derivatio, ceck ttp://wwwsscwiscedu/~base/78/noparametrics2pdf ad ttp: //wwwmatsmacesteracuk/~peterf/math380/npr%20n-w%20estimatorpdf
4 9-4 Lecture 9: Regressio: Regressogram ad erel Regressio aroud x, icreasig te size of sample tat we are averagig from Last, te covergece rate is O, wic is te same as te DE MSE ad MISE Usig te expressio of bias ad variace, te MSE at poit x is MSE m x 4 4 µ2 m x + 2 m xp 2 x + σ2 σ 2 px px + o4 + o ad te MISE is MISE m 4 4 µ2 m x + 2 m xp 2 x dx + σ2 σ 2 px px dx + o4 + o 96 Optimizig te major compoets i equatio 96 te AMISE, we obtai te optimal value of te smootig badwidt opt C /5, were C is a costat depedig o p ad 932 Ucertaity ad Cofidece Itervals How do we assess te quality of our estimator m x? We ca use te bootstrap to do it I tis case, empirical bootstrap, residual bootstrap, ad wild bootstrap all ca be applied But ote tat eac of tem relies o sligtly differet assumptios Let X, Y,, X, Y be te bootstrap sample Applyig te bootstrap sample to equatio 95, we obtai a bootstrap kerel regressio, deoted as m Now repeat te bootstrap procedure B times, tis yields m,, m B, B bootstrap kerel regressio estimator Te we ca estimate te variace of m x by te sample variace Var B m x B B m l x m,bx, m,bx B l B l m l x Similarly, we ca estimate te MSE as wat we did i Lecture 5 ad 6 However, we usig te bootstrap to estimate te ucertaity, oe as to be very careful because we is eiter too small or too large, te bootstrap estimate may fail to coverge its target We we coose O /5, te bootstrap estimate of te variace is cosistet but te bootstrap estimate of te MSE migt ot be cosistet Te mai reaso is: it is easier for te bootstrap to estimate te variace ta te bias Tus, we we coose i suc a way, bot bias ad te variace cotribute a lot to te MSE so we caot igore te bias However, i tis case, te bootstrap caot estimate te bias cosistetly so te estimate of te MSE is ot cosistet Cofidece iterval To costruct a cofidece iterval of mx, we will use te followig property of te kerel regressio: m x E m x D N 0, σ2 σ 2 px m x E m x Var m x D N0,
5 Lecture 9: Regressio: Regressogram ad erel Regressio 9-5 Te variace depeds o tree quatities: σ 2, σ 2, ad px Te quatity σ2 is kow because it is just a caracteristic of te kerel fuctio Te desity of covariates px ca be estimated usig a DE So wat remais ukow is te oise level σ 2 A good ews is: we ca estimate it usig te residuals Recall tat residuals are e i Y i Ŷi Y i m X i We m m, te residual becomes a approximatio to te oise ɛ i Te quatity σ 2 Varɛ so we ca use te sample variace of te residuals to estimate it ote tat te average of residuals is 0: σ 2 2ν + ν e 2 i, 97 were ν, ν are quatities actig as degree-of-freedom i wic we will explai later Tus, a α CI ca be costructed usig σ σ m x ± z α/2 p x, were p x is te DE of te covariates Bias issue 933 Resamplig Teciques Cross-validatio Bootstrap approac ttp://facultywasigtoedu/yecic/7sp_403/lec8-npregpdf 934 Relatio to DE May teoretical results of te DE apply to te oparametric regressio For istace, we ca geeralize te MISE ito oter types of error measuremet betwee m ad m We ca also use derivatives of m as estimators of te correspodig derivatives of m Moreover, we we ave a multivariate covariate, we ca use eiter a radial basis kerel or a product kerel to geeralize te kerel regressio to multivariate case Te DE ad te kerel regressio as a very iterestig relatiosip Usig te give bivariate radom sample X, Y,, X, Y, we ca estimate te joit PDF px, y as p x, y 2 Xi x Yi y Tis joit desity estimator also leads to a margial desity estimator of X: p x p x, ydy Xi x Now recalled tat te regressio fuctio is te coditioal expectatio mx EY X x ypy xdy px, y y px dy ypx, ydy px
6 9-6 Lecture 9: Regressio: Regressogram ad erel Regressio Replacig px, y ad px by teir correspodig estimators p x, y ad p x, we obtai a estimate of mx as y p x, ydy m x p x y 2 X i x Yi y dy X i x X i x y Yi y X i x Yi X i x X i x Y i X i x X i x m x Yi y dy Note tat we x is symmetric, y Y i Namely, we may uderstad te kerel regressio as a estimator ivertig te DE of te joit PDF ito a regressio estimator dy 94 Liear Smooter Now we are goig to itroduce a very importat otio called liear smooter Liear smooter is a collectio of may regressio estimators tat ave ice properties Te liear smooter is a estimator of te regressio fuctio i te form tat mx l i xy i, 98 were l i x is some fuctio depedig o X,, X but ot o ay of Y,, Y Te residual for te i-t observatio ca be writte as e j Y j mx j Y j l i X j Y i Let e e,, e T be te vector of residuals ad defie a matrix L as L ij l j X i : l X l 2 X l 3 X l X l X 2 l 2 X 2 l 3 X 2 l X 2 L l X l 2 X l 3 X l X Te te predicted vector Ŷ Ŷ,, Ŷ T LY, were Y Y,, Y T is te vector of observed Y i s ad e Y Ŷ Y LY I LY Example: Liear Regressio For te liear regressio, let X deotes te data matrix first colum is all value ad secod colum is X,, X We kow tat β X T X X T Y ad Ŷ X β XX T X X T Y Tis implies tat te matrix L is L XX T X X T,
7 Lecture 9: Regressio: Regressogram ad erel Regressio 9-7 wic is also te projectio matrix i liear regressio Tus, te liear regressio is a liear smooter Example: Regressogram Te regressogram is also a liear smooter Let B,, B m be te bis of te covariate ad defie Bx be te bi suc tat x belogs to Te l j x IX j Bx IX i Bx Example: erel Regressio As you may expect, te kerel regressio is also a liear smooter Recall from equatio 95 m x 0 Y i l x 0 X l Wi x 0 Y i, W i x 0 l x 0 X l so l j x 94 Variace of Liear Smooter x Xj l x X l Te liear smooter as a ubiased estimator of te uderlyig oise level σ 2 Recall tat te oise level σ 2 Varɛ i We eed to use two tricks about variace ad covariace matrix For a matrix A ad a radom variable X, Tus, te covariace matrix of te residual vector CovAX ACovXA T Cove CovI LY I LCovYI L T Because Y,, Y are IID, CovY σ 2 I, were I is te idetity matrix Tis implies Cove I LCovYI L T σ 2 I L L T + LL T Now takig matrix trace i bot side, TrCove Vare i σ 2 TrI L L T + LL T σ 2 ν ν + ν, were ν TrL ad ν TrLL T Because te residual square is approximately Vare i, we ave e 2 i Vare i σ 2 2ν + ν Tus, we ca estimate σ 2 by σ 2 2ν + ν e 2 i, 99 wic is wat we did i equatio 97 Te quatity ν is called te degree of freedom I te liear regressio case, ν ν p +, te umber of covariates so te variace estimator σ 2 p e2 i If you ave leared te variace estimator of a liear regressio, you sould be familiar wit tis estimator Te degree of freedom ν is easy to iterpret i te liear regressio Ad te power of equatio 99 is tat it works for every liear smooter as log as te errors ɛ i s are IID So it sows ow we ca defie effective degree of freedom for oter complicated regressio estimator
Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More informationLECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION
Jauary 3 07 LECTURE LEAST SQUARES CROSS-VALIDATION FOR ERNEL DENSITY ESTIMATION Noparametric kerel estimatio is extremely sesitive to te coice of badwidt as larger values of result i averagig over more
More information4 Conditional Distribution Estimation
4 Coditioal Distributio Estimatio 4. Estimators Te coditioal distributio (CDF) of y i give X i = x is F (y j x) = P (y i y j X i = x) = E ( (y i y) j X i = x) : Tis is te coditioal mea of te radom variable
More informationNonparametric regression: minimax upper and lower bounds
Capter 4 Noparametric regressio: miimax upper ad lower bouds 4. Itroductio We cosider oe of te two te most classical o-parametric problems i tis example: estimatig a regressio fuctio o a subset of te real
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationON LOCAL LINEAR ESTIMATION IN NONPARAMETRIC ERRORS-IN-VARIABLES MODELS 1
Teory of Stocastic Processes Vol2 28, o3-4, 2006, pp*-* SILVELYN ZWANZIG ON LOCAL LINEAR ESTIMATION IN NONPARAMETRIC ERRORS-IN-VARIABLES MODELS Local liear metods are applied to a oparametric regressio
More informationNotes On Nonparametric Density Estimation. James L. Powell Department of Economics University of California, Berkeley
Notes O Noparametric Desity Estimatio James L Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Uivariate Desity Estimatio via Numerical Derivatives Cosider te problem of estimatig te desity
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationJanuary 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS
Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we
More informationEstimating the Population Mean using Stratified Double Ranked Set Sample
Estimatig te Populatio Mea usig Stratified Double Raked Set Sample Mamoud Syam * Kamarulzama Ibraim Amer Ibraim Al-Omari Qatar Uiversity Foudatio Program Departmet of Mat ad Computer P.O.Box (7) Doa State
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationLecture 3: MLE and Regression
STAT/Q SCI 403: Itroductio to Resamplig Methods Sprig 207 Istructor: Ye-Chi Che Lecture 3: MLE ad Regressio 3. Parameters ad Distributios Some distributios are idexed by their uderlyig parameters. Thus,
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More informationCONCENTRATION INEQUALITIES
CONCENTRATION INEQUALITIES MAXIM RAGINSKY I te previous lecture, te followig result was stated witout proof. If X 1,..., X are idepedet Beroulliθ radom variables represetig te outcomes of a sequece of
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationFirst Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationOutput Analysis and Run-Length Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationUniversity of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015
Uiversity of Wasigto Departmet of Cemistry Cemistry 453 Witer Quarter 15 Lecture 14. /11/15 Recommeded Text Readig: Atkis DePaula: 9.1, 9., 9.3 A. Te Equipartitio Priciple & Eergy Quatizatio Te Equipartio
More informationStability analysis of numerical methods for stochastic systems with additive noise
Stability aalysis of umerical metods for stoctic systems wit additive oise Yosiiro SAITO Abstract Stoctic differetial equatios (SDEs) represet pysical peomea domiated by stoctic processes As for determiistic
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationTAMS24: Notations and Formulas
TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationDEGENERACY AND ALL THAT
DEGENERACY AND ALL THAT Te Nature of Termodyamics, Statistical Mecaics ad Classical Mecaics Termodyamics Te study of te equilibrium bulk properties of matter witi te cotext of four laws or facts of experiece
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationStatistical Properties of OLS estimators
1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of
More informationMatrix Representation of Data in Experiment
Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationLecture 20: Multivariate convergence and the Central Limit Theorem
Lecture 20: Multivariate covergece ad the Cetral Limit Theorem Covergece i distributio for radom vectors Let Z,Z 1,Z 2,... be radom vectors o R k. If the cdf of Z is cotiuous, the we ca defie covergece
More informationLecture 13: Maximum Likelihood Estimation
ECE90 Sprig 007 Statistical Learig Theory Istructor: R. Nowak Lecture 3: Maximum Likelihood Estimatio Summary of Lecture I the last lecture we derived a risk (MSE) boud for regressio problems; i.e., select
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationStat 139 Homework 7 Solutions, Fall 2015
Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationOpen book and notes. 120 minutes. Cover page and six pages of exam. No calculators.
IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationWeek 6. Intuition: Let p (x, y) be the joint density of (X, Y ) and p (x) be the marginal density of X. Then. h dy =
Week 6 Lecture Kerel regressio ad miimax rates Model: Observe Y i = f (X i ) + ξ i were ξ i, i =,,,, iid wit Eξ i = 0 We ofte assume X i are iid or X i = i/ Nadaraya Watso estimator Let w i (x) = K ( X
More informationALLOCATING SAMPLE TO STRATA PROPORTIONAL TO AGGREGATE MEASURE OF SIZE WITH BOTH UPPER AND LOWER BOUNDS ON THE NUMBER OF UNITS IN EACH STRATUM
ALLOCATING SAPLE TO STRATA PROPORTIONAL TO AGGREGATE EASURE OF SIZE WIT BOT UPPER AND LOWER BOUNDS ON TE NUBER OF UNITS IN EAC STRATU Lawrece R. Erst ad Cristoper J. Guciardo Erst_L@bls.gov, Guciardo_C@bls.gov
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationIn this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.
17 3. OLS Part III I this sectio we derive some fiite-sample properties of the OLS estimator. 3.1 The Samplig Distributio of the OLS Estimator y = Xβ + ε ; ε ~ N[0, σ 2 I ] b = (X X) 1 X y = f(y) ε is
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationIt should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.
Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig
More informationLecture 11 Simple Linear Regression
Lecture 11 Simple Liear Regressio Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech Midterm 2 mea: 91.2 media: 93.75 std: 6.5 2 Meddicorp
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More informationEstimation of the Mean and the ACVF
Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationProbability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].
Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x
More informationLocal Polynomial Regression
Local Polyomial Regressio Joh Hughes October 2, 2013 Recall that the oparametric regressio model is Y i f x i ) + ε i, where f is the regressio fuctio ad the ε i are errors such that Eε i 0. The Nadaraya-Watso
More information32 estimating the cumulative distribution function
32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationECE 901 Lecture 13: Maximum Likelihood Estimation
ECE 90 Lecture 3: Maximum Likelihood Estimatio R. Nowak 5/7/009 The focus of this lecture is to cosider aother approach to learig based o maximum likelihood estimatio. Ulike earlier approaches cosidered
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationLecture 15: Density estimation
Lecture 15: Desity estimatio Why do we estimate a desity? Suppose that X 1,...,X are i.i.d. radom variables from F ad that F is ukow but has a Lebesgue p.d.f. f. Estimatio of F ca be doe by estimatig f.
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationLECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if
LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio
More informationSOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker
SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5
More informationMaximum Likelihood Estimation
Chapter 9 Maximum Likelihood Estimatio 9.1 The Likelihood Fuctio The maximum likelihood estimator is the most widely used estimatio method. This chapter discusses the most importat cocepts behid maximum
More information10/ Statistical Machine Learning Homework #1 Solutions
Caregie Mello Uiversity Departet of Statistics & Data Sciece 0/36-70 Statistical Macie Learig Hoework # Solutios Proble [40 pts.] DUE: February, 08 Let X,..., X P were X i [0, ] ad P as desity p. Let p
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationLecture 7 Testing Nonlinear Inequality Restrictions 1
Eco 75 Lecture 7 Testig Noliear Iequality Restrictios I Lecture 6, we discussed te testig problems were te ull ypotesis is de ed by oliear equality restrictios: H : ( ) = versus H : ( ) 6= : () We sowed
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationKernel density estimator
Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationRecurrence Relations
Recurrece Relatios Aalysis of recursive algorithms, such as: it factorial (it ) { if (==0) retur ; else retur ( * factorial(-)); } Let t be the umber of multiplicatios eeded to calculate factorial(). The
More informationDifferentiation Techniques 1: Power, Constant Multiple, Sum and Difference Rules
Differetiatio Teciques : Power, Costat Multiple, Sum ad Differece Rules 97 Differetiatio Teciques : Power, Costat Multiple, Sum ad Differece Rules Model : Fidig te Equatio of f '() from a Grap of f ()
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More information