Local Polynomial Regression
|
|
- Louise Jackson
- 6 years ago
- Views:
Transcription
1 Local Polyomial Regressio Joh Hughes October 2, 2013 Recall that the oparametric regressio model is Y i f x i ) + ε i, where f is the regressio fuctio ad the ε i are errors such that Eε i 0. The Nadaraya-Watso Kerel Estimator The Nadaraya-Watso kerel estimator offers what is probably the simplest approach to oparametric regressio. The kerel estimator is a example of a liear smoother. The estimator is liear i the sese that it is give by a liear trasformatio of the respose. Specifically, let sx) s 1 x),..., s x)), where s i x) w ix) j1 w jx), where w i x) K{x x i )/h}. Now, if Y Y 1,..., Y ), the kerel estimator of f x) is ˆf x) s x)y s i x)y i w i x) j1 w jx) Y i K{x x i )/h} j1 K{x x j)/h} Y i. This shows that ˆf x) is a weighted average of the observatios, where the weights sx) are ormalized kerel weights. This formulatio ca easily be exteded to hadle a grid of estimatio poits z z 1,..., z g ). Form the g matrix S, the kth row of which is s z k ). The ˆf z) ˆf z 1 ),..., ˆf z g )) SY. The matrix S is called the smoothig matrix. It is aalogous to the hat matrix from liear regressio.
2 local polyomial regressio 2 Theorem 1 The risk assumig the L 2 loss) of the Nadaraya-Watso kerel estimator is { 2 R ˆf, f ) h4 { x Kx)dx} 2 f x) + 2 f 4 x) ġx) } 2 dx 1) gx) + σ2 K 2 x)dx 1 h gx) dx + oh 1 ) + oh 4 ) as h 0 ad h, where g is the desity from which the x i are draw, ad σ 2 Vε i. If we set the derivative of 1) equal to zero ad solve for h, we get the optimal badwidth h opt 1/5 σ 2 K 2 x)dx 1 gx) dx { x 2 Kx)dx } 2 { } f x) + 2 f x) ġx) 2 gx) dx which implies that h opt O 1/5 ). If we plug h opt ito 1), we see that the risk decreases at the rate O 4/5 ). For most parametric models, the risk of the MLE decreases at the rate O 1 ). The moral of this story is that we pay a price for usig a oparametric approach. We gai flexibility, but we may sacrifice statistical power to get it. Local Polyomial Regressio A kerel estimator suffers from desig bias a bias that depeds o the distributio of the x i ) ad boudary bias a bias ear the edpoits of the x i ). These biases ca be reduced by usig local polyomial regressio. Cosider choosig a estimator that miimizes Y i β 0 ) 2. Note that this is equivalet to miimizig the squared legth of Y β 0 1, where legth is defied as the ordiary Euclidea orm v v 2 i, 1/5, Figure 1: This figure illustrates the price we pay for adoptig a oparametric approach, as measured by the rate at which risk decreases with icreasig sample size. The solid curve is 4/5. The dashed curve is 1. which is i tur defied i terms of the usual ier product, the dot product: u, v u v u i v i. That is, v 2 v, v v v. Recall that the solutio to this estimatio problem is ˆβ 0 Ȳ. The vector Ȳ1 is the vector i spa{1} that is closest to Y with respect to the ordiary orm. You may also recall that Ȳ1 is the orthogoal
3 local polyomial regressio 3 projectio of Y oto spa{1}, where our otio of perpedicularity is give by the dot product: u v iff u v 0. To see this, observe that the orthogoal projectio of Y oto spa{1} is 11 1) 1 1 Y. This is just a special case of XX X) 1 X Y from ordiary liear regressio.) Now, 1 1 1, ad 1 Y Y i. Thus 11 1) 1 1 Y 1 1 Y i Ȳ1. Now chage the sceario slightly by chagig the ier product from u v to u W x v, where W x diag{w i x)}, with w i x) K{x x i )/h}. The aalogous estimatio problem is to miimize w ix)y i β 0 ) 2, but ow the relevat projectio is orthogoal with respect to this ew ier product. Hece, This implies that ˆβ 0 1 W x 1) 1 1 W x Y. ˆf x) ˆβ 0 w ix)y i w ix), the kerel estimator. Ad so we see that the kerel estimator results from itroducig kerel weights i a itercept-oly liear model. The weights localize the estimator i the sese that more distat observatios are dow-weighted. Sice the kerel estimator is local ad uses oly a itercept, the kerel estimator is sometimes called a locally costat estimator. Local polyomial regressio is based o the idea that we might improve the estimator by usig a higher-order polyomial as a local approximatio to f. Taylor s theorem tells us this is a sesible idea. Accordig to Taylor s theorem, f x) f z) + f 1) z)z x) + f 2) z) 2! β 0 + β 1 z x) + β 2 z x) β p z x) p P x z, β) z x) f p) z) z x) p p! for z i a eighborhood of x, where f m) deotes the mth derivative of f. The kerel estimator takes p 0. More geerally, local polyomial regressio of order p miimizes w i x){y i P x x i, β)} 2. 2)
4 local polyomial regressio 4 This yields the local estimate Note that the miimizer of 2) is where ˆf x) P x x, ˆβ) ˆβ 0 x). ˆβx) X xw x X x ) 1 X xw x Y, 1 x x 1 x 1 x) 2 X x x x x x) 2 x 1 x) p p! x x) p p!. This implies that ˆf x) ˆβ 0 x) is the ier product of Y with the first row of X xw x X x ) 1 X xw x, ad so ˆf x) is a liear smoother. The estimator has mea ad variace E ˆf x) sx)fx) V ˆf x) σ 2 sx) 2, where sx) is the first row of X xw x X x ) 1 X xw x ad fx) f x 1 ),..., f x )). Why p Should Be Odd The case p 1 is called local liear regressio. Local liear regressio elimiates desig bias ad alleviates boudary bias. Theorem 2 Let Y i f X i ) + σx i )ε i for i {1,..., } ad X i [a, b]. Assume that the X i were draw from desity g. Suppose that g is positive; g, f, ad σ are cotiuous i a eighborhood of x; ad h 0 ad h. Let x a, b). The the local costat estimator ad the local liear estimator both have variace σ 2 x) gx)h ) 1 K 2 u)du + o. h The local costat estimator has bias 1 h 2 2 f f x) + ) x)ġx) u 2 Ku)du + oh 2 ), gx) ad the local liear estimator has bias h f x) u 2 Ku)du + oh 2 ). At the edpoits of [a, b], the local costat estimator has bias of order h, ad the local liear estimator has bias of order h 2. More geerally, let p be eve. The local polyomial regressio of order p + 1 reduces desig bias ad boudary bias relative to local polyomial regressio of order p, without icreasig the variace.
5 local polyomial regressio 5 Variace Estimatio Homoscedasticity Util the previous theorem we had bee assumig homoscedasticity, i.e., Y i f x i ) + σε i for all i, where Vε i 1. I this case, we ca estimate σ 2 i a simple ad familiar way, amely, as the sum of the squared residuals divided by the residual degrees of freedom. More specifically, the estimator is ˆσ 2 {Y i ˆf x i )} 2 2ν + ν e e 2ν + ν e 2 2ν + ν where ν trs) ad ν trs S) i sx i ) 2. Recall that S is the smoothig matrix. The estimator ˆσ 2 is cosistet for σ 2. To see this, first observe that e Y SY I S)Y, which implies that ˆσ 2 Y ΛY trλ), where Λ I S) I S). A well-kow fact about quadratic forms is EY AY traσ) + µ Aµ, where Σ VY ad µ EY. Thus Eˆσ 2 EY ΛY trλ) trλσ2 I) trλ) σ f Λf 2ν + ν f Λf 2ν + ν. 3) Uder mild coditios, the secod term i 3) will go to zero as. The appearace of 2ν + ν may seem mysterious, but this quatity is i fact aalogous to the residual degrees of freedom p i ordiary liear regressio. I that settig, p tri H) tr{i H) I H)}, where H is the hat matrix. I the curret settig,
6 local polyomial regressio 6 I H is replaced by I S, ad we have tr{i S) I S)} tri I I S S I + S S) tri S S + S S) tri) trs) trs ) + trs S) 2 trs) + trs S) 2ν + ν. Heteroscedasticity Now suppose that Y i f x i ) + σx i )ε i. Sice this implies that σ is a presumably o-costat) fuctio, estimatig it requires a secod regressio. The secod regressio is for the model Z i log{y i f x i )} 2 log σ 2 x i )ε 2 i log σ 2 x i ) + log ε 2 i log σ 2 x i ) + δ i. This model suggests that we could estimate log σ 2 x) by doig a regressio with the log squared residuals from the first regressio as the respose. Specifically, we do the followig. 1. Estimate f x) to arrive at ˆf x). 2. Let Z i log{y i ˆf x i )} Regress the Z i o the x i to get a estimate ĝx) of log σ 2 x). 4. Let ˆσ 2 x) exp ĝx). Cofidece Bads We would of course like to costruct cofidece bads for f. A cofidece iterval for f x) usually has the form ˆf x) ± c sex), where c > 0 is a costat ad sex) is a estimate of the stadard deviatio of ˆf x). Perhaps couterituitively, such a cofidece iterval is ot truly a iterval for f x), but is istead a iterval for f x) E ˆf x) sx)fx). This is because there is a bias that does ot disappear as the sample size becomes large.
7 local polyomial regressio 7 Let s x) be the stadard deviatio of ˆf x). The ˆf x) f x) s x) ˆf x) f x) s x) Z x) + bias{ ˆf x)} V ˆf. x) + f x) f x) s x) Typically, Z x) N 0, 1). I a oparametric settig, the secod term does ot go to zero as the sample size icreases. This meas the bias is preset i the limit, which implies that the resultig cofidece iterval is ot cetered aroud f x). We might respod to this by 1. acceptig that our cofidece iterval is for f x) rather tha f x); 2. attemptig to correct the bias by estimatig the bias fuctio f x) f x); or 3. miimizig the bias by udersmoothig. The secod optio is perhaps the most temptig but is cosiderably more difficult tha estimatig f x) sice the bias ivolves f x). This fact makes the first ad third optios more appealig. Most people go with the first optio because it is difficult to choose the right amout of udersmoothig. Poitwise Bads We ca costruct a poitwise bad by ivokig asymptotic ormality or by usig the bootstrap. I the former case, the iterval is ˆf x) ± Φ 1 1 α/2)sex). As for the bootstrap, how we should resample depeds o whether we assume homoscedasticity. If we do assume costat variace, i.e., σx) σ, the kth bootstrap dataset is Y k) i ˆf x i ) + e k) i i 1,..., ), where e k) e k) 1,..., ek) ) is a sample with replacemet) of size from the vector of residuals e Y 1 ˆf x 1 ),..., Y ˆf x )). The edpoits of the resultig iterval at x i are the α/2 ad 1 α/2 1) b) quatiles of the bootstrap sample ˆf x i ),..., ˆf x i ). If we assume that σx) is a o-costat fuctio, we ca still do a bootstrap, but we must modify the resamplig procedure. Here is the algorithm i detail. 1. Estimate σx i ) to arrive at ˆσx i ) for i {1,..., }.
8 local polyomial regressio 8 2. Studetize the vector of residuals Y 1 ˆf x 1 ),..., Y ˆf x )) by dividig the ith elemet by ˆσx i ): e i Y i ˆf x i ). ˆσx i ) 3. Compute the kth bootstrap dataset as Y k) i ˆf x i ) + ˆσx i ) e k) i i 1,..., ), where e k) e k) 1,..., ek) ) is a sample with replacemet) of size from the vector of Studetized residuals. 4. Compute ˆf k) x) SY k) for k {1,..., b}. 5. The edpoits of the cofidece iterval at x i are agai the α/2 1) b) ad 1 α/2 quatiles of the bootstrap sample ˆf x i ),..., ˆf x i ). Simultaeous Bads To costruct a simultaeous bad we use the so-called tube formula. Suppose that σ is kow, ad let Ix) be a iterval. The P{ f x) Ix) for some x [a, b]} P max ˆf x) f ) x) > c σ sx) P P x [a,b] max i ε i s i x) x [a,b] σ sx) ) max Wx) > c, x [a,b] ) > c where Wx) Z i T i x), Z i ε i /σ N 0, 1), T i x) s i x)/ sx). It turs out that ) P max Wx) > c 2{1 Φc)} + κ x π exp c2 /2) for large c, where κ b a Ṫx) dx, where Ṫx) Ṫ 1 x),..., Ṫ x)). Choosig c to solve 2{1 Φc)} + κ π exp c2 /2) α yields the desired bad ˆf x) ± c sex).
9 local polyomial regressio 9 Choosig the Right Badwidth We wat to choose h to miimize the risk 1 Rh) E { ˆf x i ) f x i )} 2 ) Sice Rh) depeds o the ukow fuctio f, we will istead miimize a estimate ˆRh) of Rh). It might seem sesible to estimate Rh) usig ˆRh) 1 {Y i ˆf x i )} 2, the so-called traiig error. But this estimator is biased dowward ad usually leads to udersmoothig. A better risk estimator is the leave-oe-out cross-validatio score: CVh) ˆRh) 1 {Y i ˆf i x i )} 2 where ˆf i x i ) is the estimate obtaied by leavig out the ith observatio. Ituitively, we are askig, "How well ca we predict Y i if we do ot use Y i i the estimatio procedure?" For liear smoothers, computig this score is ot as burdesome as it may seem because we do ot have to recompute the estimate with each observatio left out. Istead, we have CVh) ˆRh) 1 { Y i ˆf } 2 x i ), 1 S ii where S ii is the ith diagoal elemet of S. A alterative is the geeralized cross-validatio score: GCVh) 1 { Y i ˆf } 2 x i ) 1 1 trs), which replaces the S ii with their average. Usually CV ad GCV lead to badwidths that are close to oe aother..
Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationSection 14. Simple linear regression.
Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More information32 estimating the cumulative distribution function
32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More informationKernel density estimator
Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationEfficient GMM LECTURE 12 GMM II
DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet
More informationOutput Analysis and Run-Length Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationLecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More informationInvestigating the Significance of a Correlation Coefficient using Jackknife Estimates
Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationDirection: This test is worth 250 points. You are required to complete this test within 50 minutes.
Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More informationEconomics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More information1 Review of Probability & Statistics
1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationStatistics 203 Introduction to Regression and Analysis of Variance Assignment #1 Solutions January 20, 2005
Statistics 203 Itroductio to Regressio ad Aalysis of Variace Assigmet #1 Solutios Jauary 20, 2005 Q. 1) (MP 2.7) (a) Let x deote the hydrocarbo percetage, ad let y deote the oxyge purity. The simple liear
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More information8.1 Introduction. 8. Nonparametric Inference Using Orthogonal Functions
8. Noparametric Iferece Usig Orthogoal Fuctios 1. Itroductio. Noparametric Regressio 3. Irregular Desigs 4. Desity Estimatio 5. Compariso of Methods 8.1 Itroductio Use a orthogoal basis to covert oparametric
More informationLecture 24: Variable selection in linear models
Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet
More information1 Approximating Integrals using Taylor Polynomials
Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................
More information1 Covariance Estimation
Eco 75 Lecture 5 Covariace Estimatio ad Optimal Weightig Matrices I this lecture, we cosider estimatio of the asymptotic covariace matrix B B of the extremum estimator b : Covariace Estimatio Lemma 4.
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationMachine Learning Theory (CS 6783)
Machie Learig Theory (CS 6783) Lecture 2 : Learig Frameworks, Examples Settig up learig problems. X : istace space or iput space Examples: Computer Visio: Raw M N image vectorized X = 0, 255 M N, SIFT
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 11
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationA quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population
A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate
More informationLinear Support Vector Machines
Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate
More informationRank tests and regression rank scores tests in measurement error models
Rak tests ad regressio rak scores tests i measuremet error models J. Jurečková ad A.K.Md.E. Saleh Charles Uiversity i Prague ad Carleto Uiversity i Ottawa Abstract The rak ad regressio rak score tests
More informationSimple Linear Regression
Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio
More informationIntroduction to Machine Learning DIS10
CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig
More informationMATH/STAT 352: Lecture 15
MATH/STAT 352: Lecture 15 Sectios 5.2 ad 5.3. Large sample CI for a proportio ad small sample CI for a mea. 1 5.2: Cofidece Iterval for a Proportio Estimatig proportio of successes i a biomial experimet
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationTAMS24: Notations and Formulas
TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationCLRM estimation Pietro Coretto Econometrics
Slide Set 4 CLRM estimatio Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Thursday 24 th Jauary, 2019 (h08:41) P. Coretto
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationIt should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.
Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig
More information11 Correlation and Regression
11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record
More informationBIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov
Microarray Ceter BIOSTATISTICS Lecture 5 Iterval Estimatios for Mea ad Proportio dr. Petr Nazarov 15-03-013 petr.azarov@crp-sate.lu Lecture 5. Iterval estimatio for mea ad proportio OUTLINE Iterval estimatios
More informationLECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION
Jauary 3 07 LECTURE LEAST SQUARES CROSS-VALIDATION FOR ERNEL DENSITY ESTIMATION Noparametric kerel estimatio is extremely sesitive to te coice of badwidt as larger values of result i averagig over more
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More informationDepartment of Mathematics
Departmet of Mathematics Ma 3/103 KC Border Itroductio to Probability ad Statistics Witer 2017 Lecture 19: Estimatio II Relevat textbook passages: Larse Marx [1]: Sectios 5.2 5.7 19.1 The method of momets
More informationWEIGHTED LEAST SQUARES - used to give more emphasis to selected points in the analysis. Recall, in OLS we minimize Q =! % =!
WEIGHTED LEAST SQUARES - used to give more emphasis to selected poits i the aalysis What are eighted least squares?! " i=1 i=1 Recall, i OLS e miimize Q =! % =!(Y - " - " X ) or Q = (Y_ - X "_) (Y_ - X
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationMATH 10550, EXAM 3 SOLUTIONS
MATH 155, EXAM 3 SOLUTIONS 1. I fidig a approximate solutio to the equatio x 3 +x 4 = usig Newto s method with iitial approximatio x 1 = 1, what is x? Solutio. Recall that x +1 = x f(x ) f (x ). Hece,
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationw (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.
2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For
More informationmultiplies all measures of center and the standard deviation and range by k, while the variance is multiplied by k 2.
Lesso 3- Lesso 3- Scale Chages of Data Vocabulary scale chage of a data set scale factor scale image BIG IDEA Multiplyig every umber i a data set by k multiplies all measures of ceter ad the stadard deviatio
More informationEconomics Spring 2015
1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures
More informationCircle the single best answer for each multiple choice question. Your choice should be made clearly.
TEST #1 STA 4853 March 6, 2017 Name: Please read the followig directios. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directios This exam is closed book ad closed otes. There are 32 multiple choice questios.
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY
ECO 312 Fall 2013 Chris Sims LIKELIHOOD, POSTERIORS, DIAGNOSING NON-NORMALITY (1) A distributio that allows asymmetry differet probabilities for egative ad positive outliers is the asymmetric double expoetial,
More informationOpen book and notes. 120 minutes. Cover page and six pages of exam. No calculators.
IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits
More informationLesson 10: Limits and Continuity
www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals
More informationThe picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled
1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how
More informationTMA4245 Statistics. Corrected 30 May and 4 June Norwegian University of Science and Technology Department of Mathematical Sciences.
Norwegia Uiversity of Sciece ad Techology Departmet of Mathematical Scieces Corrected 3 May ad 4 Jue Solutios TMA445 Statistics Saturday 6 May 9: 3: Problem Sow desity a The probability is.9.5 6x x dx
More informationENGI 4421 Confidence Intervals (Two Samples) Page 12-01
ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly
More informationDirection: This test is worth 150 points. You are required to complete this test within 55 minutes.
Term Test 3 (Part A) November 1, 004 Name Math 6 Studet Number Directio: This test is worth 10 poits. You are required to complete this test withi miutes. I order to receive full credit, aswer each problem
More informationFinal Review for MATH 3510
Fial Review for MATH 50 Calculatio 5 Give a fairly simple probability mass fuctio or probability desity fuctio of a radom variable, you should be able to compute the expected value ad variace of the variable
More informationMOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.
XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationCorrelation Regression
Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother
More informationInterval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),
Cofidece Iterval Estimatio Problems Suppose we have a populatio with some ukow parameter(s). Example: Normal(,) ad are parameters. We eed to draw coclusios (make ifereces) about the ukow parameters. We
More informationJournal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula
Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials
More informationThe Method of Least Squares. To understand least squares fitting of data.
The Method of Least Squares KEY WORDS Curve fittig, least square GOAL To uderstad least squares fittig of data To uderstad the least squares solutio of icosistet systems of liear equatios 1 Motivatio Curve
More informationMath 61CM - Solutions to homework 3
Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig
More informationMath 113, Calculus II Winter 2007 Final Exam Solutions
Math, Calculus II Witer 7 Fial Exam Solutios (5 poits) Use the limit defiitio of the defiite itegral ad the sum formulas to compute x x + dx The check your aswer usig the Evaluatio Theorem Solutio: I this
More informationProbability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].
Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x
More informationMath 10A final exam, December 16, 2016
Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the
More information