Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator
|
|
- Ralf Bennett
- 5 years ago
- Views:
Transcription
1 Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters = (; 2 ) with the priciple of maximum likelihood. Uder the assumptio that the error is Gaussia, we will see that the OLS estimator B is equivalet to the MLE ad the OLS estimator of 2 di ers oly sightly from its ML couterpart. Further, B achieves the Cramer-Rao lower boud. ML Priciple The ituitive idea of the ML priciple is to choose the value of the parameter that is most likely to have geerated the data. Precisely, we assume that the probability distributio of a sample (Y ) is a member of a family of fuctios idexed by (this is described as parameterizig the distributio). This fuctio, viewed as a fuctio of the parameter vector is called the likelihood fuctio. I geeral, the likelihood fuctio has the form of the joit desity fuctio For a i.i.d. fuctio as L (jy = y ; ; Y = y ) = f Y Y (y ; ; y ; ) sample of a cotiuous radom variable, we form the likelihood L (jy = y ; ; Y = y ) = Y f Y (y t ; ) De itio. The maximum likelihood estimator (MLE) of, A ML, is the value of (i the parameter space) that maximizes L (jy = y ; ; Y = y ). Coditioal versus Ucoditioal Likelihood For the regressio model, we have a sample (Y; X), whose joit desity we parameterize. Because the joit desity is the product of a margial desity ad a coditioal desity, we ca write the joit desity of the data as f (y; x; ) = f (yjx; ) f (x; ) The parameter vector of iterest is. If we kew the parametric form of f (x; ), the we could maximize the joit likelihood fuctio. We caot do this, as the classic model does ot specify f (x; ). However, if there is o fuctioal relatio betwee ad (such as
2 the value of a elemet of depedig o a elemet of ), the maximizig the joit likelihood is achieved by separately maximizig the coditioal ad margial likelihoods. I such a case, the ML estimate of is obtaied by maximizig the coditioal likelihood aloe. Log-Likelihood for the Regressio Model As we have already see, Assumptios.2 (strict exogeeity), Assumptio.4 (spherical error variace) ad Assumptio.5 (Gaussia) together imply U jx N (0; 2 I ). Because Y = X + U, we have Y jx N X; 2 I The log-likelihood fuctio, which is simpler to maximize, is l L ~; ~ 2 j (Y ; X ) = (y ; x ) ; ; (Y ; X ) = (y ; x ) = 2 l (2) 2 l ~2 2~ 2 Y X ~ 0 Y X ~ (Because the likelihood fuctio has the form of a joit desity fuctio, the likelihood fuctio takes values o the uit iterval. Because the likelihood fuctio takes values o the uit iterval, the log-likelihood fuctio is egative.) ML via Cocetrated Likelihood We could maximize the log likelihood i two stages. First, maximize over ~ for ay give ~ 2. The ~ that maximizes the objective fuctio could (but i this case, does ot) deped o ~ 2. Secod, maximize over ~ 2, takig ito accout that the ~ from the rst stage could deped o ~ 2. The log likelihood fuctio i which ~ is costraied to be the value from the rst stage is called the cocetrated log likelihood (cocetrated with respect to ). ~ Because the rst stage for the Gaussia log-likelihood amouts to miimizig the sum of squares Y X ~ 0 Y X ~, the value of ~ is simply the OLS estimator B (so B ML ad B OLS are idetical if the regressio error is Gaussia). I cosequece, the miimized sum of squares is ^U 0 ^U, so the cocetrated log likelihood is l L C ~ 2 j (Y ; X ) = (y ; x ) ; = 2 l (2) 2 l ~2 2~ 2 ^U 0 ^U This is a fuctio of ~ 2 aloe ad, because ^U 0 ^U is ot a fuctio of ~ 2, oe ca simply take the derivative with respect to ~ 2 (takig the derivative with respect 2
3 to ~ 2, rather tha ~ ca be tricky; replace ~ 2 with ~). If we set this derivative equal to zero, we obtai Propositio (ML Estimator of (; 2 )) Suppose Assumptios.-.5 hold. The the ML estimator of is the OLS estimator ad the ML estimator of 2 is ^U 0 ^U = K S2 As S 2 is a ubiased estimator of the variace, the ML estimator of 2 is biased, which idicates that a best estimator of the variace does ot exist. The resultat maximized log likelihood is 0 2 l (2) 2 l ^U ^U 2 = 2 2 l Cramer-Rao Boud for the Classic Regressio Model 2 2 l ^U 0 ^U Recall from 24A, the Cramer-Rao iequality for the covariace matrix of ay ubiased estimator. Let S ~ be the score vector, which is the gradiet (vector of partial derivatives) of the log l L ~ S ~ ~ Cramer-Rao Iequality. Let Z be a vector of radom variables (ot ecessarily idepedet) with joit desity f(z;). 2. Let be a m-dimesioal vector of parameters, de ed i a parameter space. 3. Let L( ~ ) be the likelihood ad let ^ (z) be a ubiased estimator of with ite covariace matrix. Uder certai regularity coditios o f(z;), i V ar h^ (z) I () (Cramer-Rao Lower Boud), mm 3
4 where I( ) is the iformatio matrix de ed by I () = E S () S () 0 (Note that the score is evaluated at the true parameter value.) Also uder the regularity coditios, the iformatio matrix equals the egative of the expected value of the Hessia (matrix of secod partial derivatives) of the log 2 l L () I () = ~ 0 This is called the iformatio matrix equality. The regularity coditios guaratee that the operatios of di eretiatio ad takig expectatios ca be iterchaged h () =@ ~ i [L ()] =@ ~ For the classic regressio model, the Cramer-Rao boud is (derivatio i Hayashi) I () = 2 (X 0 X) Therefore the OLS estimator, which is equivalet to the MLE, achieves the Cramer- Rao boud ad is the best ubiased estimator. What about the estimator of 2? We have already see that the MLE for 2 is biased, so the Cramer-Rao boud does ot apply. But S 2 is ubiased, does it achieve the boud? It ca be show that V ar S 2 jx = 24 K ; so the estimator does ot achieve the boud. However, it ca also be show that a ubiased estimator with lower variace does ot exist, so the boud is ot attaiable. Quasi-Maximum Likelihood Of course, if the Gaussia assumptio is icorrect, the the resultat estimator is ot the MLE. Rather, as the likelihood is misspeci ed, the resultat estimator is the quasi-mle. I may cases the Gaussia quasi-mle performs well. Ufortuately, i geeral a quasi-mle performs quite poorly. 4
5 OLSE as a Method of Momets Estimator The OLS estimators are costructed so that the populatio momets hold i the sample ad so are method of momets estimators. A assumptio of the classic model is that each regressor is ucorrelated with the error term (captured i Assumptio 2, where the regressors are assumed exogeous ad measured without error). To uderstad the mathematical implicatios of the assumptio, recall that two radom variables are ucorrelated if they have zero covariace, which i tur implies Cov (X t ; U t ) = E (X t U t ) EX t EU t = 0 Uder Assumptio 3, EU t = 0, so a zero covariace implies E (X t U t ) = 0. The two populatio momets used to costruct the estimators are EU t = 0; which ca be viewed as E (X t;0 U t ) = 0 where X t;0 = is the itercept regressor, ad E (X t U t ) = 0 The method of momets sets sample momets equal to populatio momets. To costruct sample aalogs of these momets, we eed a sample value of the uobserved error U t. For a give estimator, the residual (predictio of U t ) is observed Ut P = Y t Yt P = Y t B 0 B X t Equality of sample ad populatio momets yields U P t = 0 ad X t Ut P = 0 From the de itio of U P t, P U P t = 0 implies Y t = Y P t Oe ca readily verify that the OLS residuals do satisfy the populatio momets, as asserted above, by replacig the OLS estimators with their data formulae U P t = (Y t B 0 B X t ) = Y Y B X 5 B X = 0
6 ad X t Ut P = = = X t (Y t B 0 B X t ) X t Y t X t Y t Y Y X t B X X t X t + B X X t B Because P X t X Yt Y P = X ty t Y P X t ad P P P X2 t X X t, the above displayed equatio becomes " # X t Ut P = X t X Yt Y B X t X 2 Because B = " P (X t X)(Y t Y) P (X t X) 2, the above expressio equals X t X Yt Y X t X Yt Y # = 0 X 2 t X t Fially, ote that orthogoality betwee Ut P ad the regressors implies orthogoality betwee Ut P ad Yt P, which is a liear combiatio of the regressors. I detail The coditioal expectatios used to de e the model cotai importat iformatio. If we treat the regressor as a radom variable, the we must distiguish betwee coditioal ad ucoditioal expectatios.. For example, the coditioal expectatio of Yt P is E Y P t jx t = E (AOLS + B OLS X t jx t ) = + X t = E (Y t jx t ) The ucoditioal expectatio of Yt P is E Yt P = E (AOLS + B OLS X t ) = + EX t ; which is costat if the expectatio of the regressor is costat across observatios. While the coditioal ad ucoditioal expectatios of Yt P di er, the coditioal ad ucoditioal expectatios of Ut P are the same E U P t jx t = E Yt Y P t jx t = + Xt E Y P t jx t = 0; 6 X 2 t X 2 =
7 ad E U P t = E Yt Yt P = + EXt The (ucoditioal) covariace betwee Yt P ad Ut P is E Y P t E Yt P = 0 EYt P U P t = E (AOLS + B OLS X t EX t ) Ut P = E (A OLS + B OLS X t ) Ut P = 0; where the secod lie follows because EX t is ot radom ad EUt P = 0, ad the third lie follows because A OLS + B OLS X t is ucorrelated with Ut P by costructio (recall, if X ad Y are ucorrelated, the E (XY ) = EX EY ). Clearly, if the predicted values of the depedet variable were correlated with the estimated residuals, the the predicted values could be improved, so we expect zero covariace. To show that the sample estimate is always zero, the sample estimate of the covariace betwee Yt P ad Ut P is y P t yt P u P t = bx t b x t u P t 2 2 b = x t u P t x u P t 2 = 0; where the third lie follows from the ormal equatios that state P P up t = x tu P t = 0. Of course the ormal equatios esure that the sample aalogs equal the populatio momets. The relevat populatio momets are E (U t jx t ) = 0 (the residuals are mea zero) ad E (X t U t jx t ) = 0 (the residuals are ucorrelated with the regressors). Recall Assumptio 2 Issues of ideti catio are i play here. To make the issues clear, cosider the model Y t = 0 + X t 0 + U t ; i which x t is the k vector that does ot iclude the itercept. We ow ask, uder what coditios are the coe ciets ideti ed? If the covariace matrix of X t is osigular ad X t is idepedet of U t, the 0 is ideti ed. A additioal assumptio is eeded to idetify 0. Two alterative assumptios that idetify 7
8 0 are EU t = 0 ad Med (U t ) = 0. The oly di erece is i iterpretatio of 0 + X t 0, as discussed above. Alteratively, we could assume that U t is symmetrically distributed aroud 0, coditioal o X t. The 0 ad 0 are ideti ed ad 0 + X t 0 is both the coditioal mea ad the coditioal media, as well as beig equal to other locatio measures. Both 0 ad 0 are ideti ed uder a coditioal locatio restrictio that is weaker the either the assumptio of idepedece (betwee the regressor ad the error) or the assumptio of coditioal symmetry. Further, each coditioal locatio restrictio is associated with a coditioal momet restrictio E [f (U t ) jx t ] = 0 for some fuctio f (U t ) from which a estimator is costructed. Cosider the two locatio assumptios itroduced earlier. If E (U t jx t ) = 0, the f (U t ) = U t ad the resultat estimator is OLS (ad, agai, 0 + X t 0 is the coditioal mea of Y t ). If Med (U t jx t ) = 0, the correspodig momet coditio is E [sg (U t ) jx t ] = 0 ad the resultig estimator is least absolute deviatios (ad, agai, 0 + X t 0 is the coditioal media of Y t ). To derive the momet coditio for OLS, ote that E (U t jx t ) = 0 is clearly a momet coditio that ca be used for estimatio. The OLSE B thus satis es X t U t (B) = 0 While Med (U t jx t ) = 0 is a momet coditio, it may ot be as clear how it ca be used to form a estimator. Cosider rst the case i which U t is cotiuous. The assumptio Med (U t jx t ) = 0 implies P (U t < 0jX t ) = P (U t > 0jX t ) = 2 ; which implies E [sg (U t ) jx t ] = 0, which i tur implies E [X t sg (U t )] = 0 The sigum, or sig, fuctio is de ed as 8 < sg (u) = if u > 0 0 if u = 0 if u < 0 8
9 The LAD estimator B L satis es the sample aalog X t sg (U t (B L )) = 0 There are two problems here. First it may ot be apparet that the sample aalog with B L admits a uique solutio. I fact, i Powell s symmetrically trimmed LAD paper i Ecoometrica, his coditioal momet equatio has may solutios. Also, if U t is ot distributed symmetrically, the the assumptio Med (U t jx t ) = 0 does ot ecessarily lead to a simple momet coditio for estimatio. The problem is, if U t does ot have a cotiuous distributio, the it is possible that there is positive poit mass at the media, so it is possible that E [sg (U t ) jx t ] 6= 0. The alterative is to retur to the loss fuctio (also termed the objective fuctio). The loss fuctio approach solves both problems. First, there is clearly a uique solutio (as Powell shows i the appedix to the above metioed paper). Secod, the loss fuctio approach works well eve if U t does ot have a cotiuous distributio. 9
First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >
More informationStatistical Properties of OLS estimators
1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More informationStatistical Inference Based on Extremum Estimators
T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationFirst, note that the LS residuals are orthogonal to the regressors. X Xb X y = 0 ( normal equations ; (k 1) ) So,
0 2. OLS Part II The OLS residuals are orthogoal to the regressors. If the model icludes a itercept, the orthogoality of the residuals ad regressors gives rise to three results, which have limited practical
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationSingle-Equation GMM: Estimation
Sigle-Equatio GMM: Estimatio Lecture for Ecoomics 241B Douglas G. Steigerwald UC Sata Barbara Jauary 2012 Iitial Questio Iitial Questio How valuable is ivestmet i college educatio? ecoomics - measure value
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationUnbiased Estimation. February 7-12, 2008
Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationLecture 11 and 12: Basic estimation theory
Lecture ad 2: Basic estimatio theory Sprig 202 - EE 94 Networked estimatio ad cotrol Prof. Kha March 2 202 I. MAXIMUM-LIKELIHOOD ESTIMATORS The maximum likelihood priciple is deceptively simple. Louis
More information1 General linear Model Continued..
Geeral liear Model Cotiued.. We have We kow y = X + u X o radom u v N(0; I ) b = (X 0 X) X 0 y E( b ) = V ar( b ) = (X 0 X) We saw that b = (X 0 X) X 0 u so b is a liear fuctio of a ormally distributed
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationStudy the bias (due to the nite dimensional approximation) and variance of the estimators
2 Series Methods 2. Geeral Approach A model has parameters (; ) where is ite-dimesioal ad is oparametric. (Sometimes, there is o :) We will focus o regressio. The fuctio is approximated by a series a ite
More information11 THE GMM ESTIMATION
Cotets THE GMM ESTIMATION 2. Cosistecy ad Asymptotic Normality..................... 3.2 Regularity Coditios ad Idetificatio..................... 4.3 The GMM Iterpretatio of the OLS Estimatio.................
More informationEfficient GMM LECTURE 12 GMM II
DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet
More informationLecture 3: MLE and Regression
STAT/Q SCI 403: Itroductio to Resamplig Methods Sprig 207 Istructor: Ye-Chi Che Lecture 3: MLE ad Regressio 3. Parameters ad Distributios Some distributios are idexed by their uderlyig parameters. Thus,
More informationStatistical and Mathematical Methods DS-GA 1002 December 8, Sample Final Problems Solutions
Statistical ad Mathematical Methods DS-GA 00 December 8, 05. Short questios Sample Fial Problems Solutios a. Ax b has a solutio if b is i the rage of A. The dimesio of the rage of A is because A has liearly-idepedet
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationAsymptotic Results for the Linear Regression Model
Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is
More informationSOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker
SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationMaximum Likelihood Estimation
Chapter 9 Maximum Likelihood Estimatio 9.1 The Likelihood Fuctio The maximum likelihood estimator is the most widely used estimatio method. This chapter discusses the most importat cocepts behid maximum
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationEconomics 326 Methods of Empirical Research in Economics. Lecture 8: Multiple regression model
Ecoomics 326 Methods of Empirical Research i Ecoomics Lecture 8: Multiple regressio model Hiro Kasahara Uiversity of British Columbia December 24, 2014 Why we eed a multiple regressio model I There are
More informationSupport vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationJanuary 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS
Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we
More informationNotes On Median and Quantile Regression. James L. Powell Department of Economics University of California, Berkeley
Notes O Media ad Quatile Regressio James L. Powell Departmet of Ecoomics Uiversity of Califoria, Berkeley Coditioal Media Restrictios ad Least Absolute Deviatios It is well-kow that the expected value
More informationProbability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].
Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x
More informationRandom Matrices with Blocks of Intermediate Scale Strongly Correlated Band Matrices
Radom Matrices with Blocks of Itermediate Scale Strogly Correlated Bad Matrices Jiayi Tog Advisor: Dr. Todd Kemp May 30, 07 Departmet of Mathematics Uiversity of Califoria, Sa Diego Cotets Itroductio Notatio
More information6. Sufficient, Complete, and Ancillary Statistics
Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:
Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal
More informationIn this section we derive some finite-sample properties of the OLS estimator. b is an estimator of β. It is a function of the random sample data.
17 3. OLS Part III I this sectio we derive some fiite-sample properties of the OLS estimator. 3.1 The Samplig Distributio of the OLS Estimator y = Xβ + ε ; ε ~ N[0, σ 2 I ] b = (X X) 1 X y = f(y) ε is
More information10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More information1 Inferential Methods for Correlation and Regression Analysis
1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet
More informationSolutions: Homework 3
Solutios: Homework 3 Suppose that the radom variables Y,...,Y satisfy Y i = x i + " i : i =,..., IID where x,...,x R are fixed values ad ",...," Normal(0, )with R + kow. Fid ˆ = MLE( ). IND Solutio: Observe
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationElement sampling: Part 2
Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig
More information¹Y 1 ¹ Y 2 p s. 2 1 =n 1 + s 2 2=n 2. ¹X X n i. X i u i. i=1 ( ^Y i ¹ Y i ) 2 + P n
Review Sheets for Stock ad Watso Hypothesis testig p-value: probability of drawig a statistic at least as adverse to the ull as the value actually computed with your data, assumig that the ull hypothesis
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More information3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.
3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More informationProbability and Statistics
ICME Refresher Course: robability ad Statistics Staford Uiversity robability ad Statistics Luyag Che September 20, 2016 1 Basic robability Theory 11 robability Spaces A probability space is a triple (Ω,
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationECON 3150/4150, Spring term Lecture 3
Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio
More informationIt should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.
Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig
More informationCEU Department of Economics Econometrics 1, Problem Set 1 - Solutions
CEU Departmet of Ecoomics Ecoometrics, Problem Set - Solutios Part A. Exogeeity - edogeeity The liear coditioal expectatio (CE) model has the followig form: We would like to estimate the effect of some
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationMatrix Representation of Data in Experiment
Matrix Represetatio of Data i Experimet Cosider a very simple model for resposes y ij : y ij i ij, i 1,; j 1,,..., (ote that for simplicity we are assumig the two () groups are of equal sample size ) Y
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationLecture Stat Maximum Likelihood Estimation
Lecture Stat 461-561 Maximum Likelihood Estimatio A.D. Jauary 2008 A.D. () Jauary 2008 1 / 63 Maximum Likelihood Estimatio Ivariace Cosistecy E ciecy Nuisace Parameters A.D. () Jauary 2008 2 / 63 Parametric
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationSolutions to Odd Numbered End of Chapter Exercises: Chapter 4
Itroductio to Ecoometrics (3 rd Updated Editio) by James H. Stock ad Mark W. Watso Solutios to Odd Numbered Ed of Chapter Exercises: Chapter 4 (This versio July 2, 24) Stock/Watso - Itroductio to Ecoometrics
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More information6.867 Machine learning, lecture 7 (Jaakkola) 1
6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationLECTURE 11 LINEAR PROCESSES III: ASYMPTOTIC RESULTS
PRIL 7, 9 where LECTURE LINER PROCESSES III: SYMPTOTIC RESULTS (Phillips ad Solo (99) ad Phillips Lecture Notes o Statioary ad Nostatioary Time Series) I this lecture, we discuss the LLN ad CLT for a liear
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationIntroduction to Machine Learning DIS10
CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationSolution to Chapter 2 Analytical Exercises
Nov. 25, 23, Revised Dec. 27, 23 Hayashi Ecoometrics Solutio to Chapter 2 Aalytical Exercises. For ay ε >, So, plim z =. O the other had, which meas that lim E(z =. 2. As show i the hit, Prob( z > ε =
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More information5. Likelihood Ratio Tests
1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,
More informationQuick Review of Probability
Quick Review of Probability Berli Che Departmet of Computer Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Refereces: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chapter 2 & Teachig
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece
More informationBinomial Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible
More informationMATHEMATICAL SCIENCES PAPER-II
MATHEMATICAL SCIENCES PAPER-II. Let {x } ad {y } be two sequeces of real umbers. Prove or disprove each of the statemets :. If {x y } coverges, ad if {y } is coverget, the {x } is coverget.. {x + y } coverges
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationAlgorithms for Clustering
CR2: Statistical Learig & Applicatios Algorithms for Clusterig Lecturer: J. Salmo Scribe: A. Alcolei Settig: give a data set X R p where is the umber of observatio ad p is the umber of features, we wat
More informationLecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationLecture 7: Density Estimation: k-nearest Neighbor and Basis Approach
STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.
More informationReview Questions, Chapters 8, 9. f(y) = 0, elsewhere. F (y) = f Y(1) = n ( e y/θ) n 1 1 θ e y/θ = n θ e yn
Stat 366 Lab 2 Solutios (September 2, 2006) page TA: Yury Petracheko, CAB 484, yuryp@ualberta.ca, http://www.ualberta.ca/ yuryp/ Review Questios, Chapters 8, 9 8.5 Suppose that Y, Y 2,..., Y deote a radom
More informationOutline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression
REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationn outcome is (+1,+1, 1,..., 1). Let the r.v. X denote our position (relative to our starting point 0) after n moves. Thus X = X 1 + X 2 + +X n,
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 9 Variace Questio: At each time step, I flip a fair coi. If it comes up Heads, I walk oe step to the right; if it comes up Tails, I walk oe
More information