Regression #3: Properties of OLS Estimator

Similar documents
Regression #4: Properties of OLS Estimator (Part 2)

GLS and FGLS. Econ 671. Purdue University. Justin L. Tobias (Purdue) GLS and FGLS 1 / 22

Regression #2. Econ 671. Purdue University. Justin L. Tobias (Purdue) Regression #2 1 / 24

Regression #5: Confidence Intervals and Hypothesis Testing (Part 1)

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

Introduction to Simple Linear Regression

Properties of the least squares estimates

Regression Estimation Least Squares and Maximum Likelihood

MS&E 226: Small Data

Economics 583: Econometric Theory I A Primer on Asymptotics

Introduction to Estimation Methods for Time Series models. Lecture 1

ECON The Simple Regression Model

ECON 3150/4150, Spring term Lecture 6

The regression model with one stochastic regressor.

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Bias Variance Trade-off

ECON 3150/4150, Spring term Lecture 7

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

Advanced Signal Processing Introduction to Estimation Theory

Linear models and their mathematical foundations: Simple linear regression

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

F & B Approaches to a simple model

where x and ȳ are the sample means of x 1,, x n

MS&E 226: Small Data

Applied Econometrics (QEM)

Gibbs Sampling in Linear Models #1

Making sense of Econometrics: Basics

13. Parameter Estimation. ECE 830, Spring 2014

Chapter 14 Stein-Rule Estimation

Statistics and Econometrics I

ECE531 Lecture 10b: Maximum Likelihood Estimation

F9 F10: Autocorrelation

The Simple Regression Model. Part II. The Simple Regression Model

IEOR E4703: Monte-Carlo Simulation

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Covariance function estimation in Gaussian process regression

The Linear Regression Model

Introduction to Estimation Methods for Time Series models Lecture 2

Multiple Linear Regression

Nonparametric Regression

The Simple Linear Regression Model

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

Simple Linear Regression

Lecture 6 Multiple Linear Regression, cont.

Statistics 910, #5 1. Regression Methods

Regression, Ridge Regression, Lasso

Categorical Predictor Variables

Answers to Problem Set #4

Statistics: Learning models from data

Multivariate Regression Analysis

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

Notes on Asymptotic Theory: Convergence in Probability and Distribution Introduction to Econometric Theory Econ. 770

Model comparison and selection

Introductory Econometrics

Final Exam. Economics 835: Econometrics. Fall 2010

MS&E 226: Small Data

Econometrics of Panel Data

Statistical inference

Regression and Statistical Inference

Specification Errors, Measurement Errors, Confounding

Linear Models in Machine Learning

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Econ 583 Final Exam Fall 2008

Chapter 8: Estimation 1

Estimation, Inference, and Hypothesis Testing

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Lecture 4: Heteroskedasticity

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Machine Learning Basics: Estimators, Bias and Variance

Regression I: Mean Squared Error and Measuring Quality of Fit

Pubh 8482: Sequential Analysis

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

The Statistical Property of Ordinary Least Squares

Linear Model Selection and Regularization

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

HT Introduction. P(X i = x i ) = e λ λ x i

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

ECO375 Tutorial 8 Instrumental Variables

Linear Models and Estimation by Least Squares

Regression #8: Loose Ends

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Graduate Econometrics I: Unbiased Estimation

9. Least squares data fitting

Lecture Notes 5 Convergence and Limit Theorems. Convergence with Probability 1. Convergence in Mean Square. Convergence in Probability, WLLN

Lecture 24: Weighted and Generalized Least Squares

Statistics II Lesson 1. Inference on one population. Year 2009/10

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Algorithm Independent Topics Lecture 6

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

1 One-way analysis of variance

ECON 4160, Autumn term Lecture 1

Topic 16 Interval Estimation

Econometrics - 30C00200

Machine Learning for OR & FE

Transcription:

Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20

Introduction In this lecture, we establish some desirable properties associated with the OLS estimator. These include proofs of unbiasedness and consistency for both ˆβ and ˆσ 2, and a derivation of the conditional and unconditional variance-covariance matrix of ˆβ. Justin L. Tobias (Purdue) Regression #3 2 / 20

Unbiasedness y i = x i β + ɛ i. ˆβ = (X X ) 1 X y. We continue with our standard set of regression assumptions, including E(ɛ X ) = 0 and E(ɛɛ X ) = σ 2 I n. Theorem What does this actually mean? Can you think of a situation where an unbiased estimator might not be preferred over a biased alternative? Justin L. Tobias (Purdue) Regression #3 3 / 20

Unbiasedness Proof. First, consider E( ˆβ X ). To this end, we note: Therefore, by the law of iterated expectations, Justin L. Tobias (Purdue) Regression #3 4 / 20

Variance-Covariance Matrix We now seek to obtain the variance-covariance matrix of the OLS estimator. To this end, we note: Justin L. Tobias (Purdue) Regression #3 5 / 20

Variance-Covariance Matrix Another way to get this same result is as follows: So, what do the elements of this k k matrix represent? Why are they useful? Justin L. Tobias (Purdue) Regression #3 6 / 20

Variance-Covariance Matrix To obtain an unconditional variance-covariance matrix, i.e., Var( ˆβ), we note that, in general, Thus, (why?) In practice, we evaluate this at the observed X values: Justin L. Tobias (Purdue) Regression #3 7 / 20

Variance-Covariance Matrix Another issue that arises is that the variance parameter, σ 2 is also unknown and must be estimated. A natural estimator arises upon considering its definition: Replacing the population expectation with its sample counterpart, and using ˆβ instead of β, we obtain an intuitive estimator: Justin L. Tobias (Purdue) Regression #3 8 / 20

Variance-Covariance Matrix Though this estimator is widely used, it turns out to be a biased estimator of σ 2. An unbiased estimator can be obtained by incorporating the degrees of freedom correction: where k represents the number of explanatory variables included in the model. In the following slides, we show that ˆσ 2 is indeed unbiased. Justin L. Tobias (Purdue) Regression #3 9 / 20

We seek to show E(ˆσ 2 X ) = σ 2. Proof. where the last result follows since X M = MX = 0. Justin L. Tobias (Purdue) Regression #3 10 / 20

Proof. It follows that is an unbiased estimator of σ 2, as claimed. Justin L. Tobias (Purdue) Regression #3 11 / 20

Consistency Recall the definition of a consistent estimator, ˆθ(x n ) = ˆθ n of θ. We say ˆθ n is consistent if for any ɛ > 0, { } lim Pr ˆθ n θ > ɛ = 0. n Relatedly, we say that ˆθ n converges in mean square to θ if: lim E(ˆθ n θ) 2 = 0. n The MSE criterion can also be written as the Bias squared plus the variance, whence ˆθ n m.s. θ iff Bias(ˆθ n ) 0 and Variance (ˆθ n ) 0. Justin L. Tobias (Purdue) Regression #3 12 / 20

Consistency We will prove that MSE can be written as the square of the bias plus the variance: E([ˆθ n θ] 2 ) = E([ˆθ n E(ˆθ n ) + E(ˆθ n ) θ] 2 ) = E([ˆθ n E(ˆθ n )] 2 ) + 2E([ˆθ n E(ˆθ n )][E(ˆθ n ) θ]) +E([E(ˆθ n ) θ] 2 ) = E([ ˆθ n E(ˆθ n )] 2 ) + [E(ˆθ n ) θ] 2 = Variance + Bias 2 Justin L. Tobias (Purdue) Regression #3 13 / 20

Consistency Convergence in mean square is also a stronger condition than convergence in probability: Proof. Fix ɛ > 0 and note: [ E (ˆθ(x n ) θ) 2] = (ˆθ(x n ) θ) 2 f n (x n )dx n X n (ˆθ(x n ) θ) 2 f n (x n )dx n {x n: ˆθ(x n) θ >ɛ} ɛ 2 f n (x n )dx n {x { n: ˆθ(x n) θ >ɛ} } = ɛ 2 Pr ˆθ(x n ) θ > ɛ. Justin L. Tobias (Purdue) Regression #3 14 / 20

Consistency Thus, { } 0 Pr ˆθ(x n ) θ > ɛ 1 [ ɛ 2 E (ˆθ(x n ) θ) 2]. For fixed ɛ and taking limits as n gives the result. The assumption of convergence in mean square therefore guarantees that the estimator converges in probability. Justin L. Tobias (Purdue) Regression #3 15 / 20

Consistency Now, let us revisit ˆβ. To show that ˆβ p β [or plim( ˆβ) = β], it is enough to show that the bias and variance of ˆβ go to zero. The estimator has already been demonstrated to be unbiased. As for the variance, note: Justin L. Tobias (Purdue) Regression #3 16 / 20

Consistency Consider the matrix X X /n. A typical element of this matrix is a sample average of the form: n 1 n x ij x il. Provided these averages settle down to finite population means, it is reasonable to assume i=1 where Q has finite elements and is nonsingular. Justin L. Tobias (Purdue) Regression #3 17 / 20

Consistency Since the inverse is a continuous function, we have: Thus, whence as needed. plim( ˆβ) = β, Justin L. Tobias (Purdue) Regression #3 18 / 20

Consistency Let us now investigate the consistency of ˆσ 2. From before, we can write: We can now use some properties of plim s to simplify this result. First, note that: by Chebyshev s LLN. Similarly, note Justin L. Tobias (Purdue) Regression #3 19 / 20

Consistency By assumption, we have (X X /n) 1 p Q 1 and we also note given that E(ɛ X ) = 0. Putting all of this together, we have as needed. Justin L. Tobias (Purdue) Regression #3 20 / 20