Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Similar documents
Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

e i is a random error

Properties of Least Squares

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Economics 130. Lecture 4 Simple Linear Regression Continued

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Linear Regression Analysis: Terminology and Notation

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

The Ordinary Least Squares (OLS) Estimator

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

x i1 =1 for all i (the constant ).

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Chapter 11: Simple Linear Regression and Correlation

β0 + β1xi and want to estimate the unknown

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

Statistics for Economics & Business

Estimation: Part 2. Chapter GREG estimation

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

A Comparative Study for Estimation Parameters in Panel Data Model

β0 + β1xi. You are interested in estimating the unknown parameters β

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Introduction to Regression

STAT 3008 Applied Regression Analysis

Lecture 4 Hypothesis Testing

Lecture 3 Stat102, Spring 2007

Primer on High-Order Moment Estimators

Chapter 14 Simple Linear Regression

Introduction to Dummy Variable Regressors. 1. An Example of Dummy Variable Regressors

Problem of Estimation. Ordinary Least Squares (OLS) Ordinary Least Squares Method. Basic Econometrics in Transportation. Bivariate Regression Analysis

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

III. Econometric Methodology Regression Analysis

Econometrics of Panel Data

β0 + β1xi. You are interested in estimating the unknown parameters β

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

Basic Business Statistics, 10/e

Systems of Equations (SUR, GMM, and 3SLS)

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Statistics for Business and Economics

Continuous vs. Discrete Goods

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Learning Objectives for Chapter 11

Chapter 9: Statistical Inference and the Relationship between Two Variables

F8: Heteroscedasticity

Exam. Econometrics - Exam 1

The Geometry of Logit and Probit

T E C O L O T E R E S E A R C H, I N C.

Topic 7: Analysis of Variance

FIGURE 2: ESTIMATING THE MARGINAL COST OF A POLICY

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

Chapter 4: Regression With One Regressor

/ n ) are compared. The logic is: if the two

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

Chapter 5: Hypothesis Tests, Confidence Intervals & Gauss-Markov Result

a. (All your answers should be in the letter!

Chapter 13: Multiple Regression

Correlation and Regression

Linear Approximation with Regularization and Moving Least Squares

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Limited Dependent Variables

Lecture 3 Specification

Exercise 1 The General Linear Model : Answers

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Basically, if you have a dummy dependent variable you will be estimating a probability.

x = , so that calculated

Chapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2

Biostatistics 360 F&t Tests and Intervals in Regression 1

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

A Monte Carlo Study for Swamy s Estimate of Random Coefficient Panel Data Model

Exponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute

Chapter 15 Student Lecture Notes 15-1

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Interpreting Slope Coefficients in Multiple Linear Regression Models: An Example

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Feature Selection: Part 1

January Examinations 2015

Lecture 3: Probability Distributions

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Chapter 8 Indicator Variables

Sampling Theory MODULE V LECTURE - 17 RATIO AND PRODUCT METHODS OF ESTIMATION

Scatter Plot x

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

Chapter 15 - Multiple Regression

Effects of Ignoring Correlations When Computing Sample Chi-Square. John W. Fowler February 26, 2012

Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Comparison of Regression Lines

Outline. 9. Heteroskedasticity Cross Sectional Analysis. Homoskedastic Case

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

Transcription:

Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008

1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate the sample regresson functon Y = b 0 + b 1 X + e () where the OLS estmators are, n x y b 1 = n x b 0 = Y b 1 X Unbasedness The OLS estmate b 1 s smply a sample estmate of the populaton parameter β 1. For every random sample we draw from the populaton, we wll get a dfferent b 1. What then s the relatonshp between the b 1 we obtan from a random sample and the underlyng β 1 of the populaton? To see ths, start by rewrtng the OLS estmator as follows; b 1 = n x y n x (3) 1 = n x x (Y Y ) (4) 1 = n ( x x Y x Y ) (5) 1

= 1 n x ( x Y Y x ) (6) 1 = n ( x x Y ) (7) For Y, we can substtute the expresson for the true regresson lne n order to obtan a relatonshp between b 1 and β 1. 1 b 1 = n ( x x (β 0 + β 1 X + u )) (8) 1 n n = n (β x 0 x + β 1 x X + x u ) (9) = β 1 + k u (10) where k = x n x (11) From ths expresson, we see that b 1 and β 1 are n fact dfferent. However, we can demonstrate that, under certan assumptons, the average b 1 n repeated samplng would equal β 1. To see ths, take the expectaton of both sdes of the above expresson, E(b 1 ) = β 1 + E( k u ) (1) If we assume that X, and therefore k s non-stochastc, we can rewrte ths as E(b 1 ) = β 1 + k E(u ) (13)

If we also assume that E(u ) = 0, we get E(b 1 ) = β 1 (14) When the expectaton of the sample estmate equals the true parameter, we say that the estmator s unbased. To recap, we fnd that f X s nonstochastc and E(u ) = 0, the OLS estmator s unbased. However, note that these two condtons are not necessary for unbasedness. Suppose k s not stochastc. Then, b 1 s an unbased f, E( k u ) = ( E(k u )) = 0 (15) That s, f X and u are uncorrelated, the OLS estmator s unbased. 3 Varance of the Coeffcent Estmate The varance of the b 1 samplng dstrbuton s, by defnton V ar(b 1 ) = E[b 1 E(b 1 )] (16) We showed n the prevous secton that, under certan classcal assumptons, E(b 1 ) = β 1. Then, V ar(b 1 ) = E[b 1 β 1 ] = E[ k u ] (17) Expandng terms, we get V ar(b 1 ) = E[ k u + k k j u u j ] (18) j = k E(u ) + k k j E(u u j ) (19) j 3

If we make the followng two addtonal assumptons, 1. The varance of the error term s constant,.e. V ar(u ) = E[u ] = σ. The error terms of dfferent observatons are not correlated wth each other or the covarance between all error terms s zero,.e. E(u u j ) = 0 for all j The expresson for the varance of b 1 reduces to the followng elegant form, V ar(b 1 ) = k σ = σ n (0) x Note that the varance of the slope coeffcent depends on two thngs. Varance of the slope coeffcent ncreases as 1. The varance of the error term ncreases. The sum of squared varaton n the ndependent varable decreases,.e. the X varable s clustered around the mean. 3.1 Estmate of The Varance of the Error Term Even though the above expresson s elegant, t s mpossble to compute the varance of the slope estmate because we don t know the varance of the underlyng error term. We get around ths problem by estmatng the varance of the error term,.e. σ usng the resduals obtaned from OLS. It can be shown that, under certan classcal assumptons, ˆσ = n e n (1) 4

s an unbased estmator of σ,.e.. E[ ˆσ n e ] = E[ n ] = σ () For the formal proof, see Gujarat Appendx. Note that ths proof also depends crucally on the classcal assumptons. Note that the denomnator of ths unbased estmator s the SSR. The estmator tself s ofren called the Mean Square Resdual. The square root of the estmator s called the standard error of the regresson (SER) and s typcally used as an estmate of the standard devaton of the error term. 4 The Effcency of the OLS estmator Under the classcal assumptons, the OLS estmator b 1 can be wrtten as a lnear functon of Y; b 1 = k Y (3) where k = x (4) Our goal now s to show that ths OLS estmator has a lower varance than any other lnear estmator,.e. the OLS estmator s effcent or best. To do so, consder any other lnear unbased estmator, b 1 = w Y (5) where w s some other functon of the two varables. 5

The expected value of ths estmator s, E(b 1) = w E(Y ) = β 0 w + β 1 w X (6) Because b 1 s unbased, For ths to be the case, t follows that, E(b 1) = β 1 (7) w = 0 (8) w X = 1 (9) It follows from these two denttes that, w x = w (X X) = w X X w = 1 (30) The varance of b 1 s V ar(b 1) = V ar( w Y ) = w V ar(y ) = σ w (31) If we rewrte the varance as, V ar(b 1) = σ (w k + k ) (3) and expand ths expresson V ar(b 1) = σ ( (w k ) + k + k (w k ) ) (33) Note that, k w = w = 1 (34) under the unbasedness assumpton made earler. 6

In addton, k = = 1 (35) Therefore, the varance of b 1 smplfes to, V ar(b 1) = σ ( (w k ) + k ) (36) Ths expresson s mnmzed when w = k (37) and the mnmum varance s, V ar(b 1) = σ k (38) Ths completes the proof that, under the classcal assumptons, the OLS estmator s has the least varance among all lnear unbased estmators. 4.1 Consstency We establshed that the OLS estmator s unbased and effcent under classcal assumptons. We can also show easly that the OLS estmator s consstent under the same assumpton. An estmator s consstent f ts varances reaches zero as the sample sze ncreases. In order to see ths, start wth the expresson for the varance, V ar(b 1 ) = Dvde both the denomnator and numerator by n. σ (39) V ar(b 1 ) = σ /n /n (40) 7

As n, the numerator approaches zero whereas the denomnator remans postve. Therefore, lm V ar(b 1) = 0 (41) n 5 Gauss-Markov Theorem and Classcal Assumptons To recap, we have demonstrated that the OLS estmator, b 1 = y = k Y (4) has the followng propertes; 1. Unbased,.e.. V ar(b 1 ) = E(b 1 ) = β 1 (43) σ = σ k (44) 3. Best or effcent,.e. has lower varance than any other lnear unbased estmator,.e. V ar(b 1 ) < V ar(b 1) (45) where b 1 = w Y and w s any other functon of x. 4. Consstent,.e. lm V ar(b 1) = 0 (46) n 8

f the followng classcal assumptons are satsfed, 1. The underlyng regresson model s lnear n parameters, has an addtve error and s correctly specfed,.e. Y = β 0 + β 1 f(x ) + u (47). The X varable s non-stochastc,.e. fxed n repeated samplng. 3. The expected value of the error term s zero,.e. E(u ) = 0 (48) Note that the ntercept term, β 0 ensures that ths condton s met. Consder Y = β 0 + β 1 X + u (49) E(u ) = k (50) Ths s equvalent to a model where Y = β0 + β 1 X + u (51) β0 = β 0 + 3 (5) E(u ) = 0 (53) Note also that the frst three condtons are suffcent for OLS to be unbased. 4. The explanatory varable, X s uncorrelated wth the error term u,.e. Cor(X, e) = E[x u ] = 0 (54) 9

Note that ths assumpton s necessary for OLS to be unbased. Even f x s non-stochastc, we can obtan unbased coeffcents f x s uncorrelated wth the error term. Such correlaton occurs typcally f X s endogenous,.e. determned by other varables. If both X and Y are determned by the same unobserved varables, ths assumpton s volated. If X and Y are determned by each other,.e. smultaneous equatons, ths assumpton s also volated. For example, f Y = β 0 + β 1 X + u (55) X = δ 0 + δ 1 Y + ɛ (56) Cor(X, u ) 0 f δ 1 0 and/or Cor(u, ɛ ) 0 5. The error term s homoskedastc,.e. the condtonal varance s a constant. V ar(u X ) = E(u X ) = σ (57) 6. The error term s serally uncorrelated,.e. the error term of one observaton s not correlated wth the error term of any other observaton. Cov(u, u j X, X j ) = E(u u j X, X j ) = 0 j (58) The assumptons of serally uncorrelated and homoskedastc errors allow us to obtan an unbased estmator for the varance of the error term, and a smple OLS formula for the varance of the coeffcent estmate. In addton, we need these two assumptons to demonstrate that OLS s effcent. In fact, we wll see later that other GLS methods are effcent when these assumptons are volated. 10

There are a few other assumptons that are necessary to obtan OLS coeffcents and standard errors; 1. At least one degree of freedom,.e. the number of observatons must exceed the number of parameters (n > k + 1) where k s the number of X varables. In the smple regresson wth one X varable, ths means there should be at least three observatons.. No X varable should be a determnstc lnear functon of other X varables,.e. no multcollnearty. Ths condton apples only to multple regressons where there are more than one X varable, and s dscussed later. 3. There should be some varaton n the X varable. If the X varable does not vary, t s mpossble to estmate the slope of a regresson lne. 11