Day 1A Ordinary Least Squares and GLS

Similar documents
Day 2A Instrumental Variables, Two-stage Least Squares and Generalized Method of Moments

Day 3B Nonparametrics and Bootstrap

Day 4 Nonlinear methods

Day 4A Nonparametrics

Econometrics Midterm Examination Answers

Day 1B Count Data Regression: Part 2

11. Bootstrap Methods

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

ECON Introductory Econometrics. Lecture 6: OLS with Multiple Regressors

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Economics 326 Methods of Empirical Research in Economics. Lecture 14: Hypothesis testing in the multiple regression model, Part 2

Testing Linear Restrictions: cont.

Graduate Econometrics Lecture 4: Heteroskedasticity

Econometrics Homework 1

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Economics 582 Random Effects Estimation

Instrumental Variables, Simultaneous and Systems of Equations

Environmental Econometrics

Bootstrap-Based Improvements for Inference with Clustered Errors

ECON Introductory Econometrics. Lecture 4: Linear Regression with One Regressor

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

Föreläsning /31

Introductory Econometrics. Lecture 13: Hypothesis testing in the multiple regression model, Part 1

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

Simultaneous Equations with Error Components. Mike Bronner Marko Ledic Anja Breitwieser

Lab 07 Introduction to Econometrics

Chapter 2. Dynamic panel data models

Эконометрика, , 4 модуль Семинар Для Группы Э_Б2015_Э_3 Семинарист О.А.Демидова

ECON Introductory Econometrics. Lecture 13: Internal and external validity

ECON3150/4150 Spring 2015

Topic 7: Heteroskedasticity

Models, Testing, and Correction of Heteroskedasticity. James L. Powell Department of Economics University of California, Berkeley

Lecture 4: Linear panel models

ECON 594: Lecture #6

GMM Estimation in Stata

1 The Multiple Regression Model: Freeing Up the Classical Assumptions

ECONOMETRICS FIELD EXAM Michigan State University May 9, 2008

Applied Econometrics. Lecture 3: Introduction to Linear Panel Data Models

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Spring 2013 Instructor: Victor Aguirregabiria

Practical Econometrics. for. Finance and Economics. (Econometrics 2)

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

Applied Statistics and Econometrics

Generalized method of moments estimation in Stata 11

CRE METHODS FOR UNBALANCED PANELS Correlated Random Effects Panel Data Models IZA Summer School in Labor Economics May 13-19, 2013 Jeffrey M.

Question 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Introductory Econometrics

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Applied Statistics and Econometrics

Essential of Simple regression

Violation of OLS assumption - Heteroscedasticity

the error term could vary over the observations, in ways that are related

Econ 836 Final Exam. 2 w N 2 u N 2. 2 v N

Quantitative Methods Final Exam (2017/1)

Least Squares Estimation-Finite-Sample Properties

Day 2B Inference for Clustered Data: Part 2 With a Cross-section Data Example (Revised August 25 to add 14. Dyadic Clustering)

Please discuss each of the 3 problems on a separate sheet of paper, not just on a separate page!

The Multiple Linear Regression Model

Robust Inference for Regression with Clustered Data

Jeffrey M. Wooldridge Michigan State University

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

1 Outline. Introduction: Econometricians assume data is from a simple random survey. This is never the case.

Heteroskedasticity. We now consider the implications of relaxing the assumption that the conditional

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

GLS. Miguel Sarzosa. Econ626: Empirical Microeconomics, Department of Economics University of Maryland

Empirical Asset Pricing

Greene, Econometric Analysis (7th ed, 2012)

INTRODUCTION TO BASIC LINEAR REGRESSION MODEL

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Review of Econometrics

Repeated observations on the same cross-section of individual units. Important advantages relative to pure cross-section data

****Lab 4, Feb 4: EDA and OLS and WLS

1 Introduction to Generalized Least Squares

ESTIMATING AVERAGE TREATMENT EFFECTS: REGRESSION DISCONTINUITY DESIGNS Jeff Wooldridge Michigan State University BGSE/IZA Course in Microeconometrics

Econometrics. 9) Heteroscedasticity and autocorrelation

Economics 471: Econometrics Department of Economics, Finance and Legal Studies University of Alabama

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Instrumental Variables and Two-Stage Least Squares

point estimates, standard errors, testing, and inference for nonlinear combinations

Extensions to the Basic Framework I

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e

Lecture 8: Instrumental Variables Estimation

The Regression Tool. Yona Rubinstein. July Yona Rubinstein (LSE) The Regression Tool 07/16 1 / 35

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Lecture 3: Multiple Regression

OLS, MLE and related topics. Primer.

ECON2228 Notes 7. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 41

ECON2228 Notes 10. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 48

Robust Inference with Clustered Errors

The Case Against JIVE

Simple Linear Regression: The Model

Exercises Chapter 4 Statistical Hypothesis Testing

Measurement Error. Often a data set will contain imperfect measures of the data we would ideally like.

ECO220Y Simple Regression: Testing the Slope

Econometrics II Censoring & Truncation. May 5, 2011

A Course on Advanced Econometrics

Transcription:

Day 1A Ordinary Least Squares and GLS c A. Colin Cameron Univ. of Calif.- Davis Frontiers in Econometrics Bavarian Graduate Program in Economics. Based on A. Colin Cameron and Pravin K. Trivedi (2009,2010), Microeconometrics using Stata (MUS), Stata Press. and A. Colin Cameron and Pravin K. Trivedi (2005), Microeconometrics: Methods and Applications (MMA), C.U.P. March 21-25, 2011 c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 1 Colin / 41 C

1. ntroduction 1. ntroduction OLS for the linear model is the building block for other regression. Here we provide model in matrix notation statistical properties hypothesis testing simulations to show consistency and asymptotic normality. Additionally More e cient FGLS with heteroskedastic data c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 2 Colin / 41 C

1. ntroduction Overview 1 ntroduction 2 OLS: Data example 3 OLS: Matrix Notation 4 OLS: Properties 5 GLS: Generalized Least Squares 6 Tests of linear hypotheses (Wald tests) 7 Simulations: OLS Consistency and Asymptotic Normality 8 Stata commands 9 Appendix: OLS in matrix notation example c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 3 Colin / 41 C

2. Data Example Doctor visits 2. Data Example: OLS for doctor visits Cross-section data on individuals (from MUS chapter 10). Dependent variable docvis is a count. Here do OLS (later Poisson). Begin with data description and summary statistics.. use mus10data.dta, clear. quietly keep if year02==1. describe docvis private chronic female income storage display value variable name type format label variable label docvis int %8.0g number of doctor visits private byte %8.0g = 1 if private insurance chronic byte %8.0g = 1 if a chronic condition female byte %8.0g = 1 if female income float %9.0g ncome in $ / 1000. summarize docvis private chronic female income Variable Obs Mean Std. Dev. Min Max docvis 4412 3.957389 7.947601 0 134 private 4412.7853581.4106202 0 1 chronic 4412.3263826.4689423 0 1 female 4412.4718948.4992661 0 1 income 4412 34.34018 29.03987-49.999 280.777 c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 4 Colin / 41 C

2. Data Example OLS regression with default standard errors OLS regression with default standard errors: assumes i.i.d error.. * OLS regression with default standard errors. regress docvis private chronic female income Source SS df MS Number of obs = 4412 F( 4, 4407) = 162.29 Model 35771.7188 4 8942.92971 Prob > F = 0.0000 Residual 242846.27 4407 55.1046676 R-squared = 0.1284 Adj R-squared = 0.1276 Total 278617.989 4411 63.1643594 Root MSE = 7.4233 docvis Coef. Std. Err. t P> t [95% Conf. nterval] private 1.916263.2881911 6.65 0.000 1.351264 2.481263 chronic 4.826799.2419767 19.95 0.000 4.352404 5.301195 female 1.889675.2286615 8.26 0.000 1.441384 2.337967 income.016018.004071 3.93 0.000.0080367.0239993 _cons -.5647368.2746696-2.06 0.040-1.103227 -.0262465 Overall t poor as R 2 = 0.13. Often the case for cross-section data. Yet all regressors are stat. signi cant and have large impact. For income: annual income " $10,000 ) income " 10 units ) docvis " 10 0.016 = 0.16. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 5 Colin / 41 C

2. Data Example OLS regression with robust standard errors OLS regression with robust standard errors for OLS estimator preferred at this permits model error to be heteroskedastic. * OLS regression with robust standard errors. regress docvis private chronic female income, vce(robust) Linear regression Number of obs = 4412 F( 4, 4407) = 107.01 Prob > F = 0.0000 R-squared = 0.1284 Root MSE = 7.4233 Robust docvis Coef. Std. Err. t P> t [95% Conf. nterval] private 1.916263.2347443 8.16 0.000 1.456047 2.37648 chronic 4.826799.3001866 16.08 0.000 4.238283 5.415316 female 1.889675.2154463 8.77 0.000 1.467292 2.312058 income.016018.005606 2.86 0.004.0050275.0270085 _cons -.5647368.2069188-2.73 0.006 -.9704017 -.159072 Same coe cient estimates. Di erent standard errors. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 6 Colin / 41 C

2. Data Example Comparison of default and robust standard errors. * Comparison of standard errors. quietly regress docvis private chronic female income. estimates store DEFAULT. quietly regress docvis private chronic female income, vce(robust). estimates store ROBUST. estimates table DEFAULT ROBUST, b(%9.4f) se stats(n r2 F) Variable DEFAULT ROBUST private 1.9163 1.9163 0.2882 0.2347 chronic 4.8268 4.8268 0.2420 0.3002 female 1.8897 1.8897 0.2287 0.2154 income 0.0160 0.0160 0.0041 0.0056 _cons -0.5647-0.5647 0.2747 0.2069 N 4412.0000 4412.0000 r2 0.1284 0.1284 F 162.2899 107.0104 legend: b/se c The preferred heteroskedastic-robust standard errors are within 25% of default, sometimes more and sometimes less. A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 7 Colin / 41 C

2. Data Example Hypothesis tests Hypothesis tests can be implemented using Stata command test H 0 : β private = 0, β chronic = 0 H a : at least one of β private 6= 0, β chronic 6= 0. Stata post-estimation command test yields. * Wald test of restrictions. quietly regress docvis private chronic female income, vce(robust) noheader. test (private = 0) (chronic = 0) ( 1) private = 0 ( 2) chronic = 0 F( 2, 4407) = 165.11 Prob > F = 0.0000 Reject H 0 at level 0.05 since p < 0.05 or 165.11 > F.05 (2, 4407) = 3.00 using invftail(2,4407,.05). c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 8 Colin / 41 C

3. Ordinary Least Squares De nition 3. OLS: De nition in matrix notation For the i th observation y i = β 1 x 1i + β 2 x 2i + + β K x Ki + u i Usually x 1i = 1 (an intercept). ntroduce vector and matrix representation. Regressor vector x i and parameter vector β are K 1 column vectors. 2 3 2 3 x 1i β 1 6 7 6 7 x i = 4. 5 and β = 4. 5. (K 1) x (K 1) Ki β K 2 3 xi 0 β = β 1 6 7 x 1i x Ki 4. 5 = β 1 x 1i + β 2 x 2i + + β K x Ki β K Note that all vectors are de ned to be column vectors For the i th observation y i = xi 0 β + u i. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program in March Economics 21-25,. Based 2011 on A. 9 Colin / 41 C

3. Ordinary Least Squares De nition Now combine all N observations from sample f(y i, x i ), i = 1,..., N.g The linear regression model is 2 3 2 y 1 x1 0 6 7 6 β 3 2 3 u 1 7 6 7 4. 5 = 4. 5 + 4. 5 xn 0 β y N u N This is where 2 6 y = 4 (N 1) y 1. y N 3 y = Xβ + u 2 7 5 X = 6 4 (N K ) x 0 1. x 0 N 3 2 7 5 u = 6 4 (N 1) u 1. u N 3 7 5. The OLS estimator derived below is bβ OLS = (X 0 X) 1 X 0 y. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 10 Colin / 41 C

3. Ordinary Least Squares Example OLS: matrix notation example Example: N = 4 with (x, y) equal to (1, 1), (2, 3), (2, 4), and (3, 4). Then y is 4 1 and X is 4 2 with 2 3 2 3 2 y 1 1 y = 6 y 2 7 4 y 3 5 = 6 3 7 4 4 5 ; X = 6 4 y 4 4 x 0 1 x 0 2 x 0 3 x 0 4 3 7 5 = So (see appendix for detailed computation) bβ OLS = (X 0 X) 1 X 0 y = 4 8 8 18 2 6 4 x 11 x 21 x 12 x 22 x 13 x 23 x 14 x 24 1 12 27 ntercept bβ 1 = 0 and slope coe cient bβ 2 = 1.5. 3 7 5 = = 2 6 4 0 1.5 1 1 1 2 1 2 1 3 3 7 5. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 11 Colin / 41 C

3. Ordinary Least Squares Derivation of Formula Derivation of formula for OLS estimator The OLS estimator minimizes the sum of squared errors Q(β) = N i=1 u2 i = N i=1 (y i x 0 i β) 2. The rst-order conditions (f.o.c.) are Q(β) β = 2 N i=1 x i (y i x 0 i β) = 2X 0 (y Xβ) = 0. Then X 0 (y Xβ) = 0 from f.o.c. ) X 0 y = X 0 Xβ K linear equations in K unknowns β ) β = (X 0 X) 1 X 0 y if the inverse exists (i.e. rank[x ] = K ) So bβ OLS = (X 0 X) 1 X 0 y = N 1 i=1 x i xi 0 N i=1 x i y i. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 12 Colin / 41 C

4. OLS Properties Summary 4. OLS Properties: Summary bβ OLS is always estimable, provided rank[x ] = K. But properties of b β OLS depend on the true model called the data generating process (d.g.p.) Essential result: f the d.g.p. is correctly speci ed and the error u i is uncorrelated with regressors x i Then (1) β b is consistent for β (2) β b is normally distributed in large samples ( asymptotically ) (3) Variance of β b varies with assumptions on error u i F default: u i are independent (0, σ 2 ) F heteroskedastic: u i are independent (0, σ 2 i ) F clustered: u i are correlated within cluster, uncorrelated across cluster F HAC: u i are serially correlated (u i are correlated with u i 1 ) c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 13 Colin / 41 C

4. OLS Properties Summary OLS Properties f the d.g.p. is y = Xβ + u then bβ OLS = (X 0 X) 1 X 0 y = (X 0 X) 1 X 0 (Xβ + u) = (X 0 X) 1 X 0 Xβ + (X 0 X) 1 X 0 u = β + (X 0 X) 1 X 0 u = β + ( i x i x 0 i ) 1 i x i u i So assumptions on x i and u i are crucial. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 14 Colin / 41 C

4. OLS Properties Finite Sample Properties OLS Finite Sample Properties f u N [0, Ω] and regressors X are xed (nonstochastic) then bβ = β + (X 0 X) 1 X 0 u β + (X 0 X) 1 X 0 N [0, Ω] N [β, (X 0 X) 1 X 0 ΩX(X 0 X) 1 ] using linear transformation of the normal is normal z N [µ, Ω] =) Az + b N [Aµ + b, AΩA 0 ]. We instead use asumptotic theory this permits u to be nonnormal distributed. but does require a large sample so N!. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 15 Colin / 41 C

4. OLS Properties Consistency OLS Consistency Consistency Means that the probability limit (plim) of β b equals β That is: lim Pr[j β b βj < ε] = 1 for any ε > 0. N! We have (using results below) plim b β = plimfβ + (X 0 X) 1 X 0 ug o = plim β + plim n( i x i xi 0) 1 i x i u i = plim β + plim 1 N i x i x 0 i 1 plim 1 N i x i u i = β + plim 1 N i x i x 0 i 1 0 = β plimfa N b N g = plim A N plim b N if the plim 0 s are constants The plim s exist using laws of large numbers (as averages) For plim N 1 i x i u i = 0 the key assumption is E[u i jx i ] = 0. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 16 Colin / 41 C

4. OLS Properties Limit Distribution OLS Limit Distribution bβ has limit distribution with all mass at β (since b β p! β). To get a nondegenerate distribution in ate b β by p N. Then limit normal distribution is p N( β b β) = 1 N i x i xi 0 1 p1 N i x i u i d! plim 1 N i x i xi 0 1 h N [0, B] for some B d! N 0, plim 1 N i x i xi 0 1 B plim 1 N i x i xi 0 i 1 p d p f H N! H and bn! N [µ, Ω] then HN b N! N [Hµ, HΩH 0 ] p1 d N i x i u i! N [0, B] by a central limit theorem B = plim 1 p N i x i u i 1 p N i x i u i 0 = plim 1 N i j u i u j x i x 0 j c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 17 Colin / 41 C

4. OLS Properties Asymptotic Distribution OLS Asymptotic Distribution All we need for theory is the previous result. but rescale from p N( β b β) to β b for friendlier looking results drop plims and replace B by a consistent estimate bb The so-called asymptotic distribution is bβ a N N 1 β, i=1 x i xi 0 N 1 N bb i=1 x i xi 0 Usually B = Var[ 1 p N X 0 u] = Var[ 1 p N i x i u i ] For independent heteroskedastic errors bb = 1 N i bu 2 i x i x 0 i. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 18 Colin / 41 C

4. OLS Properties White Estimate of VCE White Estimate of VCE Most often used: requires data to be independent over i. Then B = plim 1 N i j u i u j x i x 0 j = plim 1 N i u 2 i x i x 0 i. White (1980) showed that can use bb = 1 N i bu 2 i x i x 0 i. Yields the heteroskedastic-consistent estimate of the variance-covariance matrix of the OLS estimator (VCE) bv robust [ b β] = N 1 i=1 x i xi 0 N i=1 bu2 i x i xi 0 N 1 i=1 x i xi 0 bu i = y i xi 0 β b Leads to heteroskedastic robust or robust standard errors. n Stata this is option vce(robust) for cross-section commands c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 19 Colin / 41 C

4. OLS Properties Other Estimates of VCE Other Estimates of VCE Default: ndependent homoskedastic errors: V[u i jx i ] = σ 2 bv[ b β] = s 2 N i=1 x i x 0 i 1 ; s 2 = 1 N K i bu2 i Simpli cation as then B = plim N 1 i ui 2x i xi 0 = σ 2 plim i x i xi 0 Cluster robust: Errors correlated within cluster but independent across cluster. bv[ b β] = G g =1 X g X g 0 1 G g =1 X g bu g bu g 0 X g G g =1 X g X g 0 1. Here observations are stacked in cluster g as y g = X g β + u g. n Stata this is option vce(cluster id) for cross-section commands and is option vce(robust) for most xt panel commands. Heteroskedasticity and autocorrelation (HAC) robust: time series Not covered here but extends White to an MA(q) error. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 20 Colin / 41 C

5. Generalized least squares Overview 5. Generalized least squares (GLS) Overview OLS is e cient (best linear unbiased estimator) if errors are i.i.d. so that V[ujX] = σ 2. n practice errors are rarely i.i.d. So we usually do OLS and obtain robust VCE that permits V[ujX] 6= σ 2 could be heteroskedastic robust, cluster-robust, HAC,... More e cient feasible GLS (FGLS) assumes a model for V[ujX] yields more precise estimates (smaller standard errors and bigger t-statistics) but then obtain robust VCE that allows for misspeci ed model for V[ujX]. called weighted LS or working matrix LS. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 21 Colin / 41 C

5. Generalized least squares Feasible GLS Generalized least squares (GLS) Suppose V[ujX] = Ω where Ω is known and y = Xβ + u, E[ujX] = 0 as before. The generalized least squares estimator is e cient: bβ GLS = (X 0 Ω 1 X) 1 X 0 Ω 1 y. Derivation: Premultiply y = Xβ + u by Ω 1/2 so Ω 1/2 y = Ω 1/2 Xβ + Ω 1/2 u. This model has i.i.d. errors since V[Ω 1/2 ujx] = E[(Ω 1/2 u)(ω 1/2 u) 0 jx] = Ω 1/2 ΩΩ 1/2 = N. Then GLS is OLS in this transformed model: bβ GLS = [(Ω 1/2 X) 0 (Ω 1/2 X)](Ω 1/2 X) 0 (Ω 1/2 y) = (X 0 Ω 1 X) 1 X 0 Ω 1 y. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 22 Colin / 41 C

5. Generalized least squares FGLS Feasible generalized least squares (FGLS) To implement GLS we need a consistent estimate of Ω. Assume a model for Ω = Ω(γ), estimate bγ p! γ, and form bω = Ω(bγ) p! Ω. The feasible GLS estimator (FGLS) is bβ GLS = (X 0 bω 1 X) 1 X 0 bω 1 y, and then bβ GLS a N hβ, (X 0 bω 1 X) 1i. Examples: Heteroskedasticity: V[ui 2jx i ] = exp(zi 0γ) Seemingly unrelated equations: y ig = xig 0 β g + u ig, g = 1,..., G. u ig independent over i and homoskedastic with Cov[u ig, u ih ] = σ gh. Systems of equations: SUR with β g = β. Panel data: random e ects estimator. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 23 Colin / 41 C

5. Generalized least squares Weighted least squares Weighted least squares (WLS) Now do FGLS but allow for possibility that model for V[ujX] is incorrectly speci ed So then obtain robust VCE for FGLS. Distinguish between the assumed (working) error variance matrix, denoted Σ = Σ(γ) with estimate bσ = Σ(bγ). the true (unknown) error variance matrix Ω The weighted least squares (WLS) estimator is bβ WLS = (X 0 bσ 1 X) 1 X 0 bσ 1 y. Asymptotically b β WLS a N [β, V[ b β]] where robust VCE is bv[ b β] = (X 0 bσ 1 X) 1 (X 0 bσ 1 bωbσ 1 X) 1 (X 0 bσ 1 X) 1, for cross-section data bω = Diag[(y i x 0 i b β WLS ) 2 ]. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 24 Colin / 41 C

6. Wald tests and Con dence intervals Single restriction Hypothesis test of single restriction Consider test of a single restriction, for notational simplicity β H 0 : β = β H a : β 6= β. A Wald test rejects H 0 if bβ di ers greatly from β. De ne σ b β to be the asymptotic standard deviation of bβ. Then ) a bβ j N [β, σ 2 b ] for unknown β β bβ β σ b β ) z j = b β β σ b β a N [0, 1] a N [0, 1] standardizing under H 0 : β = β To implement this, replace σ b β by s bβ, the standard error of bβ. This makes no di erence asymptotically (so still N [0, 1]). c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 25 Colin / 41 C

6. Wald tests and Con dence intervals Single restriction The Wald z-statistic is z j = b β s b β β a N [0, 1] under H 0 : β = β mplementation by two equivalent methods Test using p-values: reject H 0 at level 0.05 if p = Pr[jZ j > jz j j] < 0.05, where Z N [0, 1]. Test using critical values: reject H 0 at level 0.05 if jz j j > z.025 = 1.96. Many packages such as Stata use T (N k) rather than N [0, 1] More conservative (less likely to reject H 0 ) Exact in unlikely special case that u i N [0, σ 2 ]. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 26 Colin / 41 C

6. Wald tests and Con dence intervals Con dence interval Con dence interval A 100(1 α)% con dence interval for β is bβ z α/2 s b β. in particular a 95% con dence interval is bβ 1.96s b β. can replace z α/2 by T N k;α/2 for better nite sample performance c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 27 Colin / 41 C

6. Wald tests and Con dence intervals Multiple linear restrictions Hypothesis test of multiple linear restrictions Now consider test of several restrictions e.g. Test H 0 : β 2 = 0, β 3 = 0 against H a : at least one 6= 0. n matrix algebra we test H 0 : Rβ = r against H a : Rβ 6= r. Example: Test H 0 : β 2 = 0, β 3 = 0 against H a : at least one 6= 0 2 3 β 1 β 2 β2 0 1 0 0 = β 0 3 = β 3 0 0 1 0 6 7 0 4. 5 or R (2K ) β (K 1) = r (21) β k c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 28 Colin / 41 C

6. Wald tests and Con dence intervals Multiple linear restrictions A Wald test rejects H 0 : Rβ = r if Rβ b r di ers greatly from 0. Now Rβ b r is normal as linear combination of normals is normal. bβ a N [β, V[ β]] b ) Rβ b a r N [Rβ r, RV[ β]r b 0 ] ) Rβ b a r N [0, RV[ β]r b 0 ] under H 0 ) (Rβ b r) 0 [RV[ β]r b 0 ] 1 (Rβ b r) χ 2 (h) under H 0 The last step converts to chi-square using the result z N [0, Ω] ) z 0 Ω 1 z χ 2 (dim[ω]). To implement this test, replace V[ β] b by bv[ β]. b This makes no di erence asymptotically. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 29 Colin / 41 C

6. Wald tests and Con dence intervals Multiple linear restrictions The Wald chi-squared statistic is W = (Rβ b r) 0 [RbV[ β]r b 0 ] 1 (Rβ b r) a χ 2 (h) under H 0 mplementation by two equivalent methods Test using p-values: reject H 0 at level 0.05 if p = Pr[χ 2 (h) > W] < 0.05. Test using critical-values: reject H 0 at level 0.05 if W > χ 2.05 (h). The alternative Wald F-test statistic is F = W h F (h, N k) under H 0 Makes no di erence asymptotically as F (h, N)! χ 2 (h)/h as N!. More conservative (less likely to reject H 0 ) Exact in unlikely special case that u i N [0, σ 2 ]. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 30 Colin / 41 C

6. Wald tests and Con dence intervals Multiple linear restrictions Further test details Wald test is the commonly-used method to test H 0 against H a. Estimate β without imposing H 0. Then ask does b β approximately satisfy H 0? The other two test methods used at times are Likelihood ratio test: Estimate under both H 0 & H a and compare ln L. Lagrange multiplier or score test: Estimate under H a only. Asymptotically equivalent to Wald under H 0 and local alternatives Choice is mainly one of convenience, though Wald does have the weakness of lack of invariance to reparameterization. Also as already noted for Wald test asymptotic theory: use Z and χ 2 (q) better nite sample approximation: use T (N k) and F (q, N k) even better still: bootstrap with asymptotic re nement. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 31 Colin / 41 C

7. Simulations OLS consistency and asymptotic normality 7. Simulations: OLS consistency and asymptotic normality D.g.p.: y i = β 1 + β 2 x i + u i where x i χ 2 (1) and β 1 = 1, β 2 = 2. Error: u i χ 2 (1) 1 is skewed with mean 0 and variance 2.. * Small sample: parameters differ from dgp values. clear all. quietly set obs 30. set seed 10101. quietly generate double x = rchi2(1). quietly generate y = 1 + 2*x + rchi2(1)-1 // demeaned chi^2 error. regress y x, noheader y Coef. Std. Err. t P> t [95% Conf. nterval] x 2.713073.5743189 4.72 0.000 1.536634 3.889512 _cons 1.150439.6148461 1.87 0.072 -.1090161 2.409894 For N = 30: bβ 2 = 2.713 di ers appreciably from β 2 = 2.000. This is due to sampling error as se[bβ 2 ] = 0.574. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 32 Colin / 41 C

7. Simulations OLS consistency and asymptotic normality How to verify consistency: set N very large.. * Consistency: Large sample: parameters are very close to dgp values. clear all. quietly set obs 100000. set seed 10101. quietly generate double x = rchi2(1). quietly generate y = 1 + 2*x + rchi2(1)-1 // demeaned chi^2 error. regress y x, noheader y Coef. Std. Err. t P> t [95% Conf. nterval] x 1.998675.0031725 630.00 0.000 1.992457 2.004893 _cons 1.005819.0054945 183.06 0.000.9950495 1.016588 For N = 100, 000: bβ 2 = 1.999 is very close to β 2 = 2.000. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 33 Colin / 41 C

7. Simulations OLS consistency and asymptotic normality How to check asymptotic results: compute b β many times.. * Central limit theorem. * Write program to obtain betas for one sample of size numobs (= 150). program chi2data, rclass 1. version 10.1 2. drop _all 3. set obs $numobs 4. generate double x = rchi2(1) 5. generate y = 1 + 2*x + rchi2(1)-1 // demeaned chi^2 error 6. regress y x 7. return scalar b2 =_b[x] 8. return scalar se2 = _se[x] 9. return scalar t2 = (_b[x]-2)/_se[x] 10. return scalar r2 = abs(return(t2))>invttail($numobs-2,.025) 11. return scalar p2 = 2*ttail($numobs-2,abs(return(t2))) 12. end. * Run this program 1,000 times to get 1,000 betas etcetera. * Results differ from MUS (2008) as MUS did not reset the seed to 10101. * First define global macro numobs for sample size. global numobs 150. set seed 10101. quietly simulate b2f=r(b2) se2f=r(se2) t2f=r(t2) reject2f=r(r2) p2f=r(p2), /// > reps(1000) saving(chi2datares, replace) nolegend nodots: chi2data Then look at the distribution of these b β 0 s and test statistics. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 34 Colin / 41 C

7. Simulations OLS consistency and asymptotic normality. * Summarize the 1,000 sample means. summarize b2f se2f t2 reject2f p2f Variable Obs Mean Std. Dev. Min Max b2f 1000 2.000506.08427 1.719513 2.40565 se2f 1000.0839776.0172588.0415919.145264 t2f 1000.0028714.9932668-2.824061 4.556576 reject2f 1000.046.2095899 0 1 p2f 1000.5175818.2890325.0000108.9997772. mean b2f se2f t2 reject2f p2f Mean estimation Number of obs = 1000 Mean Std. Err. [95% Conf. nterval] b2f 2.000506.0026649 1.995277 2.005735 se2f.0839776.0005458.0829066.0850486 t2f.0028714.0314099 -.0587655.0645082 reject2f.046.0066278.032994.059006 p2f.5175818.00914.499646.5355177 For S = 1, 000 simulations each with sample size N = 150. bβ (1) 2, bβ (2) 2,..., bβ (1000) 2 has distn. with mean 2.001 close to β 2 = 2.000 and standard deviation 0.089 close to p 1/150 = 0.082 F using V[bβ 2 ] ' (σ 2 u/v[x i ])/N = (2/2)/150 = 1/150. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 35 Colin / 41 C

7. Simulations OLS consistency and asymptotic normality Test β 2 = 2 using z = (bβ 2 β 2 )/se[bβ 2 ] = (bβ 2 2.0)/se[bβ 2 ] to test H 0 : β 2 = 2. Histogram and kernel density estimate for z 1, z 2,..., z 1000. Density 0.1.2.3.4.5-2 0 2 4 6 t-statistic for slope coeff from many samples Not quite standard normal: N = 150 is still not large enough for CLT. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 36 Colin / 41 C

7. Simulations OLS consistency and asymptotic normality How to verify that standard errors are correctly estimated. The average of the computed standard errors of bβ 2 is 0.0839 (see mean of se2f) This is close to the simulation estimate of se[bβ 2 ] of 0.0842 (see Std.Dev. of b2f) Aside: Actually for this dgp expect p 1/150 ' 0.082 using V[bβ 2 ] ' (σ 2 u/v[x i ])/N = (2/2)/150 = 1/150) How to verify that test has correct size. The Wald test of H 0 : β 2 = 2 at level 0.05 has actual size 0.046 (see mean of reject2f) This is close enough as a 95% simulation interval when S = 1000 is 0.05 1.96 p 0.05 0.95/1000 = 0.05 1.96 0.007 = (0.046, 0.064). c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 37 Colin / 41 C

8. Stata commands 8. Stata commands Command regress does OLS option vce(robust) for heteroskedastic-robust standard errors option vce(cluster clid) for cluster-robust standard errors (with cluster on clid) For Feasible GLS command regress [aweight= ] for known or estimated heteroskedasticity command sureg for systems of linear equations command nlsur for systems of nonlinear equations command xtreg, re for panel random e ects. For hypothesis tests command test (and nltest for nonlinear hypotheses) c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 38 Colin / 41 C

9. Appendix OLS matrix notation example 9. Appendix: OLS matrix notation example Example: N = 4 with (x, y) equal to (1, 1), (2, 3), (2, 4), and (3, 4). Vector y and matrix X are y = (41) 2 6 4 y 1 y 2 y 3 y 4 3 7 5 = 2 6 4 1 3 4 4 3 7 5 and X = (42) 2 6 4 x 0 1 x 0 2 x 0 3 x 0 4 3 7 5 = 2 6 4 x 11 x 21 x 12 x 22 x 13 x 23 x 14 x 24 3 7 5 = 2 6 4 1 1 1 2 1 2 1 3 3 7 5. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 39 Colin / 41 C

9. Appendix OLS matrix notation example Compute b β OLS = (X 0 X) 1 X 0 y : (X 0 X) 1 = X 0 X = (X 0 X) 1 X 0 y = 4 8 8 18 X 0 y = OLS estimates: 1 1 1 1 1 2 2 3 1 = 1 1 1 1 1 2 2 3 9/4 1 1 1/2 2 6 4 1 1 1 2 1 2 1 3 3 1 18 8 72 64 8 4 2 3 12 27 6 4 = 1 3 4 4 7 5 = 4 8 8 18 = 7 5 = 12 27. 9/4 1 1 1/2. 108/4 27 12 + 54/4 intercept bβ 1 = 0 and slope coe cient bβ 2 = 1.5. = c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 40 Colin / 41 C 0 1.5..

9. Appendix OLS matrix notation example OLS on intercept and single regressor: y i = β 1 + β 2 x i + u i. 2 3 1 x 1 X 0 1 1 6 7 N X = x 1 4 x N.. 5 = i x i i x i i xi 2 1 x N (X 0 X) 1 = 1 i x 2 i i x i N i xi 2 ( i x i ) 2 i x i N = 1 N 1 i x 2 i x i xi 2 N x 2 x 1 3 2 y 1 X 0 1 1 6 y = x 1 4 x N. y N (X 0 X) 1 X 0 y = 1 ȳ i xi 2 i xi 2 N x 2 = 1 i (x i x ) 2 ȳ i x 2 i x i x i y i i (x i x)(y i ȳ) 7 5 = i y i i x i y i x i x i y i xnȳ + i x i y i " ȳ = = bβ 2 x i (x i x )(y i x ) i (x i x ) 2 Nȳ i x i y i So bβ 1 = ȳ bβ 2 x and bβ 2 = i (x i x )(y i x ) i (x i x ) 2 as in introductory course. c A. Colin Cameron Univ. of Calif.- Davis (Frontiers BGPE in Econometrics Course: OLSBavarian and GLSGraduate Program inmarch Economics 21-25,. 2011 Based on A. 41 Colin / 41 C #