GLS. Miguel Sarzosa. Econ626: Empirical Microeconomics, Department of Economics University of Maryland

GLS Miguel Sarzosa Department of Economics University of Maryland Econ626: Empirical Microeconomics, 2012

1 When any of the i s fail 2 Feasibility 3 Now we go to Stata!

GLS Fixes i s Failure Remember that for OLS estimates to work, the main assumption were compiled by the term i.i.d. In particular, the i s are crucial and easily breakable assumptions. I I Identically distributed (related to homoskedasticity) Independently distributed (related to correlation between regression residuals) GLS allows you to get appropriate estimates when the i s fail to be fulfilled The idea is simple and it is closely related to what we saw in robust estimation 1 Construct a positive semi-definite matrix 2 Use to transform the data 3 Run the regressions on the transformed data

Why GLS and not Robust Estimation When in presence of heteroskedasticity the researcher can take two approaches First, use Robust standard error estimation. I I Only care about correct standard errors, p-values and t-stats Easily implemente in Stata: vce(robust) Second, use GLS estimation I Need to model heteroskedasticity. That is, estimate a skedasticity function I Get more e cient estimates I More precise estimation of parameters and marginal e ects

How does it work? As an example we begin with a simple model of the form y = Xb + e Now suppose that the residuals have a structure such that E where 6= s 2 I Note that this is a general case that can account for heteroskedasticity or clustered data Note that 1 2 1 0 2 = I. So if we transform our model to 1 2 y = 1 2 Xb + 1 2 e h i ee 0 X =, the transformed error 1 2 e will be distributed (0,I) h i In the case where E ee 0 X = =diag si 2, 1 2 = diag ( 1 /s i ) so 1 2 e will be distributed (0,I)

From Theory to Reality is unknown, we do not know the real disturbances e The best we can do is to get a ê from a consistent first step, that will depend on a parameter vector ĝ and our data. That is, we will estimate ˆ = (ĝ). Di erent situations require di erent (g) and ˆ. For example: ˆ =N 1 Â N i=1^e i^e 0 i ˆ =Diag s 2 (z i ) Then our Feasible GLS estimator is h 1 ˆb FGLS = X ˆ Xi 0 X0 ˆ y Assuming that we have a consistent first step h i 1 Var ˆbFGLS = X 0 ˆ X

Weighted Least Squares The FGLS estimator requires us to specify (g) If (g) is misspecified, the FGLS is no longer e matrix of ˆb FGLS will be incorrect. cient and the var-cov Not surprisingly, the answer is to use a robust estimator of the var-cov matrix. We will still use a weighting matrix ˆ = (ĝ), but obtain a robust var-cov matrix of ˆb FGLS Assuming that we have a consistent first step h i 1 Var ˆbFGLS = X 0 ˆ X! N h Â x i ˆ ẽi ẽ 0 ˆ x i 1 0 i i X 0 ˆ X i=1 This is an extension of the White heteroskedasticity-robust variance estimator

GLS in Many Settings Panel data y st = x 0 stb 1 + z 0 stb s 2 + n st (1) where z st = 1inastandardfixede ectsmodelorz st =[1,t] in a model with individual e ects and individual time trends. 1 Obtain ˆ by running a fixed-e ect regression 2 Reweight the data in (1) using ˆ 1 2,creatingỹ s = ˆ 1 2 y s, X s = ˆ 1 2 X s and Z s = ˆ 1 2 Z s. 3 Residualize ỹ s and X s by partialling out Z s.let M Zs = I T Zs Z 0 s Zs 1 Z 0 s,andformỹ s = M Zs ỹ s and Xs = M Zs Xs. 4 Obtain ˆb 1 by OLS regression of ỹ s on Xs.

GLS in Panel Data Stata uses command xtgls. This command allows estimation in the presence of: cross-sectional correlation (panels(iid) or panels(correlated)) heteroskedasticity across panels (panels(heteroskedastic)) It also allows for AR(1) autocorrelation within panels (corr(ar1) or corr(psar1)).

Where is xtgls in Trouble? Sometimes there is reason to believe the autocorrelation is of order greater than 1. xtgls is no longer useful. Suppose that n st follows an AR(p) process, i.e. n st = a 1 n s,t 1 + + a p n s,t p + e st E n s,t n s,(t+k) = gs (k) 6= 0 Then the matrix needs to incorporate this in order to be able to transform the data in the required way. Hansen (2007) provides a bias-corrected estimator of the autocorrelation parameters in fixed e ects panel data models and comes up with a way to create the correct ˆ. With which the estimation procedure detailed above can be used. Hansen s estimator is implemented in Stata by the xtargls command (if you need it for your research, ask me for it).

Now we go to Stata!