POLS, GLS, FGLS, GMM. Outline of Linear Systems of Equations. Common Coefficients, Panel Data Model. Preliminaries

Outlie of Liear Systems of Equatios POLS, GLS, FGLS, GMM Commo Coefficiets, Pael Data Model Prelimiaries he liear pael data model is a static model because all explaatory variables are dated cotemporaeously with the depedet variable. It is also cosidered a commo coefficiet model because β is the same for all idividuals across time. y it = x itβ + u it where x it : K ; i =,..., ad t =,..., ; is large ad is small. We assume i j are all idepedet. Wat time heterogeeity? he use time-dummies or the Seemigly Urelated Regressio SUR model. Wat idividual heterogeeity? Fixed Effects FE ad/or Radom Effects RE, or somethig more geeral such as Radom Coefficiets RC For right ow, there is o idividual or time heterogeeity preset i the model. We will iclude uobserved idividual heterogeeity ito the pael data model later. We will also discuss mulivariate liear systems with time heterogeeity, i.e., the SUR model, at aother time. o simplify the otatio, we ca stake the model over time y i = x i β + u i where y i :, u i :, ad x i x i2 x i =. x i K

POLS Idetificatio Assumptios Assumptio POLS.: Ex it u it = 0 i,t Withi equatio, or cotemporaeous, exogeeity. For most applicatios, x i has a sufficiet umber of elemets equal to uity, so that Assumptio POLS. implies that Eu i = 0 his is the weakest assumptio we ca impose i a regressio framework to get cosistet estimators of β, ad it ca hold whe some elemets of x i are correlated with some elemets of u i. For example, it allows x is ad u it to be correlated whe s t. Uder Assumptio POLS., the vector β satisfies E [x i y i x i β] = 0 or Ex i x iβ = Ex i y i. For each i, x i y i is a K vectors ad x i x i is a K Ksymmetric, positive semidefiite radom matrix. herefore, Ex i x i is always a K K symmetric, postive semidefiite oradom matrix the expectatio here is defie over the populatio distributio of x i. o be able to estimate β, we eed to assume that it is the oly K vectors that satisfies. Assumptio POLS.2: rak [ t= Ex itx it ] = K Uder Assumptios POLS. ad POLS.2, we ca write β = [Ex i x i] Ex i y i, which shows that the two assumptios idetify the vector β. Estimator Defie the Pooled Ordiary Least Squares POLS estimator as: ˆβ POLS = i= t= x it x it i= t= x it y it = x ix i x iy i i= i= For computig ˆβ POLS usig matrix laguage programmig, it is sometimes useful to write ˆβ = X X X Y where X x. x K ad Y y. y his estimator is called the pooled ordiary least squares POLS estimator because it correspods to ruig OLS o the observatios pooled across i ad t. Page 2 of 0

Asymptotic Properties Cosistecy Sice Ex it u it = 0 by assumptio, the ˆβ POLS β = i=t= x it x it i=t= x it u it p Ex it x it Ex it u it = 0 Asymptotic Normality y i = x i β + u i ˆβPOLS β = i= V R = Ex ix i Ex i u i u ix i Ex ix i ˆV R = x ix i x iû i û ix i i= where û i = y i x i ˆβPOLS H 0 : Rβ = r; Wald: i= x ix d i iu i=x i Ex ix i N0,Ex iu i u ix i i= Rˆβ r R ˆV R R Rˆβ r d χ 2 K q x ix i System Coditioal Homoskedasticity SCH Assumptio: Eu i u i x i = Eu i u i By the law of iterated expectatios, the SCH assumptio implies that Ex i u iu i x i = Ex i Ωx i where Ω Eu i u i. V NR = Ex ix i Ex i Ωx i Ex ix i ˆV NR = x ix i x i ˆΩx i i= ˆΩ = i= û i û i i= p Eu i u i = Ω i= x ix i Page 3 of 0

Homoskedasticity ad No Serial Correlatio o apply the usual OLS statistics from the pooled OLS regressio across i ad t ad for pooled OLS to be relatively efficiet, we require that u it be homoskedastic across t ad serially ucorrelated. he weakest forms of these coditios are the followig: Assumptio POLS.3: a Eu 2 it x itx it = σ2 Ex it x it, t =,...,, where σ2 = Eut 2 t; b Eu it u is x it x is = 0, t s, t,s =,...,. he first part of Assumptio POLS.3 is a fairly strog homoskedasticity assumptio; sufficiet is Eu 2 it x it = Eu 2 it = σ2 t. his meas ot oly that the coditioal variace does ot deped o x it, but also that the ucoditioal variace is the same i every time period. Assumptio POLS3b essetially resticts the coditioal covariaces of the errors across differet time periods to be zero. I fact, sice x it almost always cotais a costat, POLS.3b requires at a miimum that Eu it u is = 0, t s. Sufficiet for POLS.3b is Eu it u is x it x is = Eu it u is = 0, t s, t,s =,...,. It is importat to remember that Assumptio POLS.3 implies more tha just a certai form of the ucoditioal variace matrix of u i. Assumptio POLS.3 implies Eu i u i = σ2 I, which meas that the ucoditioal variaces are costat ad the ucoditioal covariaces are zero, but it also effectively restricts the coditioal variaces ad covariaces. If Assumptio POLS.3 holds, the AVarˆβ POLS = σ 2 [Ex i x i] /, so its appropriate estimator is ˆσ 2 X X = i= t= x it x it where ˆσ 2 is the usual OLS variace estimator from the pooled regressio y it o x it. GLS Idetificatio Assumptios Assumptio SGLS.: Ex it u is = 0 t,s =,..., Cross equatio exogeeity, i.e., strict exogeeity. his assumptio is more easily stated usig the Kroecker product, Ex i u i = 0. ypically, at least oe elemet of x i is uity, so i practice Assumptio SGLS. implies that Eu i = 0. SGLS. is stroger tha POLS., i.e., SGLS. implies POLS.. his stroger assumptio is eeded for GLS to be cosistet. Note, GLS is less robust tha POLS, but it is more effiicet tha POLS if SGLS. holds ad we add assumptios o the coditioal variace matrix of u i. A sufficiet coditio for Assumptio SGLS. is the zero coditioal mea assumptio, i.e., Eu i x i = 0. he secod momet matrix of u i, which is ecessarily costat across i by the radom samplig assumptio, plays a critical role for GLS estimatio of systems of equatios. Defie the G G positive semidefiite matrix Ω Eu i u i. Because Eu i = 0 i the vast majority of applicatios, Page 4 of 0

we will refer to Ω as the ucoditioal variace matrix of u i. Sometimes, a equatio must be dropped to esure that Ω is osigluar. Here, we assume Ω is osigular, so Assumptio SGLS. implies that Ex i Ω u i = 0 I place of Assumptio POLS.2, we assume that a weighted expected outer product of x i is osigular. Here we isert the assumptio of a osigular variace matrix for completeess. Assumptio SGLS.2: Ω is positive defiite ad Ex i Ω x i is osigular. Estimator Let Ω = C 2C 2. We use Cholesky or raigular decompositio, which we ca do for ay symmetric ad positive semidefiite matrix. Sicet Ω is ivertible, the Ω = C 2C 2. he usual motivatio for the GLS estimator is to trasform a system of equatios where the error has oscalar variace-covariace matrix ito a system where the error vector has a scalar variacecovariace matrix. We obtai this by premultiplyig the stacked equatio by C 2, ad the we get ỹ i = x i β + ũ i where ỹ i = C 2 y i, x i = C 2 x i, ad ũ i = C 2 u i. Simple algebra shows that Eũ i ũ i = I. he geeralized least squares GLS estimator of β is obtaied by performig POLS of ỹ i o x i. ˆβ GLS = x i x i x iỹ i = i= i= x iω x i x iω y i = [ X I N Ω X ] [ X I N Ω Y i= i= Asymptotic Properties Cosistecy Sice Ex i Ω u i = 0, the ˆβ GLS β = + where A = [ Ex i Ω x i ] x iω x i x iω u i i= i= p A Ex iω u i = 0 If we are willig to make the zero coditioal mea assumptio, ˆβGLS ca be show to be ubiased coditioal o X. Note, cosistecy fails if we oly make Assumptio POLS.. Ex i u i = 0 does ot imply Ex i Ω u i = 0. If Assumptio POLS. holds but Assumptio SGLS. fails, the trasformatio equatio ỹ i = x i β + ũ i geerally iduces correlatio betwee x i ad ũ i. Asymptotic Normality ˆβGLS β = i= x iω x i x iω u i i= Page 5 of 0

By the CL, x iω u i i= d N0,B where B Ex i Ω u i u i Ω x i Sice i= x i Ω u i = O p ad x i Ω x i A = o p, we ca write i= ˆβGLS β = A x iω u i + o p i= It follows from the asymptotic equivalece lemma that ˆβ GLS β V R = E x i x i E x i ũ i ũ i x i E xi x i AVarˆβGLS = A BA / a N0,A BA. SE: Use the robust stadard error for POLS of ỹ i o x i Feasible Geeralized Least Squares FGLS Asymptotic Properties Obtaiig the GLS estimator ˆβ GLS requires kowig Ω up to scale. hat is, we must be able to write Ω = σ 2 C, where C is a kow positive defiite matrix ad σ 2 is allowed to be a ukow costat. Sometimes C is kow, but much ofte it is ukow. herefore, we ow tur to the aalysis of feasible GLS FGLS estimatio. I FGLS estimatio, we replace the ukow matrix Ω with a cosistet estimator. Because the estimator of Ω appears highly oliearly i the expressio for the FGLS estimator, derivig fiite sample properties of FGLS is geerally difficult. he asymptotic properties of the FGLS estimator are easily established as because its first-order asymptotic properties are idetical to those of the GLS estimator uder Assumptios SGLS. ad SGLS.2 We iitially assume we have a cosiste estimator, ˆΩ, of Ω: p lim ˆΩ = Ω. Whe Ω is allowed to be a geeral positive defiite matrix, the followig estimatio approach ca be used. First, obtai the POLS estimator of β, which we deote ˇβ. We already showed that ˇβ is cosistet for β uder Assumptios POLS. ad POLS.2, ad therefore uder Assumptios SGLS. ad POLS.2. So, a atural estimator of Ω is ˆΩ i= ǔ i ǔ i Page 6 of 0

where ǔ i y i x i ˇβ are the POLS residuals. We ca show that this estimator is cosistet for Ω uder Assumptios SGLS. ad SOLS.2 ad staded momet coditios. Give ˆΩ, the feasible GLS FGLS estimator of β is ˆβ FGLS = [ ] [ ] x i ˆΩ x i x i ˆΩ y i = [ X I N ˆΩ X ] [ X I N ˆΩ Y ] i= i= We already kow that GLS is cositet ad asymptotically ormal. Because ˆΩ coverges to Ω, it is ot surprisig that FGLS is cosistet, ad we ca also verify that FGLS has the same limitig distributio of GLS, i.e., they are -equivalet. his asymptotic equivalece is importat because we ot have to worry that ˆΩ is a estimator whe performig asymptotic iferece about β usig ˆβ FGLS. I the FGLS cotext, a cosistet estimator of A is Â i= x i ˆΩ x i A cosistet estimator of B is also readily available after FGLS estimatio. Defie the FGLS residuals by û i y i x i ˆβFGLS. Usig stadard arguemets, a cosistet estimator of B is ˆB i= x i ˆΩ û i û i ˆΩ x i he estimator of AVarˆβ ca be writte as Â ˆBÂ /. his is the extesio of the White heteroskedasticityrobust asymptotic variace estimator, ad it is robust uder Assumptios SGLS. ad SGLS.2 System Coditioal Homoskedasticity SCH Assumptio Uder the assumptios so far, FGLS has othig to offer over POLS, ad it is less robust. However, uder a additio assumptio, FGLS is asymptotically more efficeit that POLS ad other estimators. Assumptio SGLS.3: Eu i u i x i = Eu i u i = Ω he SCH assumptio puts restrictios o the coditioal variaces ad covariaces of elemets of u i. If Eu i x i = 0, the this assumptio is the same as assumig Varu i x i = Varu i = Ω. Aother way to state this assumptio is B = A, so this simplifies the asymptotic variace. By the law of iterated expectatios, the SCH assumptio implies that Ex i Ω u i u i Ω x i = Ex i Ω x i where Ω Eu i u i. Note, we oly eed the weaker coditio to determie the usual variace matrix for FGLS. Uder this weaker assumptio, alog with Assumptios SGLS. ad SGLS.2, the asymptotic variace of the FGLS estimator is AVarˆβ A /. We obtai a ˆ estimator of this variace matrix by usig our cosistet estimator of A, so AVarˆβ = Â /. his is the usual formula for the asymptotic variace of FGLS. It is orobust i the sese that it relies o homoskedasticity assumptio. If heteroskedasticity i u i is suspected, the the robust estimator, which was derived earlier, should be used. Page 7 of 0

Uder Assumptios SGLS., POLS.2, SGLS.2, ad SGLS.3 the FGLS estimator is more efficeit that the POLS estimator. We ca actually say much more: FGLS is more efficiet tha ay other estimtator that uses the orthogoality coditios Ex i u i = 0. Summary of the Various System GMM Estimators Prelimiaries y it = x it β + u it For all t, x it is a K vector. Suppose we have a L t vector of istrumets z it, so the umber of istrumets ca vary with time. he istrumets must satisfy Ez it u it = 0 for all t. Stackig the equatios over t, we have y i = x i β + u i which is the same setup up as i 2, ad z i has the structure of 4. hus, the momet coditios are give by: Ez i[y i x i β] = Ez iu i = Eg i L = 0 he efficiet GMM estimator that uses oly the momets Ez it u it = 0 for all t, is the GMM estimator with optimal weightig matrix. However, the choice of istrumet matrix i 5 meas we are oly usig the momet coditios aggregated across time, t= Ez itu it = 0. hus, to obtai the efficiet GMM estimator, the matrix of istrumets should be as i 4 because this expresses the full set of momet coditios. List of estimators to deal with edogeeity are: System GMM, 3SLS, S2SLS, P2SLS, SIV, PIV, H.. GMM Estimator ˆβ GMM = argmi β = [ x i z i i= Ŵ z i x i i= i= z i u i Ŵ ] [ i=x iz i i= Ŵ z i u i z i y i i= o obtai the optimal GMM esitmator, we choose Ŵ such that limŵ = W [Ez i u iu i z i]. hus, the weightig matrix for the optimal GMM estimator is Ŵ = z iu i u iz i i= ] Page 8 of 0

2. 3SLS he weightig matrix used by the 3SLS estimator is Ŵ = z i ˆΩz i where ˆΩ = i= i= û i û i. he procedure o how to obtai the 3SLS estimator is: First two stages: Ru P2SLS to get û i hird Stage: Obtai Ŵ ad perform system GMM estimatio. he 3SLS estimator is efficiet uder the coditioal homoskedasiticy assumptio: Eu i u i z i = Eu i u i Ω 3. S2SLS he weightig matrix used by the S2SLS estimator is Ŵ = z iz i i= he S2SLS estimator is efficiet uder the coditioal homoskedasticity assumptio ad whe Ω is spherical, i.e., Ω = σ 2 I. 4. P2SLS If L t is the same for all t, i.e., L t = L for all t, the z i has the structure of 5. he P2SLS estimator exploits the orthogoality coditio Ez iu i = Ez i u i + + z i u i = 0 ad the coditioal homoskedasticity assumptio. So, whe z i has the structure of 5, the weightig matrix used by the P2SLS estimator is Ŵ = z it z it i=t= ad the P2SLS estimator is give by ˆβ = x it z it z it z it i=t= i=t= i= t= z it x it i= t= x it z it i= t= z it z it i= he P2SLS is efficiet uder the coditioal homoskedasticity assumptio. Note, whe z it = x it, this estimator reduces to the POLS estimator. t= z it y it Page 9 of 0

5. SIV If z i has the structure of 4 ad L = K, the we have exactly eough IVs for the explaatory variables i the system. hus, the SIV estimator is give by ˆβ = N N z ix i N N z iy i i= i= 6. PIV If z i has the structure of 5 ad L = K, the we have exactly eough IVs for the explaatory variables i the system. hus, the pooled istrumetal variables PIV estimator is give by ˆβ = i= t= z it x it i= t= z it y it Note, whe z it = x it, this estimator reduces to the POLS estimator. Page 0 of 0