Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16) 1

2 Model Consider a system of two regressions y 1 = β 1 y 2 + u 1 (1) y 2 = β 2 y 1 + u 2 (2) This is a simultaneous equation model (SEM) since y 1 and y 2 are determined simultaneously. Both variables are determined within the model, so are endogenous, and denoted by letter y.

3 Example The demand-supply model in microeconomics includes demand function and supply function y 1 is the quantity of good; y 2 is the price If β 1 < 0,β 2 > 0, then (1) is the demand function while (2) is the supply function; u 1 is the demand shock and u 2 is the supply shock. Another example is the Keynesian cross (45 degree line) model in which y 1 is the national income and y 2 is total consumption.

4 Structural Form (1) and (2) are structural in the sense that they are directly implied by economics theory. We assume E(u 1 u 2 ) = 0 (3) So the structural errors are uncorrelated (orthogonal). Our goal is to estimate the structural coefficient β that measures the causal effect of one endogenous variable on the other endogenous variable

5 Simultaneity Bias Plugging (1) into the right hand side of (2) leads to y 2 = β 2 (β 1 y 2 + u 1 ) + u 2. After collecting terms, we have which indicates that y 2 = β 2u 1 + u 2 1 β 2 β 1, (4) E(y 2 u 1 ) = β 2Eu 2 1 1 β 2 β 1 0 cov(y 2,u 1 ) 0. (5) So structural model (1) suffers endogeneity issue (simultaneity bias) the regressor in (1) is correlated with the error term. Consequently, OLS applied to (1) gives inconsistent and biased estimate. So does (2).

6 Example Suppose there is a demand shock u 1 u 1 affects the quantity y 1 through the demand function (1) Next quantity y 1 affects the price y 2 through the supply function (2). Some people call this reverse causation. So u 1 affects y 2, and the two variables are correlated. In other words, the regressor in (1) is endogenous.

7 Endogeneity Typically an economic theory implies SEM, so several variables are determined simultaneously within the model. Those variables are endogenous from the economics perspective We just show SEM suffers simultaneity bias. Therefore those variables are correlated with the error, so are endogenous from the econometrics perspective In short, economic endogeneity is closely related to econometric (statistical) endogeneity.

8 Graph We can not identify either demand curve or supply curve from the scatter plot of quantity versus price. However, if there are some exogenous variables, say, input price that can shift the supply curve, then we can identify the demand curve.

9 SEM with Exogenous Regressors Now augment the structural model with exogenous regressors y 1 = β 1 y 2 + c 1 z 1 + u 1 (6) y 2 = β 2 y 1 + c 2 z 2 + u 2 (7) For instance, z 1 is income; z 2 is input price z 1 and z 2 are determined outside the model, so are exogenous (pre-determined) Statistically, exogeneity means that E(z 1 u 1 ) = 0;E(z 1 u 2 ) = 0;E(z 2 u 1 ) = 0;E(z 2 u 2 ) = 0 (8)

10 Reduced Form The reduced form expresses the endogenous variables in terms of exogenous variables only y 1 = c 1z 1 + β 1 c 2 z 2 + e 1 1 β 2 β 1 (9) y 2 = β 2c 1 z 1 + c 2 z 2 + e 2 1 β 2 β 1 (10) e 1 = u 1 + β 1 u 2 (11) e 2 = β 2 u 1 + u 2 (12) (9) and (10) are reduced forms, and only exogenous variables z 1 and z 2 appear on the right hand side (RHS)

11 Reduced Form Continued Let π 11 = c 1 1 β 2 β 1 ;π 12 = β 1c 2 1 β 2 β 1 ;e 1 = e 1 1 β 2 β 1 π 21 = β 2c 1 1 β 2 β 1 ;π 22 = The reduced form can be rewritten as c 2 1 β 2 β 1 ;e 2 = e 2 1 β 2 β 1 y 1 = π 11 z 1 + π 12 z 2 + e 1 (13) y 2 = π 21 z 1 + π 22 z 2 + e 2 (14) where e is reduced-form error, which is linear function of structural error. Note that reduced-form error is correlated cov(e 1,e 2 ) 0, whereas the structural error is uncorrelated.

12 Reduced Form Continued Note that all exogenous variables appear on the right hand side of each reduced form; by contrast, the structural form has endogenous variable and some exogenous variables on the right hand side. As a result, OLS applied to structural form is inconsistent, whereas OLS applied to reduced form is consistent Reduced form (14) is the first-stage regression if we want to use 2SLS estimator to obtain the causal effect of y 2 on y 1. Notice that all exogenous variables are used as regressors in the first-stage regression.

13 Indirect Least Squares Estimator (ILS) OLS applied to (13) and (14) separately gives consistent estimate for πs. However, we are interested in the coefficients in the structural form. The indirect least squares estimator (ILS) estimates the structural-form coefficients β based on the estimated reduced-form coefficient π

14 Indirect Least Squares Estimator Continued The ILS estimators for structural-form coefficient β are provided that ˆβ ILS 1 = ˆπ 12 ˆπ 22 (15) ˆβ ILS 2 = ˆπ 21 ˆπ 11 (16) c 2 0 (17) c 1 0 (18)

15 Identification and Exclusion Restrictions β 1 cannot be identified if c 2 = 0. So identification for β 1 requires c 2 0 c 2 0 indicates that there is an exogenous variable z 2 which is excluded from the first structural equation (order condition) but appears in the second structural equation with non-zero coefficient (rank condition) For the demand-and-supply example, the demand function can be identified if input price is present in the supply function. Graphically the demand curve can be traced out (identified) when supply curve shifts due to varying input price.

16 Remarks β 1 is over-identified if there are more than one exogenous variables that are excluded from the first equation and appear in the second equation with non-zero coefficients In that case, the ILS estimator for β 1 is not unique (Exercise), a big disadvantage of ILS. Another disadvantage of ILS is, ˆβ ILS is nonlinear function of ˆπ, so deriving the variance entails delta method

17 Delta Method For simplicity, let ˆπ and ˆβ = f ( ˆπ) be scalars. Consider the first order Taylor expansion ˆβ = f ( ˆπ) f (π) + f (π)( ˆπ π) which implies that var( ˆβ) = ( f (π)) 2 var( ˆπ) More generally, for vectors we have var-covariance( ˆβ) = ( ˆβ ) var-covariance( ˆπ)( ˆβ ) ˆπ ˆπ where ˆβ ˆπ is called gradient (column) vector.

18 2SLS Estimator The nice by-product of the structural model is that instrumental variables are readily available. The reduced form (9) and (10) clearly show that the exogenous variables z 1 and z 2 are correlated with the endogenous regressors y 1 and y 2. Moreover, we assume z 1 and z 2 are uncorrelated with the error u, see (8) So z 1 and z 2 are instrumental variables for y 1 and y 2, if the exogeneity assumption (8) holds Essentially the 2SLS estimator replaces the endogenous regressors with their exogenous parts, and we use instrumental variables to isolate those exogenous parts.

19 2SLS Estimator Continued Step 1: Estimate the reduced form (13) and (14) (first stage) using OLS and keep the fitted values ŷ 1 = ˆπ 11 z 1 + ˆπ 12 z 2 (19) ŷ 2 = ˆπ 21 z 1 + ˆπ 22 z 2 (20) Step 2: Replace the endogenous regressors with fitted values, and fit the second-stage regressions using OLS y 1 = β 1 ŷ 2 + c 1 z 1 + u 1 (21) y 2 = β 2 ŷ 1 + c 2 z 2 + u 2 (22) ŷ 1 is the exogenous part of y 1 ; ŷ 2 is the exogenous part of y 2 ; Both are linear combinations of exogenous z 1 and z 2. (Where are the endogenous parts?)

20 2SLS Estimator Stata For example, to get ˆβ 2SLS 1 in (6), using ivreg y1 (y2 = z2) z1 where y 1 is the dependent variable, y 2 is the endogenous regressor, z 2 is the excluded exogenous variable, and z 1 is the included exogenous variable (control variable). Exercise: what is the stata command to get variable is which. ˆβ 2SLS 2 in (7)? You need to think carefully which This command will first run regression (19), then (21).

21 (Optional) Seemingly Unrelated Regression (SUR) Reduced form (13) and (14) are example of seemingly unrelated regressions They have different LHS variables, so seem unrelated. They are indeed related because the reduce-form errors are correlated across equations, i.e., cov(e 1,e 2) 0, see (11) and (12)

22 (Optional) SUR Continued Generally the optimal estimator for SUR model is generalized least squares estimator (GLS), due to the correlation between errors across regressions. However, if each equation in SUR has the identical RHS variables, GLS becomes equation-by-equation OLS The STATA command to estimate SUR model using GLS estimator is sureg (y1 x1)(y2 x2)

23 (Optional) Gauss-Markov Theorem If the error is homoskedastic and uncorrelated, then OLS estimator is the best linear unbiased estimator (BLUE) conditional on the regressors. See theorem 10.4 in the textbook for details

24 (Optional) Generalized Least Squared Estimator I There is an estimator better than OLS if error is heteroskedastic. Suppose the model is Consider the transformed model y i = βx i + u i,e(u 2 i ) = σ 2 i, (Heteroskedasticity) (23) where the transformed error is homoskedastic: E(u 2 i ) = 1 y i = βx i + u i (24) y i = y i σ i,x i = x i σ i,u i = u i σ i (25) The estimator better than OLS is GLS, which is the OLS applied to the transformed regression (24).

25 (Optional) Generalized Least Squared Estimator II Consider a model with correlated error y i = βx i + u i,u i = ρu i 1 + v i, (Correlation) (26) Consider the transformed model y i = βxi + u i (27) y i = y i ρy i 1,xi = x i ρx i 1 (28) where the transformed error is uncorrelated: E(u i u i 1 ) = 0 GLS is the OLS applied to the transformed regression (27).

26 (Optional) Matrix Algebra for GLS We use GLS when E(UU X) = Ω σ 2 I. Because Ω is symmetric and positive definite, there is spectral decomposition Ω = AA. Now consider the transformed model Y = X β + U where Y = A 1 Y,X = A 1 X,U = A 1 U. It follows that GLS estimator is OLS applied to the transformed regression, which satisfies the conditions of Gauss-Markov Theorem. That is, E(U U X ) = I. In short, ˆβ GLS = (X X ) 1 (X Y ) = ( X Ω 1 X ) 1 ( X Ω 1 Y )

27 (Optional) STATA and GLS For instance, the STATA command prais y x reports the GLS estimator assuming the error is an AR(1) process, and so serially correlated. You use command sureg to obtain GLS estimator for SUR model. Alternatively, you can generate the transformed variables, and fit the transformed regression using OLS

OLS is inconsistent when applied to simultaneous equation model (SEM). 2SLS estimator is consistent. ILS estimator is also consistent, but requires delta method to obtain the variance and standard error. The benefit of considering SEM is tremendous in that instrumental variables are readily indicated by the structural model! 28