Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each of these statements, indicate whether the statement is true or false. 1. If x and y are independent, then cov(x, y) = 0. 2. If x and y are independent, then E(y x) = E(y). 3. If cov(x, y) = 0 then x and y are independent. 4. If cov(x, y) = 0 then E(y x) = E(y). 5. If E(y x) = E(y) then cov(x, y) = 0. 6. If E(y x) = E(y), then x and y are independent. No need to prove, just identify which statements are true and which are false. b) Which of the following problems lead to OLS being inconsistent? If the problem does not affect consistency, name one other negative consequence. 1. Strong correlation between the explanatory variables. 2. Endogeneity of one or more explanatory variables. 3. Heteroskedasticity. c) Let a and b be two independent random variables. Prove 1 that cov(a, ab) = E(b)var(a) 1 Your proof can use any of the following intermediate results: 1. If a and b are independent, g(a) and h(b) are independent for any functions g and h. 2. If a and b are independent, then E(ab) = E(a)E(b). 3. For any x and y, cov(x, y) = E(xy) E(x)E(y). 1
ECON 835, Fall 2010 2 d) Let Y be an n L matrix of outcomes, let X an n k matrix of explanatory variables, and suppose we are interested in estimating a model of the form: where B is a k L matrix of coefficients and Y = XB + U E(U X) = 0 This is called a system regression model. Let the system OLS estimator of B be: ˆB = (X X) 1 X Y (where we assume X X has rank k). Prove that E( ˆB) = B. 2 Instrumental variables with heterogeneity Suppose we are interested in measuring the effect of some scalar variable x i on some scalar outcome y i. We are concerned that x i is endogenous, so we have found some exogenous and relevant scalar instrument z i. We estimate two regressions from a random sample. Let: ˆβ OLS = cov(x ˆ i, y i ) var(x ˆ i ) be the usual OLS regression coefficient and let: be the usual IV/2SLS regression coefficient. p cov(x i, y i ) var(x i ) ˆβ IV = cov(z ˆ i, y i ) cov(z i, y i ) cov(z ˆ i, x i ) p cov(z i, x i ) However, the structural model includes a degree of heterogeneity. Suppose that individuals fall into one of two categories, which we ll call compliers and non-compliers. { 1 for compliers c i = 0 for non-compliers A person s value of c i matters in two ways. First, the instrument z i affects the value of x i only for compliers. That is: x i = α 0 + α 1 c i + α 2 c i z i + v i where E(v i c i, z i ) = 0. Second, the effect of x i on y i varies across compliers and non-compliers: y i = β 0 + β 1 c i + β 2 x i c i + β 3 x i (1 c i ) + u i where the β s are causal and so u i is not necessarily unrelated to x i. We will make several assumptions: The instrument is exogenous, i.e., E(u i z i ) = 0. The instrument is relevant, i.e., α 2 0.
ECON 835, Fall 2010 3 A person s type is independent of both x i and z i, i.e., Pr(c i = 1 z i, x i ) = Pr(c i = 1) = p This will enable you to use the result established in part (c) of question (1) of this exam. Finally, I ll point out a result that might be useful at some point. If c {0, 1}, then c 2 = c and c(1 c) = 0. Let θ = (α 0, α 1, α 2, β 0, β 1, β 2, β 3, var(z i ), var(x i ), cov(x i, u i ), p). a) Find cov(z i, x i ) in terms of the elements of θ. b) Find cov(z i, y i ) in terms of the elements of θ. c) Find cov(x i, y i ) in terms of the elements of θ. d) Find plim ˆβ IV in terms of the elements of θ. e) Find plim ˆβ OLS in terms of the elements of θ. f) The marginal effect of x i on y i is defined as the partial derivative: ME i = y i x i = β 2 c i + β 3 (1 c i ) IV Show that ˆβ consistently estimates E(ME i c i = 1), the average marginal effect among compliers (when z and x are binary, this is called the local average treatment effect or LATE). g) Show that if we assume cov(x i, u i ) = 0, then ˆβ OLS consistently averages the E(ME i ), the average marginal effect in the population. 3 Panel data with large T and small n Suppose we have a panel data set on a small number of individuals numbered i = 1,..., n observed over many time periods t = 1,..., T. In this case, we use asymptotics based on T going to infinity instead of n. This requires we deal with time series issues that could be ignored when n was large. We have a standard fixed effects model: y it = a i + βx it + u it for all i and t where we assume strict exogeneity: E(u it a i, x i1, x i2,...) = 0 for all i and t a) Prove that: β = cov( y it, x it ) var( x it ) b) The result in part (a) suggests that we can consistently estimate β by an OLS regression of y it on x it, just like in the usual large n, small T case. But in order for OLS to be consistent in our small n, large T case, we also need for y it and x it to be stationary and ergodic, so that cov( y, ˆ x) p cov( y, x) and var( x) ˆ p var( x). For the remainder of this problem, assume that: u it is a stationary and ergodic stochastic process with mean zero (we already assumed that) and finite second moments (variances and covariances). Note that this implies:
ECON 835, Fall 2010 4 u it is covariance stationary, i.e.: u it is asymptotically uncorrelated, i.e.: E(u it ) = µ u (in this case µ u = 0) cov(u it, u it+k ) = σ u (k) where σ u (k) < lim cov(u it, u it+k ) = 0 k You can take these implications as given for the rest of the problem. We can write: x it = x i + ɛ it where x i is a random variable and ɛ it is a stationary and ergodic stochastic process with finite second moments, and mean zero conditional on (a i, x i ): E(ɛ it a i, x i ) = 0 Again, this will imply that ɛ it is covariance stationary and asymptotically uncorrelated. All variances are finite and strictly positive, and a i is not an exact linear function of x i (these assumptions are made mostly for convenience). Prove that under these assumptions x it and y it are covariance stationary. c) Prove that under these assumptions neither y it nor x it is asymptotically uncorrelated. d) Prove that under these assumptions both x it and y it are asymptotically uncorrelated. e) Under our assumptions about stationarity and ergodicity (along with a few additional conditions) we will have: Take those results 2 as given, and let: Find plim T (â i a i ). plim T ˆβ = β plim T ū i = 0 â i = ȳ i ˆβ x i 4 Plausibly exogenous instruments The standard theory justifying the use of instrumental variables to estimate structural or causal models requires the assumption that we are certain that the instruments are exogenous. In practice, researchers often use instruments that are only what some have called plausibly exogenous. That is, we have no particular reason to believe the instruments are not exogenous, so we hope any deviation from exogeneity is probably small enough to ignore. We will analyze this situation in a simple setting. Suppose we have a random sample of size n on a scalar outcome y i, a scalar explanatory variable x i, and a scalar instrument z i. Suppose that our structural model is: y i = β 0 + β 1 x i + u i 2 Note that I did not tell you that ȳ i p E(y it ) or x i p E(x it ). In fact, we we proved in (c) that y it and x it were not ergodic, so there is no reason to expect those to hold.
ECON 835, Fall 2010 5 where β 1 has a causal interpretation and so x i and u i are potentially correlated. instrumental variable z i that is relevant: cov(x i, z i ) 0 but not necessarily exogenous. Use the following notation in answering this question: We have a candidate ρ x,u = corr(x i, u i ) ρ z,u = corr(z i, u i ) ρ x,z = corr(x i, z i ) σ 2 u = var(u i ) σ 2 x = var(x i ) σ 2 z = var(z i ) To keep the problem simple, assume ρ x,u 0, ρ z,u 0, and 0 ρ x,z < 1. a) Let 1 = cov(x ˆ i, y i ) var(x ˆ i ) ˆβ OLS OLS be the coefficient from the OLS regression of y i on x i. Find plim ( ˆβ 1 β 1 ) in terms of (ρ x,u, ρ z,u, ρ x,z, σ u, σ x, σ z ). b) Let 1 = cov(z ˆ i, y i ) cov(z ˆ i, x i ) ˆβ IV IV be the coefficient from the IV regression. Find plim ˆβ 1 β 1 in terms of (ρ x,u, ρ z,u, ρ x,z, σ u, σ x, σ z ). c) Find a condition on (ρ x,u, ρ z,u, ρ x,z ) under which the inconsistency (i.e. the difference between the probability limit of an estimator and the quantity it is estimating) of IV is less than that of OLS. d) Although ρ x,u and ρ z,u are not identified, we can estimate ρ x,z. Suppose you estimate of ρ x,z is about 0.2, and you would like to put the following sentence into your paper: Our IV regression will be inconsistent if our instrument fails to be exogenous, but it will be less inconsistent than the OLS regression as long as the instrument s correlation with unobservables is less than times the explanatory variable s correlation with unobservables Fill in the blank.