Instrumental Variables Again 1 STA431 Winter/Spring 2017 1 See last slide for copyright information. 1 / 15
Overview 1 2 2 / 15
Remember the problem of omitted variables Example: is income, is credit card debt. Omitted explanatory variables are part of the error term. Usually they are correlated with explanatory variables that are in the model. This makes the error term correlated with. ε β c Parameters are not identifiable. Estimation and inference fail. 3 / 15
Instrumental variable method saved the day Phillip Wright, 1928 An instrumental variable (for an explanatory variable) Is related to the explanatory variable in question. Is unrelated to any error term in the model. Is connected to the response variable only through. ε 2 β 2 ψ 12 β 1 ε 1 Real estate agents: is income, is credit card debt, is median home price. Interest is in β 2. β 4 / 15
Technically everything worked great i = α 1 + β 1W i + ɛ i1 and i = α 2 + β 2 i + ɛ i2 Σ = σ 2 z β 1σ 2 z β 1β 2σ 2 z β 2 1σ 2 z + σ 2 1 β 2(β 2 1σ 2 z + σ 2 1) + c β 2 1β 2 2σ 2 z + β 2 2σ 2 1 + 2β 2c + σ 2 2 Nine moment structure equations in 9 unknown parameters. β 2 = σ 13 σ 12. All the other parameters are identifiable too. But of course there is measurement error. 5 / 15
The model needs improvement is income, is credit card debt, is median home price. Same picture: ε 2 β 2 ψ 12 β 1 ε 1 = Income is measured with error. So is = Debt. There are βstill unmeasured variables that impact them c both. 6 / 15
An improved Model is income, is credit card debt, is median home price. ω 12 e 1 e 2 Common omitted variables are affecting true and true. β 1 λ 1 Tx β 2 λ 2 Ty Common omitted variables are affecting measurement of and measurement of. Factor loadings are realistic: Positive but not = 1. Six covariance structure equations in 11 unknowns. And it s still not realistic enough. ε 1 ε 2 Housing prices are only estimated. ψ 12 7 / 15
Easier to defend, but impossible to estimate is income, is credit card debt, is median home price. ω 12 e 1 e 2? λ 1 Tx β 2 λ 2 Ty Fortunately the instrumental variable only has to be correlated with the explanatory variable. ε 1 ε 2 ψ 12 8 / 15
Here s the Model is reported income, is reported credit card debt, is estimated median resale home price. ω 12 Fairly realistic. e 1 e 2 Still six covariance structure equations in 11 unknowns (poison). Explanatory variable correlated with the error term (poison). φ 12 λ 1 λ 2 Correlated measurement errors (poison). Tx β Ty But we have an instrumental variable. Calculate the covariance matrix. c ε 9 / 15
Show part of the calculation (centered model) is estimated median resale home price, is reported credit card debt ω 12 e 1 e 2 λ 1 λ 2 φ 12 Tx β Ty c ε ψ ε Cov(, ) 1 12 = Cov(, λ 2T y + e 2) c β 3 β 1 = Cov(, λ 2(βT x + ɛ) + e 2) = E ( (λ 2βT x + λ 2ɛ + e 2) ) ε 1 β = λ 2βE(T x) + λ 2E()E(ɛ) + E()E(e 2)) = λ 2βφ 12 10 / 15
Covariance matrix of the observable data is estimated median resale home price, is reported income, is reported credit card debt cov = φ 11 λ 1 φ 12 βλ 2 φ 12 λ 2 1 φ 22 + ω 11 βλ 1 λ 2 φ 22 + cλ 1 λ 2 + ω 12 β 2 λ 2 2 φ 22 + 2 βcλ 2 2 + λ2 2 ψ + ω 22 ω 12 e 1 e 2 β is not identifiable. But φ 12 > 0 and λ 2 > 0. So the sign of β is identifiable from σ 13. λ 1 λ 2 H 0 : β = 0 is testable. φ 12 Tx β Ty It s possible to answer the basic question of the study. c ε 11 / 15
It s a miracle Instrumental variables can help with measurement error and omitted variables at the same time. If there is measurement error, regression coefficients of interest are not identifiable and cannot be estimated consistently, but their signs can. Often, that s all you really want to know. Matrix version is available. The usual rule in Econometrics is (at least) one instrumental variable for each explanatory variable. 12 / 15
Independence of the instrumental variable and error terms is critical. φ 12 e 1 λ 1 Tx c β ω 12 e 2 λ 2 Ty ε Instrumental variables need to come from another world. For example, does academic ability contribute to higher salary? Study adults who were adopted as children. is academic ability. is salary at age 40. W is measured IQ at 40. is birth mother s IQ score. ε 13 / 15
It s a partial solution Good instrumental variables are not easy to find. They will not be in a data set casually collected for other purposes. Advance planning is needed. The ultimate instrumental variable is randomly assigned. 14 / 15
Copyright Information This slide show was prepared by Jerry Brunner, Department of Statistics, University of Toronto. It is licensed under a Creative Commons Attribution - ShareAlike 3.0 Unported License. Use any part of it as you like and share the result freely. The L A TE source code is available from the course website: http://www.utstat.toronto.edu/ brunner/oldclass/431s17 15 / 15