Instrumental Variables Regression, GMM, and Weak Instruments in Time Series

Similar documents
Weak Identification & Many Instruments in IV Regression and GMM, Part I

Weak Instruments, Weak Identification, and Many Instruments, Part II

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Lecture 11 Weak IV. Econ 715

What s New in Econometrics. Lecture 13

GMM, Weak Instruments, and Weak Identification

Imbens/Wooldridge, Lecture Notes 13, Summer 07 1

Inference with Weak Instruments

Identification of the New Keynesian Phillips Curve: Problems and Consequences

Asymptotic Distributions of Instrumental Variables Statistics with Many Instruments

A Robust Test for Weak Instruments in Stata

1 Procedures robust to weak instruments

Approximate Distributions of the Likelihood Ratio Statistic in a Structural Equation with Many Instruments

A CONDITIONAL LIKELIHOOD RATIO TEST FOR STRUCTURAL MODELS. By Marcelo J. Moreira 1

Weak Instruments in IV Regression: Theory and Practice

Instrumental Variables Estimation in Stata

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Testing for Weak Instruments in Linear IV Regression

Likelihood Ratio Based Test for the Exogeneity and the Relevance of Instrumental Variables

OPTIMAL TWO-SIDED INVARIANT SIMILAR TESTS FOR INSTRUMENTAL VARIABLES REGRESSION DONALD W. K. ANDREWS, MARCELO J. MOREIRA AND JAMES H.

What s New in Econometrics. Lecture 15

ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract

ROBUST CONFIDENCE SETS IN THE PRESENCE OF WEAK INSTRUMENTS By Anna Mikusheva 1, MIT, Department of Economics. Abstract

Testing the New Keynesian Phillips Curve without assuming. identification

Conditional Linear Combination Tests for Weakly Identified Models

Improved Inference in Weakly Identified Instrumental Variables Regression

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Likelihood Ratio based Joint Test for the Exogeneity and. the Relevance of Instrumental Variables

Location Properties of Point Estimators in Linear Instrumental Variables and Related Models

A more powerful subvector Anderson Rubin test in linear instrumental variables regression

Specification Test for Instrumental Variables Regression with Many Instruments

1 Motivation for Instrumental Variable (IV) Regression

Exponential Tilting with Weak Instruments: Estimation and Testing

Department of Econometrics and Business Statistics

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Weak Instruments and the First-Stage Robust F-statistic in an IV regression with heteroskedastic errors

A more powerful subvector Anderson and Rubin test in linear instrumental variables regression. Patrik Guggenberger Pennsylvania State University

Econometrics Summary Algebraic and Statistical Preliminaries

Chapter 1. GMM: Basic Concepts

Instrumental Variables Estimation and Weak-Identification-Robust. Inference Based on a Conditional Quantile Restriction

Econometrics. Week 4. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Comments on Weak instrument robust tests in GMM and the new Keynesian Phillips curve by F. Kleibergen and S. Mavroeidis

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Regression. Saraswata Chaudhuri, Thomas Richardson, James Robins and Eric Zivot. Working Paper no. 73. Center for Statistics and the Social Sciences

A New Paradigm: A Joint Test of Structural and Correlation Parameters in Instrumental Variables Regression When Perfect Exogeneity is Violated

Symmetric Jackknife Instrumental Variable Estimation

Exogeneity tests and weak identification

Non-linear panel data modeling

Volume 30, Issue 1. Measuring the Intertemporal Elasticity of Substitution for Consumption: Some Evidence from Japan

Economic modelling and forecasting

Instrument endogeneity, weak identification, and inference in IV regressions

Consistency without Inference: Instrumental Variables in Practical Application *

Generalized Method of Moment

8. Instrumental variables regression

Increasing the Power of Specification Tests. November 18, 2018

Unbiased Instrumental Variables Estimation Under Known First-Stage Sign

Econometrics of Panel Data

Robust Two Step Confidence Sets, and the Trouble with the First Stage F Statistic

Introduction to Econometrics

IV and IV-GMM. Christopher F Baum. EC 823: Applied Econometrics. Boston College, Spring 2014

Economics 308: Econometrics Professor Moody

Subsampling Tests of Parameter Hypotheses and Overidentifying Restrictions with Possible Failure of Identification

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Testing Overidentifying Restrictions with Many Instruments and Heteroskedasticity

Introductory Econometrics

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Joint Estimation of Risk Preferences and Technology: Further Discussion

Implementing Conditional Tests with Correct Size in the Simultaneous Equations Model

Partial Identification and Confidence Intervals

Understanding Regressions with Observations Collected at High Frequency over Long Span

A CONDITIONAL-HETEROSKEDASTICITY-ROBUST CONFIDENCE INTERVAL FOR THE AUTOREGRESSIVE PARAMETER. Donald Andrews and Patrik Guggenberger

Econ 510 B. Brown Spring 2014 Final Exam Answers

Weak Identification in Maximum Likelihood: A Question of Information

ECON3327: Financial Econometrics, Spring 2016

Econometric Analysis of Cross Section and Panel Data

New Keynesian Phillips Curves, structural econometrics and weak identification

Instrumental Variables, Simultaneous and Systems of Equations

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Econ 583 Final Exam Fall 2008

Online Appendix. j=1. φ T (ω j ) vec (EI T (ω j ) f θ0 (ω j )). vec (EI T (ω) f θ0 (ω)) = O T β+1/2) = o(1), M 1. M T (s) exp ( isω)

Testing for Weak Identification in Possibly Nonlinear Models

Applied Statistics and Econometrics. Giuseppe Ragusa Lecture 15: Instrumental Variables

Implementing weak-instrument robust tests for a general class of instrumental-variables models

A more powerful subvector Anderson Rubin test in linear instrumental variable regression

Subsets Tests in GMM without assuming identi cation

Just How Sensitive are Instrumental Variable Estimates?

Birkbeck Working Papers in Economics & Finance

On the finite-sample theory of exogeneity tests with possibly non-gaussian errors and weak identification

Econometrics Honor s Exam Review Session. Spring 2012 Eunice Han

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

Chapter 7. Hypothesis Tests and Confidence Intervals in Multiple Regression

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Split-Sample Score Tests in Linear Instrumental Variables Regression

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

LECTURE 10: NEYMAN-PEARSON LEMMA AND ASYMPTOTIC TESTING. The last equality is provided so this can look like a more familiar parametric test.

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

This chapter reviews properties of regression estimators and test statistics based on

Advanced Econometrics

Transcription:

009 Granger Lecture Granger Center for Time Series The University of Nottingham Instrumental Variables Regression, GMM, and Weak Instruments in Time Series James H. Stock Department of Economics Harvard University June 11, 009 1

NBER-NSF Time Series Conference, SMU, Sept. 005

Outline 1) IV regression and GMM ) Problems posed by weak instruments 3) Detection of weak instruments 4) Some solutions to weak instruments 5) Current research issues & literature 3

1) IV Regression and GMM Philip Wright (198) needed to estimate the supply equation for butter: Or: Q t butter butter ln( ) = β 0 + β 1 ln( ) + u t P t y t = Y t β + u t, (generic notation) β 1 = price elasticity of supply of butter OLS is inconsistent because ln( butter P t ) is endogenous: price and quantity are determined simultaneously by supply and demand Philip Wright (1861-1934), MA Harvard, Econ, 1887 Lecturer, Harvard, 1913-1917 4

Figure 4, p. 96, Wright (196), Appendix B 5

Derivation of the IV estimator in P. Wright (198, p. 314) Now multiply each term in this equation by A (the corresponding deviation in the price of a substitute) and we shall have: ea P = A O A S 1. Suppose this multiplication to be performed for every pair of price-output deviations and the results added, then: e A P = A O A A O A S S1 or e = 1. A P But A was a factor which did not affect supply conditions; hence it is uncorrelated A O with S 1 ; hence A S1 = 0; and hence e =. A P In modern notation: for an exogenous instrument, i.e. Z t s.t. EZ t u t = 0, Z (y Yβ) = Z y Z Yβ = z u, but Ez u = 0, so EZ y EZ Yβ = 0, which suggests: ˆ Zy β TSLS = ZY 6

Classical IV regression model & notation Equation of interest: y t = Y t β + u t, m = dim(y t ) k exogenous instruments Z t : E(u t Z t ) = 0, k = dim(z t ) Auxiliary equations: Y t = Π Z t + v t, corr(u t,v t ) = ρ (vector) Sampling assumption: (y t, Y t, Z t ) are i.i.d. (for now) Equations in matrix form: y = Yβ + u ( second stage ) Y = ZΠ + v ( first stage ) Comments: We assume throughout the instrument is exogenous (E(u t Z t ) = 0) Included exogenous regressors have been omitted without loss of generality Auxiliary equation is just the projection of Y on Z 7

Generalized Method of Moments GMM notation and estimator: GMM error term (G equations): h(y t ;θ); θ 0 = true value Note: In the linear model, h(y t ;θ) = y t θ Y t Errors times k instruments: φ t (θ) = G 1 k 1 hy (, θ ) Z Moment conditions - k instruments: Eφ t (θ) = E[ hy (, θ ) Z] = 0 GMM objective function: S T (θ) = t 0 t G 1 k 1 t T T 1/ 1/ T φt( θ) WT T φt( θ) t= 1 t= 1 GMM estimator: ˆ θ minimizes S T (θ) Efficient (infeasible) GMM: W T = Ω 1, Ω = π S φ ( ) (0) t θ CUE (Hansen, Heaton, Yaron 1996): W T = ˆΩ (θ) 1 (each θ) 0 t 8

Weak instruments: four examples Example #1 (time-series IV): Estimating the elasticity of intertemporal substitution, linearized Euler equation e.g. Campbell (003), Handbook of Economics of Finance Δc t+1 = consumption growth, t to t+1 r i,t+1 = return on i th asset, t to t+1 Log-linearized Euler equation moment condition: E t (Δc t+1 τ i ψr i,t+1 ) = 0 where ψ = elasticity of intertemporal substitution (EIS) 1/ψ = coeff. of relative risk aversion under power utility Resulting IV estimating equation: E[(Δc t+1 τ i ψr i,t+1 )Z t ] = 0 (or use Z t 1 because of temporal aggregation) 9

EIS estimating equations: Δc t+1 = τ i + ψr i,t+1 + u i,t+1 (a) or r i,t+1 = μ i + (1/ψ)Δc t+1 + η i,t+1 (b) Under homoskedasticity, standard estimation is by the TSLS estimator in (a) or by the inverse of the TSLS estimator in (b). Findings in literature (e.g. Campbell (003), US data): regression (a): 95% TSLS CI for ψ is (-.14,.8) regression (b): 95% TSLS CI for 1/ψ is (-.73,.14) What is going on? 10

Example # (cross-section IV): Angrist-Kreuger (1991), What are the returns to education? Example #3 (linear GMM): Hybrid New Keynesian Phillips Curve e.g. Gali and Gertler (1999), where x t = labor share; see survey by Kleibergen and Mavroeidis (008). Example #4 (nonlinear GMM): Estimating the elasticity of intertemporal substitution, nonlinear Euler equation Hansen, Heaton, Yaron (1996), Stock & Wright (000), Neely, Roy, & Whiteman (001) 11

Working definition of weak identification θ is weakly identified if the distributions of GMM or IV estimators and test statistics are not well approximated by their standard asymptotic normal or chi-squared limits because of limited information in the data. Departures from standard asymptotics are what matters in practice The source of the failures is limited information, not (for example) heavy tailed distributions, near-unit roots, unmodeled breaks, etc. The focus is on large T. Throughout, we assume instrument exogeneity 1

Why do weak instruments cause problems? IV regression with one Y and a single irrelevant instrument βˆ TSLS = Zy ZY = Z ( Y β +u) ZY = β + Zu ZY If Z is irrelevant (as in Bound et. al. (1995)), then Y = ZΠ + v = v, so T 1 ˆ Zu Ztut β TSLS β = Zv = d T t= 1 z u 1 T z, where zu ~ σu σuv N 0, v z v Ztv σuv σv t T t= 1 Comments: ˆ TSLS β isn t consistent (nor should it be!) Distribution of ˆ β TSLS is Cauchy-like (ratio of correlated normals) (Choi & Phillips (199)) 13

The distribution of ˆ β TSLS is a mixture of normals with nonzero mean: write z u = δz v + η, η z, where δ = σ uv / σ. Then z z u v δ z + η z = v v = δ + v η η, and zv ~ N(0, z v zv σ ) so the asymptotic distribution of βˆ TSLS β 0 is the mixture of normals, ˆ TSLS β (β 0 + δ) d η v η z v σ N(0, ) f z ( z ) v v dzv (1 irrelevant instrument) z heavy tails (mixture is based on inverse chi-squared) center of distribution of ˆ TSLS β is β 0 + δ. But ˆ OLS β β 0 = so Zu ZZ ˆ β TSLS plim( ˆ β OLS ) / T / T d = vu vv / / T T σ η zv σ p uv σ v = δ, so plim( ˆ OLS β ) = β 0 + δ, N(0, ) f z ( z ) v v dzv (1 irrelevant instrument) 14

The unidentified and strong-instrument distributions are two ends of a spectrum. Distribution of the TSLS t-statistic (Nelson-Startz (1990a,b)): Dark line = irrelevant instruments; dashed light line = strong instruments; intermediate cases: weak instruments. The key parameter is: μ = Π Z ZΠ/ σ (concentration parameter) v = k (numerator) noncentrality parameter of first-stage F statistic 15

Weak instrument asymptotics bridges this spectrum. Adopt nesting that makes the concentration parameter tend to a constant as the sample size increases by setting Π = C/ T (weak instrument asymptotics) This is the Pitman drift for obtaining the local power function of the first-stage F. This nesting holds Eμ constant as T. Under this nesting, F d noncentral χ k /k with noncentrality parameter Eμ /k (so F = O p (1)) Letting the parameter depend on the sample size is a common ways to obtain good approximations e.g. local to unit roots (Bobkoski 1983, Cavanagh 1985, Chan and Wei 1987, and Phillips 1987) 16

Weak IV asymptotics for TSLS estimator, 1 included end ogenous vble: ˆ β TSLS β 0 = (Y P Z u)/(y P Z Y) Now where Y P Y = Z λ = = = d 1 ( ZΠ+ v) Z ZZ Z ( ZΠ+ v) T T T 1/ 1/ ΠZZ vz ZZ ZZ ZZ Π Zv + + T T T T T T 1/ 1/ 1/ ZZ vz ZZ ZZ ZZ C + C+ T T T T T (λ + z v) (λ + z v), 1/ CQ ZZ, Q ZZ = EZ t Z t, and z σ u z ~ u N 0, v σ σ uv uv σv 1/ Zv T 17

Similarly, so Y P Z u = = ˆ TSLS β 1 ( ZΠ+ v) Z Z Z Zu ) T T T 1 ZZ vz ZZ Zu C + T T T T d (λ + z v ) z u, β 0 d ( λ + zv) zu ( λ + z )( λ + z ) Under weak instrument asymptotics, μ p C Q ZZ C/ Unidentified special case: ˆ β TSLS β 0 Strong IVs: λ λ ( ˆ β TSLS β 0 ) v d v λ zu λ λ z z z z d v v u v σ = λ λ/ σ v (obtained earlier) ~ N(0, σ ) (standard limit) u v 18

Summary o f weak IV asymptotic results: Resulting asymptotic distributions are the same as in the exact normal classical model with fixed Z but with known covariance matrices. Weak IV asymptotics yields good approximations to sampling distributions uniformly in μ for T moderate or large. Under this nestin g: o IV estimators are not consistent, are nonnormal o Test statistics (including the J-test of overidentifying restrictions) do not have normal or chi-squared distributions o Conventional confidence intervals do not have correct coverage Because μ is unknown, these distributions can t be used directly in practice to obt ain a corrected distribution for purposes of inference 19

3) Detection of weak instruments How weak is weak? Need a cutoff value for μ ˆ TSLS β β 0 d ( λ + zv) zu, ( λ+ z )( λ+ z ) where μ = Π Z ZΠ/ σ v (concentration parameter) p v v = k (numerator) noncentrality parameter of first-stage F λ λ/ σ For what values of μ does Various procedures: v ( λ + zv) zu ( λ+ z )( λ+ z ) v v λ zu λ λ First stage F > 10 rule of thumb (Staiger-Stock (1997)) Stock-Yogo (005a) relative bias method (approximately yields F>10) Stock-Yogo (005a) size method Hahn-Hausman (003) test Other methods (R, partial R, Shea (1997), etc.)? 0

TSLS relative bias cutoff method (Stock-Yogo (005a)) Some background: The relative squared normalized bias of TSLS to OLS is, ( Eβˆ β)'σ ( Eβˆ β) IV IV YY B n = OLS Eˆ YY OLS Eˆ ( β β)'σ ( β β) The square root of the maximal relative squared asymptotic bias is: max B = max ρ: 0 < ρ ρ 1 lim n B n, where ρ = corr(u t,v t ) This maximization problem is a ratio of quadratic forms so it turns into a (generalized) eigenvalue problem; algebra reveals that the solution to this eigenvalues problem depends only on μ /k and k; this yields the cutoff μbias. 1

Critical values One included endogenous regressor The 5% critical value of the test is the 95% percentile value of the noncentral χ k /k distribution, with noncentrality parameter μ bias/k Multiple included endogenous regressors The Cragg-Donald (1993) statistic is: ˆΣ VV ˆΣ VV / 1/ 1/ g min = mineval(g T ), where G T = Y P Z Y k, G T is essentially a matrix first stage F statistic Critical values are given in Stock-Yogo (005a) Software STATA (ivreg),

5% critical value of F to ensure indicated maximal bias (Stock-Yogo, 005a) To ensure 10% maximal bias, need F < 11.5; F < 10 is a rule of thumb 3

Other methods for detecting weak instruments R, partial R, or adjusted R None of these are a good idea, more precisely, what needs to be large is the concentration parameter, not the R. An R =.10 is small if T = 50 but is large if T = 5000. The first-stage R is especially uninformative if the first stage regression has included exogenous regressors (W s) because it is the marginal explanatory content of the Z s, given the W s, that matters. Hahn-Hausman (003) test Idea is to test the null of strong instruments, under which the TSLS estimator, and the inverse of the TSLS estimator from the reverse regression, should be the same Unfortunately the HH test is not consistent against weak instruments (power of 5% level test depends on parameters, is typically 15-0% (Hausman, Stock, Yogo (005)) 4

4) Some Solutions to Weak Instruments There are two approaches to improving inference (providing tools): Fully robust methods: Inference that is valid for any value of the concentration parameter, including zero, at least if the sample size is large, under weak instrument asymptotics o For tests: asymptotically correct size (and good power!) o For confidence intervals: asymptotically correct coverage rates o For estimators: asymptotically unbiased (or median-unbiased) Partially robust methods: Methods are less sensitive to weak instruments than TSLS e.g. bias is small for a large range of μ 5

Fully Robust Testing Approach #1: use a worst case (over all possible values of μ ) critical value for the TSLS t-stat o leads to low-power procedures Approach #: use a statistic whose distribution does not depend on (two such statistics are known) Approach #3: use statistics whose distribution depends on μ, but compute the critical values as a function of another statistic that is sufficient for μ under the null hypothesis. Approach #4: optimal nonsimilar tests (subsumes 1-3) μ 6

Approach #: Tests that are valid unconditionally linear IV case (that is, the distribution of the test statistic does not depend on μ ) The Anderson-Rubin (1949) test Consider H 0 : β = β 0 in y = Yβ + u, Y = ZΠ + v The Anderson-Rubin (1949) statistic is the F-statistic in the regression y Yβ 0 on Z. ( y Yβ0) PZ ( y Yβ0)/ k AR(β 0 ) = ( y Yβ ) M ( y Yβ )/( T k) 0 Z 0 of 7

Comments ( y Yβ0) PZ ( y Yβ0)/ k AR(β 0 ) = ( y Yβ ) M ( y Yβ )/( T k) ˆ TSLS 0 Z 0 AR( β ) = Hansen s (003) J-statistic Null distribution doesn t depend on μ : Under the null, y Yβ0 = u, so u PZ u/ k AR = ~ F k,n k if u t is normal u u/( ) d AR M T k Z χ k /k if u t is i.i.d. and Z t u t has moments (CLT) The distribution of AR under the alternative depends on μ more information, more power (of course) Difficult to interpret: rejection arises for two reasons: β 0 is false or Z is endogenous Power loss relative to other tests; inefficient under strong instruments 8

Kleibergen s (00) LM test Kleibergen developed an LM test that has a null distribution that is doesn t depend on μ. χ 1 - Fairly easy to implement Is efficient if instruments are strong Has very strange power properties (we shall see) Its power is dominated by the conditional likelihood ratio test 9

Approach #3: Conditional tes ts Conditional tests have rejection rate 5% for all points under the null (β 0, μ ) ( similar tests ). Moreira (003): LR tests β = β u ing the LIML likelihood: 0 s LR = max β log-likelihood( β) log-likelihood(β 0 ) Q T is sufficient for μ under the null Thus the distribution of LR Q n μ T does not depend o under the null Thus valid inference can be conducted using the quantiles of that is, use critical values which are a function of Q T LR Q T, 30

Moreira s (003) conditional lik elihood ratio (CLR) test LR = max β log-likelihood(β) log-likelihood(β 0 ) After some algebra, this becomes: = ½{ ˆQ ˆ + [( ˆQ ˆQ ) + 4 ˆQ ] 1/ } LR S QT S T ST where Ĵ 0 = ˆQ = ˆΩ = Ωˆ Qˆ ˆ S Q ST = Ĵ 0 Ω ˆ ˆ ˆ 1/ Y + P Z Y + ˆΩ 1/ Ĵ 0 QST QT M Y Y /(T k), Y + = (y Y) + + Z b Ωˆ a 1/ 1/ 0 0 b Ωˆb a Ωˆ a 1 0 0 0 0, b 0 = 1 β 0 a 0 = β0 1. 31

CLR test: Comments More powerful than AR or LM In fact, effectively uniformly most powerful among asymptotically efficient similar tests that are invariant to rotations of the instruments ( Andrews, Moreira, Stock (006)) STATA (condivreg), Gauss code for computing LR and conditional p- values exists but.. Only developed (so far) for a single included endogenous regressor As written here, the software requires homoskedastic errors; extensions to heteroskedasticity and serial correlation have been developed but are not in common statistical software 3

Approach #4: Nonsimilar tests (Andrews, Moreira, Stock (008)) Polar coordinate transform (Hillier (1990), Chamberlain (005)): Mapping: r = λ λh h, h = c d β β = ( ββ - 0)/ b0 Ωb0 1 1 a a / a Ω Ω a sinθ x(θ) = cosθ = h/ hh ' (so x(θ 0) = β θ β 0 0 < β 0 (> β 0 ) < 0 (> 0) 0 0 0 0 1 ) θ = lim β cos 1 [d β /(h h) 1/ ] - θ = θ π 33

Compound null hypothesis and two-sided alternative: H 0 : 0 r <, θ = 0 vs. H 1 : r = r 1, θ = ±θ 1 (*) Strategy (follows Lehmann (1986)) 1. Null: transform compound null into point null via weighs Λ: h (q) = f ( ;, ) ( ) Q q r θ dλ r Λ 0. Alterna tive: transform into point alternative via equal weighting of (r 1, ±θ 1 ) (this is a necessary but not sufficient condition for nonsimilar tests to be AE): g(q) = 1 fq( q; r, θ ) + fq( q; r, θ ). 34

3. Point optimal invariant test of h Λ vs. g: from Neyman -Pearson Lemma, reject if NP ( q) = Λ, r, θ 1 1 gq ( ) h ( q ) = 1 fq( q; r, θ ) + fq( q; r, θ ) h ( q) Λ Λ > κ Λ, r, θ ; α 1 1 4. Least favorable distribution Λ: NP ( q) Λ, r, θ 1 1 distribution if Λ is least favorable, that is, if sup Pr NP ( q) 0 r, θ = 0 Λ, r1, θ1 Λ, r1, θ1; α < r > κ = α is POINS for the original 5. POINS Power envelope. The PE of POINS tests of (*) is the envelope of power functions of NP ( q ), where Λ LF is the least favorable distribution Λ LF, r, θ 1 1 35

A closed form, POINS test of θ = 0 (using theoretical results on one-point least favorable distributions + Bessel function approximations): * P θ = r1, 1 cosh r ( D ) r, θ 1 * sin 1 1 θ 1 1 where ( ) ν 1 ν + z z ( ν + z ) z z z z z 1/4 / 1/4 ν / 0 0 φ φ 0 0 φ1 φ0 e 1/4 1/4 ν / 1 1 1 1 * 1 0 Dr 1, θ = e + 1 ( ) ν / ν + ( ν + ) φ 0 = ν z + z + ν ln ν + ν + z 0 0 0 ( e tc.), ν = (k )/, Numerical search over r 1, θ 1 resulted in r = 0k and θ 1 = π/4 1 36

Andrews, Moreira and Stock (008), Figures /3 Upper and lower bound on power envelope for nonsimilar invariant tests against (r, θ ) and power envelope for similar invariant tests against (r, θ ), 0 θ π/, r / k = 0.5, 1,, 4, 8, 16, 3, 64; k = 5 37

38

39

40

41

4

43

44

45

Figure 4 Power envelope for similar invariant tests against (r, θ ) and power functions of the CLR, LM, and AR tests, 0 θ π/, r / k = 1, 4, 8, 3, k = 5 46

Figure 5 Power functions of the CLR, P *B, and P * tests (in which π/4), for 0 θ π/, r / k = 1, 4, 8, 3, and k = 5 r 1 = 0k and θ 1 = 47

Confidence Intervals linear IV case Dufour (1997) impossibility result for Wald intervals Valid intervals come from inverting valid tests (1) Inversion of AR test: AR Confidence Intervals 95% CI = {β 0 : AR(β 0 ) < F k,t k;.05 } For m = 1, this entails solving a quadratic equation: ( y Yβ0) PZ ( y Yβ0)/ k AR(β 0 ) = < F k,t k;.05 ( y Yβ ) M ( y Yβ )/( T k) 0 Z 0 For m > 1, solution can be done by grid search or using methods in Dufour and Taamouti (005) Sets for a single coefficient can be computed by projecting the larger set onto the space of the single coefficient (see Dufour and Taamouti (005)), also see recent work by Kleibergen (008) Intervals can be empty, unbounded, disjoint, or convex 48

() Inversion of CLR test: CLR Confidence Intervals 95% CI = {β 0 : LR(β 0 ) < cv.05 (Q T )} where cv.05 (Q T ) = 5% conditional critical value Comments: Efficient GAUSS and STATA (condivreg) software Will contain the LIML estimator (Mikusheva (005)) Has certain optimality properties: nearly uniformly most accurate invariant; also minimum expected length in polar coordinates (Mikusheva (005)) Only available for m = 1 49

What about the bootstrap or subsampling? A straightforward bootstrap algorithm for TSLS: y t = β Y t + u Y t = Π Z t + v t i) Estimate β, Π by ˆ β TSLS t, ˆΠ ii) Compute the residuals u ˆt, v ˆt iii) Draw T errors and exogenous variables from { u ˆt, v ˆt, Z t }, and construct bootstrap data y t, Y t using ˆ β TSLS, ˆΠ iv) Compute TSLS estimator (and t-statistic, etc.) using bootstrap data v) Repeat, and compute bias-adjustments and quantiles from the boostrap distribution, e.g. bias = bootstrap mean of ˆ β TSLS ˆ β using actual data Under strong instruments, this algorithm works (provides second-order improvements). TSLS 50

Bootstrap, ctd. Under weak instruments, this algorithm (or variants) does not even provide first-order valid inference The reason the bootstrap fails here is that ˆΠ is used to compute the bootstrap distributio n. The true pdf depen ds on μ, say f TSLS ( β ˆ TSLS ;μ ) (e.g. Rothenberg (1984 exposition abov e, or weak instrument asymptotics). By using ˆΠ, μ is estimated, say by ˆ μ. The bootstrap correctly estimates f ˆ β TSLS TSLS TSLS ( ; ˆμ ), but f TSLS ( ˆ β ; ˆ μ ) TSLS f TSLS ( ˆ β ;μ ) because ˆ μ is not consistent for μ. This story might sound familiar it is the same reason the bootstrap fails in the unit root model, and in the local-to-unity model. Subsampling for these (non-pivotal) statistics doesn t work either; see Andrews and Guggenberger (007a,b). 51

Some remarks about estimation It isn t possible to have an estimator that is completely robust to weak instruments TSLS (-step GMM) is about the worst thing you can do from a bias perspective LIML (in GMM, CUE) has much better median bias properties but can have large (infinite) variance In the linear case, there are alternative estimators, e.g. Fuller s estimator; see Hahn, Hausman, and Kuersteine r ( JAE, 006) for an extensive MC comparison 5

Example #1: Consumption CAPM and the EIS Yogo (REStat, 004) Δc t+1 = consumption growth, t to t+1 r i,t+1 = return on i th asset, t to t+1 Moment conditions: E t (Δc t+1 τ i ψr i,t+1 ) = 0 EIS estimating equations: Δc t+1 = τ i + ψr i,t+1 + u i,t+1 ( forwards ) or r i,t+1 = μ i + (1/ψ)Δc t+1 + η i,t+1 ( backwards ) Under homoskedasticity, standard estimation is by TSLS or by the inverse of the TSLS estimator (remember Hahn-Hausman (003) test?); but with weak instruments, the normalization matters 53

First stage F-statistics for EIS (Yogo (004)): 54

Various estimates of the EIS, forward and backward 55

AR, LM, and CLR confidence intervals for ψ: 56

What about stock returns should they work? 57

Extensions to >1 included endogenous regressor CLR exists in theory, but difficult computational issues because the conditioning statistic has dimension m(m+1)/ (AMS (006), Kleibergen (007)) Can test joint hypothesis H 0 : β = β 0 using the AR statistic: ( y Yβ0) PZ ( y Yβ0)/ k AR(β 0 ) = ( y Yβ ) M ( y Yβ )/( T k) 0 Z 0 under H 0, AR d χ k /k Subsets by projection (Dufour-Taamouti (005)) or by concentration + bounds (Kleibergen and Mavroeidis (008, 009) 009 is GMM treatment) 58

Extensions to GMM (1) The GMM-Anderson Rubin statistic (Kocherlakota (1990); Burnside (1994), Stock and Wright (000)) The extension of the AR stat evaluated at θ 0 : S ( θ ) = CUE T 0 istic to GMM is the CUE objective function T T 1/ ˆ 1 1/ T φt( θ0) Ω( θ0) T φt( θ0) t= 1 t= 1 d ψ(θ 0 ) Ω(θ 0 ) 1 Ψ(θ 0 ) ~ Thus a valid test of Η 0 : θ = θ 0 can be undertaken by rejecting if S T (θ 0 ) > 5% critical value of χ k. χ k 59

GMM-Anderson-Rub in, ctd. In the homoskedastic/uncorrelated linear IV model, the GMM-AR statistic simplifies to the AR statistic (up to a degrees of freedom correction): S ( θ ) = CUE T 0 = = T T 1/ ˆ 1 1/ T φt( θ0) Ω( θ0) T φt( θ0) t= 1 t= 1 T 1 T 1/ ' 1/ T ( yt θ ZZ 0Yt) Zt sv T ( yt θ 0Yt) Zt t 1 T = t= 1 ( y Yθ0) PZ ( y Yθ0) = k AR(θ 0 ) ( y Yθ ) M ( y Yθ )/( T k) 0 Z 0 The GMM-AR statistic has the same issues of interpretation issues as the AR, specifically, the GMM-AR rejects because of endogenous instruments and/or incorrect θ The GMM-AR can fail to reject any values of θ (remember the Dufour (1997) critique of Wald tests) 60

() GMM-LM Kleibergen (005) develops score statistic (based on CUE objective function details of construction matter) that provides weakidentification valid hypothesis testing for sets of variables (3) GMM-CLR Andrews, Moreira, Stock (006) extension of CLR to linear GMM with a single included endogenous regressor, also see Kleibergen (007). Very limited evidence on performance exists; also problem of dimension of conditioning vector ( 4) Other methods Guggenberger-Smith (005) objective-function based tests based on Generalized Empirical Likelihood (GEL) objective function (Newey and Smith (004)); Guggenberger-Smith (008) generalize these to time series data. Performance is similar to CUE (asymptotically equivalent under weak instruments) 61

Confidence set s Fully-robust 95% confidence sets are obtained by inverting (are the acceptance region of) fully-robust 5% hypothesis tests Computation is by grid search in general: collect all the points θ which, when treated as the null, are not rejected by the GMM-AR statistic. Subsets by projection or by concentration + bounds (see Kleibergen and Mavroeidis (008, 009) for an application of GMM-AR confidence sets and subsets) Valid tests must be unbounded (contain Θ) with finite probability with weak instruments 6

Many instruments: a solution to weak instruments? T he appeal of using many instruments Under standard IV asymptotics, more instruments means greater efficiency. This story is not very credible because (a) the instruments you are adding might well be weak (you already have used the first two lags, say) and ( b) even if they are strong, this require s consistent estimation of increasingly many parameter to obtain the efficient projection hence slow rates of growth of the number of instruments in efficient GMM literature. 63

Example of problems with many weak instruments TS LS Recall the TSLS weak instrument asymptotic limit: ˆ β TSLS β d ( λ + z v) zu 0 ( λ + z )( λ + z ) v with the decomposition, z u = δz v + η. Suppose that k is large, and that λ λ/k Λ (one way to implement many asymptotics ). Then as k, λ z v /k p 0 and λ z u /k p 0 v weak instrument z v z v /k p 1 and z v η/k p 0 (z v and η are independent by construction) Putting these limits together, we have, as k, ( λ z ) p + v zu δ ( λ + zv)( λ + zv) 1 + Λ In the limit that Λ = 0, TSLS is consistent for the plim of OLS! 64

Comments Strictly this calculation isn t right it uses sequential asymptotics (T, then k ). However the sequential asymptotics is justified under certain (restrictive) conditions on K/T (specifically, k 4 /T 0) Typical conditions on k are k 3 /T 0 (e.g. Newey and Windmeijer (004)) Many instruments can be turned into a blessing (if they are not too weak! They can t push the scaled concentration parameter to zero) by exploiting the additional convergence across instruments. This can lead to bias corrections and corrected standard errors. There is no single best method at this point but there is promising research, e.g. Newey and Windmeijer (004), Chao and Swanson (005), and Hansen, Hausman, and Newey (006)) 65

Efficient testing in GMM with weak instruments Andrews, Moreira, Stock (006), Kleibergen (008) Many instruments Breaks in GMM with weak instruments Caner (008) Connection between weak ID, HAC estimation, and SVAR identification using long run restrictions Gospodinov (008) Weak set identification? 5) Current Research Issues & Literature Detection of weak instruments in general nonlinear GMM model Subset testing Kleibergen and Mavroides (008, 009) Improved estimation in GMM Guggenberger-Smith (005) (GEL) 66