Discussion of Sensitivity and Informativeness under Local Misspecification

Similar documents
1 Motivation for Instrumental Variable (IV) Regression

Specification Test for Instrumental Variables Regression with Many Instruments

Lecture 7 Introduction to Statistical Decision Theory

Econometric Methods and Applications II Chapter 2: Simultaneous equations. Econometric Methods and Applications II, Chapter 2, Slide 1

Specification Test on Mixed Logit Models

Instrumental Variables

Discrete Dependent Variable Models

Consequences of measurement error. Psychology 588: Covariance structure and factor models

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

DSGE Methods. Estimation of DSGE models: GMM and Indirect Inference. Willi Mutschler, M.Sc.

Lecture 8: Information Theory and Statistics

Lecture 4: Testing Stuff

Generalized Method of Moment

COMPARISON OF GMM WITH SECOND-ORDER LEAST SQUARES ESTIMATION IN NONLINEAR MODELS. Abstract

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Max. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes

The returns to schooling, ability bias, and regression

Simultaneous Equations and Weak Instruments under Conditionally Heteroscedastic Disturbances

ECON Introductory Econometrics. Lecture 16: Instrumental variables

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

DSGE-Models. Limited Information Estimation General Method of Moments and Indirect Inference

Instrumental Variables and the Problem of Endogeneity

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Efficiency of repeated-cross-section estimators in fixed-effects models

AGEC 661 Note Fourteen

Statistics 3858 : Maximum Likelihood Estimators

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Our point of departure, as in Chapter 2, will once more be the outcome equation:

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Economic modelling and forecasting

Robustness to Parametric Assumptions in Missing Data Models

Short T Panels - Review

Estimating Moving Average Processes with an improved version of Durbin s Method

Topic 4: Model Specifications

Econometrics Summary Algebraic and Statistical Preliminaries

A Course on Advanced Econometrics

Heteroscedasticity 1

ECO Class 6 Nonparametric Econometrics

The Estimation of Simultaneous Equation Models under Conditional Heteroscedasticity

Föreläsning /31

Time series models in the Frequency domain. The power spectrum, Spectral analysis

Chapter 6: Endogeneity and Instrumental Variables (IV) estimator

Variable Selection and Model Building

An overview of applied econometrics

Flexible Estimation of Treatment Effect Parameters

Econometrics - 30C00200

Partial Identification and Confidence Intervals

The Delta Method and Applications

Specification testing in panel data models estimated by fixed effects with instrumental variables

Maximum Likelihood (ML) Estimation

Lecture 11/12. Roy Model, MTE, Structural Estimation

Regression I: Mean Squared Error and Measuring Quality of Fit

LECTURE ON HAC COVARIANCE MATRIX ESTIMATION AND THE KVB APPROACH

On Uniform Asymptotic Risk of Averaging GMM Estimators

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Econometric Analysis of Cross Section and Panel Data

Økonomisk Kandidateksamen 2004 (I) Econometrics 2. Rettevejledning

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Optimizing forecasts for inflation and interest rates by time-series model averaging

ECNS 561 Multiple Regression Analysis

Econometrics of Panel Data

11. Bootstrap Methods

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

GMM-based inference in the AR(1) panel data model for parameter values where local identi cation fails

GMM, HAC estimators, & Standard Errors for Business Cycle Statistics

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

Economics 620, Lecture 20: Generalized Method of Moment (GMM)

Econometrics II - EXAM Outline Solutions All questions have 25pts Answer each question in separate sheets

Regression Models - Introduction

Statistical Inference with Regression Analysis

SEM with observed variables: parameterization and identification. Psychology 588: Covariance structure and factor models

Missing dependent variables in panel data models

Slide Set 14 Inference Basded on the GMM Estimator. Econometrics Master in Economics and Finance (MEF) Università degli Studi di Napoli Federico II

Multiple Equation GMM with Common Coefficients: Panel Data


y it = α i + β 0 ix it + ε it (0.1) The panel data estimators for the linear model are all standard, either the application of OLS or GLS.

Econometrics of Panel Data

Introduction to Econometrics. Heteroskedasticity

Economics 308: Econometrics Professor Moody

STK4900/ Lecture 5. Program

Econometrics II - EXAM Answer each question in separate sheets in three hours

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Locally Robust Semiparametric Estimation

Comparing Forecast Accuracy of Different Models for Prices of Metal Commodities

The Influence Function of Semiparametric Estimators

Suggested Solution for PS #5

Chapter 8 Heteroskedasticity

Chapter 11 Specification Error Analysis

Ability Bias, Errors in Variables and Sibling Methods. James J. Heckman University of Chicago Econ 312 This draft, May 26, 2006

Pattern Recognition Prof. P. S. Sastry Department of Electronics and Communication Engineering Indian Institute of Science, Bangalore

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

Statistical inference

Matching. James J. Heckman Econ 312. This draft, May 15, Intro Match Further MTE Impl Comp Gen. Roy Req Info Info Add Proxies Disc Modal Summ

Introduction to Estimation Methods for Time Series models Lecture 2

Lawrence D. Brown* and Daniel McCarthy*

Transcription:

Discussion of Sensitivity and Informativeness under Local Misspecification Jinyong Hahn April 4, 2019 Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 1 / 19

Sensitivity In Hahn and Hausman (2005), we consider a linear IV model y i = x i θ + ε i x i = z i π + v i with local violation of the exclusion restriction ε i = 1 n z i γ + ε i The example is (a violation of) the moment problem for 0 = E z i (y i x i θ): E z i (y i x i θ) E z i z i γ (I will use 3 different interpretations of 2SLS in this discussion. The GMM interpretation is the first.) Linearization aside, it is an identical problem. (You may want to adopt the normalization E z i z i = I if you want to see a 1-1 mapping with the Andrews et al paper.) Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 2 / 19

Sensitivity The asymptotic distribution of the IV estimator using A z i as an instrument, i.e., n 1 1 ( θ A = n A ) n 1 ( z i xi n A ) z i yi i =1 i =1 is asymptotically biased N ( A Φπ ) 1 A Φγ, σ 2 ( ε A Φπ ) 1 ( A ΦA ) ( π ΦA ) 1 }{{} Bias where Φ = E z i z i. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 3 / 19

Sensitivity 2SLS is the special case where A = π: If γ = 0, we usually want to minimize (A Φπ) 1 (A ΦA) (π ΦA) 1 by choosing A = Φ 1 π, i.e., 2SLS. Hahn and Hausman (2005) consider sensitivity analysis based on bias γ = (A Φπ) 1 A Φγ γ = ( A Φπ ) 1 A Φ, which is in fact identical to the sensitivity in the current paper. Hahn and Hausman (2005) go further and propose to minimize bias / γ 2. Under the normalization Φ = I, it is minimized when A π, i.e., 2SLS. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 4 / 19

Sensitivity Although not explicitly stated in Hahn and Hausman (2005), this exercise can be (should have been) based on some (asymptotic) MSE minimization problem. With the normalization Φ = I, the asymptotic distribution is ( (A N π ) 1 A γ, σε 2 ( A π ) 1 ( A A ) ( A π ) 1 ) with the MSE equal to ( A π ) 1 A γγ A ( π A ) 1 + σ 2 ε ( A π ) 1 ( A A ) ( A π ) 1 We can put weights (prior) on γ such that E γγ I, i.e., a spherical distribution, and we would minimize ( A π ) 1 A A ( π A ) 1 + σ 2 ε ( A π ) 1 ( A A ) ( π A ) 1 with solution equal to A π. = C ( A π ) 1 ( A A ) ( π A ) 1 Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 5 / 19

Sensitivity When we depart from the normalization, then we would want to use a different prior (proportional to the inverse of Φ), but it is similar to the various tricks used to justify the χ 2 -statistic in multiple hypothesis testing. Prior/weight of convenience. In order to relate the asymptotic risk to the limit of finite sample risk, one may want to use the truncation (lim ζ lim inf n E min (nl, ζ)) discussed in Lehmann and Casella (1998), and Hansen (2016, 2017). This sort of idea would extend to the heteroscedastic model, i.e., we should expect the usual GMM to be optimal for this alternative purpose. Because everything is based on linearization, nonlinearity is just a minor complication of notation. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 6 / 19

Informativeness Lemma 1 is Le Cam s Third Lemma. Letting b = (ĉ, γ), which is assumed to be asymptotically linear with influence function φ, it shows what will happen under local deviation characterized by the score s. Under the local alternative, the asymptotic bias is c = E φs, and the asymptotic variance remains to be Σ = E φφ. Lemma 2 uses the fact that R 2 1 and the assumption E s 2 = 1: because R 2 = c Σ 1 c E s 2 = c Σ 1 c 1, Lemma 2 establishes that Condition 2 is satisfied. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 7 / 19

Informativeness Lemma 2 uses the idea that c γ so the problem s.t. can be reinterpreted s.t. = E φc s ϕ E φ γ s ϕ max c 2 c γ Σ 1 c γ 1, ( ) 2 max E φc s ϕ s E sϕ 2 = 1, ( ) where s denotes the score representing the misspecification. The imposition of γ = 0 is equivalent to the additional constraint E φ γ s ϕ = 0. ( ) Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 8 / 19

Informativeness It is proposed to measure the benefit of the additional constraint ( ) in terms of the risk ( E φ c s ϕ ) 2, more precisely, on comparison of maximum risks under two different sets of constraints. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 9 / 19

Informativeness - Observation 1 The meaning of the constraint/normalization ( ) is unclear to me. Consider the model y i = x i c 0 + ε i x i = z i γ 0 + v i where γ 0 is estimated by the first stage OLS, and c 0 is estimated by the moment E ( z i γ 0) (yi x i c 0 ) = 0 ( ) Here, I am viewing c0 as a parameter identified by the second step moment ( ) using the first step estimate of the reduced form parameter γ 0. In other words, I am now viewing 2SLS as a plug-in estimator. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 10 / 19

Informativeness - Observation 1 The informativeness notion, i.e.,, seems to be based on the notion of some sort of trade-off (for lack of better words, I am just referring to the ellipsoid) between the violations of this second stage moment ( ), i.e., exclusion restriction, and the first stage moment e.g., measurement error in z i. E z i ( xi z i γ 0) = 0, ( ) Why should the degree of violation of exclusion restriction have anything to do with the amount of measurement error? (Why should it depend on the covariance between the two errors?) It does not seem natural... If you are like me, you may like a rectangle, not an ellipse, to represent the violations/uncertainty... In any case, it is based on the normalization ( ) that E sϕ 2 = 1. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 11 / 19

Informativeness - Observation 2 The paper is a (local) partial identification and comparison of maximum risks, and some care is needed for interpretation. It may give the impression that if = 1, and if there is misspecification in γ, then ĉ should be misspecified. For this purpose, it may be useful to consider an alternative representation of the LSEM y i = z i γ 20 + u i x i = z i γ 10 + v i with the understanding that c 0 = γ 20 γ 10 so that = 1 now. (I am now viewing the IV as an indirect least squares estimator.) Now, let s introduce a twist and assume that z i is subject to a (classical) measurement error, under which ĉ is still fine. Note that γ 1 and γ 2 are biased toward zero, so one may be worried because = 1. (To be fair, the paper states that they do not exclude the possibility that c 0 = c (η ) for some η = η 0.) Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 12 / 19

Informativeness - Observation 2 Another Analogy Consider y i = x i θ + ε i x i = z i π + v i with 0 = E z i (y i x i θ) and the normalization E z i z i = I2, E ε 2 i = 1 With the local violatione z i (y i x i θ) γ, the asymptotic bias for the 2SLS is (π π) 1 π γ. If we impose the restriction that the second component γ 2 of γ is zero, the asymptotic bias changes to (π π) 1 π 1 γ 1, so the square of the ratio is (π 1 γ 1 ) 2 / (π γ) 2 (π π) 2 (π π) 2 = (π 1 γ 1 ) 2 (π 1 γ 1 + π 2 γ 2 ) 2 It may be sensible to take the average of the above ratio with respect to some weight on γ to summarize the benefit of the restriction γ 2 = 0? Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 13 / 19

Informativeness - Observation 2 Another Analogy On the other hand, we can adopt the idea in the paper and consider ( (π max π ) 1 2 π γ) γ s.t. γ ( E where the constrained maximum is ( (π π ) 1 π γ) 2 = (π γ) 2 (z i ε i ) (z i ε i ) ) 1 γ = γ γ = 1 (π π) 2 (π π) (γ γ) (π π) 2 = 1 π π If we impose an additional constraint that γ 2 = 0, while maintaining the normalization γ γ = 1, we get the constrained maximum equal to The ratio of the two maxima is π1 2 (π π) 2 (π 1 γ 1 ) 2 (π π) 2 = π2 1 (π π) 2 / 1 π π = π 2 1 π 2 1 + π2 2 Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 14 / 19

Informativeness - Observation 2 Another Analogy Is a good summary of π 2 1 π 2 1 + π2 2 (π 1 γ 1 ) 2 (π 1 γ 1 + π 2 γ 2 ) 2? Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 15 / 19

Informativeness - Observation 3 IIA (for lack of better description) may be desired Consider for simplicity the case where dim (γ) = 2, and we impose the restriction that γ 2 = 0. We can proceed in two different ways (including and excluding γ 1 in the optimization problem), and it would be nice to get the same answer. Andrews et al considers an all-or-nothing option on γ by the way, so it only applies to an analysis that does not involve γ 1. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 16 / 19

Informativeness - Observation 3 First, we can recognize the fact that dim (γ) = 2, and consider full information calculation: We want to solve max c 2 s.t. solve the same problem s.t. c γ1 γ 2 Σ 1 c γ1 0 Σ 1 c γ 1 γ 2 c γ 1 0 1, 1, ( ) and compare the ratio of the two answers. Here, γ 1 plays the role of an irrelevant alternative, so we can repeat the exercise now deleting γ 1 Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 17 / 19

Informativeness Second, we may want to use a limited information calculation: We want to solve max c 2 subject to c γ2 Σ 1 c 1, γ 2 solve the same problem subject to c 0 Σ 1 c 0 1, and compare the ratio of the two answers. Here, Σ denotes the appropriate 2 2 submatrix of Σ. The second approach can be solved by the result in the paper, and the ratio is 1. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 18 / 19

Informativeness As for the first approach, it seems that the maximum of c 2 subject to ( ) is Var (ĉ) Cov (ĉ, γ 2) 2 Var ( γ 2 ) = Var (ĉ) (1 ) so the desired IIA holds. Jinyong Hahn () Discussion of Sensitivity and Informativeness under Local Misspecification April 4, 2019 19 / 19