When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Similar documents
When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

On the Use of Linear Fixed Effects Regression Models for Causal Inference

When Should We Use Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Econometric Analysis of Cross Section and Panel Data

Experimental Designs for Identifying Causal Mechanisms

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Covariate Balancing Propensity Score for General Treatment Regimes

Applied Microeconometrics (L5): Panel Data-Basics

Causal Mechanisms Short Course Part II:

Statistical Analysis of Causal Mechanisms

Flexible Estimation of Treatment Effect Parameters

Statistical Analysis of Causal Mechanisms

Final Exam. Economics 835: Econometrics. Fall 2010

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

Causal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms

Gov 2002: 4. Observational Studies and Confounding

Matching Methods for Causal Inference with Time-Series Cross-Section Data

1 Estimation of Persistent Dynamic Panel Data. Motivation

Statistical Models for Causal Analysis

Applied Econometrics Lecture 1

Quantitative Economics for the Evaluation of the European Policy

Difference-in-Differences Methods

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

Notes on causal effects

Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Instrumental Variables

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

Statistical Analysis of Causal Mechanisms

Identification and Inference in Causal Mediation Analysis

Controlling for Time Invariant Heterogeneity

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Non-linear panel data modeling

Econometrics of Panel Data

Linear dynamic panel data models

EMERGING MARKETS - Lecture 2: Methodology refresher

Fixed Effects Models for Panel Data. December 1, 2014

Matching and Weighting Methods for Causal Inference

Selection on Observables: Propensity Score Matching.

Difference-in-Differences Estimation

The Economics of European Regions: Theory, Empirics, and Policy

Lecture 9: Panel Data Model (Chapter 14, Wooldridge Textbook)

Simple Linear Regression

Identification and Sensitivity Analysis for Multiple Causal Mechanisms: Revisiting Evidence from Framing Experiments

Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

EC327: Advanced Econometrics, Spring 2007

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Discussion of Papers on the Extensions of Propensity Score

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Short T Panels - Review

Applied Microeconometrics Chapter 8 Regression Discontinuity (RD)

1 Motivation for Instrumental Variable (IV) Regression

Front-Door Adjustment

High Dimensional Propensity Score Estimation via Covariate Balancing

Lecture 8 Panel Data

Statistical Analysis of Causal Mechanisms for Randomized Experiments

New Developments in Econometrics Lecture 16: Quantile Estimation

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Causal Inference Lecture Notes: Selection Bias in Observational Studies

Telescope Matching: A Flexible Approach to Estimating Direct Effects

Week 2: Pooling Cross Section across Time (Wooldridge Chapter 13)

Regression Discontinuity Designs

Panel Data Seminar. Discrete Response Models. Crest-Insee. 11 April 2008

Advanced Econometrics

Syllabus. By Joan Llull. Microeconometrics. IDEA PhD Program. Fall Chapter 1: Introduction and a Brief Review of Relevant Tools

Extending causal inferences from a randomized trial to a target population

Missing dependent variables in panel data models

PSC 504: Instrumental Variables

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Interaction Effects and Causal Heterogeneity

Next, we discuss econometric methods that can be used to estimate panel data models.

Chapter 6. Panel Data. Joan Llull. Quantitative Statistical Methods II Barcelona GSE

Post-Instrument Bias

Corporate Finance Data & The Role of Dynamic Panels. Mark Flannery, University of Florida Kristine W. Hankins, University of Kentucky

Review of Panel Data Model Types Next Steps. Panel GLMs. Department of Political Science and Government Aarhus University.

Applied Economics. Panel Data. Department of Economics Universidad Carlos III de Madrid

What s New in Econometrics? Lecture 14 Quantile Methods

Econ 673: Microeconometrics Chapter 12: Estimating Treatment Effects. The Problem

The returns to schooling, ability bias, and regression

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Specification Tests in Unbalanced Panels with Endogeneity.

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Discussion: Can We Get More Out of Experiments?

Econometrics of Panel Data

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Causal Interaction in High Dimension

PS 271B: Quantitative Methods II Lecture Notes

Transcription:

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint work with In Song Kim (MIT) January 10, 2017 Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 1 / 36

Fixed Effects Regressions in Causal Inference Linear fixed effects regression models are the primary workhorse for causal inference with longitudinal/panel data Researchers use them to adjust for unobserved time-invariant confounders (omitted variables, endogeneity, selection bias,...): Good instruments are hard to find..., so we d like to have other tools to deal with unobserved confounders. This chapter considers... strategies that use data with a time or cohort dimension to control for unobserved but fixed omitted variables (Angrist & Pischke, Mostly Harmless Econometrics) fixed effects regression can scarcely be faulted for being the bearer of bad tidings (Green et al., Dirty Pool) When should we use linear FE regression models for causal inference? Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 2 / 36

Linear Regression with Unit Fixed Effects Balanced panel data with N units and T time periods Y it : outcome variable X it : binary causal or treatment variable of interest Assumption 1 (Linearity) Y it = α i + βx it + ɛ it U i : a vector of unobserved time-invariant confounders α i = h(u i ) for any function h( ) A flexible way to adjust for unobservables Average contemporaneous treatment effect: β = E(Y it (1) Y it (0)) Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 3 / 36

Strict Exogeneity and Least Squares Estimator Assumption 2 (Strict Exogeneity) ɛ it {X i, U i } Mean independence is sufficient: E(ɛ it X i, U i ) = E(ɛ it ) = 0 Least squares estimator based on de-meaning: ˆβ FE = arg min β N i=1 t=1 T {(Y it Y i ) β(x it X i )} 2 where X i and Y i are unit-specific sample means ATE among those units with variation in treatment: τ = E(Y it (1) Y it (0) C it = 1) where C it = 1{0 < T t=1 X it < T }. Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 4 / 36

Causal Directed Acyclic Graph (DAG) Y i1 Y i2 Y i3 X i1 X i2 X i3 U i arrow = direct causal effect absence of arrows causal assumptions Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 5 / 36

Nonparametric Structural Equation Model (NPSEM) One-to-one correspondence with a DAG: Y it = g 1 (X it, U i, ɛ it ) X it = g 2 (X i1,..., X i,t 1, U i, η it ) Nonparametric generalization of linear unit fixed effects model: Allows for nonlinear relationships, effect heterogeneity Strict exogeneity holds (ɛ it Y it {X i, U i }) No arrows can be added without violating Assumptions 1 and 2 Causal assumptions: 1 No unobserved time-varying confounders 2 Past outcomes do not directly affect current outcome 3 Past outcomes do not directly affect current treatment 4 Past treatments do not directly affect current outcome Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 6 / 36

Potential Outcomes Framework DAG causal structure Potential outcomes treatment assignment mechanism Assumption 3 (No carryover effect) Past treatments do not directly affect current outcome Y it (X i1, X i2,..., X i,t 1, X it ) = Y it (X it ) What randomized experiment satisfies unit fixed effects model? 1 randomize X i1 given U i 2 randomize X i2 given X i1 and U i 3 randomize X i3 given X i2, X i1, and U i 4 and so on Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 7 / 36

Assumption 4 (Sequential Ignorability with Unobservables) {Y it (1), Y it (0)} T t=1 X i1 U i. {Y it (1), Y it (0)} T t=1 X it X i1,..., X i,t 1, U i. {Y it (1), Y it (0)} T t=1 X it X i1,..., X i,t 1, U i as-if random assumption without conditioning on past outcomes Past outcomes cannot directly affect current treatment Says nothing about whether past outcomes can directly affect current outcome Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 8 / 36

Past Outcomes Directly Affect Current Outcome Y i1 Y i2 Y i3 X i1 X i2 X i3 U i Strict exogeneity still holds Past outcomes do not confound X it Y it given U i No need to adjust for past outcomes Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 9 / 36

Past Treatments Directly Affect Current Outcome Y i1 Y i2 Y i3 X i1 X i2 X i3 U i Past treatments as confounders Need to adjust for past treatments Strict exogeneity holds given past treatments and U i Impossible to adjust for an entire treatment history and U i at the same time Adjust for a small number of past treatments often arbitrary Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 10 / 36

Past Outcomes Directly Affect Current Treatment Y i1 Y i2 Y i3 X i1 X i2 X i3 Correlation between error term and future treatments Violation of strict exogeneity No adjustment is sufficient U i Together with the previous assumption no feedback effect over time Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 11 / 36

Instrumental Variables Approach Y i1 Y i2 Y i3 X i1 X i2 X i3 U i Instruments: X i1, X i2, and Y i1 GMM: Arellano and Bond (1991) Exclusion restrictions Arbitrary choice of instruments Substantive justification rarely given Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 12 / 36

An Alternative Selection-on-Observables Approach Y i1 Y i2 Y i3 Absence of unobserved time-invariant confounders U i past treatments can directly affect current outcome X i1 X i2 X i3 past outcomes can directly affect current treatment Comparison across units within the same time rather than across different time periods within the same unit Marginal structural models can identify the average effect of an entire treatment sequence Trade-off no free lunch Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 13 / 36

Adjusting for Observed Time-varying Confounders Y i1 Y i2 Y i3 X i1 X i2 X i3 Z i1 Z i2 Z i3 U i past treatments cannot directly affect current outcome past outcomes cannot directly affect current treatment adjusting for Z it does not relax these assumptions past outcomes cannot indirectly affect current treatment through Z it Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 14 / 36

Linear Fixed Effects Estimator Even if these assumptions are satisfied, the the unit fixed effects estimator is inconsistent for the ATE: { ( T t=1 E C X ) } it Y T it t=1 i T (1 X it )Y it p t=1 ˆβ FE X T S 2 it t=1 1 X i it E(C i Si 2 τ ) where S 2 i = T t=1 (X it X i ) 2 /(T 1) is the unit-specific variance We show how to eliminate this bias using a general matching framework Equivalence between matching and weighted fixed effects estimators Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 15 / 36

Linear Regression with Unit and Time Fixed Effects Model: Y it = α i + γ t + βx it + ɛ it where γ t flexibly adjusts for a vector of unobserved unit-invariant time effects V t, i.e., γ t = f (V t ) Estimator: ˆβ FE2 = arg min β N i=1 t=1 T {(Y it Y i Y t + Y ) β(x it X i X t + X)} 2 where Y t and X t are time-specific means, and Y and X are overall means Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 16 / 36

Understanding the Two-way Fixed Effects Estimator The two-way FE estimator combines three biased estimators! 1 β FE : bias due to time effects 2 β FEtime : bias due to unit effects 3 β pool : bias due to both time and unit effects ˆβ FE2 = ω FE ˆβ FE + ω FEtime ˆβ FEtime ω pool ˆβ pool w FE + w FEtime w pool with sufficiently large N and T, the weights are given by, ω FE E(S 2 i ) = average unit-specific variance ω FEtime E(S 2 t ) = average time-specific variance ω pool S 2 = overall variance We consider various matching estimators including difference-in-differences and synthetic control Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 17 / 36

Before-and-After Design Accommodating various causal quantity of interest No time trend for the average potential outcomes: E(Y it (x) Y i,t 1 (x) X it X i,t 1 ) = 0 for x = 0, 1 Average Outcome 0.0 0.2 0.4 0.6 0.8 1.0 control group treatment group counterfactual time t time t+1 Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 18 / 36

This is a matching estimator with the following matched set: M(i, t) = {(i, t ) : i = i, t {t 1, t + 1}, X i t = 1 X it} It is also the first differencing estimator: ˆβ FD = arg min β N i=1 t=2 T {(Y it Y i,t 1 ) β(x it X i,t 1 )} 2 We emphasize that the model and the interpretation of β are exactly as in [the linear fixed effects model]. What differs is our method for estimating β (Wooldridge; italics original). The identification assumptions is very different But, still requires the assumption that past outcomes do not affect current treatment (Regression towards the mean) Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 19 / 36

Difference-in-Differences Design Parallel trend assumption: E(Y it (0) Y i,t 1 (0) X it = 1, X i,t 1 = 0) = E(Y it (0) Y i,t 1 (0) X it = X i,t 1 = 0) Average Outcome 0.0 0.2 0.4 0.6 0.8 1.0 control group treatment group counterfactual time t time t+1 Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 20 / 36

General DiD = Weighted Two-Way FE Effects 2 2: equivalent to linear two-way fixed effects regression General setting: Multiple time periods, repeated treatments Units Time periods 4 3 2 1 C T C C T T C T T C C C T C C T T T C T Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 21 / 36

Units Time periods 4 3 2 1 0 0 0 0 0 0 0 1 2 0 1 1 2 0 1 1 2 1 2 0 0 0 0 0 Fast computation, standard error, specification test Still assumes that past outcomes don t affect current treatment Baseline outcome difference caused by unobserved time-invariant confounders It should not reflect causal effect of baseline outcome on treatment assignment Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 22 / 36

Synthetic Control Method (Abadie et al. 2010) One treated unit i receiving the treatment at time T Quantity of interest: Y i T Y i T (0) Create a synthetic control using past outcomes Weighted average: Yi T (0) = i i ŵ i Y it Estimate weights to balance past outcomes and past time-varying covariates A motivating autoregressive model: Y it (0) = ρ T Y i,t 1 (0) + δt Z it + ɛ it Z it = λ T 1 Y i,t 1 (0) + T Z i,t 1 + ν it Past outcomes can affect current treatment No unobserved time-invariant confounders Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 23 / 36

Causal Effect of ETA s Terrorism THE AMERICAN ECONOMIC REVIEW MARCH 2003 11- -5 / s / a / 210-5 0 9-, -Actual with terrorism - -Synthetic without terrorism 1955 1 1960 1965 1970 1975 1980 1985 1990 year 1995 2000 FIGURE1. PERCAPITA GDP FOR THE BASQUECOUNTRI Abadie and Gardeazabal (2003, AER) 4 Imai (Princeton) and Kim -- \ (MIT) Fixed Effects for Causal Inference GDt' gap Sydney (January 10, 2017) 24 / 36

The main motivating model: Y it (0) = γ t + δ t Z it + ξ U i + ɛ it A generalization of the linear two-way fixed effects model How is it possible to adjust for unobserved time-invariant confounders by adjusting for past outcomes? The key assumption: there exist weights such that w i Z it = Z i t for all t T 1 and w i U i = U i i i i i In general, adjusting for observed confounders does not adjust for unobserved confounders The same tradeoff as before Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 25 / 36

Concluding Remarks When should we use linear fixed effects models? Key tradeoff: 1 unobserved time-invariant confounders fixed effects 2 causal dynamics between treatment and outcome selection-on-observables Two key (under-appreciated) causal assumptions of fixed effects: 1 past treatments do not directly affect current outcome 2 past outcomes do not directly affect current treatment A new matching estimator: 1 Within-unit matching estimator no linearity assumption 2 Various causal identification strategies can be incorporated including the before-and-after and difference-in-differences designs 3 Equivalent representation as a weighted linear fixed effects regression estimator R package wfe is available at CRAN Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 26 / 36

Send comments and suggestions to: kimai@princeton.edu insong@mit.edu More information about this and other research: http://imai.princeton.edu http://web.mit.edu/insong/www Imai (Princeton) and Kim (MIT) Fixed Effects for Causal Inference Sydney (January 10, 2017) 27 / 36