Matching and Weighting Methods for Causal Inference

Similar documents
On the Use of Linear Fixed Effects Regression Models for Causal Inference

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Covariate Balancing Propensity Score for General Treatment Regimes

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

High Dimensional Propensity Score Estimation via Covariate Balancing

Causal Inference Lecture Notes: Selection Bias in Observational Studies

Introduction to Statistical Inference

What s New in Econometrics. Lecture 1

Discussion of Papers on the Extensions of Propensity Score

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Stratified Randomized Experiments

Matching for Causal Inference Without Balance Checking

arxiv: v1 [stat.me] 15 May 2011

Propensity Score Weighting with Multilevel Data

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation

When Should We Use Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Combining multiple observational data sources to estimate causal eects

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Robust Estimation of Inverse Probability Weights for Marginal Structural Models

Propensity Score Methods for Causal Inference

An Introduction to Causal Analysis on Observational Data using Propensity Scores

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Propensity Score Matching

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Simple Linear Regression

Selection on Observables: Propensity Score Matching.

Causal Mechanisms Short Course Part II:

EMERGING MARKETS - Lecture 2: Methodology refresher

Estimating the Marginal Odds Ratio in Observational Studies

Statistical Analysis of Causal Mechanisms

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Statistical Analysis of Causal Mechanisms

A Sampling of IMPACT Research:

Flexible Estimation of Treatment Effect Parameters

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

The propensity score with continuous treatments

Quantitative Economics for the Evaluation of the European Policy

Propensity Score Analysis with Hierarchical Data

Statistical Analysis of Causal Mechanisms

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Instrumental Variables

PROPENSITY SCORE MATCHING. Walter Leite

Achieving Optimal Covariate Balance Under General Treatment Regimes

Gov 2002: 5. Matching

Identification and Inference in Causal Mediation Analysis

Gov 2002: 4. Observational Studies and Confounding

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Propensity Score Methods for Estimating Causal Effects from Complex Survey Data

Regression Discontinuity Designs

Notes on causal effects

Applied Microeconometrics (L5): Panel Data-Basics

Robustness to Parametric Assumptions in Missing Data Models

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

A Theory of Statistical Inference for Matching Methods in Applied Causal Research

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Statistical Analysis of List Experiments

Econometrics of causal inference. Throughout, we consider the simplest case of a linear outcome equation, and homogeneous

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

Data Integration for Big Data Analysis for finite population inference

Statistical Models for Causal Analysis

For more information about how to cite these materials visit

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Applied Statistics Lecture Notes

Extending causal inferences from a randomized trial to a target population

University of California, Berkeley

Causal Inference Basics

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Econometric Analysis of Cross Section and Panel Data

Difference-in-Differences Methods

Noncompliance in Randomized Experiments

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

Applied Econometrics Lecture 1

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

Empirical Likelihood Methods for Two-sample Problems with Data Missing-by-Design

Dynamics in Social Networks and Causality

Empirical approaches in public economics

Since the seminal paper by Rosenbaum and Rubin (1983b) on propensity. Propensity Score Analysis. Concepts and Issues. Chapter 1. Wei Pan Haiyan Bai

Propensity score modelling in observational studies using dimension reduction methods

Clustering as a Design Problem

A Course in Applied Econometrics. Lecture 2 Outline. Estimation of Average Treatment Effects. Under Unconfoundedness, Part II

Comment: Understanding OR, PS and DR

Matching Methods for Causal Inference with Time-Series Cross-Section Data

Econometrics of Panel Data

Statistical Analysis of Causal Mechanisms for Randomized Experiments

Supporting Information. Controlling the Airwaves: Incumbency Advantage and. Community Radio in Brazil

A Measure of Robustness to Misspecification

Imbens/Wooldridge, IRP Lecture Notes 2, August 08 1

AGEC 661 Note Fourteen

Gov 2002: 13. Dynamic Causal Inference

Variable selection and machine learning methods in causal inference

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Transcription:

Matching and Weighting Methods for ausal Inference Kosuke Imai Princeton University Methods Workshop, Duke University Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) / 57

References to Relevant Papers Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric ausal Inference. Political Analysis (27) Misunderstandings among Experimentalists and Observationalists about ausal Inference. Journal of the Royal Statistical Society, Series A (28) he Essential Role of Pair Matching in luster-randomized Experiments, with Application to the Mexican Universal Health Insurance Evaluation. Statistical Science (29) ovariate Balancing Propensity Score. Working paper On the Use of Linear Fixed Effects Regression Models for ausal Inference. Working paper All papers are available at http://imai.princeton.edu/research Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 2 / 57

Software Implementation ausal inference with regression: Zelig: Everyone s Statistical Software ausal inference with matching: MatchIt: Nonparametric Preprocessing for Parametric ausal Inference ausal inference with propensity score: BPS: ovariate Balancing Propensity Score ausal inference with fixed effects: wfe: Weighted Fixed Effects Regressions for ausal Inference All software is available at http://imai.princeton.edu/software Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 3 / 57

Matching and Weighting What is matching? Grouping observations based on their observed characteristics pairing 2 subclassification 3 subsetting What is weighting? Replicating observations based on their observed characteristics All types of matching are special cases with discrete weights What matching and weighting methods can do: flexible and robust causal modeling under selection on observables What they cannot do: eliminate bias due to unobserved confounding Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 4 / 57

Matching for Randomized Experiments Matching can be used for randomized experiments too! Randomization of treatment unbiased estimates Improving efficiency reducing variance Why care about efficiency? You care about your results! Randomized matched-pair design Randomized block design Intuition: estimation uncertainty comes from pre-treatment differences between treatment and control groups Mantra (Box, Hunter, and Hunter): Block what you can and randomize what you cannot Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 5 / 57

luster Randomized Experiments Units: i =, 2,..., n j lusters of units: j =, 2,..., m reatment at cluster level: j {, } Outcome: Y ij = Y ij ( j ) Random assignment: (Y ij (), Y ij ()) j Estimands at unit level: SAE m j= n j n m j (Y ij () Y ij ()) j= i= PAE E(Y ij () Y ij ()) Random sampling of clusters and units Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 6 / 57

Merits and Limitations of REs Interference between units within a cluster is allowed Assumption: No interference between units of different clusters Often easier to implement: Mexican health insurance experiment Opportunity to estimate the spill-over effects D. W. Nickerson. Spill-over effect of get-out-the-vote canvassing within household (APSR, 28) Limitations: A large number of possible treatment assignments 2 Loss of statistical power Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 7 / 57

Design-Based Inference For simplicity, assume equal cluster size, i.e., n j = n for all j he difference-in-means estimator: ˆτ m j Y j m m j= m j= ( j )Y j where Y j n j i= Y ij/n j Easy to show E(ˆτ O) = SAE and thus E(ˆτ) = PAE Exact population variance: Var(ˆτ) = Var(Y j()) m Intracluster correlation coefficient ρ t : + Var(Y j()) m Var(Y j (t)) = σ2 t n { + (n )ρ t} σ 2 t Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 8 / 57

luster Standard Error luster robust sandwich variance estimator: Var((ˆα, m m m ˆβ) ) = X j ˆɛ jˆɛ j X j j= X j j= X j where in this case X j = [ j ] is an n j 2 matrix and ˆɛ j = (ˆɛ j,..., ˆɛ nj j) is a column vector of length n j Design-based evaluation (assume n j = n for all j): ( V(Yj ()) Finite Sample Bias = m 2 + V(Y ) j()) m 2 Bias vanishes asymptotically as m with n fixed j= X j Implication: cluster standard errors by the unit of treatment assignment Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 9 / 57 X j

Example: Seguro Popular de Salud (SPS) Evaluation of the Mexican universal health insurance program Aim: provide social protection in health to the 5 million uninsured Mexicans A key goal: reduce out-of-pocket health expenditures Sounds obvious but not easy to achieve in developing countries Individuals must affiliate in order to receive SPS services health clusters non-randomly chosen for evaluation Matched-pair design: based on population, socio-demographics, poverty, education, health infrastructure etc. reatment clusters : encouragement for people to affiliate Data: aggregate characteristics, surveys of 32, individuals Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) / 57

Matching and Blocking for Randomized Experiments Okay, but how should I match/block without the treatment group? Goal: match/block well on powerful predictors of outcome (prognostic factors) (oarsened) Exact matching Matching based on a similarity measure: Mahalanobis distance = (X i X j ) Σ (X i X j ) ould combine the two Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) / 57

Relative Efficiency of Matched-Pair Design (MPD) ompare with completely-randomized design Greater (positive) correlation within pair greater efficiency PAE: MPD is between.8 and 38.3 times more efficient! Relative Efficiency, PAE 2 3..5..5 2. 2.5 3. Relative Efficiency, UAE Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 2 / 57

hallenges of Observational Studies Randomized experiments vs. Observational studies radeoff between internal and external validity Endogeneity: selection bias Generalizability: sample selection, Hawthorne effects, realism Statistical methods cannot replace good research design Designing observational studies Natural experiments (haphazard treatment assignment) Examples: birthdays, weather, close elections, arbitrary administrative rules and boundaries Replicating randomized experiments Key Questions: Where are the counterfactuals coming from? 2 Is it a credible comparison? Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 3 / 57

Identification of the Average reatment Effect Assumption : Overlap (i.e., no extrapolation) < Pr( i = X i = x) < for any x X Assumption 2: Ignorability (exogeneity, unconfoundedness, no omitted variable, selection on observables, etc.) {Y i (), Y i ()} i X i = x for any x X onditional expectation function: µ(t, x) = E(Y i (t) i = t, X i = x) Regression-based estimator: ˆτ = n n {ˆµ(, X i ) ˆµ(, X i )} i= Delta method is pain, but simulation is easy via Zelig Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 4 / 57

Matching as Nonparametric Preprocessing READING: Ho et al. Political Analysis (27) Assume exogeneity holds: matching does NO solve endogeneity Need to model E(Y i i, X i ) Parametric regression functional-form/distributional assumptions = model dependence Non-parametric regression = curse of dimensionality Preprocess the data so that treatment and control groups are similar to each other w.r.t. the observed pre-treatment covariates Goal of matching: achieve balance = independence between and X Replicate randomized treatment w.r.t. observed covariates Reduced model dependence: minimal role of statistical modeling Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 5 / 57

Sensitivity Analysis onsider a simple pair-matching of treated and control units Assumption: treatment assignment is random Difference-in-means estimator Question: How large a departure from the key (untestable) assumption must occur for the conclusions to no longer hold? Rosenbaum s sensitivity analysis: for any pair j, Γ Pr( j = )/ Pr( j = ) Pr( 2j = )/ Pr( 2j = ) Γ Under ignorability, Γ = for all j How do the results change as you increase Γ? Limitations of sensitivity analysis FURHER READING: P. Rosenbaum. Observational Studies. Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 6 / 57

he Role of Propensity Score he probability of receiving the treatment: π(x i ) Pr( i = X i ) he balancing property (no assumption): i X i π(x i ) Exogeneity given the propensity score (under exogeneity given covariates): (Y i (), Y i ()) i π(x i ) Dimension reduction But, true propensity score is unknown: propensity score tautology (more later) Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 7 / 57

lassical Matching echniques Exact matching Mahalanobis distance matching: Propensity score matching (X i X j ) Σ (X i X j ) One-to-one, one-to-many, and subclassification Matching with caliper Which matching method to choose? Whatever gives you the best balance! Importance of substantive knowledge: propensity score matching with exact matching on key confounders FURHER READING: Rubin (26). Matched Sampling for ausal Effects (ambridge UP) Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 8 / 57

How to heck Balance Success of matching method depends on the resulting balance How should one assess the balance of matched data? Ideally, compare the joint distribution of all covariates for the matched treatment and control groups In practice, this is impossible when X is high-dimensional heck various lower-dimensional summaries; (standardized) mean difference, variance ratio, empirical DF, etc. Frequent use of balance test t test for difference in means for each variable of X other test statistics; e.g., χ 2, F, Kolmogorov-Smirnov tests statistically insignificant test statistics as a justification for the adequacy of the chosen matching method and/or a stopping rule for maximizing balance Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 9 / 57

An Illustration of Balance est Fallacy t statistic 2 3 4 "Statistical insignificance" region Math test score 2 4 6 8 QQ Plot Mean Deviation Difference in Means 2 3 4 2 3 4 Number of ontrols Randomly Dropped Number of ontrols Randomly Dropped Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 2 / 57

Problems with Hypothesis ests as Stopping Rules Balance test is a function of both balance and statistical power he more observations dropped, the less power the tests have t-test is affected by factors other than balance, nm (X mt X mc ) s 2 mt r m + s2 mc r m X mt and X mc are the sample means s 2 mt and s 2 mc are the sample variances n m is the total number of remaining observations r m is the ratio of remaining treated units to the total number of remaining observations Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 2 / 57

Recent Advances in Matching Methods he main problem of matching: balance checking Skip balance checking all together Specify a balance metric and optimize it Optimal matching: minimize sum of distances Full matching: subclassification with variable strata size Genetic matching: maximize minimum p-value oarsened exact matching: exact match on binned covariates SVM subsetting: find the largest, balanced subset for general treatment regimes Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 22 / 57

Inverse Propensity Score Weighting Matching is inefficient because it throws away data Matching is a special case of weighting Weighting by inverse propensity score (Horvitz-hompson): n n i= ( i Y i ˆπ(X i ) ( ) i)y i ˆπ(X i ) Unstable when some weights are extremely small An improved weighting scheme: n i= { iy i /ˆπ(X i )} n i= { i/ˆπ(x i )} n i= {( i)y i /( ˆπ(X i ))} n i= {( i)/( ˆπ(X i ))} Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 23 / 57

Weighting Both Groups to Balance ovariates { Balancing condition: E i X i π(x i ) ( i )X i π(x i ) } = 2 3 4 5 6 AE weighted ontrol units AE weighted reated units..2.4.6.8. Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 24 / 57

Weighting ontrol Group to Balance ovariates { } Balancing condition: E i X i π(x i )( i )X i π(x i ) = 2 3 4 5 6 A weighted ontrol units reated units..2.4.6.8. Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 25 / 57

Efficient Doubly-Robust Estimators he estimator by Robins et al. : { n ˆτ DR ˆµ(, X i ) + n n n i= i= { n ˆµ(, X i ) + n n n i= i= } i (Y i ˆµ(, X i )) ˆπ(X i ) ( i )(Y i ˆµ(, X i )) ˆπ(X i ) onsistent if either the propensity score model or the outcome model is correct (Semiparametrically) Efficient FURHER READING: Lunceford and Davidian (24, Stat. in Med.) } Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 26 / 57

Marginal Structural Models for Longitudinal Data Units i =,..., N and time j =,..., J Eventual outcome Y i measured at time J reatment and covariate history: ij and X ij Quantity of interest: (marginal) AE = E{Y i ( t)} Sequential ignorability assumption: Y i (t) ij X ij, i,j Inverse-probability-of-treatment weight: w i = P( ij X ij ) = J j= P( ij i,j, X ij ) Stabilized weight: multiply w i by P( ij ) Analysis: weighted regression of Y i on ij FURHER READINGS: Robins et al. (2), Blackwell (23) Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 27 / 57

Propensity Score autology Propensity score is unknown Dimension reduction is purely theoretical: must model i given X i Diagnostics: covariate balance checking In practice, adhoc specification searches are conducted Model misspecification is always possible autology: propensity score works only when you get it right! In fact, estimated propensity score works even better than true propensity score when the model is correct heory (Rubin et al.): ellipsoidal covariate distributions = equal percent bias reduction Skewed covariates are common in applied settings Propensity score methods can be sensitive to misspecification Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 28 / 57

Kang and Schafer (27, Statistical Science) Simulation study: the deteriorating performance of propensity score weighting methods when the model is misspecified Setup: 4 covariates Xi : all are i.i.d. standard normal Outcome model: linear model Propensity score model: logistic model with linear predictors Misspecification induced by measurement error: X i = exp(xi/2) X i2 = Xi2/( + exp(xi) + ) X i3 = (XiX i3/25 +.6) 3 X i4 = (Xi + Xi4 + 2) 2 Weighting estimators to be evaluated: Horvitz-hompson 2 Inverse-probability weighting with normalized weights 3 Weighted least squares regression 4 Doubly-robust least squares regression Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 29 / 57

Weighting Estimators Do Great If the Model is orrect Bias RMSE Sample size Estimator GLM rue GLM rue () Both models correct H.33.9 2.6 23.93 n = 2 IPW.3.3 3.98 5.3 WLS.4.4 2.58 2.58 DR.4.4 2.58 2.58 H..8 4.92.47 n = IPW..5.75 2.22 WLS...4.4 DR...4.4 (2) Propensity score model correct H.32.7 2.49 23.49 n = 2 IPW.27.35 3.94 4.9 WLS.7.7 2.59 2.59 DR.7.7 2.59 2.59 H.3. 4.93.62 n = IPW.2.4.76 2.26 WLS...4.4 DR...4.4 Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 3 / 57

Weighting Estimators Are Sensitive to Misspecification Bias RMSE Sample size Estimator GLM rue GLM rue (3) Outcome model correct H 24.25.8 94.58 23.24 n = 2 IPW.7.26 9.75 4.93 WLS 2.29.4 4.3 3.3 DR.8. 2.67 2.58 H 4.4.23 238.4.42 n = IPW 4.93.2.44 2.2 WLS 2.94.2 3.29.47 DR.2..89.3 (4) Both models incorrect H 3.32.38 266.3 23.86 n = 2 IPW.93.9.5 5.8 WLS 2.3.55 3.87 3.29 DR 7.46.37 5.3 3.74 H.47. 237.8.53 n = IPW 5.6.2 2.7 2.25 WLS 2.95.9 3.3.47 DR 48.66.8 37.9.8 Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 3 / 57

ovariate Balancing Propensity Score Recall the dual characteristics of propensity score onditional probability of treatment assignment 2 ovariate balancing score Implied moment conditions: Score equation: { i π β E (X i) ( i)π β (X } i) π β (X i ) π β (X i ) = 2 Balancing condition: { } i Xi E π β (X i ) ( i) X i π β (X i ) = where X i = f (X i ) is any vector-valued function Score condition is a particular covariate balancing condition! Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 32 / 57

Estimation and Inference Just-identified BPS: Find the values of model parameters that satisfy covariate balancing conditions in the sample Method of moments: # of parameters = # of balancing conditions Over-identified BPS: # of parameters < # of balancing conditions Generalized method of moments (GMM): ˆβ = argmin β Θ ḡ β (, X) Σ β ḡβ(, X) where ḡ β (, X) = N N i= i π β (X i ) π β (X i ) ( i )π β (X i ) π β (X i ) i Xi π β (X i ) ( i ) X i π β (X i ) and Σ β is the covariance of moment conditions Enables misspecification test Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 33 / 57

Revisiting Kang and Schafer (27) Bias RMSE Sample size Estimator GLM BPS BPS2 rue GLM BPS BPS2 rue () Both models correct H.33 2.6 4.74.9 2.6 4.68 9.33 23.93 n = 2 IPW.3.5.2.3 3.98 3.22 3.5 5.3 WLS.4.4.4.4 2.58 2.58 2.58 2.58 DR.4.4.4.4 2.58 2.58 2.58 2.58 H..44.59.8 4.92.76 4.8.47 n = IPW..3.32.5.75.44.6 2.22 WLS.....4.4.4.4 DR.....4.4.4.4 (2) Propensity score model correct H.5.99 4.94.4 4.39 4.57 9.39 24.28 n = 2 IPW.3.2.3.8 4.8 3.22 3.55 4.97 WLS.4.4.4.4 2.5 2.5 2.5 2.5 DR.4.4.4.4 2.5 2.5 2.5 2.5 H.2.44.67.29 4.85.77 4.22.62 n = IPW.2.5.3.3.75.45.6 2.27 WLS.4.4.4.4.4.4.4.4 DR.4.4.4.4.4.4.4.4 Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 34 / 57

BPS Makes Weighting Methods More Robust Bias RMSE Sample size Estimator GLM BPS BPS2 rue GLM BPS BPS2 rue (3) Outcome model correct H 24.25.9 5.42.8 94.58 5.4.7 23.24 n = 2 IPW.7.37 2.84.26 9.75 3.42 4.74 4.93 WLS 2.29 2.37 2.9.4 4.3 4.6 3.96 3.3 DR.8... 2.67 2.58 2.58 2.58 H 4.4 2.2 2.8.23 238.4 2.97 6.65.42 n = IPW 4.93.39.82.2.44 2. 2.26 2.2 WLS 2.94 2.99 2.95.2 3.29 3.37 3.33.47 DR.2....89.3.3.3 (4) Both models incorrect H 3.32.27 5.3.38 266.3 5.2.62 23.86 n = 2 IPW.93.26 2.77.9.5 3.37 4.67 5.8 WLS 2.3 2.2 2.4.55 3.87 3.9 3.8 3.29 DR 7.46 2.59 2.3.37 5.3 4.27 3.99 3.74 H.47 2.5.9. 237.8 3.2 6.75.53 n = IPW 5.6.44.92.2 2.7 2.6 2.39 2.25 WLS 2.95 3. 2.98.9 3.3 3.4 3.36.47 DR 48.66 3.59 3.79.8 37.9 4.2 4.25.8 Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 35 / 57

BPS Sacrifices Likelihood for Better Balance 64 6 58 55 52 64 6 58 55 52 GLM BPS Log Likelihood 64 6 58 55 52 64 6 58 55 52 GLM BPS 2 3 2 3 ovariate Imbalance GLM BPS 2 3 2 3 GLM BPS 2 3.. 5 2 Log Likelihood (GLM BPS) Log Imbalance (GLM BPS) Likelihood Balance radeoff 2 3.. 5 2 Log Likelihood (GLM BPS) Log Imbalance (GLM BPS) Neither Model Both Models Specified orrectly Specified orrectly Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 36 / 57

A lose Look at Fixed Effects Regression Fixed effects models are a primary workhorse for causal inference Used for stratified experimental and observational data Also used to adjust for unobservables in observational studies: Good instruments are hard to find..., so we d like to have other tools to deal with unobserved confounders. his chapter considers... strategies that use data with a time or cohort dimension to control for unobserved but fixed omitted variables (Angrist & Pischke, Mostly Harmless Econometrics) fixed effects regression can scarcely be faulted for being the bearer of bad tidings (Green et al., Dirty Pool) ommon claim: Fixed effects models are superior to matching estimators because the latter can only adjust for observables Question: What are the exact causal assumptions underlying fixed effects regression models? Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 37 / 57

Matching and Regression in ross-section Settings Units 2 3 4 5 reatment status Outcome Y Y 2 Y 3 Y 4 Y 5 Estimating the Average reatment Effect (AE) via matching: Y 2 (Y 3 + Y 4 ) Y 2 2 (Y 3 + Y 4 ) 3 (Y + Y 2 + Y 5 ) Y 3 3 (Y + Y 2 + Y 5 ) Y 4 Y 5 2 (Y 3 + Y 4 ) Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 38 / 57

Matching Representation of Simple Regression ross-section simple linear regression model: Y i = α + βx i + ɛ i Binary treatment: X i {, } Equivalent matching estimator: where Ŷ i () = Ŷ i () = ˆβ = N N (Ŷi Ŷi()) () i= { Y i if X i = N N i = X i i = X i Y i if X i = { N N i = ( X i ) i = ( X i )Y i if X i = Y i if X i = reated units matched with the average of non-treated units Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 39 / 57

One-Way Fixed Effects Regression Simple (one-way) FE model: Y it = α i + βx it + ɛ it ommonly used by applied researchers: Stratified randomized experiments (Duflo et al. 27) Stratification and matching in observational studies Panel data, both experimental and observational ˆβ FE may be biased for the AE even if X it is exogenous within each unit It converges to the weighted average of conditional AEs: where σ 2 i ˆβ FE = t= (X it X i ) 2 / p E{AE i σi 2 } E(σi 2 ) How are counterfactual outcomes estimated under the FE model? Unit fixed effects = within-unit comparison Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 4 / 57

Mismatches in One-Way Fixed Effects Model ime periods Units : treated observations : control observations ircles: Proper matches riangles: Mismatches = attenuation bias Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 4 / 57

Matching Representation of Fixed Effects Regression Proposition Ŷ it (x) = K = ˆβ FE = K { N { N Y it t t Y it N X it i= t= N i= t= if X it = x if X it = x ) } (Ŷit () Ŷit(), for x =, ( X it ) + ( X it ) t t t t X it. K : average proportion of proper matches across all observations More mismatches = larger adjustment Adjustment is required except very special cases Fixes attenuation bias but this adjustment is not sufficient Fixed effects estimator is a special case of matching estimators Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 42 / 57

Unadjusted Matching Estimator ime periods Units onsistent if the treatment is exogenous within each unit Only equal to fixed effects estimator if heterogeneity in either treatment assignment or treatment effect is non-existent Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 43 / 57

Unadjusted Matching = Weighted FE Estimator Proposition 2 he unadjusted matching estimator where Ŷ it () = ˆβ M = N N i= Y it if X it = t = X it Y it t = X it if X it = (Ŷit Ŷit()) () t= and Ŷ it () = is equivalent to the weighted fixed effects model (ˆα M, ˆβ M ) = argmin W it (α,β) N i= W it (Y it α i βx it ) 2 t= t = X it if X it =, t = ( X it ) if X it =. t = ( X it )Y it t = ( X it ) if X it = Y it if X it = Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 44 / 57

Equal Weights reatment 2 2 2 2 Weights Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 45 / 57

Different Weights reatment 2 3 3 3 4 4 Weights Any within-unit matching estimator leads to weighted fixed effects regression with particular weights We derive regression weights given any matching estimator for various quantities (AE, A, etc.) Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 46 / 57

First Difference = Matching = Weighted One-Way FE Y it = β X it + ɛ it where Y it = Y it Y i,t, X it = X it X i,t reatment Weights Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 47 / 57

Mismatches in wo-way FE Model ime periods Y it = α i + γ t + βx it + ɛ it Units riangles: wo kinds of mismatches Same treatment status Neither same unit nor same time Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 48 / 57

Mismatches in Weighted wo-way FE Model Units ime periods Some mismatches can be eliminated You can NEVER eliminate them all Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 49 / 57

ross Section Analysis = Weighted ime FE Model Average Outcome..2.4.6.8. control group treatment group counterfactual time t time t+ Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 5 / 57

First Difference = Weighted Unit FE Model Average Outcome..2.4.6.8. control group treatment group counterfactual time t time t+ Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 5 / 57

What about Difference-in-Differences (DiD)? Average Outcome..2.4.6.8. control group treatment group counterfactual time t time t+ Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 52 / 57

General DiD = Weighted wo-way (Unit and ime) FE 2 2: standard two-way fixed effects General setting: Multiple time periods, repeated treatments reatment Weights 2 2 2 2 Weights can be negative = the method of moments estimator Fast computation is available Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 53 / 57

Effects of GA Membership on International rade ontroversy Rose (24): No effect of GA membership on trade omz et al. (27): Significant effect with non-member participants 2 he central role of fixed effects models: Rose (24): one-way (year) fixed effects for dyadic data omz et al. (27): two-way (year and dyad) fixed effects Rose (25): I follow the profession in placing most confidence in the fixed effects estimators; I have no clear ranking between country-specific and country pair-specific effects. omz et al. (27): We, too, prefer FE estimates over OLS on both theoretical and statistical ground Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 54 / 57

Data and Methods Data Data set from omz et al. (27) Effect of GA: 948 994 62 countries, and 96,27 (dyad-year) observations Year fixed effects model: standard and weighted ln Y it = α t + βx it + δ Z it + ɛ it X it : Formal membership/participant () Both vs. One, (2) One vs. None, (3) Both vs. One/None Z it : 5 dyad-varying covariates (e.g., log product GDP) Year fixed effects: standard, weighted, and first difference wo-way fixed effects: standard and difference-in-differences Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 55 / 57

Empirical Results Year Fixed Effects Dyad Fixed Effects Year and Dyad Fixed Effects omparison Membership Standard Weighted Standard Weighted First Diff. Standard Diff.-in-Diff. Formal.4.2.48.69.75.98.9 (N=96,27) (.3) (.3) (.25) (.23) (.54) (.28) (.33) White s p-value..64..58 Both vs. Mix Participants.99.93.47..96.32. (N=96,27) (.34) (.35) (.3) (.29) (.3) (.34) (.28) White s p-value.998..2. Formal.6.5.34.6.76.5.6 (N=75,84) (.3) (.3) (.25) (.23) (.55) (.28) (.33) White s p-value..3..34 Both vs. One Participants.8.74.6.2.99.332.9 (N=87,65) (.35) (.36) (.3) (.29) (.3) (.34) (.29) White s p-value.999..86. Formal.7.46..94.3.82.2 (N=9,72) (.53) (.56) (.4) (.4) (.67) (.43) (.378) White s p-value.276.58..789 One vs. None Participants.63.7.8.34.53.244.7 (N=7,298) (.72) (.79) (.62) (.58) (.63) (.66) (.85) White s p-value.46.4..26 covariates dyad-varying covariates year-varying covariates no covariate able : Estimated Effects of GA on the Logarithm of Bilateral rade based on Various Fixed Effects Models: For models with either year or dyad fixed effects, the Weighted columns present the estimates based on the regression weights given in Proposition 2, which yield the estimated average treatment effects. Other weighted fixed effects estimators, i.e., first differences and difference-in-differences, Kosuke Imai (Princeton) are also presented. Matching Bothand vs. Weighting Mix ( Both Methods vs. One ) represents Duke (January the comparison 8 9, 23) between dyads 56 / 57of

oncluding Remarks Matching methods do: make causal assumptions transparent by identifying counterfactuals make regression models robust by reducing model dependence But they cannot solve endogeneity Only good research design can overcome endogeneity Recent advances in matching methods directly optimize balance the same idea applied to propensity score Weighting methods generalize matching methods Sensitive to propensity score model specification Robust estimation of propensity score model Next methodological challenges for causal inference: temporal and spatial dynamics, networks effects Kosuke Imai (Princeton) Matching and Weighting Methods Duke (January 8 9, 23) 57 / 57