High Dimensional Propensity Score Estimation via Covariate Balancing

Similar documents
Covariate Balancing Propensity Score for General Treatment Regimes

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Primal-dual Covariate Balance and Minimal Double Robustness via Entropy Balancing

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Combining multiple observational data sources to estimate causal eects

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions

Approximate Residual Balancing: De-Biased Inference of Average Treatment Effects in High Dimensions

arxiv: v3 [stat.me] 14 Nov 2016

Covariate Balancing Propensity Score for a Continuous Treatment: Application to the Efficacy of Political Advertisements

Robust Estimation of Inverse Probability Weights for Marginal Structural Models

An Experimental Evaluation of High-Dimensional Multi-Armed Bandits

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

A Sampling of IMPACT Research:

COVARIATE BALANCING PROPENSITY SCORE FOR A CONTINUOUS TREATMENT: APPLICATION TO THE EFFICACY OF POLITICAL ADVERTISEMENTS

Estimating the Marginal Odds Ratio in Observational Studies

Identification and Inference in Causal Mediation Analysis

ENTROPY BALANCING IS DOUBLY ROBUST QINGYUAN ZHAO. Department of Statistics, Stanford University DANIEL PERCIVAL. Google Inc.

ENTROPY BALANCING IS DOUBLY ROBUST. Department of Statistics, Wharton School, University of Pennsylvania DANIEL PERCIVAL. Google Inc.

Causal Inference Lecture Notes: Selection Bias in Observational Studies

Extending causal inferences from a randomized trial to a target population

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Comment: Understanding OR, PS and DR

Web-based Supplementary Materials for A Robust Method for Estimating. Optimal Treatment Regimes

Statistical Analysis of List Experiments

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Bootstrapping Sensitivity Analysis

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Propensity Score Weighting with Multilevel Data

Stable Weights that Balance Covariates for Estimation with Incomplete Outcome Data

arxiv: v1 [stat.me] 15 May 2011

Linear Regression Models P8111

COVARIATE BALANCING PROPENSITY SCORE BY TAILORED LOSS FUNCTIONS. By Qingyuan Zhao University of Pennsylvania

Program Evaluation with High-Dimensional Data

Discussion of Papers on the Extensions of Propensity Score

Instrumental Variables

On the Use of Linear Fixed Effects Regression Models for Causal Inference

Statistical Analysis of Causal Mechanisms

Statistical Analysis of Causal Mechanisms

Statistical Analysis of the Item Count Technique

A General Framework for High-Dimensional Inference and Multiple Testing

Achieving Optimal Covariate Balance Under General Treatment Regimes

Residual Balancing: A Method of Constructing Weights for Marginal Structural Models

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Robustness to Parametric Assumptions in Missing Data Models

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Causal Inference Basics

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Matching and Weighting Methods for Causal Inference

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Variable selection and machine learning methods in causal inference

Causal Inference in Observational Studies with Non-Binary Treatments. David A. van Dyk

Causal Interaction in Factorial Experiments: Application to Conjoint Analysis

Propensity-Score Based Methods for Causal Inference in Observational Studies with Fixed Non-Binary Treatments

Statistical Analysis of Causal Mechanisms

Using Estimating Equations for Spatially Correlated A

Selective Inference for Effect Modification

LIKELIHOOD RATIO INFERENCE FOR MISSING DATA MODELS

Graybill Conference Poster Session Introductions

Classification. Chapter Introduction. 6.2 The Bayes classifier

Interaction Effects and Causal Heterogeneity

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

Ultra High Dimensional Variable Selection with Endogenous Variables

Calibration Estimation for Semiparametric Copula Models under Missing Data

Telescope Matching: A Flexible Approach to Estimating Direct Effects *

A General Double Robustness Result for Estimating Average Treatment Effects

Propensity Score Methods for Causal Inference

,..., θ(2),..., θ(n)

Modeling the scale parameter ϕ A note on modeling correlation of binary responses Using marginal odds ratios to model association for binary responses

DOUBLY ROBUST NONPARAMETRIC MULTIPLE IMPUTATION FOR IGNORABLE MISSING DATA

Bounded, Efficient, and Doubly Robust Estimation with Inverse Weighting

Matching Methods for Causal Inference with Time-Series Cross-Section Data

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

arxiv: v2 [stat.me] 8 Oct 2018

Propensity Score Analysis Using teffects in Stata. SOC 561 Programming for the Social Sciences Hyungjun Suh Apr

Causal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms

University of California, Berkeley

Balancing Covariates via Propensity Score Weighting: The Overlap Weights

What s New in Econometrics. Lecture 1

The Augmented Synthetic Control Method

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

A Bootstrap Lasso + Partial Ridge Method to Construct Confidence Intervals for Parameters in High-dimensional Sparse Linear Models

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Empirical likelihood methods in missing response problems and causal interference

Gauge Plots. Gauge Plots JAPANESE BEETLE DATA MAXIMUM LIKELIHOOD FOR SPATIALLY CORRELATED DISCRETE DATA JAPANESE BEETLE DATA

Single-level Models for Binary Responses

University of California, Berkeley

Data Integration for Big Data Analysis for finite population inference

Section 7: Local linear regression (loess) and regression discontinuity designs

Confidence Intervals for Low-dimensional Parameters with High-dimensional Data

Harvard University. Harvard University Biostatistics Working Paper Series

Supplementary Materials for Residual Balancing: A Method of Constructing Weights for Marginal Structural Models

Causal Inference for Mediation Effects

Transcription:

High Dimensional Propensity Score Estimation via Covariate Balancing Kosuke Imai Princeton University Talk at Columbia University May 13, 2017 Joint work with Yang Ning and Sida Peng Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 1 / 20

Motivation Covariate adjustment in observational studies: 1 Identify and measure a set of potential confounders 2 Weight or match the treated and control units to make them similar Practical questions: 1 Which covariates should we adjust? 2 How should we adjust them? 3 What should we do if we have many potential confounders? Major advances over the last several years in the literature: Tan, Hainmueller, Graham et al., Zubizarreta, Belloni et al., Chan et al., Farrell, Chernozhukov et al., Athey et al., Zhao, etc. Covariate Balancing Propensity Score (CBPS) Imai & Ratkovic JRSSB; Fan et al. Estimate propensity score such that covariates are balanced Extensions: continuous treatment (Fong et al.), dynamic treatment (Imai & Ratkovic JASA), high-dimensional setting (Ning et al.) Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 2 / 20

Overview 1 Review of propensity score and CBPS Inverse probability weighting (IPW) Propensity score tautology covariate balancing Optimal covariate balancing for CBPS 2 High-dimensional CBPS (HD-CBPS) Impossible to balance all covariates trade-off between covariates Sparsity assumption + Regularization Balance covariates that are predictive of outcome Remove bias (due to regularization) by covariate balancing Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 3 / 20

Propensity Score and CBPS T i {0, 1}: binary treatment X i : d-dimensional pre-treatment covariates Identification assumptions: SUTVA, overlap, unconfoundedness Dual characteristics of propensity score: 1 Predicts treatment assignment: π(x i ) = Pr(T i = 1 X i ) 2 Balances covariates: { } Ti E π(x i ) f (X i) { } 1 Ti = E 1 π(x i ) f (X i) = E{f (X i )} Propensity score tautology CBPS: estimate propensity score such that balance is optimized Covariate balancing as estimating equations Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 4 / 20

Optimal Covariate Balancing Equations Which covariates and what functions of them should we balance? The outcome model matters Outcome model: E(Y i (t) X i ) = K t (X i ) for t = 0, 1 Possibly misspecified propensity score π β (X i ) π β o(x i ) Inverse probability weighting (IPW) estimator: 1 n { n i=1 T i Y i π ˆβ (X i) (1 T i)y i 1 π ˆβ (X i) Asymptotic bias for the ATE: [ ( Ti E π β o(x i ) 1 T ) ] i {π β o(x i )K 0 (X i ) + (1 π β o)k 1 (X i )} 1 π β o(x i ) }{{} optimal f (X i ) [ ( = E 1 1 T ) ( ) i Ti ] K 0 (X i ) + 1 π β o(x i ) π β o(x i ) 1 K 1 (X i ) }{{}}{{} balance control group balance treatment group Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 5 / 20 }

Covariate Balancing in High-Dimension We focus on the estimation of E(Y i (1)) A similar estimation strategy can be applied to E(Y i (0)) Low dimension strong covariate balancing 1 n n i=1 ( ) Ti 1 X i = 0 ˆπ i asymptotic normality, semiparametric efficiency, double robustness (Fan et al.) High dimension weak covariate balancing 1 n n i=1 ( ) Ti 1 α X i = 0 ˆπ i Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 6 / 20

Assumptions for Estimating µ 1 = E(Y i(1)) 1 Linear outcome model: 2 Logistic propensity score model: K 1 (X i ) = α X i Pr(T i = 1 X i ) = π(β X i ) = exp(β X i ) 1 + exp(β X i ) 3 Sparsity of both models: max(s 1, s 2 ) log d log n/n = o(1) where s 1 = #{β j > 0} and s 2 = #{α j > 0} 4 Tuning parameters for Lasso: λ = a log(d)/n and λ = a log(d)/n for some unknown constants a and a Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 7 / 20

The Proposed Methodology 1 Fit the penalized logistic regression model for propensity score: 1 ˆβ = argmin β R d n n i=1 { } T i (β X i ) log(1 + exp(β X i )) + λ β 1, 2 Fit the penalized linear regression model for the outcome: 1 n α = argmin T i {Y i α X i } 2 + λ α 1, α R d n i=1 3 Calibrate the estimated propensity score by balancing covariates: 1 n γ = argmin T 2 i n π(γ X i + ˆβ S Sc X i S ) 1 X i S c γ R S where S = {j : α j > 0} i=1 4 Use the estimated propensity score π i = π( γ X i S + ˆβ Sc X i S c ) for the IPW estimator ˆµ 1 = 1 n n i=1 T iy i / π i Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 8 / 20 2

Theoretical Properties 1 n-consistency ˆµ 1 µ 1 = 1 n n [ Ti π i=1 i ] (Y i (1) α X i ) + α X i µ 1 + o p (n 1/2 ) 2 asymptotic normality and semiparametric efficiency n(ˆµ1 µ 1 ) [ ]) D 1 N (0, E π E(ɛ2 i1 X) + (α X i µ 1 )2 valid under K -fold cross-validation 3 sample boundedness ˆµ 1 min i:t i =1 Y i n n i=1 T i π i = min i:t i =1 Y i and similarly, ˆµ 1 max i:ti =1 Y i Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 9 / 20

Robustness and Generalizations 1 Multi-valued treatment regime use the penalized multinomial logistic regression balance selected covariates for each treatment group 2 Generalized linear models for the outcome exponential family p(y X) = h(y, ψ) exp[{y α X b(α X)}/a(ψ)] balance f (X i ) = (b ( α X i ), b ( α X i )X i S) b (α X i ) b ( α X i ) + b ( α X i )(α α)x i 3 Misspecified propensity score model theoretical properties hold so long as the outcome model is correct sure screening property must hold for S in the outcome model Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 10 / 20

Comparison with Other Methods 1 Augmented inverse probability weighting (AIPW) Belloni et al. (2014), Farrell (2015), Chernozhukov et al. (2016) Extends Robins doubly-robust estimator to high-dimension 1 n n { ˆm 1 (X i ) + T } i(y i ˆm 1 (X i )) ˆπ i (X i ) i=1 Double selection method 2 Residual balancing (RB) Athey, Imbens, and Wagner (2016) Assumes sparsity only for the outcome model Critically relies on linearity assumption Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 11 / 20

Simulation Studies Inspired by Kang and Schafer (2007) Covariates: X i i.i.d. N (0, Σ) where Σ jk = ρ j k Propensity score model: Pr(T i X i ) = logit 1 ( X i1 + 0.5X i2 0.25X i3 0.1X i4 ) Outcome model: Y i (1) = 2 + 0.137X i7 + 0.137X i8 + 0.137X i9 + ɛ 1i Y i (0) = 1 + 0.291X i5 + 0.291X i6 + 0.291X i7 + 0.291X i8 + 0.291X i9 + ɛ 0i where ɛ ti i.i.d. N (0, 1) for t = 0, 1 Add irrelevant covariates so that a total number of covariates, d, varies from 10 to 2000 Comparison with Residual Balancing (RB) and regularized augmented inverse probability weighting (AIPW) Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 12 / 20

Both Models are Correct Standardized root-mean-squared error E(ˆµ µ ) 2 /µ n = 100 n = 1000 d HD-CBPS RB AIPW HD-CBPS RB AIPW 10 0.2163 0.2301 0.2209 0.0707 0.0720 0.0950 100 0.2272 0.2421 0.2273 0.0765 0.0787 0.0692 500 0.3262 0.3132 0.3859 0.0729 0.0810 0.1009 1000 0.2083 0.2413 0.2072 0.0817 0.0894 0.0992 2000 0.2327 0.2212 0.2058 0.0716 0.0760 0.0903 Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 13 / 20

Propensity Score Model is Misspecified Misspecification through transformation: X mis = (exp(x 1 /2), X 2 (1 + exp(x 1 )) 1 + 10, (X 1 X 3 /25 + 0.6) 3, (X 2 + X 4 + 20) 2, X 5,, X d ) n = 100 n = 1000 d HD-CBPS RB AIPW HD-CBPS RB AIPW 10 0.2230 0.2348 0.2174 0.0714 0.0734 0.0934 100 0.2163 0.2185 0.2104 0.0720 0.0784 0.0688 500 0.3255 0.3256 0.3742 0.0731 0.0852 0.1063 1000 0.2273 0.2462 0.2239 0.0892 0.0981 0.1074 2000 0.2232 0.2348 0.2061 0.0764 0.0844 0.0924 Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 14 / 20

Both Models are Misspecified Outcome model misspecification through another transformation: X mis = (exp(x 1 /2), X 2 (1 + exp(x 1 )) 1 + 10, (X 1 X 3 /25 + 0.6) 3, (X 2 + X 4 + 20) 2, X 6, exp(x 6 + X 7 ), X 2 9, X 3 7 20, X 9,, X d ) n = 100 n = 1000 d HD-CBPS RB AIPW HD-CBPS RB AIPW 10 0.2386 0.2462 0.2614 0.0655 0.0661 0.0724 100 0.2385 0.2556 0.2422 0.0647 0.0680 0.0693 500 0.3324 0.3400 0.3696 0.0790 0.0861 0.0765 1000 0.2226 0.2369 0.2209 0.0725 0.0822 0.0765 2000 0.2108 0.2157 0.1974 0.0843 0.0878 0.0972 Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 15 / 20

Coverage of 95% Confidence Intervals Both models are correctly specified Average length of confidence intervals in parentheses d n = 100 n = 500 n = 1000 10 0.9450 (0.8550) 0.9500 (0.3674) 0.9550 (0.2766) 100 0.9000 (0.7649) 0.9450 (0.3767) 0.9700 (0.2747) 500 0.9000 (0.8027) 0.9300 (0.3800) 0.9650 (0.2617) 1000 0.9350 (0.7268) 0.9700 (0.3718) 0.9350 (0.2697) 2000 0.9050 (0.7882) 0.9450 (0.4000) 0.9350 (0.2609) Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 16 / 20

Logistic Outcome Model Pr(Y i (1) = 1 X i ) = logit 1 (1 + exp(2 + 0.137X i1 + 0.137X i2 + 0.137X i3 )) Pr(Y i (0) = 1 X i ) = logit 1 (1 + exp(1 + 0.291X i1 + 0.291X i2 + 0.291X i3 + 0.291X i4 + 0.291X i5 )) n = 100 n = 500 n = 1000 d HD-CBPS AIPW HD-CBPS AIPW HD-CBPS AIPW 10 0.0947 0.0908 0.0398 0.0441 0.0252 0.0297 100 0.0745 0.0759 0.0352 0.0367 0.0239 0.0252 500 0.1075 0.1082 0.0351 0.0354 0.0303 0.0367 1000 0.0729 0.0730 0.0350 0.0358 0.0294 0.0320 2000 0.2113 0.2144 0.0357 0.0378 0.0232 0.0255 Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 17 / 20

Empirical Illustration Effect of college attendance on political participation (Kam and Palmer 2008 JOP) Data: Youth-Parent Socialization Panel Study (n = 1051) Outcome: a total number of (up to 8) participatory acts Covariates: 81 pre-adult characteristics Propensity score: a total of 204 predictors The authors found little effect of college attendance Henderson and Chatfield (2011) and Mayer (2011) use genetic matching and find a positive and statistically significant effect Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 18 / 20

Empirical Results Propensity score: logistic regression with 204 predictors Outcome model: binomial logistic regression with 204 predictors 27 covariates are selected by HD-CBPS Overall (ATE) Overall (ATT) Whites (ATE) HD-CBPS CBPS AIPW 0.8293 1.0163 0.8796 (0.1247) (0.2380) (0.1043) 0.8439 1.1232 (0.1420) (0.3094) 0.8445 0.8977 (0.1279) (0.1089) Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 19 / 20

Concluding Remarks Covariate balancing as an efficient and robust way to estimate propensity score Challenges in high-dimension = cannot balance all covariates High-dimensional covariate balancing propensity score (HD-CBPS) sparsity assumption + weak covariate balancing efficient and robust to misspecification of propensity score model more reliance on the outcome model specification loss of double-robustness HD-CBPS is implemented in the R package CBPS Kosuke Imai (Princeton) Covariate Balancing Propensity Score Columbia (May 13, 2017) 20 / 20