Discussion: Can We Get More Out of Experiments?

Similar documents
Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

Statistical Analysis of Causal Mechanisms

Discussion of Papers on the Extensions of Propensity Score

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Covariate Balancing Propensity Score for General Treatment Regimes

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Statistical Analysis of Causal Mechanisms

Gov 2002: 5. Matching

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

An Introduction to Causal Mediation Analysis. Xu Qin University of Chicago Presented at the Central Iowa R User Group Meetup Aug 10, 2016

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Statistical Analysis of Causal Mechanisms

Adjusting treatment effect estimates by post-stratification in randomized experiments

Unpacking the Black-Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

arxiv: v2 [math.st] 18 Oct 2017

Causal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms

Achieving Optimal Covariate Balance Under General Treatment Regimes

Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies

Gov 2002: 4. Observational Studies and Confounding

High Dimensional Propensity Score Estimation via Covariate Balancing

On the Use of Linear Fixed Effects Regression Models for Causal Inference

Advanced Quantitative Methods: Causal inference

Weighting Methods. Harvard University STAT186/GOV2002 CAUSAL INFERENCE. Fall Kosuke Imai

Casual Mediation Analysis

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

Inference for Average Treatment Effects

Causal Mechanisms Short Course Part II:

Introduction to Statistical Inference

Unpacking the Black Box of Causality: Learning about Causal Mechanisms from Experimental and Observational Studies

Simple Linear Regression

Research Design: Causal inference and counterfactuals

Variable selection and machine learning methods in causal inference

Lecture 3: Statistical Decision Theory (Part II)

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

What s New in Econometrics. Lecture 1

Propensity Score Matching and Genetic Matching : Monte Carlo Results

Identification and Inference in Causal Mediation Analysis

Noncompliance in Randomized Experiments

Matching for Causal Inference Without Balance Checking

Propensity Score Matching

University of California, Berkeley

Difference-in-Differences Methods

Selection on Observables: Propensity Score Matching.

Propensity Score Weighting with Multilevel Data

Econometric Analysis of Cross Section and Panel Data

Notes on causal effects

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Theory of Statistical Inference for Matching Methods in Applied Causal Research

Covariate Selection for Generalizing Experimental Results

Quantitative Economics for the Evaluation of the European Policy

Flexible Estimation of Treatment Effect Parameters

Interaction Effects and Causal Heterogeneity

Independent and conditionally independent counterfactual distributions

Nonconvex penalties: Signal-to-noise ratio and algorithms

Propensity Score Methods for Causal Inference

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation

Genetic Matching for Estimating Causal Effects:

Combining multiple observational data sources to estimate causal eects

Dynamics in Social Networks and Causality

Forecasting in the presence of recent structural breaks

Unpacking the Black-Box: Learning about Causal Mechanisms from Experimental and Observational Studies

Package IUPS. February 19, 2015

Experimental Designs for Identifying Causal Mechanisms

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects

Survivor Bias and Effect Heterogeneity

Potential Outcomes Model (POM)

arxiv: v1 [stat.me] 15 May 2011

An Introduction to Causal Analysis on Observational Data using Propensity Scores

On the efficiency of two-stage adaptive designs

Additive Isotonic Regression

Bayesian causal forests: dealing with regularization induced confounding and shrinking towards homogeneous effects

Regression Discontinuity Designs

New Developments in Econometrics Lecture 11: Difference-in-Differences Estimation

Machine Learning Basics: Maximum Likelihood Estimation

Principles Underlying Evaluation Estimators

Generated Covariates in Nonparametric Estimation: A Short Review.

To Hold Out or Not. Frank Schorfheide and Ken Wolpin. April 4, University of Pennsylvania

Alternative Balance Metrics for Bias Reduction in. Matching Methods for Causal Inference

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Statistical Analysis of Causal Mechanisms for Randomized Experiments

Matching Methods for Causal Inference with Time-Series Cross-Section Data

Balancing Covariates via Propensity Score Weighting

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Bayesian Causal Forests

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Job Training Partnership Act (JTPA)

Overview. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Extending causal inferences from a randomized trial to a target population

Econometrics of Policy Evaluation (Geneva summer school)

Why experimenters should not randomize, and what they should do instead

SREE WORKSHOP ON PRINCIPAL STRATIFICATION MARCH Avi Feller & Lindsay C. Page

Transcription:

Discussion: Can We Get More Out of Experiments? Kosuke Imai Princeton University September 4, 2010 Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 1 / 9

Keele, McConnaughy, and White Question: Can we gain efficiency by adjusting experimental data after the experiment is done? KMW s Answer: Yes, use matching rather than regression 1 Much weaker functional-form assumption 2 Can detect the lack of common support 3 Less data snooping Disadvantages (acknowledged by KMW): 1 May create imbalance in unobservables 2 No design-based variance calculation KMW s proposal: report both unadjusted and adjusted estimates Adjust or not Adjust?: contribution to the important but controversial debate in the literature Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 2 / 9

Covariate Adjustments in Experiments Pre-randomization adjustments are gold standard Blocking never hurts (Imai, King & Stuart, 2008) Matching can hurt, but in practice it seems to work very well When post-randomization adjustments are desirable? Covariates are unavailable before randomization AND low power Model-based variance calculation: this may be fine but not clear how to compare it with design-based variance Risk of data snooping is always there Which one do you trust if adjusted and unadjusted estimates are different? Some comments about details: 1 Asymptotics: T 0? maybe just refer to Freedman 2 Simulation: Need to account for randomization? 3 Randomization test: broken randomization? 4 Empirical results: unadjusted 0.00(0.822), with replacement 1.25(0.039), without replacement 0.25(0.803) Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 3 / 9

Another Motivation for Covariate Adjustments Quantities of interest go beyond ATE Heterogenous treatment effects 1 Useful for testing substantive theory 2 Useful for policy-makers Growing methodological literature: 1 Tree-based methods (Imai and Strauss) 2 Generalized additive models (Feller and Holmes) 3 Bayesian Additive Regression Trees (Green and Kern) Key challenge: avoid post-hoc subgroup analysis problem Regularization is required 1 Cross-validation 2 Bayesian prior 3 Penalty function Using treatment effect heterogeneity to generalize experimental results to a larger population Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 4 / 9

Hartman, Grieve, and Sekhon Disadvantage of randomized experiments: external validity Question: How do we extrapolate from SATT to PATT? HGS s solution: 1 Estimate heterogenous treatment effects via matching 2 Weight pairs to match the population distribution 3 Use placebo tests if possible Application to Pulmonary Artery Catheterization (PAC) Overall, a nice idea with an interesting application Some remaining issues: 1 Variable selection problem: How should one choose variables to include in matching/weighting? 2 Multiple testing problem with placebo tests 3 Variance calculation is no longer randomization-based Suggestion: Use HGS s method with pre-randomization matching Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 5 / 9

Some Comments about Details Clarifying the identifying assumption: Sample selection based on observables Possibilities of unobserved confounders Bias decomposition: Maybe helpful to decompose them into sample selection bias due to observables and unobservables Should be expressed using potential outcomes, not E(Y i W, T i = 1, I = 1) etc. Variance calculation: Abadie & Imbens standard errors for SATE/SATT What about PATT? Sometimes PATT has smaller standard error than SATT. Additional uncertainty due to sampling from population Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 6 / 9

Green and Kern Goal: Evaluate the performance of several competing estimators for generalizing SATE to PATE using Monte Carlo simulations Six methods 1 Difference-in-means 2 Linear regression with step-wise variable selection 3 Inverse probability weighting (IPW) 4 Genetic matching with maximum entropy weighting 5 Bayesian Additive Regression Trees (BART) Use of realistic simulation settings based on GSS Linear, nonlinear response surfaces, confounded and unconfounded Findings: 1 The difference-in-means is the worst 2 BART often does better than the others Important contribution given the growing interest in the topic (Stuart et al.; Hartman et al.) Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 7 / 9

What Does Explain the Findings? No surprise that the diff-in-means performs badly No surprise that linear regression does badly Why does IPW do worse than BART? IPW used here is parametric Stabilized weights could be used Why does MaxEnt do worse then BART? Common support assumption is satisfied No variable selection for MaxEnt? Need for theoretical understanding about the conditions under which each model does and does not work well Report bias and efficiency rather than MSE Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 8 / 9

Back to the Common Theme Original question: Can we get more out of experiments? Yes, but be careful and use appropriate statistical tools Efficiency gain by pre-treatment covariate adjustments Post-treatment covariate adjustments require a greater care Avoid post-hoc adjustment Variable and model selection issues Variance calculation Going beyond the SATE Heterogenous treatment effects and Extrapolation Avoid post-hoc subgroup analysis problem Variable and model selection Sample selection based on unobservables Experiments vs. observational studies and central role of statistics Internal vs. external validity Small vs. large data sets Kosuke Imai (Princeton) Discussion APSA 2010 (Washington D.C.) 9 / 9