Small-sample cluster-robust variance estimators for two-stage least squares models

Similar documents
Small-Sample Methods for Cluster-Robust Inference in School-Based Experiments

Small-sample adjustments for multiple-contrast hypothesis tests of meta-regressions using robust variance estimation

Small sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models

Small sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models

A Practitioner s Guide to Cluster-Robust Inference

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

Online Appendix to Yes, But What s the Mechanism? (Don t Expect an Easy Answer) John G. Bullock, Donald P. Green, and Shang E. Ha

WISE International Masters

Bootstrap-Based Improvements for Inference with Clustered Errors

Clustering as a Design Problem

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

Analysis of propensity score approaches in difference-in-differences designs

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

AGEC 661 Note Fourteen

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Wild Cluster Bootstrap Confidence Intervals

WILD CLUSTER BOOTSTRAP CONFIDENCE INTERVALS*

Instrumental Variables

Statistical Analysis of Randomized Experiments with Nonignorable Missing Binary Outcomes

NBER WORKING PAPER SERIES ROBUST STANDARD ERRORS IN SMALL SAMPLES: SOME PRACTICAL ADVICE. Guido W. Imbens Michal Kolesar

On Variance Estimation for 2SLS When Instruments Identify Different LATEs

Econometrics Summary Algebraic and Statistical Preliminaries

150C Causal Inference

Econometric Analysis of Cross Section and Panel Data

Lecture: Simultaneous Equation Model (Wooldridge s Book Chapter 16)

Robust Standard Errors in Small Samples: Some Practical Advice

Lecture 8. Roy Model, IV with essential heterogeneity, MTE

Selection on Observables: Propensity Score Matching.

ESTIMATION OF TREATMENT EFFECTS VIA MATCHING

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Supplement to Randomization Tests under an Approximate Symmetry Assumption

Instrumental Variables

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

Exact Nonparametric Inference for a Binary. Endogenous Regressor

A Robust Test for Weak Instruments in Stata

Sensitivity checks for the local average treatment effect

Econometrics. Week 8. Fall Institute of Economic Studies Faculty of Social Sciences Charles University in Prague

Using Instrumental Variables to Find Causal Effects in Public Health

arxiv: v3 [stat.me] 20 Feb 2016

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Groupe de lecture. Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings. Abadie, Angrist, Imbens

Quantitative Economics for the Evaluation of the European Policy

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

The Essential Role of Pair Matching in. Cluster-Randomized Experiments. with Application to the Mexican Universal Health Insurance Evaluation

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Panel Data?

Measuring Social Influence Without Bias

A Practical Comparison of the Bivariate Probit and Linear IV Estimators

Statistical Power and Autocorrelation for Short, Comparative Interrupted Time Series Designs with Aggregate Data

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

Noncompliance in Randomized Experiments

The Case Against JIVE

ECO375 Tutorial 8 Instrumental Variables

Exogeneity tests and weak-identification

Wild Bootstrap Inference for Wildly Dierent Cluster Sizes

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Essential of Simple regression

A Course on Advanced Econometrics

1 Motivation for Instrumental Variable (IV) Regression

Casuality and Programme Evaluation

Instrumental Variables and Two-Stage Least Squares

Economics 308: Econometrics Professor Moody

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

Causal Inference Lecture Notes: Causal Inference with Repeated Measures in Observational Studies

EMERGING MARKETS - Lecture 2: Methodology refresher

The problem of causality in microeconometrics.

Heteroskedasticity. Part VII. Heteroskedasticity

What s New in Econometrics. Lecture 13

IsoLATEing: Identifying Heterogeneous Effects of Multiple Treatments

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

IV Quantile Regression for Group-level Treatments, with an Application to the Distributional Effects of Trade

Increasing the Power of Specification Tests. November 18, 2018

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

A Course in Applied Econometrics. Lecture 5. Instrumental Variables with Treatment Effect. Heterogeneity: Local Average Treatment Effects.

Michael Lechner Causal Analysis RDD 2014 page 1. Lecture 7. The Regression Discontinuity Design. RDD fuzzy and sharp

Introductory Econometrics

Economics 241B Estimation with Instruments

Heteroskedasticity ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

PSC 504: Instrumental Variables

Supplementary material to: Tolerating deance? Local average treatment eects without monotonicity.

12E016. Econometric Methods II 6 ECTS. Overview and Objectives

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity


Inference in Differences-in-Differences with Few Treated Groups and Heteroskedasticity

Estimation of the Conditional Variance in Paired Experiments

A Simple, Graphical Procedure for Comparing. Multiple Treatment Effects

Recitation Notes 5. Konrad Menzel. October 13, 2006

Econometrics Master in Business and Quantitative Methods

Missing dependent variables in panel data models

Propensity Score Weighting with Multilevel Data

The Simple Linear Regression Model

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

ECON Introductory Econometrics. Lecture 17: Experiments

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Friday, March 15, 13. Mul$ple Regression

Inference in difference-in-differences approaches

DOCUMENTS DE TRAVAIL CEMOI / CEMOI WORKING PAPERS. A SAS macro to estimate Average Treatment Effects with Matching Estimators

Transcription:

Small-sample cluster-robust variance estimators for two-stage least squares models ames E. Pustejovsky The University of Texas at Austin Context In randomized field trials of educational interventions, participants do not always comply with their assigned treatment condition. Under these circumstances, researchers might adopt an intentto-treat (ITT) perspective and focus on the average effect of being assigned to intervention. Alternately, researchers might focus on the complier average causal effect (CACE), which is the average effect of receiving the intervention for the subgroup of participants for whom treatment assignment induces treatment receipt (Angrist, Imbens, & Rubin, 1996). There is an inherent interpretive trade-off between these two perspectives: whereas ITT analysis has strong internal validity, CACE analysis may have clearer construct validity because it relates directly to treatment receipt. Two-stage least squares (2SLS) estimation is a standard approach for CACE analysis indeed, it is the only estimation method accepted in the most recent What Works Clearinghouse Standards (What Works Clearinghouse, 2018). Inference in ITT or CACE analysis is often based on cluster-robust variance estimation (CRVE) (Cameron & Miller, 2015). CRVE is used for analyzing cluster-randomized trials in order to account for within-cluster dependence. CRVE may also be used in analysis of multi-site trials (clustering by site) when the goal is to generalize to a larger population of sites (Abadie, Athey, Imbens, & Wooldridge, 2017). In both settings, CRVE provides a way to avoid imposing strong distributional assumptions about the error structure in ITT or CACE estimating equations. Conventional CRVE methods are require a large number of clusters to obtain properly calibrated hypothesis tests and confidence intervals. Further, rejection/coverage rates from CRVE depend on features of the study design beyond just the total number of clusters (Bell & McCaffrey, 2002; Pustejovsky & Tipton, 2016). Bell and McCaffrey introduced modifications of CRVE, known as bias-reduced linearization estimators, which have improved rejection/coverage rates even with a small number of clusters. Their initial work was limited to ordinary and weighted least squares regression (Bell & McCaffrey, 2002; McCaffrey, Bell, & Botts, 2001), but the methods have subsequently been extended to generalized estimating equations (McCaffrey & Bell, 2006) and fixed effects regression (Pustejovsky & Tipton, 2016). Aims Small-sample CRVE methods can readily be applied to ITT analysis in multi-site and clusterrandomized trials. However, similar methods are not available for CACE analysis. Researchers

aiming to estimate CACE thus face another trade-off: should they use conventional CRVE, despite the possibility of obtaining mis-calibrated tests and confidence intervals, or turn to methods that impose stronger distributional assumptions? To help resolve this conundrum, we propose bias-reduced linearization estimators for 2SLS models and evaluate the type-i error rates of the methods in Monte Carlo simulations. Methods Let us first consider a linear regression model for data from a sample of clusters, where cluster j includes n j individual observations: y j = X j β + e j (1) For simplicity, we focus on the ordinary least squares estimator, β = M X j y j, where M X = ( X j X j ) 1. Let e j = y j X j β. The bias-reduced linearization estimator of Var(β ) is given by V OLS = M X ( X j A j e je j A j X j ) M X. (2) The A j are symmetric matrices defined such that E(V OLS ) = Var(β ) under an analyst-specified working model for Var(e j ), j = 1,,. For instance, the analyst might use a working independence model with Var(e j ) = σ 2 I nj ; Imbens and Kolesar (2015) propose some alternative working models. Inference on the regression coefficient β k is based on the statistic OLS β k / V kk with a t(η) reference distribution, where η = 2E 2 (V OLS kk )/V(V OLS kk ) is a Satterthwaitetype degrees of freedom approximation that is derived under the working model for Var(ε j ). Let us now consider the model y j = Z j δ + u j Z j = X j γ + ν j (3) where Z j includes the endogenous regressor (i.e., the compliance indicator) and additional covariates and X j includes the instrument (i.e., treatment assignment indicator) and additional covariates. The 2SLS estimator of δ is δ = M Z ( Z j y j ), (4)

where M Z = ( Z j Z j ) 1 and Z j = X j ( X j X j ) 1 ( X j Z j ). Let u j = y j Z j δ. We propose a bias-reduced linearization estimator of Var(δ ) as V 2SLS = M Z ( Z j A j u ju j A j Z j ) M Z, (5) where the adjustment matrices (and Satterthwaite degrees of freedom for hypothesis tests) are calculated as for an ordinary linear regression model of the second stage, y j = Z jδ + u j, under a working model for Var(u j), j = 1,,. For models with a single instrument, the proposed adjustment yields an exactly unbiased estimator of the delta-method approximation to the Wald representation of the 2SLS estimator. Simulation Design and Results We designed simulations to evaluate the performance of the proposed method compared to the conventional CRVE for 2SLS. The conventional CRVE is calculated as in (5) but omits the adjustment matrices and uses a standard normal reference distribution. Initial simulations emulated a cluster-randomized trial with cluster-level non-compliance that was correlated with cluster-level mean outcomes to varying degrees, such that naïve OLS estimation yields biased estimates of the CACE. We also varied the total number of clusters in the trial, the fraction of clusters allocated to treatment, and the overall compliance rate. The intra-class correlation of the outcome was set to 0.2 throughout, and cluster sizes were simulated as n j 1 + Pois(10). We estimate the CACE using unweighted 2SLS and compare the Type I error rates of hypothesis tests based on the conventional CRVE and the proposed bias-reduced linearization estimator. Figures 1 and 2 depict the Type I error rates of tests for the CACE estimated by 2SLS using either the conventional or modified test, at α-levels of.05 and.01, respectively. Each line represents a different degree of compliance-outcome correlation. At both α-levels, the conventional test has excessive Type-I error when the total number of clusters is small or treatment allocation is imbalanced particularly for higher compliance rates. In contrast, the modified method yields error rates at or below the nominal level, with as few as 6 treated and 14 control clusters. The tests become more conservative when compliance levels are lower. Conclusions These simulation results provide initial evidence that the proposed small-sample adjustments improve Type I error rates of hypothesis tests for the CACE in the context of cluster-randomized trials with cluster-level non-compliance. In ongoing work, we are examining the performance of these methods in individually-randomized multi-site trials, with individual-level non-compliance that varies across sites. In this context, we will examine performance of the conventional and

modified tests for 2SLS with a single instrument (treatment assignment) and instrument-by-site interactions. We expect that all methods will perform poorly in the latter case due to the inconsistency of 2SLS as the number of instruments increases (Bekker, 1994). References Abadie, A., Athey, S., Imbens, G. W., & Wooldridge,. (2017). When Should You Adjust Standard Errors for Clustering? https://doi.org/10.3386/w24003 Angrist,. D., Imbens, G. W., & Rubin, D. B. (1996). Identification of causal effects using instrumental variables. ournal of the American Statistical Association, 91(434), 444 455. Retrieved from http://www.jstor.org/stable/2291629 Bekker, P. A. (1994). Alternative approximations to the distributions of instrumental variable estimators. Econometrica, 62(3), 657 681. https://doi.org/10.2307/2951662 Bell, R. M., & McCaffrey, D. F. (2002). Bias reduction in standard errors for linear regression with multistage samples. Survey Methodology, 28(2), 169 181. Retrieved from http://www.statcan.gc.ca/pub/12-001-x/2002002/article/9058-eng.pdf Cameron, R. C., & Miller, D. L. (2015). A Practitioner s Guide to Cluster-Robust Inference. ournal of Human Resources, 50(2), 317 372. https://doi.org/10.3368/jhr.50.2.317 Imbens, G. W., & Kolesar, M. (2015). Robust standard errors in small samples: some practical advice. McCaffrey, D. F., & Bell, R. M. (2006). Improved hypothesis testing for coefficients in generalized estimating equations with small samples of clusters. Statistics in Medicine, 25(23), 4081 98. https://doi.org/10.1002/sim.2502 McCaffrey, D. F., Bell, R. M., & Botts, C. H. (2001). Generalizations of biased reduced linearization. In Proceedings of the Annual Meeting of the American Statistical Association. Pustejovsky,. E., & Tipton, E. (2016). Small-sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models. ournal of Business & Economic Statistics, 1 12. https://doi.org/10.1080/07350015.2016.1247004 What Works Clearinghouse. (2018). What Works Clearinghouse Standards Handbook, Version 4. Washington, DC.

Allocation fraction: 0.3 Allocation fraction: 0.5 Figure 1 Type I error rates of tests based on conventional and modified cluster-robust variance estimators at α =. 05 clusters: 20 clusters: 30 clusters: 40 clusters: 50 clusters: 60 0.09 0.06 Rejection rate at alpha =.05 0.09 0.06 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 Compliance rate CRVE estimator Conventional Modified

Allocation fraction: 0.3 Allocation fraction: 0.5 Figure 2 Type I error rates of tests based on conventional and modified cluster-robust variance estimators at α =. 01 clusters: 20 clusters: 30 clusters: 40 clusters: 50 clusters: 60 0.04 Rejection rate at alpha =.01 0.02 0.01 0.04 0.02 0.01 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 0.5 0.6 0.7 0.8 0.9 1.0 Compliance rate CRVE estimator Conventional Modified