Bootstrap-Based Improvements for Inference with Clustered Errors

Similar documents
Wild Bootstrap Inference for Wildly Dierent Cluster Sizes

Robust Inference with Multi-way Clustering

11. Bootstrap Methods

A Practitioner s Guide to Cluster-Robust Inference

A better way to bootstrap pairs

Heteroskedasticity-Robust Inference in Finite Samples

Empirical Methods in Applied Microeconomics

Bootstrapping Heteroskedasticity Consistent Covariance Matrix Estimator

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

WILD CLUSTER BOOTSTRAP CONFIDENCE INTERVALS*

Robust Inference with Clustered Data

Casuality and Programme Evaluation

Inference in difference-in-differences approaches

The Exact Distribution of the t-ratio with Robust and Clustered Standard Errors

TECHNICAL WORKING PAPER SERIES ROBUST INFERENCE WITH MULTI-WAY CLUSTERING. A. Colin Cameron Jonah B. Gelbach Douglas L. Miller

Wild Cluster Bootstrap Confidence Intervals

Bootstrapping heteroskedastic regression models: wild bootstrap vs. pairs bootstrap

Econometrics II. Nonstandard Standard Error Issues: A Guide for the. Practitioner

A Course in Applied Econometrics Lecture 7: Cluster Sampling. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

Inference Based on the Wild Bootstrap

Lecture 4: Linear panel models

Small-Sample Methods for Cluster-Robust Inference in School-Based Experiments

Inference about Clustering and Parametric. Assumptions in Covariance Matrix Estimation

Bootstrap Testing in Econometrics

Small sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models

Robust Inference with Clustered Errors

The Wild Bootstrap, Tamed at Last. Russell Davidson. Emmanuel Flachaire

Clustering as a Design Problem

Robust Inference for Regression with Clustered Data

Robust Standard Errors in Small Samples: Some Practical Advice

INFERENCE WITH LARGE CLUSTERED DATASETS*

Small sample methods for cluster-robust variance estimation and hypothesis testing in fixed effects models

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

NBER WORKING PAPER SERIES ROBUST STANDARD ERRORS IN SMALL SAMPLES: SOME PRACTICAL ADVICE. Guido W. Imbens Michal Kolesar

12E016. Econometric Methods II 6 ECTS. Overview and Objectives

Cluster-Robust Inference

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

FNCE 926 Empirical Methods in CF

BOOTSTRAPPING DIFFERENCES-IN-DIFFERENCES ESTIMATES

Small-sample cluster-robust variance estimators for two-stage least squares models

More efficient tests robust to heteroskedasticity of unknown form

A Practitioner s Guide to Cluster-Robust Inference

8. Nonstandard standard error issues 8.1. The bias of robust standard errors

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Day 1A Ordinary Least Squares and GLS

Modified Variance Ratio Test for Autocorrelation in the Presence of Heteroskedasticity

Environmental Econometrics

Inference for Clustered Data

A Course on Advanced Econometrics

Bootstrap Methods in Econometrics

Robust covariance estimator for small-sample adjustment in the generalized estimating equations: A simulation study

Diagnostics for the Bootstrap and Fast Double Bootstrap

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Size and Power of the RESET Test as Applied to Systems of Equations: A Bootstrap Approach

Notes on Generalized Method of Moments Estimation

On GMM Estimation and Inference with Bootstrap Bias-Correction in Linear Panel Data Models

Chapter 2. Dynamic panel data models

Day 3B Nonparametrics and Bootstrap

Econometrics - 30C00200

4.8 Instrumental Variables

Markov-Switching Models with Endogenous Explanatory Variables. Chang-Jin Kim 1

Recitation 5. Inference and Power Calculations. Yiqing Xu. March 7, 2014 MIT

What s New in Econometrics? Lecture 14 Quantile Methods

Difference-in-Differences Estimation

Supplement to Randomization Tests under an Approximate Symmetry Assumption

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

An Encompassing Test for Non-Nested Quantile Regression Models

Discussion Paper Series

Volume 30, Issue 1. The finite-sample properties of bootstrap tests in multiple structural change models

Lecture Notes on Measurement Error

Bootstrap Tests: How Many Bootstraps?

Review of Econometrics

QED. Queen s Economics Department Working Paper No Bootstrap and Asymptotic Inference with Multiway Clustering

Econ 273B Advanced Econometrics Spring

The Case Against JIVE

QED. Queen s Economics Department Working Paper No. 1355

Inference in Differences-in-Differences with Few Treated Groups and Heteroskedasticity

A Practitioner's Guide to Cluster-Robust Inference

econstor Make Your Publications Visible.

Bootstrapping Long Memory Tests: Some Monte Carlo Results

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Econometrics of Panel Data

Birkbeck Working Papers in Economics & Finance

Ordinary Least Squares Regression

Day 2A Instrumental Variables, Two-stage Least Squares and Generalized Method of Moments

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Testing Linear Restrictions: cont.

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

LECTURE 13: TIME SERIES I

The Number of Bootstrap Replicates in Bootstrap Dickey-Fuller Unit Root Tests

DOCUMENTS DE TRAVAIL CEMOI / CEMOI WORKING PAPERS

AN EVALUATION OF PARAMETRIC AND NONPARAMETRIC VARIANCE ESTIMATORS IN COMPLETELY RANDOMIZED EXPERIMENTS. Stanley A. Lubanski. and. Peter M.

Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models

QED. Queen s Economics Department Working Paper No The Size Distortion of Bootstrap Tests

A Simple, Graphical Procedure for Comparing Multiple Treatment Effects

Econometric Analysis of Cross Section and Panel Data

x i = 1 yi 2 = 55 with N = 30. Use the above sample information to answer all the following questions. Show explicitly all formulas and calculations.

Least Squares Estimation-Finite-Sample Properties

Warwick Economics Summer School Topics in Microeconometrics Instrumental Variables Estimation

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Transcription:

Bootstrap-Based Improvements for Inference with Clustered Errors Colin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, U. Maryland, U.C. - Davis May, 2008 May, 2008 1 / 41

1. Introduction OLS regression with individual-level data (i) with cluster or grouping (g) y ig = x 0 ig β + u ig, i = 1,..., N g, g = 1,..., G. E.g. y ig is (log) wage of ith individual who lives in state g and x ig includes regressor correlated within state g and u ig is error correlated within state g. May, 2008 2 / 41

Default OLS standard errors based on s 2 (X 0 X) 1 that ignore clustering can greatly understate true standard errors. See Moulton (1986, 1990). olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 3 / 41

Default OLS standard errors based on s 2 (X 0 X) 1 that ignore clustering can greatly understate true standard errors. See Moulton (1986, 1990). Standard solution is to get cluster-robust (CR) standard errors. This is the cluster option in Stata. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 3 / 41

Default OLS standard errors based on s 2 (X 0 X) 1 that ignore clustering can greatly understate true standard errors. See Moulton (1986, 1990). Standard solution is to get cluster-robust (CR) standard errors. This is the cluster option in Stata. If the number of clusters G is small then: 1. CR standard errors are downwards biased (too small) 2. t statistic is not standard normal distributed 3. t-tests using standard normal critical values over-reject. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 3 / 41

Default OLS standard errors based on s 2 (X 0 X) 1 that ignore clustering can greatly understate true standard errors. See Moulton (1986, 1990). Standard solution is to get cluster-robust (CR) standard errors. This is the cluster option in Stata. If the number of clusters G is small then: 1. CR standard errors are downwards biased (too small) 2. t statistic is not standard normal distributed 3. t-tests using standard normal critical values over-reject. This paper is concerned with better inference when there are few clusters. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 3 / 41

Wrong standard errors and critical values: 5% May, 2008 4 / 41

Outline of Talk 1 Introduction 2 Cluster-Robust Inference 3 Cluster Bootstrap (without and with re nement) 4 Monte Carlo Simulations 5 Bertrand, Du o and Mullainathan (2004) Simulations 6 Gruber and Poterba (1994) Application 7 Conclusion May, 2008 5 / 41

2.1 OLS with Cluster Errors Model for G clusters with N g individuals per cluster: OLS estimator y ig = x 0 ig β + u ig, i = 1,..., N g, g = 1,..., G, y g = X g β + u g, g = 1,..., G, y = Xβ + u, bβ = ( G g =1 N g i=1 x ig x 0 ig ) 1 ( G g =1 N g i=1 x ig y ig ) = ( G g =1 X0 g X g ) 1 ( G g =1 X g y g ) = (X 0 X) 1 X 0 y. May, 2008 6 / 41

OLS with Clustered Errors As usual bβ = β + (X 0 X) 1 X 0 u = β + (X 0 X) 1 ( G g =1 X g u g ). olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 7 / 41

OLS with Clustered Errors As usual bβ = β + (X 0 X) 1 X 0 u = β + (X 0 X) 1 ( G g =1 X g u g ). Assume independence over g with u g [0, Σ g = E[u g ug 0 ]]. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 7 / 41

OLS with Clustered Errors As usual bβ = β + (X 0 X) 1 X 0 u = β + (X 0 X) 1 ( G g =1 X g u g ). Assume independence over g with u g [0, Σ g = E[u g ug 0 ]]. Then β b a N [β, V[ β]] b with V[ b β] = (X 0 X) 1 ( G g =1 X g Σ g X 0 g )(X 0 X) 1. May, 2008 7 / 41

OLS with Clustered Errors If ignore clustering the default OLS variance estimate should be in ated by approximately where τ j ' 1 + ρ xj ρ u ( N g 1), ρ xj is the within cluster correlation of x j ρ u is the within cluster error correlation N g is the average cluster size. Moulton (1986, 1990) showed that could be large even if ρ u small. e.g. N G = 81, ρ x = 1 and ρ u = 0.1 then ρ = 9. Should correct for clustering but Σ g is unknown. May, 2008 8 / 41

2.2 Moulton-type Standard Errors Assume a random e ects (RE) model u ig = α g + ε ig α g iid[0, σ 2 α] ε ig iid[0, σ 2 ε ]. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 9 / 41

2.2 Moulton-type Standard Errors Assume a random e ects (RE) model u ig = α g + ε ig α g iid[0, σ 2 α] ε ig iid[0, σ 2 ε ]. Then Σ g = σ 2 ui Ng + σ 2 αe Ng e 0 N g and we use bv RE [ b β] = (X 0 X) 1 ( G g =1 X g b Σ g X 0 g )(X 0 X) 1, where bσ g = bσ 2 ui Ng + bσ 2 αe Ng e 0 N g, and bσ 2 ε and bσ 2 α are consistent. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 9 / 41

2.2 Moulton-type Standard Errors Assume a random e ects (RE) model u ig = α g + ε ig α g iid[0, σ 2 α] ε ig iid[0, σ 2 ε ]. Then Σ g = σ 2 ui Ng + σ 2 αe Ng e 0 N g and we use bv RE [ b β] = (X 0 X) 1 ( G g =1 X g b Σ g X 0 g )(X 0 X) 1, where bσ g = bσ 2 ui Ng + bσ 2 αe Ng e 0 N g, and bσ 2 ε and bσ 2 α are consistent. Weakness is strong distributional assumptions. May, 2008 9 / 41

2.3 Cluster-Robust Variance Estimates The cluster-robust variance estimate (CRVE) bv CR [ b β] = (X 0 X) 1 ( G g =1 X g eu g eu 0 g X 0 g )(X 0 X) 1, provides a consistent estimator of V CR [ b β] if plim 1 G G g =1 X g eu g eu g 0 X 0 g = plim 1 G G g =1 X g g X 0 g. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 10 / 41

2.3 Cluster-Robust Variance Estimates The cluster-robust variance estimate (CRVE) bv CR [ b β] = (X 0 X) 1 ( G g =1 X g eu g eu 0 g X 0 g )(X 0 X) 1, provides a consistent estimator of V CR [ b β] if plim 1 G G g =1 X g eu g eu g 0 X 0 g = plim 1 G G g =1 X g g X 0 g. bv CR [ b β] yields cluster-robust standard errors. May, 2008 10 / 41

2.3 Cluster-Robust Variance Estimates The cluster-robust variance estimate (CRVE) bv CR [ b β] = (X 0 X) 1 ( G g =1 X g eu g eu 0 g X 0 g )(X 0 X) 1, provides a consistent estimator of V CR [ b β] if plim 1 G G g =1 X g eu g eu g 0 X 0 g = plim 1 G G g =1 X g g X 0 g. bv CR [ b β] yields cluster-robust standard errors. If OLS residuals are used then eu g = bu g = y g X g b β. Then bv CR [ b β] is downwards-biased for V[ b β]. May, 2008 10 / 41

2.4 Alternative Cluster-Robust Variance Estimates Stata cluster options uses eu g = p cbu g where c = G G 1 N 1 N k ' G G 1. Bell and McCa rey (2002) instead use r G 1 eu g = G [I N g H gg ] 1 bu g. Then bv CR [ b β] is the jackknife estimate of the variance of the OLS estimator. This leads to downwards-biased CRVE if in fact Σ g = σ 2 I. Angrist and Lavy (2002) use this. We call this CR3 as generalizes HC3 of MacKinnon and White (1985). May, 2008 11 / 41

2.5 Cluster-Robust Wald Tests Two-sided Wald tests of H 0 : β 1 = β 0 1 against H a : β 1 6= β 0 1 where β 1 is a scalar component of β. Use the Wald test statistic w = b β 1 β 0 1 s b β 1, where s b β is standard error from bv CR [ β]. b 1 Asymptotically N [0, 1] and reject at α = 0.05 if jwj > 1.960. In nite samples w has much fatter tails than standard normal. Stata uses T (G 1) for critical values. And even this may over-reject. Donald and Lang (2004) propose alternative two-step estimator and T (G 2). May, 2008 12 / 41

3. Cluster Bootstraps Bootstrap due to Efron (1979) Alternative asymptotic approximation for distribution of a statistic. View the sample as the population and obtaining B resamples leading to B realizations of the statistic. There are many, many ways to bootstrap. A bootstrap without asymptotic re nement e.g. bootstrap estimate of standard error. No better than regular asymptotic theory. Popular as it may be simpler to implement. A bootstrap with asymptotic re nement e.g. bootstrap-t method. Asymptotically better than regular asymptotic theory. Hopefully better in nite samples (use Monte Carlos to con rm). Applied microeconometricians rarely use asymptotic re nement. May, 2008 13 / 41

3.1 Pairs Cluster Bootstrap-T Procedure 1 Do B iterations of this step. On the b th iteration: 1 Form a sample of G clusters f(y1, X 1 ),..., (y G, X G )g by resampling with replacement G times from the original sample. 2 Calculate the Wald test statistic w b = b β 1,b s b β 1,b bβ 1, where - bβ 1,b is OLS estimate using the bth pseudo-sample - s b β is its standard error estimated using CR method 1,b - bβ 1 is the original sample OLS estimate. 2 The empirical distribution of w1,..., w B estimates distribution of w. Reject H 0 at level α if and only if w < w [α/2] or w > w [1 α/2], where w[q] denotes the qth quantile of w1,..., w B. [e.g. w = 2.10, w[.025] = 2.32, w [.975] = 1.87 do not reject.] May, 2008 14 / 41

3.2 Asymptotic Re nement Consider two-sided nonsymmetric test of nominal size 0.5. Actual size = 0.05 + O(G 0.5 ) by conventional asymptotics. Actual size = 0.05 + O(G 1 ) using preceding bootstrap -t. So bootstrap-t o ers an asymptotic re nement. Key is to bootstrap an asymptotically pivotal statistic, one whose distribution does not depend on unknown parameters. w is asymptotically pivotal but bβ is not. May, 2008 15 / 41

3.3 Wild Cluster Bootstrap-T Procedure 1 Obtain the OLS estimator b β and OLS residuals bu g, g = 1,..., G. [Best to use residuals that impose H 0 ]. 2 Do B iterations of this step. On the b th iteration: 1 For each cluster g = 1,..., G, form bu g = bu g or bu g = bu g each with probability 0.5 and hence form by g = X 0 g b β + bu g. This yields wild cluster bootstrap resample f(by 1, X 1),..., (by G, X G )g. 2 Calculate the OLS estimate bβ 1,b and its standard error s bβ and given 1,b these form the Wald test statistic wb = ( bβ 1,b bβ 1 )/s b β. 1,b 3 Reject H 0 at level α if and only if w < w [α/2] or w > w [1 α/2], where w [q] denotes the qth quantile of w 1,..., w B. May, 2008 16 / 41

Wild bootstrap proposed by Wu (1986) for regression in the nonclustered case (heteroskedastic errors). Theory by Liu (1988) and Mammen (1993). Horowitz (1997, 2001) provides a Monte Carlo demonstrating good size properties. We use Rademacher weights that provide re nement if b β is symmetrically distributed Mammen weights can be used if b β is asymmetrically distributed. Davidson and Flachaire (2001) provide theory and simulation to support using Rademacher weights even in the asymmetric case. Here we have extended the wild bootstrap to a clustered setting. A brief application by Brownstone and Valletta (2001) also does this. Davidson and MacKinnon (1999) advocate imposing the null hypothesis. May, 2008 17 / 41

3.4 Bootstraps with Few Clusters - For cluster pairs bootstrap-t possible unordered recombinations of the data 2G 1 G 1-126 combinations for G = 5 and 92, 378 combinations for G = 10 - implementation problems can still arise e.g. with binary regressor. For wild cluster bootstrap-t - 2 G possible recombinations - 32 combinations for G = 5 and 1, 024 combinations for G = 10 - but implementation problems less likely to arise e.g. with binary regressor. May, 2008 18 / 41

3.6 Test Methods Used in This Paper Method Bootstrap? Re nement? H 0 impos 1. Assume iid (default se s) No 2. Moulton-type correction se No 3. Cluster-robust se No 4. CR3 cluster-robust se No 5. Pairs Cluster Bootstrap-se Yes No 6. Residual Cluster Bootstrap se Yes No 7. Wild Cluster Bootstrap-se Yes No 8. Pairs Cluster BCA Yes Yes 9. BDM Bootstrap-t Yes No No 10. Pairs Cluster Bootstrap-t Yes Yes No 11. Pairs Cluster CR3 Bootstrap-t Yes Yes No 12. Residual Cluster Bootstrap-t Yes No Yes 13. Wild Cluster Bootstrap-t Yes Yes Yes May, 2008 19 / 41

4. Monte Carlo Simulations Dgp has β 0 = 1 and β 1 = 1 in y ig = β 0 + β 1 x ig + u ig, i = 1,..., 30, g = 1,..., G. We consider G = 5, 10, 15, 20, 25 and 30. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 20 / 41

4. Monte Carlo Simulations Dgp has β 0 = 1 and β 1 = 1 in y ig = β 0 + β 1 x ig + u ig, i = 1,..., 30, g = 1,..., G. We consider G = 5, 10, 15, 20, 25 and 30. Test H 0 : β 1 = 1 vs H a : β 1 6= 1 at 5%: Wald test statistic: t = b β 1. s b β olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 20 / 41

4. Monte Carlo Simulations Dgp has β 0 = 1 and β 1 = 1 in y ig = β 0 + β 1 x ig + u ig, i = 1,..., 30, g = 1,..., G. We consider G = 5, 10, 15, 20, 25 and 30. Test H 0 : β 1 = 1 vs H a : β 1 6= 1 at 5%: Wald test statistic: t = b β 1. s b β Actual rejection rate computed using S = 1, 000 Monte Carlo simulations. Monte Carlo se = p ba(1 ba)/s = 0.009 if ba = 0.10. May, 2008 20 / 41

4.1 Monte Carlo: Homoskedastic Clustered Errors Table 1: Errors correlated within group but homoskedastic. Dgp is y ig = α + βx ig + u ig = α + β (z g + z ig ) + (ε g + ε ig ) z g N [0, 1], z ig N [0, 1] ) ρ x = 0.5 ε g N [0, 1], ε ig N [0, 1] ) ρ u = 0.5. This is classic random e ects model for both the error and the regressor. Expect Moulton correction and residual bootstrap to do well. May, 2008 21 / 41

Monte Carlo: Homoskedastic Clustered Errors May, 2008 22 / 41

Monte Carlo Simulations 1 iid errors: p 1 + (N G 1)ρ x ρ u = p 8.25 = 2.87 and Pr[jzj > 1.96/2.87] = 0.50 as G!. 2 Moulton: rejection rate that if t T (G 2). 3 CR: Improvement, but not as good as Moulton. 4 CR3: Better still. 5 Pairs cluster bootstrap-se: similar to CR. 6 Residual cluster bootstrap-se: close to 0.5 (why?) 7 Wild cluster bootstrap-se: close to 0.5 (why?) May, 2008 23 / 41

Monte Carlo: Homoskedastic Clustered Errors May, 2008 24 / 41

Monte Carlo Simulations 8. Pairs-cluster bootstrap BCA - similar to CR in row 3. So no re nement. 9. BDM bootstrap-t uses iid s b β not cluster-robust s bβ. So no re nement. But still big improvement on 1. 10. Pairs cluster bootstrap-t. Good. Some over-rejection. 11. Pairs CR3 bootstrap-t. Similar to 10. 12. Residual cluster bootstrap-t. Works well. 13. Wild cluster bootstrap-t. Works well. May, 2008 25 / 41

Monte Carlo: Heteroskedastic Clustered Errors Table 3: Errors correlated within group and heteroskedastic. Dgp is y ig = α + βx ig + u ig = α + β (z g + z ig ) + (ε g + ε ig ) z g N [0, 1], z ig N [0, 1] ) ρ x = 0.5 ε g N [0, 1], ε ig N [0, 9 (z g + z ig ) 2 ] ) ρ u < 0.5. No longer classic random e ects model for both the error and the regressor. Find that Moulton-type correction breaks down. The bootstrap-t methods work well (surprising for residual bootstrap). May, 2008 26 / 41

Monte Carlo: Heteroskedastic Clustered Errors May, 2008 27 / 41

Monte Carlo: Heteroskedastic Clustered Errors May, 2008 28 / 41

Monte Carlo: Variations for G=10 If reject when jtj > t 8;.025 = 2.306 rather than 1.96 then methods 2-7 do better. As increase cluster size then as expected default OLS does much worse while methods 2-13 change little. If have four regressors rather than one (each 0.5 (z g + z ig )) then expect similar results though more noise. May, 2008 29 / 41

Monte Carlo: Variations for G=10 May, 2008 30 / 41

Monte Carlo: Variations for G=10 May, 2008 31 / 41

Bootstrap for BDM Study Use BDM data on 21 years and up to G = 50 states. y ig = α g + γ i + β 1 I ig + u ig : i is year; g is state. y ig is a year-state measure of excess earnings (after control for age and education). olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 32 / 41

Bootstrap for BDM Study Use BDM data on 21 years and up to G = 50 states. y ig = α g + γ i + β 1 I ig + u ig : i is year; g is state. y ig is a year-state measure of excess earnings (after control for age and education). I ig is a binary policy change indicator. Randomly occurs in half the states. When occurs between sixth and fteenth year. Once only change from 0 to 1. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 32 / 41

Bootstrap for BDM Study Use BDM data on 21 years and up to G = 50 states. y ig = α g + γ i + β 1 I ig + u ig : i is year; g is state. y ig is a year-state measure of excess earnings (after control for age and education). I ig is a binary policy change indicator. Randomly occurs in half the states. When occurs between sixth and fteenth year. Once only change from 0 to 1. Actual rejection rate: fraction of times jwj = j(bβ 1 0)/s b β 1 j > 1.96. olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 32 / 41

Bootstrap for BDM Study Use BDM data on 21 years and up to G = 50 states. y ig = α g + γ i + β 1 I ig + u ig : i is year; g is state. y ig is a year-state measure of excess earnings (after control for age and education). I ig is a binary policy change indicator. Randomly occurs in half the states. When occurs between sixth and fteenth year. Once only change from 0 to 1. Actual rejection rate: fraction of times jwj = j(bβ 1 0)/s b β 1 j > 1.96. Power against β 1 = 0.02: fraction of 400 times jwj > 1.96. [Do OLS of y ig + (0.02I ig ) on I tg.] olin Cameron, Jonah Gelbach, Doug Miller U.C. - Davis, Cluster U. Maryland, Bootstrap U.C. - Davis () May, 2008 32 / 41

Bootstrap for BDM Study Get BDM s result that assuming iid errors leads to massive over-rejection. Get BDM s result that cluster robust with G = 6 has over-rejection. Moulton-type correction does surprisingly poorly. Problems with pairs cluster bootstrap-t. For 3 out of 6 states there was no policy change and I g = 0: Pr[all 6 I g = 0] = (1/2) 6 = 1/64. Find that wild cluster bootstrap-t does well. May, 2008 33 / 41

Bootstrap for BDM Study May, 2008 34 / 41

Bootstrap for BDM Study May, 2008 35 / 41

Bootstrap for BDM Study: With individual data Similar to with aggregate data. May, 2008 36 / 41

6. Gruber and Poterba Model for whether have private insurance is y ijt = α 1 + α 2 SELF ijt + α 3 POST ijt + β 1 SELF ijt POST ijt + u jt, where i denotes individual, j denotes employer type, t denotes year SELF ijt = 1 if individual i is self-employed at time t POST ijt = 1 if the year is 1987, 1988 or 1999. Di erence-in-di erence analysis controlling for clustered errors. Treat years as clusters G = 8 (following Donald and Lang (2004). Aggregated employment-year data. Then CR se is 0.0074 compared to 0.0044 without clustering. Individual-level data results similar to aggregate. May, 2008 37 / 41

7. Conclusion At the least should have small-sample correction of standard errors and use a T distribution with G or fewer degrees of freedom. Cluster pairs bootstrap-t is better than Cluster pairs BCA. But can run into problems implementing if binary cluster-invariant regressor and few clusters. Cluster Wild bootstrap-t works well, though is designed just for OLS. May, 2008 38 / 41

Some References Angrist and Lavy (2002), The E ect of High School Matriculation Awards: Evidence from Randomized Trials, NBER Working Paper Number 9389. Arellano, M. (1987), Computing Robust Standard Errors for Within-Group Estimators, Oxford Bulletin of Economics and Statistics, 49, 431-434. Bell, R.M. and D.F. McCa rey (2002), Bias Reduction in Standard Errors for Linear Regression with Multi-Stage Samples, Survey Methodology, 169-179. Bertrand, M., E. Du o, and S. Mullainathan (2004), How Much Should we Trust Di erences-in-di erences Estimates, Quarterly Journal of Economics, 249-275. Davison, A.C. and D. Hinkley (1997), Bootstrap methods and their Applications, Cambridge. Donald, S. and K. Lang (2004), Inferences with Di erences in Di erences and Other Panel Data, October 22, 2004. Efron, B. and R. J. Tibsharani (1993), An Introduction to the Bootstrap, Chapman and Hall. May, 2008 39 / 41

Some References Hall, P. (1992), The Bootstrap and Edgeworth Expansion, New York: Springer-Verlag. Horowitz, J. L. (1997), Bootstrap Methods in Econometrics: Theory and Numerical Performance, in Kreps and Wallis eds., Advances in Econometrics, Vol. 7 Kauermann, G. and R.J. Carroll (2001), A Note on the E ciency of Sandwich Covariance Matrix Estimation, Journal of the American Statistical Association, 96, 1387-1396. Liang, K.-Y., and S.L. Zeger (1986), Longitudinal Data Analysis Using Generalized Linear Models, Biometrika, 73, 13-22. MacKinnon, J.G. and H. White (1985), Some Heteroskedasticity-Consistent Covariance Matrix Estimators with Improved Finite Sample Properties, Journal of Econometrics, Vol. 29, 305-325. Mancl, L.A. and T.A. DeRouen, A Covariance estimator for GEE with Improved Finite- Sample Properties, Biometrics, 57, 126-134. May, 2008 40 / 41

Some References Moulton, B.R. (1986), Random Group E ects and the Precision of Regression Estimates, Journal of Econometrics, 32, 385-97. Moulton, B.R. (1990), An Illustration of a Pitfall in Estimating the E ects of Aggregate Variables on Micro Units, Review of Economics and Statistics, 72, 334-38. Sherman, M. and S. le Cressie (1997), A Comparison Between Bootstrap Methods and Generalized Estimating Equations for Correlated Outcomes in Generalized Linear Models, Communications in Statistics, White, H. (1984), Asymptotic Theory for Econometricians, San Diego, Academic Press. Wooldridge, J.M. (2003), Cluster-Sample Methods in Applied Econometrics, American Economic Review, 93, 133-138. May, 2008 41 / 41