Rerandomization to Balance Covariates

Similar documents
Combining Experimental and Non-Experimental Design in Causal Inference

arxiv: v1 [stat.me] 6 Nov 2015

Matching. Quiz 2. Matching. Quiz 2. Exact Matching. Estimand 2/25/14

Propensity Score Methods for Causal Inference

Answers to Problem Set #4

Observational Studies and Propensity Scores

Weighting. Homework 2. Regression. Regression. Decisions Matching: Weighting (0) W i. (1) -å l i. )Y i. (1-W i 3/5/2014. (1) = Y i.

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

Balancing Covariates via Propensity Score Weighting: The Overlap Weights

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

CHL 5225 H Crossover Trials. CHL 5225 H Crossover Trials

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Multiple linear regression S6

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

Review of the General Linear Model

Introduction to Crossover Trials

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

Balancing Covariates via Propensity Score Weighting

Construction and statistical analysis of adaptive group sequential designs for randomized clinical trials

Optimal Blocking by Minimizing the Maximum Within-Block Distance

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Group Sequential Designs: Theory, Computation and Optimisation

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

Least Squares Estimation

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Use of Matching Methods for Causal Inference in Experimental and Observational Studies. This Talk Draws on the Following Papers:

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Sequential rerandomization

Pubh 8482: Sequential Analysis

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

Practical Considerations Surrounding Normality

Gov 2002: 4. Observational Studies and Confounding

Lecture 1 January 18

Growth Mixture Model

Pubh 8482: Sequential Analysis

University of California, Berkeley

Sampling and Sample Size. Shawn Cole Harvard Business School

Comparison of Two Samples

Multivariate Statistical Analysis

Overlap Propensity Score Weighting to Balance Covariates

Q learning. A data analysis method for constructing adaptive interventions

Lecture 7 Time-dependent Covariates in Cox Regression

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

University of California, Berkeley

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Introduction to Econometrics. Review of Probability & Statistics

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

HYPOTHESIS TESTING. Hypothesis Testing

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

ECON 3150/4150, Spring term Lecture 7

STAT 501 EXAM I NAME Spring 1999

Control Function Instrumental Variable Estimation of Nonlinear Causal Effect Models

Pubh 8482: Sequential Analysis

Ignoring the matching variables in cohort studies - when is it valid, and why?

11 CHI-SQUARED Introduction. Objectives. How random are your numbers? After studying this chapter you should

Adaptive Trial Designs

Multivariate Linear Regression Models

Econometric Causality

Lecture 15 (Part 2): Logistic Regression & Common Odds Ratio, (With Simulations)

Advanced Quantitative Research Methodology, Lecture Notes: Research Designs for Causal Inference 1

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

UNIVERSITY OF TORONTO Faculty of Arts and Science

On Selecting Tests for Equality of Two Normal Mean Vectors

Harvard University. Rigorous Research in Engineering Education

Statistical Inference of Covariate-Adjusted Randomized Experiments

Determining Sufficient Number of Imputations Using Variance of Imputation Variances: Data from 2012 NAMCS Physician Workflow Mail Survey *

Blocks are formed by grouping EUs in what way? How are experimental units randomized to treatments?

Intermediate Econometrics

An Introduction to Mplus and Path Analysis

HANDBOOK OF APPLICABLE MATHEMATICS

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Chapter 1 Statistical Inference

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects

Estimating Causal Effects of Organ Transplantation Treatment Regimes

AP Final Review II Exploring Data (20% 30%)

APPENDIX B Sample-Size Calculation Methods: Classical Design

Statistics: revision

Logistic regression: Why we often can do what we think we can do. Maarten Buis 19 th UK Stata Users Group meeting, 10 Sept. 2015

12 Discriminant Analysis

CompSci Understanding Data: Theory and Applications

5. Erroneous Selection of Exogenous Variables (Violation of Assumption #A1)

On the efficiency of two-stage adaptive designs

NISS. Technical Report Number 167 June 2007

BIOS 2083 Linear Models c Abdus S. Wahed

Haphazard Intentional Allocation and Rerandomization to Improve Covariate Balance in Experiments

Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:

Multiple Linear Regression

An Empirical Comparison of Multiple Imputation Approaches for Treating Missing Data in Observational Studies

6 Multivariate Regression

A Practitioner s Guide to Cluster-Robust Inference

An Introduction to Path Analysis

Gov 2000: 9. Regression with Two Independent Variables

STK4900/ Lecture 3. Program

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Sequential rerandomization

ECON Introductory Econometrics. Lecture 17: Experiments

Transcription:

Rerandomization to Balance Covariates Kari Lock Morgan Department of Statistics Penn State University Joint work with Don Rubin University of Minnesota Biostatistics 4/27/16

The Gold Standard Randomized experiments are the gold standard for estimating causal effects WHY? 1. They yield unbiased estimates 2. They eliminate confounding variables

R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R Randomize R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R

Covariate Balance - Gender Suppose you get a bad randomization What would you do??? Can you rerandomize??? 512 Females, 158 Males 158 Females, 512 Males

The Origin of the Idea Rubin: What if, in a randomized experiment, the chosen randomized allocation exhibited substantial imbalance on a prognostically important baseline covariate? Cochran: Why didn't you block on that variable? Rubin: Well, there were many baseline covariates, and the correct blocking wasn't obvious; and I was lazy at that time. Cochran: This is a question that I once asked Fisher, and his reply was unequivocal: Fisher (recreated via Cochran): Of course, if the experiment had not been started, I would rerandomize.

Mind-Set Matters In 2007, Dr. Ellen Langer tested her hypothesis that mind-set matters with a randomized experiment She recruited 84 maids working at 7 different hotels, and randomly assigned half to a treatment group and half to control The treatment was simply informing the maids that their work satisfies the Surgeon General s recommendations for an active lifestyle Crum, A.J. and Langer, E.J. (2007). Mind-Set Matters: Exercise and the Placebo Effect, Psychological Science, 18:165-171.

Mind-Set Matters p-value =.01 p-value =.001

Covariate Imbalance The more covariates, the more likely at least one covariate will be imbalanced across treatment groups With just 10 independent covariates, the probability of a significant difference (α =.05) for at least one covariate is 1 (1-.05) 10 = 40%! Covariate imbalance is not limited to rare unlucky randomizations

The Gold Standard Randomized experiments are the gold standard for estimating causal effects WHY? 1. They yield unbiased estimates 2. They eliminate confounding variables on average! For any particular experiment, covariate imbalance is possible (and likely!), and conditional bias exists

1) Collect covariate data Specify criteria determining when a randomization is unacceptable; based on covariate balance 2) Randomize (Re)randomize subjects subjects to treated to treated and and control control RERANDOMIZATION Check covariate balance unacceptable acceptable 3) Conduct experiment 4) Analyze results with a randomization test

Unbiased? To maintain an unbiased estimate of the treatment effect, the decision to rerandomize or not should be Ø automatic and specified in advance Ø blind to which group is treated Theorem: If the treated and control groups are the same size, and if for every unacceptable randomization the exact opposite randomization is also unacceptable, then rerandomization yields an unbiased estimate of the treatment effect.

Unbiased? Equal sized treatment groups are not actually required, but that s the easiest case to consider. Theorem: If the rerandomization criterion is affinely invariant and if the covariates, x, are ellipsoidally symmetric, then rerandomization leads to unbiased estimates of any linear function of x.

Criteria for Acceptable Balance We recommend using Mahalanobis Distance, M, to represent multivariate distance between group means: M ( ) [ ] 1 X X 'cov( X) ( X X ) = T C T C Under adequate sample sizes and pure randomization: M 2 ~ χ k Choose a and rerandomize when M > a

Distribution of M p a = Probability of accepting a randomization RERANDOMIZE Acceptable Randomizations a M

Choosing p a Choosing the acceptance probability is a tradeoff between a desire for better balance and computational time The number of randomizations needed to get one successful one is Geometric(p a ), so the expected number needed is 1/p a Computational time must be considered in advance, since many simulated acceptable randomizations are needed to conduct the randomization test

Another Measure of Balance The obvious choice may be to set limits on the acceptable balance for each covariate individually This destroys the joint distribution of the covariates -3-2 -1 0 1 2 3 D2 D1-3 -2-1 0 1 2 3

Rerandomization Based on M Since M follows a known distribution, easy to specify the proportion of accepted randomizations M is affinely invariant (unaffected by affine transformations of the covariates) Correlations between covariates are maintained The balance improvement for each covariate is the same (and known) and is the same for any linear combination of the covariates

Covariates After Rerandomization Theorem: If n T = n C, the covariate means are normally distributed, and rerandomization occurs when M > a, then ( ) T C E X X M a = 0 and ( ) ( ) XT XC M a = v X a T XC cov cov. v a k a γ 1, 2 + 2 2, k k a γ, 2 2 where γ is the incomplete gamma function: c b 1 y γ (,) bc y e dy 0

Mind-Set Matters Difference in Means = -8.3

1000 Randomizations Pure Randomization Rerandomization p a =.001-10 -5 0 5 10 Difference in Mean Age

Percent Reduction in Variance Define the percent reduction in variance for each covariate to be the percentage by which rerandomization reduces the randomization variance for the difference in means: ( ) var( X j,t X ) j,c var( X j,t X ) j,c var X j,t X j,c rerandomization For rerandomization when M a, the percent reduction in variance for each covariate is 1 v a

Percent Reduction in Variance k = 10, p a =.001 a = 1.48 k v a = 2 γ k 2 +1, a 10 2 k γ 2, a = 2 γ 10 2 +1,1.48 2 10 2 γ 2,1.48 2 =.12 Pure Randomization Rerandomization Percent reduction in variance = 88% Increase p a, how does percent reduction in variance change? k = 10, p a =.1 a = 4.87 k v a = 2 γ k 2 +1, a 10 4.87 2 k γ 2, a = 2 γ +1, 10 2 2 10 2 γ 2, 4.87 2-10 -5 0 5 10 =.38 Difference in Mean Age Percent reduction in variance = 62% -10-5 0 5 10 Difference in Mean Age

Percent Reduction in Variance Percent Reduction in Variance R 2 =1 R 2 =.8 Acceptance Probability: log10 scale 6 5 4 3 2 1 0 100 80 60 40 20 0 80 60 40 20 0 10 20 30 40 50 k: Number of Covariates

Estimated Treatment Effect Theorem: If n T = n C and rerandomization occurs when M > a, then E Y# $ Y# & M a = τ and if the covariate means are normally distributed and the treatment effect, τ, is additive, the percent reduction in variance for the outcome difference in means is (1 v a )R 2, where R 2 is the coefficient of determination between Y and x.

Outcome Variance Reduction Percent Reduction in Variance R 2 =1 R 2 =.8 R 2 =.5 R 2 =.2 Acceptance Probability: log10 scale 6 5 4 3 2 1 0 100 80 60 40 20 0 80 60 40 20 0 50 40 30 20 10 0 20 15 10 5 0 10 20 30 40 50 k: Number of Covariates

Outcome Variance Reduction k = 10, p =.001 v =.12 a Outcome: Weight Change R 2 =.1 Percent Reduction in Variance = (1 v a )R 2 = (1.12 )(.1) = 8.8% a Outcome: Change in Diastolic Blood Pressure R 2 =.64 Percent Reduction in Variance = (1 v a )R 2 = (1.12 )(.64) = 56% Equivalent to increasing the sample size by 1/(1 -.56) = 2.27

Analysis Rerandomization can drastically improve power! BUT to take advantage of this, the analysis has to account for the rerandomization Typical distributional based methods (such as a t-test) will overestimate the standard error, and will be too conservative (p-values will be too insignificant) To get an accurate p-value and Type I error rate, use a randomization test, where each simulated randomization follows the same rerandomization procedure used in the actual experiment

Extensions Tiers of covariates importance Improving computational efficiency Sequential assignment Block experiments Unequal treatment group sizes Multiple treatment groups Choose the best from a fixed number of randomizations

Conclusion Rerandomization improves covariate balance between the treatment groups, giving the researcher more faith that an observed effect is really due to the treatment If the covariates are correlated with the outcome, rerandomization also increases precision in estimating the treatment effect, giving the researcher more power to detect a significant result

Lock Morgan, K. and Rubin, D.B. (2012). Rerandomization to Improve Covariate Balance in Experiments, Annals of Statistics, 40(2): 1262-1282. Lock Morgan, K. and Rubin, D.B. (2015). Rerandomization to Balance Tiers of Covariates, JASA, 110(512): 1412-1421. klm47@psu.edu

Tiers of Covariates Place covariates into tiers of varying importance 1) Assess balance for the top tier as usual (calculate M 1, rerandomize if M 1 > a 1 ) 2) For each lower tier, regress each covariate on all covariates in more important tiers. Balance the residuals (calculate M t for the residuals of tier t, rerandomize if M t > a t ) Balance criteria can be more stringent for more important tiers

Computational Efficiency First, orthogonalize covariates: regress each covariate on all previous covariates, and use residuals as new covariate Mahalanobis distance simplifies to a sum of squared differences in covariate means If squared difference in means for any covariate exceeds threshold, rerandomize Result is the same as if we used original covariates, because Mahalanobis distance is affinely invariant

Sequential Assignment Rerandomization used in the design of a clinical trial proposed to the FDA for a drug to improve female sexual satisfaction Patients enroll sequentially: rerandomize in blocks of B women at a time (B = 6 for this example) 1. Randomize the first B women to enroll 2. Treat these women with placebo or active treatment 3. Wait for B more women to enroll 4. Use rerandomization for these next B women, measuring overall covariate balance for all 2B women enrolled 5. Treat these new B women with placebo or active 6. Wait for B more women to enroll For each rerandomization, generate a fixed number of randomizations and choose the best

Blocking This is not meant to replace blocking, and blocking and rerandomization can be used together Blocking is great for balancing a few very important covariates, but is not easily used for many covariates Block what you can, rerandomize what you can t

Unequal Group Sizes If necessary for incentive to participate, one option is to rerandomizeinto multiple equal sized groups, then combine groups after the fact Example: for a treatment group twice as big as the control, rerandomize into 3 balanced groups, then combine two to form the treatment group If only enough resources to support a certain number in one group, it can be worth reducing the sample size to get equal group sizes (if you care about unbiasedness)

Multiple Treatment Groups Rerandomization is easily extended to multiple treatment groups; just need a balance measure To measure balance, one of the MANOVA test statistics (such as Wilks Λ) can be used, measuring the ratio of within group variability to between group variability for multiple covariates This is equivalent to rerandomizing based on Mahalanobis distance if used on only two groups

Small Samples or Non-Normality The theoretical results given depend on M ~ χ k 2 This is not true if n is not large enough for the CLT to make the covariate means normal That is fine! Rerandomization does NOT depend on asymptotics. The threshold for a desired p a and the percent reduction in variance for covariates can be determined empirically through simulation Variance of the estimate is used nowhere in analysis. The randomization test will work, no matter how nonnormal the covariate means are

Conclusion Rerandomization improves covariate balance between the treatment groups, giving the researcher more faith that an observed effect is really due to the treatment If the covariates are correlated with the outcome, rerandomization also increases precision in estimating the treatment effect, giving the researcher more power to detect a significant result

Lock Morgan, K. and Rubin, D.B. (2012). Rerandomization to Improve Covariate Balance in Experiments, Annals of Statistics, 40(2): 1262-1282. Lock Morgan, K. and Rubin, D.B. (2015). Rerandomization to Balance Tiers of Covariates, JASA, 110(512): 1412-1421. klm47@psu.edu