Accounting for Baseline Observations in Randomized Clinical Trials

Similar documents
Accounting for Baseline Observations in Randomized Clinical Trials

BIOS 6649: Handout Exercise Solution

General Regression Model

Bios 6649: Clinical Trials - Statistical Design and Monitoring

Bios 6648: Design & conduct of clinical research

BIOS 2083 Linear Models c Abdus S. Wahed

Sample Size Determination

4 Multiple Linear Regression

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

Regression #4: Properties of OLS Estimator (Part 2)

Analysis of Longitudinal Data. Patrick J. Heagerty PhD Department of Biostatistics University of Washington

Topic 12 Overview of Estimation

Harmonic Regression in the Biological Setting. Michael Gaffney, Ph.D., Pfizer Inc

Linear Models and Estimation by Least Squares

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Comprehensive Examination Quantitative Methods Spring, 2018

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Scatter plot of data from the study. Linear Regression

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

3 Multiple Linear Regression

DIAGNOSTICS FOR STRATIFIED CLINICAL TRIALS IN PROPORTIONAL ODDS MODELS

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Scatter plot of data from the study. Linear Regression

Lecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:

BIOS 312: Precision of Statistical Inference

BIOSTATISTICAL METHODS

General Linear Model: Statistical Inference

On the covariate-adjusted estimation for an overall treatment difference with data from a randomized comparative clinical trial

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Power and Sample Size Calculations with the Additive Hazards Model

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Introduction to Estimation Methods for Time Series models. Lecture 1

Double Robustness. Bang and Robins (2005) Kang and Schafer (2007)

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

For more information about how to cite these materials visit

Quantitative Analysis of Financial Markets. Summary of Part II. Key Concepts & Formulas. Christopher Ting. November 11, 2017

Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models

Extending causal inferences from a randomized trial to a target population

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Math 494: Mathematical Statistics

USING TRAJECTORIES FROM A BIVARIATE GROWTH CURVE OF COVARIATES IN A COX MODEL ANALYSIS

Categorical Predictor Variables

Group Sequential Designs: Theory, Computation and Optimisation

Robustifying Trial-Derived Treatment Rules to a Target Population

Econometrics II - EXAM Answer each question in separate sheets in three hours

Correlation analysis. Contents

STAT5044: Regression and Anova. Inyoung Kim

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Example: An experiment can either result in success or failure with probability θ and (1 θ) respectively. The experiment is performed independently

Association studies and regression

ECON 3150/4150, Spring term Lecture 6

Practical Considerations Surrounding Normality

EEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as

PALACE PIER, ST. LEONARDS. M A N A G E R - B O W A R D V A N B I E N E.

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

MFin Econometrics I Session 4: t-distribution, Simple Linear Regression, OLS assumptions and properties of OLS estimators

Mixed Models II - Behind the Scenes Report Revised June 11, 2002 by G. Monette

MANY BILLS OF CONCERN TO PUBLIC

DATA-ADAPTIVE VARIABLE SELECTION FOR

BIOS 2083: Linear Models

Graphical Model Selection

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

18 Bivariate normal distribution I

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Distribution-free ROC Analysis Using Binary Regression Techniques

Accounting for Regression to the Mean and Natural Growth in Uncontrolled Weight Loss Studies

Next is material on matrix rank. Please see the handout

Statistical Methods for Alzheimer s Disease Studies

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

VIII. ANCOVA. A. Introduction

Analysing data: regression and correlation S6 and S7

Problem Selected Scores

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Causal inference in biomedical sciences: causal models involving genotypes. Mendelian randomization genes as Instrumental Variables

The BLP Method of Demand Curve Estimation in Industrial Organization

Introduction to Econometrics Midterm Examination Fall 2005 Answer Key

Ch 2: Simple Linear Regression

Causal Mechanisms Short Course Part II:

Stat 579: Generalized Linear Models and Extensions

Instructions: Closed book, notes, and no electronic devices. Points (out of 200) in parentheses

Estimating Optimal Dynamic Treatment Regimes from Clustered Data

Generalized Linear Models. Kurt Hornik

A Significance Test for the Lasso

Questions and Answers on Heteroskedasticity, Autocorrelation and Generalized Least Squares

ECON 4160, Autumn term Lecture 1

Two-Variable Regression Model: The Problem of Estimation

Finite Sample Performance of A Minimum Distance Estimator Under Weak Instruments

Comparing the effects of two treatments on two ordinal outcome variables

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

Probability Theory and Statistics. Peter Jochumzen

Accepted Manuscript. Comparing different ways of calculating sample size for two independent means: A worked example

[y i α βx i ] 2 (2) Q = i=1

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

Transcription:

Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA August 5, 0 Abstract In clinical trials investigating the treatment of chronic, progressive diseases, a common design is to make some physiologic measurement of the disease state at the start of the trial, randomize patients to either the experimental treatment or some control treatment (placebo, follow the patients over some period of time, and then measure once again the physiologic measurement of the disease state The question then arises: How should the baseline measurements made prior to randomization be used? In this manuscript, I consider several common strategies that might be used in this setting, including analyzing only the follow-up measurement, analyzing the change in measurements, and the analysis of covariance (ANCOVA approach of adjusting for the baseline measurement in a linear regression model I illustrate several different ways to motivate the use of ANCOVA, as well as examining the settings in which not measuring baseline might be preferred I conclude with a discussion of how these results might change when using a randomized crossover design Introduction In clinical trials investigating the treatment of chronic, progressive diseases, a common design is to make some physiologic measurement of the disease state at the start of the trial, randomize patients to either the experimental treatment or some control treatment (placebo, follow the patients over some period of time, and then measure once again the physiologic measurement of the disease state Take for instance a doubleblind, placebo controlled, randomized clinical trial (RCT of a new drug in the setting of hypertension Upon recruiting a patient into the RCT, we would typically measure the patient s systolic blood pressure (SBP for the purposes of determining eligibility for the trial We might further decide to stratify randomization of patients on SBP in order to ensure balance across treatment arms with respect to severity of hypertension Then after treating the patients with either the experimental treatment or placebo for the period of, say, a year, we again measure the SBP of all patients, and analyze the data to decide whether the experimental treatment is associated with a scientifically important and statistically significant lower SBP when compared to placebo The question is: How do we use the baseline (pre-randomization measurement of SBP in the data analysis? Owing to the randomized nature of the study, there are several choices that will each estimate the causal effect of the treatment in an unbiased fashion: Ignore baseline and analyze the final measurements By virtue of randomization, the distribution of SBP at the start of the study is equal in each treatment arm Hence, any differences in the distribution of SBP at the end of the trial is directly attributable to the effect of the experimental treatment Analyze the change in measurements over the course of the study For each patient, we compute the difference between the final SBP and their baseline SBP, and then compare the treatment arms with respect to the change in measurements

Accounting for Baseline Measurements in RCT Emerson, Page 3 Adjust for the baseline measurement as a covariate in a linear regression model of final measurement for each patient by treatment arm 4 Adjust for the baseline measurement as a covariate in a linear regression model of the change in measures for each patient by treatment arm In my experience, the second of these methods is the one most commonly advocated by clinical investigators It seems only natural to them: we are interested in an intervention that would either change the pathophysiologic process for the better or (at least lessen the change in the pathophysiologic process for the worse In this manuscript, I explore the settings in which one of the approaches might be preferable to the others In this exploration, I illustrate the seemingly paradoxical result that suggests that in RCT that second method (analyzing the change in measurements is never an optimal choice, while there are situations that any of the other three is optimal It should be noted that the results presented herein are specific to RCT, because we make extensive use of the fact that the distribution of baseline measurements is known to be the same in both treatment groups Notation in the Homoscedastic, Independent Sample Setting Let Y ktj be the SBP measurement in the kth treatment group (k = for experimental treatment, k = 0 for placebo at time t (t = 0 for the baseline (pre-randomization measurement, t = for the end of treatment measurement in the jth patient (j =,, n i We assume all subjects are independent and that the subjects in the experimental treatment group are different that those in the placebo group Suppose Y ktj (µ kt, σ, meaning that for a population receiving treatment k, the average measurement pre-randomization would be µ k0 with a standard deviation of σ, and the average measurement post-treatment would be µ k with a standard deviation of σ We presume that individuals are independent, and that repeat measurements made on the same subject in the kth group have correlation ρ Thus corr(y kj, Y k0j = ρ, corr(y ktj, Y kt j = 0 for j j, and corr(y ktj, Y k t j = 0 for k k (This homoscedastic model presumes not only that the treatment does not affect the variability of the measurements, but also that the treatment does not affect the correlation of the baseline and follow-up measurements We presume a randomized study, so we have µ 0 = µ 00 θ = µ µ 0 We are ultimately interested in estimating In order to derive general results, we consider the distribution of linear combinations of the form kj = Y kj a k Y k0j Using simple properties of expectation and covariances, we find E kj = E Y kj a i Y k0j = µ k a k µ k0 V ar ( kj = V ar (Y kj a k Y k0j = σ + a kσ a k ρσ = ( a k ρ + a k σ so letting we can derive moments k = n k n k kj j= E k = µk a k µ k0 V ar ( ak ρ + a k = k σ n k

Accounting for Baseline Measurements in RCT Emerson, Page 3 Now, for a specified value of a, we define ˆθ (a = 0, and find distribution moments E ˆθ(a = (µ a µ 0 (µ 0 a 0 µ 00 = θ (a µ 0 a 0 µ 00 = θ (a a 0 µ 00 and ˆθ is unbiased for θ for arbitrary µ 00 if and only if a = a 0 = a Now we can find distribution variance ˆθ(a V ar = ( aρ + a ( σ + n n 0 Special cases of interest are Ignore baseline and analyze the final measurements This corresponds to a = 0, in which case V ar(ˆθ (0 = σ ( n + n 0 Analyze the change in measurements over the course of the study This corresponds to a =, in which case ( V ar(ˆθ ( = σ ( ρ + n n 0 By differentiating the above expression with respect to a, we can find the form of ˆθ with minimal variability as having a = ρ, in which case V ar(ˆθ (ρ = σ ( ρ ( + n n 0 Note that For ρ < 05, V ar(ˆθ (0 < V ar(ˆθ (, and throwing away the baseline value is more efficient than comparing the change in measurements across treatment arms For ρ = 05, V ar(ˆθ (0 = V ar(ˆθ (, and throwing away the baseline value is just as efficient as comparing the change in measurements across treatment arms (This is the only point of equality between these two estimators 3 For ρ > 05, V ar(ˆθ (0 > V ar(ˆθ (, and throwing away the baseline value is less efficient than comparing the change in measurements across treatment arms 4 For all ρ, V ar(ˆθ (ρ V ar(ˆθ (0 and V ar(ˆθ (ρ V ar(ˆθ ( 5 For ρ = 0, V ar(ˆθ (0 = V ar(ˆθ (ρ (This is the only point of equality between these two estimators 6 For ρ =, V ar(ˆθ ( = V ar(ˆθ (ρ (This is the only point of equality between these two estimators Note that the third observation above suggests that when ρ is known, the use of ˆθ (ρ is the best linear estimator, as it is always at least as good as either of the other two approaches based on a = 0 or a =

Accounting for Baseline Measurements in RCT Emerson, Page 4 3 An Alternative Derivation When Measurements are Normally Distributed Consider the same homoscedastic setting as in section with σ and ρ known constants, but now suppose that it is known that vector (Y k0j, Y kj are bivariate normal We can then write the density for our data as a function of θ = (µ 00, µ 0, µ as (Y k0j µ 00 ρ(y k0j µ 00 (Y kj µ k + (Y kj µ k n k f( Y ; θ = πσ ( ρ k=0 j= exp = g(θh( Y 3 exp c l (θt l ( Y, l= σ ( ρ where the familiar form of an exponential family density is found by expanding the product and grouping terms, with n0+n g( θ = πσ exp (n 0 + n µ 00 ρ(n 0 µ 0 + n µ µ 00 + n µ + n 0 µ 0 ( ρ σ ( ρ ( nk h( Y k=0 j= Yk0j = exp ρy k0jy kj + Ykj σ ( ρ µ 00 c ( θ = σ ( ρ µ 0 c ( θ = σ ( ρ µ c 3 ( θ = σ ( ρ T ( Y n k = (Y k0j ρy kj k=0 j= n 0 T ( Y = (Y 0j ρy 00j j= T 3 ( Y n = (Y j ρy 0j j= Note that in that exponential family density, θ is a three dimensional vector, and the range of θ contains an open rectangle in 3 dimensions Hence, we know that T is a complete sufficient statistic Furthermore, because ˆθ (ρ = T 3 ( Y /n T ( Y /n 0 is unbiased for θ and a function of the complete sufficient statistic, we know that ˆθ (ρ is the uniform minimum variance unbiased estimator of θ 4 Settings Where Not Measuring Baseline is Optimal Even when ρ is known, there is a setting in which a better approach would be to use only the final measurement

Accounting for Baseline Measurements in RCT Emerson, Page 5 Suppose the measurement of Y ktj is extremely expensive Then the limiting factor in our experimental design might be the number of measurements we have to make In order to estimate ˆθ (ρ, we must make (n 0 + n measurements one for each subject at baseline and one for each subject at the end of treatment As noted above, the variance of ˆθ (ρ is given by V ar(ˆθ (ρ = σ ( ρ ( + n n 0 But suppose that we doubled the number of subjects to n 0 = n 0 and n = n and we did not measure baseline values at all In this case, we would use ˆθ (0 which has variance ( V ar(ˆθ (0 = σ n + ( n = σ + 0 n n 0 We can then consider the setting in which V ar(ˆθ (0 < V ar(ˆθ (ρ as σ V ar(ˆθ (0 < V ar(ˆθ (ρ ( + < σ ( ρ ( + n n 0 n n 0 < ( ρ ρ < = 0707 Hence, if there is no other need to measure baseline values, and if it is relatively easier and cheaper to accrue patients than to make measurements on them, then the most efficient design is to measure and use only the follow-up values at the end of treatment, unless the correlation within subjects is extremely high (ρ > 0707 Of course, it is not unusual for patient eligibility to be based on the pre-randomization value of the measurement, in which case, the advantage of ignoring the baseline measurement is gone 5 Settings Where ρ is Unknown: ANCOVA The above derivations assumed that both ρ and σ were known in a homoscedastic setting It is most often the case, however, that neither ρ nor σ are known, and we must use estimates One approach to such estimation is to use a linear model in which we adjust for the baseline value as a covariate To further explore this approach, it is useful to modify our notation Define outcome vector Z, treatment vector X, and baseline vector W for i =,, n = n 0 + n by Y 0i i n 0 0 i n 0 Y 00i i n 0 Z i = X i = W i = Y j i = n 0 + j i > n 0 Y 0j i = n 0 + j We then fit ordinary least squares (OLS model EZ i X i, W i = β 0 + X i β + W i β

Accounting for Baseline Measurements in RCT Emerson, Page 6 obtaining OLS estimates ˆβ = rxw ˆβ = rxw VXZ V XW V W Z V XX V XX V W W VW Z V XW V XZ V W W V XX V W W, where for sample means Z, X, and W S XW r W X = SXX S W W n S XX = (X i X, i= V XX = S XX /n, n S XW = (X i X (W i W, i= V XW = S XW /n, with similar definitions for V W W, V XZ, and V W Z Now by randomization, we know that W k and X k are independent Thus under the assumption that the randomization ratio approaches some constant (ie, n /n λ (0, as n, the consistency of sample moments for population moments and the properties of convergence in probability then provide that V XW p 0 r XW p 0 V XX p λ( λ V W W p σ V ZZ p σ + λ( λ(µ µ 0 V W Z p ρσ V XZ p λ( λ(µ µ 0 = λ( λθ ˆβ p θ ˆβ p ρ Thus, this analysis of covariance (ANCOVA model is essentially using a consistent estimate for ρ as the coefficient for the baseline value Further statements can be made if we know that our statistical model is an accurate representation of the data generation mechanism Because the treatment assignment is binary, the only issue is whether some linear relationship exists between the baseline and follow-up measurement within each dose group But if this does hold, then in the homoscedastic setting (irrespective of the distribution of the errors including irrespective of whether the errors are identically distributed we know by the Gauss-Markov Theorem that ˆβ is the best linear unbiased estimator (BLUE for θ Sometimes an investigator is extremely insistent on analyzing the change in measurements They then

Accounting for Baseline Measurements in RCT Emerson, Page 7 fit a model in which they define Y 0i Y 00i i n 0 Z i = Y j Y 0j i = n 0 + j X i = 0 i n 0 Y 00i i n 0 W i = i > n 0 Y 0j i = n 0 + j and fit ordinary least squares (OLS model EZ k X i, W i = α 0 + X i α + W i α This approach provides the exact same inference about the treatment effect, because α 0 = β 0, α = β, and α = β and OLS estimates ˆα 0 = ˆβ 0, ˆα = ˆβ, and ˆα = ˆβ 6 ANCOVA in the General Heteroscedastic Setting 6 Notation Let Y ktj be the SBP measurement in the kth treatment group (k = for experimental treatment, k = 0 for placebo at time t (t = 0 for the baseline (pre-randomization measurement, t = for the end of treatment measurement in the jth patient (j =,, n i Suppose Y ktj (µ kt, σkt, meaning that for a population receiving treatment k, the average measurement pre-randomization would be µ k0 with a standard deviation of σ k0, and the average measurement post-treatment would be µ k with a standard deviation of σ k We presume that individuals are independent, and that repeat measurements made on the same subject in the kth group have correlation ρ k Thus corr(y kj, Y k0j = ρ k, corr(y ktj, Y kt j = 0 for j j, and corr(y ktj, Y k t j = 0 for k k (In this heteroscedastic setting, we consider that the treatment can affect both the variability of the follow-up measurements, as well as the correlation between the baseline and follow-up measurements 6 Optimal Linear Combination of Measurements We presume a randomized study, so we have µ 0 = µ 00 and σ 0 = We are ultimately interested in estimating θ = µ µ 0 In order to derive general results, we consider the distribution of linear combinations of the form kj = Y kj a k Y k0j Using simple properties of expectation and covariances, we find E kj = E Y kj a k Y k0j = µ k a k µ i0 V ar ( kj = V ar (Y kj a k Y k0j = σ k + a kσ k0 a k ρ k σ k σ k0 so letting we can derive moments k = n k n k kj j= E k = µk a k µ k0 V ar k = σ k + a k σ k0 a kρ k σ k σ k0 n k

Accounting for Baseline Measurements in RCT Emerson, Page 8 Now, we define ˆθ = 0, and find distribution moments E ˆθ = (µ a µ 0 (µ 0 a 0 µ 00 = θ (a µ 0 a 0 µ 00 = θ (a a 0 µ 00 and ˆθ is unbiased for θ for arbitrary µ 00 if and only if a = a 0 = a Now we can find distribution variance V ar ˆθ = σ + a σ0 aρ σ σ 0 + σ 0 + a σ00 aρ 0 σ 0 n n ( 0 = σ + σ 0 + a σ00 + ( ρ σ a + ρ 0σ 0 n n 0 n n 0 n n 0 By differentiating the above expression with respect to a, we can find the form of ˆθ with minimal variability as having ( ( n0 σ n σ 0 σ σ 0 a = ρ + ρ 0 = ( λ ρ + λρ 0, n 0 + n σ 0 n 0 + n where λ = n /(n 0 + n reflects the randomization ratio Note that this is a weighted average of the slope parameter from a regression of Y j on Y 0j and the slope parameter from a regression of Y 0j on Y 00j 63 An Alternative Derivation When Measurements are Normally Distributed Consider the same heteroscedastic setting as in section 6 with σ kt and ρ k known constants, but now suppose that it is known that vector (Y k0j, Y kj are bivariate normal We can then write the density for our data as a function of µ = (µ 00, µ 0, µ as f( Y n k ; µ = πσ k=0 j= 00 σ k ( ρ k exp (Y k0j µ 00 σ00 ( ρ k + ρ k(y k0j µ 00 (Y kj µ k σ k ( ρ k (Y kj µ k σk ( ρ k = g(µh( Y 3 exp c l ( µt l ( Y, l=

Accounting for Baseline Measurements in RCT Emerson, Page 9 where the familiar form of an exponential family density is found by expanding the product and grouping terms, with ( nk n g( µ = k µ 00 exp πσ k=0 00 σ k ( ρ k σ00 ( ρ k + n kρ k µ k µ 00 σ k ( ρ k n k µ } k σk ( ρ k h( Y n k = exp Yk0j σ00 ( ρ k ρ ky k0j Y kj σ k ( ρ k + Ykj σk ( ρ k c ( θ = µ 00 σ 00 µ 0 k=0 j= c ( θ = σ0 ( ρ 0 µ c 3 ( θ = σ ( ρ T ( Y = n k ρ k k=0 n 0 j= j= ( Y k0j ρ k Y kj σ k = ( ρ 0 Y 00 n 0ρ 0 ( ρ 0 Y 0 + n σ 0 ( ρ Y 0 n ρ ( ρ Y σ T ( Y n 0 ( σ 0 σ 0 = Y 0j ρ 0 Y 00j = n 0 Y 0 n 0 ρ 0 Y 00 σ j= 00 T 3 ( Y n ( σ σ = Y j ρ Y 0j = n Y n ρ Y 0 Note that in that exponential family density, µ is a three dimensional vector, and the range of µ contains an open rectangle in 3 dimensions Hence, we know that T is a complete sufficient statistic, and any unbiased estimator of a function of µ that is only a function of the complete sufficient statistic is the uniform minimum variance unbiased estimator Thus for unbiased estimator µ = (ˆµ 00 ˆµ 0 ˆµ T with ˆµ 00 = T ( Y n 0 + n + ρ 0 ρ T ( Y 0 σ + ρ 0 ρ T 3 ( Y σ = n 0 n 0 + n Y 00 + n n 0 + n Y 0 ˆµ 0 = n 0 T ( Y + ρ 0 σ 0 ˆµ 00 = Y 0 ρ 0 n n 0 + n σ 0 Y 00 + ρ 0 n n 0 + n σ 0 Y 0 ˆµ = n T 3 ( Y + ρ σ σ 0 ˆµ 00 = Y + ρ n 0 n 0 + n σ Y 00 ρ n 0 n 0 + n σ Y 0 it is easily seen that this estimator is UMVUE for µ It similarly follows that ˆθ = ˆµ ˆµ 0 is UMVUE for

Accounting for Baseline Measurements in RCT Emerson, Page 0 θ = µ µ 0, with ( ( n σ 0 ˆθ = Y ρ 0 + n 0 σ ρ n 0 + n n 0 + n Y 0 ( ( n σ 0 Y 0 ρ 0 + n 0 σ ρ Y 00 n 0 + n n 0 + n We can thus find that the UMVUE is exactly equal to the optimal choice of a when reducing data on each subject to a single measurement 64 Use of the OLS ANCOVA model In the realistic setting in which we do not know ρ k or σkt, we could again consider the ANCOVA model EZ i X i, W i = β 0 + X i β + W i β with OLS estimates ˆβ = rxw ˆβ = rxw VXZ V XW V W Z V XX V XX V W W VW Z V XW V XZ V W W V XX V W W In the heteroscedastic setting in which the randomization ratio is kept constant (so again assuming n /n λ (0, as n, we have V XW p 0, r XW p 0, V XX p λ( λ, V W W p σ 00, V ZZ p ( λσ 0 + λσ + λ( λ(µ µ 0, V W Z ( λρ 0 σ 0 + λρ σ, V XZ p λ( λ(µ µ 0 = λ( λθ, ˆβ p θ, ˆβ p λρ σ σ 0 + ( λ ρ 0 σ 0 Note that in this case, the OLS coefficient for the baseline measurement is consistent for the optimal value of a, when the randomization ratio is : (ie, λ = 05 a very common setting or ρ σ = ρ 0 σ 0 (something that is rather hard to ascertain Otherwise, the OLS coefficient is not consistent for the optimal value of a If we know that our statistical model is an accurate representation of the data generation mechanism, we know by the Gauss-Markov Theorem that the OLS estimator ˆβ is not the best linear unbiased estimator (BLUE for θ unless V ar(y kj Y k0j is a constant independent of k and j Otherwise, the BLUE would be the weighted least squares estimate with each observation weighted by the inverse variance of the follow-up measurement conditional on its treatment group and its baseline measurement As noted above, the optimal linear combination of the baseline and follow-up observations based on a is of a form that would suggest performing linear regressions of the follow-up observations on the baseline

Accounting for Baseline Measurements in RCT Emerson, Page observations within each group, and then using the estimated coefficients for the baseline values to adjust the follow-up measurements That is, we could fit linear regression models E Y kj Y k0j = γ k0 + Y k0j γ k, and then define variables Z i (( ( n0 n = Z i ˆγ + ˆγ 0 W i, n 0 + n n 0 + n and then regress Z i on X i It should be noted, however, that the Z i are now (very slightly correlated within treatment groups, and thus the inference obtained from classical OLS on that simple linear regression model would not be entirely correct