Accounting for Baseline Observations in Randomized Clinical Trials

Size: px

Start display at page:

Download "Accounting for Baseline Observations in Randomized Clinical Trials"

Debra Murphy
6 years ago
Views:

1 Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA October 6, 0 Abstract In clinical trials investigating the treatment of chronic, progressive diseases, a common design is to make some physiologic measurement of the disease state at the start of the trial, randomize patients to either the experimental treatment or some control treatment (placebo), follow the patients over some period of time, and then measure once again the physiologic measurement of the disease state The question then arises: How should the baseline measurements made prior to randomization be used? In this manuscript, I consider several common strategies that might be used in this setting, including analyzing only the follow-up measurement, analyzing the change in measurements, and the analysis of covariance (ANCOVA) approach of adjusting for the baseline measurement in a linear regression model I illustrate several different ways to motivate the use of ANCOVA, as well as examining the settings in which not measuring baseline might be preferred Introduction In clinical trials investigating the treatment of chronic, progressive diseases, a common design is to make some physiologic measurement of the disease state at the start of the trial, randomize patients to either the experimental treatment or some control treatment (placebo), follow the patients over some period of time, and then measure once again the physiologic measurement of the disease state Take for instance a doubleblind, placebo controlled, randomized clinical trial (RCT) of a new drug in the setting of hypertension Upon recruiting a patient into the RCT, we would typically measure the patient s systolic blood pressure (SBP) for the purposes of determining eligibility for the trial We might further decide to stratify randomization of patients on SBP in order to ensure balance across treatment arms with respect to severity of hypertension Then after treating the patients with either the experimental treatment or placebo for the period of, say, a year, we again measure the SBP of all patients, and analyze the data to decide whether the experimental treatment is associated with a scientifically important and statistically significant lower SBP when compared to placebo The question is: How do we use the baseline (pre-randomization) measurement of SBP in the data analysis? Owing to the randomized nature of the study, there are several choices that will each estimate the causal effect of the treatment in an unbiased fashion: Ignore baseline and analyze the final measurements By virtue of randomization, the distribution of SBP at the start of the study is equal in each treatment arm Hence, any differences in the distribution of SBP at the end of the trial is directly attributable to the effect of the experimental treatment Analyze the change in measurements over the course of the study For each patient, we compute the difference between the final SBP and their baseline SBP, and then compare the treatment arms with respect to the change in measurements

2 Accounting for Baseline Measurements in RCT Emerson, Page 3 Adjust for the baseline measurement as a covariate in a linear regression model of final measurement for each patient by treatment arm 4 Adjust for the baseline measurement as a covariate in a linear regression model of the change in measures for each patient by treatment arm In my experience, the second of these methods is the one most commonly advocated by clinical investigators It seems only natural to them: we are interested in an intervention that would either change the pathophysiologic process for the better or (at least) lessen the change in the pathophysiologic process for the worse In this manuscript, I explore the settings in which one of the approaches might be preferable to the others In this exploration, I illustrate the seemingly paradoxical result that suggests that in RCT that second method (analyzing the change in measurements) is never an optimal choice, while there are situations that any of the other three is optimal It should be noted that the results presented herein are specific to RCT, because we make extensive use of the fact that the distribution of baseline measurements is known to be the same in both treatment groups Notation in the Homoscedastic Setting Let Y itj be the SBP measurement in the ith treatment group (i = for experimental treatment, i = 0 for placebo) at time t (t = 0 for the baseline (pre-randomization) measurement, t = for the end of treatment measurement) in the jth patient (j =,, n i ) Suppose Y itj (µ it, σ ), meaning that for a population receiving treatment i, the average measurement pre-randomization would be µ i0 with a standard deviation of σ, and the average measurement post-treatment would be µ i with a standard deviation of σ We presume that individuals are independent, and that repeat measurements made on the same subject in the ith group have correlation ρ Thus corr(y ij, Y i0j ) = ρ, corr(y itj, Y it j ) = 0 for j j, and corr(y itj, Y i t j ) = 0 for i i (This homoscedastic model presumes not only that the treatment does not affect the variability of the measurements, but also that the treatment does not affect the correlation of the baseline and follow-up measurements) We presume a randomized study, so we have µ 0 = µ 00 θ = µ µ 0 We are ultimately interested in estimating In order to derive general results, we consider the distribution of linear combinations of the form ij = Y ij a i Y i0j Using simple properties of expectation and covariances, we find E ij = E Y ij a i Y i0j = µ i a i µ i0 V ar ( ij ) = V ar (Y ij a i Y i0j ) = σ + a i σ a i ρσ = ( a i ρ + a ) i σ so letting we can derive moments i = n i n i ij i= E i = µi a i µ i0 V ar ( ) ai ρ + a i = i σ n i

3 Accounting for Baseline Measurements in RCT Emerson, Page 3 Now, for a specified value of a, we define ˆθ (a) = 0, and find distribution moments E ˆθ(a) = (µ a µ 0 ) (µ 0 a 0 µ 00 ) = θ (a µ 0 a 0 µ 00 ) = θ (a a 0 )µ 00 and ˆθ is unbiased for θ for arbitrary µ 00 if and only if a = a 0 = a Now we can find distribution variance ˆθ(a) V ar = ( aρ + a ) ( σ + ) n n 0 Special cases of interest are Ignore baseline and analyze the final measurements This corresponds to a = 0, in which case V ar(ˆθ (0) ) = σ ( n + n 0 ) Analyze the change in measurements over the course of the study This corresponds to a =, in which case ( V ar(ˆθ () ) = σ ( ρ) + ) n n 0 By differentiating the above expression with respect to a, we can find the form of ˆθ with minimal variability as having a = ρ, in which case V ar(ˆθ (ρ) ) = σ ( ρ ) ( + ) n n 0 Note that For ρ < 05, V ar(ˆθ (0) ) < V ar(ˆθ () ), and throwing away the baseline value is more efficient than comparing the change in measurements across treatment arms For ρ = 05, V ar(ˆθ (0) ) = V ar(ˆθ () ), and throwing away the baseline value is just as efficient as comparing the change in measurements across treatment arms (This is the only point of equality between these two estimators) 3 For ρ > 05, V ar(ˆθ (0) ) > V ar(ˆθ () ), and throwing away the baseline value is less efficient than comparing the change in measurements across treatment arms 4 For all ρ, V ar(ˆθ (ρ) ) V ar(ˆθ (0) ) and V ar(ˆθ (ρ) ) V ar(ˆθ () ) 5 For ρ = 0, V ar(ˆθ (0) ) = V ar(ˆθ (ρ) ) (This is the only point of equality between these two estimators) 6 For ρ =, V ar(ˆθ () ) = V ar(ˆθ (ρ) ) (This is the only point of equality between these two estimators) Note that the third observation above suggests that when ρ is known, the use of ˆθ (ρ) is the best linear estimator, as it is always at least as good as either of the other two approaches based on a = 0 or a =

4 Accounting for Baseline Measurements in RCT Emerson, Page 4 3 An Alternative Derivation When Measurements are Normally Distributed Consider the same homoscedastic setting as in section with σ and ρ known constants, but now suppose that it is known that vector (Y i0j, Y ij ) are bivariate normal We can then write the density for our data as a function of θ = (µ 00, µ 0, µ ) as f( Y ; θ) n i = πσ ( ρ i=0 j= ) exp (Y i0j µ 00 ) ρ(y i0j µ 00 )(Y ij µ i ) + (Y ij µ i ) σ ( ρ ) = g(θ)h( Y 3 ) exp c k (θ)t k ( Y ), k= where the familiar form of an exponential family density is found by expanding the product and grouping terms, with n0+n g( θ) = πσ exp (n 0 + n )µ 00 ρ(n 0 µ 0 + n µ )µ 00 + n µ + n 0 µ 0 ( ρ ) σ ( ρ ) ni ( ) h( Y i=0 j= Y i0j ρy i0j Y ij + Yij ) = exp σ ( ρ ) µ 00 c ( θ) = σ ( ρ ) µ 0 c ( θ) = σ ( ρ ) µ c 3 ( θ) = σ ( ρ ) T ( Y n i ) = (Y i0j ρy ij ) i=0 j= T ( Y n 0 ) = (Y 0j ρy 00j ) j= T 3 ( Y n ) = (Y j ρy 0j ) j= Note that in that exponential family density, θ is a three dimensional vector, and the range of θ contains an open rectangle in 3 dimensions Hence, we know that T is a complete sufficient statistic Furthermore, because ˆθ (ρ) = T 3 ( Y )/n T ( Y )/n 0 is unbiased for θ and a function of the complete sufficient statistic, we know that ˆθ (ρ) is the uniform minimum variance unbiased estimator of θ 4 Settings Where Not Measuring Baseline is Optimal Even when ρ is known, there is a setting in which a better approach would be to use only the final measurement

5 Accounting for Baseline Measurements in RCT Emerson, Page 5 Suppose the measurement of Y itj is extremely expensive Then the limiting factor in our experimental design might be the number of measurements we have to make In order to estimate ˆθ (ρ), we must make (n 0 + n ) measurements one for each subject at baseline and one for each subject at the end of treatment As noted above, the variance of ˆθ (ρ) is given by V ar(ˆθ (ρ) ) = σ ( ρ ) ( + ) n n 0 But suppose that we doubled the number of subjects to n 0 = n 0 and n = n and we did not measure baseline values at all In this case, we would use ˆθ (0) which has variance ( V ar(ˆθ (0) ) = σ n + ) ( n = σ + ) 0 n n 0 We can then consider the setting in which V ar(ˆθ (0) ) < V ar(ˆθ (ρ) ) as σ V ar(ˆθ (0) ) < V ar(ˆθ (ρ) ) ( + ) < σ ( ρ ) ( + ) n n 0 n n 0 < ( ρ ) ρ < = 0707 Hence, if there is no other need to measure baseline values, and if it is relatively easier and cheaper to accrue patients than to make measurements on them, then unless the correlation within subjects is extremely high (ρ > 0707), the most efficient design is to measure and use only the follow-up values at the end of treatment Of course, it is not unusual for patient eligibility to be based on the pre-randomization value of the measurement, in which case, the advantage of ignoring the baseline measurement is gone 5 Settings Where ρ is Unknown: ANCOVA The above derivations assumed that both ρ and σ were known in a homoscedastic setting It is most often the case, however, that neither ρ nor σ are known, and we must use estimates One approach to such estimation is to use a linear model in which we adjust for the baseline value as a covariate To further explore this approach, it is useful to modify our notation Define outcome vector Z, treatment vector X, and baseline vector W for k =,, n = n 0 + n by Y 0k k n 0 0 k n 0 Y 00k k n 0 Z k = X k = W k = Y j k = n 0 + j k > n 0 Y 0j k = n 0 + j We then fit ordinary least squares (OLS) model EZ k X k, W k = β 0 + X k β + W k β

6 Accounting for Baseline Measurements in RCT Emerson, Page 6 obtaining OLS estimates ˆβ = rxw ˆβ = rxw VXZ V XW V W Z V XX V XX V W W VW Z V XW V XZ V W W V XX V W W, where for sample means Z, X, and W S XW r W X = SXX S W W n S XX = (X k X ), k= V XX = S XX /n, n S XW = (X k X )(W k W ), k= V XW = S XW /n, with similar definitions for V W W, V XZ, and V W Z Now by randomization, we know that W k and X k are independent Thus under the assumption that the randomization ratio approaches some constant (ie, n /n λ (0, ) as n ), the consistency of sample moments for population moments and the properties of convergence in probability then provide that V XW p 0, r XW p 0, V XX p λ( λ), V W W p σ, V ZZ p σ + λ( λ)(µ µ 0 ), V W Z ρσ, V XZ p λ( λ)(µ µ 0 ) = λ( λ)θ, ˆβ p θ, ˆβ p ρ Thus, this analysis of covariance (ANCOVA) model is essentially using a consistent estimate for ρ as the coefficient for the baseline value Further statements can be made if we know that our statistical model is an accurate representation of the data generation mechanism Because the treatment assignment is binary, the only issue is whether some linear relationship exists between the baseline and follow-up measurement within each dose group But if this does hold, then in the homoscedastic setting (irrespective of the distribution of the errors including irrespective of whether the errors are identically distributed) we know by the Gauss-Markov Theorem that ˆβ is the best linear unbiased estimator (BLUE) for θ Sometimes an investigator is extremely insistent on analyzing the change in measurements They then

7 Accounting for Baseline Measurements in RCT Emerson, Page 7 fit a model in which they define Y 0k Y 00k k n 0 Z k = Y j Y 0k k = n 0 + j X k = 0 k n 0 Y 00k k n 0 W k = k > n 0 Y 0j k = n 0 + j and fit ordinary least squares (OLS) model EZ k X k, W k = α 0 + X k α + W k α This approach provides the exact same inference about the treatment effect, because α 0 = β 0, α = β, and α = β and OLS estimates ˆα 0 = ˆβ 0, ˆα = ˆβ, and ˆα = ˆβ 6 Notation in the General Heteroscedastic Setting Let Y itj be the SBP measurement in the ith treatment group (i = for experimental treatment, i = 0 for placebo) at time t (t = 0 for the baseline (pre-randomization) measurement, t = for the end of treatment measurement) in the jth patient (j =,, n i ) Suppose Y itj (µ it, σit ), meaning that for a population receiving treatment i, the average measurement pre-randomization would be µ i0 with a standard deviation of σ i0, and the average measurement post-treatment would be µ i with a standard deviation of σ i We presume that individuals are independent, and that repeat measurements made on the same subject in the ith group have correlation ρ i Thus corr(y ij, Y i0j ) = ρ i, corr(y itj, Y it j ) = 0 for j j, and corr(y itj, Y i t j ) = 0 for i i (In this heteroscedastic setting, we consider that the treatment can affect both the variability of the follow-up measurements, as well as the correlation between the baseline and follow-up measurements) We presume a randomized study, so we have µ 0 = µ 00 and σ 0 = σ 00 We are ultimately interested in estimating θ = µ µ 0 In order to derive general results, we consider the distribution of linear combinations of the form ij = Y ij a i Y i0j Using simple properties of expectation and covariances, we find E ij = E Y ij a i Y i0j = µ i a i µ i0 V ar ( ij ) = V ar (Y ij a i Y i0j ) = σ i + a i σ i0 a i ρ i σ i σ i0 so letting we can derive moments i = n i n i ij i= E i = µi a i µ i0 V ar i = σ i + a i σ i0 a iρ i σ i σ i0 n i Now, we define ˆθ = 0, and find distribution moments E ˆθ = (µ a µ 0 ) (µ 0 a 0 µ 00 ) = θ (a µ 0 a 0 µ 00 ) = θ (a a 0 )µ 00

8 Accounting for Baseline Measurements in RCT Emerson, Page 8 and ˆθ is unbiased for θ for arbitrary µ 00 if and only if a = a 0 = a Now we can find distribution variance V ar ˆθ = σ + a σ0 aρ σ σ 0 + σ 0 + a σ00 aρ 0 σ 0 σ 00 n n ( 0 = σ + σ 0 + a σ00 + ) ( ρ σ aσ 00 + ρ ) 0σ 0 n n 0 n n 0 n n 0 By differentiating the above expression with respect to a, we can find the form of ˆθ with minimal variability as having ( ) ( ) n0 σ n σ 0 σ σ 0 a = ρ + ρ 0 = ( λ) ρ + λρ 0, n 0 + n σ 0 n 0 + n σ 00 σ 00 σ 00 where λ = n /(n 0 + n ) reflects the randomization ratio Note that this is a weighted average of the slope parameter from a regression of Y j on Y 0j and the slope parameter from a regression of Y 0j on Y 00j In the realistic setting in which we do not know ρ i or σit, we would again consider the ANCOVA model EZ k X k, W k = β 0 + X k β + W k β with OLS estimates ˆβ = rxw ˆβ = rxw VXZ V XW V W Z V XX V XX V W W VW Z V XW V XZ V W W V XX V W W In the heteroscedastic setting in which the randomization ratio is kept constant (so again assuming n /n λ (0, ) as n ), we have V XW p 0, r XW p 0, V XX p λ( λ), V W W p σ 00, V ZZ p ( λ)σ 0 + λσ + λ( λ)(µ µ 0 ), V W Z ( λ)ρ 0 σ 00 σ 0 + λρ σ 00 σ, V XZ p λ( λ)(µ µ 0 ) = λ( λ)θ, ˆβ p θ, ˆβ p λρ σ σ 0 + ( λ) ρ 0 σ 0 σ 00 Note that in this case, the OLS coefficient for the baseline measurement is not consistent for the optimal value of a, unless the randomization ratio is : (ie, λ = 05) or ρ σ = ρ 0 σ 0 (something that is rather hard to ascertain) If we know that our statistical model is an accurate representation of the data generation mechanism, we know by the Gauss-Markov Theorem that the OLS estimator ˆβ is not the best linear unbiased estimator

9 Accounting for Baseline Measurements in RCT Emerson, Page 9 (BLUE) for θ unless V ar(y ij Y i0j ) is a constant independent of i and j Otherwise, the BLUE would be the weighted least squares estimate with each observation weighted by the inverse variance of the follow-up measurement conditional on its treatment group and its baseline measurement As noted above, the optimal linear combination of the baseline and follow-up observations based on a is of a form that would suggest performing linear regressions of the follow-up observations on the baseline observations within each group, and then using the estimated coefficients for the baseline values to adjust the follow-up measurements That is, we could fit linear regression models E Y ij Y i0j = γ 0i + Y i0j γ i, and then define variables (( ) ( ) ) Zk n0 n = Z k ˆγ 0 + ˆγ W k, n 0 + n n 0 + n and then regress Zk on X k It should be noted, however, that the Zk are now correlated within treatment groups, and thus the inference obtained from classical OLS on that simple linear regression model would not be correct

Accounting for Baseline Observations in Randomized Clinical Trials

Accounting for Baseline Observations in Randomized Clinical Trials Scott S Emerson, MD, PhD Department of Biostatistics, University of Washington, Seattle, WA 9895, USA August 5, 0 Abstract In clinical