Power analyses for longitudinal trials and other clustered designs

Size: px

Start display at page:

Download "Power analyses for longitudinal trials and other clustered designs"

Patrick Garrett
5 years ago
Views:

1 STATISTICS IN MEDICINE Statist Med 2004; 23: (DOI: 0002/sim869) Power analyses for longitudinal trials and other clustered designs X M Tu ; 2; ; ;, J Kowalski 3;, J Zhang 4;, K G Lynch 4; and 5; P Crits-Christoph Department of Biostatistics and Computational Biology; University of Rochester; 60 Elmwood Avenue; Rochester; NY 4642; USA 2 Department of Psychiatry; University of Rochester; 60 Elmwood Avenue; Rochester; NY 4642; USA 3 Department of Oncology and Biostatistics; Johns Hopkins University; USA 4 Department of Biostatistics and Epidemiology; University of Pennsylvania School of Medicine; USA 5 Department of Psychiatry; University of Pennsylvania School of Medicine; USA SUMMARY Existing methods for power and sample size estimation for longitudinal and other clustered study designs have limited applications In this paper, we review and extend existing approaches to improve these limitations In particular, we focus on power analysis for the two most popular approaches for clustered data analysis, the generalized estimating equations and the linear mixed-eects models By basing the derivation of the power function on the asymptotic distribution of the model estimates, the proposed approach provides estimates of power that are consistent with the methods of inference for data analysis The proposed methodology is illustrated with numerous examples that are motivated by real study designs Copyright? 2004 John Wiley & Sons, Ltd KEY WORDS: epidemiological study; GEE; HIV; linear mixed-eects models; intraclass correlation; psychosocial and survey research INTRODUCTION Power and sample size estimation constitutes an important component in the design and planning of modern clinical trials It provides information for assessing the feasibility of a study to Correspondence to: X M Tu, Department of Biostatistics and Computational Biology, University of Rochester, 60 Elmwood Avenue, Box 630, Rochester, NY 4642, USA xin tu@urmcrochesteredu Professor Assistant Professor Graduate Student Assistant Professor Professor Contract=grant sponsor: NIH=MINH; contract=grant number: P50-MH Contract=grant sponsor: NIH=NIAD; contract=grant number: K22AI 586 Received January 2004 Copyright? 2004 John Wiley & Sons, Ltd Accepted March 2004

2 2800 X M TU ET AL detect some pre-specied eect size and for estimating the amount of resources necessary for its execution in both ecacy and eectiveness research [ 3] Although the past two decades have witnessed major advances in the statistical methods for clustered data analysis, particularly in longitudinal studies, most research eects have been centred around data analysis, with little attention paid to power and sample-size estimation As a result, the development of methods for the latter has been evolving at a much slower pace Often, methods developed based on cross-sectional study designs are used to provide power and sample-size estimates for longitudinal studies Only recently has attention been paid to the eect of within-subject correlations in such study designs [4 0] Despite these new developments, current methods still have limited applications, especially in the presence of continuous covariates (predictors) In this paper, we review and extend current work on power analysis for the two most popular clustered data approaches, the generalized estimating equations (GEE) and the linear mixed-eects models (LMM) Existing GEE- and LMM-based methods only apply to group comparisons and relatively simple study designs By extending these approaches to more complex designs and developing new analytic methods based on the asymptotic distribution of estimates, we are able to provide more accurate power estimates In Section 2, we develop the general approach and highlight its dierences with respect to existing alternatives In Section 3, we illustrate the approach with examples based on real design considerations for biomedical, epidemiological and psychosocial research studies In Section 4, we discuss limitations of the proposed approach and directions for future research 2 POWER ANALYSIS FOR CLUSTERED DATA Consider a study of n clusters, indexed by i, each of size m (6i6n) Let y it denote the response and x it the vector of predictors (or covariates) of the tth member from the ith cluster Note that we have assumed a common cluster size We discuss the implications of this assumption and extensions to accommodate varying cluster sizes in Discussion Note also that m is considered to be xed throughout the development so that asymptotic methods apply for large n The two most popular approaches for regression analysis with clustered data, especially for data from longitudinal studies, are the GEE [, 2] and the mixed-eects models (MM) [3 23] For data analysis, the latter has the advantage of being able to tease apart the between- and within-cluster variability, while the former class of models has the much desired distribution-free property for robust inference However, in most applications, power estimates are typically desired for detecting dierences in the (marginal) mean response or xed-eects Thus, the advantage of MM in obtaining cluster-specic variance estimates is no longer an important consideration On the other hand, the distribution-free property of the GEE becomes especially desirable for power analysis Without real data, it would be impossible to verify a parametric distribution model and the robust property of GEE will ensure reliable estimates regardless of the data distribution However, the nal decision rests upon which approach will be used for data analysis For example, it makes little sense to compute power based on a MM model, when the study data will be analysed by the GEE Regardless of which approach to use, we consider regression models that link x it to y it through a linear predictor x it or a function of such a predictor, where is the vector of parameters of interest Under GEE, such a regression for a continuous or binary response is

3 POWER ANALYSES FOR LONGITUDINAL TRIALS 280 dened by the marginal mean and variance of the response in terms of the generalized linear models [, 2]: it = E[y it x it ]=h(x it); Var(y it x it )= 2 v( it ); 6i6n; 6t6m () where h ( ) is a known link function, 2 a scale parameter and v( ) a known function Under MM, the serial correlations of the joint distribution of the clustered responses is explicitly accounted for by introducing some latent variables (random eects) For a continuous response, the LMM is dened by: y it = x it + z itb i + it or y i = X i + Z i b i + i b i N(0;D); i N(0; 2 I m ); 6t6m (2) where D denotes the variance of the random eect b i, 2 the variance of the model error i, I m the m m identity matrix, X i =(x i ;:::x im ) and Z i =(z i ;:::z im ) the design matrix for the xed and random eect, respectively In most applications, Z i = X i In this paper, we limit our discussion of the MM-based approach to LMM A primary reason for focusing on LMM rather than more general mixed-eects models such as the generalized linear mixed-eects models for discrete responses is the large number of dierent approaches proposed for inference and the complexity in the asymptotic distributions of model estimates [4 23] Also, for GEE, we limit our consideration to () for continuous and binary responses However, the proposed approach is readily generalized for GEE models for categorical responses [24] For power analysis, we consider the class of general linear hypothesis of the form: H 0 : K = b; H a : K = d b (3) where K is a known full rank s p matrix (p is the dimension of ), and b and d are s vectors of known constants The linear hypothesis (3) is the most general class of hypotheses that has been systematically studied for both the classic linear and generalized linear models and applies to virtually all types of hypotheses concerning arising in practical studies When b = 0, H 0 is known as a linear contrast For non-zero b, we can re-express (3) in terms of a contrast by performing the parameter transformation: = K (K K) b (eg Reference [25]) such that when expressed in terms of, () becomes: it = h it (c it + x it); Var(y it x it )= 2 v( it ); 6i6n; 6t6m (4) where c it = x itk (K K) b is a known constant For linear regressions, c it is often absorbed into the response by dening a new response variable (eg Reference [25]) For non-linear models, c it is often called an oset (eg Reference [26]) Since c it has no eect on power, we focus on the class of linear contrasts, ie b = 0 in (3) We develop power functions for such contrasts under each approach We start with our considerations for GEE 2 Generalized estimating equation Under GEE, estimates of are obtained as solutions to a generalized estimating equation [] Since the GEE procedure has been well documented and extensively discussed in the literature, we only highlight the results that are most relevant to the current development

4 2802 X M TU ET AL Let A i = diag[v(h it )]; y i =(y i ;:::;y im ) ; h i =(h i ;:::;h im ) V i = 2 A =2 i W ()A =2 i ; D h i; S i = y i h i ; U i = D i V i S i (5) where diag[v(h it )] denotes a diagonal matrix with v(h it ) on the tth diagonal and W () a working correlation matrix modelled by the parameter vector The working correlation matrix, W (), is generally not equal to the true within-cluster correlation of y and its role is to increase eciency (or decrease standard errors) [, 27] A GEE estimate, ˆ, obtained by solving the GEE, is both consistent and asymptotically normal, ie n( ˆ ) d N(0; = B U B ) B = E(D V D ); U = E[D V S S V D ] (6) where d denotes convergence in distribution [28, 29] Thus, for large sample size, ˆ has an approximate normal distribution, N(; =n ), which is the basis for inference about For data analysis,, B and U are estimated by the observed data By substituting these estimates into (6), we obtain a robust asymptotic variance estimate of In other words, with real data, we can obtain an estimate of, regardless of the choice of W () For power analysis, however, the situation is quite dierent Prior to data collection, estimated values are not available for any of these parameters In this case, we set W () equal to the true correlation matrix from which it follows that: U = E[D V E(S S x )V D x ]=B ; = E (D V D ) (7) In most applications of power analysis, this true correlation matrix is often modelled as a function of t s [7, 30] For example, under the uniform compound symmetry structure (eg Reference [7]), the correlation between y s and y t is modelled as a constant, st = (s t) This assumption may oversimplify the structure of the correlation of y, but other alternatives may be used to reach a compromise between analytic simplicity and reality Modelling such correlation structures has been extensively discussed in the literature and is not repeated here It follows from standard asymptotic theory (eg Reference [3]) and (6) that K ˆ has a limiting normal distribution under either H 0 or H a, ie H 0 : nk ˆ d N(0;K K ); H a : n(k ˆ d) d N(0;K K ) (8) In addition, the centred quadratic statistic, Qn0 2 = n[k( ˆ )] (K K ) [K( ˆ )] has an asymptotic central 2 distribution, Qn0 2 ds 2 (0), where s 2 (c) denotes a 2 distribution with degree of freedom s and non-centrality parameter c In other words, under H 0 and H a, the quadratic or Wald statistic, Qn 2 = n(k ˆ) (K K ) (K ˆ), has approximately a central and non-central s 2 distribution, respectively, H 0 : Q 2 n 2 s (0); H a : Q 2 n 2 s (c) (9)

5 POWER ANALYSES FOR LONGITUDINAL TRIALS 2803 where c = nd (K K ) d Let F 2 s (c) denote the cdf of s 2 (c) For a given level of type I error, let p denote the th percentile of s 2 (0) The power function,, for the linear contrasts (3) based on the Wald statistic is given by: (n; c; )= F 2 s (c)(p ); = F 2 s (0)(p ) (0) For a given sample size n, above provides the power for detecting H a in (3) Alternatively, (0) can be used to estimate the required sample size to achieve a pre-specied power In this case, is given and the minimum sample size required to reach is the solution to the rst equation in (0) This non-linear equation is numerically solved by root-nding methods such as the Newton s algorithm (eg Reference [32]) Most commercially available software packages implement one or more such root-nding methods For example, in SAS, the function, NLPFDD, computes the root of a general non-linear function using the Newton s method, where the required rst-order derivatives are either analytically calculated or numerically approximated, depending on the complexity of the non-linear function Sample size estimates are obtained by rounding o the solutions using the largest integer function Note that the power function (0) is derived based on the asymptotic distribution of the GEE estimates As a result, the power function depends on x it through its distribution, rather than a set of specic values, as in other existing methods for linear and generalized linear models [4, 8] Within our context, x it can contain both discrete and continuous variables Although for discrete x it both the proposed and existing approaches yield the same power functions, they are derived based on dierent principles We highlight this dierence with some specic models below 2 Repeated analysis of variance models First, consider the class of repeated analysis of variance (RANOVA) models Such models generalize the traditional analysis of variance (ANOVA) models to a longitudinal setting with repeated assessments [8] However, as the repeated measures give rise to correlated outcomes, methods for clustered data must be used to address the within-cluster correlations Let y kit denote the tth repeated measure from the ith subject within the kth group for 6i6n k,6k6g and 6t6m By identifying the repeated responses of the same subject as a cluster, () for modelling the group means over time is given by: or, in a matrix form, as: E(y kit )= kt ; Var(y kit )= 2 t ; 6i6n k ; 6k6g; 6t6m () E(y ki )= k ; V = Var(y ki ) = diag( s )( st ) diag( t ); 6i6n k ; 6k6g (2) where diag( s ) denotes a diagonal matrix with s on the sth diagonal, ( st ) the correlation matrix between the within-cluster responses and y ki =(y ki y kim ) ; k =( k km ) ; =( g ) (3) Note that implicit in () or (2) is the assumption that the variance matrix, Var(y ki ), is the same across all groups To compute power and sample size estimates, must be specied For example, may be assumed to follow the compound symmetry model, C(), where denotes the within-subject correlation In this case, V = diag( s )C() diag( t )

6 2804 X M TU ET AL Let y k = n k y ki ; ˆ =(y n y g ) ; n= g n k ; k i= k= p k = lim n n k n ; D= diag(p k ) (4) Then, it follows from (6) that the GEE estimate, ˆ, has the asymptotic distribution: n(ˆ ) d N(0; = D V ) (5) where denotes the Kronecker product (eg Reference [33]) The power function for the linear contrasts (3) (with replaced by ) is given by (9) and (0) after substituting for Note that in (5) is the asymptotic variance Thus, p k refers to the proportion of group k in the study sample When used for a nite sample size, p k is simply set to the estimate ˆp k = n k =n, which is the proportion of group k in the sample Thus, is determined by the distribution of ˆp k, rather than conditioned upon a particular sample, as in other existing methods (eg Reference [8]) This subtle dierence is best illustrated with continuous predictors (covariates) as discussed in the next section Note also that by ignoring the ordered structure of the repeated assessments (over time), the RANOVA can be applied to cross-sectional clustered study designs in survey and epidemiological research [0, 34] The only dierence when applying to such a context is that in most applications, the ordered nature of the repeated assessments no longer exists, ie all elements of y ki are exchangeable and thus hypotheses of interest only concern group means rather than mean vectors dened for each of the assessment times as in the context of longitudinal studies For example, if we want to compare g groups each with mean k (6k6g), we simply set k =( k k ) in (3) and proceed otherwise as above As a special case, for g = 2 and under the uniform compound symmetry correlation structure, this yields the procedure proposed by Manatunga et al [0] in the case of a common across-group cluster size Note that the authors also considered a general case with varying cluster sizes We discuss extensions of our approach to this more general setting in Discussion 22 Linear regression models with continuous covariates By setting h it to the identity link, we obtain from () the class of linear regression models for repeated measures: E(y it x it )=x it; Var(y it x it )= 2 t ; 6t6m; 6i6n (6) This class of models is quite popular in growth curve analysis As in the case of RANOVA, the GEE estimate ˆ again can be expressed in closed form: ( n ˆ = i= ) ( n ) Xi V X i Xi V y i i= where X i =(x i ;:::;x im ) For the class of linear contrasts (3), the power is given by (0), with = E (X V X ) To compute power, we must evaluate E(X V X ) To this end, note that [ E(X V X )=E[(x ;:::;x m )V (x ;:::;x m ) ]= c jk = ] v lr E(x jl x kr ) (7) l; r

7 POWER ANALYSES FOR LONGITUDINAL TRIALS 2805 where v lr denotes the lrth element of V and E(x jl x kr ) the lrth element of E(x l xr ) Thus, is a function of V and E(x s xt )(6s; t6m) Given V, E(X ) and Var(X ), E(X V X ) is readily evaluated Like RANOVA, is determined by the distribution of x it, or more precisely, the mean and variance of x it in this case In comparison, existing methods condition upon a particular sample of x it and thus require that the values of x it be known for each subject in the entire sample [4] This is quite unrealistic in practice The current approach only requires the mean and variance of x it and in most applications, it is possible to obtain estimates of these parameters from other similar studies, especially when x it only contains one or two covariates Note that for regression analysis with non-clustered study designs, a common approach is to use the expectation of the R 2 = SSR=SSTO as the eect size for power and sample size estimation (eg Reference [35, Chapter 9]), where SSR and SSTO denote the regression and total sums of squares, respectively (eg Reference [36, Chapter 7]) Such an approach does not require the distribution of the predictors A primary disadvantage of this approach is that it does not provide power for specic contrasts of interest In addition, such approach requires normal data distribution and may not be readily generalized to the GEE setting 23 Generalized linear models for binary response For a binary response y it, the model in () has the following form: E(y it x it )=h it (x it ); Var(y it x it )=h it ( h it ); 6t6m; 6i6n (8) For binary data, the within-cluster correlation does not have a straight forward interpretation as for continuous responses If estimates of such correlations are available from other similar studies, then they can be used for the variance matrix of the responses Otherwise, it is probably more convenient to express the correlations as a function of transition probabilities as in one-step Markov models For example, it is readily shown that the within-cluster correlation between y is and y it is given by: Corr(y is ;y it x is ; x it )= Pr[y it = y is =; x is ; x it ] h it ( his ) ( h it ) h is h it ; s t (9) Here, the one-step transition probability, Pr[y it = y is =; x is ; x it ], is generally easier to interpret than the correlation Corr (y is ;y it x is ; x it ) In addition, for most applications, we may also want to approximate Pr[y it = y is =; x is ; x it ] with Pr[y it = y is = ], which is even easier to specify For example, Pr[y it = y is = ] is simply the transition probability of observing a response from time t, given the same response at time s In general, for non-linear links such as the logit link, = E (D V D ) is not in closed form In addition, unlike linear models, E (D V D ) no longer depends on x t through its rst two moments and a distribution of x t must be assumed to evaluate this quantity Let F() denote the probability distribution function of X Then, we have: [ = E (D V D )= D ()V ()D ()df()] (20) For discrete x t, the above is readily expressed in closed form (see also Example 3 in Section 3) For continuous x t, (20) is generally not in closed form One way to approx-

8 2806 X M TU ET AL imate it is through Monte Carlo (MC) simulations For example, by generating a sample of size M from the distribution of X, we can approximate by the sample average: [ M M k= ] Dk V k Dk (2) The accuracy of the MC approximation improves as M increases In addition, the MC sample size M can even be selected to ensure that the MC approximation achieves required accuracies [37] These are well known facts and are not further discussed Note that for logistic regression with non-clustered data, Whittemore [38] proposed an approach to approximate the asymptotic variance of parameter estimates when the response probability is small However, even in this special case, the approach only works for some special types of distributions The MC-based approach above is more exible in terms of accommodating a mixture of both continuous and discrete predictors In addition, the accuracy of the MC approximation (2) is only a function of the MC sample size and is independent of the magnitude of the response probability as in Whittemore s approximation 22 Linear mixed-eects model Published studies have considered relatively simple designs such as RANOVA with two groups [9] and growth curve models with no predictors [5] We generalize these approaches to accommodate multiple groups in the former and predictors in the latter It follows from (2) that the conditional variance of y i given X i is given by: V i = Z i DZi + 2 I m Thus, the maximum likelihood estimate (MLE) of is given by: [ ˆ = n n i= X i ˆV i Xi ] [ n n i= ] X i ˆV i y i where ˆV i is estimated by the MLEs of D and 2 [3] Following the law of large numbers and the central limit theorem [28, 29], ˆ is both consistent and asymptotically normal: (22) n( ˆ ) d N(0; = E [X (Z DZ + 2 I m ) X ]) (23) For the linear contrast (3), the Wald statistic dened in Section 2 again has an asymptotic s 2 distribution (9), except for a redened given by (23) The power function is again given by (0) In most growth-curve analysis, z it is a function of time t only and is functionally independent of any baseline covariates In this case, V i becomes functionally independent of x it and in (23) is similarly calculated as in (7) Thus, the power function depends on X only through its rst two moments In some other applications, however, z it may depend on some baseline predictors (covariates), in which case, the distributions of X must be known in order to compute The considerations are similar to the binary model under GEE discussed above For example, if Z i contains part or whole of X i, ie Z i = Z i (X i ), we may estimate with a Monte Carlo approximation given by: ˆ = M M k= [X k (Z k DZ k + 2 I m ) X k ] (24)

9 POWER ANALYSES FOR LONGITUDINAL TRIALS 2807 where X k denotes a random sample from the distribution of X and Z k the corresponding matrix based on this sample Note that in (2), we have assumed that it has a constant variance 2 over time If this variance changes over time, the discussion above still applies, with 2 I m replaced by the variance of i 3 ILLUSTRATION In this section, we illustrate the general approach with several examples All these examples were motivated by real study designs from grant preparations within the School of Medicine at University of Pennsylvania and the School of Medicine and Dentistry at University of Rochester For illustration purposes, real study sizes are not used Instead a sample size of 70 (per group for group comparisons) is used throughout the examples, with the type I error xed at =0:05 Example (Repeated analysis of variance) In a longitudinal study to assess for depressive symptoms, chronic pain, and interpersonal functioning (dened as social role performance and attachment to others) among women presenting to a low-income women s public health clinic, it is of interest to compare patients with co-morbid depressive symptoms and chronic pain to those with chronic pain only and a control group with respect to some outcome of interest such as interpersonal functioning The design calls for an RANOVA model with three groups For illustration purposes, we assume three follow-up assessments, t =; 2; 3, (excluding baseline, denoted by t = 0) Thus, g =3; m=3 and n k =70 (6k6g) in () Under a common variance and a uniform compound symmetry correlation structure across all three groups, we have from (2) that V = 2 C() Now, consider testing the hypothesis of a constant mean dierence between group (k ) and k over the three visits, ie H 0 : kt (k )t = 0 versus H a : kt (k )t = a; k =2; 3; 6t63 (25) To express the above in the form of (3), let K =(c(; 2); c(3; 4); c(5; 6); c(); c(3); c(5); c(2); c(4); c(6)); a = a 6 (26) where c(i;:::;j) denotes a 6 vector with in the rows i;:::;j and 0 elsewhere, and 6 a 6 vector of s Power estimates for the hypothesis (25) are readily computed by (0) with s = 6 Shown in Table I are power estimates for a series of values of the input parameters, a; and As expected, increasing a or decreasing has a positive eect on power In addition, power decreases as the within-cluster correlation becomes larger The latter is also readily demonstrated analytically For example, let ˆk;(k ) =ˆ kt ˆ (k )t with ˆ kt dened in (4) Then, under H a in (25), it is readily shown that k;(k ) has the following asymptotic distribution: ) nk ( ˆ k;(k ) a) d N (0; 2 2 n k m [+(m )] ; k=2; 3 The asymptotic variance of the dierence statistic k;(k ) is an increasing function of Thus, power is inversely related to the within-cluster correlation

10 2808 X M TU ET AL Table I Estimates of power for detecting a constant between-group mean dierence over time among three groups in an RANOVA model with three assessments Within-subject over time correlation Standard deviation =4=3:5 Between-group mean dierence a =08 046= = =038 02= = = = = = = = =094 07=087 06=078 It is interesting to compare this example with the paired t test The latter is widely used for comparing two groups involving changes between two time points, often termed pre- and post-treatment It is well known that for the paired t-test power is an increasing function of the within-subject correlation (between pre- and post-treatment measures), in contrast to the behaviour of the power function for hypothesis (25) To explain the dierence, let us formulate the paired t-test using the set-up in this paper Let m = 2 and g = Let y it denote the paired pre- (t = ) and post-responses (t = 2), with mean t for t =; 2 The paired t-test is designed to detect a dierence between the t s, ie H 0 : 2 = 0 versus H a : 2 = a 0 (27) By comparing the above to (25), it is seen that the paired t-test detects dierences in the mean response between two assessment points, while the hypotheses in (25) concern between-group dierences within each of the assessment points Note that power for the paired t-test is usually computed based on the t-distribution [35] The alternative based on (27) is asymptotically equivalent to this procedure, with the advantage of not requiring the normal assumption Example 2 (Linear growth curve with a continuous baseline covariate under GEE) In a sleep study, it is of interest to model the change of some measure of sleep disturbance (total sleep time, averaged cortisol or melatonin levels, etc) over three assessment points Since sleep is a function of age, it is important to control for its eect in power analysis In this example, we assume that the change pattern over the period of study is independent of age so that there is no age by time interaction (see also Example 4 for a dierent analysis) Assume a linear growth curve model, with one baseline covariate x and three assessments, t =0;t 2 and t 3 Let x it =(x it ;x i2t ;x i3t ) =(;x i ;t) Then, under the GEE approach, it follows from Section 22 that E(y it x i )= 0 + x i + t 2 ; Var (y it x i )= 2 (28) Consider testing a non-zero slope for the linear growth: H 0 : 2 = 0 versus H a : 2 = a

11 POWER ANALYSES FOR LONGITUDINAL TRIALS 2809 Table II Estimates of power for detecting a slope of 05 in a GEE-based growth-curve model with sample size equal to 70 Within-subject over time correlation Standard deviation of model error =5=7 Standard deviation of covariate x i = =045 06= = = = = = = = = = = = =059 Then, K =(0; 0; ) and the non-centrality parameter for (9) is c = na 2 (K K ) 2 To compute, note that E(x tl x tm )=; E(x tl x 2tm )=E(x ); E(x tl x 3tm )=t m E(x 2tl x 2tm )=E(x 2 ); E(x 2tl x 3tm )=t m E(x ); E(x 3tl x 3tm )=t l t m (29) Given the rst two moments or the mean and variance of x i ; = E (X V X ) is readily computed using (7) Shown in Table II are power estimates under a compound symmetry correlation assumption for a range of values of the key parameters, ; and the standard deviation of the covariate As with Example, power is a decreasing function of the within-subject correlation In addition, power increases as the variance of the covariate gets larger Example 3 (Clustered binary responses) As an example of a clustered cross-sectional study design, consider a study in testing a new behavioural therapy in reducing the sexually transmitted disease (STD) due to HIV The goal of the study is to determine the ecacy of the therapy at post-treatment by testing for a dierential STD rate between the treated and a control group A total of 40 HIV serodiscordant heterosexual couples are targeted, which is evenly split with the two groups Although assessment is cross-sectional, partners from each couple form clustered responses To address this between-partner correlation, we model the binary responses from partners using the model in (), with cluster size m = 2 and a logit link, h it (x i ) = exp( 0 +x i )=+exp( 0 + x i ), where i indexes couple, t indexes partner (t =0; ), and the binary covariate x i indicates treatment condition: x i = for the treated x i = 0 for the control group Under this model, STD rate for the controlled and treated group can be expressed as: p 0 = h it ( 0 ) and p = h it ( 0 + ) and a dierential STD rate can be expressed in terms of as follows: H 0 : =0; H a : = a = h it (p ) h it (p 0 ) (30) To compute, note that it follows from (9) that, i = Corr(y it ;y is x i )= Pr[y it = y is =;x i ] h i h i

12 280 X M TU ET AL Table III Estimates of power for detecting a between-group dierence in incidence of STD based on a sample size of 70 couples per group STD incidence rate Between-partner correlation Control Treatment ( ) ( ) xi i D i (x i )=h i ( h i ) ; V i (x i )=h i ( h i ) x i i = E (D V D )= [ D (x )V (x )D (x )Pr[x = l]] (3) l=0; where Pr[x = l] denotes the proportion of group l in the total study population (l =0; ) For equal group size as in this example, Pr[x = l]= 2 For this application, we also set the within-couple correlation to a constant, i = This is a reasonable assumption in most such studies, since it is unlikely that the between-partner correlation changes across treatments Shown in Table III are power estimates for detecting a dierential STD rate for a range of values of the between-partner correlation As expected, power decreases as this correlation increases Note that unlike linear models, power also depends on the actual rates of the two groups, in addition to being a function of their dierence Example 4 (Linear growth curve with a binary baseline covariate under LMM) Consider again the sleep disturbance study in Example 2 Now, suppose that the change pattern varies with age For illustration purposes, we assume that we can group subjects into two age groups In addition, we assume a LMM for modelling the change pattern over time Note that applications of LMM require that the response variable follow a normal distribution For most biological measures such as averaged cortisol levels, the normal assumption may approximately apply Let x i denote a baseline binary covariate, indicating the two age groups As in Example 2, assume three assessment times t =0; t 2 and t 3, and set x it =(;x i ;t;tx i ) and z it =(;t) The LMM in (2) becomes: y it = 0 + x i + t 2 + tx i 3 + b 0i + tb i + it ; b i (0;D); N(0; 2 I m ) (32) Unlike Example 2, the above model includes a time by covariate interaction, which in this case, accounts for a dierential linear trend between two groups dened by x i = 0 and x i = Note that as special cases, (32) reduces to a two-group RANOVA if 2 = 3 = 0 and z it = [9] and to a growth curve model without covariate if = 3 = 0 [5] For this two-group growth curve model, hypotheses of interest include non-zero growth rate for one or both groups, dierential growth rates, etc For illustration purposes, consider testing

13 POWER ANALYSES FOR LONGITUDINAL TRIALS 28 Table IV Estimates of power for detecting a dierential slope in a LMM-based growth-curve model with a sample size 70 Ratio of between- to within-cluster standard deviation b Between-group slope dierence a =0:5=:0 Within-cluster std dev :99=:0 0:9=0:99 0:62=0:99 0:40=0:92 0:28=0:79 0:=0:29 4 0:99=:0 0:69=0:99 0:40=0:92 0:25=0:73 0:8=0:54 0:08=0:8 5 0:93=0:99 0:5=0:98 0:27=0:77 0:8=0:54 0:3=0:38 0:07=0:3 for a dierential growth rate, ie The power function is given by (0), with H 0 : 3 = 0 versus H a : 3 = a 0 (33) = E [X V X ]; V = Z DZ + 2 I 3 ; K =(0; 0; 0; ) (34) By identifying V above as the V in Example 2, it is readily seen that depends on x through its mean and variance the same way as in that example In comparison, the conditional variance of y i given x i here is explicitly partitioned into two parts; one accounts for the random eect (between-cluster), Z DZ, and the other for model (within-cluster) error, 2 I 3 Let D = b 2( r2 )=2 bg(; ) Note that since the intercept and slope may have quite different units, the uniform compound symmetry correlation structure is not appropriate for the random eect and the presence of is to account for such dierential units between the intercept and slope Then, it follows from (34) that = 2 E [X V X ]; V = 2 b 2 Z G(; )Z + I 3 Thus, is a function of 2 ; 2 b =2, and Shown in Table IV are power estimates for a range of values of the input parameters, 2 and 2 b =2, with =0:5 and = Note that for the particular hypothesis in (33), we found that power estimates did not change when was varied and for this reason, was set to in the calculations As in Example 2, power decreases as the variance ratio, 2 b =2, increases It is somewhat surprising to see the drastic eect of this parameter on power In addition, Table IV also shows a strong dependence of power on the slope dierence Example 5 (Intraclass correlation) We now illustrate a cross-sectional clustered design in psychotherapy research An important consideration in designing psychosocial studies is the so-called therapists eect Because therapists may dier in their skill or ability to form a therapeutic bond, there are often real dierences between therapists in their average outcomes [39] Thus, studies of the ecacy or eectiveness of therapy often have a built-in component for testing this eect The LMM is often used for this purpose

14 282 X M TU ET AL Table V Estimates of power for detecting a treatment dierence between two groups, with a sample of 70 therapists per group and 4 patients per therapist Intraclass correlation = 2 b 2 b + 2 Treatment dierence a =:0=:5 Total variance b :99=0:93 0:99=0:82 0:96=0:73 0:92=0:6 0:87=0:54 6 0:97=0:74 0:90=0:57 0:80=0:46 0:7=0:40 0:64=0: :88=0:55 0:73=0:40 0:6=0:32 0:52=0=27 0:45=0:23 Let n denote the number of therapists and m the number of patients seen by each therapist By treating n as the number of clusters and m as the size of each cluster, patients responses, y it, can be modelled using the following LMM: y it = 0 + x i + b i + it ; b i (0; 2 b); it N(0; 2 ); 6i6n; 6t6m (35) where x i is a predictor of interest In (35), the therapists eect is explicitly modelled by the random eect b i It follows from Section 22 that the asymptotic variance of the MLE of =( 0 ; ) given by: is = E [X V X ]; V = bj 2 m + 2 I m =(b )C() (36) where J m is an m m matrix of s and C() the uniform compound symmetry correlation matrix with = b 2=2 b + 2 The within-cluster correlation is widely known as the intraclass correlation Thus, the asymptotic variance is a function of the rst two moments of x i, the intraclass correlation and the total variance (b ) Consider the hypothesis: H 0 : = 0 versus H a : = a 0 (37) If x i is binary, indicating say two treatment conditions, the above tests for a dierential treatment eect If x i is a continuous covariate, then (36) tests for a linear relationship between y it and x i Shown in Table V are power estimates for detecting a non-zero treatment dierence between two samples (x i =0; ) As in all previous examples, power decreases as the intraclass correlation increases Like Example 4, power also depends heavily on the size of the treatment dierence a As noted in Section 2, cross-sectional clustered study designs also often arise in survey and epidemiological research [0, 34] However, unlike psychosocial applications where intraclass correlation is also of interest for assessing therapists eect, survey and epidemiological studies are mostly interested in estimating population means (or xed-eect), in which case one can apply either GEE or LMM Thus, LMM is more appropriate for psychosocial applications involving inference for intraclass correlations

15 POWER ANALYSES FOR LONGITUDINAL TRIALS 283 Pattern Time of Assessment Missing DataPatterns Pattern 2 Time of Assessment Pattern 3 Time of Assessment 2 3 y y 2 y 3 y 2 y 22 y y y y 2 y 32 y 42 y 53 y 63 y 2 y 3 y 4 y 5 y 6 Figure Three missing-data patterns for a longitudinal study with six subjects and three assessment points (y it denotes the response of the ith subject at time t with dots denoting missing data) 4 DISCUSSION In this paper, we have developed a systematic approach to power analysis for the two most popular clustered data approaches, the GEE and LMM By extending existing methods to accommodate more practical considerations, this unied approach improves the limitations of these methods and provides power and sample size estimation for quite general study designs under both modelling paradigms One major limitation of the proposed approach is the assumption of a common, constant cluster size across clusters In survey research and most epidemiological studies, cluster sizes often vary [0, 34] This issue of varying cluster sizes also arises in longitudinal studies as the result of missing data In the latter case, we must also address the order structure among the repeated assessments in addition to dierence in cluster sizes For data analysis, both the varying cluster size and order structure are readily addressed by applying GEE or LMM to the observed data, provided that missing data follows the missing completely at random assumption (for using GEE) or the missing at random assumption (for using LMM) [3, 40, 4] For power analysis, both become important considerations, since cluster size and order structure are a function of the sampling process and dynamically changes under replications of the same study design For example, shown in Figure are three possible missing data patterns for a hypothetical longitudinal study with six subjects and three assessments For a real study, only one of the patterns is observed and inference for model parameters of interest is performed by conditioning on the observed missing data pattern using either GEE and=or LMM However, for power analysis, we must consider all three plus many more potential missing data patterns as realizations of a random process Currently, there is no general approach to addressing the random nature of the missing data patterns Existing methods either provide power estimates conditional on a particular missing data pattern or based on certain marginal distributions of the missing data pattern [5, 6, 9, 0] Since dierent missing data patterns generally give rise to dierent power functions, the former approach does not address the eect of the dynamic

16 284 X M TU ET AL missing data patterns on power estimates The latter approach is also ineective in addressing the dierent missing data patterns For example, to account for missing data in longitudinal study designs, one such method is to condition on the available sample size at each assessment point However, this method does not distinguish the rst two patterns in Figure Methods used in survey research and epidemiological studies only account for dierences in cluster sizes [0] and as such cannot distinguish the last two missing data patterns in the gure Thus, to fundamentally address the missing data issue, it seems necessary to model the missing data process and incorporate this model into the power function Developing such an approach will be pursued in our future research ACKNOWLEDGEMENTS This research is supported in part by an NIH/MINH Grant P50-MH (Crits-Christoph and Tu) and by an NIH/NIAD Grant K22AI 586 (Kowalski) We especially thank two anonymous reviewers for bringing to our attention many important references and for numerous valuable comments that greatly improved the presentation of the material REFERENCES Clarke GN Improving the transition from basic ecacy research to eectiveness studies: methodological issues and procedures Journal of Consulting and Clinical Psychology 995; 63: Hoagwood K, Hibbs E, Brent D, Jensen P Introduction to the special section: ecacy and eectiveness in studies of child and adolescent psychotherapy Journal of Consulting and Clinical Psychology 995; 63: Hogarty GE, Schooler NR, Baker RW Ecacy versus eectiveness Psychiatric Services 997; 48:07 4 Muller KE, LaVange LM, Ramey SL, Ramey CT Power calculations for general linear multivariate models including repeated measures applications Journal of the American Statistical Association 992; 87: Wu M Sample size for comparison of changes in the presence of right censoring caused by death, withdrawal, and staggered entry Controlled Clinical Trials 988; 9: Lee JW, DeMets DL Sequential comparison of changes with repeated measurements data Journal of the American Statistical Association 99; 86: Diggle PJ, Liang KY, Zeger SL Analysis of Longitudinal Data Oxford University Press: New York, Rochon J Application of GEE procedures for sample size calculations in repeated measures experiments Statistics in Medicine 998; 7: Hedeker D, Gibbons RD, Waternaux C Sample size estimation for longitudinal designs with attrition: comparing time-related contrasts between two groups Journal of Educational and Behavioral Statistics 999; 24: Manatunga AK, Hudgens MG, Chen S Sample size estimation in cluster randomized studies with varying cluster size Biometrical Journal 200; 43:75 86 Liang KY, Zeger SL Longitudinal data analysis using generalized linear models Biometrika 996; 73: Zeger SL, Liang KY Longitudinal data analysis for discrete and continuous outcomes Biometrics 996; 42: Laird N, Ware J Random-eects models for longitudinal data Biometrics 982; 38: Stiratelli R, Laird N, Ware JH Random-eects models for serial observations with binary response Biometrics 984; 40: Gilmour AR, Anderson RD, Rae AL The analysis of binomial data by a generalized linear mixed model Biometrika 985; 72: Breslow NE, Clayton DG Approximate inference in generalized linear mixed models Journal of the American Statistical Association 993; 88: Davidian M, Gallant AR The nonlinear mixed eects model with a smooth random eects density Biometrika 993; 80: Wolnger R Laplace s approximation for nonlinear mixed models Biometrika 993; 80: Pinheiro JC, Bates DM Approximations to the log-likelihood function in the non-linear mixed-eects model Journal of Computational and Graphical Statistics 995; 4: Goldstein H, Rasbash J Improved approximations for multilevel models with binary responses Journal of the Royal Statistical Society Series A 996; 59:505 53

17 POWER ANALYSES FOR LONGITUDINAL TRIALS Lin X, Breslow NE Bias correction in generalized linear mixed models with multiple components of dispersion Journal of the American Statistical Association 996; 9: Wang N, Lin X, Gutierrez G, Carroll RJ Bias analysis and SIMEX approach in generalized linear mixed measurement error models Journal of the American Statistical Association 998; 93: Agresti A, Booth JG, Hobert JP, Cao B Random eects modelling of categorical response data Technical Report, Department of Statistics, University of Florida, Gainesville, FL, Lipsitz SR, Kyungmann K, Zhao L Analysis of repeated categorical data using generalized estimating equations Statistics in Medicine 994; 3: Searle SR Linear Models Wiley: New York, MacCullagh P, Nelder JA Generalized Linear Models (2nd edn) Chapman & Hall: London, Pepe MS, Anderson GL A cautionary note on inferences for marginal regression models with longitudinal data and general correlated response Communications in Statistics Part A Theory and Methods 994; 23: Billingsley P Probability and Measure (2nd edn) Wiley: New York, Chung KL A Course in Probability Theory (2nd edn) Academic Press: CA, Jennrich RI, Schluchter MD Unbalanced repeated-measures models with structured covariance matrices Biometrics 986; 42: Sering RJ Approximation Theorems of Mathematical Statistics Wiley: New York, Seber GAF, Wild CJ Nonlinear Regression Wiley: New York, Seber GAF Multivariate Observation Wiley: New York, Donner A, Klar N Cluster randomization trials in epidemiology: theory and applications Journal of Statistical Planning and Inference 994; 42: Cohen J Statistical Power Analysis for the Behavioral Sciences (2nd edn) Lawrence Erlbaum Associates: New Jersey, Neter J, Wasserman W, Kutner MH Applied Linear Statistical Models (3rd edn) Irwin: Illinois, Geweke J Bayesian inference in econometric models using Monte Carlo integration Econometrica 989; 57: Whittemore AS Sample size for logistical regression with small response probability Journal of the American Statistical Association 98; 76: Crits-Christoph P, Mintz J Implications of therapist eects for the design and analysis of comparative studies of psychotherapy Journal of Consulting and Clinical Psychology 99; 59: Little RJA, Rubin DB Statistical Analysis with Missing Data Wiley: New York, Robins J, Rotnitzky A, Zhao LP Analysis of semiparametric regression models for repeated outcomes in the presence of missing data Journal of the American Statistical Association 995; 90:06 2

Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University

A SURVEY OF VARIANCE COMPONENTS ESTIMATION FROM BINARY DATA by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1211-M May 1993 ABSTRACT The basic problem of variance components