dierent individuals in the data set can have dierent \baseline" event rates (on entry of study) which may be regarded as sampled from an imaginary dis

Size: px
Start display at page:

Download "dierent individuals in the data set can have dierent \baseline" event rates (on entry of study) which may be regarded as sampled from an imaginary dis"

Transcription

1 Group Seuential rocedures for oisson rocess Data with Frailty Wenxin Jiang Department of Statistics, Northwestern University, Evanston, Illinois 60208, U.S.A. January 7, 1998 Summary Consider studies with recurrent events data modeled by local oisson processes with frailty, or random eects. In a group seuential design, the increments of the test statistics are no longer independent. We explore the inuences of the frailty, and tabulate the stopping boundaries and sample size ratios to control the overall type-i and type-ii error rates. Applications in an animal trial with recurrent tumors are discussed, using the procedures such as the group seuential tests and the repeated condence intervals. Minimal cost analysis is considered for determining the optimal combination of study duration and sample size. 1. Introduction This paper investigates group seuential procedures for recurrent events data, allowing frailty (see Oakes, 1992), or the random heterogeneity of event freuencies among dierent subjects (see Lawless, 1987; Turnbull, Jiang, and Clark, 1997). Recurrent events data consist of individuals each being able to develop a number of events over time. Examples include data from medical studies of epileptic seizures, asthmatic attacks, infections, etc. In this context, \frailty" means that Key words: Frailty; Independent increments; Interim analyses; oisson process; Minimal cost analysis; Multiplicative intensity model; Recurrent events. 1

2 dierent individuals in the data set can have dierent \baseline" event rates (on entry of study) which may be regarded as sampled from an imaginary distribution. As we show in Section 2, when frailty is present, the increments of the seuentially calculated partial-likelihood score statistics no longer have the convenient independent properties (e.g., Tsiatis, Boucher, and Kim, 1995; Lan and DeMets, 1983), implying that the usual stopping rules or exit boundaries are no longer valid in general. Seuential analyses with xed number of subjects but with variable follow-up times have been discussed in a dierent context of repeated measurements data by Armitage, Stratton and Worthington (1985); Geary (1988); Lee and DeMets (1991); Lee and DeMets (1992). Cook (1995) discussed the xed design of clinical trials for recurrent events data in the same context of the present paper. Few works exist on seuential analysis of recurrent events data. In a recent paper, Cook and Lawless (1996) used robust pseudo-score test statistics which do not necessarily have independent increment structures and considered the evaluation of stopping boundaries of various types. The important idea of using the robust score-type tests, stemming from Lawless and Nadeau (1995), Cook, Lawless and Nadeau (1996) and applied to seuential analyses in Cook and Lawless (1996), has the virtue of (i) no distributional assumption is needed for the event process except for the assumption of the mean process; (ii) the score statistic does not reuire obtaining the parameter estimate to start with; (iii) The temporal trend of the mean process is modeled non-parametrically. The current paper is intended to focus more on the design issues, and to directly model and investigate the eects of frailty, by focusing on the specic situation when the recurrent events data can be modeled by local oisson processes with frailty. Here, \local" means that the event rate of a same individual is allowed to change over time. When transformed to the process of the cumulative count of events, this model is essentially a generalization of the Andersen-Gill Model (Andersen et al., 1993) incorporating frailty. A \frailty parameter" is introduced in Section 2, which is proportional to the between-subject variance of the baseline event rate. 2

3 In particular, we address the following aspects dierent from Cook and Lawless (1996), in an attempt to provide some guidelines for study design. (i) We consider the Wald statistics which are often used in biomedical studies. (ii) We consider the asymptotic joint distribution of the seuential test statistics under local alternatives (Section 2) to derive the formulas for obtaining stopping boundaries and sample size planning (Section 3), and repeated condence intervals (Section 4) that allow extra exibility for interim analyses (Jennison and Turnbull, 1984; Lai, 1984). For uniform follow-up plans (Section 3), we transform the test statistics and derive an iterative algorithm in the form of Armitage, Mcherson and Rowe (1969) for calculating stopping boundaries and planning sample size. For 2 to 5-stage interim analyses with eual increments procedures (Section 3), we present concise tables (Tables 3,4,5) for the stopping boundaries and sample size planning, labeled by one parameter related to frailty. (iii) We illustrate the calculation of a correlation parameter expressed in (8), to see how the dependence of the increments of the test statistic is induced by frailty. Two properties of the test statistics caused by frailty are discussed (1 and 2 in Section 2), with their implications on study planning discussed in Section 3. (iv) For studies with recurrent events data, designers often have their own choice on increasing the sample size or the length of study period to achieve certain error rates. We present a minimal cost analysis (Section 5) for determining the optimal combination of sample size and study duration. (v) Cook and Lawless (1996) uses the error spending function approach, and estimates the covariance matrix of the seuential statistics to obtain the stopping boundaries stage by stage from data. This approach has the virtue of robustness under model misspecications. We consider the original ocock (ocock, 1977) and OF (O'Brien and Fleming, 1979) designs and summarize the covariance matrix in terms of the frailty parameter, under the local oisson process model with frailty. This approach is more model-specic, but is especially suitable for designing problems, due to the ease of summarizing the procedures in terms of pre-planned stopping boundaries and sample 3

4 sizes using the frailty parameter. (vi) All the stopping criteria and sample size calculations are based on the joint asymptotic normality of the seuentially calculated score-type statistics. However, since the independent increment properties fail, the joint normality is not automatic even if under mild regularity conditions the seuential score statistics are marginally asymptotically normal. We outline a proof of the joint asymptotic normality in Lemma 1 in Appendix A. In Section 2 the robust test statistics are introduced to account for frailty. Joint distributions of the test statistics evaluated at dierent interim analysis dates are derived and expressed in terms of certain correlation coecients, which can be calculated for dierent models of follow-up processes. From the joint distributions, we calculate the adjusted stopping boundaries of the ocock and the OF types, and tabulate them (Tables 1,2,3) for dierent correlation coecients (Section 3). Recipes are also introduced for calculating sample sizes (Section 3) and constructing (Section 4) repeated condence intervals. Optimal choice between increasing sample size and study duration is discussed in Section 5. We then retrospectively use a rats experiment data to illustrate the methods of this paper (Section 6), followed by a brief discussion (Section 7). All the Tables (1 to 5) are put in Appendix B. 2. Model and notation First, let us consider a trial with a xed design, with n independent subjects labeled as i = 1; : : : ; n. Each subject is randomly assigned to receive a certain treatment (Z = 1) or a placebo (Z = 0) with eual probabilities. The study duration is parameterized as Day 0 (t = 0) to Day K? 1 (t = K? 1). During the study period, subject i enters at t = t e and exits either at the end of study K or at a censoring time c due to a loss to follow-up. The purpose of the trial is to detect the treatment eect in reducing the freuency of certain outcome events. For subject i at time t, the response variable Y ~ it is the number of events, observed or not. Let Y it = H it Yit ~ be the number of 4

5 observed events, where we introduce an indicator H it which takes value 1 if subject i is observed at t, and 0 otherwise. We assume that the follow-up process fh it g is independent of the event process f ~Y it g. A semi-parametric model in the multiplicative form is specied as where is the parameter for treatment eect (log risk ratio), t E c (Y it ) = H it i t e Z i ; (1) is a discrete baseline intensity function which represents the natural trend of disease progression, E c represents the expectation conditional on the follow-up process, treatment assignment and a \frailty" factor i. Note that in this model even if treatment assignments are the same for two patients, the event freuencies can dier due to dierent frailty i 's, which can come from all the dierent personal attributes (age, gender, genetic factors, family history, etc). We simplistically assume that i 's are independent and identically distributed (i.i.d.) random variables, and have mean 1 and variance (the frailty parameter). In the following we will assume a local oisson process regression model, where Y it 's conditional on the Z i 's, H it 's and the frailty i 's are independent oisson random variables with mean expressed by (1). We use this model as an example to formulate our method, which itself allows other models as well. For the model described above, it is known (Lawless and Nadeau, 1995; Jiang, 1996; Jiang and Turnbull, 1997) that the usual partial likelihood estimate ^ is consistent for the treatment eect, and is asymptotically normal with variance estimatible by a robust sandwich-type estimator, despite the existence of frailty. The partial likelihood score test is also valid, after using a robust estimate of the variance, for testing the null hypothesis H 0 : = 0. The results can be summarized as follows. Let the partial likelihood estimator be ^ = arg max b2r L(b) where L(b) = log n Y i=1 K?1 Y t=0 0 e Z i 0 b n j=1 H jt e Z j 0 b AY it : Denoting U(b) := r b L(b), the partial likelihood score statistic is U( 0 ). We have, under mild 5

6 regularity conditions, as n! 1, that n 1=2 ( ^? 0 )! Normal f0; n var( ^)g in distribution, and n?1=2 U( 0 )! Normal f0; n?1 varfu( 0 )gg in distribution, under the null hypothesis H 0 : = 0. Robust estimates of the asymptotic variances var( ^) and varfu( 0 )g are discussed in Lawless and Nadeau (1995), and Jiang (1996), for example. In the present notation, they can be expressed as var( ^ ^) = (n^i?1 )(n?1 ^V)(n^I?1 ) and varfu( ^ 0 )g = (n?1 ^V); where ^I =?r 2 bl(b)j b= ^ and (2) nx K?1 X n j=1 Y jt ^V = [ H it fy it? ( ^gfz n n j=1 H jt e Z j ^ )ez i j=1 H jt Z j e Z j ^ ik? ( n j=1 H jt e Z j ^ )g]2 : i=1 t=0 Denote I(b) :=?r 2 b`(b) where `(b) := K?1 t=0 E[Y it fz i b? log E(H it e Z ib )g] is the uniform strong asymptotic limit of the (n?1 L), up to a constant independent of b. When the true parameter is, the (strong) asymptotic limit of n?1 ^I is just I(), which will be used in the discussion below. We now come to a seuential design, where we divide the entire study period into [0; K) = [K 0 ; K 1 ) [ [K 1 ; K 2 ) [ ::: [ [K Q?1 ; K Q ), where K 0 = 0 and K Q = K. Q analyses are scheduled at t = K 1 ; :::K Q, to perform the Wald test or the score test by examining ^ on data in t 2 [0; K ), = 1; :::; Q, allowing early termination of the trial. or U ( 0 ) based From now on, a subscript represents the uantity evaluated from data in t 2 [0; K ), = 1; :::; Q. To establish the stopping rules to control the overall type-i error rate, we need to know the joint distributions of ( ^? 0 ); = 1; :::; Q, and of U ( 0 ); = 1; :::; Q. Under a seuence of local alternatives, where the true parameter is = 0 + n?1=2, the joint distribution of the n?1=2 U ( 0 )'s turns out to be asymptotically normal under mild regularity conditions. See Lemma 1 in Appendix A. The joint asymptotic distribution of the n 1=2 ( ^? 0 )'s can be obtained by noticing that n 1=2 ( ^? 0 ) is fi ( 0 )g?1 fn?1=2 U ( 0 )g + o p (1) as n! 1. When the test statistics U ( 0 )'s and 6

7 ( ^? 0 )'s are normalized by being divided by their asymptotic standard errors, they both have the same asymptotic distribution. Denote the normalized statistic as s ( 0 ) = U ( 0 )[ varfu ^ ( 0 )g]?1=2, and w ( 0 ) = ( ^? 0 )fvar( ^ ^ )g?1=2, = 1; :::; Q. Then we have the following theorem. Theorem 1. If the true parameter is = 0 +n?1=2 (local alternative), then under mild regularity conditions, w 1 ( 0 ) ::: w Q ( 0 ) and s 1 ( 0 ) ::: s Q ( 0 ) 3 7 5! Normal 82 >< 6 4 >: 1 ::: Q ; ::: 1Q ::: ::: ::: Q1 ::: QQ 39 >= 7 5>; (3) in distribution, as n! 1. Here for ; 0 2 f1; :::; Qg, = I ()[n?1 varfu ()g]?1=2 + o(1), and 0 = corrfu (); U 0()g + o(1), as n! 1. For the rest of this section, we derive expressions for and 0's. We introduce the notation = () = e (1 + e )?1 (4) K 0?1 X and (T i; 0) = H it t for ; 0 2 f1; :::; Qg and < 0. (5) t=k We will also often omit the subscript i for a generic subject. In the present randomization set-up, the treatment variable Z = 0; 1 with eual probability and is assumed independent of the follow-up process. Straightforward calculation leads to I () = 2?1 ()Ef(T i;0 )g: In order to evaluate varfu ()g's and the correlation 0, we need to evaluate the covariance of the following form: covfu (); U 0()g, ; 0 2 f1; :::; Qg. Notice that the score U () is linear in the outcome variable Y it. Under the local oisson process regression model with frailty parameterized by variance, the covariance of Y it 's conditional on the Z i 's and the H it 's becomes covf(y it ; Y i0 t 0)jZ i's & H it 'sg = ii 0f tt 0H it t e Z i + (H it t e Z i )(H i0 t 0 t 0eZ i 0 )g; (6) 7

8 by rst conditioning on the frailty i 's. Here ab = 1 if a=b, and 0 otherwise. Next, note that the expectation of U ()'s conditional on the Z i 's and H it 's are 0. Hence covfu (); U 0()g is the same as EcovfU (); U 0()jZ i 's & H it 'sg. Then, using (6), we obtain, for ; 0 2 f1; :::; Qg, n?1 covfu (); U 0()g = 2?1 ()E( X K?1 t=0 +f()g 2 E( K 0?1 X X t 0 =0 K?1 K 0?1 t=0 H it H it 0 t tt 0) X t 0 =0 H it H it 0 t t 0) + o(1): (7) The convergence (o(1)) was proved rst by showing that the conditional covariance converges almost surely, and then applying the dominated convergence theorem. One thing worth mentioning is that the existence of frailty implies that the the score statistic U ( 0 )'s no longer have independent increments. Using notation (5), euation (7) implies that covfu 1 ( 0 ); U 2 ( 0 )? U 1 ( 0 )g = f( 0 )g 2 Ef(T i;01 )(T i;12 )g; which is positive if the frailty parameter is positive. The independent increment structure has been a basis for many work in seuential analyses, e.g., Tsiatis, Boucher and Kim (1995), Jennison and Turnbull (1989). Now its failure means that we cannot use the usual test criteria, of the ocock or the OF types for example. The joint distribution of the test statistics depends on the 's and the 0's. In the present model, these parameters are dependent on the frailty as following: = n 1=2 (? 0 )[2?1 fe((t 0 ))g?1 + 4Ef((T 0 )) 2 gfe((t 0 ))g?2 ]?1=2 0 = E(T 0 ) + 2Ef(T 0 )(T 0 0)g ; [Ef(T 0 )g + 2Ef((T 0 )) 2 g] 1=2 [Ef(T 0 0)g + 2Ef((T 0 0)) 2 1=2 g] (8) for ; 0 2 f1; :::; Qg and < 0, where is dened in (4). The element 0 's are reuired to be the same as the 0's, so as to make the variance-covariance matrix symmetric in (3). In the case when Q = 2, a single parameter = 12 determines the whole variance-covariance matrix. 8

9 Note 0's depend on the moments M 0 ;d 1 d 2 = E[f(T 0 )g d 1 f(t 0 0)g d 2 ]; where d 1 ; d 2 2 f0; 1; 2g, ; 0 2 f1; : : : ; Qg. They depend on the the form of the baseline intensity function t, as well as the details of the follow-up process. While various models are possible, we in the following focus on a simplest situation, where covf(t 0 ); (T 0 0)g = 0; 1 ; 0 Q, which could be assumed when the length of follow-up periods in each stage are nearly the same for all individuals (uniform follow-up plans). In this situation, = 2?1 A ; 1 Q (9) 0 = A =A 0; 0 = 0; 1 0 Q (10) where A = ( + (2 )?1 )?1 ; and is a shorthand notation for E(T 0 ), 1 Q, and = n 1=2 (? 0 ). Obviously, the correlation 0's is an increasing function of (non-negative). Hence, we have roperty 1 (1) (Increased Correlations): For uniform follow-up plans, the existence of frailty leads to bigger pairwise correlations between the seuentially calculated test statistics (Wald or score) asymptotically. Another implication of the frailty is the following: roperty 2 (2) (Standard Error Ination): The seuentially calculated test statistics (Wald or score), at each stage, have bigger asymptotic standard errors, due to frailty. This is because (7) implies that n?1 varfu ()g is an increasing function of (up to a term of o(1)), and the asymptotic variance of the test statistics (Wald or score) is proportional to n?1 varfu ()g. Note that (2) is not restricted to the uniform follow-up plans. In the following section, we will investigate the stopping boundaries and the sample size planning. The implications of (1) and (2) on these design aspects will then be made clear. From now 9

10 on we will concentrate only on the (normalized) Wald test statistic. However, up to the leading order of large sample size, things are the same for the normalized score test, since the two test statistics are asymptotically euivalent. 3. Stopping rules and sample size planning Consider now the following stopping rule for the Q-stage analysis: For = 1; :::; Q? 1, if jw ( 0 )j c (Q) then stop the trial and reject H 0 at time K ; otherwise continue the trial and perform a test at time K +1. At time K Q, if jw Q ( 0 )j c (Q) Q then reject H 0 ; otherwise retain H 0. Exit boundary fc (Q) 1 ; :::; c (Q) g is needed to preserve the overall type-i error rate, say, I. Sample size Q is also reuired to achieve a certain power, say, 1? II, at the alternative hypothesis H a : = a. Suppose the true parameter is parameterized in as () = 0 + n?1=2. Dene The error rate constraints are then simply () = pr () [jw ( 0 )j < c (Q) ; = 1; :::; Q]: (11) (0) = 1? I (12) ( a ) = II ; (13) with an alternative hypothesis H a : = a = ( a ). The above two euations can be used to nd the stopping boundary and the sample size (or the study duration, if the sample size is xed). The integration involved in calculating the probability () can be performed by using a multivariate normal integration package such as MULNOR (Schervish, 1984). In the following, we consider the ocock type boundary (ocock, 1977), where all c (Q) 's are assumed eual to some constant c (Q) ; as well as the OF type boundary (O'Brien and Fleming, 1979), where c (Q) = c (Q) OF (Q=) 1=2 is assumed for some constant c (Q) OF commonly-used type-i error rate I and = 1; :::; Q. For the most = 0:05, we obtained from (12) the constants c (Q) 10 and c (Q) OF

11 for Q = 2, as a function of the correlation = 12, by numerical integration using the Gaussian uadrature method with 48 nodes. The results are tabulated below (Tables 1 and 2) for correlations ranging from 0.00 to 0.99, in increments of The value of c (Q) or c (Q) OF for a negative correlation is the same as the value for a positive correlation with the same magnitude, due to the symmetry of the integration region. In the following we illustrate the calculation of based on a uniform follow-up plan dened at the end of the last section. Assume a constant intensity rate t = r 0. Then (T 0 ) = r 0 T 0 where T 0 := K?1 t=0 H t. Assume that all subjects enter study at time 0, there is no loss to follow-up, and one interim analysis is scheduled at t = K 1 = K 2 =2. In this case (T 01 ) = r 0 K 1, and (T 02 ) = r 0 K 2. Then the parameters for the asymptotic distribution of the Wald statistic are = n 1=2 (? 0 )f2?1 (r 0 K )?1 + 4g?1=2 ; = 1; 2! 1= r and = 2?1=2 0 K 2 where is dened in (4). (14) 1 + r 0 K 2 The frailty parameters r 0, (in ) and in (14) need to be estimated before the study, strictly speaking. They could be estimated by the method of moments, or by a method of negative binomial regression (Lawless, 1987; Abu-Libdeh, Turnbull and Clark, 1990; Turnbull et al., 1997), based on some pilot study data set, or results from previous studies with a similar nature. One thing to notice in (14) is that the correlation, as a function of the frailty, is always bigger than the correlation without frailty ( = 0), which is ( = 0) = 2?1=2 0:7071. Comparing with Tables 1 and 2, we see that it will always be conservative to use c (Q) 2:18 and c (Q) OF 1:98 in testing H 0, which corresponds to neglecting the frailty by taking = 0 and = 2?1=2. If frailty is taken into account, the rejection critical values c (Q) and c (Q) OF will be smaller, making it easier to reject H 0 for a same value of the Wald statistic. This shows a major implication of property (1) in the last section on the stopping boundaries. (1) implies, by the Slepian's ineuality (e.g., Theorem

12 of Tong, 1980), that the existence of frailty allows the use of narrower stopping boundaries. The usual stopping boundaries obtained based on the independent increments assumption, neglecting frailty, are conservative. In the rest of this section, we consider extensions to multi-stage analyses (Q > 2). The multidimensional numerical integration algorithms have decreasing accuracy and reliability as Q increases. However, we notice that a convenient alternative algorithm is available for uniform followup plans. In this situation, (9), (10) and Theorem 1 imply that (A? A?1 )?1=2 [A 1=2 fw ( 0 )? g? A 1=2?1fw?1 ( 0 )??1 g]; 1 Q are independent standard normal random variables (A 0 = 0). This leads to the following iterative (1-dimensional) integration algorithm, in the same form of Armitage et al. (1969), for () dened in (11): () = Z c (Q) Q?c (Q) Q where g(q; z; ) is iteratively dened as the following: g(q; z; )dz g(1; z; 0) = '(z) := (2)?1=2 exp(?z 2 =2); For 1 Q? 1, g( + 1; z; 0) =! 1=2 A Z (Q) c +1 A +1? A?c (Q) g(; u; 0)'f A1=2 +1z? A 1=2 u gdu; 1=2 (A +1? A ) and g(q; z; ) = g(q; z; 0) expf?8?1 A Q 2 + 2?1 z(a Q ) 1=2 g: This facilitates calculation of the stopping boundaries, powers and the sample sizes for multi-stage designs. When (Q = 2), the single-parameter () parameterization was very convenient for setting up concise tables for stopping boundaries. For Q > 2, the stopping boundary will in general depend on a Q-dimensional symmetric matrix determined by the recruitment/follow-up model. This will 12

13 lead to diculties in the summarization of the designing aspects of the interim analyses. However in a class of recruiting schemes simpler description is achievable for the seuential designs. Suppose, in addition to the uniform follow-up, we have an eual increments procedure, where = ; 1 Q for some constant, parameterization of and 0's can be further simplied: = Dh (); 0 = h ()=h 0(); 0 = 0; 1 0 Q (15) where D = 2?1 (2) 1=2, h () = ((? 0:5)(1? )?1 +?1 )?1=2 and = () := 2 12 = (2 + 0:5)(2 + 1)?1 : (16) (15) is easily obtained from (9) and (10) by noting that A 1=2 = (2) 1=2 h (); 1 Q: Hence we can in this case parameterize the power function by D and ( 2 [0:5; 1]). contains the input of frailty parameter. When there is no frailty, = 0:5. Consider now the null hypothesis = 0. Note that = n 1=2 (? 0 ) in D is 0, and so is D. The stopping boundaries can then be solved from (12), labeled by one parameter only. Table 3 lists c (Q) ranging from 0:50 to 0:90 in increments of 0.01, when I and c (Q) OF for Q = 2; 3; 4; 5 and = 0:05. When Q = 1, c (Q) and c (Q) OF are simply z I =2 1:96, where z is the 100(1? )th percentile of the standard normal cumulative distribution function. Consider now the alternative hypothesis = a. To solve for the sample size from (13), note that (13) involves two variables D and only through parameterization (15), and the sample size enters D through = n 1=2 (? 0 ). We therefore solve D from (13) as a function of to obtain D = D(), which also (implicitly) depend on Q, I and II (error rates reuired), as well as the type T of stopping boundaries (T = or OF ). Then, the denition of D below (15) leads to n = 4fD()g 2 ( a? 0 )?2 (2)?1 (17) where = ( a ), and () is dened in (4). If Q = 1, and = 0:5 (no frailty), D() can be solved directly as (z I =2 +z II ). Hence, in a xed design without frailty, the sample size needed for a study 13

14 length Q is n 0 = 4(z I =2 + z II ) 2 ( a? 0 )?2 (2Q)?1 : (18) n 0 is to be used as a reference sample size. The sample size reuired in (17) can then be expressed as n = R (Q) T ()n 0, where the ratio R (Q) T () = QD() 2 (z I =2 + z II )?2 is tabulated at I = 0:05, and II = 0:20 (Table 4) and 0:10 (Table 5), for Q = 2 to 5 and = 0:50 to 0:90 in increments of In practice, sample size could be obtained by rst nding the xed design \frailtiless" sample size n 0 from (18), then multiplying by a ratio R (Q) T () read from Table 4 or 5, if there is an initial estimate of frailty parameter to determine. We nd it very convenient to base the sample size planning on these tables. For comparison, the rst row R (1) in those tables lists the sample size ination ratio reuired in xed designs with > 0:5, induced by the existence of frailty. Now we comment on the implication of properties (1) and (2) of the last section on the sample size planning. (1) and (2) have opposite implications. With xed type-i and II error rates, (1) alone would imply that a smaller sample size could be used with the presence of frailty. However in reality the increase of standard error, due to (2), is often overwhelming, and a net result is that a much bigger sample size is reuired, due to the existence of frailty. The increment of the reuired sample size can be seen from Tables 4 and 5, where the ratio R (Q) T ()'s increase with, and hence also increase with the frailty parameter. 4. Repeated condence intervals Repeated condence intervals (RCIs) are a method which allows the study results to be evaluated exibly at interim analyses without depending on the rigid stopping criteria (Jennison and Turnbull, 1984; Lai, 1984; Coe and Tamhane, 1993). Here we construct the RCIs for recurrent events data with frailty. Note rst that the boundary fc (Q) g in the previous section is dependent on 0 (the value of under H 0 ) through the correlation 0's which depend on ( 0 ). We in this section explicitly 14

15 express such relation as c (Q) = c (Q) ( 0 ). Euation (12) can then be rewritten as pr 0 fjw ( 0 )j c (Q) ( 0 ); = 1; :::; Qg = 1? I : Let RCIs be I (Q) = f 0 : jw ( 0 )j c (Q) ( 0 )g; = 1; :::; Q: Then pr 0 f 0 2 I (Q) ; = 1; :::; Qg = 1? I : Then pr 0 f 0 2 I g 1? I for any stopping rule. Note that w ( 0 ) = (? 0 )fvar( ^ ^ )g?1=2 = (? 0 )= ^se( ^ ). When sample size n is large, the leading order approximation to the RCI I (Q) is determined by I (Q) = f 0 : j? 0 j= ^se( ^ ) = jw ( 0 )j c (Q) ( ^ )g; = 1; :::; Q; replacing c( 0 ) with c( ^ ). The solution becomes I (Q) = ( ^? c (Q) ( ^ ) ^se( ^ ); ^ + c (Q) ( ^ ) ^se( ^ )); = 1; :::; Q: (19) 5. Minimal cost analysis For recurrent events study planning, we often have the choice of increasing the sample size or study duration (or expected number of events per subject) to achieve a certain power. Optimal combination of sample size and duration could be determined by a minimax-type cost analysis. This is made possible by the sample size calculation method presented in Section 3. We assume an eual increments procedure where = ; 1 Q. C 0 0 Suppose the maximal cost of a study is C Q (n; ) = C 0 n(q + 0 ), where C 0 is a constant, and is the starting cost for recruiting one subject. 0, termed initial duration, is the expected number of events to be observed that would cost the same as recruiting one extra subject. Note that n = n 0 R (Q) T () / R (Q) T ()(Q)?1 from Section 3, where = () is dened in (16). We have C Q (n; ) / R (Q) T f( 0 x)gf1 + (Qx)?1 g 15

16 where x := = 0. An alternative expression in terms of is C Q (n; ) / R (Q) T ()f Q?1 (1? )(? 0:5)?1 g (20) Up to a multiplicative constant, this could be calculated for each from the tables of R (Q) T (), to search for the minimizer op. Then the minimizer in is op = (2)?1 ( op? 0:5)(1? op )?1 : The optimal sample size is then n op = n 0 R (Q) T ( op ) = R (Q) T ( op )4(z I =2 + z II ) 2 ( a? 0 )?2 (2Q op )?1 : In xed design Q = 1, R (1) () can be analytically evaluated to be 0:5(1? )?1, the optimal points could then be solved as op = (2 0 )?1=2 0, R (1) ( op ) = op and n op = (1 + 2 op )4(z I =2 + z II ) 2 ( a? 0 )?2 (2Q op )?1 : 6. An example Let us use a rats experiment data set (Gail, Santner and Brown, 1980; Thompson et al., 1978) retrospectively to illustrate our method. The data set itself was really obtained from a xed design. However we imagine that it was designed to have an interim analysis at halftime, and see what information a halftime analysis can provide in deciding whether we need to further carry out the other half of the experiment, as an illustration of our method. The data set has a relatively small sample size (48) and a relatively big average number of events (about 6). Further simulations will be reuired to test if these conditions are good enough for our proposed asymptotic method to work satisfactorily. Here the main motivation will be to use this example to illustrate the methodology. 48 female rats who remained tumor-free after sixty days of pre-treatment of a prevention drug (retinyl acetate) were randomized into two groups. In Group 1 (23 rats) they continue to receive 16

17 treatment (Z=1), in Group 2 (25 rats) they receive placebo (Z=0). Rats were palpated for tumors twice a week. For details see Thompson et al. (1978). Times of mammary tumor diagnoses were recorded, from which our response variable Y it 's are constructed. The objective of the study was to see if discontinuation of treatment leads to more tumors diagnosed. Formally, we would like to test the null hypothesis H 0 : = 0 at level I = 0:05. The original design was to follow all rats for a xed length of time (122 days). However, imagine that an interim analysis was planned on the data gathered up to the 61th day (halftime), and we will see how the method described above can be applied. We will use the local oisson process model with frailty, as introduced in Section 2, and the uniform follow-up plan as described in Section 3, to calculate the correlation (14) and perform the analysis. Based on the halftime data ( = 1), we get the maximum partial likelihood estimate as ^ 1 =?0:7549, with the robust estimate of standard error ^se( ^ 1 ) = 0:2427. Here the subscript 1 is used to denote the rst interim analysis. Note that the Z-value (or the Wald statistic) is the ratio?0:7549=0:2427 =?3:1104. If we decided in advance to use a ocock's boundary, then c (2) 1 = c (2). We notice from Table 1 that c (2) for any correlation is not as large in magnitude as our test statistic. So jw 1 (0)j = j ^ 1 j= ^se( ^ 1 ) c (2) 1. Then we could stop the trial and reject H 0 at halftime. If the OF procedure was decided before hand, we would be comparing the Z-value with c (2) 1 = 2 1=2 c (2) O. From Table 2 we again see that even the largest possible c (2) O will make c (2) 1 smaller than the Z-value. Therefore we decide to reject H 0 and discontinue the experiment at halftime. The actual analysis at Day 122 gives ^ 2 = , ^se( ^ 2 ) = 0:1968, and a Z-value?4:1819. RCIs with an overall level of condence 95% ( I = 0:05) can provide a range of plausible values for at the interim analysis, without conforming to a rigid stopping rule. In order to calculate the critical value c (2) 1, we need to nd the correlation from (14). The constants r 0 and are needed from some pilot study. Since we do not have such information, two approaches could be used. One is to simply use the lower bound of which is 0:7071. Another is to estimate r 0 and by a negative 17

18 binomial regression or a method of moment estimation from the data set accumulated up to the time of the interim analysis. The second approach, unlike in the case of tests where boundaries need to be pre-specied bef ore the experiment, is legitimate at present for obtaining RCIs up to the leading order of large sample sizes. This is because the estimates of r 0 and are used only as approximations to their true values. The rst approach ( = 0:7071) is conservative but is simpler, and will result in a slightly wider RCI. We did perform the second approach for the half-time data set and found that the resulting RCIs are very close to the conservative ones. Here we decide only to report the conservative results, corresponding to using = 0:7071. For RCI derived from the ocock's procedure, we use c (2) 1 = c (2) = 2:18 from = 0:7071, in place of the coecient c (2) 1 ( ^ 1 ) in (19). We obtain I 1 = (?1:28;?0:226). For OF's procedure we use c (2) 1 = 2 1=2 c (2) O = (1:41)(1:98) and get (?1:43; 0:0763). These, in the scale of of the risk ratio (e ), become the intervals (0.277, 0.797) and (0.239, 0.927) respectively. These imply that the treatment eect could range from roughly none, to reducing the freuency of tumors to about a uarter. The exibility of RCI approach allows the decision on the continuation of the experiment being independent of the rigid stopping rules. If we decide to carry on the experiment, we can obtain a second RCI at Day 122 based on the complete data. Using a correlation of 0:7071 to obtain a most conservative interval, we get the following intervals in the scale of risk ratio (e ). For ocock type RCI we obtain (0.286, 0.674), while for OF type we obtain (0.298, 0.648). Notice that the OF type gives a wider RCI for the rst interim analysis, but gives a narrower RCI for the later one. In conclusion, we nd that the treatment eect reduces the tumor freuency to about one-third to two-thirds of the control group rats. Now imagine that the present study is a pilot study for the purpose of estimating the sample size for the study of another drug, with the same time table of scheduled analyses, i.e., an interim analyses at Day 61 (K 1 ), plus a possible nal analysis at Day 122 (K 2 ). The present pilot study provides an estimate of r 0 = 6:04=122 (or about 1 tumor every 20 days), as well as a frailty 18

19 parameter = 0:2665, from a negative binomial regression (see, e.g., Abu-Libdeh et al., 1990; or Turnbull et al., 1997). Therefore we can use a rough value r 0 = 6=122 and = 0:3 in our sample size calculation. Suppose we are interested in detecting a drug eect of 80% in risk ratio, which corresponds to an alternative hypothesis H a : = a = log(0:80) =?0:2231, with power 1? II = 0:80. Then = ( a ) = 0:4444 where () is dened in (4). These lead to a correlation = 0:8498. From Tables 1 and 2 we obtain c (2) 1 = c (2) 2 = c (2) and c (2) 1 = 2 1=2 c (2) O = 2:780, c (2) 2 = c (2) O = 2:133 for the ocock's boundary; = 1:966 for the OF's boundary. Then we can use euation (13) to estimate the sample size. Alternatively we could use Table 4. Note that 12 = 0:85. = :73 (rounded up). From (18), we obtain a reference sample size n 0 as 119. According to Table 4, sample size ratios 2:8817 and 2:7086 are needed for power 0:80, leading to adjusted sample sizes n of 343 and 323, for ocock and OF design, respectively. If frailty was neglected ( = 0 and = 0:5), however, the sample size ratios would be and , respectively for ocock and OF designs with Q = 2, leading to corresponding sample sizes of 132 and 120 that would be naively planned. The correct sample sizes are more than twice the naive ones. This increase of sample size is reuired, however, to achieve the real power of If the naive sample size were to be used, real power can only be achieved at and 0.416, for ocock and OF's procedures respectively, instead of the desired one 0:80. It is also interesting to look at the sample size needed for a x design, if the error rates are specied to be the same as before. Suppose we plan to observe the subjects for 122 days, and assume that r 0 K Q is about 6, and the frailty is taken as 0:3 as before. In the previous formalism we need to replace by r 0 K Q = 6 when calculating by (16). The result is 0:81. The corresponding sample size ratio R (1) is For the previously obtained reference sample size n 0 = 119, we get an adjusted sample size n 314. Note that when there exists frailty ( = 0:3), the sample size reuired for a xed design (314) is only slightly smaller than the sample size reuired for a seuential design (e.g., 323 for the OF design from last paragraph). On the other hand, 19

20 a seuential design also allows the possibility of a shorter study period by possible stopping at halftime. These two observations suggest that in trials with recurrent events data with frailty, seuential designs are recommendable. Now we consider a cost analysis. The Type-I and II error rates are reuired to be 0.05 and 0.20, respectively. Suppose the initial duration described in Section 5 is 0 = 4. a = log(0:8), = 0:3 as before. We used (20) and Table 4 to obtain op = 0:68, op = 2:11 (corresponding to about 40 days follow-up each stage) and n op = 385, for the ocock design; and op = 0:67, op = 1:93 (corresponding to about 40 days follow-up each stage) and n op = 374, for the OF design. Using these results of optimal sample sizes and durations, it is straightforward to check that the OF design will have less optimal maximal cost C Q (n op ; op ) (93% of that of the ocock design). 7. Discussion This paper is an attempt to prescribe how to perform interim analyses for studies with recurrent events data with frailty. The most general level of our approach does not impose independent increment structures for the seuential test statistics. As a result, there may be other problems with non-independent increments of test statistics to which the present method could be applied. For example, Tables 1 and 2 only depend on the correlation, rather than the underlying mechanism by which the correlation is induced. However, computation is most convenient in the case when follow-up variation is negligible and the iterative 1-dimensional integration techniue can be applied. For eual increments procedures, parameterization can be further simplied and tables for stopping boundaries and sample size planning are provided for 2 to 5-stage analyses. The failure of independent increment structure leads to, in this case, a one-parameter family of stopping boundaries labeled by a parameter depending on the frailty, which reduce to the usual stopping boundaries when = 0:5. The impact of the frailty on the designing aspects are explicitly investigated. The existence of 20

21 frailty induces extra correlation between the seuentially calculated statistics, leading to smaller stopping boundary constants c (Q) and c (Q) OF for a pre-specied type-i error rate. However due to the ination of the standard error caused by the frailty, the sample size needed to achieve a certain power will often be larger. Our methods are asymptotic and depend on the large sample approximation. A study on the agreement of the empirical and nominal type-i error rates are discussed in Cook and Lawless (1996), in the situation of constant recruitment rates. Further such studies under dierent parameter ranges and dierent recruitment plans will be helpful. Our method will probably be useful in the context when initial stage analysis involves already a relatively large number of subjects (e:g:, > 100), and decision on whether or not to continue the follow-up/recruitment is to be made at each stage. There are still many opening uestions left in this direction. Various recruitment procedures of practical interest may be considered in calculating the correlations. A dierent problem is to look at the seuential bioeuivalence tests involving recurrent events outcomes with frailty. When the multivariate asymptotic normality does not hold but the marginal asymptotic normality for each seuentially calculated score-type statistic does, conservative stopping boundaries may be obtained from Bonferroni-type probability ineualities, which should work well for small number of interim analyses such as 5 or 6. When marginal asymptotic normality fails as well, e:g:, due to very small sample size at the initial stage, Chebyshev-type ineualities (or their correlationadjusted versions) may provide conservative stopping boundaries{the drawback is that they may be very conservative. These are the price to be paid for being robust in terms of the distributions of the test statistics. Acknowledgment The author is deeply indebted to rofessor Bruce Turnbull for helpful discussions and teaching on seuential analysis. He is also grateful to rofessor Ajit Tamhane for commenting on the 21

22 manuscript. This research is partly supported by the URGC Award of Northwestern University, and the U.S. National Science Foundation grant DMS References Abu-Libdeh, H., Turnbull, B. W., and Clark, L. C. (1990). Analysis of multi-type recurrent events in longitudinal studies; application to a skin cancer prevention trial. Biometrics 46, Andersen,. K., Borgan, O., Gill, R. D. and Keiding, N. (1993). Statistical Models Based on Counting rocesses. New York: Springer-Verlag. Armitage,., Mcherson, C. K., and Rowe B. C. (1969). Repeated Signicance Tests on Accumulating Data. Journal of the Royal Statistical Society, Series A 132, Armitage,., Stratton, I. M. and Worthington, H. V. (1985). Repeated signicance tests for clinical trails with a xed number of patients and variable follow-up. Biometrics 41, Coe,. R. and Tamhane, A. C. (1993). Exact repeated condence intervals for Bernoulli parameters in a group seuential clinical trial. Controlled Clinical Trials 14, Cook, R. (1995). The design and analysis of randomized trials with recurrent events. Statistics in Medicine 14, Cook, R. and Lawless, J. F. (1996). Interim monitoring of longitudinal comparative studies with recurrent event responses. Biometrics 52, Cook, R. J., Lawless, J. F. and Nadeau, C. (1996). Robust tests for treatment comparisons based on recurrent event responses. Biometrics 52, Gail, M. H., Santner T. J. and Brown, C. C. (1980). An analysis of comparative carcinogenesis experiments based on multiple times to tumor. Biometrics 36, Geary, D. N. (1988). Seuential testing in clinical trials with repeated measurements. Biometrika 75, Jennison, C. and Turnbull, B.W. (1984). Repeated condence intervals for group seuential clinical trials. Controlled Clinical Trials 5,

23 Jennison, C. and Turnbull, B.W. (1989). Interim analyses: the repeated condence interval approach. (with discussion). Journal of the Royal Statistical Society, Series B 51, Jiang, W. (1996). Aspects of Misspecication in Statistical Models: Applications to Latent Variables, Measurement Error, Random Eects, Omitted Covariates and Incomplete Data, h. D. Thesis, Cornell University. Jiang, W. and Turnbull, B. W. (1997). Semiparametric Regression Models for Repeated Events with Random Eects and Measurement Error. (submitted to Journal of the American Statistical Association). Lai, T. L. (1984). Incorporating scientic, ethical and economic considerations into the design of clinical trials in the pharmaceutical industry: a seuential approach. Communications in Statistics (A){Theory and Methods 13, Lan, K. K. and DeMets. D. L. (1983). Discrete seuential boundaries for clinical trials. Biometrika 70, Lawless, J. F. (1987). Regression methods for oisson process data. Journal of the American Statistical Association 82, Lawless, J. F. and Nadeau, C. (1995). Some simple robust methods for the analysis of recurrent events. Technometrics 37, Lee, J. W. and DeMets, D. L. (1991). Seuential comparison of changes with repeated measurements data. Journal of the American Statistical Association 86, Lee, J. W. and DeMets, D. L. (1992). Seuential rank tests with repeated measurements in clinical trials. Journal of the American Statistical Association 87, O'Brien,. C. and Fleming, T. R. (1979). Biometrics 35, A multiple testing procedure for clinical trials. Oakes, D. (1992). Frailty models for multiple event times. In Survival Analysis: State of the Arts, Ed. J.. Klein and. K. Goel, pp Netherlands: Kluwer Academic ublishers. ocock, S. J. (1977). Group seuential methods in the design and analysis of clinical trials. Biometrika 64, Schervish, M. J. (1984). Multivariate normal probabilities with error bound. (with corrections in 1985). Applied Statistics 33,

24 Thompson, H. F., Grubbs, C. J., Moon, R. C. and Sporn, M. B. (1978). Continual reuirement of retinoid for maintenance of mammary cancer inhibition. roceedings of the Annual Meeting of the American Association for Cancer Research 19, 74. Tong, Y. L. (1980). robability ineualities in multivariate distributions. Academic ress, New York. Tsiatis, A. A., Boucher, H. and Kim, K. (1995). models. Biometrika 82, Seuential methods for parametric survival Turnbull, B. W., Jiang, W. and Clark, L. C. (1997). Regression models for recurrent event data: parametric random eects models with measurement error. Statistics in Medicine 16,

25 Appendix A Lemma 1. If true parameter is = 0 + n?1=2 (local alternative), then under mild regularity conditions, n?1=2 ~U( 0 )! Normal f~i(); n?1 V g in distribution as n! 1. Here ~U(b) is the column of U (b)'s for each b 2 R; V is the (asymptotic) variance-covariance matrix of ~U(); and ~I() is the column of partial likelihood information I ()'s, where I () can be obtained from the asymptotic limit of n?1 ^I dened by (2), when restricted to the data in t 2 [0; K ). We here only give the outline of the proof. We claim that n?1=2 ~U() is asymptotically normal with mean 0, if the true parameter is. Then a Taylor expansion n?1=2 ~ U(0 ) = n?1=2 ~ U()+ ~ I()+op (1) is used to obtain Lemma 1, using the Slutsky's theorem. To prove the claimed asymptotic normality of n?1=2 ~U(), it suces to show that n?1=2 U? () is asymptotically normal, where U? () is the column of dierence fu ()? U?1 ()g's (U 0 () = 0). We show this by noting that U? () can be regarded as a usual score vector of dimension. This follows by constructing a Q-dimensional time-dependent covariate vector ~Z it = ( ~Z it1 ; :::; ~Z itq ) 0, where ~Z it = Z i 1 t, 1 t = 1 if t 2 [K?1 ; K ) and 0 otherwise, for = 1; :::; Q. Construct also a Q-dimensional vector ~ b = (b 1 ; :::; b Q ) 0. Then U? () = r ~ ~ b L( ~ b)j ~ b= ~, which is the score vector for the log partial likelihood ~L( ~ b) = log ny K?1 Y i=1 t=0 e ~ Z 0 it ~ b n j=1 H jt e ~ Z 0 jt ~ b 1 AY it ; taking value at ~ b = ~ = (; :::; ) 0. Note that ~ corresponds to the true parameter in this formalism, since Er ~ b ~ L( ~ b)j ~ b= ~ = 0. Finally, since U? () = r ~ ~ b L( ~ b)j ~ b= ~ has the form of a usual score vector, there are several dierent methods to prove it to be asymptotically normal with mean 0. A proof can be made from using a multivariate central limit theorem, by rst showing that U? () is a sum of n i.i.d. random vectors plus a term of order o p (1), or from using a martingale central limit theorem, or alternatively from recognizing ~L( ~ b) as the prole likelihood for a system of n independent oisson processes. 25

26 c (2) Appendix B Five Useful Tables Table 1 as a function of ( I = 0:05) c (2) Table 2 OF as a function of ( I = 0:05)

27 c (Q) Table 3 and c (Q) OF as a function of ; Q = 2; 3; 4; 5 ( I = 0:05) c (2) c (2) OF c (3) c (3) OF c (4) c (4) OF c (5) c (5) OF

Independent Increments in Group Sequential Tests: A Review

Independent Increments in Group Sequential Tests: A Review Independent Increments in Group Sequential Tests: A Review KyungMann Kim kmkim@biostat.wisc.edu University of Wisconsin-Madison, Madison, WI, USA July 13, 2013 Outline Early Sequential Analysis Independent

More information

Group Sequential Designs: Theory, Computation and Optimisation

Group Sequential Designs: Theory, Computation and Optimisation Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference

More information

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping : Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj InSPiRe Conference on Methodology

More information

Published online: 10 Apr 2012.

Published online: 10 Apr 2012. This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer

More information

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika.

Biometrika Trust. Biometrika Trust is collaborating with JSTOR to digitize, preserve and extend access to Biometrika. Biometrika Trust Discrete Sequential Boundaries for Clinical Trials Author(s): K. K. Gordon Lan and David L. DeMets Reviewed work(s): Source: Biometrika, Vol. 70, No. 3 (Dec., 1983), pp. 659-663 Published

More information

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data Efficiency Comparison Between Mean and Log-rank Tests for Recurrent Event Time Data Wenbin Lu Department of Statistics, North Carolina State University, Raleigh, NC 27695 Email: lu@stat.ncsu.edu Summary.

More information

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs Christopher Jennison Department of Mathematical Sciences, University of Bath http://people.bath.ac.uk/mascj

More information

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS

SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE SEQUENTIAL DESIGN IN CLINICAL TRIALS Journal of Biopharmaceutical Statistics, 18: 1184 1196, 2008 Copyright Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543400802369053 SAMPLE SIZE RE-ESTIMATION FOR ADAPTIVE

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

The Design of a Survival Study

The Design of a Survival Study The Design of a Survival Study The design of survival studies are usually based on the logrank test, and sometimes assumes the exponential distribution. As in standard designs, the power depends on The

More information

ingestion of selenium tablets for plasma levels to rise; another explanation may be that selenium aects only initiation not promotion of tumors, so tu

ingestion of selenium tablets for plasma levels to rise; another explanation may be that selenium aects only initiation not promotion of tumors, so tu Likelihood Ratio Tests for a Change Point with Survival Data By XIAOLONG LUO Department of Biostatistics, St. Jude Children's Research Hospital, P.O. Box 318, Memphis, Tennessee 38101, U.S.A. BRUCE W.

More information

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016 Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Goodness-of-Fit Tests With Right-Censored Data by Edsel A. Pe~na Department of Statistics University of South Carolina Colloquium Talk August 31, 2 Research supported by an NIH Grant 1 1. Practical Problem

More information

Statistics and Probability Letters. Using randomization tests to preserve type I error with response adaptive and covariate adaptive randomization

Statistics and Probability Letters. Using randomization tests to preserve type I error with response adaptive and covariate adaptive randomization Statistics and Probability Letters ( ) Contents lists available at ScienceDirect Statistics and Probability Letters journal homepage: wwwelseviercom/locate/stapro Using randomization tests to preserve

More information

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation

Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample Size Re-estimation Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.100/sim.0000 Adaptive Extensions of a Two-Stage Group Sequential Procedure for Testing a Primary and a Secondary Endpoint (II): Sample

More information

Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Depar

Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Depar Symmetric Tests and Condence Intervals for Survival Probabilities and Quantiles of Censored Survival Data Stuart Barber and Christopher Jennison Department of Mathematical Sciences, University of Bath,

More information

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work SSUI: Presentation Hints 1 Comparing Marginal and Random Eects (Frailty) Models Terry M. Therneau Mayo Clinic April 1998 SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that

More information

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models

Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Augustin: Some Basic Results on the Extension of Quasi-Likelihood Based Measurement Error Correction to Multivariate and Flexible Structural Models Sonderforschungsbereich 386, Paper 196 (2000) Online

More information

Adaptive Designs: Why, How and When?

Adaptive Designs: Why, How and When? Adaptive Designs: Why, How and When? Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj ISBS Conference Shanghai, July 2008 1 Adaptive designs:

More information

Group-Sequential Tests for One Proportion in a Fleming Design

Group-Sequential Tests for One Proportion in a Fleming Design Chapter 126 Group-Sequential Tests for One Proportion in a Fleming Design Introduction This procedure computes power and sample size for the single-arm group-sequential (multiple-stage) designs of Fleming

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

Models for Multivariate Panel Count Data

Models for Multivariate Panel Count Data Semiparametric Models for Multivariate Panel Count Data KyungMann Kim University of Wisconsin-Madison kmkim@biostat.wisc.edu 2 April 2015 Outline 1 Introduction 2 3 4 Panel Count Data Motivation Previous

More information

Rene Tabanera y Palacios 4. Danish Epidemiology Science Center. Novo Nordisk A/S Gentofte. September 1, 1995

Rene Tabanera y Palacios 4. Danish Epidemiology Science Center. Novo Nordisk A/S Gentofte. September 1, 1995 Estimation of variance in Cox's regression model with gamma frailties. Per Kragh Andersen 2 John P. Klein 3 Kim M. Knudsen 2 Rene Tabanera y Palacios 4 Department of Biostatistics, University of Copenhagen,

More information

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC

Mantel-Haenszel Test Statistics. for Correlated Binary Data. Department of Statistics, North Carolina State University. Raleigh, NC Mantel-Haenszel Test Statistics for Correlated Binary Data by Jie Zhang and Dennis D. Boos Department of Statistics, North Carolina State University Raleigh, NC 27695-8203 tel: (919) 515-1918 fax: (919)

More information

An Adaptive Futility Monitoring Method with Time-Varying Conditional Power Boundary

An Adaptive Futility Monitoring Method with Time-Varying Conditional Power Boundary An Adaptive Futility Monitoring Method with Time-Varying Conditional Power Boundary Ying Zhang and William R. Clarke Department of Biostatistics, University of Iowa 200 Hawkins Dr. C-22 GH, Iowa City,

More information

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis. Estimating the Mean Response of Treatment Duration Regimes in an Observational Study Anastasios A. Tsiatis http://www.stat.ncsu.edu/ tsiatis/ Introduction to Dynamic Treatment Regimes 1 Outline Description

More information

ADJUSTED POWER ESTIMATES IN. Ji Zhang. Biostatistics and Research Data Systems. Merck Research Laboratories. Rahway, NJ

ADJUSTED POWER ESTIMATES IN. Ji Zhang. Biostatistics and Research Data Systems. Merck Research Laboratories. Rahway, NJ ADJUSTED POWER ESTIMATES IN MONTE CARLO EXPERIMENTS Ji Zhang Biostatistics and Research Data Systems Merck Research Laboratories Rahway, NJ 07065-0914 and Dennis D. Boos Department of Statistics, North

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim Tests for trend in more than one repairable system. Jan Terje Kvaly Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim ABSTRACT: If failure time data from several

More information

The SEQDESIGN Procedure

The SEQDESIGN Procedure SAS/STAT 9.2 User s Guide, Second Edition The SEQDESIGN Procedure (Book Excerpt) This document is an individual chapter from the SAS/STAT 9.2 User s Guide, Second Edition. The correct bibliographic citation

More information

Inverse Sampling for McNemar s Test

Inverse Sampling for McNemar s Test International Journal of Statistics and Probability; Vol. 6, No. 1; January 27 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Inverse Sampling for McNemar s Test

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Group Sequential Methods. for Clinical Trials. Christopher Jennison, Dept of Mathematical Sciences, University of Bath, UK

Group Sequential Methods. for Clinical Trials. Christopher Jennison, Dept of Mathematical Sciences, University of Bath, UK Group Sequential Methods for Clinical Trials Christopher Jennison, Dept of Mathematical Sciences, University of Bath, UK http://www.bath.ac.uk/οmascj PSI, London, 22 October 2003 1 Plan of talk 1. Why

More information

Fahrmeir: Discrete failure time models

Fahrmeir: Discrete failure time models Fahrmeir: Discrete failure time models Sonderforschungsbereich 386, Paper 91 (1997) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Discrete failure time models Ludwig Fahrmeir, Universitat

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

Pubh 8482: Sequential Analysis

Pubh 8482: Sequential Analysis Pubh 8482: Sequential Analysis Joseph S. Koopmeiners Division of Biostatistics University of Minnesota Week 10 Class Summary Last time... We began our discussion of adaptive clinical trials Specifically,

More information

Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY

Nonparametric Hypothesis Testing and Condence Intervals with. Department of Statistics. University ofkentucky SUMMARY Nonparametric Hypothesis Testing and Condence Intervals with Doubly Censored Data Kun Chen and Mai Zhou Department of Statistics University ofkentucky Lexington KY 46-7 U.S.A. SUMMARY The non-parametric

More information

Analysing longitudinal data when the visit times are informative

Analysing longitudinal data when the visit times are informative Analysing longitudinal data when the visit times are informative Eleanor Pullenayegum, PhD Scientist, Hospital for Sick Children Associate Professor, University of Toronto eleanor.pullenayegum@sickkids.ca

More information

Early selection in a randomized phase II clinical trial

Early selection in a randomized phase II clinical trial STATISTICS IN MEDICINE Statist. Med. 2002; 21:1711 1726 (DOI: 10.1002/sim.1150) Early selection in a randomized phase II clinical trial Seth M. Steinberg ; and David J. Venzon Biostatistics and Data Management

More information

On power and sample size calculations for Wald tests in generalized linear models

On power and sample size calculations for Wald tests in generalized linear models Journal of tatistical lanning and Inference 128 (2005) 43 59 www.elsevier.com/locate/jspi On power and sample size calculations for Wald tests in generalized linear models Gwowen hieh epartment of Management

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Group Sequential Methods. for Clinical Trials. Christopher Jennison, Dept of Mathematical Sciences, University of Bath, UK

Group Sequential Methods. for Clinical Trials. Christopher Jennison, Dept of Mathematical Sciences, University of Bath, UK Group Sequential Methods for Clinical Trials Christopher Jennison, Dept of Mathematical Sciences, University of Bath, UK http://wwwbathacuk/ mascj University of Malaysia, Kuala Lumpur March 2004 1 Plan

More information

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints

The Design of Group Sequential Clinical Trials that Test Multiple Endpoints The Design of Group Sequential Clinical Trials that Test Multiple Endpoints Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Bruce Turnbull

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data

Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Non-parametric Tests for the Comparison of Point Processes Based

More information

Group Sequential Tests for Delayed Responses

Group Sequential Tests for Delayed Responses Group Sequential Tests for Delayed Responses Lisa Hampson Department of Mathematics and Statistics, Lancaster University, UK Chris Jennison Department of Mathematical Sciences, University of Bath, UK Read

More information

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018 Sample Size Re-estimation in Clinical Trials: Dealing with those unknowns Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj University of Kyoto,

More information

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Statistics Preprints Statistics -00 A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Jianying Zuo Iowa State University, jiyizu@iastate.edu William Q. Meeker

More information

Prediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels

Prediction of ordinal outcomes when the association between predictors and outcome diers between outcome levels STATISTICS IN MEDICINE Statist. Med. 2005; 24:1357 1369 Published online 26 November 2004 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/sim.2009 Prediction of ordinal outcomes when the

More information

Comparison of several analysis methods for recurrent event data for dierent estimands

Comparison of several analysis methods for recurrent event data for dierent estimands Comparison of several analysis methods for recurrent event data for dierent estimands Master Thesis presented to the Department of Economics at the Georg-August-University Göttingen with a working time

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks

A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks A Gatekeeping Test in a Group Sequential Design 1 A Gatekeeping Test on a Primary and a Secondary Endpoint in a Group Sequential Design with Multiple Interim Looks Ajit C. Tamhane Department of Industrial

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials Testing a secondary endpoint after a group sequential test Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 9th Annual Adaptive Designs in

More information

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35

Rewrap ECON November 18, () Rewrap ECON 4135 November 18, / 35 Rewrap ECON 4135 November 18, 2011 () Rewrap ECON 4135 November 18, 2011 1 / 35 What should you now know? 1 What is econometrics? 2 Fundamental regression analysis 1 Bivariate regression 2 Multivariate

More information

Group sequential designs with negative binomial data

Group sequential designs with negative binomial data Group sequential designs with negative binomial data Ekkehard Glimm 1 Tobias Mütze 2,3 1 Statistical Methodology, Novartis, Basel, Switzerland 2 Department of Medical Statistics, University Medical Center

More information

Multistate models and recurrent event models

Multistate models and recurrent event models and recurrent event models Patrick Breheny December 6 Patrick Breheny University of Iowa Survival Data Analysis (BIOS:7210) 1 / 22 Introduction In this final lecture, we will briefly look at two other

More information

Statistical Inference

Statistical Inference Statistical Inference Bernhard Klingenberg Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Outline Estimation: Review of concepts

More information

STAT 331. Martingale Central Limit Theorem and Related Results

STAT 331. Martingale Central Limit Theorem and Related Results STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal

More information

Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis

Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis Glenn Heller Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center,

More information

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Xuelin Huang Department of Biostatistics M. D. Anderson Cancer Center The University of Texas Joint Work with Jing Ning, Sangbum

More information

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection

Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Biometrical Journal 42 (2000) 1, 59±69 Confidence Intervals of the Simple Difference between the Proportions of a Primary Infection and a Secondary Infection, Given the Primary Infection Kung-Jong Lui

More information

Lawrence D. Brown* and Daniel McCarthy*

Lawrence D. Brown* and Daniel McCarthy* Comments on the paper, An adaptive resampling test for detecting the presence of significant predictors by I. W. McKeague and M. Qian Lawrence D. Brown* and Daniel McCarthy* ABSTRACT: This commentary deals

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation

More information

On the errors introduced by the naive Bayes independence assumption

On the errors introduced by the naive Bayes independence assumption On the errors introduced by the naive Bayes independence assumption Author Matthijs de Wachter 3671100 Utrecht University Master Thesis Artificial Intelligence Supervisor Dr. Silja Renooij Department of

More information

1/sqrt(B) convergence 1/B convergence B

1/sqrt(B) convergence 1/B convergence B The Error Coding Method and PICTs Gareth James and Trevor Hastie Department of Statistics, Stanford University March 29, 1998 Abstract A new family of plug-in classication techniques has recently been

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian Estimation of Optimal Treatment Regimes Via Machine Learning Marie Davidian Department of Statistics North Carolina State University Triangle Machine Learning Day April 3, 2018 1/28 Optimal DTRs Via ML

More information

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study

TECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Group sequential designs for negative binomial outcomes

Group sequential designs for negative binomial outcomes Group sequential designs for negative binomial outcomes Tobias Mütze a, Ekkehard Glimm b,c, Heinz Schmidli b, and Tim Friede a,d a Department of Medical Statistics, University Medical Center Göttingen,

More information

Control of Directional Errors in Fixed Sequence Multiple Testing

Control of Directional Errors in Fixed Sequence Multiple Testing Control of Directional Errors in Fixed Sequence Multiple Testing Anjana Grandhi Department of Mathematical Sciences New Jersey Institute of Technology Newark, NJ 07102-1982 Wenge Guo Department of Mathematical

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

GIST 4302/5302: Spatial Analysis and Modeling

GIST 4302/5302: Spatial Analysis and Modeling GIST 4302/5302: Spatial Analysis and Modeling Basics of Statistics Guofeng Cao www.myweb.ttu.edu/gucao Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Spring 2015 Outline of This Week

More information

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design Comparing Adaptive Designs and the Classical Group Sequential Approach to Clinical Trial Design Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj

More information

Adaptive Treatment Selection with Survival Endpoints

Adaptive Treatment Selection with Survival Endpoints Adaptive Treatment Selection with Survival Endpoints Gernot Wassmer Institut für Medizinische Statisti, Informati und Epidemiologie Universität zu Köln Joint wor with Marus Roters, Omnicare Clinical Research,

More information

NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET

NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET Approximate Bayesian Inference for nonhomogeneous Poisson processes with application to survival analysis by Rupali Akerkar, Sara Martino and Håvard Rue PREPRINT

More information

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables,

Economics 472. Lecture 10. where we will refer to y t as a m-vector of endogenous variables, x t as a q-vector of exogenous variables, University of Illinois Fall 998 Department of Economics Roger Koenker Economics 472 Lecture Introduction to Dynamic Simultaneous Equation Models In this lecture we will introduce some simple dynamic simultaneous

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Journal of Modern Applied Statistical Methods Volume 4 Issue Article 8 --5 Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Sudhir R. Paul University of

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers

Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Approximate analysis of covariance in trials in rare diseases, in particular rare cancers Stephen Senn (c) Stephen Senn 1 Acknowledgements This work is partly supported by the European Union s 7th Framework

More information

Statistical Applications in Genetics and Molecular Biology

Statistical Applications in Genetics and Molecular Biology Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca

More information

PSC 504: Dynamic Causal Inference

PSC 504: Dynamic Causal Inference PSC 504: Dynamic Causal Inference Matthew Blackwell 4/8/203 e problem Let s go back to a problem that we faced earlier, which is how to estimate causal effects with treatments that vary over time. We could

More information

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy

Computation Of Asymptotic Distribution. For Semiparametric GMM Estimators. Hidehiko Ichimura. Graduate School of Public Policy Computation Of Asymptotic Distribution For Semiparametric GMM Estimators Hidehiko Ichimura Graduate School of Public Policy and Graduate School of Economics University of Tokyo A Conference in honor of

More information

Business Statistics 41000: Homework # 5

Business Statistics 41000: Homework # 5 Business Statistics 41000: Homework # 5 Drew Creal Due date: Beginning of class in week # 10 Remarks: These questions cover Lectures #7, 8, and 9. Question # 1. Condence intervals and plug-in predictive

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information