Sample Size Calculation Based on the Semiparametric Analysis of Short-term and Long-term Hazard Ratios. Yi Wang

Size: px
Start display at page:

Download "Sample Size Calculation Based on the Semiparametric Analysis of Short-term and Long-term Hazard Ratios. Yi Wang"

Transcription

1 Sample Sze Calculaton Based on the Semparametrc Analyss of Short-term and Long-term Hazard Ratos Y Wang Submtted n partal fulfllment of the requrements for the degree of Doctor of Phlosophy under the Executve Commttee of the Graduate School of Arts and Scences COLUMBIA UNIVERSITY 23

2 c 23 Y Wang All Rghts Reserved

3 ABSTRACT Sample Sze Calculaton Based on the Semparametrc Analyss of Short-term and Long-term Hazard Ratos Y Wang We derve sample sze formulae for survval data wth non-proportonal hazard functons under both fxed and contguous alternatves. Sample sze determnaton has been wdely dscussed n lterature for studes wth falure-tme endponts. Many researchers have developed methods wth the assumpton of proportonal hazards under contguous alternatves. Wthout covarate adjustment, the logrank test statstc s often used for the sample sze and power calculaton. Wth covarate adjustment, the approaches are often based on the score test statstc for the Cox proportonal hazards model. Such methods, however, are napproprate when the proportonal hazards assumpton s volated. We develop methods to calculate the sample sze based on the semparametrc analyss of short-term and long-term hazard ratos. The methods are bult on a semparametrc model by Yang and Prentce 25). The model accommodates a wde range of patterns of hazard ratos, and ncludes the Cox proportonal hazards model and the proportonal odds model as ts specal cases. Therefore, the proposed methods can be used for survval data wth proportonal or non-proportonal hazard functons. In partcular, the sample sze formula by Schoenfeld 983) and Hseh and Lavor 2) can be obtaned as a specal case of our methods under contguous alternatves. KEY WORDS: Accrual and follow-up; Contguous alternatves; Cox model; Crossng

4 hazards; Fxed alternatve; Non-proportonal hazard functons; Sample sze; Shortterm and long-term hazard ratos; Survval analyss.

5 Table of Contents Table of Contents Introducton 2 General procedure of sample sze calculaton 7 2. Example General procedure of sample sze and power calculatons The fxed alternatve and the contguous alternatve hypotheses Sample sze calculaton for the Cox proportonal hazards model 6 3. Notatons and assumptons Model specfcaton Sample sze formula under fxed alternatve Sample sze formula under contguous alternatves Sample sze calculaton wth Yang and Prentce s semparametrc model Notatons and Model specfcaton Parameter estmaton Sample sze formula under fxed alternatve Sample sze formula under contguous alternatves Smulaton studes

6 4.5. Smulaton studes to evaluate sample sze formula derved under contguous alternatves Smulaton studes to evaluate sample sze formula derved under fxed alternatve Accrual and follow-up tmes n sample sze calculaton Accrual and followup n sample sze calculaton Example Dscusson Concludng remarks Future work Proofs Proofs of the theorems and corollares n Chapter Proof of theorem Proof of Corollay Proof of Theorem Proof of Corollary Proofs of the theorems n Chapter Lemma and proof Proof of Theorem 4.3. ) Proof of Theorem Bblography 99

7 Lst of Fgures. Kaplan-Meer estmates for the VA lung cancer data Proportonal hazards: Cox model γ = γ 2 = γ) Non-proportonal hazards: proportonal odds model γ 2 = ) Non-proportonal hazards: long-term effect model γ = ) Non-proportonal hazards: crossng hazards γ < and γ 2 >, or, γ > and γ 2 < ) Sample sze for accrual and follow-ups up to 3 months Sample sze for 3 months accrual a = 3) Sample sze for 2 months follow-up f = 2)

8 Lst of Tables 2. Comparson of η and z α/2 + z β ) 2 for commonly assumed α s and β s. 4. Values of η for dfferent α s and β s Emprcal power of calculated sample sze for α =.5, β =.9 and θ =, ) Emprcal power of calculated sample sze for α =.5, β =.9 and θ =, ) Emprcal power of calculated sample sze for α =.5, β =.9 and θ =.4,.4) Emprcal power of calculated sample sze for the Cox model. α =.5, β =.9 and θ = Emprcal power of calculated sample sze for the Cox model. α =.5, β =.9 and θ = Emprcal power of calculated sample sze for the Cox model. α =.5, β =.9 and θ = Emprcal power of calculated sample sze for the Cox model. α =.5, β =.9 and θ = Dfferent accrual dstrbutons as n Mak 26) and Wang et al. 22). 6 v

9 Acknowledgments I would lke to express my deepest grattude to my advsor, Dr. Zhezhen Jn, for hs supervson and support. I would also lke to thank Dr. Bn Cheng, Dr. Robert Taub, Dr. We-Yann Tsa and Dr. Anta Wang for servng on my dssertaton commttee. Last but not least, many thanks to my frends Huahou Chen, We Xong, Wenfe Zhang and Zqang Zhao, who have encouraged and supported me throughout the entre process. v

10 To my famly v

11 CHAPTER. INTRODUCTION Chapter Introducton Sample sze determnaton s mportant n clncal studes, especally at the desgn stage of a tral when researchers want to address some scentfc hypotheses. Sample sze calculaton procedure conssts of two mportant elements: hypotheses of nterest and test statstc. For the sample sze calculaton n a study, the null and alternatve hypotheses need to be determned frst. Then, a proper test statstc must be dentfed along wth the dstrbutons of the test statstc under the null and alternatve hypotheses. Sample sze can then be evaluated wth the pre-specfed type I error, power, effect sze, and desgn effects). When the exact dstrbutons of the test s- tatstc are not avalable, the asymptotc dstrbutons are often used nstead. Type I error, power, sample sze, effect sze, and desgn effects) are related to one another for the pre-specfed hypotheses of nterest and test statstc. Therefore, power can be calculated f sample sze, type I error, effect sze and desgn effects) are known. Sample sze and power calculaton follows bascally the same procedure. Revews on the general procedure for sample sze determnaton and power analyss n clncal trals are gven by many nvestgators e.g., Lachn 98 [25]; Donner 984 [2]; Dupont and Plummer 99 []). The general procedure s revewed and dscussed n detal n Chapter 2. Although sample sze determnaton s the focus n ths dssertaton, power analyss for the same type of studes can be conducted n a smlar manner.

12 CHAPTER. INTRODUCTION 2 Sample sze formulae can be dfferent wth dfferent choce of ether the hypotheses of nterest or the test statstc. In sample sze calculaton, two types of alternatve hypotheses are usually consdered for the hypotheses of nterest: fxed alternatve and contguous alternatves. An ntroducton to these alternatves can be found n Chapter 2. For any specfc hypotheses of nterest, the choce of the test statstcs also makes a dfference when sample sze s calculated for a study. There are model-based and non model-based test statstcs. The test statstc should be model-based when some non-bnary effects) s are) of nterest, and/or there s a need for covarate adjustment. When only a categorcal effect often bnary), for example, treatment effect, s of nterest, one can calculate the sample sze wth a non model-based test statstc. The methods for sample sze calculaton n survval analyss have been dscussed extensvely n the lterature. Many authors have focused on developng methods for survval data wth treatment as the effect of nterest e.g., Halpern et al. 968 [2]; Pasternack and Glbert 97 [38]; Pasternack 972 [37]; George and Desu 974 [7]; Palta and McHugh 979 [35]; Palta and McHugh 98 [36]; Wu et al. 98 [5]; Lachn 98 [25]; Rubnsten et al. 98 [4]; Schoenfeld 98 [4]; Makuch and Smon 982 [32]; Schoenfeld and Rchter 982 [43]; Freedman 982 [5]; Schoenfeld 983 [42]; Gal 985 [6]; Palta and Amn 985 [34]; Lachn and Foulkes 986 [26]; Lakatos 988 [28]; Gu and La 999 [9]; Chen et al. 2 [5]; Wang et al. 22 [49]). Collett 23) [9] has dscussed and revewed sample sze calculaton methods n survval analyss. Most of these methods focused on the smple two-sample problem and assumed proportonal hazards. In partcular, early works assumed exponental dstrbuton n survval tmes e.g., Pasternack and Glbert 97 [38]; Pasternack 972 [37]; George and Desu 974 [7]; Palta and McHugh 98 [36]; Lachn 98 [25]; Rubnsten et al. 98 [4]; Makuch and Smon 982 [32]; Schoenfeld and Rchter 982 [43]; Lachn and Foulkes 986 [26]). Among these methods, there are some that are not lmted to two-sample problem or to the assumpton of proportonal hazards.

13 CHAPTER. INTRODUCTION 3 For example, Makuch and Smon 982) [32] provded the sample sze requrement for more than two treatment groups; Lakatos 988) [28] and Lakatos and Lan 992) [29] derved methods that can be used for non-proportonal hazard functons a SAS macro for the approaches are gven by Shh 995 [45]); Wu et al. 98) [5] also employed an approach that allows for tme-dependent dropout and event rates as an extenson of the approach by Halpern et al. 968) [2]; Gu and La 999) [9] derved a formula for clncal studes wth nterm analyses; Chen et al. 2) [5] dscussed sample sze determnaton n jont modelng of longtudnal and survval data; Wang et al. 22) [49] derved a formula based on the cure rate proportonal hazards model, whch ncludes the Cox proportonal hazards model as a specal case. Some researchers developed approaches that can be used for non-bnary covarates). For example, Zhen and Murphy 994) [53] presented ther approach based on the exponental model; Hseh and Lavor 2) [2] derved a method based on the Cox proportonal hazards model. The sample sze formula by Schoenfeld 983) [42] s commonly used n practce. The treatment effect s handled as a bnary covarate n the Cox proportonal hazards model. The formula by Schoenfeld 983) [42] based on the score test statstc for the Cox proportonal hazards model) s the same as that of Schoenfeld 98) [4] based on the logrank test statstc) when there are no tes consdered. Ths s because the logrank test statstc s actually the same as the score test statstc based on the partal lkelhood for the Cox proportonal hazards model when the only covarate s the treatment ndcator under no tes. Many authors derved the same formula under dfferent assumptons, among whch there are Schoenfeld and Rchter 982 [43] and Collett 23) [9]. Hseh and Lavor 2) [2] extended Schoenfeld s result 983) [42] to the case of non-bnary covarate. All of these methods are derved wth the proportonal hazards assumpton under local alternatves. Therefore, the methods are napproprate when the proportonal hazards assumpton s volated. Even f the proportonal hazards assumpton holds, the valdty of the formula may

14 CHAPTER. INTRODUCTION 4 be questonable when the results under the fxed alternatve are more desred. Fgure. presents the Kaplan-Meer estmates for the VA lung cancer tral Kalbflesch and Prentce 22 [23]). Ths s a classc example of possble volaton of the proportonal hazards assumpton. If there s crossng n hazards functons, there must be crossng n survval functons. In practce, people often check the Kaplan-Meer curves to see whether the survval functons cross. When we observe Kaplan-Meer curves nstead of the true survvals, questons are rased as to whether the underlyng survvals are crossed and whether the underlyng hazards are crossed. Approprate methods for crossng hazards can help dagnose the problem. Reasons behnd crossng can be complcated. Conventonal methods such as the logrank test and the score test for the Cox proportonal hazards model would perform poorly n the presence of crossng hazards. Consequently, the sample sze calculaton based on these test statstcs wll be napproprate. Fgure.: Kaplan-Meer estmates for the VA lung cancer data. Survval Functon Treatment Control Tme days) We develop methods to calculate the sample sze for survval data wth non-

15 CHAPTER. INTRODUCTION 5 proportonal hazard functons. Our methods are based on the pseudo score test for a semparametrc model by Yang and Prentce 25) [5]. Ths semparametrc model accommodates a wde range of hazard rato patterns. It models the hazard rato that changes monotoncally over tme. For a two-sample problem, ths model can be used for survval data wth proportonal or non-proportonal even crossed) hazard functons. There are three submodels of specfc nterest: the Cox proportonal hazards model, the proportonal odds model and the long-term effect model. These submodels are dscussed n detal as specal cases of our methods. We obtan the sample sze formulae based on Yang and Prentce s method under dfferent alternatves fxed and contguous alternatves). Sample sze calculaton for the Cox model s revewed n ths dssertaton, wth generalzaton to the result under the fxed alternatve. It can be shown that our sample sze formula for the semparametrc short-term and long-term hazard ratos model reduces to the formula developed by Schoenfeld 983) [42] and Hseh and Lavor 2) [2] under contguous alternatves for the Cox proportonal hazards model. Accrual and follow-up tmes can be ncorporated n sample sze calculaton. Some authors ncluded accrual and follow-up tmes n ther sample sze formulae e.g., Pasternack and Glbert 97 [38]; Pasternack 972 [37]; George and Desu 974 [7]; Lachn 98 [25]; Rubnsten et al. 98 [4]; Schoenfeld and Rchter 982 [43]; Schoenfeld 983 [42]; Donner 984 [2]; Gal 985 [6]; Lachn and Foulkes 986 [26]; Lakatos 986 [27]; Lakatos 988 [28]; Wang et al. 22 [49]). We wll adopt the assumptons n Wang et al. 22) [49] for accrual and follow-up tmes. The sample sze wth accrual and follow-ups are derved n a smlar way. Detals wll be dscussed n Chapter 5. Two mportant technques to prove the results of ths work are countng processes and emprcal processes. These are useful tools to evalulate the asymptotc propertes. Some books dscussed the applcaton of countng processes n survval analyss e.g., Flemng and Harrngton 99 [4]; Andersen et al. 993 []; Kalbflesch and Prentce

16 CHAPTER. INTRODUCTION 6 22 [23]). The large sample propertes of emprcal processes were presented n Van der Vaart 998) [47] and Van der Vaart and Wellner 996) [48]. Although dfferent technques may be used to derve the results, countng processes and emprcal processes are powerful tools n survval analyss. We wll use these technques n our dervatons. Ths dssertaton s structured as follows. In Chapter 2, the general procedure of sample sze estmaton s llustrated wth examples. The sample sze formulae are derved for the Cox proportonal hazards model under both fxed and contguous alternatves n Chapter 3. The formula by Schoenfeld 983) [42] and Hseh and Lavor 2) [2] s shown to be a specal case of our formula under contguous alternatves. In Chapter 4, the sample sze formulae are developed based on the semparametrc model by Yang and Prentce 25) [5] for non-proportonal hazard functons. Under certan condtons, the sample sze formula by Schoenfeld 983) [42] and Hseh and Lavor 2) [2] agan turns out to be a specal case of our formula. Smulaton studes are summarzed n the chapter. In Chapter 5, the proposed methods are further generalzed when accrual and follow-up tmes are ncorporated. Chapter 6 gves a summary of the fndngs and dscussons, along wth possble alternatve approaches for sample sze determnaton wth falure-tme endpont. All the proofs are presented n Chapter 7.

17 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION 7 Chapter 2 General procedure of sample sze calculaton Ths chapter begns wth a smple example to llustrate how the sample sze can be calculated. Smlar dscussons appeared n many works, ncludng Lachn 98) [25], Donner 984) [2] and Dupont and Plummer 99) []. The general procedures for sample sze calculaton and power analyss are summarzed n Secton 2.2. It s mportant to specfy the hypotheses of nterest and dentfy a proper test statstc n sample sze and power calculaton. In ths dssertaton, sample sze formulae are derved under both fxed and contguous alternatves. We gve a bref ntroducton to the two types of alternatve hypotheses fxed and contguous alternatves). 2. Example Let {W,..., W n } be a random sample of sze n, where W s are ndependent and dentcally dstrbuted..d.) random varables wth unknown mean µ and known fnte varance σ 2, for =,..., n. We consder the null hypothess H : µ = µ versus the alternatve hypothess H : µ = µ, where µ, µ R are known and µ µ. In ths example, the Wald test statstc s used to calculate the sample sze. We assume

18 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION 8 that the type I error s α and the desred power s β n the study. We present the sample sze formula for the followng two cases: random normal sample and random non-normal sample. The major dfference between the two s that the asymptotc dstrbutons, nstead of the exact dstrbutons of the test statstc, are used for the calculaton n the case of non-normal random sample. Case : Random normal sample If W s are..d. random varables from a normal dstrbuton wth unknown mean µ and known fnte varance σ 2, for =,..., n, the Wald test statstc can be used to test the hypotheses of nterest. The test statstc s T w n = n Wn µ ), 2.) σ where W n = n = W /n s the sample mean of {W,..., W n }. The dstrbutons of the test statstc T w n alternatve hypotheses: Under H : µ = µ, 2.) can be derved under the null and T w n N, ). 2.2) Under H : µ = µ, nµ Tn w µ ) σ N, ). 2.3) The results can be used to calculate sample sze wth pre-specfed type I error α, power β, effect sze µ µ, and desgn effect σ 2. By 2.2) and the defnton of type I error, Pr H T w n > c ) = α, where the crtc value c s equal to z α/2 for gven type I error α, wth z α/2 beng the upper α/2 th percentle of the standard normal dstrbuton.

19 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION 9 By the defnton of power, Pr H T w n > c ) = β, whch mples Pr H T w n > c ) + Pr H T w n < c ) = β. 2.4) For small α and β, one of the tems on the left-hand sde of equaton 2.4) s small and can thus be omtted. Then, sample sze can be calculated based on the prevous results. If µ > µ, t follows from 2.4) that Pr H T w n < c ) = β. Wth some algebra, the followng equaton can be obtaned Pr H T w nµ µ ) n σ nµ µ )) < c = β. σ By 2.3), z β can be approxmated: nµ µ ) c σ = z β. The sample sze s obtaned by substtutng z α/2 for c n the above equaton. n = z α/2 + z β ) 2 µ µ ) 2 /σ 2, 2.5) where z β s the upper β th percentle of the standard normal dstrbuton. If µ < µ, t follows from 2.4) that Pr H T w n > c ) = β. Wth some algebra, the followng equaton can be obtaned Pr H T w nµ µ ) n σ nµ µ )) > c = β. σ

20 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION By 2.3), z β can be approxmated: nµ µ ) c σ = z β. The sample sze s n = z α/2 z β ) 2 µ µ ) 2 /σ 2, 2.6) where z β s the upper β) th percentle of the standard normal dstrbuton. Remark 2... In fact, 2.5) and 2.6) are equvalent because z β = z β. The sample sze formula s regardless of the relatonshp between µ and µ. n = z α/2 + z β ) 2 µ µ ) 2 /σ 2 2.7) Remark An alternatve way to calculate the sample sze s to consder the dstrbutons of T w n ) 2. The advantage of ths method s to use all the terms nvolved, wthout the omsson we made of the term on the left-hand sde of equaton 2.4). Under H : µ = µ, T w n ) 2 χ 2, 2.8) where χ 2 s the Ch-squared dstrbuton wth degree of freedom. Under H : µ = µ, T w n ) 2 χ 2 η ), 2.9) where χ 2 η ) s the non-central Ch-squared dstrbuton wth degree of freedom and noncentralty parameter η. The noncentralty parameter η can be approxmated by η n = nµ µ ) 2. The sample sze formula can be derved n a smlar manner wthout σ 2 makng the approxmaton n 2.4). Followng the procedure shown n ths secton, the sample sze s gven by n = η µ µ ) 2 /σ 2, 2.)

21 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION where η s derved such that χ 2, α = χ 2, β η ). χ 2, α s the upper α th percentle of the Ch-squared dstrbuton wth degree of freedom, and χ 2, β η ) s the upper β) th percentle of the non-central Ch-squared dstrbuton wth degree of freedom and noncentralty parameter η. Note that 2.7) and 2.) dffer only n the numerator. If the approxmaton n 2.4) s approprate, z α/2 + z β ) 2 should be close to η. We compare these values for small α s and β s n Table 2.. The two values are almost dentcal for dfferent small values of α and β. Thus, we conclude that the sample sze formula 2.7) wth normal approxmaton) can be used nstead of 2.). Table 2.: Comparson of η and z α/2 + z β ) 2 for commonly assumed α s and β s. α β η z α/2 + z β ) Remark When the varance σ 2 s unknown, the sample varance ˆσ 2 can be used nstead n the sample sze calculaton. Consequently, σ s substtuted wth ˆσ n the test statstc Tn w 2.). The sample sze formula has the form of 2.7) wth σ beng replaced by ˆσ, and z α/2 and z β beng replaced by t α/2 n ) and t β n ), the upper α/2th and βth percentles of t-dstrbuton wth n degrees of freedom. Case 2: Random non-normal sample Suppose that W s are..d. random varables from a non-normal dstrbuton wth unknown mean µ and known fnte varance σ 2, for =,..., n. The sample sze

22 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION 2 calculaton procedure remans largely the same as n Case. The dfference s that the asymptotc dstrbutons should be used nstead of the exact dstrbutons under the null and alternatve hypotheses. The test statstc n 2.) can stll be used n ths case. The asymptotc dstrbutons should dffer from 2.2) and 2.3) only n the asymptotc context;.e., the results are vald when n. Because the formula obtaned s based on the large sample theorem, the method may not be good for small sample sze. We present the asymptotc dstrbutons of the test statstc Tn w 2.) under the null and alternatve hypotheses: By the classc central lmt theorem, under H : µ = µ, T w n d N, ) as n. 2.) Under H : µ = µ, nµ Tn w µ ) σ For a gven type I error α, by defnton we have: d N, ) as n. 2.2) Pr H T w n > c ) = α, where the crtc value c s equal to z α/2. By the defnton of power, we derve the followng: Pr H T w n > c ) = β, whch mples Pr H T w n > c ) + Pr H T w n < c ) = β. 2.3) We adopt the same normal approxmaton method as n the random normal case. Wthout loss of generalty, we assume µ > µ. 2.3) can be smplfed as Pr H T w n < c ) = β.

23 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION 3 It follows that Pr H T w nµ µ ) n σ nµ µ )) < c = β. σ By 2.2), nµ µ ) c σ = z β. The sample sze s obtaned by replacng c wth z α/2 n the above equaton. n = z α/2 + z β ) 2 µ µ ) 2 /σ ) Remark The sample sze formula 2.4) s derved based on the asymptotc dstrbutons of the test statstc 2.). As a result, t may not be vald for small samples. Although 2.7) and 2.4) have the same expressons, the method needs to be used wth cauton when n s small. 2.2 General procedure of sample sze and power calculatons The example n the prevous secton llustrates the general procedure n sample sze calculaton. There are several mportant steps n sample sze calculaton. The key s to choose the approprate hypotheses of nterest and test statstc. Sample sze s related to a number of factors such as type I error, power, effect sze, and desgn effects). To make a connecton among these factors, we need to nvestgate the dstrbutons or asymptotc dstrbutons of the test statstc under the null and alternatve hypotheses. Specfcally, the sample sze calculaton procedure can be summarzed as follows. Sample sze calculaton procedure: Step. Specfy the hypotheses of nterest null and alternatve hypotheses).

24 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION 4 Step 2. Choose a test statstc. Step 3. Derve the asymptotc) dstrbutons of the test statstc under the null and alternatve hypotheses. Step 4. Lnk the sample sze to type I error, power, effect sze, and desgn effects) by Step 3. Type I error, power, sample sze, effect sze, and desgn effects) are related. If only one of the elements s unknown, then t can be determned by the other quanttes. Power can be calculated n a smlar way as sample sze. In general, power can be calculated by the followng steps. Power calculaton procedure: Step. Specfy the hypotheses of nterest null and alternatve hypotheses). Step 2. Choose a test statstc for the hypotheses of nterest. Step 3. Derve the asymptotc) dstrbutons of the test statstc under the null and alternatve hypotheses. Step 4. The power can be lnked to type I error, sample sze, effect sze, and desgn effects) by Step 3. Agan, the two key components here are the hypotheses of nterest and the test statstc. Dfferent formulae may be derved f we choose dfferent alternatve hypothess or test statstc. In ths dssertaton, we derve methods under dfferent alternatve hypotheses. The followng secton ntroduces two commonly used alternatve hypotheses n sample sze calculaton and power analyss.

25 CHAPTER 2. GENERAL PROCEDURE OF SAMPLE SIZE CALCULATION The fxed alternatve and the contguous alternatve hypotheses We adopt the notatons n the example of Secton 2.. Although the example shows the dervaton of sample sze formula under the fxed alternatve hypothess, other alternatve hypotheses can also be consdered, especally when the random sample s non-normal and the asymptotc dstrbuton s dffcult to obtan under the fxed alternatve. For the null hypothess of nterest H : µ = µ, the followng alternatve hypotheses are often consdered n sample sze and power calculaton:. Fxed alternatve: H : µ = µ. 2. Contguous alternatves : H n : µ = µ n = µ + h n, where h R. When the class of contguous alternatves s consdered n sample sze determnaton, the asymptotc dstrbuton of the test statstc needs to be derved under the contguous alternatves. Then, the sample sze can be determned by lettng µ = µ n. Ths actually assumes a small effect sze.e., µ and µ are close). By settng µ = µ n, the sample sze n s a functon of a real-valued h and the effect sze µ µ. Ths s useful when the sample sze formula s derved based on the contguous alternatves. Although sample sze dervaton based on the fxed alternatve s often more desrable, many researchers use the methods under the contguous alternatves when the alternatve dstrbuton of the test statstc s dffcult to obtan. If the assumpton of the closeness of µ and µ s made, one should be cautous when usng the correspondng formula. It s dffcult to actually quantfy the adequacy of the closeness assumpton n practce. Some would use the formulae obtaned under the contguous alternatves wthout checkng the approprateness of the assumpton. In Chapter 3 and 4, we show n detal the dervatons of the sample sze formulae under these two types of alternatves for the Cox model and for the semparametrc model by Yang and Prentce 25) [5].

26 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 6 Chapter 3 Sample sze calculaton for the Cox proportonal hazards model 3. Notatons and assumptons In ths chapter, we generalze the sample sze calculaton methods for the Cox proportonal hazards model under the fxed and the contguous alternatve hypotheses. The test statstc used for the Cox model s the score statstc based on the partal lkelhood. The formula for the fxed alternatve s complcated. We propose a method that smplfes the calculaton under the fxed alternatve. Ths method s evaluated through smulaton studes n Chapter 4. It s shown that our formula for the contguous alternatves reduces to that derved by Schoenfeld 983) [42] and Hseh and Lavor 2) [2] under certan condtons. We defne notatons for rght-censored falure tme data n the followng. Let T be the survval tme, C be the censorng varable, X = mnt, C) be the observed tme and = IT C) be the event ndcator, where I ) s the ndcator functon takng value f the condton s satsfed, otherwse. For rght-censored survval data, the observed tme s ether the survval tme or the censorng tme, whchever occurs frst. For smplcty, only one covarate Z s consdered. Suppose that the observed data are

27 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 7 X, Z, ), for =,..., n. We denote f Z ) as the probablty densty functon pdf) or probablty mass functon pmf), F Z ) as the cumulatve dstrbuton functon cdf) and S Z ) as the survval functon, λ Z ) as the hazard functon and Λ Z ) as the cumulatve hazard functon of survval tme T. In a two-sample problem, wth Z beng bnary whch takes value for the control group and for the treatment group, S ) denotes the survval functon of the treatment group and S ) denotes that of the control group. Accordngly, λ ) and λ ) are the hazard functons of the treatment group and control group, respectvely. Let b a denote a,b]. Let N t) = IX t) be the countng process of the number of observed events on, t], and Y t) = IX t) be the at rsk process at t. If the th ndvdual s stll at rsk and has not yet faled at tme t, then Y t) =, for =,..., n and t, τ], where τ > s a fnte tme pont at whch the study ends. Let t denote the left-contnuous pont of t. Denote χ 2 k as the Ch-squared dstrbuton of k degrees of freedom, and χ2 k,ω as the upper ωth percentle of the Ch-squared dstrbuton χ 2 k. Also let χ2 k η) be the non-central Ch-squared dstrbuton of k degrees of freedom wth noncentralty parameter η, and χ 2 k,ω η) be the upper ωth percentle of the non-central Chsquared dstrbuton χ 2 k η). We consder the followng regularty condtons: A) Condtonng on Z, T s ndependent of C. A2) θ Θ, where Θ s compact and Θ R p. A3) Z has bounded support. A4) S T Z t) and S C Z t) are contnuously dfferentable n t, τ]. A5) f T Z t) and f C Z t) are unformly bounded n t, τ]. A6) PrC τ) = PrC = τ) >.

28 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 8 A7) PrT > τ) >. A8) PrT C Z) > almost surely under F Z. Non-nformatve censorng s assumed condton A)). Note that θ = θ,,..., θ p ), the parameter of nterest, s a p vector, and Θ s ts parameter space. In ths chapter, p = ; that s, there s only one parameter of nterest θ n the Cox proportonal hazards model, based on whch we conduct sample sze calculaton. Denote S C Z ) and f C Z ) as the survval functon and the pdf, respectvely, of C condtonng on Z of survval tme T. Denote S T Z ) and f T Z ) as the survval functon and the pdf, respectvely, of T condtonng on Z. The condtons A2) A5) are techncal assumptons, whch are needed n the dervaton of the asymptotc propertes. The condton A6) mples that any patents alve by the end of the study at tme τ are consdered to be censored. The condton A7) ndcates that the probablty of an ndvdual survvng after τ s postve. The condton A8) mples that there s a postve probablty of observng an event for any possble value of Z. All the condtons dscussed n ths chapter are adopted throughout ths dssertaton unless otherwse specfed. The followng addtonal condtons are consdered n some of the results n ths dssertaton. C) C s ndependent of Z. C2) Z s bnary and PrZ = ) = ρ, where ρ, ). Condtons C) C2) are used wherever specfed. The condton C2) apples to the case when Z s the treatment group ndcator e.g., Z = f n the treatment arm and Z = f n the control arm), and ρ s the proporton of ndvduals n the treatment arm.

29 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL Model specfcaton For smplcty, we consder the case wth one covarate Z n the Cox proportonal hazards model λ T Z t Z, θ) = exp{zθ}λ t). For all θ Θ R, the partal lkelhood for the Cox model s ) n exp{z θ} L p θ) = = l RX ) exp{z lθ} n exp{z θ} = n j= Y, jx ) exp{z j θ}) where RX ) s the rsk set at tme X, for =,..., n. = The correspondng log partal lkelhood s n [ τ n )] log L p θ) = Z θ log Y j t) exp{z j θ} dn t). = The score functon U n wth respect to θ and the correspondng observed nformaton matrx V c n based on the partal lkelhood are derved as follows: = j= U n θ) = log L pθ) θ n [ τ n j= Z Z ] jy j t) exp{z j θ} n j= Y dn t), 3.) jt) exp{z j θ} = = V c n θ) = 2 log L p θ) θ 2 n n j= Z2 j Y j t) exp{z j θ} n n j= Y jt) exp{z j θ} j= Z ) 2 jy j t) exp{z j θ} n j= Y dn t). jt) exp{z j θ} = 3.2) 3.3 Sample sze formula under fxed alternatve In ths secton, we derve the sample sze formula based on the score test statstc under the fxed alternatve. We consder the followng hypotheses of nterest: the null

30 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 2 hypothess H : θ = θ and the alternatve hypothess H : θ = θ, where θ, θ Θ and θ θ. The score test statstc based on the partal lkelhood s T c n = U nθ ) V c n θ ). 3.3) To determne the asymptotc propertes of the test statstc T c n, the asymptotc dstrbutons of U n θ ), the score functon evaluated at θ should be derved frst. Theorem 3.3. gves the asymptotc dstrbutons of U n θ ) under the null and alternatve hypotheses. We also defne the followng: v c θ θ ) = lm n E θ v c θ θ ) = lm n E θ e c θ θ ) = lm n E θ v c θ θ ) = lm n Var θ [ ] n V n c θ ), [ ] n V n c θ ), [ ] n U nθ ), [ n U n θ ) ]. The expressons of v c θ θ ), e c θ θ ) and v c θ θ ) can be found n Secton 7... That of v c θ θ ) can be found n Secton Theorem Under condtons A) A8), the followng results hold: ) ) Under H : θ = θ, n U n θ ) d N, vθ c θ ) as n. ) Under H : θ = θ, n U n θ ) ) ne c θ θ ) d N, vθ c θ ) as n. Now we need to derve the asymptotc propertes of the test statstc to calculate the sample sze. Ths s not dffcult snce we have already derved the asymptotc propertes of U n θ ) n Theorem We wll only need to know the lmt of n V c n θ ) under the alternatve hypothess. Ths s shown n the proof of Corollary n Secton Corollary gves the correspondng results for the test statstc T c n. Corollary Under condtons A) A8), the results follow: ) Under H : θ = θ, Tn c N, ) as n. ) Under H : θ = θ, Tn c n ec θ θ ) v c θ θ ) d d N, vc θ θ ) v c θ θ ) ) as n.

31 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 2 Now we have obtaned the asymptotc propertes of the test statstc Tn c under both the null and alternatve hypotheses. The sample sze can be lnked to type I error, power, effect sze, and desgn effect based on the above results. Result gves the sample sze formula for the Cox model under the fxed alternatve hypothess. Result The hypotheses of nterest to test are H : θ = θ versus H : θ = θ, wth type I error α and power β. Under condtons A) A8), by Corollary ), the followng relatonshp can be establshed: Pr H T c n > c ) = α, where c = z α/2 s the crtc value. The dervaton of the sample sze s smlar as shown n the example of Chapter 2. By the defnton of power, we derve the followng: Pr H T c n > c ) = β, whch mples Pr H T c n > c ) + Pr H T c n < c ) = β. Wthout loss of generalty, we assume that the mean of the test statstc Tn c s negatve under the alternatve hypothess H. The followng result follows wth asymptotc normal approxmaton: Pr H T c n < c ) = β. Then, { v c Pr θ θ ) H v c θ θ ) = β. Tn c ) n ec θ θ ) v c θ θ ) < v c θ θ ) v c θ θ ) c n ec θ θ ) v c θ θ ) )}

32 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 22 Based on Corollary ), z β can be approxmated: v c θ θ ) c ) n ec θ θ ) = z v c θ θ ) v c β. θ θ ) The sample sze s obtaned by replacng c wth z α/2 n the above equaton. n = ) v c 2 z α/2 + θ θ ) z v c β θ θ ) e c θ θ )) 2 /v c θ θ ) 3.4) The formula 3.4) can be used for the sample sze calculaton for the Cox proportonal hazards model under the fxed alternatve. It s a functon of type I error, power, and desgn effects e c θ θ ), v c θ θ ) and v c θ θ ). The formula derved here s general as t s based on the score statstc of the Cox model under the fxed alternatve. However, the form of v c θ θ ) s usually complcated and may be dffcult to derve. In the next secton, we consder a method that smplfes the expresson of v c θ θ ) under contguous alternatves. 3.4 Sample sze formula under contguous alternatves The sample sze formula n Secton 3.3 s derved for the null hypothess H : θ = θ versus the fxed alternatve hypothess H : θ = θ. The lmtng varance of the test statstc T c n has a rather complex form. In ths secton, we dscuss the formula derved based on a sequence of alternatves, the contguous alternatves. The asymptotc dstrbuton of the test statstc wll be evaluated under the contguous alternatves H n : θ = θ n = θ + h n, where h R s some fxed real value. The power under the fxed alternatve s approxmated by that under the contguous alternatves. Let θ = θ n = θ + h n, then the sample sze n s related to h, θ and θ, snce h = nθ θ ). By dong so, t s actually assumed that θ s close to θ ;.e., the effect sze s small. When a sequence of contguous alternatves s consdered,

33 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 23 the lmtng dstrbuton of T c n has a smpler form. We show that the sample sze formula by Schoenfeld 983) [42] and Hseh and Lavor 2) [2] can be obtaned as a specal case of our method under certan condtons when θ =. Theorem 3.4. gves the asymptotc dstrbuton of U n θ ) under the contguous alternatves H n : θ = θ n = θ + h n. Theorem Under condtons A) A8), the followng result holds. Under H n : θ = θ n = θ + h n, ) U n θ ) d N hv c n θ θ ), vθ c θ ), as n. The proof of Theorem 3.4. can be found n Secton The followng Corollary gves the asymptotc dstrbuton of the test statstc T c n under the contguous alternatves. The proof s straghtforward by applyng Slutsky s theorem. Corollary Under condtons A) A8), the followng result holds. Under H n : θ = θ n = θ + h n, T c n ) d N h vθ c θ ), as n. Corollary s proved n Secton Now we have obtaned the asymptotc dstrbutons of the test statstc under both the null and alternatve hypotheses. The sample sze can be derved by known type I error, power, effect sze, and desgn effect based on the above results. Result gves the sample sze formula for the Cox model based on the contguous alternatves. Result The hypotheses of nterest to test are H : θ = θ versus H : θ = θ, wth type I error α and power β. Under condtons A) A8), by Corollary ), the followng relatonshp can be establshed: Pr H T c n > c ) = α,

34 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 24 where c = z α/2 s the crtc value. Wthout loss of generalty, we assume that the mean of the test statstc Tn c s negatve under H : θ = θ. By Corollary and the defnton of power, we derve the followng: Pr H T c n > c ) = β, whch s equvalent to Pr H T c n > c ) + Pr H T c n < c ) = β. Wth asymptotc normal approxmaton, Pr H T c n < c ) = β. Based on Corollary 3.4.2, we assume that Tn c follows an asymptotc normal dstrbuton wth mean approxmated by h vθ c θ ), where h = nθ θ ), and asymptotc varance equal to. Thus, we approxmate the dstrbuton of T c n under H by that under H n. It follows that Pr H T c n h vθ c θ ) < c h vθ c θ ) ) = β, and c h vθ c θ ) = z β. Based on the fact that h = nθ θ ), the sample sze formula can be obtaned by substtutng c wth z α/2 n the above equaton. n = zα/2 + z β ) 2 v c θ θ ) θ θ ) ) The formula 3.5) can be used for sample sze calculaton for the hypotheses of nterest H : θ = θ versus H : θ = θ, wth type I error α and power β. It

35 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 25 s calculated based on the contguous alternatves H n : θ = θ n = θ + h n. The formula s a functon of α, β, θ, θ, and the desgn effect v c θ θ ). We notce that the lmtng varances of the score statstc under the null and alternatve hypotheses are both v c θ θ ). Therefore, the sample sze formula under the fxed alternatve can be smplfed. We wll dscuss the method n the followng remark. Remark It s shown that the asymptotc varance of n U n θ ) s v c θ θ ) under the contguous alternatves see Theorem 3.4. and ts proof n Secton 7..3). Ths result ndcates that the lmtng varance of n U n θ ) under the alternatve hypothess s close to v c θ θ ) when the alternatve s close to the null hypothess. Let us assume a smple lnear relatonshp between the two lmtng varances v c θ θ ) and v c θ θ ); that s, v c θ θ ) = φv c θ θ ), where φ >. When the effect sze s small, that s, when θ s close to θ, v c θ θ ) v c θ θ ). Thus, t s expected that φ s close to for small effect sze. Dfferent values of φ can be explored to obtan the best approprate sample sze. If we replace v c θ θ ) wth φv c θ θ ) n 3.4, the sample sze formula becomes: n = ) φv c 2 z α/2 + θ θ ) z v c β θ θ ) e c θ θ )) 2 /v c θ θ ) 3.6) Dfferent values of φ can be nvestgated f we use 3.6) nstead of 3.4). The value of φ can be determned through a seres of smulaton studes. In ths way, the calculaton s greatly smplfed wthout dervng the explct form of the lmtng varance of the score statstc under the fxed alternatve. In practce, t s often of nterest to derve the sample sze for the null hypothess of no effect θ = ) versus some small effect under the alternatve hypothess. The followng result gves the sample sze formula for ths knd of specal case when θ =. Result The hypotheses of nterest to test are H : θ = versus H : θ = θ, wth type I error α and power β. Under condtons A) A8) and C), the sample sze formula based on the contgu-

36 CHAPTER 3. SAMPLE SIZE CALCULATION FOR THE COX PROPORTIONAL HAZARDS MODEL 26 ous alternatves H n : θ = θ n = h n s n = zα/2 + z β ) ) VarZ) Pr θ= { : = }) θ 2 We denote Pr θ= { : = }) as the event probablty under the null hypothess of θ =. Result s derved as a drecton applcaton of Result Note that wth the addtonal condton C), vθ c can be smplfed when θ =. 3.7) s thus the same as the sample sze formula by Hseh and Lavor 2) [2]. The followng equaton can be obtaned from 3.7): npr θ= { : = }) = zα/2 + z β ) 2 VarZ) θ 2. Let D θ= denote the expected number of events under the null hypothess H : θ =. Therefore, D θ= = npr θ= { : = }) = z α/2+z β) 2 VarZ) θ 2. In partcular, when the covarate s bnary condton C2)), VarZ) = ρ ρ). Then the sample sze formula 3.7) becomes n = zα/2 + z β ) ) ρ ρ) Pr θ= { : = }) θ 2 Then D θ=, the expected number of events under the null hypothess H : θ =, can be calculated by z α/2+z β) 2 ρ ρ) θ 2. These results are the same as what Schoenfeld 983) [42] has derved. Ths representaton of D θ= has appeared n many works, ncludng Schoenfeld 98) [4], Schoenfeld and Rchter 982) [43] and Collett 23) [9].

37 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL 27 Chapter 4 Sample sze calculaton wth Yang and Prentce s semparametrc model The methods dscussed n Chapter 3 rely on the proportonal hazards assumpton. When there s presence of non-proportonalty of the hazard functons, an approprate model should be consdered for sample sze calculaton. We derve the sample sze formulae based on the semparametrc model by Yang and Prentce 25) [5]. Ths model can be used for non-proportonal hazard functons, wth the Cox proportonal hazards model and the proportonal odds model beng two specal cases. Thus, the sample sze calculaton based on ths model can be used for survval data when the proportonal hazards assumpton does not hold. The frst secton of ths chapter ntroduces the semparametrc model for short-term and long-term hazard ratos Yang and Prentce 25 [5]). The sample sze formulae based on ths model under fxed and contguous alternatves wll follow. The specal cases for testng the null hypothess of no treatment effect are dscussed for the three submodels: Cox proportonal hazards model, proportonal odds model, and long-term effect model. The last secton provdes smulaton results for the proposed methods under varous scenaros.

38 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL Notatons and Model specfcaton Yang and Prentce 25) [5] developed a semparametrc model that can accommodate dfferent short-term and long-term hazard ratos. In the smple two-sample case, the model has the followng form: λ t) λ t) = γ γ 2 γ + γ 2 γ )S t). 4.) The hazard rato n 4.) s a functon of two postve parameters γ and γ 2 and the baselne survval functon S. It s monotone n t for fxed γ and γ 2, snce S t) s nonncreasng n t. Notce that γ = lm t λ t) λ t) and γ λ 2 = lm t) t λ. Ths s t) because S t) goes to when t goes to, and S t) goes to when t goes to. Therefore, γ can be nterpreted as the short-term hazard rato and γ 2 as the longterm hazard rato. The model reduces to the Cox proportonal hazards model when γ = γ 2 = γ. The hazard rato for the Cox model s constant over tme, and the correspondng hazard functons are proportonal. When γ 2 =, the model becomes the proportonal odds model. The proportonal odds model has non-constant hazard rato, and the correspondng hazard functons are not proportonal. The treatment effect would fade away over tme for the proportonal odds model. Another specal case of nterest s what we call the long-term effect model, n whch γ =. Ths means that there s no treatment effect at the begnnng, but the effect shows up later on. The long-term effect model has non-constant hazard rato, and the correspondng hazard functons are non-proportonal. The model 4.) can be used for a wde range of hazards patterns, ncludng crossng hazards. When there s crossng n the hazard functons, γ < and γ 2 >, or, γ > and γ 2 <. Examples wth baselne hazard functon λ t) = / + t) correspondng baselne survval S t) = / + t) and odds of baselne survval S t) S t) = t) are shown n the followng Fgures. The hazard rato, correspondng hazard functons and survval functons are plotted under dfferent scenaros. Fgure 4. gves two examples of constant hazard rato. These are examples of the Cox

39 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL 29 proportonal hazards model. When the common hazard rato s less than, the treatment s more favorable. When the common hazard rato s great than, the treatment s not effectve. Fgure 4.2 shows the examples of the proportonal odds model, n whch γ 2, the long-term effect, s. Fgure 4.3 shows two examples of the long-term effect model, where γ, the short-term effect, s. The correspondng survval functons of the treatment and the control are close to each other n each scenaro of the long-term effect model, but start to devate from each other after a whle. Fgure 4.4 gves examples of the crossng hazard functons. These are the cases when the treatment effect s dfferent at dfferent stages of a study. The treatment may be benefcal at the begnnng, but turns out to do harm later on, or, the treatment may not be favorable at the begnnng, but appears to be more effectve as the study contnues. Crossng n hazard functons occurs when γ < and γ 2 >, or, γ > and γ 2 <. Notce that n each of the crossng hazards example, both of the hazard functons and the survval functons cross, but the crossng ponts are dfferent. We wll use the same notatons and regularty condtons as n Secton 3.. The regularty condtons are A) A8). Let θ = log γ be the parameter of nterest. We suppose that θ Θ, where Θ s a compact subset of R p. In the model by Yang and Prentce 25) [5], p = 2. The parameter of nterest s θ = θ, θ 2 ), whch s a real-valued vector.

40 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL 3 Fgure 4.: Proportonal hazards: Cox model γ = γ 2 = γ). γ =.5, γ 2 =.5 Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme γ =2, γ 2 =2 Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme

41 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL 3 Fgure 4.2: Non-proportonal hazards: proportonal odds model γ 2 = ). γ =.8, γ 2 = Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme γ =2, γ 2 = Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme

42 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL 32 Fgure 4.3: Non-proportonal hazards: long-term effect model γ = ). γ =, γ 2 =.5 Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme γ =, γ 2 =.8 Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme

43 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL 33 Fgure 4.4: Non-proportonal hazards: crossng hazards γ < and γ 2 >, or, γ > and γ 2 < ). γ =.6, γ 2 =2.5 Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme γ =2.5, γ 2 =.5 Hazard Rato Hazard Functon Treatment Control Survval Functon Treatment Control Tme Tme Tme

44 CHAPTER 4. SAMPLE SIZE CALCULATION WITH YANG AND PRENTICE S SEMIPARAMETRIC MODEL Parameter estmaton The model has the followng form after re-parameterzatons and the ncorporaton of a covarate Z: λ T Z t Z, θ) = e Zθ + e Zθ 2R t) dr t), 4.2) dt where θ j = logγ j ), for =, 2. Note that θ = θ, θ 2 ) Θ R 2 s a real-valued vector of two dmensons. Now θ = logγ ) can be nterpreted as the logarthm of the short-term hazard rato and θ 2 = logγ 2 ) as the logarthm of the long-term hazard rato. The true parameter θ s a vector that conssts of two elements: θ and θ 2,.e., θ = θ, θ 2 ). The compact parameter space Θ s a subset of R 2. We also defne R as the odds of the baselne survval functon,.e., R t) = S t) S, for t, τ]. t) The hazard rato s then a functon of θ, θ 2 and R. The model reduces to the Cox proportonal hazards model when θ = θ 2, to the proportonal odds model when θ 2 =, and to the long-term effect model when θ =. If θ and θ 2 have dfferent sgns.e., one s postve and the other s negatve), then there s crossng n hazard functons. If R t) n the model 4.2) s known, the lkelhood and log lkelhood of θ = θ, θ 2 ) can be wrtten as follows under non-nformatve censorng: L n θ, R ) = logl n θ, R ) = n λ T Z X Z, θ) S T Z X Z, θ), = n logλ T Z t Z, θ) dn t) n = = λ T Z t Z, θ)y t) dt. The score functon wth respect to θ s loglnθ,r ). If R θ s unknown, t can be consstently estmated by ˆR n, where ˆR n t, θ) has the followng close-from expresson Yang and Prentce 25 [5]) ˆR n t, θ) = t ˆP n t, θ) ˆP n s, θ) dˆλ n, s, θ), 4.3)

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Chapter 20 Duration Analysis

Chapter 20 Duration Analysis Chapter 20 Duraton Analyss Duraton: tme elapsed untl a certan event occurs (weeks unemployed, months spent on welfare). Survval analyss: duraton of nterest s survval tme of a subject, begn n an ntal state

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria ECOOMETRICS II ECO 40S Unversty of Toronto Department of Economcs Wnter 07 Instructor: Vctor Agurregabra SOLUTIO TO FIAL EXAM Tuesday, Aprl 8, 07 From :00pm-5:00pm 3 hours ISTRUCTIOS: - Ths s a closed-book

More information

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

STK4080/9080 Survival and event history analysis

STK4080/9080 Survival and event history analysis SK48/98 Survval and event hstory analyss Lecture 7: Regresson modellng Relatve rsk regresson Regresson models Assume that we have a sample of n ndvduals, and let N (t) count the observed occurrences of

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

The optimal delay of the second test is therefore approximately 210 hours earlier than =2.

The optimal delay of the second test is therefore approximately 210 hours earlier than =2. THE IEC 61508 FORMULAS 223 The optmal delay of the second test s therefore approxmately 210 hours earler than =2. 8.4 The IEC 61508 Formulas IEC 61508-6 provdes approxmaton formulas for the PF for smple

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients

Marginal Effects in Probit Models: Interpretation and Testing. 1. Interpreting Probit Coefficients ECON 5 -- NOE 15 Margnal Effects n Probt Models: Interpretaton and estng hs note ntroduces you to the two types of margnal effects n probt models: margnal ndex effects, and margnal probablty effects. It

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

GROUP SEQUENTIAL TEST OF NON-PARAMETRIC STATISTICS FOR SURVIVAL DATA

GROUP SEQUENTIAL TEST OF NON-PARAMETRIC STATISTICS FOR SURVIVAL DATA Hacettepe Journal of Mathematcs and Statstcs Volume 34 (2005), 67 74 GROUP SEQUETIAL TEST OF O-PARAMETRIC STATISTICS FOR SURVIVAL DATA Yaprak Parlak Demrhan and Sevl Bacanlı Receved 2 : 02 : 2005 : Accepted

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Formulas for the Determinant

Formulas for the Determinant page 224 224 CHAPTER 3 Determnants e t te t e 2t 38 A = e t 2te t e 2t e t te t 2e 2t 39 If 123 A = 345, 456 compute the matrx product A adj(a) What can you conclude about det(a)? For Problems 40 43, use

More information

Supplementary Notes for Chapter 9 Mixture Thermodynamics

Supplementary Notes for Chapter 9 Mixture Thermodynamics Supplementary Notes for Chapter 9 Mxture Thermodynamcs Key ponts Nne major topcs of Chapter 9 are revewed below: 1. Notaton and operatonal equatons for mxtures 2. PVTN EOSs for mxtures 3. General effects

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

9. Binary Dependent Variables

9. Binary Dependent Variables 9. Bnar Dependent Varables 9. Homogeneous models Log, prob models Inference Tax preparers 9.2 Random effects models 9.3 Fxed effects models 9.4 Margnal models and GEE Appendx 9A - Lkelhood calculatons

More information

DUE: WEDS FEB 21ST 2018

DUE: WEDS FEB 21ST 2018 HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant

More information

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples

Parameters Estimation of the Modified Weibull Distribution Based on Type I Censored Samples Appled Mathematcal Scences, Vol. 5, 011, no. 59, 899-917 Parameters Estmaton of the Modfed Webull Dstrbuton Based on Type I Censored Samples Soufane Gasm École Supereure des Scences et Technques de Tuns

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.65/15.070J Fall 013 Lecture 1 10/1/013 Martngale Concentraton Inequaltes and Applcatons Content. 1. Exponental concentraton for martngales wth bounded ncrements.

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j Stat 642, Lecture notes for 01/27/05 18 Rate Standardzaton Contnued: Note that f T n t where T s the cumulatve follow-up tme and n s the number of subjects at rsk at the mdpont or nterval, and d s the

More information

Lecture 17 : Stochastic Processes II

Lecture 17 : Stochastic Processes II : Stochastc Processes II 1 Contnuous-tme stochastc process So far we have studed dscrete-tme stochastc processes. We studed the concept of Makov chans and martngales, tme seres analyss, and regresson analyss

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Andreas C. Drichoutis Agriculural University of Athens. Abstract

Andreas C. Drichoutis Agriculural University of Athens. Abstract Heteroskedastcty, the sngle crossng property and ordered response models Andreas C. Drchouts Agrculural Unversty of Athens Panagots Lazards Agrculural Unversty of Athens Rodolfo M. Nayga, Jr. Texas AMUnversty

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

1 Binary Response Models

1 Binary Response Models Bnary and Ordered Multnomal Response Models Dscrete qualtatve response models deal wth dscrete dependent varables. bnary: yes/no, partcpaton/non-partcpaton lnear probablty model LPM, probt or logt models

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 7, Number 2, December 203 Avalable onlne at http://acutm.math.ut.ee A note on almost sure behavor of randomly weghted sums of φ-mxng

More information

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF

8 : Learning in Fully Observed Markov Networks. 1 Why We Need to Learn Undirected Graphical Models. 2 Structural Learning for Completely Observed MRF 10-708: Probablstc Graphcal Models 10-708, Sprng 2014 8 : Learnng n Fully Observed Markov Networks Lecturer: Erc P. Xng Scrbes: Meng Song, L Zhou 1 Why We Need to Learn Undrected Graphcal Models In the

More information

Maximizing the number of nonnegative subsets

Maximizing the number of nonnegative subsets Maxmzng the number of nonnegatve subsets Noga Alon Hao Huang December 1, 213 Abstract Gven a set of n real numbers, f the sum of elements of every subset of sze larger than k s negatve, what s the maxmum

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi LOGIT ANALYSIS A.K. VASISHT Indan Agrcultural Statstcs Research Insttute, Lbrary Avenue, New Delh-0 02 amtvassht@asr.res.n. Introducton In dummy regresson varable models, t s assumed mplctly that the dependent

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression STAT 45 BIOSTATISTICS (Fall 26) Handout 5 Introducton to Logstc Regresson Ths handout covers materal found n Secton 3.7 of your text. You may also want to revew regresson technques n Chapter. In ths handout,

More information

NEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI

NEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI NEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI ASTERISK ADDED ON LESSON PAGE 3-1 after the second sentence under Clncal Trals Effcacy versus Effectveness versus Effcency The apprasal of a new or exstng healthcare

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600 Statstcal tables are provded Two Hours UNIVERSITY OF MNCHESTER Medcal Statstcs Date: Wednesday 4 th June 008 Tme: 1400 to 1600 MT3807 Electronc calculators may be used provded that they conform to Unversty

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential Open Systems: Chemcal Potental and Partal Molar Quanttes Chemcal Potental For closed systems, we have derved the followng relatonshps: du = TdS pdv dh = TdS + Vdp da = SdT pdv dg = VdP SdT For open systems,

More information