Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency
|
|
- Lauren Brown
- 5 years ago
- Views:
Transcription
1 Smooth Goodness-of-Fit Tests in Hazard-Based Models by Edsel A. Pe~na Department of Statistics University of South Carolina at Columbia Univ. of Virginia Colloquium Talk Department of Statistics December 1, 2 Research supported by an NIH Grant 1
2 1. Practical Problem ffl Right-censored survival data for lung cancer patients from Gatsonis, Hsieh and Korwar (1985). ffl Survival times (in months): n = 86 with 23 right-censored..99, 1.28, 1.77, 1.97, 2.17, 2.63, 2.66, 2.76, 2.79, 2.86, 2.99, 3.6, 3.15, 3.45, 3.71, 3.75, 3.81, 4.11, 4.27, 4.34, 4.4, 4.63, 4.73, 4.93, 4.93, 5.3, 5.16, 5.17, 5.49, 5.68, 5.72, 5.85, 5.98, 8.15, 8.26, 8.48, 8.61, 9.46, 9.53, 1.5, 1.15, 1.94, 1.94, 11.4+, 11.24, 11.63, 12.26, 12.65, 12.78, 13.18, 13.47, , 13.96, , , 14.88, , 15.5, 15.31, , 16.13, 16.46, , 17.5+, , 17.45, 17.61, , , 18.2, , 2.24+, 2.7, 2.73+, , , 22.54, ffl Probability histograms of the Complete and Censored Values. Relative Frequency Relative Frequency Survival Times (in months) Survival Times (in months) 2
3 ffl NPMLE of Survivor Function: Product-Limit Estimator ^μf (t) = Y 8 < : 1 N(s) 9 = Y (s) ; ; fs»tg where N(s) = No. of uncensored times up to time s; Y (s) = No. at risk at time s: ffl Product-limit estimator and best-fitting exponential survivor function. Survivor Function Estimates Survival Time (in months) ffl Question: Did the survival data come from the family of exponential distributions? Or was it from the family of Weibull distributions? 3
4 ffl Another Data Set: Times to withdrawal (in hours) of 171 car tires in Davis and Lawrance (1989, Scand. J. Statist.) from a car-tire testing study, with withdrawal either due to failure or right-censoring. ffl The pneumatic tires were subjected to laboratory testing by rotating each tire against a steel drum until either failure [actually, there were several competing causes] or removal (right-censoring). ffl Plots of the product-limit estimator, best-fitting exponential, and bestfitting Weibull distribution. Survivor Function Estimates Removal Times (in hours) ffl Question: Is this failure-time car tires data consistent with an underlying Weibull distribution? ffl Both situations point to a goodness-of-fit problem with right-censored data. 4
5 2. Densities and Hazards ffl T = a positive-valued continuous failure-time variable, e.g., Π time-to-failure of a mechanical or electronic system Π time-to-occurrence of an event Π survival time of a patient in a clinical trial ffl f(t) =density function of T. f(t) t ß P ft 2 [t; t + t)g : ffl F (t) = PfT» tg = distribution function ffl μ F (t) = 1 F (t) =survivor function ffl (t) = f(t)= μ F (t) =hazard rate function. (t) t ß P ft 2 [t; t + t)jt tg : ffl Λ(t) = R t (w)dw = log[ μf (t)] = (cumulative) hazard function ffl Equivalences: μf (t) = expf Λ(t)g f(t) = (t)expf Λ(t)g 5
6 ffl Two Common Examples: Π Exponential: f(t; ) = expf tg μf (t; ) =expf tg (t; ) = Λ(t; ) = t Π Two-Parameter Weibull: f(t; ff; ) =(ff )( t) ff 1 expf ( t) ff g μf (t; ff; ) = expf ( t) ff g (t; ff; ) =(ff )( t) ff 1 Λ(t; ff; ) =( t) ff Π Qualitative Aspects from Plots of Hazards Weibull Hazard Plots Hazard Rate Time 6
7 3. Hazard-Based Models Π IID Model: T 1 ;T 2 ;:::;T n IID with common unknown hazard rate function (t). Observable vectors: (Z 1 ;ffi 1 ); (Z 2 ;ffi 2 );:::;(Z n ;ffi n ) with ffi i = 1 ) T i = Z i ffi i = ) T i > Z i : Π Cox PH Model (also Andersen and Gill Model): (T 1 ;X 1 ); (T 2 ;X 2 );:::;(T n ;X n ) with T jx (tjx) = (t)expffi t Xg ( ) an unknown hazard rate function, and fi a regression coefficient vector. Observable vectors: (Z 1 ;ffi 1 ;X 1 ); (Z 2 ;ffi 2 ;X 2 );:::;(Z n ;ffi n ;X n ) with ffi i = 1 ) T i = Z i ffi i = ) T i > Z i : 7
8 4. Problems and Issues ffl Problem (Goodness-of-Fit): Given f(z i ;ffi i );i= 1; 2;:::;ng in the IID model, or f(z i ;ffi i ;X i );i= 1; 2;:::;ng in the Cox model, decide whether ( ) 2 C = f ( ; ) : 2 g where is a nuisance parameter vector. Π C could be: Exponential, Weibull, Gamma; or IFRA class. Π Importance? Π Previous works on GOF problem with censored data: Akritas (88, JASA), Hjort (9, AS), Hollander and Pe~na (92, JASA), Li and Doss (93, AS), and others. Π Generalize Pearson's: χ 2 P = X (O j ^E j ) 2 ^E j? Π Difficulty: O j 's not computable with censored data. Also, any optimality properties of existing tests? 8
9 ffl Problem (Model Validation): Given f(z i ;ffi i );i = 1; 2;:::;ng or f(z i ;ffi i ;X i );i = 1; 2;:::;ng, how to assess model assumptions (e.g., IID assumption, model structure)? Π Unit Exponentiality Property (UEP) T ο Λ( ) ) Λ(T ) ο EXP(1) Π IID model: If Λ ( ) is the true hazard function, then with R i = Λ (Z i ), (R1;ffi 1 ); (R2;ffi 2 );:::;(Rn;ffi n ) is a right-censored sample from EXP(1). Π Since Λ ( ) is not known, the R i 's are estimated by R i 's with R i = ^Λ(Z i ); i = 1; 2;:::;n; ^Λ( ) is an estimator of Λ ( ) based on the (Z i ;ffi i )'s. Π Idea: (R i ;ffi i )'s assumed to form an approximate right-censored sample from EXP(1), so to validate model, test whether (R i ;ffi i )'s is a right-censored sample from EXP(1). Π Question: How good is this approximation, even in the limit??? 9
10 Π For Cox PH model, the analogous expressions for R i and R i are: R i = Λ (Z i ) expffix i g; i = 1; 2;:::;n; R i = ^Λ(Z i )expf ^fix i g; i = 1; 2;:::;n: Λ ^Λ( ) is an estimator of Λ ( ) [e.g., Aalen-Breslow estimator]. Λ ^fi is an estimator of fi [partial likelihood MLE]. Π R i 's are true generalized residuals (Cox and Snell (68, JRSS)). Π R i 's are estimated generalized residuals. Π Generalized residuals are analogs of linear model residuals: (Observed Value) minus (Fitted Value)." Π Question: Effects of substituting estimators for the unknown parameters, even when the estimators are consistent?? 1
11 5. Class of Smooth GOF Tests ffl Convert observed data (Z 1 ;ffi 1 ;X 1 ); (Z 2 ;ffi 2 ;X 2 );:::;(Z n ;ffi n ;X n ) into stochastic processes. ffl For i = 1; 2;:::;n, and t, let N i (t) = IfZ i» t; ffi i = 1g; Y i (t) =IfZ i tg: ffl A i (t; ( );fi)= so Z t Y i (w) (w)expffi t X i gdw; Pf N i (t) = 1jF t g = Y i (t) (t) expffi t X i gdt: ffl M i (t; ( );fi)=n i (t) A i (t; ( );fi): ffl If ( ) and fi are the true parameters, M (t) = (M 1 (t; ( );fi );:::;M n (t; ( );fi )) are orthogonal sq-int zero-mean martingales with predictable quadratic variation processes hm i ;M i i(t) = A i (t; ( );fi ): 11
12 ffl Problem: Test H : ( ) 2 C = f ( ; ) : 2 g versus H 1 : ( ) =2 C: ffl Idea: If ( ) is the true hazard rate function, then under H there is some 2 such that ( ) = ( ; ): ffl Define 2»(t; ) = log 4 3 (t) 5 : (t; ) ffl K = collection of such f»( ; ) : 2 g. ffl Basis set (e.g., trigonometric, polynomial, wavelet, etc.) for K: Ψ = fψ 1 ( ; );ψ 2 ( ; );:::g; X»(t; ) = 1 j ψ(t; ) = Ψ: j=1 ffl For smoothing order K, approximate»( ; ) by»(t; ) = K X j=1 j ψ(t; ) = KΨ K : ffl Equivalently, 8 9 < KX = (t) ß (t; )exp: j ψ(t; ); : j=1 ffl Define: < C K = : < KX = K( ; ; ) = ( ; )exp: j ψ j ( ; ); : = K 2 < K ; 2 ; : ffl Embedding: H ρ C K : j=1 12
13 ffl Goodness-of-Fit Tests: Score tests for H : K = ;fi 2B versus H 1 : K 6= ;fi 2 B: ffl Tests introduced in Pe~na (1998, JASA; 1998, AS). ffl Since score tests, they possess local optimality properties. ffl Likelihood function: obtained via product integration. ffl Score function for K at K = : Q( ; fi) = n X Z fi Ψ K ( ) n dn i Y i ( )expffi t X i gdt o : ffl Not a statistic since and fi are unknown. ffl Plug-in estimators for and fi under the restriction K =. ffl Estimate fi by ^fi which solves S(fi) n X S (m) (t; fi) = n X ffl Estimate by ^ which solves R( ; ^fi) n X Z fi Z fi [X i E(fi)]dN i = ; E(t; fi) = S(1) (t; fi) S () (t; fi) ; X Ωm i Y i expffi t X i g; m = ; 1; 2: ρ ρ( ) dn i Y i ( )expf ^fi ff t X i gdt = ; ρ(t; log (t; ): 13
14 ffl Test statistic: S K = 1 n ρ ffρ^ξ 1 ffρ^qff ^Qt ; with ^Q = Q(^ ; ^fi) = n X Z fi ρ Ψ K (^ ) dn i Y i (^ )expf ^fi ff t X i gdt ffl ^Ξ is an estimator of the limiting covariance matrix of 1 p n ^Q. ffl S K is a function of the generalized residuals (R 1 ;ffi 1 );:::;(R n ;ffi n ): 6. Main Asymptotic Results ffl Proposition: If the parameters are known, Q( ;fi ) p 6 n 4 R( ;fi ) d 7 5! N 6 4 C A ; ± = S(fi ) so 1 p n Q( ;fi ) d! N(; ± 11 ): ± 11 ± 12 ± 21 ± 22 ± C A 7 5 ; ffl Theorem: With estimated parameters, 1 1 p ^Q pn Q(^ ; ^fi) n d! N K (; Ξ) where Ξ = ± 11 ± 12 ± 1 22 ± 21 +( 1 ± 12 ± )± 1 33 ( 1 ± 12 ± ) t : ffl Proof: Relies on Rebolledo's MCT. 14
15 7. Effects of the Plug-In Procedure Π From Ξ = ± 11 ± 12 ± 1 22 ± 21 +( 1 ± 12 ± )± 1 33 ( 1 ± 12 ± ) t ; replacing ( ; fi) by(^ ; ^fi) to obtain the statistic ^Q has no asymptotic effect if ± 12 = and 1 = ; since ± 11 is the limiting covariance for 1 p n Q( ;fi ). Π Essence of adaptiveness:" it does not matter that nuisance parameters ( ; fi) are unknown in Q( ; fi) since replacing them by their estimators does not make theasymptotic distribution of Q(^ ; ^fi) different from Q( ;fi ). Π ± 12 = is an orthogonality condition between Ψ K ( ) and ρ( ). Π 1 = is an orthogonality condition between ρ( ) and e( ;fi ). Π Orthogonality defined in a Hilbert space whose inner product is hf;gi = Z fi fgν (dt); with Z ν (A) = A s() ( ;fi ) ( )dt; s () (t; ;fi ) = plim 1 n S() (t; ;fi ): 15
16 Π Can we always choose Ψ K to satisfy orthogonality conditions? Yes, via a Gram-Schmidt process... but hard to implement! Π If orthogonality conditions are not satisfied, substituting (^ ; ^fi) for ( ;fi ) in Q( ;fi ) impacts on the asymptotic distribution of ^Q, even if the estimators are consistent. Π Second term in Ξ: effect of estimating by ^. Π Third term in Ξ: effect of estimating fi by the partial MLE ^fi. Π Estimating fi by ^fi leads to an increase in variance because this estimator is less efficient than the full MLE of fi. Π Ignoring effect on variance could have dire consequences in the goodness-of-fit testing. Π If overall effect is a variance reduction, ignoring effect may result in a highly conservative test and lead into concluding model appropriateness when model is inappropriate. 16
17 8. Form of Test Procedure ffl Recall: ^Q = Q(^ ; ^fi) X = n Z fi ρ Ψ K (^ ) dn i Y i (^ )expf ^fi ff t X i gdt : ffl Estimating Limiting Covariance Matrix, Ξ: With ^± = 1 2n nx ^ 1 = 1 2n ^ 2 = 1 2n Z fi nx nx Z fi an estimator of Ξ is Ψ K (^ ) ρ(^ ) X i E( ^fi) Z fi 3Ω2 ρ 7 dni 5 + Yi (^ )expf ^fi ff t Xigdt ; Ψ K (^ )E( ^fi) ρdn t i + Y i (^ ) expf ^fi ff t X i gdt ; ρ(^ )E( ^fi) ρdn t i + Y i (^ ) expf ^fi ff t X i gdt ; ^Ξ = ^± 11 ^± ^± 1 ^± (^ 1 ^± ^± ^ 2)^± 33 ( ^ 1 ^± ^± 1 ^ 2) t : ffl Asymptotic ff-level Test: Reject H : ( ) 2 C; fi 2 B whenever S K = 1 ρ ^Q ffρ^ξ t ffρ^q ff χ 2 K n ;ff; Λ ^Ξ = Moore-Penrose generalized inverse of ^Ξ and K Λ = rank(^ξ): 17
18 9. Choices for Ψ K ffl Partition [;fi]: = a < a 1 < ::: < a K 1 < a K = fi ffl Let Ψ K (t) = h I [;a1 ](t);i (a1 ;a 2 ](t);:::;i (ak 1 ;fi ](t) i t : ffl Then ^Q = [O 1 E 1 ;O 2 E 2 ;:::;O K E K ] t where E j = n X O j = n X Z aj Z aj a j 1 dn i (t) a j 1 Y i (t) expf ^fi t X i g (t;^ )dt: ffl O j 's are observed frequencies. ffl E j 's are estimated dynamic expected frequencies. ffl Extends Pearson's statistic, but `counts are in a dynamic fashion.' ffl Resulting test statistic not of form KX j=1 (O j E j ) 2 E j because correction terms in variance destroys diagonal nature of covariance matrix. ffl If adjustments are ignored, distribution is not chi-squared. ffl Procedure extends Akritas (1988) and Hjort (199). 18
19 ffl Example: Consider no covariates (so X i = ); C = f (t; ) = g, that is, Exponential distribution. ffl For j = 1; 2;:::;K, ^ = ffl Resulting test statistic: O j = n X E j = ^ n X P n R fi dn i(t) P n R fi Y i(t)dt Z aj a j 1 dn i (t); Z aj a j 1 Y i (t)dt; = Number of events Total exposure : S K = E ffl [p ^ß] t h Dg(^ß) ^ß^ß ti [p ^ß] ; where X E ffl = K E j ; j=1 p = 1 E ffl (O 1 ;O 2 ;:::;O K ) t ; ^ß = 1 E ffl (E 1 ;E 2 ;:::;E K ) t : ffl E ffl 6= n. ffl Appropriate degrees-of-freedom for the statistic is K 1 since 1 t^ß = 1: 19
20 ffl Polynomial-type: Ψ K (t; ) = h 1; Λ (t; ); Λ (t; ) 2 ; :::; Λ (t; ) K 1i t : ffl Resulting ^Q vector has components ^Q j ;j = 1; 2;:::;K given by ^Q j = n X where, with expf (j 1) ^fix Z fi i g w n j 1 dn R (w) Y R i i (w)dw o ; R i = Λ (Z i ;^ )expf ^fix i g being the estimated residuals, N R i (w) =IfR i» w; ffi i = 1g; Y R i (w) =IfR i wg: ffl N R i 's and Y R i 's are generalized residual processes. ffl Resulting test generalizes test proposed by Hyde's (mid '7's). 2
21 1. Levels of Tests for Different K's ffl Polynomial: Ψ K (t; ) = 1; Λ (t; );:::;Λ (t; ) K 1 t ffl K 2 f1; 2; 3; 4g; censoring proportion 2 f25%; 5%g and true failure time model were exponential and 2-Weibull. # of Reps = 2 Null Dist. Exponential( ) Parameters = 2 = 5 % Uncensored 75% 5% 75% 5% Level 5% 5% 5% 5% n K Null Dist. Weibull(ff; ) Parameters (ff; ) =(2; 1) (ff; ) = (3; 2) % Uncensored 75% 5% 75% 5% Level 5% 5% 5% 5% n K
22 11. Power Function of Tests Simulation Parameters: n = 1; 25% censoring. Legend: Solid line (K = 2); Dots (K = 3); Short dashes (K = 4); Long dashes (K = 5). Figure 1: Simulated Powers for Null: Exponential vs. Alt: 2-Weibull Power Alpha (Weibull Shape Parameter) Figure 2: Simulated Powers for Null: Exponential vs. Alt: 2-Gamma Power Alpha (Gamma Shape Parameter) 22
23 Figure 3: Simulated Powers for Null: 2-Weibull vs. Alt: 2-Gamma Power Alpha (Gamma Shape Parameter) Some Observations: ffl Appropriate smoothing order K depends on alternative considered. ffl Not always necessary to have large K! ffl Tests based on S 3 and S 4 could serve as omnibus tests, at least for the models in these simulations. ffl Calls for a formal method to dynamically determine K. Plan: to utilize information-based criteria to determine appropriate smoothing order. 23
24 12. Back to the Lung Cancer Data ffl Product-Limit Estimator (PLE) of survival curve together with confidence band, and the best fitting exponential survival curve. Survivor Function Estimates Survival Time (in months) ffl Testing Exponentiality: Values of test statistics using the polynomialtype specification, together with their p-values are: S 2 = 1:92(p = :1661); S 3 = 1:94(p = :3788); S 4 = 7:56(p = :561); S 5 = 12:85(p = :121): ffl Close to a constant function, but with high frequency terms. ffl Testing Two-Parameter Weibull: Value of S 3, together with its p-value, is S 3 = 8:35 (p = :153) Values of S 4 and S 5 both indicate rejection of two-parameter Weibull model. 24
25 13. Back to the Car Tire Example Survivor Function Estimates Removal Times (in hours) Results for Testing Exponentiality Summary of Results of Smooth Goodness-of-Fit Test Testing for Exponential Distribution Input FileName: C: Talks TalkATUG davislawrancecartiredata.txt Output FileName: C: Talks TalkATUG cartire.out Sample Size = 171 Sum of Z_i = Sum of Delta_i = 15 Estimate of Eta = E-3 Estimate of Mean = 243 k S_k DF P-Value Conclusion: Results very consistent with the graphical display: the exponential model does not hold. 25
26 Results for Testing the Weibull Class Summary of Results of Smooth Goodness-of-Fit Test Testing for the Two-Parameter Weibull Distribution Using the Polynomial Specification for Psi Input FileName: C: Talks TalkATUG davislawrancecartiredata.txt Output FileName: C: Talks TalkATUG cartireweibull.txt Sample Size = 171 # of Uncensored Values = 15. Estimate of alpha = Estimate of eta = E-3 k S_k DF P-Value Conclusions From Analysis ffl Cannot reject the hypothesis that the Weibull class holds for this car tire data. ffl Notice in particular the p-values associated with the different smoothing order. ffl Result of test consistent with the plots of the survivor estimates! ffl Are they, really? Results Tests Using Larger Smoothing Orders k S_k DF P-Value
27 14. Extensions and Open Problems ffl When failure times are discrete or interval=censored. Paper with V. Nair now in draft form. New twist is that embedding of the discrete hazards is through their odds! ffl How about if the failure times are of a mixed-type, i.e., with continuous and discrete components? ffl Problem of adaptively determining the smoothing order to be addressed. ffl Promising possibility for Ψ K : Wavelets as Basis??! 15. Conclusions ffl Presented a formal approach for developing GOF tests with censored data, and extends in a more natural way Neyman's smooth gof tests (cf., Rayner and Best (1989), Smooth Goodness of Fit Tests). ffl Resulting tests possess optimality conditions by virtue of being score tests. ffl Able to see effects of estimating nuisance parameters: (Morale: Don't Ignore!) ffl Potential problems when using generalized residuals in model validation: (Morale: Intuitive considerations may Fail!) 27
Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency
Smooth Goodness-of-Fit Tests in Hazard-Based Models by Edsel A. Pe~na Department of Statistics University of South Carolina at Columbia E-Mail: pena@stat.sc.edu Univ. of Georgia Colloquium Talk Department
More informationGoodness-of-Fit Tests With Right-Censored Data by Edsel A. Pe~na Department of Statistics University of South Carolina Colloquium Talk August 31, 2 Research supported by an NIH Grant 1 1. Practical Problem
More informationTMA 4275 Lifetime Analysis June 2004 Solution
TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,
More informationCHAPTER 1 CLASSES OF FIXED-ORDER AND ADAPTIVE SMOOTH GOODNESS-OF-FIT TESTS WITH DISCRETE RIGHT-CENSORED DATA
CHAPTER 1 CLASSES OF FIXED-ORDER AND ADAPTIVE SMOOTH GOODNESS-OF-FIT TESTS WITH DISCRETE RIGHT-CENSORED DATA Edsel A. Peña Department of Statistics University of South Carolina Columbia, SC 29208 USA E-mail:
More informationGoodness-of-Fit Testing with. Discrete Right-Censored Data
Goodness-of-Fit Testing with Discrete Right-Censored Data Edsel A. Pe~na Department of Statistics University of South Carolina June 2002 Research support from NIH and NSF grants. The Problem T1, T2,...,
More informationStatistics 262: Intermediate Biostatistics Non-parametric Survival Analysis
Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve
More informationPhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)
PhD course in Advanced survival analysis. (ABGK, sect. V.1.1) One-sample tests. Counting process N(t) Non-parametric hypothesis tests. Parametric models. Intensity process λ(t) = α(t)y (t) satisfying Aalen
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationChapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample
Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 7 Fall 2012 Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample H 0 : S(t) = S 0 (t), where S 0 ( ) is known survival function,
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationSemiparametric Regression
Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationOther Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More informationModelling and Analysis of Recurrent Event Data
Modelling and Analysis of Recurrent Event Data Edsel A. Peña Department of Statistics University of South Carolina Research support from NIH, NSF, and USC/MUSC Collaborative Grants Joint work with Prof.
More information4. Comparison of Two (K) Samples
4. Comparison of Two (K) Samples K=2 Problem: compare the survival distributions between two groups. E: comparing treatments on patients with a particular disease. Z: Treatment indicator, i.e. Z = 1 for
More informationGROUPED SURVIVAL DATA. Florida State University and Medical College of Wisconsin
FITTING COX'S PROPORTIONAL HAZARDS MODEL USING GROUPED SURVIVAL DATA Ian W. McKeague and Mei-Jie Zhang Florida State University and Medical College of Wisconsin Cox's proportional hazard model is often
More informationDATA IN SERIES AND TIME I. Several different techniques depending on data and what one wants to do
DATA IN SERIES AND TIME I Several different techniques depending on data and what one wants to do Data can be a series of events scaled to time or not scaled to time (scaled to space or just occurrence)
More informationEMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky
EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are
More informationEfficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models
NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia
More informationSTAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where
STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht
More informationSurvival Distributions, Hazard Functions, Cumulative Hazards
BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution
More informationAnalysis of Time-to-Event Data: Chapter 6 - Regression diagnostics
Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the
More informationOutline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data
Outline Frailty modelling of Multivariate Survival Data Thomas Scheike ts@biostat.ku.dk Department of Biostatistics University of Copenhagen Marginal versus Frailty models. Two-stage frailty models: copula
More informationHypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations
Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate
More informationLog-linearity for Cox s regression model. Thesis for the Degree Master of Science
Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical
More informationof the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated
Spatial Trends and Spatial Extremes in South Korean Ozone Seokhoon Yun University of Suwon, Department of Applied Statistics Suwon, Kyonggi-do 445-74 South Korea syun@mail.suwon.ac.kr Richard L. Smith
More information4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups
4 Testing Hypotheses The next lectures will look at tests, some in an actuarial setting, and in the last subsection we will also consider tests applied to graduation 4 Tests in the regression setting )
More informationSurvival Analysis. STAT 526 Professor Olga Vitek
Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9 Survival Data and Survival Functions Statistical analysis of time-to-event data Lifetime of machines and/or parts (called failure time analysis
More information11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationInvestigation of goodness-of-fit test statistic distributions by random censored samples
d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit
More informationTwo-stage Adaptive Randomization for Delayed Response in Clinical Trials
Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal
More informationREGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520
REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU
More informationTypical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction
Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing
More information1 Introduction. 2 Residuals in PH model
Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical
More informationNonparametric Inference In Functional Data
Nonparametric Inference In Functional Data Zuofeng Shang Purdue University Joint work with Guang Cheng from Purdue Univ. An Example Consider the functional linear model: Y = α + where 1 0 X(t)β(t)dt +
More informationCox s proportional hazards/regression model - model assessment
Cox s proportional hazards/regression model - model assessment Rasmus Waagepetersen September 27, 2017 Topics: Plots based on estimated cumulative hazards Cox-Snell residuals: overall check of fit Martingale
More informationSTAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model
STAT331 Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model Because of Theorem 2.5.1 in Fleming and Harrington, see Unit 11: For counting process martingales with
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM
More informationStatistical Inference and Methods
Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and
More informationExercises Chapter 4 Statistical Hypothesis Testing
Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics
More informationPart III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data
1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions
More informationEstimating Load-Sharing Properties in a Dynamic Reliability System. Paul Kvam, Georgia Tech Edsel A. Peña, University of South Carolina
Estimating Load-Sharing Properties in a Dynamic Reliability System Paul Kvam, Georgia Tech Edsel A. Peña, University of South Carolina Modeling Dependence Between Components Most reliability methods are
More informationUSING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA
USING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA Ørnulf Borgan Bryan Langholz Abstract Standard use of Cox s regression model and other relative risk regression models for
More informationGoodness-of-fit tests for randomly censored Weibull distributions with estimated parameters
Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly
More informationDistribution Fitting (Censored Data)
Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...
More informationInstitute of Actuaries of India
Institute of Actuaries of India Subject CT3 Probability & Mathematical Statistics May 2011 Examinations INDICATIVE SOLUTION Introduction The indicative solution has been written by the Examiners with the
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationLikelihood Construction, Inference for Parametric Survival Distributions
Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make
More informationIntroduction to repairable systems STK4400 Spring 2011
Introduction to repairable systems STK4400 Spring 2011 Bo Lindqvist http://www.math.ntnu.no/ bo/ bo@math.ntnu.no Bo Lindqvist Introduction to repairable systems Definition of repairable system Ascher and
More informationLecture 7 Time-dependent Covariates in Cox Regression
Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the
More informationLecture 22 Survival Analysis: An Introduction
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which
More informationLecture 2: Martingale theory for univariate survival analysis
Lecture 2: Martingale theory for univariate survival analysis In this lecture T is assumed to be a continuous failure time. A core question in this lecture is how to develop asymptotic properties when
More informationYou may use your calculator and a single page of notes.
LAST NAME (Please Print): KEY FIRST NAME (Please Print): HONOR PLEDGE (Please Sign): Statistics 111 Midterm 4 This is a closed book exam. You may use your calculator and a single page of notes. The room
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationReliability Engineering I
Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00
More informationSTAT Sample Problem: General Asymptotic Results
STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative
More informationTUTORIAL 8 SOLUTIONS #
TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level
More informationDefinitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen
Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation
More informationST495: Survival Analysis: Hypothesis testing and confidence intervals
ST495: Survival Analysis: Hypothesis testing and confidence intervals Eric B. Laber Department of Statistics, North Carolina State University April 3, 2014 I remember that one fateful day when Coach took
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationSTAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis
STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive
More informationModeling and Analysis of Recurrent Event Data
Edsel A. Peña (pena@stat.sc.edu) Department of Statistics University of South Carolina Columbia, SC 29208 New Jersey Institute of Technology Conference May 20, 2012 Historical Perspective: Random Censorship
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationAsymptotic Statistics-VI. Changliang Zou
Asymptotic Statistics-VI Changliang Zou Kolmogorov-Smirnov distance Example (Kolmogorov-Smirnov confidence intervals) We know given α (0, 1), there is a well-defined d = d α,n such that, for any continuous
More informationSTAT 331. Martingale Central Limit Theorem and Related Results
STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationTESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip
TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississippi University, MS38677 K-sample location test, Koziol-Green
More informationMeei Pyng Ng 1 and Ray Watson 1
Aust N Z J Stat 444), 2002, 467 478 DEALING WITH TIES IN FAILURE TIME DATA Meei Pyng Ng 1 and Ray Watson 1 University of Melbourne Summary In dealing with ties in failure time data the mechanism by which
More informationFollow this and additional works at: Part of the Applied Mathematics Commons
Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2004 Cox Regression Model Lindsay Sarah Smith Louisiana State University and Agricultural and Mechanical College, lsmit51@lsu.edu
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview
Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More information(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.
Problem 1 (21 points) An economist runs the regression y i = β 0 + x 1i β 1 + x 2i β 2 + x 3i β 3 + ε i (1) The results are summarized in the following table: Equation 1. Variable Coefficient Std. Error
More informationJoint Modeling of Longitudinal Item Response Data and Survival
Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,
More information( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan
Outline: Cox regression part 2 Ørnulf Borgan Department of Mathematics University of Oslo Recapitulation Estimation of cumulative hazards and survival probabilites Assumptions for Cox regression and check
More informationAnalysis of Time-to-Event Data: Chapter 4 - Parametric regression models
Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored
More informationDynamic Prediction of Disease Progression Using Longitudinal Biomarker Data
Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Xuelin Huang Department of Biostatistics M. D. Anderson Cancer Center The University of Texas Joint Work with Jing Ning, Sangbum
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More information1 Glivenko-Cantelli type theorems
STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then
More informationNow consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.
Weighting We have seen that if E(Y) = Xβ and V (Y) = σ 2 G, where G is known, the model can be rewritten as a linear model. This is known as generalized least squares or, if G is diagonal, with trace(g)
More informationST745: Survival Analysis: Nonparametric methods
ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More informationPASS Sample Size Software. Poisson Regression
Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationEmpirical Processes & Survival Analysis. The Functional Delta Method
STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will
More informationStatistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS040) p.4828 Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationEfficiency of Profile/Partial Likelihood in the Cox Model
Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows
More informationLecture 15. Hypothesis testing in the linear model
14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma
More informationCHAPTER 3 ANALYSIS OF RELIABILITY AND PROBABILITY MEASURES
27 CHAPTER 3 ANALYSIS OF RELIABILITY AND PROBABILITY MEASURES 3.1 INTRODUCTION The express purpose of this research is to assimilate reliability and its associated probabilistic variables into the Unit
More informatione 4β e 4β + e β ˆβ =0.765
SIMPLE EXAMPLE COX-REGRESSION i Y i x i δ i 1 5 12 0 2 10 10 1 3 40 3 0 4 80 5 0 5 120 3 1 6 400 4 1 7 600 1 0 Model: z(t x) =z 0 (t) exp{βx} Partial likelihood: L(β) = e 10β e 10β + e 3β + e 5β + e 3β
More informationThe Design of a Survival Study
The Design of a Survival Study The design of survival studies are usually based on the logrank test, and sometimes assumes the exponential distribution. As in standard designs, the power depends on The
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationHypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes
Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis
More information