Goodness-of-fit test for the Cox Proportional Hazard Model
|
|
- Garry Harmon
- 5 years ago
- Views:
Transcription
1 Goodness-of-fit test for the Cox Proportional Hazard Model Rui Cui Department of Economics, UC3M Abstract In this paper, we develop new goodness-of-fit tests for the Cox proportional hazard model. We derive principal component decomposition of the cumulative martingale residual process and construct new tests based on its estimated components, which overperform the corresponding omnibus test. The omnibus test, consistent in the deviation of non-peracute alternatives, is in fact a weighted average of all components, while our test is based on each component, i.e., it is not able to detect all possible alternatives, but it is very powerful in some high-frequency directions. Smooth tests, which are unweighted averages of a few components, are also constructed. The finite sample performance of the tests are illustrated by mean of a Monte Carlo experiment. JEL: C12; C52; Keywords: Duration analysis; Goodness-of-fit; Principal component decomposition; Right-censorship; 1 Introduction The Cox Proportional Hazard model has been widely used in many fields, including economics, since it has been proposed by David Cox in The model specifies the interested duration time through its hazard rate, which is the candidate to describe a dynamic time-dependent phenomena. Also it introduces covariable effects, I very much appreciate the help and support from Miguel Delgado and Winfried Stute. All errors belong to me. 1
2 that makes regression analysis possible for duration data under censorship. The estimation of the Cox model has been studied by Cox (1972,1975) through a partial likelihood approach. The large sample properties has been studied by Tsiatis (1981) and Andersen and Gill (1982) among others. Andersen and Gill (1982) adopted a counting process approach and extended the results to recurrent events process. The counting process approach, which is equivalent to the hazard approach, becomes popular because of the introduction of martingale theory, which makes duration analysis possible. A comprehensive review is in Fleming and Harrington (1991). The Cox model might fail in two ways. On one hand, the Cox model assumes that the hazard rates among individuals are proportional, i.e.,, the hazard ratio is time invariant. This proportional assumption might fail. On the other hand, the specification of the covariable effect might be misspecified. This misspecification might be of the functional form of the covariables and the exponential form of the link function. For model checking, various graphical methods and goodness-of-fit tests have been proposed in the literature. The most common method consists of using the martingale residuals defined by Barlow and Prentice (1988). The martingale residuals, which come from the Doob-Meyer decomposition of the counting process, provide a basis for goodness-of-fit tests of hazard models, e.g. Lin, Wei and Ying (1993), Martinussen and Scheike (27). For the Cox model, the landmark paper is Lin, Wei and Ying (1993). They developed a class of goodness-of-fit tests, including an omnibus test and special tests for the proportional hazard assumption, the functional form of covariables and the form of the link function. Their method is based on the cumulative sum of martingale residuals. Principal component decomposition approach, commonly used in functional analysis, has been used to develop more powerful goodness-of-fit tests for different models. The landmark paper is by Durbin, Knott and Taylor (1975). They studied the standard empirical process with estimated parameters to test the specification of the distribution function and derived its principal components. These components not only help to solve the problem caused by estimation, but also provide a basis for more powerful tests in certain directions. Stute (1997) studied the marked residual process and its principal components to test the specification of nonparametric regression model. Anh and Stute (212) studied the principal component analysis of the martingale part of the empirical process to test parametric hazard model. In all the cases, the obtained components serve as special experts to detect certain deviations. In this paper, more powerful tests are developed for the Cox model using a con- 2
3 ditional principal component decomposition approach. I consider the CUSUM of martingale residuals as in Lin, Wei and Ying (1993), and derive its principal component decomposition in the time dimension. The obtained components are sensitive when detecting certain deviations from the proportional hazard assumption, for instance, higher-frequency deviations are more reflected in later components. The decompose method in this paper is applicable for any model that has a martingale interpretation, including hazard models and transformation models. However, we focus on the Cox model in the present paper. A brief introduction of the Cox model together with some other important models in duration analysis and the omnibus test proposed by Lin, Wei and Ying (1993) is in section 2. Section 3 contains the main result: the principal component decomposition, the asymptotic results of the component processes and the test statistics based on the components. Simulation studies illustrating the performance of our tests in finite sample are presented in section 4. 2 Omnibus Test for the Cox Model 2.1 The Cox Proportional Hazard Model In the framework of regression analysis with right-censored duration data, consider a sample {Z i, i, X i }, i = 1,, n of i.i.d. realizations of {Z,, X}. Here Z is the minimum of the non-negative failure and censoring time, which are denoted by T and C, i.e.,,, Z = min(t, C). The indicator = 1 {T C} contains the information indicating which of T and C is actually observed, and X is the covariable vector. The conditional distribution of failure time is usually better described through its hazard functions rather than densities. The conditional cumulative hazard function is given by Λ(t X) = t df (u X) 1 F (u X), where F is the conditional distribution function of the failure time. If F admits a density f, we have The function dλ(t X) = λ(t X) = f(t X) 1 F (t X) dt. f(t X) 1 F (t X) 3
4 is called the hazard function. It also has a conditional probability expression, λ(t X) = lim h h 1 P (t T < t + h T t, X). In the Cox proportional hazard model, the hazard rate is assumed to have the multiplicative form λ(t X) = λ (t)exp(β X), where λ (t) is an unspecified baseline hazard function. Another approach to the censored data regression models is based on the analysis of counting process. Define the following two processes N(t) = 1 {Z t, =1}, Y (t) = 1 {Z t}. Here N is the counting process and Y is the at-risk process. Applying the Doob- Meyer decomposition, there is a unique predictable process A such that N A is a martingale and A is called the compensator of N. In the counting process approach, instead of modeling conditional hazard rate of T, the compensator process is modeled. Notice that the information contained in {Z, } is equivalent to that contained in {N, Y }. Actually, these two approaches are equivalent under the conditional independence of T and C on X. To be more specific, the process given by M(t) = N(t) t Y (u)dλ(u X) is a martingale with the filtration F t = σ{x, N(u), Y (u+) : u t}. Then modeling the compensator t Y (s)dλ(s X) is equivalent to modeling the conditional hazard. Hence, if the Cox specification is correct for a given sample, there exists a β and λ (t), such that M i (t) = N i (t) t Y i (s)exp(β X i )λ (s)ds i = 1,, n are martingales. The corresponding martingale residuals are defined as ˆM i (t) = N i (t) t Y i (s)exp( ˆβ X i )dˆλ (s), (2.1) 4
5 where ˆβ is an estimator of β and ˆΛ (t) is an estimator of the cumulative baseline hazard function Λ (t) = t λ (s)ds. These martingale residuals provide a basis for goodness-of-fit test for the Cox model. The estimation was suggested by Cox (1972,1975) using the partial likelihood inference. The partial likelihood score function for β is U(β) = ( X i X(β, ) t) dn i (t), where X(β, t) = n Y i(t)e β X i X i n Y i(t)e β X i. The partial likelihood estimator ˆβ is the solution to U(β) =. Under some mild regularity conditions, n 1/2 ( ˆβ β ) converges in distribution to a centered Gaussian variable with covariance matrix Σ(β ) 1. The matrix Σ(β) is defined as [ ] Σ(β) = E (X i X(β, s)) 2 Y i (s)e β X i λ (s)ds, with being the limit of X(β, t). X(β, t) = E[Y (t)eβ X X] E[Y (t)e β X ] The cumulative baseline hazard is estimated by the Breslow (1974) estimator t n ˆΛ (t) = dn i(u) n Y. i(u)e ˆβX i 2.2 Other Important Models in duration analysis The Cox proportional hazard model assumes the conditional hazard rate of the duration time to be as the product of a baseline hazard and the covariable effect. In this sense, it is also called the multiplicative hazard model. Another important hazard model is the Aalen s additive hazard model, which is proposed by Aalen (198) and the hazard rate is assumed to be a summation of the covariable effects. The multiplicative and additive hazard models are suitable for regression analysis of duration data, however, they are not the only important models in duration analysis. There are two general classes of models in duration analysis with regression, the transformation model and the accelerated failure time model. In fact, the Cox model is a special case of the transformation model. 5
6 A transformation model is H(T ) = β X + ε, (2.2) with H( ) an unknown monotone transformation and ε an error term with a known distribution. The transformation model has a martingale interpretation, i.e.,, if we denote Λ ε as the known cumulative hazard function of ε, then M(t) = N(t) t Y (u)dλ ε (β X + H(u)) is a martingale. One special case is the Cox model, in which ε is taken to follow the extreme-value distibution with Λ ε (t) = e t and the transformation is taken as H( ) = ln(λ ( )). Another special case is the proportional odds model, in which ε follows the standard logistic distribution. The accelerated failure time model assumes log(t ) = β X + ε, (2.3) with unspecified distribution of ε. It is just a transformed version of ordinary linear model. The inference of accelerated failure time model is not as easy as that of the Cox model because of censorship. This is no direct martingale structure in (2.3). Although the parameter is easily interpreted as the effect on the mean of log(t ) in the standard linear regression model, it is not so clear when T is under censorship. For transformation model, Chen et al. (22) has proposed an estimating equation approach based on the martingale structure. The estimation coincides with the partial-likelihood estimator in the special case of the Cox model. A brief review of transformation model and accelerated failure time model can be found in Martinussen and Scheike (27). The method to construct goodness-of-fit test in this paper is applicable to models that have a martingale structure, e.g. hazard models and transformation models. The tests we propose are therefore helpful with model selection for analysis of duration data. We demonstrate the method under the Cox model, and generate datas from transformation models and accelerated failure time models as alternatives in the simulation, to study the power of our tests. 2.3 Omnibus Test To test the specification of the Cox model, i.e.,,, to test H : λ(t X) = λ (t)exp(β X) for some β and λ (t), 6
7 we could consider the CUSUM of the martingales R n (t, x) = n 1/2 1 {Xi x}m i (t), (2.4) where M i (t) = N i (t) t Y i(s)exp(β X i )λ (s)ds, i = 1,, n are martingales under H. Lin, Wei and Ying (1993) proposed an omnibus test for the Cox model by considering the process with estimated β and Λ, ˆR n (t, x) = n 1/2 1 ˆM {Xi x} i (t), (2.5) where ˆM i (t) = N i (t) t Y i(s)exp( ˆβ X i )dˆλ (s), i = 1,, n are the martingale residuals. They have shown that, under the null hypothesis, the process ˆR n (t, x) converges weakly to a centered Gaussian process R (t, x) in the space D([, ) [ 1, 1]). Kolmogorov type statistic is constructed based on this process. To simplify the notation, we only consider the univariate case, i.e.,, real-valued X. In the next section, we decompose R n into a countable sum of component processes, and use these component processes to construct new test statistics. 3 Tests based on Component Processes 3.1 Conditional Principal Component Analysis Notice that the process R n in (2.4) is bivariate with non-independent components x and t. Hence, the direct Karhunen-Loève representation is not available in this case. Instead, I adopt a conditional principal component decomposition, i.e.,, to do the decomposition of the process conditional on X. From now on, we impose the following assumptions. (A1). T and C are independent conditional on X. (A2). X is bounded, without loss of generality by 1. (A3). C is independent of X. (A4). For each τ [ <, P {Y (τ) = 1} >. (A5). Σ(β ) = E (X i X(β ], s)) 2 Y i (s)e β X i λ (s)ds is positive definite. The first two assumptions are standard in the Cox model. The third one is needed to justify consistency of the martingale conditional variance. The last two assumptions are needed to get the asymptotic distribution of the partial likelihood estimator ˆβ, 7
8 see Anderson and Gill (1982), Theorem 4.2. Let us begin with the decomposition of the martingale M i (t) conditional on X i. The conditional covariance of M i conditional on X i is [ s t ] E(M i (s)m i (t) X i ) = E Y i (u)e β X i λ (u)du X i = = = = s t s t s t s t E[Y i (u) X i ]e β X i λ (u)du P (T i u, C i u X i )e β X i λ (u)du P (T i u X 1 )P (C i u X i )e β X i λ (u)du exp( Λ (u)e β X i )P (C i u)e β X i λ (u)du. The first equation follows from martingale properties (Fleming and Harrington (1991), Theorem 2.5.1). The last two equations follow respectively from assumption (A1) and (A3). Let us denote the conditional covariance function by T (s t, x) = E(M i (s)m i (t) X i = x), with T (t, x) := E(M 2 (t) X = x) = t exp( Λ (u)e β X )P (C u)e β X λ (u)du. Notice that function T is non-decreasing in t, and T (, x) =, T (, x) 1. Remark. Suppose we do not have censorship, then T (t, x) = 1 exp( Λ (t)e β x ) = F T (t X = x). The conditional covariance function of the martingale equals to the conditional distribution function of T. In this case, T (, x) = 1. Let µ j = 4 π 2 (2j 1) 2, ϕ j(t) = 2sin (2j 1)πt, j = 1, 2, 2 be the eigenvalues and eigenfunctions of the standard Brownian Motion with covariance structure K(s, t) = s t. For each x, let f j be the transformation f j (t, x) = ϕ j (T (t, x)/t (, x)). 8
9 Then {f j (, x)} form an orthonormal basis of a subspace of L 2 (R +, T (, x)/t (, x)), the Hilbert space of all square integrable functions on R + with the inner product T (dt, x) ρ, g x = ρ(t)g(t) R T (, x). + Actually, f j, f h x = = ϕ j R + 1 ( T (t, x) ) ( T (t, x) ϕ h T (, x) T (, x) ϕ j (u)ϕ h (u)du = { 1 j = h j h. ) T (dt, x) T (, x) Moreover, {f j (, x)} are the eigenfunctions of the covariance structure T (s t, x)/t (, x) with associated eigenvalues {µ j }, i.e.,, T (s t, x) R T (, x) f T (ds, x) j(s, x) T (, x) = µ jf j (t, x). + By Mercer s theorem, the covariance function can be decomposed as T (s t, x) = µ j f j (s, x)f j (t, x). T (, x) Since T (s t, x)/t (, x) is the conditional covariance function of the process M i (t)(t (, x)) 1/2 given X i = x, we have the decomposition M i (t)(t (, X i )) 1/2 = µ 1/2 j z ij f j (t, X i ) a.s., (3.1) where z ij := µ 1/2 j (T (, X i )) 1/2 M i, f j (X i ) Xi 1/2 = µ j (T (, X i )) 3/2 M i (t)f j (t, X i )T (dt, X i ). R + The z ij is the j th principal component of M i (t)(t (, X i )) 1/2 conditional on X i. For each j and j h, it has the following properties E(z ij X i ) =, E(z 2 ij X i ) = 1, (3.2) E(z ij z ih X i ) =. Hence, from (3.1) the Karhunen-Loève representation of R n can be written as R n (t, x) = n 1/2 = µ j 1/2 1 {Xi x} [(T (, X i )) 1/2 ] µ 1/2 j z ij f j (t, X i ) ] [n 1/2 z ij 1 {Xi x}(t (, X i )) 1/2 f j (t, X i ). 9
10 I call the term in bracket the j th component process of R n and denote it as c n,j (t, x) := n 1/2 z ij 1 {Xi x}(t (, X i )) 1/2 f j (t, X i ). (3.3) Thus we have the following proposition. Proposition 1 Under the null hypothesis and (A1)-(A5), the CUSUM of martingale processes (2.4) can be decomposed into a weighted sum of component processes R n (t, x) = µ 1/2 j c n,j (t, x). (3.4) The weights are the square root of the eigenvalues of the standard Brownian Motion. 3.2 Asymptotic Theory of Component Processes The expression of the component process (3.3) looks complicated. However it can be simplified a lot. Let us define another function g j corresponding to f j as g j (t, x) := φ j (T (t, x)/t (, x)) = 2cos (2j 1)T (t, x)/t (, x). 2 By changing the order of integration and change of variables, the component processes can be rewritten as c n,j (t, x) = n 1/2 1 {Xi x}f j (t, X i )g j (s, X i )dm i (s). (3.5) The following theorem shows the convergence of the component process, which follows from a tightness result and the fact that it is a sum of i.i.d. centered random functions with variance H j (t, x) := E[1 {Xi x}t (, X i )fj 2 (t, X i )] = Here F ( ) is the distribution function of X. x T (, s)f 2 j (t, s)f (ds). Theorem 1 Under the null hypothesis and (A1)-(A5), for each j, the process c n,j (t, x) converges weakly to a centered Gaussian process in the space D([, ) [ 1, 1]), c d n,j c,j. The limit Gaussian process c,j has covariance structure K(t 1, t 2, x 1, x 2 ) = x1 x 2 T (, s)f j (t 1, s)f j (t 2, s)f (ds). 1
11 Moreover, c,j and c,h are independent for j h. In order to have test statistics, we need to consider the component process after estimation ĉ n,j (t, x) := n 1/2 1 ˆf {Xi x} j (t, X i )ĝ j (s, X i )d ˆM i (s). (3.6) Here ˆM i (t) = N i (t) t Y i (s)exp( ˆβ X i )dˆλ (s), ˆf j (t, x) = ϕ j ( ˆT (t, x)/ ˆT (, x)), ĝ j (t, x) = φ j ( ˆT (t, x)/ ˆT (, x)). As remarked earlier, it involves the estimators of β and Λ and the estimator of the conditional covariance function T (t, x). For ˆT (t, x), recall that T (t, x) = t A natural consistent estimator is ˆT (t, x) = t exp( Λ (u)e β X )P (C u)e β X λ (u)du. exp( ˆΛ (u)e ˆβx )(1 Ĝ(u ))e ˆβx dˆλ (u), where Ĝ is the Kaplan-Meier estimator of the distribution function of C. In the appendix, it is shown that ĉ n,j (t, x) has the same asymptotic distribution as c n,j (t, x) := n 1/2 A j (t, x)σ(β ) 1 n 1/2 [ 1 {Xi x}f j (t, X i )g j (s, X i ) l ] j (β, t, x, s) dm i (s) with X(β, t) and Σ(β) defined in section 2.1, and (X i X(β, s))dm i (s), lj (β, t, x, s) = E[Y (s)eβ X 1 {X x} f j (t, X)g j (s, X)] E[Y (s)e β X, ] [ ] A j (t, x) = E Y (s)e βx (X X(β, s))λ (s)1 {X x} f j (t, X)g j (s, X)ds. The process c n,j (t, x) can be written in the form of c n,j (t, x) = n 1/2 h ij (β, t, x, s)dm i (s), (3.7) 11
12 with h ij (β, t, x, s) = 1 {Xi x}f j (t, X i )g j (s, X i ) l j (β, t, x, s) A j (t, x)σ(β) 1 (X i X(β, s)). The asymptotic distribution of ĉ n,j (t, x) is derived by the following theorem. Theorem 2 Under the null hypothesis and (A1)-(A5), for each j = 1, 2,, the process ĉ n,j (t, x) converges weakly to a centered Gaussian process in the space D([, ) [ 1, 1]), ĉ n,j d c,j. The limit Gaussian process c,j (t, x) has covariance structure K(t 1, t 2, x 1, x 2 ) = E [ h j (β, t 1, x 1, s)h j (β, t 2, x 2, s)y (s)e β X λ (s)ds ]. In addition to the single component process, finite weighted sum of some component processes can also be used for model checking. In this sense, we combine information from different components. Consider the first m component processes with weight w = {w j } m, i.e.,, the process m w jĉ n,j (t, x). It has the same asymptotic distribution with the following process m ( m ) w j c n,j (t, x) = n 1/2 w j h ij (β, t, x, s) dm i (s). The asymptotic distribution is given in the following theorem. Theorem 3 Under the null hypothesis and (A1)-(A5), for any given weight w = {w j } m, the process m w jĉ n,j (t, x) converges weakly to a centered Gaussian process in the space D([, ) [ 1, 1]), m w j ĉ d n,j c w. The limit Gaussian process c w (t, x) has covariance structure [ ( m ) ( m ) ] K(t 1, t 2, x 1, x 2 ) = E w j h j (β, t 1, x 1, s) w j h j (β, t 2, x 2, s) Y (s)e βx λ (s)ds. 3.3 Test Statistics The omnibus tests are based on the original CUSUM of the martingales. By the continuous mapping theorem, we have the following asymptotic distribution of the Kolmogorov-Smirnov and Cramér-von Mises type statistics KS o = sup t,x ˆRn (t, x) d R sup t,x (t, x), 12
13 CvM o = [ ˆRn (t, x)] 2 Fn (dx)dt d [ R (t, x)] 2F (dx)dt. Here F n ( ) is the empirical distribution of X. The component processes we derived provide a basis of new specification tests for the Cox model. I propose to construct Kolmogorov-Smirnov and Cramér-von Mises type statistics based on each component process, i.e.,, for each j = 1, 2,, we have the following, what I call, component tests, ĉn,j KS nj = sup t,x (t, x) d sup t,x c,j (t, x), CvM nj = [ĉ n,j (t, x)] 2 F n (dx)dt d [ c,j (t, x)] 2 F (dx)dt. Note that in (3.4), the weight for the j th component process is µ 1/2 j that decreases very rapidly in j. In consequence, the later components are down-weighted in the original process. In fact, each component reflects certain aspect of a deviation from the null hypothesis. For example, high-frequency deviations are more reflected in later components. Therefore, the omnibus test, which gives low weights to later components, has low power, while the tests based on later components are specially designed for such high-frequency alternatives. Since different aspects of a deviation are distinguished through its components and it is difficult to decide which component to use before model checking, we should construct tests based on each component and reject the null hypothesis if any of them gives us a rejection. In practice, the data should not be very frequent, hence we can focus on the first few components, say no more than ten in general. In addition, smooth test statistics based on the reweighted sum of component processes can be constructed. If we give the components with equal weights and consider the sum of the first m components, the Kolmogorov and Cramér-von Mises type statistics, for some fixed m, can be constructed as m KS nm = sup t,x w j ĉ n,j (t, x) d sup t,x c w (t, x), [ m 2Fn CvM nm = w j ĉ n,j (t, x)] (dx)dt d [ c w (t, x)] 2 F (dx)dt. The smooth tests provide a compromise between the omnibus tests and the tests based on one component. The smooth test is the one that takes w = (1,, 1). The test based on the j th component process is the one that takes w as the j th unit vector, i.e.,, w = 13
14 (,, 1,, ). However, the problem is that one has to choose a suitable w before model checking. Accually, we can take into account the information from all the component processes by considering a new test that behaves as an intersection of the component tests. The idea is based on Bonferroni method, i.e., we run the first m component tests, and record the decision for each one. Then we accept H if all the m tests accept, and reject H if any of them gives us a rejection. Specifically, let T 1, T 2,, T m be the first m component tests with common size x. The Bonferroni test T is { if T1 = = T m =, T = 1 o.w. The probability of our test to accept under H is P (T 1 =,, T m = ), and it admits the following inequality P (T 1 =,, T m = ) P (T 1 = ) + + P (T m = ) (m 1) = (1 x) + + (1 x) (m 1) = 1 mx. For a significant level α, we could choose x = α/m, then the size of our test will be 1 P (T 1 =,, T m = ) mx = α, i.e., the Bonferroni test has a bounded size of α. To approximate the limit distribution c,j (t, x), we follow the suggestion of Lin, Wei and Ying (1993) through Monte Carlo simulations. Recall from the expression (3.7), c n,j (t, x) is a martingale integral. To approximate its asymptotic distribution, the integrand h i (β, t, x, s) can be replaced by its consistent estimator, but we do not know the distribution form of the martingale M i (t). Lin, Wei and Ying (1993) suggested to replace M i (t) by a similar process which has a known distribution. The candidate is N i (t)g i, where N i is the observed counting process and {G i ; i = 1,, n} is a random sample of standard normal variables. Noticing the martingale property E[M 2 (t)] = E[N(t)], the process M i (t) and N i (t)g i have the same variance function. Finally replace all the unknown quantities in h i (β, t, x, s) by their consistent estimators, i.e.,, replace β, Λ (t), f j (t, x), g j (t, x) by ˆβ, ˆΛ (t), ϕ j ( ˆT (t, x)/ ˆT (, x)), φ j ( ˆT (t, x)/ ˆT (, x)) and replace X(β, t), l(β, t, x, s) by their sample analogies. Given the observed data, the distribution of the process after replacement is the same with c n,j (t, x) in the limit. 14
15 4 Simulation study As discussed earlier, the accelerated failure time model and transformation model provide general frameworks for studying the covariable effects of duration data. In our simulation study, we take several alternatives from these models to study the power of our tests. We consider the following DGPs with explanations afterwards. Cox: λ(t X) = λ (t)exp(β X). DGP1: Weibull hazard rate λ(t X) = (.2X)t.2X 1. DGP2: Log-normal Model ln(t ) = β X + ɛ. Here we take ɛ as a standard normal variable. This model is a special case of accelerated failure time models. DGP3: Transformation Model Λ (T )e β X = P areto, where P areto is a standard Pareto variable, which has hazard rate x 1 for x > 1. DGP4: Transformation Model Λ (T )e β X = A 1, where A 1 is a positive random variable that has hazard rate λ(t) = 1 + sin(3πt/2). DGP5: Transformation Model Λ (T )e β X = A 2, where A 2 is a positive random variable that has hazard rate λ(t) = 1 + cos(3πt/2). DGP6: Transformation Model Λ (T )e β X = A 3, where A 3 is a positive random variable that has hazard rate λ(t) = 1 + sin(5πt/2). 15
16 DGP7: Transformation Model Λ (T )e β X = A 4, where A 4 is a positive random variable that has hazard rate λ(t) = 1 + cos(5πt/2). DGP1 is the Weibull hazard model, in which the hazard for different values of the covariable is non-proportional. DGP2 is a commonly used model in economics, and it belongs to the accelerated failure time models. DGP3-7 are transformation models with unspecified transformation ln(λ ( )). The Cox model, as a special case of a transformation model, can be expressed as Λ (T )e β X = E, where E is the standard exponential variable with constant hazard rate. In DGP3, we replace the exponential by a Pareto variable which has decreasing hazard rate. For DGP4-7, we call them high-frequency alternatives, in the sense that the variable A 1, A 2, A 3, A 4 have periodic hazard rates rather than constants. We take β =.2, λ (t) = 1, Λ (t) = t, and X =, 1,, 9 with equal proportions. The censoring variable in each case is drawn from uniform distribution such that the percentage of censorship is around 3%. We run for sample size n = 5, 1, 15, and use 1 realizations of the Gaussian process to estimate the distribution of each statistic. We run 1 replications for each DGP. The result is shown in the table 1 and 2. The omnibus test is based on the original process ˆR n (t, x). The smooth test is based on the reweighted sum of the first five component processes as discussed in the previous section. The last five lines in the tables are for tests based on the first five component processes. I use bold type to indicate the test that has the largest power. Since the Weibull hazard rate data is highly non-proportional, the omnibus test works well, but the test based on the second component has larger power. The lognormal data seems to fit the Cox model well, but still the test based on the second component has largest power. For the Pareto alternative, it is the same situation. Finally, from the result of high-frequency alternatives DGP4-7, it is clear how these components serve as special experts for certain deviations from the proportional hazard assumption. When the alternative gets more frequent in the time domain, the test based on latter component behaves better. 16
17 Table 1: Estimated size and power of KS tests at 5% Cox DGP1 DGP2 DGP3 n = n = n = n = omnibus smooth Bonferroni DGP4 DGP5 DGP6 DGP7 n = n = n = n = omnibus smooth Bonferroni Table 2: Estimated size and power of CvM tests at 5% Cox DGP1 DGP2 DGP3 n = n = n = n = omnibus smooth Bonferroni DGP4 DGP5 DGP6 DGP7 n = n = n = n = omnibus smooth Bonferroni
18 Table 3: Estimated size and power of KS component tests at 5% Cox DGP1 DGP2 DGP3 n = n = n = n = st nd rd th th DGP4 DGP5 DGP6 DGP7 n = n = n = n = st nd rd th th Table 4: Estimated size and power of CvM component tests at 5% Cox DGP1 DGP2 DGP3 n = n = n = n = st nd rd th th DGP4 DGP5 DGP6 DGP7 n = n = n = n = st nd rd th th
19 5 Conclusion We have used conditional principal component decomposition method to decompose the martingale process in hazard model with regression. The component processes provide a basis of more powerful specification tests. The decomposition is in the time domain, and each component process reflects certain deviations from the proportional hazard assumption. The method is applicable for any hazard model with regression and for transformation models. However, these components do not help when the deviations come from misspecifications of the covariable effect, for example, missing variable or wrong link function. To have more powerful tests against these deviations, we need the decomposition of the bivariate process R n (t, x) in x. Since x and t play different roles in R n (t, x), the decomposition method should be different. The decomposition in x will be discussed in the following paper. 6 Appendix: Proofs Proof of Theorem 1: Note that each f j and g j are bounded and differentiable. The tightness of c n,j follows from Lemma 1 in Lin, Wei and Ying (1993). It then follows from the multivariate CLT that the process converges weakly to a centered Gaussian process. The independence between c,j and c,h comes from the Gaussian property and conditional uncorrelation between z ij and z ih. Proof of the asymptotic equivalence of ĉ n,j (t, x) and c n,j (t, x): The asymptotic properties of ˆβ and ˆΛ is given by Tsiatis (1981) and Andersen and Gill (1982). By taking the Taylor s expansion of ĉ n,j (t, x) and the score function 19
20 U(β) at β, we have ĉ n,j (t, x) = n 1/2 n 1/2 n 1 Σ(β ) 1 n 1/2 +o p (1). 1 {Xi x} ˆf j (t, X i )ĝ j (s, X i )dm i (s) n Y i(s)e β X i 1 ˆf {Xi x} j (t, X i )ĝ j (s, X i ) n Y i(s)e β X i dm i (s) Y i (s)e β X i (X i X(β, s))λ (s)1 {Xi x} ˆf j (t, X i )ĝ j (s, X i )ds (X i X(β, s))dm i (s) By the strong consistency of ˆβ, ˆΛ and the Kaplan-Meier estimator, together with the continuous mapping theorem, ˆfj and ĝ j are strongly consistent. Hence, for the first term on the right hand side of the above equation, by the martingale property and the strong consistency and boundness of ˆf j and ĝ j, we have E [n 1/2 = E [n 1/2 =, ( 1 ˆf ) ] 2 {Xi x} j (t, X i )ĝ j (s, X i ) 1 {Xi x}f j (t, X i )g j (s, X i ) dm i (s) ( 1 ˆf 2Yi ] {Xi x} j (t, X i )ĝ j (s, X i ) 1 {Xi x}f j (t, X i )g j (s, X i )) (s)e β X i λ (s)ds [( E 1 ˆf ) 2Yi ] {Xi x} j (t, X i )ĝ j (s, X i ) 1 {Xi x}f j (t, X i )g j (s, X i ) (s)e β X i λ (s)ds thus n 1/2 ( 1 ˆf ) {Xi x} j (t, X i )ĝ j (s, X i ) 1 {Xi x}f j (t, X i )g j (s, X i ) dm i (s) = o p (1). The same argument for the second term, since from the strong consistency of ˆf j and ĝ j and the uniform SLLN, we have n 1 Y i (s)e β X i 1 {Xi x}( ˆf j (t, X i )ĝ j (s, X i ) f j (t, X i )g j (s, X i )) = o p (1). For the third term, we have n 1 and Y i (s)e β X i (X i X(β, s))λ (s)1 {Xi x}( ˆf j (t, X i )ĝ j (s, X i ) f j (t, X i )g j (s, X i ))ds = o p (1), n 1/2 (X i X(β, s))dm i (s) d N(, Σ(β )). 2
21 Thus, ĉ n,j (t, x) and c n,j (t, x) have the same asymptotic distribution. Proof of Theorem 2: To show the tightness of ĉ n,j (t, x), it suffices to show the tightness of c n,j (t, x). Recall c n,j (t, x) = n 1/2 A(t, x)σ(β ) 1 n 1/2 [ 1 {Xi x}f j (t, X i )g j (s, X i ) l(β ], t, x, s) dm i (s) (X i X(β, s))dm i (s). From Lemma 1 in Lin, Wei and Ying (1993), the first term is tight. The second term is tight since n 1/2 (X i X(β, s))dm i (s) converges in distribution. It then follows from the multivariate CLT that ĉ n,j (t, x) converges weakly to a centered Gaussian process. 21
22 References [1] Aalen, O. O. (198). A model for non-parametric regression analysis of life times. Mathematical Statistics and Probability Theory (eds W. Klonecki, A. Kozek and J. Rosinski), Lecture Notes in Statistics, vol. 2, Springer-Verlag, New York. [2] Andersen, P. K. and Gill, R. D. (1982). Cox s regression model for counting processes: a large sample study. The annals of statistics, [3] Anh, T. L. and Stute, W. (212). Principal Component Analysis of Martingale Residuals. Indian Statist. Assoc. [4] Barlow, W. E. and Prentice, R. L. (1988). Residuals for relative risk regression. Biometrika, [5] Bickel, P. J. and Wichura, M. J. (1971). Convergence criteria for multiparameter stochastic processes and some applications. The Annals of Mathematical Statistics, [6] Billingsley, P. (213). Convergence of probability measures, John Wiley & Sons. [7] Breslow, N. (1974). Covariance analysis of censored duration data. Biometrics, [8] Chen, K. and Jin, Z. and Ying, Z. (22). Semiparametric analysis of transformation models with censored data. Biometrika, [9] Cheng, S. C. and Wei, L. J. and Ying, Z. (1995). Analysis of transformation models with censored data. Biometrika, [1] Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), [11] Cox, D. R. (1975). Partial likelihood. Biometrika, 62(2): [12] Dabrowska, D. M. and Doksum, K. A. (1988). Partial likelihood in transformation models with censored data. Scandinavian journal of statistics, [13] Delgado, M. A. and Stute, W. (28). Distribution-free specification tests of conditional models. Journal of Econometrics, [14] Durbin, J. and Knott, M. and Taylor, C. C. (1975). Components of Cramérvon Mises statistics. II. Journal of the Royal Statistical Society. Series B (Methodological),
23 [15] Fleming, T. R. and Harrington, D. P. (1991). Counting processes and duration analysis, Wiley, New York. [16] Lin, D. Y. and Wei, L. J. and Ying, Z. (1993). Checking the Cox model with cumulative sums of martingale-based residuals. Biometrika, 8(3): [17] Martinussen, T. and Scheike, T. H. (27). Dynamic regression models for duration data, Springer Science & Business Media. [18] Schoenfeld, D. (198). Chi-squared goodness-of-fit tests for the proportional hazards regression model. Biometrika, 67(1): [19] Schoenfeld, D. (1982). Partial residuals for the proportional hazards regression mode. Biometrika, [2] Stute, W. (1993). Consistent estimation under random censorship when covariables are present. Journal of Multivariate Analysis, 45(1): [21] Stute, W. (1997). Nonparametric model checks for regression. The Annals of Statistics, [22] Therneau, T. M. and Grambsch, P. M. and Fleming, T. R. (199). Martingale-based residuals for survival models. Biometrika, [23] Tsiatis, A. A. (1981). A large sample study of Cox s regression model. The Annals of Statistics,
Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model
Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);
More information1 Introduction. 2 Residuals in PH model
Supplementary Material for Diagnostic Plotting Methods for Proportional Hazards Models With Time-dependent Covariates or Time-varying Regression Coefficients BY QIQING YU, JUNYI DONG Department of Mathematical
More informationHypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations
Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate
More informationSTAT Sample Problem: General Asymptotic Results
STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative
More informationUniversity of California, Berkeley
University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan
More informationLecture 5 Models and methods for recurrent event data
Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationLecture 2: Martingale theory for univariate survival analysis
Lecture 2: Martingale theory for univariate survival analysis In this lecture T is assumed to be a continuous failure time. A core question in this lecture is how to develop asymptotic properties when
More informationSTAT 331. Martingale Central Limit Theorem and Related Results
STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal
More informationSTAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model
STAT331 Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model Because of Theorem 2.5.1 in Fleming and Harrington, see Unit 11: For counting process martingales with
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationLecture 22 Survival Analysis: An Introduction
University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which
More informationSurvival Analysis for Case-Cohort Studies
Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz
More informationTests of independence for censored bivariate failure time data
Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ
More information1 Glivenko-Cantelli type theorems
STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then
More information11 Survival Analysis and Empirical Likelihood
11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with
More informationEfficiency of Profile/Partial Likelihood in the Cox Model
Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows
More informationPart III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data
1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions
More informationInvestigation of goodness-of-fit test statistic distributions by random censored samples
d samples Investigation of goodness-of-fit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodness-of-fit
More informationGoodness-Of-Fit for Cox s Regression Model. Extensions of Cox s Regression Model. Survival Analysis Fall 2004, Copenhagen
Outline Cox s proportional hazards model. Goodness-of-fit tools More flexible models R-package timereg Forthcoming book, Martinussen and Scheike. 2/38 University of Copenhagen http://www.biostat.ku.dk
More informationGoodness-of-Fit Tests With Right-Censored Data by Edsel A. Pe~na Department of Statistics University of South Carolina Colloquium Talk August 31, 2 Research supported by an NIH Grant 1 1. Practical Problem
More informationAFT Models and Empirical Likelihood
AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationPhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)
PhD course in Advanced survival analysis. (ABGK, sect. V.1.1) One-sample tests. Counting process N(t) Non-parametric hypothesis tests. Parametric models. Intensity process λ(t) = α(t)y (t) satisfying Aalen
More informationStatistics 262: Intermediate Biostatistics Non-parametric Survival Analysis
Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationFrailty Models and Copulas: Similarities and Differences
Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt
More informationStatistical Analysis of Competing Risks With Missing Causes of Failure
Proceedings 59th ISI World Statistics Congress, 25-3 August 213, Hong Kong (Session STS9) p.1223 Statistical Analysis of Competing Risks With Missing Causes of Failure Isha Dewan 1,3 and Uttara V. Naik-Nimbalkar
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH
FULL LIKELIHOOD INFERENCES IN THE COX MODEL: AN EMPIRICAL LIKELIHOOD APPROACH Jian-Jian Ren 1 and Mai Zhou 2 University of Central Florida and University of Kentucky Abstract: For the regression parameter
More informationSTAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where
STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht
More informationTESTINGGOODNESSOFFITINTHECOX AALEN MODEL
ROBUST 24 c JČMF 24 TESTINGGOODNESSOFFITINTHECOX AALEN MODEL David Kraus Keywords: Counting process, Cox Aalen model, goodness-of-fit, martingale, residual, survival analysis. Abstract: The Cox Aalen regression
More informationEmpirical Likelihood in Survival Analysis
Empirical Likelihood in Survival Analysis Gang Li 1, Runze Li 2, and Mai Zhou 3 1 Department of Biostatistics, University of California, Los Angeles, CA 90095 vli@ucla.edu 2 Department of Statistics, The
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationOn the Breslow estimator
Lifetime Data Anal (27) 13:471 48 DOI 1.17/s1985-7-948-y On the Breslow estimator D. Y. Lin Received: 5 April 27 / Accepted: 16 July 27 / Published online: 2 September 27 Springer Science+Business Media,
More informationAccelerated Failure Time Models: A Review
International Journal of Performability Engineering, Vol. 10, No. 01, 2014, pp.23-29. RAMS Consultants Printed in India Accelerated Failure Time Models: A Review JEAN-FRANÇOIS DUPUY * IRMAR/INSA of Rennes,
More informationGoodness-of-fit tests for randomly censored Weibull distributions with estimated parameters
Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly
More informationOn the Goodness-of-Fit Tests for Some Continuous Time Processes
On the Goodness-of-Fit Tests for Some Continuous Time Processes Sergueï Dachian and Yury A. Kutoyants Laboratoire de Mathématiques, Université Blaise Pascal Laboratoire de Statistique et Processus, Université
More informationEmpirical Processes & Survival Analysis. The Functional Delta Method
STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will
More informationFull likelihood inferences in the Cox model: an empirical likelihood approach
Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:
More informationUSING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA
USING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA Ørnulf Borgan Bryan Langholz Abstract Standard use of Cox s regression model and other relative risk regression models for
More informationLikelihood Construction, Inference for Parametric Survival Distributions
Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make
More informationExercises. (a) Prove that m(t) =
Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for
More informationSurvival Analysis: Counting Process and Martingale. Lu Tian and Richard Olshen Stanford University
Survival Analysis: Counting Process and Martingale Lu Tian and Richard Olshen Stanford University 1 Lebesgue-Stieltjes Integrals G( ) is a right-continuous step function having jumps at x 1, x 2,.. b f(x)dg(x)
More informationDAGStat Event History Analysis.
DAGStat 2016 Event History Analysis Robin.Henderson@ncl.ac.uk 1 / 75 Schedule 9.00 Introduction 10.30 Break 11.00 Regression Models, Frailty and Multivariate Survival 12.30 Lunch 13.30 Time-Variation and
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationSTAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis
STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive
More informationProduct-limit estimators of the survival function with left or right censored data
Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut
More informationTESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip
TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississippi University, MS38677 K-sample location test, Koziol-Green
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationlog T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,
Accelerated failure time model: log T = β T Z + ɛ β estimation: solve where S n ( β) = n i=1 { Zi Z(u; β) } dn i (ue βzi ) = 0, Z(u; β) = j Z j Y j (ue βz j) j Y j (ue βz j) How do we show the asymptotics
More informationCox s proportional hazards/regression model - model assessment
Cox s proportional hazards/regression model - model assessment Rasmus Waagepetersen September 27, 2017 Topics: Plots based on estimated cumulative hazards Cox-Snell residuals: overall check of fit Martingale
More informationStatistical Inference and Methods
Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationEstimation and Inference of Quantile Regression. for Survival Data under Biased Sampling
Estimation and Inference of Quantile Regression for Survival Data under Biased Sampling Supplementary Materials: Proofs of the Main Results S1 Verification of the weight function v i (t) for the lengthbiased
More informationMODELING THE SUBDISTRIBUTION OF A COMPETING RISK
Statistica Sinica 16(26), 1367-1385 MODELING THE SUBDISTRIBUTION OF A COMPETING RISK Liuquan Sun 1, Jingxia Liu 2, Jianguo Sun 3 and Mei-Jie Zhang 2 1 Chinese Academy of Sciences, 2 Medical College of
More informationSemiparametric Regression
Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under
More informationTMA 4275 Lifetime Analysis June 2004 Solution
TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationAttributable Risk Function in the Proportional Hazards Model
UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu
More informationMultivariate Survival Data With Censoring.
1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.
More informationA comparison study of the nonparametric tests based on the empirical distributions
통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical
More informationPublished online: 10 Apr 2012.
This article was downloaded by: Columbia University] On: 23 March 215, At: 12:7 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 172954 Registered office: Mortimer
More informationGOODNESS-OF-FIT TEST FOR RANDOMLY CENSORED DATA BASED ON MAXIMUM CORRELATION. Ewa Strzalkowska-Kominiak and Aurea Grané (1)
Working Paper 4-2 Statistics and Econometrics Series (4) July 24 Departamento de Estadística Universidad Carlos III de Madrid Calle Madrid, 26 2893 Getafe (Spain) Fax (34) 9 624-98-49 GOODNESS-OF-FIT TEST
More informationEfficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models
NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia
More informationOn Estimation of Partially Linear Transformation. Models
On Estimation of Partially Linear Transformation Models Wenbin Lu and Hao Helen Zhang Authors Footnote: Wenbin Lu is Associate Professor (E-mail: wlu4@stat.ncsu.edu) and Hao Helen Zhang is Associate Professor
More informationOn graphical tests for proportionality of hazards in two samples
On graphical tests for proportionality of hazards in two samples Technical Report No. ASU/2014/5 Dated: 19 June 2014 Shyamsundar Sahoo, Haldia Government College, Haldia and Debasis Sengupta, Indian Statistical
More informationChapter 2 Inference on Mean Residual Life-Overview
Chapter 2 Inference on Mean Residual Life-Overview Statistical inference based on the remaining lifetimes would be intuitively more appealing than the popular hazard function defined as the risk of immediate
More informationCramér-von Mises Gaussianity test in Hilbert space
Cramér-von Mises Gaussianity test in Hilbert space Gennady MARTYNOV Institute for Information Transmission Problems of the Russian Academy of Sciences Higher School of Economics, Russia, Moscow Statistique
More informationAnalysis of transformation models with censored data
Biometrika (1995), 82,4, pp. 835-45 Printed in Great Britain Analysis of transformation models with censored data BY S. C. CHENG Department of Biomathematics, M. D. Anderson Cancer Center, University of
More informationIssues on quantile autoregression
Issues on quantile autoregression Jianqing Fan and Yingying Fan We congratulate Koenker and Xiao on their interesting and important contribution to the quantile autoregression (QAR). The paper provides
More informationThe Log-generalized inverse Weibull Regression Model
The Log-generalized inverse Weibull Regression Model Felipe R. S. de Gusmão Universidade Federal Rural de Pernambuco Cintia M. L. Ferreira Universidade Federal Rural de Pernambuco Sílvio F. A. X. Júnior
More informationST745: Survival Analysis: Cox-PH!
ST745: Survival Analysis: Cox-PH! Eric B. Laber Department of Statistics, North Carolina State University April 20, 2015 Rien n est plus dangereux qu une idee, quand on n a qu une idee. (Nothing is more
More informationModelling Survival Events with Longitudinal Data Measured with Error
Modelling Survival Events with Longitudinal Data Measured with Error Hongsheng Dai, Jianxin Pan & Yanchun Bao First version: 14 December 29 Research Report No. 16, 29, Probability and Statistics Group
More informationQuantile Regression for Residual Life and Empirical Likelihood
Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu
More informationProblem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM
Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods MIT 14.385, Fall 2007 Due: Wednesday, 07 November 2007, 5:00 PM 1 Applied Problems Instructions: The page indications given below give you
More informationA TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS
A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS Dimitrios Konstantinides, Simos G. Meintanis Department of Statistics and Acturial Science, University of the Aegean, Karlovassi,
More informationSTAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes
STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes This section introduces Lebesgue-Stieltjes integrals, and defines two important stochastic processes: a martingale process and a counting
More informationLectures on Structural Change
Lectures on Structural Change Eric Zivot Department of Economics, University of Washington April5,2003 1 Overview of Testing for and Estimating Structural Change in Econometric Models 1. Day 1: Tests of
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationSmoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng
Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong seng Reference: 1. Andersen, Borgan, Gill, and Keiding (1993). Statistical Model Based on Counting Processes, Springer-Verlag, p.229-255
More informationImproving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates
Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates Anastasios (Butch) Tsiatis Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More informationSome Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models
Some Monte Carlo Evidence for Adaptive Estimation of Unit-Time Varying Heteroscedastic Panel Data Models G. R. Pasha Department of Statistics, Bahauddin Zakariya University Multan, Pakistan E-mail: drpasha@bzu.edu.pk
More informationSize and Shape of Confidence Regions from Extended Empirical Likelihood Tests
Biometrika (2014),,, pp. 1 13 C 2014 Biometrika Trust Printed in Great Britain Size and Shape of Confidence Regions from Extended Empirical Likelihood Tests BY M. ZHOU Department of Statistics, University
More informationGoodness of fit test for ergodic diffusion processes
Ann Inst Stat Math (29) 6:99 928 DOI.7/s463-7-62- Goodness of fit test for ergodic diffusion processes Ilia Negri Yoichi Nishiyama Received: 22 December 26 / Revised: July 27 / Published online: 2 January
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationThe Proportional Hazard Model and the Modelling of Recurrent Failure Data: Analysis of a Disconnector Population in Sweden. Sweden
PS1 Life Cycle Asset Management The Proportional Hazard Model and the Modelling of Recurrent Failure Data: Analysis of a Disconnector Population in Sweden J. H. Jürgensen 1, A.L. Brodersson 2, P. Hilber
More informationUNIVERSITÄT POTSDAM Institut für Mathematik
UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationModified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain
152/304 CoDaWork 2017 Abbadia San Salvatore (IT) Modified Kolmogorov-Smirnov Test of Goodness of Fit G.S. Monti 1, G. Mateu-Figueras 2, M. I. Ortego 3, V. Pawlowsky-Glahn 2 and J. J. Egozcue 3 1 Department
More informationInference For High Dimensional M-estimates: Fixed Design Results
Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49
More informationFrom semi- to non-parametric inference in general time scale models
From semi- to non-parametric inference in general time scale models Thierry DUCHESNE duchesne@matulavalca Département de mathématiques et de statistique Université Laval Québec, Québec, Canada Research
More information9 Estimating the Underlying Survival Distribution for a
9 Estimating the Underlying Survival Distribution for a Proportional Hazards Model So far the focus has been on the regression parameters in the proportional hazards model. These parameters describe the
More informationLinear life expectancy regression with censored data
Linear life expectancy regression with censored data By Y. Q. CHEN Program in Biostatistics, Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, U.S.A.
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationAnalysis of Gamma and Weibull Lifetime Data under a General Censoring Scheme and in the presence of Covariates
Communications in Statistics - Theory and Methods ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20 Analysis of Gamma and Weibull Lifetime Data under a
More informationEMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky
EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are
More information