PhD course in Advanced survival analysis. (ABGK, sect. V.1.1) One-sample tests. Counting process N(t) Non-parametric hypothesis tests. Parametric models. Intensity process λ(t) = α(t)y (t) satisfying Aalen s multiplicative intensity model. Usually, N(t) is obtained by aggregating individual counting processes, α(t) is the hazard function, and Y (t) is the number at risk. 1 October 212 Per Kragh Andersen We will test the null hypothesis H : α(t) = α (t), with α (t) a known hazard function, e.g. a population mortality. 1 2 Compare Nelson-Aalen increments dn(s) dâ(s) = Y (s) Idea: with α (s)ds. Formally: introduce predictable, non-negative weights, K(s), and consider the process: Z(t) = ( ) K(s) dâ(s) α (s)ds. Then Z(t) tends to be large (small) if α(s) is larger (smaller) than α (s) for all s [, t]. Note that we always assume that K(s) = whenever Y (s) =. Properties. If H is true then dn(s) = α (s)y (s)ds + dm(s) and Z(t) equals the stochastic integral Z(t) = K(s) Y (s) dm(s). In particular, Z(t) is a mean zero martingale with (under H ) ( ) 2 K(s) t K 2 (s) Z (t) = d M (s) = Y (s) Y (s) α (s)ds. Conditions can be found under which, by the martingale CLT, the test statistic Z(t)/ Z (t) is asymptotically standard normal under H. 3 4
Choice of weights. Common choice is K(t) = Y (t), cf. Ex. V.1.1. Then with Z(t) = ( ) dn(s) Y (s) Y (s) α (s)ds = N(t) E(t), Z (t) = E(t) = Y 2 (s) Y (s) α (s)ds = E(t) Y (s)α (s)ds. The one-sample logrank test statistic is then Example: Fyn diabetics. For female diabetics (Ex. V.1.3) we have, for t = 1 years, N(t) = 237 and using published Danish life-tables we get E(t) = 8.1. The one-sample logrank statistic takes the (highly significant) value: 237 8.1 8.1 = 17.5. (N(t) E(t))/ E(t) and E(t) is the expected number of events in [, t] under H (since then N(t) E(t) is a martingale). 5 6 (ABGK sec. V.2.1) k-sample tests. k-variate counting process: (N 1 (t),...,n k (t)) where, typically, each component, N h (t), is obtained by aggregating individual processes. Intensity process (λ 1 (t),...,λ k (t)) of the form: We will test the null hypothesis: λ h (t) = α h (t)y h (t), h = 1,...,k. H : α 1 (t) =... = α k (t) for all t, e.g., if mortality is the same in k groups. 7 Properties of test statistic. If H is true then N (t) = k h=1 N h(t) is a univariate counting process with intensity process: λ (t) = k λ h (t) = α(t)y (t) h=1 where α(t) is the common value of the α h (t) s under H. The idea is now to compare the Nelson-Aalen increments a priori: with those under H : dâh(t) = dn h(t) Y h (t) dâ(t) = dn (t) Y (t). 8
Properties of test statistic. Formally, follow the approach from the one-sample situation: introduce non-negative weights, K(s), and consider the processes (for h = 1,...,k): Z h (t) = = Note that k h=1 Z h(t) =. ( ) K(s)Y h (s) dâ h (s) dâ(s) K(s)dN h (s) (observed (weighted)) K(s) Y h(s) Y (s) dn (s) (expected (weighted)). We assume that K(s) = whenever Y (s) =. Properties of test statistic. Under H, the Z h (t) s are mean zero martingales. From their predictable (co-)variance processes, we obtain the (co-)variance estimators: σ hj (t) = K 2 (s) Y ( h(s) δ hj Y ) j(s) dn (s). Y (s) Y (s) Introduce Z(t) = (Z 1 (t),...,z k 1 (t)) T and Σ(t) = (σ hj (t)) k 1 h,j=1 and use the quadratic form as test statistic: Z(t) T Σ(t) 1 Z(t) which is χ 2 k 1 under H under suitable regularity conditions (Th. V.2.1). 9 1 General choices: Choices of weights. Nelson-Aalen estimates by tumor thickness. Log-rank (Savage): K(s) = I(Y (s) > ). Gehan-Breslow (Wilcoxon, Kruskal-Wallis): K(s) = Y (s). Tarone-Ware: K(s) = Y (s) ρ. Survival data: Peto-Prentice (Wilcoxon, Kruskal-Wallis): K(s) = ( ) u<s 1 N (u) Y (s) Y (u)+1 Y (s)+1 Harrington-Fleming: K(s) = I(Y (s) > )Ŝ(s ) ρ. The log-rank test is optimal for proportional hazards alternatives. Categories: -2, 2-5, 5+ mm 11 12
Melanoma data Compare mortality for tumor thickness groups: below 2mm, 2-5mm, above 5mm. Log-rank test (at t =14 years): X 2 (t) = 31.6, d.f.=2 Harrington-Fleming test (at t =14 years, ρ = 1): X 2 (t) = 33.9, d.f.=2 The two-sample case. When k = 2 the test statistic takes the form X 2 (t) = (Z 1(t)) 2 σ 11 (t) and an alternative, standard normally distributed (under H ), statistic is: U(t) = Note that U(t) has a direction. Z 1(t) σ11 (t). Example: melanoma data, compare males and females. Log-rank: U(t) = 2.54, P =.11, Gehan-Breslow: U(t) = 2.72, P =.65, Peto-Prentice: U(t) = 2.66, P =.77. 13 14 Nelson-Aalen estimates by sex. Exercise 1. Show that the 2-sample logrank test (i.e., K(s) = I(Y. (s) > )) can be written in the form Z 1 (t) = Y 1 (s)y 2 (s) Y 1 (s) + Y 2 (s) (dâ2(s) dâ1(s)). 2. Show that, under H : α 1 (t) = α 2 (t), Z 1 is a martingale. 3. Find the predictable variation process Z 1 and the variance estimate σ 11 (t). Males: dashed, females: solid 15 16
Parametric models. Set-up: N i (t), i = 1,...,n are counting processes with intensity processes λ i (t), i = 1,...,n w.r.t. filtration (F t ). Recall that λ i (t)dt P(dN i (t) = 1 F t ). Examples (mainly survival data) of parametric models (where Y i (t) is at-risk indicator): λ i (t) = αy i (t), constant hazard λ i (t) = αρ(αt) ρ 1 Y i (t), Weibull λ i (t) = θµ i (t)y i (t), relative mortality λ i (t) = (γ + µ i (t))y i (t), excess mortality λ i (t) = α j Y i (t), t j < t t j+1 piecewise constant hazard Likelihood: Jacod s formula. ABGK; p.58. Consider a general parametric model λ i (t) = α i (t; θ)y i (t), where θ = (θ 1,...,θ p ) is the parameter vector. To derive the likelihood for θ, let N = i N i, Y = i Y i and note that P(dN (t) = F t ) 1 λ (t; θ)dt. With τ being the terminal time of observation we have P(data) = <t τ π P(dN 1 (t),...,dn n (t) F t )P(other events at t dn(t), F t ) <t τ π P(dN 1 (t),...,dn n (t) F t ) 17 18 L(θ) = P(data) = ( Likelihood. π <t τ λ 1 (t, θ) dn 1(t) λ n (t, θ) dn n(t) (1 λ (t, θ)dt) 1 dn (t) n <t τ λ i (t, θ) dni(t) ) exp( λ (u, θ)du). Here, dn i (t) = 1 if N i jumps at t and otherwise. One may show that the scores U j (θ) = θ j log L(θ) are stochastic integrals w.r.t. the counting process martingales when evaluated at the true parameter: Score: log L(θ) = θ j log L(θ) = log λ i (t, θ)dn i (t) λ (t)dt. λ ij (t, θ) τ λ i (t, θ) dn i(t) λ j(t)dt. Use the Doob-Meyer decomposition dn i (t) = λ i (t, θ )dt dm i (t) to show that the score evaluated at θ equals: λ ij (t, θ) log L(θ) = θ j λ i (t, θ) dm i(t). Thereby, using the martingale CLT, asymptotic normality of the scores may be established leading to a proof that the MLE for θ = (θ 1,...,θ q ) enjoys the usual large sample properties. 19 2
Sufficiency. In most of the examples above, λ i (t; θ) = α(t; θ)y i (t) in which case the likelihood is (proportional to): L(θ) = ( α(t; θ) dn (t) ) exp( <t τ It follows that (N, Y ) is sufficient for θ. α(u; θ)y (u)du). Constant hazard. (N, Y ) sufficient when the hazard is assumed constant, i.e. λ i (t; α) = αy i (t). If we let R(t) = Y (u)du be the total time at risk in [, t] then L(α) = α N (τ) exp( αr(τ)) and the MLE is the occurrence/exposure rate α = N (τ) R(τ) NOTE: The Nelson-Aalen estimator (and thereby, by properties of the product-integral, also the Kaplan-Meier and Aalen-Johansen estimators, to be defined later) may be given a MLE interpretation. and ŜD( α) = ( 2 α 2 log L( α)) 1 2 = α R(τ). 21 22 Nelson-Aalen estimates for male and female melanoma patients. Example: melanoma data. Here we have, for females: α = 28 786.9 =.356 per year with ŜD =.67 while, for males, we have α = 29 42.8 =.689 per year with ŜD =.128. We can compare the two using the approximately χ 2 1 LR statistic which takes the value 6.15 (P =.1). The constant hazard hypothesis may be evaluated using confidence bands for the Nelson-Aalen estimates or by embedding the model in the larger Weibull model. For the Weibull model the estimated shape parameter for females is ρ = 1.197(.22) while for males it is ρ = 1.17(.166) and for both sexes the Wald statistic ( ρ 1)/ŜD is quite insignificant. Males: dashed, females: solid 23 24
Nelson-Aalen estimate for male melanoma patients with bands. Piecewise constant hazards. The constant hazard model is often quite restrictive, however, a simple device offers a lot of flexibility, relax the assumption to piecewise constant hazards: λ i (t) = α j Y i (t), t j < t t j+1, where = t < t 1 <... < t k 1 < t k = + are pre-specified interval cut-points. Define occurrence, D j = N (t j+1 ) N (t j ) and exposure, R j = j+1 t j Y (u)du. Then L(α) = ( j α D j j ) exp( j α j R j ), and the MLE simply becomes α j = D j R j, ŜD( α j ) = α j R j, as. indep. 25 26 Example: suicides among Danish non-manual workers. Ex. I.3.3, p.17. Four categories of non-manual workers: 1. academics 2. advanced non-academic training 3. extensive practical training 4. other non-manual workers Suicides and person-years tabulated by age group and work category (excerpt): Suicides Age Group 1 Group 2 Group 3 Group 4 2-24 2 14 31 51 25-29 1 48 46 46 Males... 6-64 6 1 21 17 2-24 12 29 61 25-29 5 1 27 65 Females... 6-64 2 3 7 17 Person-years Age Group 1 Group 2 Group 3 Group 4 2-24 13439 67623 153241 22563 25-29 52787 Males... 6-64 22965 3716 8327 45255 2-24 2366 79773 199778 6134 25-29 1447 Females... 6-64 4866 18858 3275 69418 27 28
Example: suicides among Danish non-manual workers. Age-specific suicide rates (e.g., male academics, 2-24, 2/13439=.15; 25-29, 1/52787=.19): Suicide rates (per 1 years) (SD) Age Males Females 2-24 1.5 (1.1) () 25-29 1.9 (.6) 3.6 (1.6) 3-34 4.9 (1.) 4.9 (1.9) 35-39 4. (.9) 7.6 (2.9) 4-44 5.2 (1.) 6.4 (2.6) 45-49 4.1 (.9) 4.9 (2.2) 5-54 5.1 (1.1) 4.6 (2.3) 55-59 5.8 (1.3) 9. (3.7) 6-64 2.6 (1.1) 4.1 (2.9) The standardized mortality ratio. Recall the model for relative mortality λ i (t) = θµ i (t)y i (t) where µ i (t) is a known mortality rate, e.g. an age-, sex-, and calendar time specific population mortality for invividual i at time t. If θ = 1 we have N (τ) = µ i (u)y i (u)du + M (τ), and thereby E(N (τ)) = E( n µ i(u)y i (u)du). Therefore, E(τ) = µ i (u)y i (u)du can be thought of as the (under H : θ = 1) expected number of deaths. 29 3 The MLE is θ = N (τ) E(τ), the standardized mortailty ratio with θ ŜD( θ) = E(τ). Example: suicides (continued), let µ i (t) = µsex(i)(t) at age t be defined as occ/exp rates from combined data (treated as known). Then we get for the academics Males: θ = 1.2(.1) Females: θ = 1.57(.25) both marginally significantly (judged from a Wald test) larger than 1. 31