Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Similar documents
Simple techniques for comparing survival functions with interval-censored data

Exercises. (a) Prove that m(t) =

Linear rank statistics

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

STAT331. Cox s Proportional Hazards Model

Harvard University. Harvard University Biostatistics Working Paper Series. A New Class of Rank Tests for Interval-censored Data

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event

Lecture 3. Truncation, length-bias and prevalence sampling

University of California, Berkeley

STAT Sample Problem: General Asymptotic Results

Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Sample Size and Power Considerations for Longitudinal Studies

MAS3301 / MAS8311 Biostatistics Part II: Survival

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

4. Comparison of Two (K) Samples

Lecture 5 Models and methods for recurrent event data

TMA 4275 Lifetime Analysis June 2004 Solution

Master s Written Examination - Solution

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups

Lecture 22 Survival Analysis: An Introduction

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

1 Glivenko-Cantelli type theorems

Survival Analysis. Stat 526. April 13, 2018

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Survival Analysis for Case-Cohort Studies

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Power and Sample Size Calculations with the Additive Hazards Model

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Survival Analysis. Lu Tian and Richard Olshen Stanford University

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

Semiparametric Regression

Lecture 2: Martingale theory for univariate survival analysis

A Bivariate Weibull Regression Model

MAS3301 / MAS8311 Biostatistics Part II: Survival

Problem Selected Scores

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Chapter 4: Constrained estimators and tests in the multiple linear regression model (Part III)

The Design of a Survival Study

Proportional hazards regression

Cox s proportional hazards model and Cox s partial likelihood

Tests of independence for censored bivariate failure time data

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Order restricted inference for comparing the cumulative incidence of a competing risk over several populations

Multistate Modeling and Applications

Consider Table 1 (Note connection to start-stop process).

Score tests for dependent censoring with survival data

Econometrics of Panel Data

Testing Error Correction in Panel data

Unit 10: Planning Life Tests

Estimating Bivariate Survival Function by Volterra Estimator Using Dynamic Programming Techniques

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Residuals and model diagnostics

Survival Analysis I (CHL5209H)

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

Package Rsurrogate. October 20, 2016

1 One-way Analysis of Variance

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Quasi-likelihood Scan Statistics for Detection of

Analysis of Progressive Type-II Censoring. in the Weibull Model for Competing Risks Data. with Binomial Removals

MAS361. MAS361 1 Turn Over SCHOOL OF MATHEMATICS AND STATISTICS. Medical Statistics

Multivariate Survival Data With Censoring.

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

BIOS 312: Precision of Statistical Inference

Likelihood Construction, Inference for Parametric Survival Distributions

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Joint Modeling of Longitudinal Item Response Data and Survival

Modified maximum likelihood estimation of parameters in the log-logistic distribution under progressive Type II censored data with binomial removals

Topic 22 Analysis of Variance

Statistical Inference and Methods

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, )

Math 181B Homework 1 Solution

Examination paper for TMA4275 Lifetime Analysis

Meei Pyng Ng 1 and Ray Watson 1

Let us use the term failure time to indicate the time of the event of interest in either a survival analysis or reliability analysis.

Survival Analysis Math 434 Fall 2011

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Duration Analysis. Joan Llull

Session 3 The proportional odds model and the Mann-Whitney test

A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes

ST745: Survival Analysis: Nonparametric methods

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

Hypothesis Testing for an Extended Cox Model with Time-Varying Coefficients

Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time

Quantile Regression for Residual Life and Empirical Likelihood

and Comparison with NPMLE

Understanding product integration. A talk about teaching survival analysis.

Transcription:

1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions S 1 (t),..., S p (t). Suppose that the goal is to test the hypothesis H 0 : S 1 (t) =... = S p (t). based on right-censored failure time data { X i = min(t i, C i ), δ i = I(X i = T i ) ; i = 1,..., n }. Let t 1 < t 2 <... < t k observed failure times, d ij = # of failures at t j from the ith population r ij = # of subjects at risk at t j from the ith population d j = # of failures at t j (= d 1j +... + d pj ), r j = # of subjects at risk at t j (= r 1j +... + r pj ), j = 1,..., k, i = 1,..., p.

To construct a test statistic, consider what happened at time t j. Conditional on the failure and censoring experience up to time t j, under H 0, the conditional distribution of d 1j,..., d pj given d j is the hypergeometric distribution 2 P(d 1j,..., d pj d j, r 1j,..., r pj ) =. Thus we have w ij = E[ d ij d j ] = r ij d j r 1 j, V j ii = V ar[ d ij d j ] = r ij (r j r ij ) d j (r j d j ) r 2 j (r j 1) 1, V j i 1 i 2 = cov[ d i1 j, d i2 j d j ] = r i1 jr i2 jd j (r j d j )r 2 j (r j 1) 1. Define the statistic ν j = ( d 1j w 1j,..., d pj w pj ) at t j, which has (conditional) mean zero and covariance matrix V j = ( V j i 1 i 2 ). The log-rank statistic is defined as the simple summation over failure times ν = k ν j = ( D 1 E 1,..., D p E p ),

the vector of the observed numbers of failures in each population minus the corresponding vector of the expected numbers of failures, where D i = k d ij, E i = k w ij. Or the statistic ν can be written as ν = D E, 3 where D = (D 1,..., D p ), E = (E 1,..., E p ). If the ν j s are independent, then E[ ν ] = 0, V ar[ ν ] = V 1 +... + V k. The hypothesis H 0 can be tested using the statistic χ 2 = ν V 1 ν based on a χ 2 p 1 distribution for large samples. If p = 2, the test of the hypothesis H 0 can be based on the statistic Z = k (d 1j r 1j d j /r j ) [ k r 1j (r j r 1j ) d j (r j d j ) r 2 j (r j 1) 1 ] 1/2 with the standard normal distribution for large samples.

Comments: 1. The log-rank test can be seen as censored data generalizations of linear rank statistics such as the Wilcoxon test and Savage exponential score test. It is also referred to as the generalized Savage test. 2. The log-rank test can also be derived as a score test from the marginal or partial likelihood under the proportional hazards model, which means that the hazard or survival functions are proportional to each other. Under this case, it can be shown that the log-rank test is the optimal test or the most efficient test. 3. The log-rank test is derived based on large-sample theory under the assumption that the censoring distribution is independent of the failure distributions. 4. The log-rank test statistic can be rewritten as with ν = k D i E i = k ν j = ( D 1 E 1,..., D p E p ) r ij d ij r ij d j r j = k r ij ( ˆλ ij ˆλ j ) 4 = 0 w i (t) [ d Λ i (t) d Λ(t) ], the summation of weighted differences between the estimates of hazard functions for individual populations and the common population under H 0.

III.2. Other Tests for Right-censored Failure Time Data As in the previous section, again consider the problem of comparing p = 2 survival functions based on right-censored data from n independent subjects, that is, testing H 0 : S 1 (t) = S 2 (t). 5 III.2.1. Weighted log-rank tests : Note that we can rewrite the log-rank statistic ν 1 as ν 1 = D 1 E 1 = k r 1j d 1j r 1j d j r j = k r 1j r 2j r 1j + r 2j d 1j r 1j d 2j r 2j = 0 Ȳ 1 (t) Ȳ2(t) Ȳ 1 (t) + Ȳ2(t) { d Λ1 (t) d Λ 2 (t) }, where Ȳ 1 (t) = # of subjects from the population 1 at risk at t, Ȳ 2 (t) = # of subjects from the population 2 at risk at t.

6 This motivated the weighted log-rank test statistics = 0 WLR = 0 = k W(t) K(s) { d Λ 1 (t) d Λ 2 (t) } Ȳ 1 (t) Ȳ2(t) Ȳ 1 (t) + Ȳ2(t) r 1j r 2j W(t j ) r 1j + r 2j { d Λ1 (t) d Λ 2 (t) } d 1j r 1j d 2j, r 2j where K(s) or W(s) is a weight process. It can be shown that under H 0 and some regularity conditions, WLR has an asymptotic normal distribution with mean zero and variance that can be estimated by ˆσ 2 = k as n. Let K 2 (t j ) 1 r 1j r 2j r j d j r j 1 d j = k W 2 (t j ) r 1j r 2j r 2 j r j d j r j 1 d j Ŝ denote the the Kaplan-Meier estimator of the survival function under H 0 based on pooled samples. A common class of weight processes is given by W(t) = { Ŝ(t )}ρ { 1 Ŝ(t )}γ (Harrington and Fleming, 1982), where ρ and γ are non-negative constants. In this case, the test statistics W LR are referred to as G ρ,γ statistics.

7 III.2.2. Weighted Kaplan-Meier statistics : To test H 0, we could also employ the weighted Kaplan-Meier statistics WKM = n 1 n 2 n τ 0 W(t) [ Ŝ1(t) Ŝ2(t) ] dt, where τ is the largest observation time, W(t) is a weight process and Ŝ 1 and Ŝ2 are the Kaplan-Meier estimators of the survival functions S 1 and S 2 based on separate samples, respectively. Suppose that the weight process W(t) is small when t is close to τ. Then it can be shown that as n, the distribution of the statistics W KM can be approximated by a normal distribution with mean zero and variance where ˆσ 2 = τ 0 [ τ t W(u) Ŝ(u) du ]2 dŝ(t), Ŝ 2 (t) Ĉ (t) Ŝ and Ĉ are the Kaplan-Meier estimators of the common survival function under H 0 and the survival function of the censoring variable based on the pooled samples, respectively. Pepe and Fleming (1989), Biometrics, 497-507.

Comments 1. The test statistics W LR, the integrated weighted differences of the estimated hazard functions, are most sensitive to the alternative of ordered hazard functions Ha 1 : λ 2(t) λ 1 (t) for all t. In contrast, the test statistics W KM, the integrated weighted difference between Kaplan-Meier estimates of the survival functions, are most sensitive to the alternative of ordered survival functions Ha 2 : S 2(t) S 1 (t) for all t. Ha 2 does not imply H1 a. 2. The test statistics WLR are constructed based on ranks and thus invariant under all monotone transformations of time. That is, they do not depend on the scale in which time is measured. This is not true for WKM. 8

9 III.3. Log-rank Test for Interval-censored Data As in the previous sections, consider a survival study which involves n independent subjects from p populations and in which the goal is to test the hypothesis H 0. Instead of observing right-censored data, suppose that only interval-censored data are available. Also suppose that the survival time takes discrete values 0 = t 0 < t 1 <... < t k < t k+1 =. For subject i, let A i = { L i, L i + 1,..., U i } ǫ { t 1,..., t k+1 } denote the interval within which the ith individual fails. Then observed data have the form { A i ; i = 1,..., n }. Also let 0 = s 0 < < s m+1 = k + 1 denote the smallest subset of { t 0, t 1,..., t k+1 } such that each L i and U i is contained in the subset and j = { s j 1 + 1,..., s j }, j = 1,..., m. Define α ij as the indicator of the event j A i. Note that if (i) the intervals not including k + 1 are not overlapping and (ii) for each interval with U i = k + 1, its left endpoint coincides with a left endpoint of an interval that does not include k + 1, then the observed data can be treated as right-censored data by treating each interval as a single point.

To test H 0, we will follow the idea behind the log-rank test for right-censored data and determine the death and risk numbers. Let S = (S 0,..., S m ) denote the common survival function of the p populations under H 0 (S j = Pr{T > s j }) and Ŝ = (Ŝ0,..., Ŝm) the maximum likelihood estimator of S. 10 Define and d j = n {α ij [Ŝj 1 r j = m+1 r=j i=1 n i=1 {α ir [Ŝr 1 m+1 Ŝj]/ u=1 m+1 Ŝr]/ u=1 α iu [Ŝu 1 Ŝu]} α iu [Ŝu 1 Ŝu]}. Also define and d jl = i r jl = m+1 r=j i {α ij [Ŝj 1 {α ir [Ŝr 1 m+1 Ŝj]/ u=1 m+1 Ŝr]/ u=1 α iu [Ŝu 1 Ŝu]} α iu [Ŝu 1 Ŝu]}, where i denotes the summation over subjects i in the population l. The d j s, r j s, d jl s and r jl s possess the similar meanings to the d j s, j s, d jl s and r jl s respectively, the numbers of failures and the numbers of risks.

11 Motivated by the log-rank test statistic for right-censored data, we can construct a test statistic T = (T 1,..., T p ) t for testing H 0, where T l = m d jl r jl d j n j. If an estimate, V, of the variance of T is available, then the test of H 0 can be based on the approximation T t V 1 T χ 2 p 1. To obtain an estimate for the covariance matrix of T or V, see Sun (1996), Statistics in Medicine, Vol. 15, 1387-1395. Zhao and Sun (2004), Statistics in Medicine, Vol. 23, 1621-1629.

III.4. Weighted Survival Test for Interval-censored Data In this section, we will consider two sample comparison problem (p = 2) and use the notation given in the previous section. To test H 0, similar to the weighted Kaplan-Meier test statistics for rightcensored data, we can construct a class of test statistics as 12 W = k w(t j ) [ Ŝ1(t j ) Ŝ2(t j ) ] j, where w is a weight function, Ŝ1 and Ŝ2 are the maximum likelihood estimates of the two survival functions S 1 and S 2 based on separate samples, and j = t j t j 1. The statistic W can be rewritten as W = k w(t j ) [ j l=1 ˆp (2) l j l=1 ˆp (1) l ] j, where p (i) l = S i (t l 1 ) S i (t l ), i = 1, 2. That is, W is a function of estimates of parameters { p (1) l, p (2) l ; l = 1,..., m }, whose covariance can be estimated using the Fisher information matrix. Also under H 0, the distribution of W can be approximated by the normal distribution with mean zero. Petroni and Wolfe (1994), Biometrics, 77-87.