Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Similar documents
Chapter 7: Hypothesis testing

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

4. Comparison of Two (K) Samples

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine

Lecture 8 Stat D. Gillen

Kernel density estimation in R

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

TMA 4275 Lifetime Analysis June 2004 Solution

Tests of independence for censored bivariate failure time data

Lecture 7 Time-dependent Covariates in Cox Regression

β j = coefficient of x j in the model; β = ( β1, β2,

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Power and Sample Size Calculations with the Additive Hazards Model

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Estimation for Modified Data

The Design of a Survival Study

Simple techniques for comparing survival functions with interval-censored data


Linear rank statistics

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Comparison of Two Population Means

MAS3301 / MAS8311 Biostatistics Part II: Survival

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Survival Analysis for Case-Cohort Studies

Multi-state models: prediction

H 2 : otherwise. that is simply the proportion of the sample points below level x. For any fixed point x the law of large numbers gives that

STT 843 Key to Homework 1 Spring 2018

Survival Analysis APTS 2016/17. Ingrid Van Keilegom ORSTAT KU Leuven. Glasgow, August 21-25, 2017

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Multistate models in survival and event history analysis

Multistate models and recurrent event models

Sample Size Determination

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

5. Parametric Regression Model

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes

Relative-risk regression and model diagnostics. 16 November, 2015

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

Multistate models and recurrent event models

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Chapter 4 Regression Models

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

ST745: Survival Analysis: Nonparametric methods

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

MATH5745 Multivariate Methods Lecture 07

Joint Modeling of Longitudinal Item Response Data and Survival

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Chapter Seven: Multi-Sample Methods 1/52

Session 3 The proportional odds model and the Mann-Whitney test

STAT Sample Problem: General Asymptotic Results

APPENDIX B Sample-Size Calculation Methods: Classical Design

Unit 14: Nonparametric Statistical Methods

Published online: 10 Apr 2012.

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

Quantile Regression for Residual Life and Empirical Likelihood

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency

Multi-state Models: An Overview

Chapter 7 Comparison of two independent samples

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Survival Distributions, Hazard Functions, Cumulative Hazards

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Logistic regression model for survival time analysis using time-varying coefficients

Analysing Survival Endpoints in Randomized Clinical Trials using Generalized Pairwise Comparisons

Survival Regression Models

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Philosophy and Features of the mstate package

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

Statistics in medicine

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

9. Estimating Survival Distribution for a PH Model

Distribution-Free Procedures (Devore Chapter Fifteen)

The coxvc_1-1-1 package

Censoring mechanisms

Biostatistics Quantitative Data

Survival Analysis for Interval Censored Data - Nonparametric Estimation

Topic 15: Simple Hypotheses

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Survival Analysis Math 434 Fall 2011

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Statistics Handbook. All statistical tables were computed by the author.

Inference for Binomial Parameters

INFERENCES ON MEDIAN FAILURE TIME FOR CENSORED SURVIVAL DATA

Chapter. Hypothesis Testing with Two Samples. Copyright 2015, 2012, and 2009 Pearson Education, Inc. 1

Comparing Two Variances. CI For Variance Ratio

Transcription:

Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 7 Fall 2012 Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample H 0 : S(t) = S 0 (t), where S 0 ( ) is known survival function, for all t τ; H 0 : H(t) = H 0 (t), where H 0 ( ) is known; H 0 : h(t) = h 0 (t), where h 0 ( ) is known. (B) 2-sample, no known forms H 0 : S 1 (t) = S 2 (t), H 1 (t) = H 2 (t), h 1 (t) = h 2 (t). Equivalent hypotheses Global null, H 0 holds for all t [0, τ] (C) K-sample, obvious extension of (B) H 0 : S 1 (t) = S 2 (t) = = S K (t), t [0, τ] How to test? Difficulty: Censoring (& truncation) Our focus: censoring 7.2 One sample tests with right censoring Data: {(T i, δ i ), i = 1,..., n}, T i = min(x i, C i ), δ i = I(T i = X i ), X i has hazard h(t) H 0 : h(t) = h 0 (t), t [0, τ] then H 0 : H(t) = H 0 (t), t [0, τ] Notations: t 1 < t 2 < < t D, D unique death times 1

d j = # deaths at t j = n i=1 I(T i = t j, δ i = 1) = n i=1 dn i(t j ) = # at risk /alive at t j = n i=1 I(T i t j ) = Y (t j ), j = 1,..., D Recall N-A estimator H(t) = { 0, t < t1 t j t d j/, t 1 t. Let W (t) be a weight function such that W (t) = 0 whenever Y (t) = 0. Define the test statistic Z(τ) = O(τ) E(τ) = d j W (t j ) Y (t j ) τ 0 W (s)h 0 (s)ds. (7.2.1) When the null hypothesis is true, the sample variance of this statistic is given by V [Z(τ)] = τ 0 W 2 (s) h 0(s) ds. (7.2.2) Y (s) For large samples, the statistic Z(τ) 2 /V [Z(τ)] has a central chi-squared distribution when the null hypothesis is true. Choices of weight functions: The most popular choice of a weight function is the weight W (t) = Y (t) which yields the one-sample log-rank test. n E(τ) = V [Z(τ)] = [H 0 (T i ) H 0 (L i )], (7.2.4) where L i is the delayed entry time for the ith patient, and H 0 is the cumulative hazard under H 0. Harrington and Fleming (1982) proposed W HF (t) = Y (t)s 0 (t) p [1 S 0 (t)] q, p 0, q 0, where S 0 (t) = exp[ H 0 (t)] is the hypothesized survival function. Example: IBCSG trial 6 of a multi center of adjuvant chemotherapy after surgical removal of tumor. 2 i=1

7.3 Tests for two and more samples We shall test the following set of hypotheses: H 0 : h 1 (t) = h 2 (t) = = h K (t), for all t τ, versus H A : at least one of the h k (t) is different for some t τ, where τ is the largest time at which all of the groups have at least one subject at risk. Right censoring data: {(T ik, δ ik ), k = 1,..., K, i = 1,..., n k } Let 0 < t 1 < < t D be the ordered times at which deaths occur in the pooled sample. Define: d j = # deaths in pooled sample at t j, j = 1,..., D d jk = # deaths in group k at t j, j = 1,..., D = # at risk in pooled sample at t j, alive and uncensored k = # at risk in group k at t j Basic idea: Under H 0 : h 1 (t) = = h K (t), the pooled NA estimators for H(t) and the within group NA estimators should be estimating the same quantity. So we can construct a test based on a weighted average of difference between H and the within group estimators H k. Let W k (t) be a nonnegative random function and equal to 0 when = 0 or k = 0, k = 1,..., K. For example, W k (t j ) = k W (t j ), where W is common to all the groups k = 1,..., K. Define Test statistic for H 0 : Z k (τ) = { } k W (t j ) d jk d j. χ 2 = (Z 1 (τ),..., Z K 1 (τ))ˆσ 1 (K 1) (K 1) (Z 1(τ),..., Z K 1 (τ)) t chi-squared dist. with df = K-1 underh 0, 3

where ˆΣ = (ˆσ kk ) (K 1) (K 1) ˆσ kk = W (t j ) 2 Y ( jk 1 Y ) jk Yj d j 1 d j, k = 1,..., K, (7.3.4) and ˆσ kk = W (t j ) 2 kk Y 2 j d j 1 d j, 1 k k K. (7.3.5) Choice of weight functions with general form: W k (t j ) = k W (t j ), k = 1,..., K (1) W (t) = 1, log-rank test, optimal against proportional hazards alternatives, where h 1 (t) = c 2 h 2 (t) = = c K h K (t), t τ. (2) W (t) = Y (t) = K nk k=1 i=1 I(T ik t) Gehan s test Simple generalization of Wilcoxon rank sum /Mann-Whitney test when K =2. (3) Tarone-Ware W (t) = Y (t) (4) Peto-Peto W (t) = S(t), where the survival estimator S(t) = t j t (1 d j +1 ) is based on the pooled sample. (5) Modified Peto-Peto W (t) = S(t)Y (t) Y (t)+1. (6) Fleming-Harrington test W (t j ) = (Ŝ(t j 1)) p (1 Ŝ(t j 1)) q, p, q 0, where Ŝ is the KM estimator based on the combined sample. Example 7.2 We are interested in testing if there is a difference in the time to cutaneous exit-site infection between patients whose catheter was placed surgically (group 1) as compared to patients who had their catheters placed percutaneously (group 2). Example 7.4 Comparison of disease-free survival in three groups of leukemia patients given a bone marrow transplantation. Here the three groups include: Group 1: 38 ALL patients, Group 2: 54 AML low-risk patients, and Group 3: 45 AML high-risk patients. We want to test H 0 : S 1 (t) = S 2 (t) = S 3 (t), t τ = 2204. 7.4 Tests for trend 4

We shall test the following set of hypotheses: H 0 : h 1 (t) = h 2 (t) = = h K (t), for all t τ, versus H A : h 1 (t) h 2 (t) h K (t) for all t τ, and there is at least one strict inequality. It is equivalent to testing that H A : S 1 (t) S 2 (t) S K (t) for all t τ. Right censoring data: {(T ik, δ ik ), k = 1,..., K, i = 1,..., n k } Let 0 < t 1 < < t D be the ordered times at which deaths occur in the pooled sample. Define: d j = # deaths in pooled sample at t j, j = 1,..., D d jk = # deaths in group k at t j, j = 1,..., D = # at risk in pooled sample at t j, alive and uncensored k = # at risk in group k at t j Let W be a common weight function to all the groups k = 1,..., K and define Z k (τ) = { } d j W (t j ) d jk k. Objective: Increase power by constructing statistic sensitive to the stochastic ordering. General form: Z = K K k=1 a kz k (τ) k=1 K k =1 a ka k ˆσ kk. (7.4.2) where ˆσ kk is from ˆΣ, estimated covariance matrix for (Z 1 (τ),..., Z K (τ)). ˆσ kk = W (t j ) 2 Y ( jk 1 Y ) jk Yj d j 1 d j, k = 1,..., K, (7.3.4) and ˆσ kk = W (t j ) 2 kk Y 2 j d j 1 d j, 1 k k K. (7.3.5) 5

How to choose a k to increase power of detecting H A? Let a 1 < a 2 < < a K (e.g., a j = j standard approach) We reject H 0 in favor of H A at an α Type I error rate when the test statistic is larger than the αth upper quantile of a standard normal distribution. Exercise: To show that the test is invariant under linear transformation of the scores. Other choices of a k, which may have better properties. Example 7.6: testing survival difference in patients with larynx cancer across disease stage groups (I, II, III, IV). Should only use trend test if there is a strong prior information about group ordering; otherwise, the trend test results in potential loss of power. 7.5. Stratified tests Suppose one wishes to adjust for other covariates. E.g.: Compare the survival rates of patients receiving 3 months versus 6 months chemotherapy in IBCSG with adjustment for tumor stage (I, II, III, IV). H 0 : h 1s (t) = h 2s (t) = = h Ks (t), t [0, τ], s = 1,..., M strata Compute Z js (τ), j = 1,..., K, s = 1,..., M, and ˆΣ s = ˆ Cov[Z 1s (τ),..., Z Ks (τ)], separately with each stratum. The global test is Z j. (τ) = M s=1 Z js(τ) and ˆΣ. = ˆ Cov[Z 1. (τ),..., Z K. (τ)] = M s=1 ˆΣ s, χ 2 = [Z 1. (τ),..., Z (K 1). (τ)]ˆσ 1.(K 1) (K 1) [Z 1.(τ),..., Z (K 1). (τ)] χ 2 K 1, under H 0. Example 7.4 (Bone marrow transplant data) 6

We once compared the disease-free survival rates of patients in the three groups ALL, AML low risk and AML high risk. The subjects were also divided into those who used a graft-versus-host prophylactic (MTX) and those who didn t (NOMTX). We want to perform a stratified log-rank test for differences in the hazard rates of the three disease states. To illustrate the stratified test procedure, we first perform 2 separate log-rank tests using R. library(survival) survdiff(surv(t2, dfree) g + strata (mtx), data = bmt, rho = 0) > print(fit1) Call: survdiff(formula = Surv(t2, dfree) ~ g + strata(mtx), data = bmt, rho = 0) N Observed Expected (O-E)^2/E (O-E)^2/V g=1 38 24 23.2 0.0261 0.038 g=2 54 25 38.7 4.8663 9.621 g=3 45 34 21.0 7.9673 10.796 Chisq= 13.2 on 2 degrees of freedom, p= 0.00136 Continue example IBCSG data: Recall H 0 : h 3mon (t) = h 6mon (t) Stratify on tumor stage = 1, 2, 3, 4 In Splus/R: library(survival) survdiff(surv(t, ind) treatment + strata (stage), rho = 0) Matched pairs 7

Here we have paired event times (T 1i, T 2i ) and event indicators (δ 1i, δ 2i ), for i = 1,..., M. We wish to test H 0 : h 1i (t) = h 2i (t), i = 1,..., M. Computing the statistic (7.3.3) and (7.3.4), we have W (T 1i )/2 if T 1i < T 2i, δ 1i = 1, W 2 (T 1i )/4 or T 1i = T 2i, δ 1i = 1, δ 2i = 0, Z 1i (τ) = W (T 2i )/2 if T 2i < T 1i, δ 2i = 1, W 2 (T 2i )/4 = ˆσ 11i (τ) ir T 1i = T 2i, δ 1i = 0, δ 2i = 1, 0 otherwise, 0. (7.5.5) If the weight function does not depend on time, then Z 1. (τ) = W 2 (D 1 D 2 ), where D 1 = T 1i T 2i δ 1i and D 2 = T 1i T 2i δ 2i. Similarly, ˆσ 11. (τ) = M i=1 ˆσ 11i(τ) = W 2 4 (D 1 + D 2 ). So the test statistic is D 1 D 2 D1 +D 2 N(0, 1) under H 0. Example 7.8 Matched pair of leukemia patients receiving 6MP and placebo treatment. 7.6 Renyi type tests The Renyi type tests are preferable when hazard rates cross. We test H 0 : h 1 (t) = h 2 (t), t τ, against H A : h 1 (t) h 2 (t) for some t τ. Consider two samples of sizes n 1 and n 2 with n = n 1 + n 2. Let t 1 < t 2 < < t D be the distinct failure times of the pooled sample. Let d j1, d j2 be the number of events at t j and 1, 2 be the number at risk at time t j in the two samples. Also d j = d j1 + d j2 and = 1 + 2. Let W be the weight function and let Z(t i ) = t j t i W (t j ) [ d j1 1 d j ], i = 1,..., D. (7.6.1) 8

σ 2 (τ) = t j τ where τ is the largest t j with 1, 2 > 0. The test statistic for a two-sided alternative is given by W 2 (t j ) 1 2 d j 1 d j, (7.6.2) Q = sup{ Z(t), t τ}/σ(τ). (7.6.3) When H 0 is true, the distribution of Q can be approximated by the distribution of the sup{ B(x), 0 x 1} where B is a standard Brownian motion process. Critical values of Q are found in Table C.5 in Appendix C. When the hazards cross, the supremum test should have greater power to detect such differences between the hazard rates. Example 7.9 A clinical trial of chemotherapy against chemotherapy combined with radiotherapy in the treatment of gastric cancer. 9