Linear rank statistics

Similar documents
PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model

Lecture 2: Martingale theory for univariate survival analysis

Survival Analysis: Counting Process and Martingale. Lu Tian and Richard Olshen Stanford University

4. Comparison of Two (K) Samples

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

Exercises. (a) Prove that m(t) =

STAT 331. Martingale Central Limit Theorem and Related Results

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Session 3 The proportional odds model and the Mann-Whitney test

STAT331. Cox s Proportional Hazards Model

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Lecture 3. Truncation, length-bias and prevalence sampling

Adjusting for possibly misclassified causes of death in cause-specific survival analysis

STAT Sample Problem: General Asymptotic Results

Published online: 10 Apr 2012.

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Survival Analysis for Case-Cohort Studies

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups

Lecture 5 Models and methods for recurrent event data

Efficiency Comparison Between Mean and Log-rank Tests for. Recurrent Event Time Data

Business Statistics MEDIAN: NON- PARAMETRIC TESTS

Power Calculations for Preclinical Studies Using a K-Sample Rank Test and the Lehmann Alternative Hypothesis

Estimating transition probabilities for the illness-death model The Aalen-Johansen estimator under violation of the Markov assumption Torunn Heggland

DAGStat Event History Analysis.

Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng

Tests of independence for censored bivariate failure time data

Semiparametric Regression

Resampling methods for randomly censored survival data

5 Introduction to the Theory of Order Statistics and Rank Statistics

Statistical Analysis of Competing Risks With Missing Causes of Failure

A Comparison of Different Approaches to Nonparametric Inference for Subdistributions

The Design of a Survival Study

Unit roots in vector time series. Scalar autoregression True model: y t 1 y t1 2 y t2 p y tp t Estimated model: y t c y t1 1 y t1 2 y t2

Exam C Solutions Spring 2005

Follow this and additional works at: Part of the Applied Mathematics Commons

Linear life expectancy regression with censored data

USING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA

Multivariate Survival Data With Censoring.

TMA 4275 Lifetime Analysis June 2004 Solution

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Problems 5: Continuous Markov process and the diffusion equation

SPRING 2007 EXAM C SOLUTIONS

Simultaneous Group Sequential Analysis of Rank-Based and Weighted Kaplan Meier Tests for Paired Censored Survival Data

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Statistics Handbook. All statistical tables were computed by the author.

Introduction to Statistical Analysis

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Meei Pyng Ng 1 and Ray Watson 1

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.

Econometrics of Panel Data

Cohort Sampling Schemes for the Mantel Haenszel Estimator


A STATISTICAL TEST FOR MONOTONIC AND NON-MONOTONIC TREND IN REPAIRABLE SYSTEMS

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

Relative efficiency. Patrick Breheny. October 9. Theoretical framework Application to the two-group problem

Risk-Minimality and Orthogonality of Martingales

Step-Stress Models and Associated Inference

Harvard University. Harvard University Biostatistics Working Paper Series. A New Class of Rank Tests for Interval-censored Data

Notes on Time Series Modeling

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

STATISTICS 4, S4 (4769) A2

Estimation for Modified Data

ST745: Survival Analysis: Nonparametric methods

Survival analysis in R

Lecture 22 Survival Analysis: An Introduction

STAT 135 Lab 8 Hypothesis Testing Review, Mann-Whitney Test by Normal Approximation, and Wilcoxon Signed Rank Test.

Simple techniques for comparing survival functions with interval-censored data

Residuals and model diagnostics

Filtrations, Markov Processes and Martingales. Lectures on Lévy Processes and Stochastic Calculus, Braunschweig, Lecture 3: The Lévy-Itô Decomposition

Central Limit Theorem ( 5.3)

Chapter 2 Inference on Mean Residual Life-Overview

The Wright-Fisher Model and Genetic Drift

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

Goodness-of-fit test for the Cox Proportional Hazard Model

Competing risks data analysis under the accelerated failure time model with missing cause of failure

Tyler Hofmeister. University of Calgary Mathematical and Computational Finance Laboratory

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

Analysis of Survival Data Using Cox Model (Continuous Type)

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models

ST745: Survival Analysis: Cox-PH!

Order restricted inference for comparing the cumulative incidence of a competing risk over several populations

Multivariate Time Series

Simulating Properties of the Likelihood Ratio Test for a Unit Root in an Explosive Second Order Autoregression

Asymptotic Properties of Kaplan-Meier Estimator. for Censored Dependent Data. Zongwu Cai. Department of Mathematics

Survival Analysis. STAT 526 Professor Olga Vitek

UNIVERSITY OF CALIFORNIA, SAN DIEGO

On the Comparison of Fisher Information of the Weibull and GE Distributions

TESTINGGOODNESSOFFITINTHECOX AALEN MODEL

Econometrics of financial markets, -solutions to seminar 1. Problem 1

Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data

Introduction to Statistical Data Analysis III

MAS3301 / MAS8311 Biostatistics Part II: Survival

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

On least-squares regression with censored data

Transcription:

Linear rank statistics Comparison of two groups. Consider the failure time T ij of j-th subject in the i-th group for i = 1 or ; the first group is often called control, and the second treatment. Let n i be the size of the i-th group. We consider the respective censoring time U ij, and assume that T ij and U ij are independent. Within a group we introduce the number of observed events by where N i (t) = N ij (t) = I {Xij t,δ ij =1} (5.1) X ij = min(t ij, U ij ), i = 1, and j = 1,..., n i, are observed times along with indicator δ ij = I {Tij U ij }. Martingales for grouped processes. Similarly we can construct the number at risk by Y i (t) = Y ij (t) = I {Xij t} (5.) A natural filtration F t is also introduced so that the above processes are F t -measurable and M i (t) = N i (t) Y i (u)λ i (u)du (5.3) becomes an F t -martingale with the respective hazard function λ i (t) for each i = 1,. Mantel s logrank statistic. We introduce and N i (t) = N i (t) N i (t ); Y i (t) = Y i (t) Y i (t ) for i = 1,, N(t) = N 1 (t) + N (t); Y (t) = Y 1 (t) + ; n = n 1 + n. Standardized sum of differences between observed and expected numbers of events ( ) 1/ [ n L M = n 1 n dn 1 (t) Y ] 1(t) Y (t) dn(t) is called Mantel s logrank statistic. Heuristic interpretation of test. For a moment we assume that all the data are uncensored; thus, X ij = T ij and δ ij = 1. Then the following table summarizes data observed at time t: Group 1 Group Total Observed at t N 1 (t) N (t) N(t) Not observed Y 1 (t) N 1 (t) N (t) Y (t) N(t) Risk at t Y 1 (t) Y (t) Page 1 Special lecture/july 16

Under the null hypothesis there is no difference between two groups, the number N 1 (t) of observed events in the first group must be predicted by N(t)Y 1 (t) /Y (t). Problem 1. Mantel logrank statistic can be written in a form L M = K M (t) with ( ) 1/ n Y 1 (t) K M (t) = n 1 n Y (t) Here data are generally censored, and the formulation of (5.4) is called a linear rank statistic. (5.4) Wilcoxon rank-sum statistic. Let T (1) < < T (n) be the ordered statistics of the pooled failure time T ij s. Then we can define the rank R j of the j-th subject in the first group by setting R j = k if T 1j = T (k). Let S i (t) be the survival function of failure time from the i-th group. Under the null hypothesis H : S 1 = S, The Wilcoxon rank-sum statistic W = has the mean n 1(n+1) and the variance n 1n (n+1) 1. Gehan-Wilcoxon statistic. Assuming that the observed data are uncensored, the Wilcoxon rank-sum statistic W can be related to a linear rank statistic L G = n 1 (n + 1) W = K G (t) (5.5) with R j K G (t) = Y 1(t) (nn 1 n ) 1/. This formulation allows us to incorporate censored data into the hypothesis test, and it is called the Gehan-Wilcoxon statistic. Problem. we can verify (5.5) by completing the following questions. (a) Show that Y 1 (t) = [Y (t)dn 1 (t) Y 1 (t)dn(t)] (b) Show that (c) Show that Y (t)dn 1 (t) = n 1 (n + 1) W Y 1 (t)dn(t) = R (j) = W. Page Special lecture/july 16

Asymptotic properties. A linear rank statistic L K = K(t) (5.6) is characterized with F t -measurable process K(t). Under the null hypothesis H : λ 1 = λ = λ, the test statistic L K has asymptotically a normal distribution with mean. The varaince σ of asymptotic normal distribution can be obtained as the limit of [ ] Var(L K (t)) = E Y (t) λ(u)du (5.7) Y 1 (u)y (u) Consistency. Let z α be the critical value with level α from the standard normal distribution. Then the critical value of linear rank statistic can be obtained from the asymptotic normality so that P (L K > σz α ) α under H. Consider the alternative hypothesis H A : λ 1 (t) λ (t) for all t. The linear rank statistic (5.6) is consistent in the sense that lim n1,n P (L K > σz α ) = 1 if H A is true. Problem 3. We derive the variance formula (5.7) of linear rank statistic by completing the following questions. (a) Show that E[L K ] = (b) It is known that E[][λ 1 (u) λ (u)]du ( ) Cov Y 1 (u) dm 1(u), Y (u) dm (u) =. Assuming the null hypothesis H : λ 1 = λ = λ, verify (5.7). Problem 4. Assume that data are all uncensored, and that the null hypothesis H : S 1 = S is true. Then we have Y i (t) n i S(t), i = 1,, and Y (t) n S(t) as n 1, n. (a) Show that the limit σ M of Var(L M) for Mantel logrank statistic is obtained by σ M = S(u)λ(u)du = 1. (b) Show that the limit σ G of Var(L G) for Gehan-Wilcoxon statistic is obtained by σ G = S(u) 3 λ(u)du = 1 3. Page 3 Special lecture/july 16

Problem 1. We obtain algebraically Problem solutions dn 1 (t) Y 1(t) Y (t) dn(t) = dn 1(t) Y 1(t) Y (t) [dn 1(t) + dn (t)] Thus, L M has the form of (5.4), as desired. = Y Y (t) dn 1(t) Y 1(t) Y (t) dn = Y [ 1(t) dn1 (t) Y (t) Problem. (a) We can show that Y 1 (t) = dn 1 (t) Y 1 (t)dn (t) = dn 1 (t) Y 1 (t)[dn(t) dn 1 (t)] = [Y (t)dn 1 (t) Y 1 (t)dn(t)] (b) Observe that Y (T (k) ) counts all the cases at risk at the time T (k) of k-th event, and therefore, that Y (T (k) ) = n + 1 k. Then we can verify that Y (t)dn 1 (t) = Y (T 1j ) = Y (T (Rj )) n 1 = (n + 1 R j ) = n 1 (n + 1) W (c) Consider the order statistics R (1) < < R (n1 ) of the rank statistics R j, j = 1,..., n 1. Observe that Y 1 (T (R(j) )) counts the cases at risk in group 1 at the time T (R(j) ) of j-th event from group 1; thus, we have Y 1 (T (R(j) )) = n 1 + 1 j. By setting R () = and T () = for convenience, we can verify that Y 1 (t)dn(t) = Y 1 (T ij ) = Y 1 (T (R(j) ))[R (j) R (j 1) ] i=1 n 1 = (n 1 + 1 j)[r (j) R (j 1) ] = n 1 R (1) + (n 1 1)[R () R (1) ] + + (n 1 1)[R (n1 ) R (n1 1)] = R (j) = W. Problem 3. (a) By using martingales M 1 (t) and M (t), we introduce the stochastic process L K (t) = Y 1 (u) dm 1(u) Y (u) dm (u) + K(t)[λ 1 (u) λ (u)]du Page 4 Special lecture/july 16

Then we can show that E[L K (t)] = E[][λ 1 (u) λ (u)]du (b) By applying the martingale variance formula, we have ( ) t [ ] Var Y i (u) dm i(u) = E λ i (u)du Y i (u) Under the null hypothesis H : λ 1 = λ = λ, we can show that ( t ) Var(L K (t)) = Var Y 1 (u) dm 1(u) Y (u) dm (u) [ ] = E Y 1 (u) + λ(u)du Y (u) Problem 4. (a) Observe that K M (t) Y (t) Y 1 (t) = Therefore, by taking the limit of (5.7) we obtain S(u)λ(u)du = n Y1(t) n 1 n Y (t) S(t) d[ S(u)] = [ S(u)] u= u= = 1 (b) Observe that K G (t) Y (t) Y 1 (t) = Y (t)y 1(t) S(t) 3 nn 1 n Therefore, by taking the limit of (5.7) we obtain S(u) 3 λ(u)du = S(u) d[ S(u)] = [ S(u)3 3 ] u= u= = 1 3 Page 5 Special lecture/july 16