A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes

Similar documents
The Design of a Survival Study

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

4. Comparison of Two (K) Samples

Joint Modeling of Longitudinal Item Response Data and Survival

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Statistical Methods for Alzheimer s Disease Studies

Tutorial 6: Tutorial on Translating between GLIMMPSE Power Analysis and Data Analysis. Acknowledgements:

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Lecture 6: Single-classification multivariate ANOVA (k-group( MANOVA)

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Sample Size and Power Considerations for Longitudinal Studies

A Regression Model For Recurrent Events With Distribution Free Correlation Structure

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Large sample inference for a win ratio analysis of a composite outcome based on prioritized components

Multi-state Models: An Overview

β j = coefficient of x j in the model; β = ( β1, β2,

Introduction to Statistical Analysis

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Estimating Optimal Dynamic Treatment Regimes from Clustered Data

Lecture 5 Models and methods for recurrent event data

Models for Multivariate Panel Count Data

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

STA6938-Logistic Regression Model

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

Adaptive Designs: Why, How and When?

A3. Statistical Inference Hypothesis Testing for General Population Parameters

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, )

Overrunning in Clinical Trials: a Methodological Review

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Group sequential designs with negative binomial data

Extensions of Cox Model for Non-Proportional Hazards Purpose

Power and Sample Size Calculations for the Wilcoxon Mann Whitney Test in the Presence of Missing Observations due to Death

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Comparison of Two Samples

Evidence synthesis for a single randomized controlled trial and observational data in small populations

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Survival Analysis for Case-Cohort Studies

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Lectures of STA 231: Biostatistics

Effect of investigator bias on the significance level of the Wilcoxon rank-sum test

Adjusting for possibly misclassified causes of death in cause-specific survival analysis

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

A superiority-equivalence approach to one-sided tests on multiple endpoints in clinical trials

Comparative effectiveness of dynamic treatment regimes

Interim Monitoring of Clinical Trials: Decision Theory, Dynamic Programming. and Optimal Stopping

Statistics in medicine

Directionally Sensitive Multivariate Statistical Process Control Methods

Statistical Methods in Clinical Trials Categorical Data

Bayesian Nonparametric Accelerated Failure Time Models for Analyzing Heterogeneous Treatment Effects

Power and Sample Size Calculations with the Additive Hazards Model

Group sequential designs for negative binomial outcomes

Chapter 7, continued: MANOVA

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

Multivariate Survival Analysis

Describing Contingency tables

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

CTDL-Positive Stable Frailty Model

Instrumental variables estimation in the Cox Proportional Hazard regression model

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

METHODOLOGIES FOR IDENTIFYING SUBSETS OF THE POPULATION WHERE TWO TREATMENTS DIFFER. Gregory A. Yothers

Group Sequential Trial with a Biomarker Subpopulation

Independent Increments in Group Sequential Tests: A Review

Likelihood Construction, Inference for Parametric Survival Distributions

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Session 9 Power and sample size

General Principles Within-Cases Factors Only Within and Between. Within Cases ANOVA. Part One

Testing Non-Linear Ordinal Responses in L2 K Tables

Testing a secondary endpoint after a group sequential test. Chris Jennison. 9th Annual Adaptive Designs in Clinical Trials

Extensions of Cox Model for Non-Proportional Hazards Purpose

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data

This paper has been submitted for consideration for publication in Biometrics

Sample Size Calculations for Group Randomized Trials with Unequal Sample Sizes through Monte Carlo Simulations

ST4241 Design and Analysis of Clinical Trials Lecture 4: 2 2 factorial experiments, a special cases of parallel groups study

,..., θ(2),..., θ(n)

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

NONPARAMETRIC ADJUSTMENT FOR MEASUREMENT ERROR IN TIME TO EVENT DATA: APPLICATION TO RISK PREDICTION MODELS

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Impact of approximating or ignoring within-study covariances in multivariate meta-analyses

Survival models and health sequences

Analysis of Multiple Endpoints in Clinical Trials

Comparison of Two Population Means

Estimating the Mean Response of Treatment Duration Regimes in an Observational Study. Anastasios A. Tsiatis.

Inference in Randomized Trials with Death and Missingness

Rerandomization to Balance Covariates

Analysing Survival Endpoints in Randomized Clinical Trials using Generalized Pairwise Comparisons

Mixtures of multiple testing procedures for gatekeeping applications in clinical trials

Score test for random changepoint in a mixed model

Integrated likelihoods in survival models for highlystratified

Bios 6648: Design & conduct of clinical research

Statistics in medicine

Semiparametric Regression

Multistate models and recurrent event models

Robustifying Trial-Derived Treatment Rules to a Target Population

Transcription:

A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes Ritesh Ramchandani Harvard School of Public Health August 5, 2014 Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 1 / 20

Motivation Multiple Outcomes in clinical studies when a single endpoint is not adequate. ALS: ALSFRS-R and mortality. Heart Disease: Rehospitalization and mortality. Stroke: Different outcome scales to assess recovery Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 2 / 20

Methods for multivariate outcomes Multiple comparisons Examples: Bonferroni; Hochberg s step-up procedure (1988) Composite outcomes for time-to-event (e.g. progression free survival) Global Tests Hotelling s (1934) T 2 : Assumes normality O Brien s (1984) nonparametric rank-sum test We are interested in a nonparametric global test of the overall efficacy of a treatment over multiple outcomes. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 3 / 20

Some joint tests based on ranks O Brien (1984): Rank subjects with respect to each outcome; average each subject s ranks, and perform rank-sum test. Finkelstein-Schoenfeld Joint Rank Test (1999): Survival and longitudinal outcome. First rank pairs of subjects on survival; if tied or incomparable then rank on longitudinal outcome. Moye (1992,2011) proposed a related test. Wittkowski (2004): Patient A is ranked higher than B if all outcomes for patient A are better or equal to those of patient B. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 4 / 20

Examples of scoring pairs A B Time ALSFRS-R Alive? ALSFRS-R Alive? 0 32 Yes 32 Yes 1 35 Yes 31 Yes 2 37 Censored 25 Yes 3 NA NA NA No A is censored at time 2, B dies at time 3. Survival outcome score is 0 since indeterminate At the last common time they were observed, A had a higher ALSFRS-R score than B did, so the ALSFRS-R outcome score is 1. 1 O Brien: Overall Score of 0.5 (average of both outcome scores) 2 Finkelstein-Schoenfeld: Score of 1 (survival indeterminate, so use ALSFRS-R) 3 Wittkowski: Score of 1 (ALSFRS-R better, and survival not known to be worse) Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 5 / 20

General Procedure For each pair of subjects i and j in different groups, score for each outcome k: 1, if i did better than j r ijk = 1, if j did better than i 0, if indeterminate Assign a univariate score for the pair i and j that is a function of each the scores for the p outcomes. The test statistic will be based on the sum of the univariate scores over one of the groups. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 6 / 20

General Procedure Suppose we have two groups, and p outcomes. Denote E[r ijk ] = θ k. Can be thought of as treatment effect for outcome k. Constructed so that θ k = 0 when the treatment has no effect. Define a composite scoring function φ(r) that maps the individual outcome ranks to a univariate score. Examples: O Brien: φ(r) = p k r k Joint rank test: φ(r) = r 1 + I (r 1 = 0)r 2 +... + I (r 1 =... = r p 1 = 0)r p Wittkowski: φ(r) = I (max k {r k } > 0) I (min k {r k } < 0) Combination: φ(r) = r 1 + I (r 1 = 0) 1 p p 1 k=2 r k Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 7 / 20

Test Statistic Let n, m be the sample sizes in groups 1 and 2 respectively, and N = n + m is the total sample size Test Statistic: U = 1 nm n i m φ(r ij ) U-statistic that estimates the parameter θ φ = E[φ(r 1 (X 1, Y 1 ),..., r p (X p, Y p ))], for a given choice of φ Global treatment effect Under H 0 and some regularity conditions, NU N(0, σ 2 ) Can estimate σ 2 consistently from the data. Can also estimate power for different values of θ φ under H 1. j Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 8 / 20

Weights In addition, we can choose a function φ that assigns different weights to each outcome. For Example, if we have weights (w 1,...w p ): Weighted O Brien: φ(r 1,..., r p ) = p k w kr k Weighted Joint Rank: φ(r 1,..., r p ) = w 1 r 1 + I (r 1 = 0)w 2 r 2 +... + I (r 1 =... = r p 1 = 0)w p r p How to choose weights? Importance of outcomes (utility). Minimize variance Optimal: Maximize power under a particular alternative hypothesis. Adaptively choose optimal weights by estimating from other strata. Should restrict weights to the positive quadrant, i.e. w k 0 for all k. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 9 / 20

O Brien as sum of correlated U-Statistics We can write O Brien s global test as a sum of outcome-specific U-statistics. The weighted O Brien statistic is given by w U = w 1 U 1 + w 2 U 2 +...w p U p Then Nw U N(0, w Λw), where Λ = cov(u) Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 10 / 20

Optimal Weights Solution that maximizes the power function yields w = Λ 1 θ, where θ = (θ 1,..., θ p ), the marginal treatment effects for each outcome. Note: this may give negative weights; we can use numerical optimization to restrict weights to be positive. Minimum variance weights correspond to the case where θ = (1,..., 1) Choose θ for a particular alternative hypothesis, estimate Λ from the data. Similarly, can write Joint-Rank test as sum of correlated U-statistics, and optimal solution follows. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 11 / 20

Simulations: Mortality and ALSFRS-R Simulations based on a trial of Celebrex for ALS. Data generated from a shared parameter model with patient-specific random effects for ALSFRS-R functional rating scores, and hazard for survival is a function of ALSFRS trajectory and treatment effect (Healy 2012). First column denotes treatment effect for survival and ALS, respectively: None, mild, or moderate (mod). (Surv,ALS) O Brien O Brien w ( w opt ) Joint-Rank JR w ( w opt ) (Mod,Mild).52.59 (90,10).49.59 (93,7) (Mild,Mod).61.72 (3,97).52.57 (32,68) (Mod,None).25.61 (1,0).31.61 (1,0) (None,Mod).35.74 (0,1).21.62 (0.1,1)* Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 12 / 20

Adaptive Weights Optimal weight requires knowledge of the parameter values we expect to see under an alternative hypothesis. Can be estimated from previous studies, but in general unknown. With stratified data (e.g. different centers in a clinical trial), we can estimate the necessary weights in each of the previous strata, and then apply the optimal weights to the next stratum, and continue in this manner (Fisher 1998) Disadvantages: Can make interpretation difficult. Can yield erroneous weights, particularly when there is a treatment by strata interaction. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 13 / 20

Adaptive Weights: Simulations 2 outcomes, uncensored; generated from multivariate normal with variance 1, correlation ρ. 2 Strata; Relative Effect Size: 1/5 4 Strata; Relative Effect Size: 1/10 Power 0.5 0.6 0.7 0.8 0.9 1.0 Equal weights Adaptive Weights Optimal Weights Power 0.5 0.6 0.7 0.8 0.9 1.0 Equal weights Adaptive Weights Optimal Weights 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 rho rho Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 14 / 20

Simulations Summary Type 1 error controlled at the nominal level for small samples, unequal variances, unequal sample sizes, and unequal censoring distributions. Optimal weight improves efficiency of existing tests, but weights in general are unknown. May be estimated from prior data. Adaptive weighting can improve power, but only if effect sizes vary significantly between outcomes and correlation is moderate to high. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 15 / 20

Example Data Analysis: Ceftriaxone ALS Trial Stage 3 Double Blind Clinical Trial of Ceftriaxone in Subjects with ALS 513 Subjects monitored for rate of decline in ALSFRS-R scores over time, and survival 340 Subjects administered Ceftriaxone; 173 placebo Average follow up time of 1.6 years, maximum 6 years Conclusion: Survival and rate of decline were not signifcantly different between Ceftriaxone and placebo Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 16 / 20

Ceftriaxone Trial: Comparison of Joint Tests Test Statistic p-value Survival only 1.13.26 ALS only -0.39.70 O Brien 0.16.87 O Brien w 0.85.40 Joint-Rank 0.71.48 Joint-Rank w 0.31.76 Covariance matrices ˆΛ: ( ).37.06 O Brien:.06 1.40 (.37.006 Joint-Rank:.006.17 ) Minimum variance weights: w = (.815,.185) w = (.31,.69) Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 17 / 20

Some things to consider Global Test weak under alternatives where treatment affects outcomes in different directions. That is okay for our purposes. Results of any given test should be carefully interpreted. A positive result does not necessarily mean treatment is best for all outcomes. Make use of descriptive statistics and plots, or combine with closed testing procedures. Not necessarily the right test if main interest in isolating which specific outcomes are meaningfully different. Use multiple comparisons. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 18 / 20

Strengths and Limitations Flexible: Allows investigator to choose test based on the context of study, estimate of interest, and required sample size/power. Variance can easily be estimated from the data for a given composite function of the scores. Framework for constructing optimal weights under specific alternatives, for some tests. Covariate adjustment only available through stratification. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 19 / 20

Acknowledgements Dianne Finkelstein and David Schoenfeld. Harvard School of Public Health and Massachusetts General Hospital, Department of Biostatistics. Supported by NIH grant T32NS048005. Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes August 5, 2014 20 / 20