Censoring mechanisms

Similar documents
Continuous case Discrete case General case. Hazard functions. Patrick Breheny. August 27. Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21

Multistate models and recurrent event models

Proportional hazards regression

Multistate models and recurrent event models

Semiparametric Regression

ST495: Survival Analysis: Maximum likelihood

Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Accelerated Failure Time Models

Residuals and model diagnostics

One-parameter models

The Weibull Distribution

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

Exercises. (a) Prove that m(t) =

Nonconvex penalties: Signal-to-noise ratio and algorithms

Covariance test Selective inference. Selective inference. Patrick Breheny. April 18. Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/20

Cox regression: Estimation

Likelihood Construction, Inference for Parametric Survival Distributions

Joint Modeling of Longitudinal Item Response Data and Survival

Empirical Likelihood

Lecture 22 Survival Analysis: An Introduction

Lecture 3. Truncation, length-bias and prevalence sampling

Tied survival times; estimation of survival probabilities

Time-dependent coefficients

Censoring and Truncation - Highlighting the Differences

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Theoretical results for lasso, MCP, and SCAD

Frailty Modeling for clustered survival data: a simulation study

Introduction to Reliability Theory (part 2)

UNIVERSITY OF CALIFORNIA, SAN DIEGO

Survival Distributions, Hazard Functions, Cumulative Hazards

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

Stability and the elastic net

Lecture 4 - Survival Models

Temporal point processes: the conditional intensity function

Multistate Modeling and Applications

MAS3301 / MAS8311 Biostatistics Part II: Survival

Cox s proportional hazards model and Cox s partial likelihood

The normal distribution

Week 5: Logistic Regression & Neural Networks

18.465, further revised November 27, 2012 Survival analysis and the Kaplan Meier estimator

ST495: Survival Analysis: Hypothesis testing and confidence intervals

Lecture 20. Poisson Processes. Text: A Course in Probability by Weiss STAT 225 Introduction to Probability Models March 26, 2014

Modelling geoadditive survival data

For right censored data with Y i = T i C i and censoring indicator, δ i = I(T i < C i ), arising from such a parametric model we have the likelihood,

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

The Central Limit Theorem

Permutation tests. Patrick Breheny. September 25. Conditioning Nonparametric null hypotheses Permutation testing

Chapter 9: Sampling Distributions

The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding

CSC321 Lecture 18: Learning Probabilistic Models

4. Comparison of Two (K) Samples

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Survival Analysis for Case-Cohort Studies

Bootstrap Confidence Intervals

On Measurement Error Problems with Predictors Derived from Stationary Stochastic Processes and Application to Cocaine Dependence Treatment Data

Nuisance parameters and their treatment

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

Optimization Methods II. EM algorithms.

Examination paper for TMA4275 Lifetime Analysis

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

3003 Cure. F. P. Treasure

Statistical Data Analysis Stat 3: p-values, parameter estimation

Lecture 5 Models and methods for recurrent event data

MAS223 Statistical Inference and Modelling Exercises

Standard Error of Technical Cost Incorporating Parameter Uncertainty

Reliability Monitoring Using Log Gaussian Process Regression

Relative efficiency. Patrick Breheny. October 9. Theoretical framework Application to the two-group problem

One-sample categorical data: approximate inference

The Design of a Survival Study

Integrated likelihoods in survival models for highlystratified

Analysing geoadditive regression data: a mixed model approach

CTDL-Positive Stable Frailty Model

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Multi-state Models: An Overview

Lecture 7 Time-dependent Covariates in Cox Regression

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Stat 451 Lecture Notes Simulating Random Variables

1 Glivenko-Cantelli type theorems

A Joint Model with Marginal Interpretation for Longitudinal Continuous and Time-to-event Outcomes

MAS3301 / MAS8311 Biostatistics Part II: Survival

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

Parameter Estimation

Outline of GLMs. Definitions

11 Survival Analysis and Empirical Likelihood

Lecture 12: Application of Maximum Likelihood Estimation:Truncation, Censoring, and Corner Solutions

Describing Contingency tables

A hidden semi-markov model for the occurrences of water pipe bursts

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FAILURE-TIME WITH DELAYED ONSET

Lecture 12 November 3

Lab #11. Variable B. Variable A Y a b a+b N c d c+d a+c b+d N = a+b+c+d

IE 303 Discrete-Event Simulation

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

A New Two Sample Type-II Progressive Censoring Scheme

Transcription:

Censoring mechanisms Patrick Breheny September 3 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23

Fixed vs. random censoring In the previous lecture, we derived the contribution to the likelihood from fully observed, censored, and truncated observations under a variety of situations (left/right/interval) We didn t explicitly state it as an assumption, but our derivations treated the censoring time c i for ith individual as a fixed quantity, known in advance In most real settings, however, censoring times are random variables, not fixed constants Patrick Breheny Survival Data Analysis (BIOS 7210) 2/23

Fixed vs. random censoring (cont d) Even in settings where the date of censoring is fixed in advance, the time on study until censoring is random because it depends on when the subject entered the study To realistically account for the effect of censoring on inference, then, we must consider censoring as a random variable, meaning that we must consider the distributions of two random variables (time to event and time to censoring) and how they relate to one another Patrick Breheny Survival Data Analysis (BIOS 7210) 3/23

Random censorship model Let T i denote the true survival time and C i denote the censoring time, with T i x i S(θ, xi ) C i x i G(η, xi ) T i C i x i, where θ is the parameter(s) of interest and x i is a potentially vector-valued covariate We observe {t i, d i, x i } n i=1, where t i = t i c i = min( t i, c i ) d i = 1{ t i c i } Patrick Breheny Survival Data Analysis (BIOS 7210) 4/23

Random censoring: Likelihood Under the assumptions on the previous slide, which collectively are known as random censoring, each observation is independent and therefore each observation makes an independent contribution L i (θ) to the likelihood: Thus, for T i = t, D i = 1 : L i (θ) = f(t x i, θ)g(t x i, η) for T i = t, D i = 0 : L i (θ) = g(t x i, η)s(t x i, θ) L(θ) = i L i (θ) i f(t i x i, θ) d i S(t i x i, θ) 1 d i Patrick Breheny Survival Data Analysis (BIOS 7210) 5/23

Remarks on random censoring Thus, we arrive at the same likelihood we derived previously in the case of fixed censoring: under the random censorship model, the likelihood contributions from censoring amount to a constant with respect to θ and do not affect inference This is very convenient since, in practice, one generally does not care about the censoring mechanism or wish to spend any effort modeling it Patrick Breheny Survival Data Analysis (BIOS 7210) 6/23

Random entry Random Censorship It is worth noting that the random censorship model covers the important special case of random entry into the study, with a fixed censoring date at the end of the study: Entry date: A i iid G (η) Censoring date: C i = c True failure date: T i = A i + T i, where T i S(θ) is the true failure time, as discussed earlier Thus, in terms of time on study, under the assumption A i T i, we have T i F (θ) C i G(η) T i C i Patrick Breheny Survival Data Analysis (BIOS 7210) 7/23

Random censoring: Example To illustrate the assumptions of the random censorship model and the bias that can result when those assumptions are violated, we ll now go through a few examples The examples will generally follow the random entry scenario just described, in which subjects enter the model randomly over the interval (0, 1) and then the analysis is carried out at time t = 1 To begin with, let s just satisfy ourselves that what we just derived does, in fact, work: T i iid Exp(1) A i Unif(0, 1), so that T i C i Patrick Breheny Survival Data Analysis (BIOS 7210) 8/23

Random censoring: Simulated results (n = 500) 1.0 0.8 0.6 L(λ) 0.4 0.2 0.0 0.5 1.0 1.5 2.0 λ Patrick Breheny Survival Data Analysis (BIOS 7210) 9/23

Example: Entry and failure dependent Now, let s introduce some dependence between the entry time and the failure time, so that the random censorship assumption of T i C i is violated In particular, let s let T i follow an Exp(1) distribution as before, but A i Unif(0, 1) if T < 1 A i Unif(0, 1 2 ) if T 1 Thus, there is a systematic bias in which patients who live longer tend to enter the study earlier Patrick Breheny Survival Data Analysis (BIOS 7210) 10/23

Dependent entry and failure (n = 500) 1.0 0.8 0.6 L(λ) 0.4 0.2 0.0 0.5 1.0 1.5 2.0 λ Patrick Breheny Survival Data Analysis (BIOS 7210) 11/23

Example: Conditional independence Note, however, that C i and T i do not have to be strictly independent; they can be conditionally independent given x i To see how this distinction matters, let s introduce a covariate such that C i and T i are marginally dependent, but conditionally independent given x i : X i Bern( 1 2 ) T i x i = 1 Exp(λ) T i x i = 0 Exp(2λ) A i x i = 1 Unif(0, 1 2 ) A i x i = 0 Unif(0, 1) Patrick Breheny Survival Data Analysis (BIOS 7210) 12/23

Conditional independence: Remarks Note that, as in the first example, there is a systematic bias in which patients who live longer tend to enter the study earlier In this example, however, there is a covariate, x i, that can explain and account for this phenomenon Let s see what happens to the likelihood in this second example when we use the information from the covariate, and also what happens when we ignore x i (although admittedly, this is a somewhat flawed comparison, as we ll discuss in class) Patrick Breheny Survival Data Analysis (BIOS 7210) 13/23

Using the covariate (n = 500) 1.0 0.8 0.6 L(λ) 0.4 0.2 0.0 0.5 1.0 1.5 2.0 λ Patrick Breheny Survival Data Analysis (BIOS 7210) 14/23

Ignoring the covariate (n = 500; λ 1 = 0.4, λ 2 = 2) 1.0 0.8 0.6 L(λ) 0.4 0.2 0.0 0.5 1.0 1.5 2.0 λ Patrick Breheny Survival Data Analysis (BIOS 7210) 15/23

Generality of the RC likelihood We have seen that the likelihood L(θ) = i f(t i x i, θ) d i S(t i x i, θ) 1 d i is correct under the random censorship model However, it is also the correct likelihood in many other situations that do not fall under random censorship In other words, random censorship is a sufficient condition for the above likelihood, but not a necessary one Patrick Breheny Survival Data Analysis (BIOS 7210) 16/23

Type II censoring Random Censorship For example, suppose we enrolled n subjects in a study and then continued the study until a prespecified number d of events occurred (this is known as Type II censoring ) This is clearly outside the earlier framework; in particular, the censoring times depend on the failure times for other subjects Nevertheless, it can be shown that one still arrives at the same likelihood as under the random censorship model Patrick Breheny Survival Data Analysis (BIOS 7210) 17/23

Censoring mechanisms for which the aforementioned likelihood remains correct are called independent censoring mechanisms; random censoring is a special case of independent censoring A detailed description of the conditions under which independent censoring holds is beyond the scope of this course, but the general idea to consider the likelihood as being built up by a collection of stochastic processes unfolding in time, leading to { } L(θ) = λ(t i θ, x i ) d i exp λ(u θ, x j ) du, i 0 j R(u) where R(u) is the risk set at time u, consisting of all the individuals still alive and uncensored at time u Patrick Breheny Survival Data Analysis (BIOS 7210) 18/23

Finally, let us briefly revisit the assumption of the random censorship model that C i G(η, x i ) Specifically, we are assuming here that the distribution of the censoring time does not depend on θ, the parameter of interest This assumption is referred to as noninformative censoring; in other words, that the censoring mechanism does not contain any information about the parameter we are studying Patrick Breheny Survival Data Analysis (BIOS 7210) 19/23

Violations of noninformative censoring We have already examined the consequences when the assumption that C i T i is violated; what if the assumption of noninformative censoring is violated? Generally speaking, if the other assumptions of the random censorship model hold, informative censoring does not necessarily introduce any bias Instead, its main consequence is a loss of efficiency with regard to the estimation of θ Patrick Breheny Survival Data Analysis (BIOS 7210) 20/23

Informative censoring: Example As an example of informative censoring, suppose we have: T i Exp(λ) C i Exp(λ) T i C i In this case, as you showed in homework, T i Exp(2λ) and no observations are censored with respect to θ Let s compare this likelihood with the likelihood we get without making any assumptions concerning G (i.e., assuming random censorship) Patrick Breheny Survival Data Analysis (BIOS 7210) 21/23

Informative censoring (n = 100) 1.0 Uninformative Using cens information 0.8 0.6 L(λ) 0.4 0.2 0.0 0.5 1.0 1.5 2.0 λ Patrick Breheny Survival Data Analysis (BIOS 7210) 22/23

Final remarks Random Censorship This week, we addressed the question of constructing likelihoods in the presence of censoring (and truncation), and examined various censoring mechanisms with respect to how they influence the likelihood and potentially introduce bias Our illustrations have come from the exponential distribution so far out of convenience, but the main points have broad applicability Next week, we will discuss nonparametric approaches to estimating survival distributions Patrick Breheny Survival Data Analysis (BIOS 7210) 23/23