ST745: Survival Analysis: Nonparametric methods

Size: px
Start display at page:

Download "ST745: Survival Analysis: Nonparametric methods"

Transcription

1 ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015

2 The KM estimator is used ubiquitously in medical studies to estimate and depict the fraction of patients living for a certain amount of time after treatment. It has since been applied to data from clinical trials of therapies for every disease from cancer to cardiology to concussion. Science Life Paul Meiers work and the KM analysis have been responsible for saving millions of lives. Significance

3 Then and now Last time we discussed max-lh with censoring Right-censoring schemes Left-truncation Interval censored data Current status data Estimating parametric models in R Large sample theory and inference Today we ll discuss Kaplan-Meier estimator and inference Nelson Aalen estimator and inference Using R for nonpar estimation

4 Warm-up Explain to your stat buddy 1. What s the difference between left-censoring and left-truncation? 2. Given two examples of nonparametric estimators 3. Pros and cons of nonparametric methods relative to parametric methods 4. What is a confidence interval? True or false: (T/F) Paul Meier is still alive (T/F) The bootstrap is an asymptotic approximation (T/F) The intergral symbol was invented by Gottfried Wilhelm Leibniz III

5 Things to recall For a discrete distribution with failure times t 1,... S(t) = [1 h(t j )], j:t j <t where h(t j ) = P(T = t j T t)

6 Family feud! I surveyed statisticians in SAS hall for the five most important steps in an applied statistical analysis. What are they?

7 Complications due to censoring Consider making a simple visual display of lifetime data subj to right-censoring Why is this important? Consider making a histogram, what goes wrong? What about plotting the empirical CDF? Today we ll see how to make these plots (and more!)

8 Product limit estimator: warm-up Let T 1,..., T n denote an iid sample (no-censoring) Empirical CDF F (t) = 1 n 1 Ti t n Empirical survival function (ESF) i=1 Ŝ(t) = 1 n n i=1 1 Ti t Does F (t) = Ŝ(t) everywhere?

9 Ex. ECDF and ESF F^(x) S^(x) x x How big are the steps above?

10 Ex. ECDF and ESF cont d n = 100; x = rchisq (n, df=4); par(list(mar=c(5,5,4,1) + 0.1, mfrow=c(1,2))); plot (stepfun (sort(x), c(0, (1:n)/n)), xlab="x", ylab=expression(hat(f)(x)), main="", lwd=3); plot (stepfun (sort(x), c(1, 1-(1:n)/n)), xlab="x", ylab=expression(hat(s)(x)), main="", lwd=3);

11 Ex. ECDF and ESF cont d If t 1 < t 2 < < t k are distinct failure times Ŝ(t) = 1 n k d j 1 tj t, j=1 where d j are the number of observations equal to t j Why?

12 ECDF and ESF under censoring When there is censoring Number of points in an interval [a, b] is unknown Cannot compute ESF or ECDF Kaplan-Meier (KM) estimator (aka product limit estimator) is an analog of the ESF for right-censored data The original KM paper is the most highly cited statistics paper to date. What is the second most highly cited?

13 KM estimator Let {(t i, δ i)} n i=1 denote obs. data with distinct failure times t 1 < t 2 < < t k (these DO NOT include censoring times) Define dj n i=1 1 t i =t j,δ i =1 to be the number of failures at t j nj n i=1 1 t i t j to be the number at risk at t j The KM estimator of S(t) is Ŝ(t) = j:t j <t ( ) nj d j Explain Ŝ(t) intuitively to your stat buddy n j

14 Why does KM make sense? Given {(t i, δ i)} n i=1 how can we estimate h(t j)? (Assume discrete for now) h(t j ) = P(T = t j T t j ) #fail at t j #at risk at t j = d j n j apply S(t) = j:t j <t [1 h(t j)] ( ) j:t j <t 1 d j n j = Ŝ(t)

15 Ex. compute the KM estimator t δ t j n j d j (n j d j )/n j Ŝ(t j +)

16 Code break I: Computing KM in R See file firstkm.r

17 Sanity check Claim: The KM estimator reduces to the ESF when there is no censoring. Why? Answer on board.

18 Code break II: Example from Lawless See file ex321.r

19 Variance estimation A consistent estimator of the variance of Ŝ(t) is given by Greenwood s formula: σ 2 S (t) = Ŝ 2 (t) j:t j <t d j n j (n j d j ) When there is no censoring, this reduces to Ŝ(t)(1 Ŝ(t))/n. Why is this the right quantity?

20 KM as nonparametric MLE Recall our counting process notation Y t (t) = 1 Ti t,ith subj not cens at t dn i (t) = Y i (t)1 Ti =t dc i (t) = Y i (t)1 ith subj cens at t, we ll assume a discrete distribution with potential failure times t = 0, 1,... With your stat buddy prove n i=1 dn i(t) = n i=1 Y i(t)dn i (t)

21 KM as nonparametric MLE cont d Recall from our work on non-informative censoring that L n i=1 t=0 h(t) dn i (t) [1 h(t)] Y i (t)dn i (t) Note* We saw this en route to simplifying to an expression involving f (t) and S(t); for our purposes it will be convenient to use the above form.

22 KM as nonparametric MLE cont d The LH simplifies to L h(t) dt [1 h(t)] nt dt, t=0 where d t n i=1 dn i(t), n t n i=1 Y i(t) Why? Interchange products to obtain t=0 i=1 n h(t) dt [1 h(t)] nt dt = t=0 h(t) n i=1 dn i (t) [1 h(t)] n i=1 Y i (t)(1 dn i (t)), and use i=1 Y i(t)dn i (t) = n i=1 dn i(t) = d t

23 KM as nonparametric MLE cont d To obtain nonparametric MLE we view (h(0), h(1),...) as our parameter and maximize L If n t = 0 then there is no information about h(t), let τ denote the largest t s.t. n t > 0 then and the log-lh is l = L τ h(t) dt [1 h(t)] nt dt, t=0 τ {d t log h(t) + (n t d t ) log (1 h(t))} t=0

24 KM as nonparametric MLE cont d Differentiate l wrt to h(t) to obtain h(t) l = d t h(t) (n t d t ) 1 h(t), set this to zero and solve for h(t) to obtain ĥ(t) = d t/n t Then Ŝ(t) = j:t j <t [ ] 1 ĥ(t j) = j:t j <t [ 1 d ] j, n j is the MLE for S(t) by the invariance property of the MLE

25 KM as nonpar MLE, enough already! Some things to note 1. If the last obs time τ is a failure then Ŝ(t) 0 for all t > τ 2. If the last obs time τ is a censoring time then Ŝ(t) is not defined for t > τ 3. MLE formulation is powerful since large sample theory can be used to study efficiency and conduct statistical inference

26 Fact from your past Let g be a smooth function from R into R then so that g( θ n ) g(θ) + g(θ)( θ n θ) Var g( θ n ) g 2 (θ)var θ n, thus we can approximate the variance of θ n via Var θ n Ex. Let g(u) = log u to obtain 1 g 2 (θ) Var g( θ n ) Var Ŝ(t) S 2 (t)var log Ŝ(t)

27 Computing Greenwood s formula If we can approximate the variance of log Ŝ(t) then we can use the preceding expansion to approximate Var Ŝ(t) Recall the score function (derivative of log-lh) is so that u(h(t)) = d t h(t) (n t d t ) 1 h(t), u (h(t)) = d t h 2 (t) (n t d t ) (1 h(t)) 2 [ 1 = n t h(t) h(t) n t = h(t)(1 h(t)) ]

28 Computing Greenwood s formula cont d Observed fisher info is a diagonal matrix with entries I t = n t h(t)(1 h(t)) Thus (ĥ(0), ĥ(1),..., ĥ(τ)) are asymptotically independent s.t. [ ] Var log Ŝ(t) = Var log 1 ĥ(t j) j:t j <t = Var log j:t j <t j:t j <t Var log [ ] 1 ĥ(t j) [ ] 1 ĥ(t j)

29 Computing Greenwood s formula cont d We can estimate Var log Var log { ] 1 ĥ(t j) using our approx { ] 1 ĥ(t j) Var ĥ(t 1 j) (1 It j ĥ(t))2 (1 = n t j ĥ(t j ) ĥ(t))2 1 ĥ(t j) Putting it all together Var(Ŝ(t)) Ŝ 2 (t) j:t j <t ĥ(t j )n tj 1 ĥ(t j) = Ŝ 2 (t) j:t j <t d j (n j d j ) 2 n j, where we have used n tj = n j and ĥ(t j) = d j /n j

30 Computing Greenwood s formula epilogue We glossed over some slippery technical details; for rigorous treatment see advanced survival texts (e.g., Flemming and Harrington, 2005). For a treatment of infinite dimensional parameter spaces see Butches semi-parametrics course.

31 Nelson-Aalen estimator One could obtain an estimator of the cumulative hazard via log Ŝ(t) (why?) but the following estimator is typically preferred Ĥ(t) d j, n j j:t j t this is called the Nelson-Aalen (pronounced OH-len) estimator

32 Ex. compute the NA estimator t δ t j n j d j d j )/n j Ĥ(t j )

33 Code break III: Computing NA in R See file firstna.r

34 Plotting the NA estimator Plot of Ĥ(t) informative for the shape of the hazard fn H(t) linear implies constant hazard H(t) convex implies monotone hazard Slope of H(t) approximates h(t)

35 Match the NA estimator with the true hazard H^(t) H^(t) H^(t) time time time h(t) h(t) h(t) time time time

36 Variance estimation NA estimator is an MLE just like KM Variance estimator for Ĥ(t) is σ 2 H (t) = j:t j t d j (n j d j ) nj 2, which can be derived using large-sample approximations

37 Codebreak IV: NA on Example from Lawless See file ex321na.r

38 Confidence interval for S(t) Fact: For any fixed t > 0 Ŝ(t) S(t) σ S (t) N(0, 1) Stronger convergence results (simultaneous over all t) exist (1 α) 100% CI based on Greenwood s formula Ŝ(t) ± z 1 α/2 σ S (t)

39 Alternative confidence intervals Greenwood s formula is intuitive but has drawbacks CI generally does not perform well in small samples Can generate a CI with endpoints outside of (0, 1) Recall our general strategy for modeling probabilities 1. Transform to take values in R 2. Conduct estimation/inference on transformed scale 3. Transform back to (0, 1)

40 Transformed confidence interval Let g(s) be a decreasing cts function from (0, 1) onto R, construct a CI for g(s(t)) then transform back via Taylor approx Define ψ(t) g(ŝ(t)) then σ 2 ψ (t) [ g {Ŝ(t) }] 2 σ 2 S (t) Taylor series arguments show ( P z 1 α/2 ψ(t) ) ψ(t) z σ ψ (t) 1 α/2 1 α

41 Transformed confidence interval cont d Rearrange terms to obtain ( P ψ(t) z1 α/2 σ ψ (t) ψ(t) ψ(t) ) + z 1 α/2 σ ψ (t) 1 α Solve for S(t) using ψ(t) = g(s(t)) ( { } P g 1 ψ(t) + z1 α/2 σ ψ (t) S(t) g 1 { ψ(t) z1 α/2 σ ψ (t)} ) 1 α Note the arguments within g 1 have flipped Question: How do we know g 1 exists and is decreasing?

42 Transformed confidence interval cont d If g(s) = log ( log(s)) CI is [ e { exp( ψ(t)+z 1 α/2 σ ψ)}, e { exp( ψ(t) z 1 α/2 σ ψ)} ] σ 2 σ S 2 ψ(t) = 2 [Ŝ(t) log Ŝ(t)] Variance is Another common choice is g(s) = log(s)

43 Bootstrap: AKA the boostarp Eric Draws a brilliant depiction of the bootstrap on the board Applaud subsides A quiet moment of reflection reveals a new appreciation for the beauty of statistics in each of us

44 The boostarp cont d Let D = {(T i, δ i )} n i=1 denote the observed data and P n the empirical distribution A (nonparametric) bootstrap sample is a sample of size n, say D (b), drawn uniformly (with replacement) from D D (b) is an i.i.d. draw of size n from P n Other resample sizes are possible Standard percentile bootstrap CI for S(t) 1. Draw B nonparametric samples, D (1),..., D (B) 2. Compute Ŝ (b) (t), KM on D (b), b = 1,..., B 3. Let l α/2, and û 1 α/2 be the (α/2) 100 and (1 α/2) 100 percentiles of Ŝ (1) (t),..., Ŝ (B) ] 4. Final (1 α) 100% CI is [ lα/2, û 1 α/2

45 Simulated experiment: coverage probabilities T log-normal( 1, 2), C exp(1.75) Sample size of n = 200 and 10K MC replications Compare coverage of Greenwood s formula with log log transform Coverage Greenwood Log log t See coverageexample.r

46 Confidence intervals for quantiles In some settings a quantile is of interest E.g., the median Quantiles are often easier to estimate than moments Recall t p is the pth quantile of T t p = inf {t : 1 S(t) p} Give an estimator Ŝ(t) of S(t) we obtain } t p = inf {t : 1 Ŝ(t) p

47 Confidence intervals for quantiles cont d For continuous T, S(t p ) = 1 p Suppose t L = t L (Data) satisfies P (S(t L ) 1 p) 1 α, then t L is a lower confidence bound for t p (Why?) For any fixed t ( ) P S(t) Ŝ(t) z 1 α/2 σ S (t) 1 α, solve Ŝ(t L) z 1 α/2 σ S (t) = 1 p for t L

ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

More information

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 4 Fall 2012 4.2 Estimators of the survival and cumulative hazard functions for RC data Suppose X is a continuous random failure time with

More information

ST495: Survival Analysis: Hypothesis testing and confidence intervals

ST495: Survival Analysis: Hypothesis testing and confidence intervals ST495: Survival Analysis: Hypothesis testing and confidence intervals Eric B. Laber Department of Statistics, North Carolina State University April 3, 2014 I remember that one fateful day when Coach took

More information

STAT Sample Problem: General Asymptotic Results

STAT Sample Problem: General Asymptotic Results STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative

More information

Estimation for Modified Data

Estimation for Modified Data Definition. Estimation for Modified Data 1. Empirical distribution for complete individual data (section 11.) An observation X is truncated from below ( left truncated) at d if when it is at or below d

More information

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

ST745: Survival Analysis: Cox-PH!

ST745: Survival Analysis: Cox-PH! ST745: Survival Analysis: Cox-PH! Eric B. Laber Department of Statistics, North Carolina State University April 20, 2015 Rien n est plus dangereux qu une idee, quand on n a qu une idee. (Nothing is more

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

STAT Section 2.1: Basic Inference. Basic Definitions

STAT Section 2.1: Basic Inference. Basic Definitions STAT 518 --- Section 2.1: Basic Inference Basic Definitions Population: The collection of all the individuals of interest. This collection may be or even. Sample: A collection of elements of the population.

More information

Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time

Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

ST745: Survival Analysis: Parametric

ST745: Survival Analysis: Parametric ST745: Survival Analysis: Parametric Eric B. Laber Department of Statistics, North Carolina State University January 13, 2015 ...the statistician knows... that in nature there never was a normal distribution,

More information

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap Patrick Breheny December 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/21 The empirical distribution function Suppose X F, where F (x) = Pr(X x) is a distribution function, and we wish to estimate

More information

Quantile Regression for Residual Life and Empirical Likelihood

Quantile Regression for Residual Life and Empirical Likelihood Quantile Regression for Residual Life and Empirical Likelihood Mai Zhou email: mai@ms.uky.edu Department of Statistics, University of Kentucky, Lexington, KY 40506-0027, USA Jong-Hyeon Jeong email: jeong@nsabp.pitt.edu

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

Nonparametric Model Construction

Nonparametric Model Construction Nonparametric Model Construction Chapters 4 and 12 Stat 477 - Loss Models Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 1 / 28 Types of data Types of data For non-life

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

Constrained estimation for binary and survival data

Constrained estimation for binary and survival data Constrained estimation for binary and survival data Jeremy M. G. Taylor Yong Seok Park John D. Kalbfleisch Biostatistics, University of Michigan May, 2010 () Constrained estimation May, 2010 1 / 43 Outline

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

Right-truncated data. STAT474/STAT574 February 7, / 44

Right-truncated data. STAT474/STAT574 February 7, / 44 Right-truncated data For this data, only individuals for whom the event has occurred by a given date are included in the study. Right truncation can occur in infectious disease studies. Let T i denote

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Empirical Processes & Survival Analysis. The Functional Delta Method

Empirical Processes & Survival Analysis. The Functional Delta Method STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will

More information

Censoring and Truncation - Highlighting the Differences

Censoring and Truncation - Highlighting the Differences Censoring and Truncation - Highlighting the Differences Micha Mandel The Hebrew University of Jerusalem, Jerusalem, Israel, 91905 July 9, 2007 Micha Mandel is a Lecturer, Department of Statistics, The

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Bootstrap Confidence Intervals

Bootstrap Confidence Intervals Bootstrap Confidence Intervals Patrick Breheny September 18 Patrick Breheny STA 621: Nonparametric Statistics 1/22 Introduction Bootstrap confidence intervals So far, we have discussed the idea behind

More information

Step-Stress Models and Associated Inference

Step-Stress Models and Associated Inference Department of Mathematics & Statistics Indian Institute of Technology Kanpur August 19, 2014 Outline Accelerated Life Test 1 Accelerated Life Test 2 3 4 5 6 7 Outline Accelerated Life Test 1 Accelerated

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Stat 710: Mathematical Statistics Lecture 31

Stat 710: Mathematical Statistics Lecture 31 Stat 710: Mathematical Statistics Lecture 31 Jun Shao Department of Statistics University of Wisconsin Madison, WI 53706, USA Jun Shao (UW-Madison) Stat 710, Lecture 31 April 13, 2009 1 / 13 Lecture 31:

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics.

Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics. Asymptotic Distributions for the Nelson-Aalen and Kaplan-Meier estimators and for test statistics. Dragi Anevski Mathematical Sciences und University November 25, 21 1 Asymptotic distributions for statistical

More information

Bootstrap, Jackknife and other resampling methods

Bootstrap, Jackknife and other resampling methods Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

Introduction to Reliability Theory (part 2)

Introduction to Reliability Theory (part 2) Introduction to Reliability Theory (part 2) Frank Coolen UTOPIAE Training School II, Durham University 3 July 2018 (UTOPIAE) Introduction to Reliability Theory 1 / 21 Outline Statistical issues Software

More information

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models

Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models NIH Talk, September 03 Efficient Semiparametric Estimators via Modified Profile Likelihood in Frailty & Accelerated-Failure Models Eric Slud, Math Dept, Univ of Maryland Ongoing joint project with Ilia

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring

STAT 6385 Survey of Nonparametric Statistics. Order Statistics, EDF and Censoring STAT 6385 Survey of Nonparametric Statistics Order Statistics, EDF and Censoring Quantile Function A quantile (or a percentile) of a distribution is that value of X such that a specific percentage of the

More information

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is

The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is Weibull in R The Weibull in R is actually parameterized a fair bit differently from the book. In R, the density for x > 0 is f (x) = a b ( x b ) a 1 e (x/b) a This means that a = α in the book s parameterization

More information

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood ICSA Applied Statistics Symposium 1 Balanced adjusted empirical likelihood Art B. Owen Stanford University Sarah Emerson Oregon State University ICSA Applied Statistics Symposium 2 Empirical likelihood

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL A Thesis Presented to the Faculty of San Diego State University In Partial Fulfillment of the Requirements for the Degree

More information

Kernel density estimation in R

Kernel density estimation in R Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Asymptotic statistics using the Functional Delta Method

Asymptotic statistics using the Functional Delta Method Quantiles, Order Statistics and L-Statsitics TU Kaiserslautern 15. Februar 2015 Motivation Functional The delta method introduced in chapter 3 is an useful technique to turn the weak convergence of random

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t) PhD course in Advanced survival analysis. (ABGK, sect. V.1.1) One-sample tests. Counting process N(t) Non-parametric hypothesis tests. Parametric models. Intensity process λ(t) = α(t)y (t) satisfying Aalen

More information

UNIVERSITÄT POTSDAM Institut für Mathematik

UNIVERSITÄT POTSDAM Institut für Mathematik UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

Analytical Bootstrap Methods for Censored Data

Analytical Bootstrap Methods for Censored Data JOURNAL OF APPLIED MATHEMATICS AND DECISION SCIENCES, 6(2, 129 141 Copyright c 2002, Lawrence Erlbaum Associates, Inc. Analytical Bootstrap Methods for Censored Data ALAN D. HUTSON Division of Biostatistics,

More information

Package Rsurrogate. October 20, 2016

Package Rsurrogate. October 20, 2016 Type Package Package Rsurrogate October 20, 2016 Title Robust Estimation of the Proportion of Treatment Effect Explained by Surrogate Marker Information Version 2.0 Date 2016-10-19 Author Layla Parast

More information

5 Introduction to the Theory of Order Statistics and Rank Statistics

5 Introduction to the Theory of Order Statistics and Rank Statistics 5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order

More information

A3. Statistical Inference Hypothesis Testing for General Population Parameters

A3. Statistical Inference Hypothesis Testing for General Population Parameters Appendix / A3. Statistical Inference / General Parameters- A3. Statistical Inference Hypothesis Testing for General Population Parameters POPULATION H 0 : θ = θ 0 θ is a generic parameter of interest (e.g.,

More information

Comparing Distribution Functions via Empirical Likelihood

Comparing Distribution Functions via Empirical Likelihood Georgia State University ScholarWorks @ Georgia State University Mathematics and Statistics Faculty Publications Department of Mathematics and Statistics 25 Comparing Distribution Functions via Empirical

More information

Better Bootstrap Confidence Intervals

Better Bootstrap Confidence Intervals by Bradley Efron University of Washington, Department of Statistics April 12, 2012 An example Suppose we wish to make inference on some parameter θ T (F ) (e.g. θ = E F X ), based on data We might suppose

More information

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values Statistical Consulting Topics The Bootstrap... The bootstrap is a computer-based method for assigning measures of accuracy to statistical estimates. (Efron and Tibshrani, 1998.) What do we do when our

More information

Asymptotic Properties of Kaplan-Meier Estimator. for Censored Dependent Data. Zongwu Cai. Department of Mathematics

Asymptotic Properties of Kaplan-Meier Estimator. for Censored Dependent Data. Zongwu Cai. Department of Mathematics To appear in Statist. Probab. Letters, 997 Asymptotic Properties of Kaplan-Meier Estimator for Censored Dependent Data by Zongwu Cai Department of Mathematics Southwest Missouri State University Springeld,

More information

AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

More information

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview

Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Introduction to Empirical Processes and Semiparametric Inference Lecture 01: Introduction and Overview Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

e 4β e 4β + e β ˆβ =0.765

e 4β e 4β + e β ˆβ =0.765 SIMPLE EXAMPLE COX-REGRESSION i Y i x i δ i 1 5 12 0 2 10 10 1 3 40 3 0 4 80 5 0 5 120 3 1 6 400 4 1 7 600 1 0 Model: z(t x) =z 0 (t) exp{βx} Partial likelihood: L(β) = e 10β e 10β + e 3β + e 5β + e 3β

More information

Empirical Likelihood

Empirical Likelihood Empirical Likelihood Patrick Breheny September 20 Patrick Breheny STA 621: Nonparametric Statistics 1/15 Introduction Empirical likelihood We will discuss one final approach to constructing confidence

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods MIT 14.385, Fall 2007 Due: Wednesday, 07 November 2007, 5:00 PM 1 Applied Problems Instructions: The page indications given below give you

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of. Symmetry. On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models

Supporting Information for Estimating restricted mean. treatment effects with stacked survival models Supporting Information for Estimating restricted mean treatment effects with stacked survival models Andrew Wey, David Vock, John Connett, and Kyle Rudser Section 1 presents several extensions to the simulation

More information

Let us use the term failure time to indicate the time of the event of interest in either a survival analysis or reliability analysis.

Let us use the term failure time to indicate the time of the event of interest in either a survival analysis or reliability analysis. 10.2 Product-Limit (Kaplan-Meier) Method Let us use the term failure time to indicate the time of the event of interest in either a survival analysis or reliability analysis. Let T be a continuous random

More information

Product-limit estimators of the survival function with left or right censored data

Product-limit estimators of the survival function with left or right censored data Product-limit estimators of the survival function with left or right censored data 1 CREST-ENSAI Campus de Ker-Lann Rue Blaise Pascal - BP 37203 35172 Bruz cedex, France (e-mail: patilea@ensai.fr) 2 Institut

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method Rebecca Barter February 2, 2015 Confidence Intervals Confidence intervals What is a confidence interval? A confidence interval is calculated

More information

Contents 1. Contents

Contents 1. Contents Contents 1 Contents 1 One-Sample Methods 3 1.1 Parametric Methods.................... 4 1.1.1 One-sample Z-test (see Chapter 0.3.1)...... 4 1.1.2 One-sample t-test................. 6 1.1.3 Large sample

More information

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes. Unit 2: Models, Censoring, and Likelihood for Failure-Time Data Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes. Ramón

More information

The Design of a Survival Study

The Design of a Survival Study The Design of a Survival Study The design of survival studies are usually based on the logrank test, and sometimes assumes the exponential distribution. As in standard designs, the power depends on The

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring

Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring Exact Inference for the Two-Parameter Exponential Distribution Under Type-II Hybrid Censoring A. Ganguly, S. Mitra, D. Samanta, D. Kundu,2 Abstract Epstein [9] introduced the Type-I hybrid censoring scheme

More information

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS040) p.4828 Statistical Inference on Constant Stress Accelerated Life Tests Under Generalized Gamma Lifetime Distributions

More information

Likelihood ratio confidence bands in nonparametric regression with censored data

Likelihood ratio confidence bands in nonparametric regression with censored data Likelihood ratio confidence bands in nonparametric regression with censored data Gang Li University of California at Los Angeles Department of Biostatistics Ingrid Van Keilegom Eindhoven University of

More information

Introduction to repairable systems STK4400 Spring 2011

Introduction to repairable systems STK4400 Spring 2011 Introduction to repairable systems STK4400 Spring 2011 Bo Lindqvist http://www.math.ntnu.no/ bo/ bo@math.ntnu.no Bo Lindqvist Introduction to repairable systems Definition of repairable system Ascher and

More information

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

More information

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk

Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Ann Inst Stat Math (0) 64:359 37 DOI 0.007/s0463-00-036-3 Estimators for the binomial distribution that dominate the MLE in terms of Kullback Leibler risk Paul Vos Qiang Wu Received: 3 June 009 / Revised:

More information

Analysis of incomplete data in presence of competing risks

Analysis of incomplete data in presence of competing risks Journal of Statistical Planning and Inference 87 (2000) 221 239 www.elsevier.com/locate/jspi Analysis of incomplete data in presence of competing risks Debasis Kundu a;, Sankarshan Basu b a Department

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

More information

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 7 Fall 2012 Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample H 0 : S(t) = S 0 (t), where S 0 ( ) is known survival function,

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators

A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Statistics Preprints Statistics -00 A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Jianying Zuo Iowa State University, jiyizu@iastate.edu William Q. Meeker

More information

7 Influence Functions

7 Influence Functions 7 Influence Functions The influence function is used to approximate the standard error of a plug-in estimator. The formal definition is as follows. 7.1 Definition. The Gâteaux derivative of T at F in the

More information