PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)

Size: px
Start display at page:

Download "PhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)"

Transcription

1 PhD course in Advanced survival analysis. (ABGK, sect. V.1.1) One-sample tests. Counting process N(t) Non-parametric hypothesis tests. Parametric models. Intensity process λ(t) = α(t)y (t) satisfying Aalen s multiplicative intensity model. Usually, N(t) is obtained by aggregating individual counting processes, α(t) is the hazard function, and Y (t) is the number at risk. 1 October 212 Per Kragh Andersen We will test the null hypothesis H : α(t) = α (t), with α (t) a known hazard function, e.g. a population mortality. 1 2 Compare Nelson-Aalen increments dn(s) dâ(s) = Y (s) Idea: with α (s)ds. Formally: introduce predictable, non-negative weights, K(s), and consider the process: Z(t) = ( ) K(s) dâ(s) α (s)ds. Then Z(t) tends to be large (small) if α(s) is larger (smaller) than α (s) for all s [, t]. Note that we always assume that K(s) = whenever Y (s) =. Properties. If H is true then dn(s) = α (s)y (s)ds + dm(s) and Z(t) equals the stochastic integral Z(t) = K(s) Y (s) dm(s). In particular, Z(t) is a mean zero martingale with (under H ) ( ) 2 K(s) t K 2 (s) Z (t) = d M (s) = Y (s) Y (s) α (s)ds. Conditions can be found under which, by the martingale CLT, the test statistic Z(t)/ Z (t) is asymptotically standard normal under H. 3 4

2 Choice of weights. Common choice is K(t) = Y (t), cf. Ex. V.1.1. Then with Z(t) = ( ) dn(s) Y (s) Y (s) α (s)ds = N(t) E(t), Z (t) = E(t) = Y 2 (s) Y (s) α (s)ds = E(t) Y (s)α (s)ds. The one-sample logrank test statistic is then Example: Fyn diabetics. For female diabetics (Ex. V.1.3) we have, for t = 1 years, N(t) = 237 and using published Danish life-tables we get E(t) = 8.1. The one-sample logrank statistic takes the (highly significant) value: = (N(t) E(t))/ E(t) and E(t) is the expected number of events in [, t] under H (since then N(t) E(t) is a martingale). 5 6 (ABGK sec. V.2.1) k-sample tests. k-variate counting process: (N 1 (t),...,n k (t)) where, typically, each component, N h (t), is obtained by aggregating individual processes. Intensity process (λ 1 (t),...,λ k (t)) of the form: We will test the null hypothesis: λ h (t) = α h (t)y h (t), h = 1,...,k. H : α 1 (t) =... = α k (t) for all t, e.g., if mortality is the same in k groups. 7 Properties of test statistic. If H is true then N (t) = k h=1 N h(t) is a univariate counting process with intensity process: λ (t) = k λ h (t) = α(t)y (t) h=1 where α(t) is the common value of the α h (t) s under H. The idea is now to compare the Nelson-Aalen increments a priori: with those under H : dâh(t) = dn h(t) Y h (t) dâ(t) = dn (t) Y (t). 8

3 Properties of test statistic. Formally, follow the approach from the one-sample situation: introduce non-negative weights, K(s), and consider the processes (for h = 1,...,k): Z h (t) = = Note that k h=1 Z h(t) =. ( ) K(s)Y h (s) dâ h (s) dâ(s) K(s)dN h (s) (observed (weighted)) K(s) Y h(s) Y (s) dn (s) (expected (weighted)). We assume that K(s) = whenever Y (s) =. Properties of test statistic. Under H, the Z h (t) s are mean zero martingales. From their predictable (co-)variance processes, we obtain the (co-)variance estimators: σ hj (t) = K 2 (s) Y ( h(s) δ hj Y ) j(s) dn (s). Y (s) Y (s) Introduce Z(t) = (Z 1 (t),...,z k 1 (t)) T and Σ(t) = (σ hj (t)) k 1 h,j=1 and use the quadratic form as test statistic: Z(t) T Σ(t) 1 Z(t) which is χ 2 k 1 under H under suitable regularity conditions (Th. V.2.1). 9 1 General choices: Choices of weights. Nelson-Aalen estimates by tumor thickness. Log-rank (Savage): K(s) = I(Y (s) > ). Gehan-Breslow (Wilcoxon, Kruskal-Wallis): K(s) = Y (s). Tarone-Ware: K(s) = Y (s) ρ. Survival data: Peto-Prentice (Wilcoxon, Kruskal-Wallis): K(s) = ( ) u<s 1 N (u) Y (s) Y (u)+1 Y (s)+1 Harrington-Fleming: K(s) = I(Y (s) > )Ŝ(s ) ρ. The log-rank test is optimal for proportional hazards alternatives. Categories: -2, 2-5, 5+ mm 11 12

4 Melanoma data Compare mortality for tumor thickness groups: below 2mm, 2-5mm, above 5mm. Log-rank test (at t =14 years): X 2 (t) = 31.6, d.f.=2 Harrington-Fleming test (at t =14 years, ρ = 1): X 2 (t) = 33.9, d.f.=2 The two-sample case. When k = 2 the test statistic takes the form X 2 (t) = (Z 1(t)) 2 σ 11 (t) and an alternative, standard normally distributed (under H ), statistic is: U(t) = Note that U(t) has a direction. Z 1(t) σ11 (t). Example: melanoma data, compare males and females. Log-rank: U(t) = 2.54, P =.11, Gehan-Breslow: U(t) = 2.72, P =.65, Peto-Prentice: U(t) = 2.66, P = Nelson-Aalen estimates by sex. Exercise 1. Show that the 2-sample logrank test (i.e., K(s) = I(Y. (s) > )) can be written in the form Z 1 (t) = Y 1 (s)y 2 (s) Y 1 (s) + Y 2 (s) (dâ2(s) dâ1(s)). 2. Show that, under H : α 1 (t) = α 2 (t), Z 1 is a martingale. 3. Find the predictable variation process Z 1 and the variance estimate σ 11 (t). Males: dashed, females: solid 15 16

5 Parametric models. Set-up: N i (t), i = 1,...,n are counting processes with intensity processes λ i (t), i = 1,...,n w.r.t. filtration (F t ). Recall that λ i (t)dt P(dN i (t) = 1 F t ). Examples (mainly survival data) of parametric models (where Y i (t) is at-risk indicator): λ i (t) = αy i (t), constant hazard λ i (t) = αρ(αt) ρ 1 Y i (t), Weibull λ i (t) = θµ i (t)y i (t), relative mortality λ i (t) = (γ + µ i (t))y i (t), excess mortality λ i (t) = α j Y i (t), t j < t t j+1 piecewise constant hazard Likelihood: Jacod s formula. ABGK; p.58. Consider a general parametric model λ i (t) = α i (t; θ)y i (t), where θ = (θ 1,...,θ p ) is the parameter vector. To derive the likelihood for θ, let N = i N i, Y = i Y i and note that P(dN (t) = F t ) 1 λ (t; θ)dt. With τ being the terminal time of observation we have P(data) = <t τ π P(dN 1 (t),...,dn n (t) F t )P(other events at t dn(t), F t ) <t τ π P(dN 1 (t),...,dn n (t) F t ) L(θ) = P(data) = ( Likelihood. π <t τ λ 1 (t, θ) dn 1(t) λ n (t, θ) dn n(t) (1 λ (t, θ)dt) 1 dn (t) n <t τ λ i (t, θ) dni(t) ) exp( λ (u, θ)du). Here, dn i (t) = 1 if N i jumps at t and otherwise. One may show that the scores U j (θ) = θ j log L(θ) are stochastic integrals w.r.t. the counting process martingales when evaluated at the true parameter: Score: log L(θ) = θ j log L(θ) = log λ i (t, θ)dn i (t) λ (t)dt. λ ij (t, θ) τ λ i (t, θ) dn i(t) λ j(t)dt. Use the Doob-Meyer decomposition dn i (t) = λ i (t, θ )dt dm i (t) to show that the score evaluated at θ equals: λ ij (t, θ) log L(θ) = θ j λ i (t, θ) dm i(t). Thereby, using the martingale CLT, asymptotic normality of the scores may be established leading to a proof that the MLE for θ = (θ 1,...,θ q ) enjoys the usual large sample properties. 19 2

6 Sufficiency. In most of the examples above, λ i (t; θ) = α(t; θ)y i (t) in which case the likelihood is (proportional to): L(θ) = ( α(t; θ) dn (t) ) exp( <t τ It follows that (N, Y ) is sufficient for θ. α(u; θ)y (u)du). Constant hazard. (N, Y ) sufficient when the hazard is assumed constant, i.e. λ i (t; α) = αy i (t). If we let R(t) = Y (u)du be the total time at risk in [, t] then L(α) = α N (τ) exp( αr(τ)) and the MLE is the occurrence/exposure rate α = N (τ) R(τ) NOTE: The Nelson-Aalen estimator (and thereby, by properties of the product-integral, also the Kaplan-Meier and Aalen-Johansen estimators, to be defined later) may be given a MLE interpretation. and ŜD( α) = ( 2 α 2 log L( α)) 1 2 = α R(τ) Nelson-Aalen estimates for male and female melanoma patients. Example: melanoma data. Here we have, for females: α = =.356 per year with ŜD =.67 while, for males, we have α = =.689 per year with ŜD =.128. We can compare the two using the approximately χ 2 1 LR statistic which takes the value 6.15 (P =.1). The constant hazard hypothesis may be evaluated using confidence bands for the Nelson-Aalen estimates or by embedding the model in the larger Weibull model. For the Weibull model the estimated shape parameter for females is ρ = 1.197(.22) while for males it is ρ = 1.17(.166) and for both sexes the Wald statistic ( ρ 1)/ŜD is quite insignificant. Males: dashed, females: solid 23 24

7 Nelson-Aalen estimate for male melanoma patients with bands. Piecewise constant hazards. The constant hazard model is often quite restrictive, however, a simple device offers a lot of flexibility, relax the assumption to piecewise constant hazards: λ i (t) = α j Y i (t), t j < t t j+1, where = t < t 1 <... < t k 1 < t k = + are pre-specified interval cut-points. Define occurrence, D j = N (t j+1 ) N (t j ) and exposure, R j = j+1 t j Y (u)du. Then L(α) = ( j α D j j ) exp( j α j R j ), and the MLE simply becomes α j = D j R j, ŜD( α j ) = α j R j, as. indep Example: suicides among Danish non-manual workers. Ex. I.3.3, p.17. Four categories of non-manual workers: 1. academics 2. advanced non-academic training 3. extensive practical training 4. other non-manual workers Suicides and person-years tabulated by age group and work category (excerpt): Suicides Age Group 1 Group 2 Group 3 Group Males Females Person-years Age Group 1 Group 2 Group 3 Group Males Females

8 Example: suicides among Danish non-manual workers. Age-specific suicide rates (e.g., male academics, 2-24, 2/13439=.15; 25-29, 1/52787=.19): Suicide rates (per 1 years) (SD) Age Males Females (1.1) () (.6) 3.6 (1.6) (1.) 4.9 (1.9) (.9) 7.6 (2.9) (1.) 6.4 (2.6) (.9) 4.9 (2.2) (1.1) 4.6 (2.3) (1.3) 9. (3.7) (1.1) 4.1 (2.9) The standardized mortality ratio. Recall the model for relative mortality λ i (t) = θµ i (t)y i (t) where µ i (t) is a known mortality rate, e.g. an age-, sex-, and calendar time specific population mortality for invividual i at time t. If θ = 1 we have N (τ) = µ i (u)y i (u)du + M (τ), and thereby E(N (τ)) = E( n µ i(u)y i (u)du). Therefore, E(τ) = µ i (u)y i (u)du can be thought of as the (under H : θ = 1) expected number of deaths The MLE is θ = N (τ) E(τ), the standardized mortailty ratio with θ ŜD( θ) = E(τ). Example: suicides (continued), let µ i (t) = µsex(i)(t) at age t be defined as occ/exp rates from combined data (treated as known). Then we get for the academics Males: θ = 1.2(.1) Females: θ = 1.57(.25) both marginally significantly (judged from a Wald test) larger than 1. 31

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

Linear rank statistics

Linear rank statistics Linear rank statistics Comparison of two groups. Consider the failure time T ij of j-th subject in the i-th group for i = 1 or ; the first group is often called control, and the second treatment. Let n

More information

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model

STAT331. Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model STAT331 Combining Martingales, Stochastic Integrals, and Applications to Logrank Test & Cox s Model Because of Theorem 2.5.1 in Fleming and Harrington, see Unit 11: For counting process martingales with

More information

STAT 331. Martingale Central Limit Theorem and Related Results

STAT 331. Martingale Central Limit Theorem and Related Results STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

STAT Sample Problem: General Asymptotic Results

STAT Sample Problem: General Asymptotic Results STAT331 1-Sample Problem: General Asymptotic Results In this unit we will consider the 1-sample problem and prove the consistency and asymptotic normality of the Nelson-Aalen estimator of the cumulative

More information

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 7 Fall 2012 Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample H 0 : S(t) = S 0 (t), where S 0 ( ) is known survival function,

More information

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data

Part III. Hypothesis Testing. III.1. Log-rank Test for Right-censored Failure Time Data 1 Part III. Hypothesis Testing III.1. Log-rank Test for Right-censored Failure Time Data Consider a survival study consisting of n independent subjects from p different populations with survival functions

More information

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation

More information

Lecture 2: Martingale theory for univariate survival analysis

Lecture 2: Martingale theory for univariate survival analysis Lecture 2: Martingale theory for univariate survival analysis In this lecture T is assumed to be a continuous failure time. A core question in this lecture is how to develop asymptotic properties when

More information

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan

( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan Outline: Cox regression part 2 Ørnulf Borgan Department of Mathematics University of Oslo Recapitulation Estimation of cumulative hazards and survival probabilites Assumptions for Cox regression and check

More information

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations

Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Hypothesis Testing Based on the Maximum of Two Statistics from Weighted and Unweighted Estimating Equations Takeshi Emura and Hisayuki Tsukuma Abstract For testing the regression parameter in multivariate

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Session 3 The proportional odds model and the Mann-Whitney test

Session 3 The proportional odds model and the Mann-Whitney test Session 3 The proportional odds model and the Mann-Whitney test 3.1 A unified approach to inference 3.2 Analysis via dichotomisation 3.3 Proportional odds 3.4 Relationship with the Mann-Whitney test Session

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

USING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA

USING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA USING MARTINGALE RESIDUALS TO ASSESS GOODNESS-OF-FIT FOR SAMPLED RISK SET DATA Ørnulf Borgan Bryan Langholz Abstract Standard use of Cox s regression model and other relative risk regression models for

More information

DAGStat Event History Analysis.

DAGStat Event History Analysis. DAGStat 2016 Event History Analysis Robin.Henderson@ncl.ac.uk 1 / 75 Schedule 9.00 Introduction 10.30 Break 11.00 Regression Models, Frailty and Multivariate Survival 12.30 Lunch 13.30 Time-Variation and

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip

TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOZIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississip TESTS FOR LOCATION WITH K SAMPLES UNDER THE KOIOL-GREEN MODEL OF RANDOM CENSORSHIP Key Words: Ke Wu Department of Mathematics University of Mississippi University, MS38677 K-sample location test, Koziol-Green

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

More information

4. Comparison of Two (K) Samples

4. Comparison of Two (K) Samples 4. Comparison of Two (K) Samples K=2 Problem: compare the survival distributions between two groups. E: comparing treatments on patients with a particular disease. Z: Treatment indicator, i.e. Z = 1 for

More information

Goodness-of-Fit Tests With Right-Censored Data by Edsel A. Pe~na Department of Statistics University of South Carolina Colloquium Talk August 31, 2 Research supported by an NIH Grant 1 1. Practical Problem

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Chapter 7: Hypothesis testing

Chapter 7: Hypothesis testing Chapter 7: Hypothesis testing Hypothesis testing is typically done based on the cumulative hazard function. Here we ll use the Nelson-Aalen estimate of the cumulative hazard. The survival function is used

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups 4 Testing Hypotheses The next lectures will look at tests, some in an actuarial setting, and in the last subsection we will also consider tests applied to graduation 4 Tests in the regression setting )

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event

Nonparametric two-sample tests of longitudinal data in the presence of a terminal event Nonparametric two-sample tests of longitudinal data in the presence of a terminal event Jinheum Kim 1, Yang-Jin Kim, 2 & Chung Mo Nam 3 1 Department of Applied Statistics, University of Suwon, 2 Department

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Patrick J. Heagerty PhD Department of Biostatistics University of Washington 1 Biomarkers Review: Cox Regression Model

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

Examination paper for TMA4275 Lifetime Analysis

Examination paper for TMA4275 Lifetime Analysis Department of Mathematical Sciences Examination paper for TMA4275 Lifetime Analysis Academic contact during examination: Ioannis Vardaxis Phone: 95 36 00 26 Examination date: Saturday May 30 2015 Examination

More information

1 Glivenko-Cantelli type theorems

1 Glivenko-Cantelli type theorems STA79 Lecture Spring Semester Glivenko-Cantelli type theorems Given i.i.d. observations X,..., X n with unknown distribution function F (t, consider the empirical (sample CDF ˆF n (t = I [Xi t]. n Then

More information

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes

STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes STAT331 Lebesgue-Stieltjes Integrals, Martingales, Counting Processes This section introduces Lebesgue-Stieltjes integrals, and defines two important stochastic processes: a martingale process and a counting

More information

Statistical Analysis of Competing Risks With Missing Causes of Failure

Statistical Analysis of Competing Risks With Missing Causes of Failure Proceedings 59th ISI World Statistics Congress, 25-3 August 213, Hong Kong (Session STS9) p.1223 Statistical Analysis of Competing Risks With Missing Causes of Failure Isha Dewan 1,3 and Uttara V. Naik-Nimbalkar

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Kernel density estimation in R

Kernel density estimation in R Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to

More information

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data Outline Frailty modelling of Multivariate Survival Data Thomas Scheike ts@biostat.ku.dk Department of Biostatistics University of Copenhagen Marginal versus Frailty models. Two-stage frailty models: copula

More information

Goodness-of-fit test for the Cox Proportional Hazard Model

Goodness-of-fit test for the Cox Proportional Hazard Model Goodness-of-fit test for the Cox Proportional Hazard Model Rui Cui rcui@eco.uc3m.es Department of Economics, UC3M Abstract In this paper, we develop new goodness-of-fit tests for the Cox proportional hazard

More information

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0,

log T = β T Z + ɛ Zi Z(u; β) } dn i (ue βzi ) = 0, Accelerated failure time model: log T = β T Z + ɛ β estimation: solve where S n ( β) = n i=1 { Zi Z(u; β) } dn i (ue βzi ) = 0, Z(u; β) = j Z j Y j (ue βz j) j Y j (ue βz j) How do we show the asymptotics

More information

Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng

Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong Tseng Smoothing the Nelson-Aalen Estimtor Biostat 277 presentation Chi-hong seng Reference: 1. Andersen, Borgan, Gill, and Keiding (1993). Statistical Model Based on Counting Processes, Springer-Verlag, p.229-255

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Relative-risk regression and model diagnostics. 16 November, 2015

Relative-risk regression and model diagnostics. 16 November, 2015 Relative-risk regression and model diagnostics 16 November, 2015 Relative risk regression More general multiplicative intensity model: Intensity for individual i at time t is i(t) =Y i (t)r(x i, ; t) 0

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Resampling methods for randomly censored survival data

Resampling methods for randomly censored survival data Resampling methods for randomly censored survival data Markus Pauly Lehrstuhl für Mathematische Statistik und Wahrscheinlichkeitstheorie Mathematisches Institut Heinrich-Heine-Universität Düsseldorf Wien,

More information

Empirical Processes & Survival Analysis. The Functional Delta Method

Empirical Processes & Survival Analysis. The Functional Delta Method STAT/BMI 741 University of Wisconsin-Madison Empirical Processes & Survival Analysis Lecture 3 The Functional Delta Method Lu Mao lmao@biostat.wisc.edu 3-1 Objectives By the end of this lecture, you will

More information

Survival Analysis for Case-Cohort Studies

Survival Analysis for Case-Cohort Studies Survival Analysis for ase-ohort Studies Petr Klášterecký Dept. of Probability and Mathematical Statistics, Faculty of Mathematics and Physics, harles University, Prague, zech Republic e-mail: petr.klasterecky@matfyz.cz

More information

Modelling the risk process

Modelling the risk process Modelling the risk process Krzysztof Burnecki Hugo Steinhaus Center Wroc law University of Technology www.im.pwr.wroc.pl/ hugo Modelling the risk process 1 Risk process If (Ω, F, P) is a probability space

More information

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the

More information

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky

EMPIRICAL ENVELOPE MLE AND LR TESTS. Mai Zhou University of Kentucky EMPIRICAL ENVELOPE MLE AND LR TESTS Mai Zhou University of Kentucky Summary We study in this paper some nonparametric inference problems where the nonparametric maximum likelihood estimator (NPMLE) are

More information

Survival analysis in R

Survival analysis in R Survival analysis in R Niels Richard Hansen This note describes a few elementary aspects of practical analysis of survival data in R. For further information we refer to the book Introductory Statistics

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

Follow this and additional works at: Part of the Applied Mathematics Commons

Follow this and additional works at:   Part of the Applied Mathematics Commons Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2004 Cox Regression Model Lindsay Sarah Smith Louisiana State University and Agricultural and Mechanical College, lsmit51@lsu.edu

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation

More information

Multivariate Survival Data With Censoring.

Multivariate Survival Data With Censoring. 1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

More information

Faculty of Health Sciences. Cox regression. Torben Martinussen. Department of Biostatistics University of Copenhagen. 20. september 2012 Slide 1/51

Faculty of Health Sciences. Cox regression. Torben Martinussen. Department of Biostatistics University of Copenhagen. 20. september 2012 Slide 1/51 Faculty of Health Sciences Cox regression Torben Martinussen Department of Biostatistics University of Copenhagen 2. september 212 Slide 1/51 Survival analysis Standard setup for right-censored survival

More information

Survival analysis in R

Survival analysis in R Survival analysis in R Niels Richard Hansen This note describes a few elementary aspects of practical analysis of survival data in R. For further information we refer to the book Introductory Statistics

More information

CIMPA SCHOOL, 2007 Jump Processes and Applications to Finance Monique Jeanblanc

CIMPA SCHOOL, 2007 Jump Processes and Applications to Finance Monique Jeanblanc CIMPA SCHOOL, 27 Jump Processes and Applications to Finance Monique Jeanblanc 1 Jump Processes I. Poisson Processes II. Lévy Processes III. Jump-Diffusion Processes IV. Point Processes 2 I. Poisson Processes

More information

Lecture 17: Likelihood ratio and asymptotic tests

Lecture 17: Likelihood ratio and asymptotic tests Lecture 17: Likelihood ratio and asymptotic tests Likelihood ratio When both H 0 and H 1 are simple (i.e., Θ 0 = {θ 0 } and Θ 1 = {θ 1 }), Theorem 6.1 applies and a UMP test rejects H 0 when f θ1 (X) f

More information

A note on the decomposition of number of life years lost according to causes of death

A note on the decomposition of number of life years lost according to causes of death A note on the decomposition of number of life years lost according to causes of death Per Kragh Andersen Research Report 12/2 Department of Biostatistics University of Copenhagen A note on the decomposition

More information

9 Estimating the Underlying Survival Distribution for a

9 Estimating the Underlying Survival Distribution for a 9 Estimating the Underlying Survival Distribution for a Proportional Hazards Model So far the focus has been on the regression parameters in the proportional hazards model. These parameters describe the

More information

Multistate models STK4080 H Competing risk setting 2. Illness-Death setting 3. General event histories and Aalen-Johansen estimator

Multistate models STK4080 H Competing risk setting 2. Illness-Death setting 3. General event histories and Aalen-Johansen estimator Multistate models p. 1/36 Multistate models STK4080 H16 1. Competing risk setting 2. Illness-Death setting 3. General event histories and Aalen-Johansen estimator Multistate models p. 2/36 Multistate models

More information

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency Smooth Goodness-of-Fit Tests in Hazard-Based Models by Edsel A. Pe~na Department of Statistics University of South Carolina at Columbia E-Mail: pena@stat.sc.edu Univ. of Virginia Colloquium Talk Department

More information

Philosophy and Features of the mstate package

Philosophy and Features of the mstate package Introduction Mathematical theory Practice Discussion Philosophy and Features of the mstate package Liesbeth de Wreede, Hein Putter Department of Medical Statistics and Bioinformatics Leiden University

More information

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007) FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

More information

Lecture 32: Asymptotic confidence sets and likelihoods

Lecture 32: Asymptotic confidence sets and likelihoods Lecture 32: Asymptotic confidence sets and likelihoods Asymptotic criterion In some problems, especially in nonparametric problems, it is difficult to find a reasonable confidence set with a given confidence

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes Introduction Method Theoretical Results Simulation Studies Application Conclusions Introduction Introduction For survival data,

More information

Survival Analysis: Counting Process and Martingale. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Counting Process and Martingale. Lu Tian and Richard Olshen Stanford University Survival Analysis: Counting Process and Martingale Lu Tian and Richard Olshen Stanford University 1 Lebesgue-Stieltjes Integrals G( ) is a right-continuous step function having jumps at x 1, x 2,.. b f(x)dg(x)

More information

MODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS

MODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS MODELING MISSING COVARIATE DATA AND TEMPORAL FEATURES OF TIME-DEPENDENT COVARIATES IN TREE-STRUCTURED SURVIVAL ANALYSIS by Meredith JoAnne Lotz B.A., St. Olaf College, 2004 Submitted to the Graduate Faculty

More information

Estimation for Modified Data

Estimation for Modified Data Definition. Estimation for Modified Data 1. Empirical distribution for complete individual data (section 11.) An observation X is truncated from below ( left truncated) at d if when it is at or below d

More information

Survival Analysis. STAT 526 Professor Olga Vitek

Survival Analysis. STAT 526 Professor Olga Vitek Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9 Survival Data and Survival Functions Statistical analysis of time-to-event data Lifetime of machines and/or parts (called failure time analysis

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Discrete Multivariate Statistics

Discrete Multivariate Statistics Discrete Multivariate Statistics Univariate Discrete Random variables Let X be a discrete random variable which, in this module, will be assumed to take a finite number of t different values which are

More information

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 4 Fall 2012 4.2 Estimators of the survival and cumulative hazard functions for RC data Suppose X is a continuous random failure time with

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Turning a research question into a statistical question.

Turning a research question into a statistical question. Turning a research question into a statistical question. IGINAL QUESTION: Concept Concept Concept ABOUT ONE CONCEPT ABOUT RELATIONSHIPS BETWEEN CONCEPTS TYPE OF QUESTION: DESCRIBE what s going on? DECIDE

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

Survival Analysis APTS 2016/17. Ingrid Van Keilegom ORSTAT KU Leuven. Glasgow, August 21-25, 2017

Survival Analysis APTS 2016/17. Ingrid Van Keilegom ORSTAT KU Leuven. Glasgow, August 21-25, 2017 Survival Analysis APTS 2016/17 Ingrid Van Keilegom ORSTAT KU Leuven Glasgow, August 21-25, 2017 Basic What is Survival analysis? Survival analysis (or duration analysis) is an area of statistics that and

More information

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency

Survival Times (in months) Survival Times (in months) Relative Frequency. Relative Frequency Smooth Goodness-of-Fit Tests in Hazard-Based Models by Edsel A. Pe~na Department of Statistics University of South Carolina at Columbia E-Mail: pena@stat.sc.edu Univ. of Georgia Colloquium Talk Department

More information

Modeling Prediction of the Nosocomial Pneumonia with a Multistate model

Modeling Prediction of the Nosocomial Pneumonia with a Multistate model Modeling Prediction of the Nosocomial Pneumonia with a Multistate model M.Nguile Makao 1 PHD student Director: J.F. Timsit 2 Co-Directors: B Liquet 3 & J.F. Coeurjolly 4 1 Team 11 Inserm U823-Joseph Fourier

More information

Tests of independence for censored bivariate failure time data

Tests of independence for censored bivariate failure time data Tests of independence for censored bivariate failure time data Abstract Bivariate failure time data is widely used in survival analysis, for example, in twins study. This article presents a class of χ

More information

Logistic regression: Miscellaneous topics

Logistic regression: Miscellaneous topics Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio

More information

A Comparison of Different Approaches to Nonparametric Inference for Subdistributions

A Comparison of Different Approaches to Nonparametric Inference for Subdistributions A Comparison of Different Approaches to Nonparametric Inference for Subdistributions Johannes Mertsching Johannes.Mertsching@gmail.com Master Thesis Supervision: dr. Ronald B. Geskus prof. Chris A.J. Klaassen

More information