Survival Analysis. DZHW Summer School Workshop Johannes Giesecke HU Berlin

Size: px
Start display at page:

Download "Survival Analysis. DZHW Summer School Workshop Johannes Giesecke HU Berlin"

Transcription

1 Survival Analysis DZHW Summer School Workshop Johannes Giesecke HU Berlin

2 Program Block I Basics a) Key concepts (censoring, truncation, analysis time and functions thereof, overview of potential models) b) Data structure and data management (wide and long data format, handling of dates, st-commands in Stata) Block II Some Examples using Different Models a) Non-parametric approaches b) Semi-parametric models c) Parametric models d) Outlook Survival Analysis J. Giesecke 2

3 Literature Allison, Paul D. (2014). Event History and Survival Analysis. Thousand Oaks: Sage Publications Blossfeld, Hans-Peter/Golsch, Katrin/Rohwer, Götz (2007). Event History Analysis with Stata. Mahwah: Lawrence Erlbaum Associates Box-Steffensmeier, Janet M./Jones, Bradford S. (2004). Event History Modeling. A Guide for Social Scientists. Cambridge: Cambridge University Press Cleves, Mario/Gould, William/Gutierrez, Roberto G./Marchenko Yulia V. (2010). An Introduction to Survival Analysis Using Stata. College Station: Stata Press Hosmer, David W./Lemeshow, Stanley (1999). Applied Survival Analysis. Regression Modeling of Time to Event Data. New York: John Wiley & Sons Kleinbaum, David G./Klein, Mitchel (2011). Survival Analysis: A Self- Learning Text. Berlin: Springer Verlag Survival Analysis J. Giesecke 3

4 Basics

5 Introduction aim: analysis of time until an (predefined) event occurs analysis of distribution of events investigate groups differences in time-to-event examples: job search duration, employment duration, entry into first job with permanent contract, time until first substantial pay raise etc. synonyms: hazard models, event history analysis, models for transition data, time-to-event data, survival-time data, duration data, failure time Survival Analysis J. Giesecke 5

6 Introduction data soures cross-sectional surveys collecting retrospective information (e.g. German Life History Study) panel studies collecting prospective information (e.g. German Socioeconomic Panel SOEP) administrative data (e.g. employment information collected by Federal Emplyoment Agency) Survival Analysis J. Giesecke 6

7 Alternative Models? Why not using OLS-models (dependent variable: time until an event occurs)? general problem: How to deal with censored cases? (but: censored regression models) more important: non-normally distributed dependent variable and time-varying variables Why not using logit/probit-models (dependent variable: event occurred vs. event did not occur)? loss of information (chronology is lost) time-varying variables cannot be integrated Survival Analysis J. Giesecke 7

8 Censoring and Truncation censoring: event occurs, but observational unit is not observed right censoring: event occurs after observational period has ended (observational unit no longer observed) left censoring: event occurred before observational period started (observational unit not yet observed) interval censoring: event occurred between start and end of the observational period, but no exact timing of event is known, only time interval (observational unit was not observed during time interval) Survival Analysis J. Giesecke 8

9 start of observational period end of observational period X left censoring gap interval censoring X in gap right censoring Survival Analysis J. Giesecke 9

10 Censoring and Truncation truncation: observational unit is not observed for a certain time span left truncation (delayed entry): observational unit was at risk before it was actually observed for the first time right truncation: observational unit UE is no longer observed after the end of the observational period, but event will finally occur; effectively not distinguishable from right censoring interval truncation (gaps): observational unit is not observed for a certain time span between start and end of the observational period Survival Analysis J. Giesecke 10

11 start of observational period end of observational period X left truncation gap X interval truncation right truncation Survival Analysis J. Giesecke 11

12 Continuous vs. Discrete Time continuous time: idea: continuous time axis, no time intervals but time points (infinitely small time intervals) basis of (almost) all textbooks on survival analysis discontinuous (discrete) time grouping of originally continuous time (grouping happens during or after observation phase) often neglected in textbooks, but quite important in practice Survival Analysis J. Giesecke 12

13 Types of Variables time-constant variables (e.g. sex, social origin, graduation grade) time-varying variables (e.g. number of job applications, type of search strategy) splitting of episodes might be necessary Survival Analysis J. Giesecke 13

14 Analysis Time and Functions thereof T non-negative random variable denoting the time until an event occurs 1. continuous time a) F(t)=Pr(T t) cumulative distribution function of T - probability that there is an event prior to t - monotone, nondecreasing function - also known as failure function Survival Analysis J. Giesecke 14

15 Cumulative Distribution Function F(t) F(1.517)=0.9 F(0.833)= t Weibull Distribution p=2 Survival Analysis J. Giesecke 15

16 Analysis Time and Functions thereof b) S(t)=1-F(t)= Pr(T>t) survivor function - probability that there is no event prior to t - monotone, nonincreasing function - S(0)=1, S(t)=0 for t Survival Analysis J. Giesecke 16

17 Survivor Function S(t) S(0.833)=0.5 S(1.517)= t Weibull Distribution p=2 Survival Analysis J. Giesecke 17

18 Analysis Time and Functions thereof c) f(t) density function - measure of concentration of events at time t - density function, no probabilities - note: 0 f(t), but not f(t) 1 Pr t T tt F() t S() t f() t lim t 0 t t t Survival Analysis J. Giesecke 18

19 Probability Density Function f(t) t Weibull Distribution p=2 Survival Analysis J. Giesecke 19

20 ht ( ) lim Analysis Time and Functions d) h(t) hazard function t0 thereof - measure of concentration of events at time t, conditional upon the subject having survived until t Pr t T tt T t f () t t S() t also known as: hazard rate, intensity rate, failure rate, transition rate, transition intensity, risk function, mortality rate therefore: given a certain level of f(t): the smaller S(t) the higher h(t) given a certain level of S(t): the higher f(t) the higher h(t) Survival Analysis J. Giesecke 20

21 Analysis Time and Functions thereof characteristics of hazard function: - values between 0 (no risk) to (certainty of event occurring at this particular instance) - time-constant hazard: conditional upon the event not having occurred until t, the chances of surviving at this instance or that are all the same - increasing hazard: increasing risk - falling hazard: falling risk Survival Analysis J. Giesecke 21

22 Hazard Function h(t) f (0.209) 0.4 S(0.209) h(0.209) f (1.393) 0.4 S(1.393) h(1.393) t Weibull Distribution p=2 Survival Analysis J. Giesecke 22

23 Analysis Time and Functions thereof e) H(t) cumulative hazard function - measure of total risk a subject has passed through until time t t Ht () hudu ( ) ln St () 0 Survival Analysis J. Giesecke 23

24 H(t) Cumulative Hazard Function t Weibull Distribution p=2 Survival Analysis J. Giesecke 24

25 Analysis Time and Functions thereof if one particular function is known, all other functions can be determined e.g. if ht ( ) t 0 1 then: Ht () hudu ( ) t St ( ) exp Ht ( ) exp( t) Ft ( ) 1 St ( ) 1exp Ht ( ) 1exp( t) f () t h() t S() t h()exp t H() t exp( t) Survival Analysis J. Giesecke 25

26 Hazard Function h(t) 0 2 ht ()= t Weibull Distribution p=1 Survival Analysis J. Giesecke 26

27 Cumulative Hazard Function H(t) t Ht () htdu () t t Weibull Distribution p=1 Survival Analysis J. Giesecke 27

28 Survivor Function S(t) St () exp Ht ()=exp( t) t Weibull Distribution p=1 Survival Analysis J. Giesecke 28

29 Cumulative Distribution Function F(t) F( t) 1exp H( t) =1 exp( t) t Weibull Distribution p=1 Survival Analysis J. Giesecke 29

30 Probability Density Function f(t) f () t h()exp t H() t exp( t) t Weibull Distribution p=1 Survival Analysis J. Giesecke 30

31 Analysis Time and Functions thereof 2. discontinuous time idea: T still is continuous random variable, but time axis is divided into k disjoint intervals [0=a 0,a 1 ], (a 1,a 2 ], (a 2,a 3 ],, (a k-1,a k = ] interval j: (a j-1,a j ] ] including ( excluding Survival Analysis J. Giesecke 31

32 Analysis Time and Functions thereof F(.) und S(.) are still defined as F(a j )=Pr(T a j ) - probability that an event occurs up to the end of time interval j S(a j )=1-F(a j )= Pr(T>a j ) - probability that no event occurs up to the end of time interval j (i.e., that subject survives at least until interval j+1) Survival Analysis J. Giesecke 32

33 Analysis Time and Functions thereof f(.) density function of T, here: probability function f( j) Pr a T a S( j 1) S( j) j1 j probability of the event occurring in time interval j Survival Analysis J. Giesecke 33

34 Analysis Time and Functions thereof interval hazard defined as: ha ( ) Pr a Ta Ta j j1 j j1 j1 Pr aj 1 T aj Sa ( j1) Sa ( j) Sa ( j) 1 Pr T a Sa ( ) Sa ( ) j1 j1 probability of the event occuring in time interval j, given that subject has survived to the beginning of that interval conditional probability, e.g. 0 h(a j ) 1 Survival Analysis J. Giesecke 34

35 Survivor Function S(t) t Weibull Distribution p=1 Survival Analysis J. Giesecke 35

36 Survivor Function S(t) S( j 1) 0.05 S( j 2) S( j 3) h( j 1) 0.95 f( j 1) 0.95 h( j 2) 0.95 f( j 2) h( j 2) 0.95 f( j 2) t Weibull Distribution p=1 Survival Analysis J. Giesecke 36

37 Survivor Function S(t) t Weibull Distribution p=0.5 Survival Analysis J. Giesecke 37

38 Survivor Function S(t) S( j 1) S( j 2) S( j 3) h( j 1) 0.82 f( j 1) 0.82 h( j 2) 0.51 f( j 2) 0.09 h( j 2) 0.42 f( j 2) t Weibull Distribution p=0.5 Survival Analysis J. Giesecke 38

39 Continuous vs. Discrete Time empirically, every measurement of time is discontinuous decision guidance a) ratio of units of time intervals (days, months, years) to typical length of episodes (e.g. median, mean) - the smaller this ratio the more appropriate it is to use models for continuous time b) frequency of observational units with the same T (time until an event occurs) - the smaller the number of ties the more appropriate it is to use models for continuous time Survival Analysis J. Giesecke 39

40 Overview of Potential Models we distinguish: parametric, semi-parametric and nonparametric models for event history data parametric and semi-parametric models: (strong) a-prioriassumptions about distribution of T and about the effects of covariates - parametric models: a-priori-assumption about both distribution of T and effects of covariates - semi-parametric models: a-priori-assumption effects of covariates non-parametric models: no a-priori-assumptions about distribution of T and about the effects of covariates Survival Analysis J. Giesecke 40

41 Overview of Potential Models 1. parametric and semi-parametric models models have the general form: h() t g t, x β i 0 i a) parametric models: strong a-priori-assumptions (distribution of T, effects of covariates) e.g. so-called proportional hazard-models h() t h ()exp t x β i 0 0 i Exponential model, Weibull model, Gombertz model Survival Analysis J. Giesecke 41

42 Overview of Potential Models some popular parametric models continuous time Exponential model Weibull model Log-logistic model Log-nomal model Gombertz model Generalized Gamma model discontinuous time Logistic model Complementary log-log model Survival Analysis J. Giesecke 42

43 Overview of Potential Models b) semi-parametric models: a-priori-assumptions about effects of covariates, but not about distribution of T e.g. Piecewise Constant Exponential Model h() t i h h h exp x β t(0, ] exp x β t(, ] exp x β t(, ] 01 i 1 02 i 1 2 0K i K1 K Survival Analysis J. Giesecke 43

44 Overview of Potential Models popular semi-parametric models continuous time Piecewise Constant Exponential Model Cox Model discontinuous time Piecewise Constant Logistic Model Piecewise Constant Complementary log-log Model Survival Analysis J. Giesecke 44

45 Overview of Potential Models 2. non-parametric models: no a-priori-assumption about about effects of covariates or not about distribution of T continuous time Kaplan-Meier-Estimator Nelson-Aalen-Estimator discontinuous time Life Table Survival Analysis J. Giesecke 45

46 Data Structure and Data Management

47 Data Structure intended data format: for each observational unit information on begin and end, event or cencering as well as on potential covariates should be stored PID t 0 t 1 Event x 1 x k Survival Analysis J. Giesecke 47

48 Data Structure using this data format allows to record: a)right censoring PID t 0 t 1 Event x 1 x k Survival Analysis J. Giesecke 48

49 Data Structure using this data format allows to record: b)left truncation (delayed entry) PID t 0 t 1 Event x 1 x k Survival Analysis J. Giesecke 49

50 Data Structure using this data format allows to record: c) interval truncation PID t 0 t 1 Event x 1 x k Survival Analysis J. Giesecke 50

51 Data Structure using this data format allows to record: d)time-varying covariates PID t 0 t 1 Event x 1 x k Survival Analysis J. Giesecke 51

52 Data Structure using this data format allows to record: e)multiple events PID t 0 t 1 Event x 1 x k Survival Analysis J. Giesecke 52

53 Wide vs. Long Data Format wide format: one row per observational unit PID Begin End Event x 1 x k Survival Analysis J. Giesecke 53

54 Wide vs. Long Data Format wide format: one row per observational unit, even if there are multiple episodes PID Begin1 End1 Event1 x 11 x k1 Begin2 End2 Event2 x Survival Analysis J. Giesecke 54

55 Wide vs. Long Data Format long format: multiple rows for each oberservational unit possible PID Begin End Event x 1 x k Survival Analysis J. Giesecke 55

56 Wide vs. Long Data Format long format usually necessary for survival analysis (because, for example, of interval truncation, timevarying covariates, multiple events) but also: data management much easier than in wide format switiching from long to wide format (and vice versa) is easy in Stata (command reshape) Survival Analysis J. Giesecke 56

57 Wide vs. Long Data Format. use "$home\suf97_sample.dta", clear. keep id_suf- job9best einmon einjahr. @best, i( id_suf) j(spell_nr) string (note: j = job1 job2 job3 job4 job5 job6 job7 job8 job9) Data wide -> long Number of obs > Number of variables 92 -> 21 j variable (9 values) -> spell_nr xij variables: job1manf job2manf... job9manf -> manf job1janf job2janf... job9janf -> janf job1mend job2mend... job9mend -> mend job1jend job2jend... job9jend -> jend job1lnoc job2lnoc... job9lnoc -> lnoc job1arve job2arve... job9arve -> arve job1az job2az... job9az -> az job1std job2std... job9std -> std job1best job2best... job9best -> best Survival Analysis J. Giesecke 57

58 Handling of Dates begin and end of an episode should be stored in dates format (better interaction with Stata s stcommands) conversion of existing date information into dates format: - string-to-numeric-conversion functions, e.g via gen datumsvar=date(stringvar, DMY ) - date-from-numerical-components functions, e.g. gen datumsvar=mdy(m,d,y) - see help datetime bzw. [D] datetime for more information Survival Analysis J. Giesecke 58

59 Handling of Dates. tab1 manf janf -> tabulation of manf manf Freq. Percent Cum filter 17, k.a Jan Feb März Apr Mai Juni Juli Aug Sep Okt Nov Dez Total 25, Survival Analysis J. Giesecke 59

60 Handling of Dates janf Freq. Percent Cum Filter 17, k. A , , , , Total 25, Survival Analysis J. Giesecke 60

61 Handling of Dates gen begin=ym(janf,manf) format beginn %tm gen end=ym(jend,mend) format ende %tm. list id_suf spell_nr manf janf mend jend begin end in 1/30, noobs id_suf spell_nr manf janf mend jend beginn ende Okt Sep m m Okt Juni m m Juli filter -2. Filter 2000m filter -2. Filter -2. filter -2. Filter filter -2. Filter -2. filter -2. Filter filter -2. Filter -2. filter -2. Filter filter -2. Filter -2. filter -2. Filter filter -2. Filter -2. filter -2. Filter filter -2. Filter -2. filter -2. Filter Okt Apr m m Okt Juli m m filter -2. Filter -2. filter -2. Filter..... Survival Analysis J. Giesecke 61

62 st-commands in Stata st-commands in Stata are used to analyse survialtime data before using other st-commands, we have to stset the datat see help st bzw. [st] st for more information Survival Analysis J. Giesecke 62

63 st-commands in Stata stset-command has three aims definition of onset of risk, definition of event(s), definition of entry and exit of subjects various checks if statements/data cause problems statements are kept for subsequent st-commands, data does not need to be stset again Survival Analysis J. Giesecke 63

64 stset stset beginn, failure(firstjob=1) id(persnr) origin(time ende_studium) means analysis is beginn-ende_studium, onset of risk is at t=ende_studium definition of event: firstjob=1 no multiple events (single-failure data) multiple episodes possible (within persnr) Survival Analysis J. Giesecke 64

65 stset. stset beginn, failure(firstjob=1) id(id_suf) origin(time ende_studium) id: id_suf failure event: firstjob == 1 obs. time interval: (beginn[_n-1], beginn] exit on or before: failure t for analysis: (time-origin) origin: time ende_studium total obs event time missing (beginn>=.) PROBABLE ERROR 103 multiple records at same instant PROBABLE ERROR (beginn[_n-1]==beginn) 419 obs. end on or before enter() 4527 obs. begin on or after (first) failure obs. remaining, representing 2367 subjects 2367 failures in single failure-per-subject data total analysis time at risk, at risk from t = 0 earliest observed entry t = 0 last observed exit t = 74 Survival Analysis J. Giesecke 65

66 stset. list id_suf spell_nr end_study begin firstjob _t0 _t _d _st in 1/30, noobs id_suf spell_nr end_st~y begin firstjob _t0 _t _d _st job1 1997m6 1998m job2 1997m6 1999m job3 1997m6 2000m job4 1997m job5 1997m job6 1997m job7 1997m job8 1997m job9 1997m job1 1997m8 1994m job2 1997m8 1999m job3 1997m job4 1997m job5 1997m job6 1997m Survival Analysis J. Giesecke 66

67 stset after having stset the data, we can use stcommands, for example stdes: description of survival-time data stvary: information on variables Survival Analysis J. Giesecke 67

68 stdes. stdes failure _d: ereignis == 1 analysis time _t: (beginn-origin) origin: time ende_studium id: id_suf per subject Category total mean min median max no. of subjects 2367 no. of records (first) entry time (final) exit time subjects with gap 0 time on gap if gap time at risk failures Survival Analysis J. Giesecke 68

69 stvary. stvary geschl arve failure _d: ereignis == 1 analysis time _t: (beginn-origin) origin: time ende_studium id: id_suf subjects for whom the variable is never always sometimes variable constant varying missing missing missing geschl arve Survival Analysis J. Giesecke 69

70 Some Examples using Different Models

71 Non-parametric Approaches no a-priori-assumptions about distribution of T and about the effects of covariates serve mainly descriptive purposes estimation of survivor function: Kaplan-Meier-Estimator estimation of cumulative hazard function: Nelson-Aalen- Estimator for both estimators: illustration of group differences but also: statistical tests of group differences Survival Analysis J. Giesecke 71

72 Non-parametric Approaches. sts list failure _d: firstjob == 1 analysis time _t: (begin-origin) origin: time end_study id: id_suf Beg. Net Survivor Std. Time Total Fail Lost Function Error [95% Conf. Int.] Survival Analysis J. Giesecke 72

73 Kaplan-Meier survival estimate. sts graph, graphregion(color(white)) failure _d: firstjob == 1 analysis time _t: (begin-origin) origin: time end_study id: id_suf analysis time Survival Analysis J. Giesecke 73

74 Non-parametric Approaches Question: Does survivor function dffer between groups? 1. description: illustration of group differences 2. statistical tests of group differences Survival Analysis J. Giesecke 74

75 Non-parametric Approaches 1. description: illustration of group differences sts graph, by(group) all Stata graph twoway-options can be used see help sts graph and [ST] sts graph for more options Survival Analysis J. Giesecke 75

76 Kaplan-Meier survival estimates. sts graph, by(hsart) plot1opts(recast(line) lcolor(gs12)) graphregion(color(white)) analysis time hsart = 1. FH hsart = 2. UNI Survival Analysis J. Giesecke 76

77 Non-parametric Approaches 2. statistical tests of group differences basic idea: compare number of observed events with number of events to be expected unter H 0 H 0 :h 1 (t)=h 2 (t)= =h r (t) calculation of χ² distributed test statistic Survival Analysis J. Giesecke 77

78 Non-parametric Approaches Stata-Beispiel 7b: sts test. sts test hsart Log-rank test for equality of survivor functions Events Events hsart observed expected FH UNI Total chi2(1) = Pr>chi2 = Survival Analysis J. Giesecke 78

79 Non-parametric Approaches tests of group differences can be stratified to account for the fact that risk might also differ between other groups sts test varname, strata(varlist) reduces risk of confounding option detail to display test results for single strata Survival Analysis J. Giesecke 79

80 . replace geschl=. if geschl<1 (36 real changes made, 36 to missing). sts test hsart, strata(geschl) detail Stratified log-rank test for equality of survivor functions -> geschl = 1 Events Events hsart observed expected FH UNI Total chi2(1) = Pr>chi2 = > Total Events Events hsart observed expected(*) FH UNI Total (*) sum over calculations within geschl chi2(1) = Pr>chi2 = > geschl = 2 Events Events hsart observed expected FH UNI Total chi2(1) = 3.82 Pr>chi2 = Survival Analysis J. Giesecke 80

81 Semi-parametric Models a-priori-assumptions about effects of covariates, but not about distribution of T formal model: x expxβ ht h t i 0 i 0 ht h x t i hazard at time t, given x baseline hazard x β x x... x i 1i 1 2i 2 Ki K baseline hazard does not need to be specified i Survival Analysis J. Giesecke 81

82 Semi-parametric Models advantage: no assumption about h 0 (t), therefore less error-prone disadvantage: not efficient compared to (correctly specified) parametric model In Stata: stcox Survival Analysis J. Giesecke 82

83 . stcox ib2.hsart ib2.geschl failure _d: firstjob == 1 analysis time _t: (begin-origin) origin: time end_study id: id_suf Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Refining estimates: Iteration 0: log likelihood = Cox regression -- Breslow method for ties No. of subjects = 2,364 Number of obs = 2,387 No. of failures = 2,364 Time at risk = LR chi2(2) = Log likelihood = Prob > chi2 = _t Haz. Ratio Std. Err. z P> z [95% Conf. Interval] hsart 1. FH geschl 1. männlich Survival Analysis J. Giesecke 83

84 Parametric Models a-priori-assumptions about effects of covariates and about distribution of T fully parameterized model two variants of parametrization: proportional hazard modelle (PH-models) accelerated failure time-models (AFT-models) Survival Analysis J. Giesecke 84

85 Parametric Models PH-models formal model: htx h texp xβ i 0 i 0 ht h t x i hazard at time t, given baseline hazard, needs to be specified z.b. h 0 t exp 0 (Exponential Model) baseline hazard cannot be left unspecified x i Survival Analysis J. Giesecke 85

86 Parametric Models AFT-models formal model: ln t x β ln i i i x β x x... x i 1i 1 2i 2 Ki K follows a certain distribution e.g. Weibull, 0 p Survival Analysis J. Giesecke 86

87 . streg ib2.hsart ib2.geschl, distribution(gompertz) failure _d: firstjob == 1 analysis time _t: (begin-origin) origin: time end_study id: id_suf Fitting constant-only model: Iteration 0: log likelihood = Iteration 1: log likelihood = Gompertz regression -- log relative-hazard form No. of subjects = 2,364 Number of obs = 2,387 No. of failures = 2,364 Time at risk = LR chi2(2) = Log likelihood = Prob > chi2 = _t Haz. Ratio Std. Err. z P> z [95% Conf. Interval] hsart 1. FH geschl 1. männlich cons /gamma Survival Analysis J. Giesecke 87

88 Outlook some important things that we did not talk about interactions between covariates time-varying effects of covariates multiple events recurrent events competing risks diagnostics for (semi-)parametric models Survival Analysis J. Giesecke 88

How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 4.20) revised

How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 4.20) revised WM Mason, Soc 213B, S 02, UCLA Page 1 of 15 How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 420) revised 4-25-02 This document can function as a "how to" for setting up

More information

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 =

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 = August 2005 Stata Application Tutorial 4: Discrete Models Data Note: Code makes use of career.dta, and icpsr_discrete1.dta. All three data sets are available on the Event History website. Code is based

More information

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis

STAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive

More information

Consider Table 1 (Note connection to start-stop process).

Consider Table 1 (Note connection to start-stop process). Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching

GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching GOV 2001/ 1002/ E-2001 Section 10 1 Duration II and Matching Mayya Komisarchik Harvard University April 13, 2016 1 Heartfelt thanks to all of the Gov 2001 TFs of yesteryear; this section draws heavily

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Duration Analysis. Joan Llull

Duration Analysis. Joan Llull Duration Analysis Joan Llull Panel Data and Duration Models Barcelona GSE joan.llull [at] movebarcelona [dot] eu Introduction Duration Analysis 2 Duration analysis Duration data: how long has an individual

More information

Applied Survival Analysis Lab 10: Analysis of multiple failures

Applied Survival Analysis Lab 10: Analysis of multiple failures Applied Survival Analysis Lab 10: Analysis of multiple failures We will analyze the bladder data set (Wei et al., 1989). A listing of the dataset is given below: list if id in 1/9 +---------------------------------------------------------+

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see Title stata.com stcrreg postestimation Postestimation tools for stcrreg Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

More information

Introduction to Event History Analysis. Hsueh-Sheng Wu CFDR Workshop Series June 20, 2016

Introduction to Event History Analysis. Hsueh-Sheng Wu CFDR Workshop Series June 20, 2016 Introduction to Event History Analysis Hsueh-Sheng Wu CFDR Workshop Series June 20, 2016 1 What is event history analysis Event history analysis steps Outline Create data for event history analysis Data

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Estimation of discrete time (grouped duration data) proportional hazards models: pgmhaz

Estimation of discrete time (grouped duration data) proportional hazards models: pgmhaz Estimation of discrete time (grouped duration data) proportional hazards models: pgmhaz Stephen P. Jenkins ESRC Research Centre on Micro-Social Change University of Essex, Colchester

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

Parametric multistate survival analysis: New developments

Parametric multistate survival analysis: New developments Parametric multistate survival analysis: New developments Michael J. Crowther Biostatistics Research Group Department of Health Sciences University of Leicester, UK michael.crowther@le.ac.uk Victorian

More information

Flexible parametric alternatives to the Cox model, and more

Flexible parametric alternatives to the Cox model, and more The Stata Journal (2001) 1, Number 1, pp. 1 28 Flexible parametric alternatives to the Cox model, and more Patrick Royston UK Medical Research Council patrick.royston@ctu.mrc.ac.uk Abstract. Since its

More information

Survival Analysis. Hsueh-Sheng Wu CFDR Workshop Series Spring 2010

Survival Analysis. Hsueh-Sheng Wu CFDR Workshop Series Spring 2010 Survival Analysis Hsueh-Sheng Wu CFDR Workshop Series Spring 2010 1 Outline Survival analysis steps Create data for survival analysis Data for different analyses The dependent variable in Life Table analysis

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

ST745: Survival Analysis: Cox-PH!

ST745: Survival Analysis: Cox-PH! ST745: Survival Analysis: Cox-PH! Eric B. Laber Department of Statistics, North Carolina State University April 20, 2015 Rien n est plus dangereux qu une idee, quand on n a qu une idee. (Nothing is more

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

Estimation for Modified Data

Estimation for Modified Data Definition. Estimation for Modified Data 1. Empirical distribution for complete individual data (section 11.) An observation X is truncated from below ( left truncated) at d if when it is at or below d

More information

Cox s proportional hazards model and Cox s partial likelihood

Cox s proportional hazards model and Cox s partial likelihood Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.

More information

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis

7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression

More information

A Survival Analysis of GMO vs Non-GMO Corn Hybrid Persistence Using Simulated Time Dependent Covariates in SAS

A Survival Analysis of GMO vs Non-GMO Corn Hybrid Persistence Using Simulated Time Dependent Covariates in SAS Western Kentucky University From the SelectedWorks of Matt Bogard 2012 A Survival Analysis of GMO vs Non-GMO Corn Hybrid Persistence Using Simulated Time Dependent Covariates in SAS Matt Bogard, Western

More information

Hazard Function, Failure Rate, and A Rule of Thumb for Calculating Empirical Hazard Function of Continuous-Time Failure Data

Hazard Function, Failure Rate, and A Rule of Thumb for Calculating Empirical Hazard Function of Continuous-Time Failure Data Hazard Function, Failure Rate, and A Rule of Thumb for Calculating Empirical Hazard Function of Continuous-Time Failure Data Feng-feng Li,2, Gang Xie,2, Yong Sun,2, Lin Ma,2 CRC for Infrastructure and

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

More information

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Patrick J. Heagerty PhD Department of Biostatistics University of Washington 1 Biomarkers Review: Cox Regression Model

More information

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek Survival Analysis 732G34 Statistisk analys av komplexa data Krzysztof Bartoszek (krzysztof.bartoszek@liu.se) 10, 11 I 2018 Department of Computer and Information Science Linköping University Survival analysis

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

especially with continuous

especially with continuous Handling interactions in Stata, especially with continuous predictors Patrick Royston & Willi Sauerbrei UK Stata Users meeting, London, 13-14 September 2012 Interactions general concepts General idea of

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis

Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Statistics 262: Intermediate Biostatistics Non-parametric Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Overview of today s class Kaplan-Meier Curve

More information

Homework Solutions Applied Logistic Regression

Homework Solutions Applied Logistic Regression Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

1 The problem of survival analysis

1 The problem of survival analysis 1 The problem of survival analysis Survival analysis concerns analyzing the time to the occurrence of an event. For instance, we have a dataset in which the times are 1, 5, 9, 20, and 22. Perhaps those

More information

Case-control studies

Case-control studies Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November

More information

Simulating complex survival data

Simulating complex survival data Simulating complex survival data Stata Nordic and Baltic Users Group Meeting 11 th November 2011 Michael J. Crowther 1 and Paul C. Lambert 1,2 1 Centre for Biostatistics and Genetic Epidemiology Department

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.

Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes. Unit 2: Models, Censoring, and Likelihood for Failure-Time Data Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes. Ramón

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE Annual Conference 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Author: Jadwiga Borucka PAREXEL, Warsaw, Poland Brussels 13 th - 16 th October 2013 Presentation Plan

More information

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Michael P. Babington and Javier Cano-Urbina August 31, 2018 Abstract Duration data obtained from a given stock of individuals

More information

****Lab 4, Feb 4: EDA and OLS and WLS

****Lab 4, Feb 4: EDA and OLS and WLS ****Lab 4, Feb 4: EDA and OLS and WLS ------- log: C:\Documents and Settings\Default\Desktop\LDA\Data\cows_Lab4.log log type: text opened on: 4 Feb 2004, 09:26:19. use use "Z:\LDA\DataLDA\cowsP.dta", clear.

More information

The influence of categorising survival time on parameter estimates in a Cox model

The influence of categorising survival time on parameter estimates in a Cox model The influence of categorising survival time on parameter estimates in a Cox model Anika Buchholz 1,2, Willi Sauerbrei 2, Patrick Royston 3 1 Freiburger Zentrum für Datenanalyse und Modellbildung, Albert-Ludwigs-Universität

More information

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM

Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods. MIT , Fall Due: Wednesday, 07 November 2007, 5:00 PM Problem Set 3: Bootstrap, Quantile Regression and MCMC Methods MIT 14.385, Fall 2007 Due: Wednesday, 07 November 2007, 5:00 PM 1 Applied Problems Instructions: The page indications given below give you

More information

Beyond GLM and likelihood

Beyond GLM and likelihood Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence

More information

A Regression Model For Recurrent Events With Distribution Free Correlation Structure

A Regression Model For Recurrent Events With Distribution Free Correlation Structure A Regression Model For Recurrent Events With Distribution Free Correlation Structure J. Pénichoux(1), A. Latouche(2), T. Moreau(1) (1) INSERM U780 (2) Université de Versailles, EA2506 ISCB - 2009 - Prague

More information

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University

Chapter 17. Failure-Time Regression Analysis. William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Chapter 17 Failure-Time Regression Analysis William Q. Meeker and Luis A. Escobar Iowa State University and Louisiana State University Copyright 1998-2008 W. Q. Meeker and L. A. Escobar. Based on the authors

More information

Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt

Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt Assessing the Calibration of Dichotomous Outcome Models with the Calibration Belt Giovanni Nattino The Ohio Colleges of Medicine Government Resource Center The Ohio State University Stata Conference -

More information

Statistical Modelling with Stata: Binary Outcomes

Statistical Modelling with Stata: Binary Outcomes Statistical Modelling with Stata: Binary Outcomes Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 21/11/2017 Cross-tabulation Exposed Unexposed Total Cases a b a + b Controls

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Reliability Growth in JMP 10

Reliability Growth in JMP 10 Reliability Growth in JMP 10 Presented at Discovery Summit 2012 September 13, 2012 Marie Gaudard and Leo Wright Purpose of Talk The goal of this talk is to provide a brief introduction to: The area of

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution.

Key Words: survival analysis; bathtub hazard; accelerated failure time (AFT) regression; power-law distribution. POWER-LAW ADJUSTED SURVIVAL MODELS William J. Reed Department of Mathematics & Statistics University of Victoria PO Box 3060 STN CSC Victoria, B.C. Canada V8W 3R4 reed@math.uvic.ca Key Words: survival

More information

2. We care about proportion for categorical variable, but average for numerical one.

2. We care about proportion for categorical variable, but average for numerical one. Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is

More information

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata Harald Tauchmann (RWI & CINCH) Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI) & CINCH Health

More information

Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time

Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time Analysis of Time-to-Event Data: Chapter 2 - Nonparametric estimation of functions of survival time Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University Survival Analysis: Weeks 2-3 Lu Tian and Richard Olshen Stanford University 2 Kaplan-Meier(KM) Estimator Nonparametric estimation of the survival function S(t) = pr(t > t) The nonparametric estimation

More information

Lecture 7: OLS with qualitative information

Lecture 7: OLS with qualitative information Lecture 7: OLS with qualitative information Dummy variables Dummy variable: an indicator that says whether a particular observation is in a category or not Like a light switch: on or off Most useful values:

More information

Reliability Engineering I

Reliability Engineering I Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

Understanding the Cox Regression Models with Time-Change Covariates

Understanding the Cox Regression Models with Time-Change Covariates Understanding the Cox Regression Models with Time-Change Covariates Mai Zhou University of Kentucky The Cox regression model is a cornerstone of modern survival analysis and is widely used in many other

More information

Monday 7 th Febraury 2005

Monday 7 th Febraury 2005 Monday 7 th Febraury 2 Analysis of Pigs data Data: Body weights of 48 pigs at 9 successive follow-up visits. This is an equally spaced data. It is always a good habit to reshape the data, so we can easily

More information

Longitudinal + Reliability = Joint Modeling

Longitudinal + Reliability = Joint Modeling Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,

More information

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous

Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Rank preserving Structural Nested Distribution Model (RPSNDM) for Continuous Y : X M Y a=0 = Y a a m = Y a cum (a) : Y a = Y a=0 + cum (a) an unknown parameter. = 0, Y a = Y a=0 = Y for all subjects Rank

More information

Nonparametric Model Construction

Nonparametric Model Construction Nonparametric Model Construction Chapters 4 and 12 Stat 477 - Loss Models Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 1 / 28 Types of data Types of data For non-life

More information

Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA

Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA Technical Report - 7/87 AN APPLICATION OF COX REGRESSION MODEL TO THE ANALYSIS OF GROUPED PULMONARY TUBERCULOSIS SURVIVAL DATA P. VENKATESAN* K. VISWANATHAN + R. PRABHAKAR* * Tuberculosis Research Centre,

More information

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups

4 Testing Hypotheses. 4.1 Tests in the regression setting. 4.2 Non-parametric testing of survival between groups 4 Testing Hypotheses The next lectures will look at tests, some in an actuarial setting, and in the last subsection we will also consider tests applied to graduation 4 Tests in the regression setting )

More information

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016 Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family

More information

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals John W. Mac McDonald & Alessandro Rosina Quantitative Methods in the Social Sciences Seminar -

More information

Instantaneous geometric rates via Generalized Linear Models

Instantaneous geometric rates via Generalized Linear Models The Stata Journal (yyyy) vv, Number ii, pp. 1 13 Instantaneous geometric rates via Generalized Linear Models Andrea Discacciati Karolinska Institutet Stockholm, Sweden andrea.discacciati@ki.se Matteo Bottai

More information

Econometrics II Censoring & Truncation. May 5, 2011

Econometrics II Censoring & Truncation. May 5, 2011 Econometrics II Censoring & Truncation Måns Söderbom May 5, 2011 1 Censored and Truncated Models Recall that a corner solution is an actual economic outcome, e.g. zero expenditure on health by a household

More information

Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Fundamentals of Reliability Engineering and Applications

Fundamentals of Reliability Engineering and Applications Fundamentals of Reliability Engineering and Applications E. A. Elsayed elsayed@rci.rutgers.edu Rutgers University Quality Control & Reliability Engineering (QCRE) IIE February 21, 2012 1 Outline Part 1.

More information

Longitudinal and Multilevel Methods for Multinomial Logit David K. Guilkey

Longitudinal and Multilevel Methods for Multinomial Logit David K. Guilkey Longitudinal and Multilevel Methods for Multinomial Logit David K. Guilkey Focus of this talk: Unordered categorical dependent variables Models will be logit based Empirical example uses data from the

More information

Multistate models and recurrent event models

Multistate models and recurrent event models and recurrent event models Patrick Breheny December 6 Patrick Breheny University of Iowa Survival Data Analysis (BIOS:7210) 1 / 22 Introduction In this final lecture, we will briefly look at two other

More information

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure).

7. Assumes that there is little or no multicollinearity (however, SPSS will not assess this in the [binary] Logistic Regression procedure). 1 Neuendorf Logistic Regression The Model: Y Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and dichotomous (binomial; 2-value), categorical/nominal data for a single DV... bear in mind that

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Meei Pyng Ng 1 and Ray Watson 1

Meei Pyng Ng 1 and Ray Watson 1 Aust N Z J Stat 444), 2002, 467 478 DEALING WITH TIES IN FAILURE TIME DATA Meei Pyng Ng 1 and Ray Watson 1 University of Melbourne Summary In dealing with ties in failure time data the mechanism by which

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

Introduction to Reliability Theory (part 2)

Introduction to Reliability Theory (part 2) Introduction to Reliability Theory (part 2) Frank Coolen UTOPIAE Training School II, Durham University 3 July 2018 (UTOPIAE) Introduction to Reliability Theory 1 / 21 Outline Statistical issues Software

More information