Survival models and Cox-regression

Size: px
Start display at page:

Download "Survival models and Cox-regression"

Transcription

1 models Cox-regression Steno Diabetes Center Copenhagen, Gentofte, Denmark & Department of Biostatistics, University of Copenhagen IDEG 2017 training day, Abu Dhabi, 11 December Rates From /home/bendix/sdc/conf/ideg2017/teach/slides/slides.tex. 1/ 94

2 Rates Senior Statistician, Steno Diabetes Center models Cox-regression IDEG 2017 training day, Abu Dhabi, 11 December surv-rate

3 data Persons enter study at some date. Persons exit at a later date, eir dead or alive. Observation: Actual time span to death ( event ) or Some time alive ( at least this long ) Rates models Rates (surv-rate) 2/ 94

4 Examples of time-to-event measurements Time from diagnosis of cancer to death. Time from romisation to death in a cancer clinical trial Time from HIV infection to AIDS. Time from marriage to 1st child birth. Time from marriage to divorce. Time to re-offending after being released from jail Rates models Rates (surv-rate) 3/ 94

5 Each line a person Each blob a death Study ended at 31 Dec Calendar time Rates models Rates (surv-rate) 4/ 94

6 Ordered by date of entry models Most likely order in your database. Rates Calendar time Rates (surv-rate) 5/ 94

7 Timescale changed to Time since diagnosis. Rates models Time since diagnosis Rates (surv-rate) 6/ 94

8 Patients ordered by survival time. models Rates Time since diagnosis Rates (surv-rate) 7/ 94

9 times grouped into bs of survival. models Rates Year of follow up Rates (surv-rate) 8/ 94

10 Patients ordered by survival status within each b. models Rates Year of follow up Rates (surv-rate) 9/ 94

11 after Cervix cancer Stage I Stage II Year N D L N D L Estimated risk in year 1 for Stage I women is 5/107.5 = Estimated 1 year survival is = Rates models Rates Life-table (surv-rate) estimator. 10/ 94

12 function Persons enter at time 0: Date of birth, date of romization, date of diagnosis. How long do y survive? time T a stochastic variable. Distribution is characterized by survival function: S(t) = P {survival at least till t} = P {T > t} = 1 P {T t} = 1 F (t) F (t) is cumulative risk of death before time t. Rates models Rates (surv-rate) 11/ 94

13 Intensity / rate / hazard same same intensity or hazard function Probability of event in interval, reltive to interval length: λ(t) = P {event in (t, t + h] alive at t} /h Characterizes distribution of survival times as does f (density) or F (cumulative distibution). oretical counterpart of a(n empirical) rate. Rates models Rates (surv-rate) 12/ 94

14 Rate survival ( t ) S(t) = exp λ(s) ds 0 λ(t) = S (t) S(t) is a cumulative measure, rate is an instantaneous measure. Note: A cumulative measure requires an origin!... it is always survival since some timepoint. Rates models Rates (surv-rate) 13/ 94

15 Observed survival rate studies: Observation of (right censored) survival time: X = min(t, Z ), δ = 1{X = T } Rates sometimes conditional on T > t 0 (left truncation, delayed entry). Epidemiological studies: Observation of (components of) a rate: D/Y D: no. events, Y no of person-years, in a prespecified time-frame. models Rates (surv-rate) 14/ 94

16 Empirical for individuals At individual level we introduce empirical rate: (d, y), number of events (d {0, 1}) during y risk time. A person contributes several observations of (d, y), with associated covariate values. Empirical are responses in survival analysis. timescale t is a covariate varies within each individual: t: age, time since diagnosis, calendar time. Don t confuse with y difference between two points on any timescale we may choose. Rates models Rates (surv-rate) 15/ 94

17 Empirical by calendar time. models Rates Calendar time Rates (surv-rate) 16/ 94

18 Empirical by time since diagnosis. Rates models Time since diagnosis Rates (surv-rate) 17/ 94

19 Statistical inference: Likelihood Two things needed: Data what did we actually observe Follow-up for each person: Entry time, exit time, exit status, covariates Model how was data generated Rates as a function of time: Probability machinery that generated data Likelihood is probability of observing data, assuming model is correct. Maximum likelihood estimation is choosing parameters of model that makes likelihood maximal. Rates models Rates (surv-rate) 18/ 94

20 Likelihood from one person likelihood from several empirical from one individual is a product of conditional probabilities: P {event at t 4 t 0 } = P {survive (t 0, t 1 ) alive at t 0 } P {survive (t 1, t 2 ) alive at t 1 } P {survive (t 2, t 3 ) alive at t 2 } P {event at t 4 alive at t 3 } Log-likelihood from one individual is a sum of terms. Each term refers to one empirical rate (d, y) y = t i t i 1 mostly d = 0. t i is timescale (covariate). Rates models Rates (surv-rate) 19/ 94

21 Poisson likelihood log-likelihood contributions from follow-up of one individual: d t log ( λ(t) ) λ(t)y t, t = t 1,..., t n is also log-likelihood from several independent Poisson observations with mean λ(t)y t, i.e. log-mean log ( λ(t) ) + log(y t ) Analysis of, (λ) can be based on a Poisson model with log-link applied to empirical where: d is response variable. log(λ) is modelled by covariates log(y) is offset variable. Rates models Rates (surv-rate) 20/ 94

22 Likelihood for follow-up of many persons Adding empirical over follow-up of persons: D = d Y = y Dlog(λ) λy Rates Persons are assumed independent Contribution from same person are conditionally independent, hence give separate contributions to log-likelihood. refore equivalent to likelihood for independent Poisson variates No need to correct for dependent observations; likelihood is a product. models Rates (surv-rate) 21/ 94

23 Likelihood Probability of data parameter: Assuming rate (intensity) is constant, λ, probability of observing 7 deaths in course of 500 person-years: P {D = 7, Y = 500 λ} = λ D e λy K = λ 7 e λ500 K = L(λ data) Best guess of λ is where this function is as large as possible. Confidence interval is where it is not too far from maximum Rates models Rates (surv-rate) 22/ 94

24 Likelihood function Likelihood 8e 17 6e 17 4e 17 2e 17 0e Rates models Rate parameter, λ Rates (surv-rate) 23/ 94

25 Likelihood function Log likelihood ratio Rates models Rate parameter, λ (per 1000) Rates (surv-rate) 23/ 94

26 Example using R Poisson likelihood, for one rate, based on 17 events in PY: library( Epi ) D <- 17 ; Y < m1 <- glm( D ~ 1, offset=log(y/1000), family=poisson) ci.exp( m1 ) exp(est.) 2.5% 97.5% (Intercept) Poisson likelihood, two, or one rate RR: D <- c(17,28) ; Y <- c(843.7,632.3) ; gg <- factor(0:1) m2 <- glm( D ~ gg, offset=log(y/1000), family=poisson) ci.exp( m2 ) exp(est.) 2.5% 97.5% (Intercept) gg Rates models Rates (surv-rate) 24/ 94

27 Example using R Poisson likelihood, two, or one rate RR: D <- c(17,28) ; Y <- c(843.7,632.3) ; gg <- factor(0:1) m2 <- glm( D ~ gg, offset=log(y/1000), family=poisson) ci.exp( m2 ) exp(est.) 2.5% 97.5% (Intercept) gg m3 <- glm( D ~ gg - 1, offset=log(y/1000), family=poisson) ci.exp( m3 ) exp(est.) 2.5% 97.5% gg gg Rates models Rates (surv-rate) 25/ 94

28 Senior Statistician, Steno Diabetes Center models Cox-regression IDEG 2017 training day, Abu Dhabi, 11 December ltab

29 analysis Response variable: Time to event, T Censoring time, Z We observe (min(t, Z ), δ = 1{T < Z }). This gives time a special status, mixes response variable (risk)time with covariate time(scale). Originates from clinical trials where everyone enters at time 0, refore Y = T 0 = T Rates models (ltab) 26/ 94

30 life table method simplest analysis is by life-table method : interval alive dead cens. i n i d i l i p i /(77 2/2)= /(70 4/2)= /(59 1/2)= p i = P {death in interval i} = d i /(n i l i /2) S(t) = (1 p 1 ) (1 p t ) Rates models (ltab) 27/ 94

31 Population life table, DK Men Women a S(a) λ(a) E[l res(a)] S(a) λ(a) E[l res(a)] Rates models (ltab) 28/ 94

32 Danish life tables Mortality per 100,000 person years log 2 [mortality per 10 5 (40 85 years)] Men: age Women: age Rates models (ltab) 29/ 94 Age

33 Swedish life tables Mortality per 100,000 person years log 2 [mortality per 10 5 (40 85 years)] Men: age Women: age Rates models (ltab) 30/ 94 Age

34 Observations for lifetable Age Life table is based on person-years deaths accumulated in a short period. Age-specific cross-sectional! function: S(t) = e t 0 λ(a) da = e t 0 λ(a) assumes stability of to be interpretable for actual persons. Rates models (ltab) 31/ 94

35 Observations for lifetable 65 This is a Lexis diagram. Age 60 Rates models (ltab) 32/ 94

36 Observations for lifetable 65 This is a Lexis diagram. Age 60 Rates models (ltab) 33/ 94

37 Life table approach population experience: D: Deaths (events). Y : Person-years (risk time). classical lifetable analysis compiles se for prespecified intervals of age, computes age-specific mortality. Data are collected crossectionally, but interpreted longitudinally. are basic building bocks used for construction of: RRs cumulative measures (survival risk) Rates models (ltab) 34/ 94

38 Senior Statistician, Steno Diabetes Center models Cox-regression IDEG 2017 training day, Abu Dhabi, 11 December km-na

39 Method most common method of estimating survival function. A non-parametric method. Divides time into small intervals where intervals are defined by unique times of failure (death). Based on conditional probabilities as we are interested in probability a subject surviving next time interval given that y have survived so far. Rates models (km-na) 35/ 94

40 Kaplan method illustrated ( = failure = censored): models Cumulative 1.0 survival probability N = Steps caused by multiplying by (1 1/49) (1 1/46) respectively Late entry can also be dealt with Time Rates (km-na) 36/ 94

41 Using R: Surv() library( survival ) data( lung ) head( lung, 3 ) inst time status age sex ph.ecog ph.karno pat.karno meal.cal wt.loss NA NA 15 with( lung, Surv( time, status==2 ) )[1:10] [1] ( s.km <- survfit( Surv( time, status==2 ) ~ 1, data=lung ) ) Call: survfit(formula = Surv(time, status == 2) ~ 1, data = lung) n events median 0.95LCL 0.95UCL plot( s.km ) abline( v=310, h=0.5, col="red" ) Rates models (km-na) 37/ 94

42 Rates models (km-na) 38/ 94

43 Rates models (km-na) 39/ 94

44 Senior Statistician, Steno Diabetes Center models Cox-regression IDEG 2017 training day, Abu Dhabi, 11 December cox

45 proportional hazards model λ(t, x) = λ 0 (t) exp(x β) partial log-likelihood for regression parameters (βs): l(β) = ( e x death ) β log i R t e x iβ death times This is David Cox s invention. Extremely efficient from a computational point of view. baseline hazard λ 0 (t) is bypassed (profiled out). Rates models (cox) 40/ 94

46 Proportional Hazards model baseline hazard rate, λ 0 (t), is hazard rate when all covariates are 0. form of above equation means that covariates act multiplicatively on baseline hazard rate. Time is a covariate (albeit modeled special). baseline hazard is a function of time thus varies with time. No assumption about shape of underlying hazard function. but you will never see shape... Rates models (cox) 41/ 94

47 Interpreting Regression Coefficients If x j is binary exp(β j ) is estimated hazard ratio for subjects corresponding to x j = 1 compared to those where x j = 0. If x j is exp(β j ) is estimated increase/decrease in hazard rate for a unit change in x j. With more than one covariate interpretation is similar, i.e. exp(β j ) is hazard ratio for subjects who only differ with respect to covariate x j. Rates models (cox) 42/ 94

48 Fitting a Cox- model in R library( survival ) data(bladder) bladder <- subset( bladder, enum<2 ) head( bladder) id rx number size stop event enum Rates models (cox) 43/ 94

49 Fitting a in R c0 <- coxph( Surv(stop,event) ~ number + size, data=bladder ) c0 Call: coxph(formula = Surv(stop, event) ~ number + size, data = bladder) coef exp(coef) se(coef) z p number size Likelihood ratio test=7.04 on 2 df, p= n= 85, number of events= 47 Rates models (cox) 44/ 94

50 Plotting base survival in R plot( survfit(c0) ) lines( survfit(c0), conf.int=f, lwd=3 ) plot.coxph plots survival curve for a person with an average covariate value which is not average survival for population considered... not necessarily meaningful Rates models (cox) 45/ 94

51 Rates models (cox) 46/ 94

52 Plotting base survival in R You can plot survival curve for specific values of covariates, using newdata= argument: plot( survfit(c0) ) lines( survfit(c0), conf.int=f, lwd=3 ) lines( survfit(c0, newdata=data.frame(number=1,size=1)), lwd=2, col="limegreen" ) text( par("usr")[2]*0.98, 1.00, "number=1,size=1", col="limegreen", font=2, adj=1 ) Rates models (cox) 47/ 94

53 1.0 number=1,size=1 number=4,size=1 number=1,size= Rates models (cox) 48/ 94

54 Senior Statistician, Steno Diabetes Center models Cox-regression IDEG 2017 training day, Abu Dhabi, 11 December KMCox

55 A look at Cox model λ(t, x) = λ 0 (t) exp(x β) A model for rate as a function of t x. covariate t has a special status: Computationally, because all individuals contribute to (some of) range of t.... scale along which time is split ( risk sets) Conceptually t is just a covariate that varies within individual. Cox s approach profiles λ 0 (t) out from model Rates models (KMCox) 49/ 94

56 Cox-likelihood as profile likelihood One parameter per death time to describe effect of time (i.e. chosen timescale). log ( λ(t, x i ) ) = log ( λ 0 (t) ) + β 1 x 1i + + β p x pi = α t + η i Rates Profile likelihood: Derive estimates of αt as function of data βs assuming constant rate between death times Insert in likelihood, now only a function of data βs Turns out to be Cox s partial likelihood models (KMCox) 50/ 94

57 Cox-likelihood: mechanics of computing likelihood is computed by summing over risk-sets at each event time t: l(η) = ( e η death ) log t i R t e η i this is essentially splitting follow-up time at event- ( censoring) times... repeatedly in every cycle of iteration... simplified by not keeping track of risk time... but only works along one time scale Rates models (KMCox) 51/ 94

58 log ( λ(t, x i ) ) = log ( λ 0 (t) ) + β 1 x 1i + + β p x pi = α t + η i Suppose time scale has been divided into small intervals with at most one death in each: Empirical : (d it, y it ) each t has at most one d it = 0. Assume w.l.o.g. ys in empirical all are 1. Log-likelihood contributions that contain information on a specific time-scale parameter α t will be from: (only) empirical rate (1, 1) with death at time t. all or empirical (0, 1) from those at risk at time t. Rates models (KMCox) 52/ 94

59 Splitting dataset a priori Poisson approach needs a dataset of empirical (d, y) with suitably small values of y. each individual contributes many empirical (one per risk-set contribution in ling) From each empirical rate we get: Poisson-response d Risk time y log(y) as offset Covariate value for timescale (time since entry, current age, current date,... ) or covariates Rates models (KMCox) 53/ 94

60 Example: Mayo Clinic lung cancer after lung cancer Covariates: Age at diagnosis Sex Time since diagnosis Cox model Split data: Poisson model, time as factor Poisson model, time as spline Rates models (KMCox) 54/ 94

61 Mayo Clinic lung cancer 60 year old woman Rates Days since diagnosis models (KMCox) 55/ 94

62 Example: Mayo Clinic lung cancer I > library( survival ) > library( Epi ) > Lung <- Lexis( exit = list( tfe=time ), + exit.status = factor(status,labels=c("alive","dead")), + data = lung ) NOTE: entry.status has been set to "Alive" for all. NOTE: entry is assumed to be 0 on tfe timescale. Rates models (KMCox) 56/ 94

63 Example: Mayo Clinic lung cancer II > ml.cox <- coxph( Surv( tfe, tfe+lex.dur, lex.xst=="dead" ) ~ + age + factor( sex ), + method="breslow", eps=10^-8, iter.max=25, data=lung ) > Lung.s <- splitlexis( Lung, + breaks=c(0,sort(unique(lung$time))), + time.scale="tfe" ) > Lung.S <- splitlexis( Lung, Rates + breaks=c(0,sort(unique(lung$time[lung$lex.xst=="dead"]))), + time.scale="tfe" ) > summary( Lung.s ) Transitions: To From Alive Dead Records: Events: Risk time: Persons: Alive > summary( Lung.S ) models (KMCox) 57/ 94

64 Example: Mayo Clinic lung cancer III Transitions: To From Alive Dead Records: Events: Risk time: Persons: Alive > subset( Lung.s, lex.id==96 )[,1:11] lex.id tfe lex.dur lex.cst lex.xst inst time status age sex ph.ecog Alive Alive Alive Alive Alive Alive Alive Alive Alive Alive Alive Alive Alive Dead > nlevels( factor( Lung.s$tfe ) ) [1] 186 Rates models (KMCox) 58/ 94

65 Example: Mayo Clinic lung cancer IV > system.time( + mls.pois.fc <- glm( lex.xst=="dead" ~ factor( tfe ) + + age + factor( sex ), + offset = log(lex.dur), + family=poisson, data=lung.s, eps=10^-8, maxit=25 ) + ) user system elapsed > length( coef(mls.pois.fc) ) [1] 188 > system.time( + mls.pois.fc <- glm( lex.xst=="dead" ~ factor( tfe ) + + age + factor( sex ), + offset = log(lex.dur), + family=poisson, data=lung.s, eps=10^-8, maxit=25 ) + ) Rates models (KMCox) 59/ 94

66 Example: Mayo Clinic lung cancer V user system elapsed > length( coef(mls.pois.fc) ) [1] 142 > t.kn <- c(0,25,100,500,1000) > dim( Ns(Lung.s$tfe,knots=t.kn) ) [1] > system.time( + mls.pois.sp <- glm( lex.xst=="dead" ~ Ns( tfe, knots=t.kn ) + + age + factor( sex ), + offset = log(lex.dur), + family=poisson, data=lung.s, eps=10^-8, maxit=25 ) + ) Rates models (KMCox) 60/ 94

67 Example: Mayo Clinic lung cancer VI user system elapsed models > ests <- Rates + rbind( ci.exp(ml.cox), + ci.exp(mls.pois.fc,subset=c("age","sex")), + ci.exp(mls.pois.fc,subset=c("age","sex")), + ci.exp(mls.pois.sp,subset=c("age","sex")) ) > cmp <- cbind( ests[c(1,3,5,7),], + ests[c(1,3,5,7)+1,] ) > rownames( cmp ) <- c("cox","poisson-factor","poisson-factor (D)","Poisson-spline") > colnames( cmp )[c(1,4)] <- c("age","sex") > round( cmp, 7 ) (KMCox) 61/ 94

68 Example: Mayo Clinic lung cancer VII age 2.5% 97.5% sex 2.5% 97.5% Cox Poisson-factor Poisson-factor (D) Poisson-spline Rates models (KMCox) 62/ 94

69 models Rates Mortality rate per year Days since diagnosis Days since diagnosis (KMCox) 63/ 94

70 models Rates Mortality rate per year Days since diagnosis Days since diagnosis (KMCox) 63/ 94

71 Deriving survival function > mls.pois.sp <- glm( lex.xst=="dead" ~ Ns( tfe, knots=t.kn ) + + age + factor( sex ), + offset = log(lex.dur), + family=poisson, data=lung.s, eps=10^-8, maxit=25 ) > CM <- cbind( 1, Ns( seq(10,1000,10)-5, knots=t.kn ), 60, 1 ) > lambda <- ci.exp( mls.pois.sp, ctr.mat=cm ) > Lambda <- ci.cum( mls.pois.sp, ctr.mat=cm, intl=10 )[,-4] > survp <- exp(-rbind(0,lambda)) Code output for entire example avaiable in Rates models (KMCox) 64/ 94

72 What really is Taking life-table approach ad absurdum by: dividing time very finely modeling one covariate, time-scale, with one parameter per distinct value. model for time scale is really with exchangeable time-intervals. difficult to access baseline hazard (which looks terrible) uninitiated tempted to show survival curves where irrelevant Rates models (KMCox) 65/ 94

73 Models of this world Replace α t s by a parametric function f (t) with a limited number of parameters, for example: Piecewise constant Splines (linear, quadratic or cubic) Fractional polynomials two latter brings model into this world : smoothly varying parametric closed form representation of baseline hazard finite no. of parameters Makes it really easy to use directly in calculations of expected residual life time state occupancy probabilities in multistate models... Rates models (KMCox) 66/ 94

74 Senior Statistician, Steno Diabetes Center models Cox-regression IDEG 2017 training day, Abu Dhabi, 11 December crv-mod

75 Testis cancer Testis cancer in Denmark: > options( show.signif.stars=false ) > library( Epi ) > data( testisdk ) > str( testisdk ) 'data.frame': 4860 obs. of 4 variables: $ A: num $ P: num $ D: num $ Y: num > head( testisdk ) A P D Y Rates models (crv-mod) 67/ 94

76 Cases, PY > stat.table( list(a=floor(a/10)*10, + P=floor(P/10)*10), + list( D=sum(D), + Y=sum(Y/1000), + rate=ratio(d,y,10^5) ), + margins=true, data=testisdk ) P A Total Rates models (crv-mod) / 94

77 Linear effects in glm How do depend on age? > ml <- glm( D ~ A, offset=log(y), family=poisson, data=testisdk ) > round( ci.lin( ml ), 4 ) Estimate StdErr z P 2.5% 97.5% (Intercept) A > round( ci.exp( ml ), 4 ) exp(est.) 2.5% 97.5% (Intercept) A Linear increase of log- by age Rates models (crv-mod) 69/ 94

78 Linear effects in glm > nd <- data.frame( A=15:60, Y=10^5 ) > pr <- ci.pred( ml, newdata=nd ) > head( pr ) Estimate 2.5% 97.5% > matplot( nd$a, pr, + type="l", lty=1, lwd=c(3,1,1), col="black", log="y" ) Rates models (crv-mod) 70/ 94

79 Linear effects in glm > round( ci.lin( ml ), 4 ) Estimate StdErr z P 2.5% 97.5% (Intercept) A > Cl <- cbind( 1, nd$a ) > head( Cl ) [,1] [,2] [1,] 1 15 [2,] 1 16 [3,] 1 17 [4,] 1 18 [5,] 1 19 [6,] 1 20 > matplot( nd$a, ci.exp( ml, ctr.mat=cl ), + type="l", lty=1, lwd=c(3,1,1), col="black", log="y" ) Rates models (crv-mod) 71/ 94

80 Linear effects in glm pr nd$a > matplot( nd$a, pr, + type="l", lty=1, lwd=c(3,1,1), col="black", log="y" ) Rates models (crv-mod) 72/ 94

81 Linear effects in glm ci.exp(ml, ctr.mat = Cl) * 10^ nd$a > matplot( nd$a, ci.exp( ml, ctr.mat=cl )*10^5, + type="l", lty=1, lwd=c(3,1,1), col="black", log="y" ) Rates models (crv-mod) 73/ 94

82 Quadratic effects in glm How do depend on age? > mq <- glm( D ~ A + I(A^2), + offset=log(y), family=poisson, data=testisdk ) > round( ci.lin( mq ), 4 ) Estimate StdErr z P 2.5% 97.5% (Intercept) A I(A^2) > round( ci.exp( mq ), 4 ) exp(est.) 2.5% 97.5% (Intercept) A I(A^2) Rates models (crv-mod) 74/ 94

83 Quadratic effect in glm > round( ci.lin( mq ), 4 ) Estimate StdErr z P 2.5% 97.5% (Intercept) A I(A^2) > Cq <- cbind( 1, 15:60, (15:60)^2 ) > head( Cq, 4 ) [,1] [,2] [,3] [1,] [2,] [3,] [4,] > matplot( nd$a, ci.exp( mq, ctr.mat=cq )*10^5, + type="l", lty=1, lwd=c(3,1,1), col="black", log="y" ) Rates models (crv-mod) 75/ 94

84 Quadratic effect in glm ci.exp(mq, ctr.mat = Cq) * 10^ nd$a > matplot( nd$a, ci.exp( mq, ctr.mat=cq )*10^5, + type="l", lty=1, lwd=c(3,1,1), col="black", log="y" ) > matlines( nd$a, ci.exp( ml, ctr.mat=cl )*10^5, + type="l", lty=1, lwd=c(3,1,1), col="blue" ) Rates models (crv-mod) 76/ 94

85 Spline effects in glm > library( splines ) > ms <- glm( D ~ Ns(A,knots=seq(15,65,10)), + offset=log(y), family=poisson, data=testisdk ) > round( ci.exp( ms ), 3 ) exp(est.) 2.5% 97.5% (Intercept) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) > aa <- 15:65 > As <- Ns( aa, knots=seq(15,65,10) ) > head( As ) [1,] [2,] Rates models [3,] [4,] time scales (crv-mod) / 94

86 Spline effects in glm Testis cancer incidence rate per 100,000 PY Rates models > matplot( aa, ci.exp( ms, Age ctr.mat=cbind(1,as) )*10^5, + log="y", xlab="age", ylab="testis cancer incidence rate per 100,000timePY", scales + type="l", lty=1, lwd=c(3,1,1), col="black", ylim=c(2,20) ) > matlines( nd$a, ci.exp( mq, ctr.mat=cq )*10^5, + type="l", lty=1, lwd=c(3,1,1), col="blue" ) (crv-mod) 78/ 94

87 Adding a linear period effect > msp <- glm( D ~ Ns(A,knots=seq(15,65,10)) + P, + offset=log(y), family=poisson, data=testisdk ) > round( ci.lin( msp ), 3 ) Estimate StdErr z P 2.5% 97.5% (Intercept) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) P > Ca <- cbind( 1, Ns( aa, knots=seq(15,65,10) ), 1970 ) > head( Ca ) [1,] [2,] [3,] Rates models [4,] [5,] time 1scales (crv-mod) / 94

88 Adding a linear period effect Testis cancer incidence rate per 100,000 PY in Age > matplot( aa, ci.exp( msp, ctr.mat=ca )*10^5, + log="y", xlab="age", + ylab="testis cancer incidence rate per 100,000 PY in 1970", + type="l", lty=1, lwd=c(3,1,1), col="black", ylim=c(2,20) ) Rates models (crv-mod) 80/ 94

89 Adding a linear period effect Testis cancer incidence rate per 100,000 PY in Age > matplot( aa, ci.exp( msp, ctr.mat=ca )*10^5, + log="y", xlab="age", + ylab="testis cancer incidence rate per 100,000 PY in 1970", + type="l", lty=1, lwd=c(3,1,1), col="black", ylim=c(2,20) ) Rates models > matlines( nd$a, ci.pred( ms, newdata=nd ), + type="l", (crv-mod) lty=1, lwd=c(3,1,1), col="blue" ) 81/ 94

90 period effect > round( ci.lin( msp ), 3 ) Estimate StdErr z P 2.5% 97.5% (Intercept) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) P > pp <- seq(1945,1995,0.2) > Cp <- cbind( pp ) > head( Cp ) pp [1,] [2,] [3,] [4,] Rates models [5,] [6,] time scales (crv-mod) 82/ 94

91 Period effect Testis cancer incidence RR > matplot( pp, ci.exp( msp, Date subset="p", ctr.mat=cp ), + log="y", ylim=c(0.5,2), xlab="date", + ylab="testis cancer incidence RR", + type="l", lty=1, lwd=c(3,1,1), col="black" ) > abline( h=1, v=1970 ) Rates models (crv-mod) 83/ 94

92 A quadratic period effect > mspq <- glm( D ~ Ns(A,knots=seq(15,65,10)) + P + I(P^2), + offset=log(y), family=poisson, data=testisdk ) > round( ci.exp( mspq ), 3 ) exp(est.) 2.5% 97.5% (Intercept) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) P I(P^2) > Cq <- cbind( pp-1970, pp^2-1970^2 ) > head( Cq ) [,1] [,2] [1,] [2,] Rates models [3,] [4,] time scales (crv-mod) 84/ 94

93 A quadratic period effect Testis cancer incidence RR Date > matplot( pp, ci.exp( mspq, subset="p", ctr.mat=cq ), + log="y", ylim=c(0.5,2), xlab="date", + ylab="testis cancer incidence RR", + type="l", lty=1, lwd=c(3,1,1), col="black" ) > abline( h=1, v=1970 ) Rates models (crv-mod) 85/ 94

94 A spline period effect Because we have age-effect with rate dimension, period effect is a RR > msps <- glm( D ~ Ns(A,knots=seq(15,65,10)) + + Ns(P,knots=seq(1950,1990,10),ref=1970), + offset=log(y), family=poisson, data=testisdk ) > round( ci.exp( msps ), 3 ) exp(est.) 2.5% 97.5% (Intercept) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(A, knots = seq(15, 65, 10)) Ns(P, knots = seq(1950, 1990, 10), ref = 1970) Ns(P, knots = seq(1950, 1990, 10), ref = 1970) Ns(P, knots = seq(1950, 1990, 10), ref = 1970) Ns(P, knots = seq(1950, 1990, 10), ref = 1970) Rates models (crv-mod) 86/ 94

95 A spline period effect > Cp <- Ns( pp, knots=seq(1950,1990,10),ref=1970) > head( Cp, 4 ) [1,] [2,] [3,] [4,] > ci.exp( msps, subset="p" ) exp(est.) 2.5% 97.5% Ns(P, knots = seq(1950, 1990, 10), ref = 1970) Ns(P, knots = seq(1950, 1990, 10), ref = 1970) Ns(P, knots = seq(1950, 1990, 10), ref = 1970) Ns(P, knots = seq(1950, 1990, 10), ref = 1970) > matplot( pp, ci.exp( msps, subset="p", ctr.mat=cp ), + log="y", ylim=c(0.5,2), xlab="date", + ylab="testis cancer incidence RR", + type="l", lty=1, lwd=c(3,1,1), col="black" ) Rates models (crv-mod) 87/ 94

96 Period effect Testis cancer incidence RR Date > matplot( pp, ci.exp( msps, subset="p", ctr.mat=cp ), + log="y", ylim=c(0.5,2), xlab="date", + ylab="testis cancer incidence RR", + type="l", lty=1, lwd=c(3,1,1), col="black" ) > abline( h=1, v=1970 ) Rates models (crv-mod) 88/ 94

97 Period effect > par( mfrow=c(1,2) ) > matplot( aa, ci.pred( msps, newdata=data.frame(a=aa,p=1970,y=10^5) ), + log="y", xlab="age", + ylab="testis cancer incidence rate per 100,000 PY in 1970", + type="l", lty=1, lwd=c(3,1,1), col="black" ) > matplot( pp, ci.exp( msps, subset="p", ctr.mat=cp ), + log="y", xlab="date", ylab="testis cancer incidence RR", + type="l", lty=1, lwd=c(3,1,1), col="black" ) > abline( h=1, v=1970 ) Rates models (crv-mod) 89/ 94

98 Age period effect models Testis cancer incidence rate per 100,000 PY in Testis cancer incidence RR Rates Age(crv-mod) Date 90/ 94

99 Period effect > par( mfrow=c(1,2) ) > matplot( aa, ci.pred( msps, newdata=data.frame(a=aa,p=1970,y=10^5) ), + log="y", xlab="age", + ylim=c(2,20), xlim=c(15,65), + ylab="testis cancer incidence rate per 100,000 PY in 1970", + type="l", lty=1, lwd=c(3,1,1), col="black" ) > matplot( pp, ci.exp( msps, subset="p", ctr.mat=cp ), + log="y", xlab="date", + ylim=c(2,20)/sqrt(2*20), xlim=c(15,65)+1930, + ylab="testis cancer incidence RR", + type="l", lty=1, lwd=c(3,1,1), col="black" ) > abline( h=1, v=1970 ) Rates models (crv-mod) 91/ 94

100 Age period effect models Testis cancer incidence rate per 100,000 PY in Testis cancer incidence RR Rates Age(crv-mod) Date 92/ 94

101 Age period effect with ci.exp In rate models re is always one term with rate dimension usually age But it must refer to a specific reference value for all or variables (P). All parameters must be used in computing, at some reference value(s). For or variables, report RR relative to reference point. Only parameters relevant for variable (P) used. Contrast matrix is a difference between (splines at) prediction points reference point. Rates models (crv-mod) 93/ 94

Statistical Analysis in the Lexis Diagram: Age-Period-Cohort models

Statistical Analysis in the Lexis Diagram: Age-Period-Cohort models Statistical Analysis in the Lexis Diagram: Age-Period-Cohort models Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://bendixcarstensen.com Max Planck Institut for Demographic Research,

More information

Modern Demographic Methods in Epidemiology with R

Modern Demographic Methods in Epidemiology with R Modern Demographic Methods in Epidemiology with R Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark & Department of Biostatistics, University of Copenhagen bxc@steno.dk http://bendixcarstensen.com

More information

Practice in analysis of multistate models using Epi::Lexis in

Practice in analysis of multistate models using Epi::Lexis in Practice in analysis of multistate models using Epi::Lexis in Freiburg, Germany September 2016 http://bendixcarstensen.com/advcoh/courses/frias-2016 Version 1.1 Compiled Thursday 15 th September, 2016,

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

Follow-up data with the Epi package

Follow-up data with the Epi package Follow-up data with the Epi package Summer 2014 Michael Hills Martyn Plummer Bendix Carstensen Retired Highgate, London International Agency for Research on Cancer, Lyon plummer@iarc.fr Steno Diabetes

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

Analysis plan, data requirements and analysis of HH events in DK

Analysis plan, data requirements and analysis of HH events in DK Analysis plan, data requirements and analysis of HH events in DK SDC http://bendixcarstensen.com/sdc/kzia/hh-dk September 2016 Version 6 Compiled Monday 3 rd October, 2016, 13:40 from: /home/bendix/sdc/coll/kzia/r/hh-dk.tex

More information

Simulation of multistate models with multiple timescales: simlexis in the Epi package

Simulation of multistate models with multiple timescales: simlexis in the Epi package Simulation of multistate models with multiple timescales: simlexis in the Epi package SDC Thursday 8 th March, 2018 http://bendixcarstensen.com/epi/simlexis.pdf Version 2.4 Compiled Thursday 8 th March,

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Multistate models and recurrent event models

Multistate models and recurrent event models and recurrent event models Patrick Breheny December 6 Patrick Breheny University of Iowa Survival Data Analysis (BIOS:7210) 1 / 22 Introduction In this final lecture, we will briefly look at two other

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016 Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 9.1 Survival analysis involves subjects moving through time Hazard may

More information

Survival analysis in R

Survival analysis in R Survival analysis in R Niels Richard Hansen This note describes a few elementary aspects of practical analysis of survival data in R. For further information we refer to the book Introductory Statistics

More information

Simulation of multistate models with multiple timescales: simlexis in the Epi package

Simulation of multistate models with multiple timescales: simlexis in the Epi package Simulation of multistate models with multiple timescales: simlexis in the Epi package SDC Tuesday 8 th August, 2017 http://bendixcarstensen.com/epi/simlexis.pdf Version 2.3 Compiled Tuesday 8 th August,

More information

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Patrick J. Heagerty PhD Department of Biostatistics University of Washington 1 Biomarkers Review: Cox Regression Model

More information

Case-control studies

Case-control studies Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

Survival analysis in R

Survival analysis in R Survival analysis in R Niels Richard Hansen This note describes a few elementary aspects of practical analysis of survival data in R. For further information we refer to the book Introductory Statistics

More information

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016 Proportional Hazards Model - Handling Ties and Survival Estimation Statistics 255 - Survival Analysis Presented February 4, 2016 likelihood - Discrete Dan Gillen Department of Statistics University of

More information

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek Survival Analysis 732G34 Statistisk analys av komplexa data Krzysztof Bartoszek (krzysztof.bartoszek@liu.se) 10, 11 I 2018 Department of Computer and Information Science Linköping University Survival analysis

More information

The coxvc_1-1-1 package

The coxvc_1-1-1 package Appendix A The coxvc_1-1-1 package A.1 Introduction The coxvc_1-1-1 package is a set of functions for survival analysis that run under R2.1.1 [81]. This package contains a set of routines to fit Cox models

More information

Introduction to linear algebra with

Introduction to linear algebra with Introduction to linear algebra with March 2015 http://bendixcarstensen/sdc/r-course Version 5.2 Compiled Thursday 26 th March, 2015, 11:31 from: /home/bendix/teach/apc/00/linearalgebra/linalg-notes-bxc

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

Multistate example from Crowther & Lambert with multiple timescales

Multistate example from Crowther & Lambert with multiple timescales Multistate example from Crowther & Lambert with multiple timescales SDCC http://bendixcarstensen.com/advcoh March 2018 Version 6 Compiled Sunday 11 th March, 2018, 14:36 from: /home/bendix/teach/advcoh/00/examples/bcms/bcms.tex

More information

Case-control studies C&H 16

Case-control studies C&H 16 Case-control studies C&H 6 Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk http://bendixcarstensen.com PhD-course in Epidemiology, Department

More information

Instantaneous geometric rates via Generalized Linear Models

Instantaneous geometric rates via Generalized Linear Models The Stata Journal (yyyy) vv, Number ii, pp. 1 13 Instantaneous geometric rates via Generalized Linear Models Andrea Discacciati Karolinska Institutet Stockholm, Sweden andrea.discacciati@ki.se Matteo Bottai

More information

Methodological challenges in research on consequences of sickness absence and disability pension?

Methodological challenges in research on consequences of sickness absence and disability pension? Methodological challenges in research on consequences of sickness absence and disability pension? Prof., PhD Hjelt Institute, University of Helsinki 2 Two methodological approaches Lexis diagrams and Poisson

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

Descriptive Epidemiology (a.k.a. Disease Reality)

Descriptive Epidemiology (a.k.a. Disease Reality) Descriptive Epidemiology (a.k.a. Disease Reality) SDCC March 2018 http://bendixcarstensen.com/sdc/daf Version 8 Compiled Sunday 18 th March, 2018, 21:14 from: /home/bendix/sdc/proj/daffodil/disreal/desepi.tex

More information

Building a Prognostic Biomarker

Building a Prognostic Biomarker Building a Prognostic Biomarker Noah Simon and Richard Simon July 2016 1 / 44 Prognostic Biomarker for a Continuous Measure On each of n patients measure y i - single continuous outcome (eg. blood pressure,

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE Annual Conference 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Author: Jadwiga Borucka PAREXEL, Warsaw, Poland Brussels 13 th - 16 th October 2013 Presentation Plan

More information

Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016 Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of

More information

Multivariable Fractional Polynomials

Multivariable Fractional Polynomials Multivariable Fractional Polynomials Axel Benner September 7, 2015 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example

More information

Chapter 4 Regression Models

Chapter 4 Regression Models 23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Lecture 8 Stat D. Gillen

Lecture 8 Stat D. Gillen Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

Introduction to cthmm (Continuous-time hidden Markov models) package

Introduction to cthmm (Continuous-time hidden Markov models) package Introduction to cthmm (Continuous-time hidden Markov models) package Abstract A disease process refers to a patient s traversal over time through a disease with multiple discrete states. Multistate models

More information

Multivariable Fractional Polynomials

Multivariable Fractional Polynomials Multivariable Fractional Polynomials Axel Benner May 17, 2007 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example

More information

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )

cor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson ) Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation

More information

Exercises. (a) Prove that m(t) =

Exercises. (a) Prove that m(t) = Exercises 1. Lack of memory. Verify that the exponential distribution has the lack of memory property, that is, if T is exponentially distributed with parameter λ > then so is T t given that T > t for

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Survival Analysis. STAT 526 Professor Olga Vitek

Survival Analysis. STAT 526 Professor Olga Vitek Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9 Survival Data and Survival Functions Statistical analysis of time-to-event data Lifetime of machines and/or parts (called failure time analysis

More information

Estimation for Modified Data

Estimation for Modified Data Definition. Estimation for Modified Data 1. Empirical distribution for complete individual data (section 11.) An observation X is truncated from below ( left truncated) at d if when it is at or below d

More information

7.1 The Hazard and Survival Functions

7.1 The Hazard and Survival Functions Chapter 7 Survival Models Our final chapter concerns models for the analysis of data which have three main characteristics: (1) the dependent variable or response is the waiting time until the occurrence

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

flexsurv: flexible parametric survival modelling in R. Supplementary examples

flexsurv: flexible parametric survival modelling in R. Supplementary examples flexsurv: flexible parametric survival modelling in R. Supplementary examples Christopher H. Jackson MRC Biostatistics Unit, Cambridge, UK chris.jackson@mrc-bsu.cam.ac.uk Abstract This vignette of examples

More information

for Time-to-event Data Mei-Ling Ting Lee University of Maryland, College Park

for Time-to-event Data Mei-Ling Ting Lee University of Maryland, College Park Threshold Regression for Time-to-event Data Mei-Ling Ting Lee University of Maryland, College Park MLTLEE@UMD.EDU Outline The proportional hazards (PH) model is widely used in analyzing time-to-event data.

More information

Joint longitudinal and time-to-event models via Stan

Joint longitudinal and time-to-event models via Stan Joint longitudinal and time-to-event models via Stan Sam Brilleman 1,2, Michael J. Crowther 3, Margarita Moreno-Betancur 2,4,5, Jacqueline Buros Novik 6, Rory Wolfe 1,2 StanCon 2018 Pacific Grove, California,

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics

Faculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial

More information

Time-dependent coefficients

Time-dependent coefficients Time-dependent coefficients Patrick Breheny December 1 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/20 Introduction As we discussed previously, stratification allows one to handle variables that

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Generalized Linear Models in R

Generalized Linear Models in R Generalized Linear Models in R NO ORDER Kenneth K. Lopiano, Garvesh Raskutti, Dan Yang last modified 28 4 2013 1 Outline 1. Background and preliminaries 2. Data manipulation and exercises 3. Data structures

More information

Package CoxRidge. February 27, 2015

Package CoxRidge. February 27, 2015 Type Package Title Cox Models with Dynamic Ridge Penalties Version 0.9.2 Date 2015-02-12 Package CoxRidge February 27, 2015 Author Aris Perperoglou Maintainer Aris Perperoglou

More information

Instantaneous geometric rates via generalized linear models

Instantaneous geometric rates via generalized linear models Instantaneous geometric rates via generalized linear models Andrea Discacciati Matteo Bottai Unit of Biostatistics Karolinska Institutet Stockholm, Sweden andrea.discacciati@ki.se 1 September 2017 Outline

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

STAT 526 Spring Final Exam. Thursday May 5, 2011

STAT 526 Spring Final Exam. Thursday May 5, 2011 STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

Introduction to the rstpm2 package

Introduction to the rstpm2 package Introduction to the rstpm2 package Mark Clements Karolinska Institutet Abstract This vignette outlines the methods and provides some examples for link-based survival models as implemented in the R rstpm2

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Relative-risk regression and model diagnostics. 16 November, 2015

Relative-risk regression and model diagnostics. 16 November, 2015 Relative-risk regression and model diagnostics 16 November, 2015 Relative risk regression More general multiplicative intensity model: Intensity for individual i at time t is i(t) =Y i (t)r(x i, ; t) 0

More information

Package JointModel. R topics documented: June 9, Title Semiparametric Joint Models for Longitudinal and Counting Processes Version 1.

Package JointModel. R topics documented: June 9, Title Semiparametric Joint Models for Longitudinal and Counting Processes Version 1. Package JointModel June 9, 2016 Title Semiparametric Joint Models for Longitudinal and Counting Processes Version 1.0 Date 2016-06-01 Author Sehee Kim Maintainer Sehee Kim

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Multi-state models: prediction

Multi-state models: prediction Department of Medical Statistics and Bioinformatics Leiden University Medical Center Course on advanced survival analysis, Copenhagen Outline Prediction Theory Aalen-Johansen Computational aspects Applications

More information

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Sarwar I Mozumder 1, Mark J Rutherford 1, Paul C Lambert 1,2 1 Biostatistics

More information

Daffodil project Multiple heart failure events

Daffodil project Multiple heart failure events Daffodil project Multiple heart failure events SDC April 2017 http://bendixcarstensen.com Draft version 1 Compiled Tuesday 25 th April, 2017, 15:17 from: /home/bendix/sdc/proj/daffodil/rep/multhf.tex Bendix

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Lecture 10. Diagnostics. Statistics Survival Analysis. Presented March 1, 2016

Lecture 10. Diagnostics. Statistics Survival Analysis. Presented March 1, 2016 Statistics 255 - Survival Analysis Presented March 1, 2016 Dan Gillen Department of Statistics University of California, Irvine 10.1 Are model assumptions correct? Is the proportional hazards assumption

More information

The Weibull Distribution

The Weibull Distribution The Weibull Distribution Patrick Breheny October 10 Patrick Breheny University of Iowa Survival Data Analysis (BIOS 7210) 1 / 19 Introduction Today we will introduce an important generalization of the

More information

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Semiparametric Mixed Effects Models with Flexible Random Effects Distribution Marie Davidian North Carolina State University davidian@stat.ncsu.edu www.stat.ncsu.edu/ davidian Joint work with A. Tsiatis,

More information

Federated analyses. technical, statistical and human challenges

Federated analyses. technical, statistical and human challenges Federated analyses technical, statistical and human challenges Bénédicte Delcoigne, Statistician, PhD Department of Medicine (Solna), Unit of Clinical Epidemiology, Karolinska Institutet What is it? When

More information

Estimating Causal Effects of Organ Transplantation Treatment Regimes

Estimating Causal Effects of Organ Transplantation Treatment Regimes Estimating Causal Effects of Organ Transplantation Treatment Regimes David M. Vock, Jeffrey A. Verdoliva Boatman Division of Biostatistics University of Minnesota July 31, 2018 1 / 27 Hot off the Press

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

Package crrsc. R topics documented: February 19, 2015

Package crrsc. R topics documented: February 19, 2015 Package crrsc February 19, 2015 Title Competing risks regression for Stratified and Clustered data Version 1.1 Author Bingqing Zhou and Aurelien Latouche Extension of cmprsk to Stratified and Clustered

More information

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work

SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that need work SSUI: Presentation Hints 1 Comparing Marginal and Random Eects (Frailty) Models Terry M. Therneau Mayo Clinic April 1998 SSUI: Presentation Hints 2 My Perspective Software Examples Reliability Areas that

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

Consider Table 1 (Note connection to start-stop process).

Consider Table 1 (Note connection to start-stop process). Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event

More information

Package InferenceSMR

Package InferenceSMR Type Package Package InferenceSMR February 19, 2015 Title Inference about the standardized mortality ratio when evaluating the effect of a screening program on survival. Version 1.0 Date 2013-05-22 Author

More information

Variable Selection and Model Choice in Survival Models with Time-Varying Effects

Variable Selection and Model Choice in Survival Models with Time-Varying Effects Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität

More information

Modelling excess mortality using fractional polynomials and spline functions. Modelling time-dependent excess hazard

Modelling excess mortality using fractional polynomials and spline functions. Modelling time-dependent excess hazard Modelling excess mortality using fractional polynomials and spline functions Bernard Rachet 7 September 2007 Stata workshop - Stockholm 1 Modelling time-dependent excess hazard Excess hazard function λ

More information