Simulating complex survival data

Size: px
Start display at page:

Download "Simulating complex survival data"

Transcription

1 Simulating complex survival data Stata Nordic and Baltic Users Group Meeting 11 th November 2011 Michael J. Crowther 1 and Paul C. Lambert 1,2 1 Centre for Biostatistics and Genetic Epidemiology Department of Health Sciences University of Leicester, UK. 2 Department of Medical Epidemiology and Biostatistics Karolinska Institutet Stockholm, Sweden. michael.crowther@le.ac.uk Michael J. Crowther Stata Users Group Meeting 1 / 20

2 Outline Outline Standard parametric distributions 2-component mixture distributions Cause-specific competing risks Time-dependent effects Extensions Michael J. Crowther Stata Users Group Meeting 2 / 20

3 Background Survival times are often generated using the exponential or Weibull distributions h 0 (t) = λ h 0 (t) = λγt γ 1 Michael J. Crowther Stata Users Group Meeting 3 / 20

4 Background Survival times are often generated using the exponential or Weibull distributions h 0 (t) = λ h 0 (t) = λγt γ 1 Are these distributions biologically realistic? Are they complex enough to fully assess statistical models? Michael J. Crowther Stata Users Group Meeting 3 / 20

5 Parametric distributions Simulating under parametric distributions We can use the method of Bender et al. (2005): h(t) = h 0 (t) exp(x β) Michael J. Crowther Stata Users Group Meeting 4 / 20

6 Parametric distributions Simulating under parametric distributions We can use the method of Bender et al. (2005): h(t) = h 0 (t) exp(x β) H(t) = H 0 (t) exp(x β), S(t) = exp[ H 0 (t) exp(x β)] Michael J. Crowther Stata Users Group Meeting 4 / 20

7 Parametric distributions Simulating under parametric distributions We can use the method of Bender et al. (2005): h(t) = h 0 (t) exp(x β) H(t) = H 0 (t) exp(x β), S(t) = exp[ H 0 (t) exp(x β)] F (t) = 1 exp[ H 0 (t) exp(x β)] Michael J. Crowther Stata Users Group Meeting 4 / 20

8 Parametric distributions Simulating under parametric distributions We can use the method of Bender et al. (2005): h(t) = h 0 (t) exp(x β) H(t) = H 0 (t) exp(x β), S(t) = exp[ H 0 (t) exp(x β)] F (t) = 1 exp[ H 0 (t) exp(x β)] U = exp[ H 0 (t) exp(x β)] U(0, 1) Michael J. Crowther Stata Users Group Meeting 4 / 20

9 Parametric distributions Simulating under parametric distributions We can use the method of Bender et al. (2005): h(t) = h 0 (t) exp(x β) H(t) = H 0 (t) exp(x β), S(t) = exp[ H 0 (t) exp(x β)] F (t) = 1 exp[ H 0 (t) exp(x β)] U = exp[ H 0 (t) exp(x β)] U(0, 1) T = H 1 0 [ log(u) exp( X β)] Michael J. Crowther Stata Users Group Meeting 4 / 20

10 Example Simulate Weibull distributed survival times. set obs gen age = rnormal(50,5). gen trt = rbinomial(1,0.5). survsim stime, dist(weibull) lambda(0.1) gamma(1.5) cov(age 0.2 trt -0.5). gen event = stime<5. replace stime = 5 if event == 0. stset stime, failure(event=1) (output omitted ). streg age trt, dist(w) nohr nolog noheader failure _d: event == 1 analysis time _t: stime _t Coef. Std. Err. z P> z [95% Conf. Interval] age trt _cons /ln_p p /p Michael J. Crowther Stata Users Group Meeting 5 / 20

11 Increasing complexity Increasing complexity We wish to increase the complexity of the baseline hazard function beyond standard and sometimes biologically implausible shapes Michael J. Crowther Stata Users Group Meeting 6 / 20

12 Increasing complexity Increasing complexity We wish to increase the complexity of the baseline hazard function beyond standard and sometimes biologically implausible shapes In many cancer trial datasets a turning point in the hazard function is observed Michael J. Crowther Stata Users Group Meeting 6 / 20

13 Increasing complexity Increasing complexity We wish to increase the complexity of the baseline hazard function beyond standard and sometimes biologically implausible shapes In many cancer trial datasets a turning point in the hazard function is observed We propose to use 2-component mixture distributions (see McLachlan and McGiffin (1994)). For example the Weibull-Weibull mixture distribution: S 0 (t) = p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 ) Michael J. Crowther Stata Users Group Meeting 6 / 20

14 Increasing complexity Simulating survival times S 0 (t) = p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 ) We can induce proportional hazards by: S(t X ) = {p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 exp(x β) )} Michael J. Crowther Stata Users Group Meeting 7 / 20

15 Increasing complexity Simulating survival times S 0 (t) = p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 ) We can induce proportional hazards by: S(t X ) = {p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 exp(x β) )} We therefore have: {p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 )} exp(x β) U(0, 1) Michael J. Crowther Stata Users Group Meeting 7 / 20

16 Increasing complexity Simulating survival times S 0 (t) = p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 ) We can induce proportional hazards by: S(t X ) = {p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 exp(x β) )} We therefore have: {p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 )} exp(x β) U(0, 1) This cannot be directly solved for t... Michael J. Crowther Stata Users Group Meeting 7 / 20

17 Increasing complexity Newton-Raphson A simple solution is to use Newton-Raphson iterations where t n+1 = t n f (t n) f (t n ) f (t n ) = {p exp( λ 1 t γ 1 ) + (1 p) exp( λ 2 t γ 2 )} exp(x β) u and u U(0, 1) Michael J. Crowther Stata Users Group Meeting 8 / 20

18 Increasing complexity λ1 = 1, γ1 = 1.5, λ2 = 1,γ2 = 0.5, p = 0.5 λ1 = 0.1, γ1 = 3, λ2 = 0.1,γ2 = 1.6, p = 0.8 Hazard function Hazard function Follow up time Follow up time λ1 = 1.4, γ1 = 1.3, λ2 = 0.1,γ2 = 0.5, p = 0.9 λ1 = 1.5, γ1 = 0.2, λ2 = 0.5,γ2 = 0.1, p = 0.1 Hazard function Hazard function Follow up time Follow up time Figure: Example baseline mixture-weibull hazard functions. Michael J. Crowther Stata Users Group Meeting 9 / 20

19 Example Fit stmix model using Crowther and Lambert (2011). webuse brcancer, clear (German breast cancer data). stset rectime, failure(censrec==1) scale(365.25). stmix hormon, dist(ww) nolog Mixture Weibull-Weibull proportional hazards regression Log likelihood = Number of obs = 686 Haz. Ratio Std. Err. z P> z [95% Conf. Interval] xb hormon logit_p_mix _cons ln_lambda1 _cons ln_gamma1 _cons ln_lambda2 _cons ln_gamma2 _cons Michael J. Crowther Stata Users Group Meeting 10 / 20

20 Example Weibull model Mixture Weibull model Survival probability 0.6 Survival probability Follow-up (years) Follow-up (years) KM, hormone=0 KM, hormone=1 Predicted survival, hormone==0 Predicted survival, hormone==1 Figure: Fitted survival function. Michael J. Crowther Stata Users Group Meeting 11 / 20

21 Example Simulation study. survsim stime, mixture lambdas(`l1 `l2 ) gammas(`g1 `g2 ) pmix(`pmix ) /// > cov(trt `loghr ). simulate s2w = r(s2w) s2ww = r(s2ww) trt_w = r(trt_w) setrt_w = r(setrt_w) /// > trt_ww = r(trt_ww) setrt_ww = r(setrt_ww), reps(500): /// > simstudy2, pmix(`pmix ) l1(`l1 ) l2(`l2 ) g1(`g1 ) g2(`g2 ) loghr(`loghr ). gen s2 = `pmix * exp(-`l1 * 2^(`g1 )) + (1-`pmix )*exp(-`l2 * 2^(`g2 )). /* Bias */. gen bias_trt_w = trt_w - (`loghr ). gen bias_trt_ww = trt_ww - (`loghr ). su bias* Variable Obs Mean Std. Dev. Min Max bias_trt_w bias_trt_ww su bias_s2* Variable Obs Mean Std. Dev. Min Max bias_s2w bias_s2ww Michael J. Crowther Stata Users Group Meeting 12 / 20

22 Competing risks Simulating competing risks We can use the method of Beyersmann et al. (2009): Specify K cause-specific hazard functions, h k (t) Michael J. Crowther Stata Users Group Meeting 13 / 20

23 Competing risks Simulating competing risks We can use the method of Beyersmann et al. (2009): Specify K cause-specific hazard functions, h k (t) Simulate survival times with all-cause hazard: h all (t) = K h k (t) k=1 Michael J. Crowther Stata Users Group Meeting 13 / 20

24 Competing risks Simulating competing risks We can use the method of Beyersmann et al. (2009): Specify K cause-specific hazard functions, h k (t) Simulate survival times with all-cause hazard: h all (t) = K h k (t) k=1 For a simulated time t we run a multinomial experiment, with probability for each cause j: Prob(Cause = j t) = h j (t) K k=1 h k(t) Michael J. Crowther Stata Users Group Meeting 13 / 20

25 Example Example using Coviello (2008). set obs obs was 0, now gen trt = rbinomial(1,0.5). survsim stime event, dist(weibull) ncr(2) lambdas( ) gammas( ) > cov(trt ). replace event = 0 if stime>15 (78 real changes made). stset stime, failure(event==1) (output omitted ). streg trt, dist(w) nohr nolog noheader failure _d: event == 1 analysis time _t: stime _t Coef. Std. Err. z P> z [95% Conf. Interval] trt _cons /ln_p p /p stcompet ci1 = ci, compet1(2) by(trt) Michael J. Crowther Stata Users Group Meeting 14 / 20

26 Example 1.0 Cause Cause Cumulative Incidence Cumulative Incidence Follow up time Follow up time Treatment = 0 Treatment = 1 Figure: Cumulative incidence. Michael J. Crowther Stata Users Group Meeting 15 / 20

27 Time-dependent effects We want to incorporate time-dependent effects, such as a diminishing treatment effect. Under standard parametric models this can be achieved simply: h(t) = λγt γ 1 exp(βx i + φx i log(t)) = λγt γ 1+φX i exp(βx i ) Michael J. Crowther Stata Users Group Meeting 16 / 20

28 Example Example using Lambert and Royston (2009). set obs obs was 0, now gen trt = rbinomial(1,0.5). survsim stime, dist(weibull) lambdas(0.1) gammas(1.5) /// > cov(trt -0.5) tde(trt 0.15). gen died = stime <= 5. replace stime = 5 if died == 0 (3869 real changes made). stset stime, f(died = 1) (output omitted ). stpm2 trt, scale(h) df(3) tvc(trt) dftvc(1). predict hr, hrnumer(trt 1) ci. stpm2 trt, scale(h) df(3). predict hr2, hrnumer(trt 1) ci Michael J. Crowther Stata Users Group Meeting 17 / 20

29 Example 0.8 Time dependent hazard ratio Hazard Ratio Time 95% Confidence Interval Time dependent HR Constant HR Figure: Time-dependent hazard ratio. Michael J. Crowther Stata Users Group Meeting 18 / 20

30 Cure proportions Frailty distributions Time-dependent effects in mixture distributions Joint longitudinal and survival data Michael J. Crowther Stata Users Group Meeting 19 / 20

31 References I R. Bender, T. Augustin, and M. Blettner. Generating survival times to simulate Cox proportional hazards models. Stat Med, 24(11): , Jan Beyersmann, Aurélien Latouche, Anika Buchholz, and Martin Schumacher. Simulating competing risks data in survival analysis. Stat Med, 28(6): , Mar doi: /sim URL Enzo Coviello. Stcompet: Stata module to generate cumulative incidence in presence of competing events. Statistical Software Components, Boston College Department of Economics, URL Michael J. Crowther and Paul C. Lambert. Stmix: Stata module to fit two-component parametric mixture survival models. Statistical Software Components, Boston College Department of Economics, October URL P. C Lambert and P. Royston. Further development of flexible parametric models for survival analysis. The Stata Journal, 9: , G. J. McLachlan and D. C. McGiffin. On the role of finite mixture models in survival analysis. Stat Methods Med Res, 3(3): , Michael J. Crowther Stata Users Group Meeting 20 / 20

Parametric multistate survival analysis: New developments

Parametric multistate survival analysis: New developments Parametric multistate survival analysis: New developments Michael J. Crowther Biostatistics Research Group Department of Health Sciences University of Leicester, UK michael.crowther@le.ac.uk Victorian

More information

Instantaneous geometric rates via generalized linear models

Instantaneous geometric rates via generalized linear models Instantaneous geometric rates via generalized linear models Andrea Discacciati Matteo Bottai Unit of Biostatistics Karolinska Institutet Stockholm, Sweden andrea.discacciati@ki.se 1 September 2017 Outline

More information

Instantaneous geometric rates via Generalized Linear Models

Instantaneous geometric rates via Generalized Linear Models The Stata Journal (yyyy) vv, Number ii, pp. 1 13 Instantaneous geometric rates via Generalized Linear Models Andrea Discacciati Karolinska Institutet Stockholm, Sweden andrea.discacciati@ki.se Matteo Bottai

More information

Fractional Polynomials and Model Averaging

Fractional Polynomials and Model Averaging Fractional Polynomials and Model Averaging Paul C Lambert Center for Biostatistics and Genetic Epidemiology University of Leicester UK paul.lambert@le.ac.uk Nordic and Baltic Stata Users Group Meeting,

More information

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Sarwar I Mozumder 1, Mark J Rutherford 1, Paul C Lambert 1,2 1 Biostatistics

More information

One-stage dose-response meta-analysis

One-stage dose-response meta-analysis One-stage dose-response meta-analysis Nicola Orsini, Alessio Crippa Biostatistics Team Department of Public Health Sciences Karolinska Institutet http://ki.se/en/phs/biostatistics-team 2017 Nordic and

More information

Flexible parametric alternatives to the Cox model, and more

Flexible parametric alternatives to the Cox model, and more The Stata Journal (2001) 1, Number 1, pp. 1 28 Flexible parametric alternatives to the Cox model, and more Patrick Royston UK Medical Research Council patrick.royston@ctu.mrc.ac.uk Abstract. Since its

More information

Meta-analysis of epidemiological dose-response studies

Meta-analysis of epidemiological dose-response studies Meta-analysis of epidemiological dose-response studies Nicola Orsini 2nd Italian Stata Users Group meeting October 10-11, 2005 Institute Environmental Medicine, Karolinska Institutet Rino Bellocco Dept.

More information

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 =

Logit estimates Number of obs = 5054 Wald chi2(1) = 2.70 Prob > chi2 = Log pseudolikelihood = Pseudo R2 = August 2005 Stata Application Tutorial 4: Discrete Models Data Note: Code makes use of career.dta, and icpsr_discrete1.dta. All three data sets are available on the Event History website. Code is based

More information

Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

More information

How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 4.20) revised

How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 4.20) revised WM Mason, Soc 213B, S 02, UCLA Page 1 of 15 How To Do Piecewise Exponential Survival Analysis in Stata 7 (Allison 1995:Output 420) revised 4-25-02 This document can function as a "how to" for setting up

More information

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach

Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling approach Research Article Received XXXX (www.interscience.wiley.com) DOI: 10.1002/sim.0000 Direct likelihood inference on the cause-specific cumulative incidence function: a flexible parametric regression modelling

More information

Estimation of discrete time (grouped duration data) proportional hazards models: pgmhaz

Estimation of discrete time (grouped duration data) proportional hazards models: pgmhaz Estimation of discrete time (grouped duration data) proportional hazards models: pgmhaz Stephen P. Jenkins ESRC Research Centre on Micro-Social Change University of Essex, Colchester

More information

Latent class analysis and finite mixture models with Stata

Latent class analysis and finite mixture models with Stata Latent class analysis and finite mixture models with Stata Isabel Canette Principal Mathematician and Statistician StataCorp LLC 2017 Stata Users Group Meeting Madrid, October 19th, 2017 Introduction Latent

More information

Introduction to the rstpm2 package

Introduction to the rstpm2 package Introduction to the rstpm2 package Mark Clements Karolinska Institutet Abstract This vignette outlines the methods and provides some examples for link-based survival models as implemented in the R rstpm2

More information

A brief note on the simulation of survival data with a desired percentage of right-censored datas

A brief note on the simulation of survival data with a desired percentage of right-censored datas Journal of Data Science 14(2016), 701-712 A brief note on the simulation of survival data with a desired percentage of right-censored datas Edson Zangiacomi Martinez 1*, Jorge Alberto Achcar 1, Marcos

More information

Estimating compound expectation in a regression framework with the new cereg command

Estimating compound expectation in a regression framework with the new cereg command Estimating compound expectation in a regression framework with the new cereg command 2015 Nordic and Baltic Stata Users Group meeting Celia García-Pareja Matteo Bottai Unit of Biostatistics, IMM, KI September

More information

A Journey to Latent Class Analysis (LCA)

A Journey to Latent Class Analysis (LCA) A Journey to Latent Class Analysis (LCA) Jeff Pitblado StataCorp LLC 2017 Nordic and Baltic Stata Users Group Meeting Stockholm, Sweden Outline Motivation by: prefix if clause suest command Factor variables

More information

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution

A comparison of inverse transform and composition methods of data simulation from the Lindley distribution Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 2287-7843 / Online ISSN 2383-4757 A comparison of inverse transform

More information

especially with continuous

especially with continuous Handling interactions in Stata, especially with continuous predictors Patrick Royston & Willi Sauerbrei UK Stata Users meeting, London, 13-14 September 2012 Interactions general concepts General idea of

More information

Extended multivariate generalised linear and non-linear mixed effects models

Extended multivariate generalised linear and non-linear mixed effects models Extended multivariate generalised linear and non-linear mixed effects models Stata UK Meeting Cass Business School 7th September 2017 Michael J. Crowther Biostatistics Research Group, Department of Health

More information

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error The Stata Journal (), Number, pp. 1 12 The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error James W. Hardin Norman J. Arnold School of Public Health

More information

Homework Solutions Applied Logistic Regression

Homework Solutions Applied Logistic Regression Homework Solutions Applied Logistic Regression WEEK 6 Exercise 1 From the ICU data, use as the outcome variable vital status (STA) and CPR prior to ICU admission (CPR) as a covariate. (a) Demonstrate that

More information

Joint longitudinal and time-to-event models via Stan

Joint longitudinal and time-to-event models via Stan Joint longitudinal and time-to-event models via Stan Sam Brilleman 1,2, Michael J. Crowther 3, Margarita Moreno-Betancur 2,4,5, Jacqueline Buros Novik 6, Rory Wolfe 1,2 StanCon 2018 Pacific Grove, California,

More information

Understanding the multinomial-poisson transformation

Understanding the multinomial-poisson transformation The Stata Journal (2004) 4, Number 3, pp. 265 273 Understanding the multinomial-poisson transformation Paulo Guimarães Medical University of South Carolina Abstract. There is a known connection between

More information

Confidence intervals for the variance component of random-effects linear models

Confidence intervals for the variance component of random-effects linear models The Stata Journal (2004) 4, Number 4, pp. 429 435 Confidence intervals for the variance component of random-effects linear models Matteo Bottai Arnold School of Public Health University of South Carolina

More information

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see Title stata.com stcrreg postestimation Postestimation tools for stcrreg Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Applied Survival Analysis Lab 10: Analysis of multiple failures

Applied Survival Analysis Lab 10: Analysis of multiple failures Applied Survival Analysis Lab 10: Analysis of multiple failures We will analyze the bladder data set (Wei et al., 1989). A listing of the dataset is given below: list if id in 1/9 +---------------------------------------------------------+

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information

Consider Table 1 (Note connection to start-stop process).

Consider Table 1 (Note connection to start-stop process). Discrete-Time Data and Models Discretized duration data are still duration data! Consider Table 1 (Note connection to start-stop process). Table 1: Example of Discrete-Time Event History Data Case Event

More information

Generalized linear models

Generalized linear models Generalized linear models Christopher F Baum ECON 8823: Applied Econometrics Boston College, Spring 2016 Christopher F Baum (BC / DIW) Generalized linear models Boston College, Spring 2016 1 / 1 Introduction

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

Package survsim. R topics documented: October 16, 2015

Package survsim. R topics documented: October 16, 2015 Package survsim October 16, 2015 Type Package Title Simulation of Simple and Comple Survival Data Version 1.1.4 Date 2015-10-07 Encoding UTF-8 Author David Moriña (Centre for Research in Environmental

More information

The influence of categorising survival time on parameter estimates in a Cox model

The influence of categorising survival time on parameter estimates in a Cox model The influence of categorising survival time on parameter estimates in a Cox model Anika Buchholz 1,2, Willi Sauerbrei 2, Patrick Royston 3 1 Freiburger Zentrum für Datenanalyse und Modellbildung, Albert-Ludwigs-Universität

More information

Using generalized structural equation models to fit customized models without programming, and taking advantage of new features of -margins-

Using generalized structural equation models to fit customized models without programming, and taking advantage of new features of -margins- Using generalized structural equation models to fit customized models without programming, and taking advantage of new features of -margins- Isabel Canette Principal Mathematician and Statistician StataCorp

More information

The Stata Journal. Editor Nicholas J. Cox Geography Department Durham University South Road Durham City DH1 3LE UK

The Stata Journal. Editor Nicholas J. Cox Geography Department Durham University South Road Durham City DH1 3LE UK The Stata Journal Editor H. Joseph Newton Department of Statistics Texas A & M University College Station, Texas 77843 979-845-3142; FAX 979-845-3144 jnewton@stata-journal.com Editor Nicholas J. Cox Geography

More information

Calculating Odds Ratios from Probabillities

Calculating Odds Ratios from Probabillities Arizona State University From the SelectedWorks of Joseph M Hilbe November 2, 2016 Calculating Odds Ratios from Probabillities Joseph M Hilbe Available at: https://works.bepress.com/joseph_hilbe/76/ Calculating

More information

Editor Executive Editor Associate Editors Copyright Statement:

Editor Executive Editor Associate Editors Copyright Statement: The Stata Journal Editor H. Joseph Newton Department of Statistics Texas A & M University College Station, Texas 77843 979-845-3142 979-845-3144 FAX jnewton@stata-journal.com Associate Editors Christopher

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM

More information

e 4β e 4β + e β ˆβ =0.765

e 4β e 4β + e β ˆβ =0.765 SIMPLE EXAMPLE COX-REGRESSION i Y i x i δ i 1 5 12 0 2 10 10 1 3 40 3 0 4 80 5 0 5 120 3 1 6 400 4 1 7 600 1 0 Model: z(t x) =z 0 (t) exp{βx} Partial likelihood: L(β) = e 10β e 10β + e 3β + e 5β + e 3β

More information

The simulation extrapolation method for fitting generalized linear models with additive measurement error

The simulation extrapolation method for fitting generalized linear models with additive measurement error The Stata Journal (2003) 3, Number 4, pp. 373 385 The simulation extrapolation method for fitting generalized linear models with additive measurement error James W. Hardin Arnold School of Public Health

More information

LOGISTIC REGRESSION Joseph M. Hilbe

LOGISTIC REGRESSION Joseph M. Hilbe LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of

More information

Binary Dependent Variables

Binary Dependent Variables Binary Dependent Variables In some cases the outcome of interest rather than one of the right hand side variables - is discrete rather than continuous Binary Dependent Variables In some cases the outcome

More information

Working with Stata Inference on the mean

Working with Stata Inference on the mean Working with Stata Inference on the mean Nicola Orsini Biostatistics Team Department of Public Health Sciences Karolinska Institutet Dataset: hyponatremia.dta Motivating example Outcome: Serum sodium concentration,

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

JOINT REGRESSION MODELING OF TWO CUMULATIVE INCIDENCE FUNCTIONS UNDER AN ADDITIVITY CONSTRAINT AND STATISTICAL ANALYSES OF PILL-MONITORING DATA

JOINT REGRESSION MODELING OF TWO CUMULATIVE INCIDENCE FUNCTIONS UNDER AN ADDITIVITY CONSTRAINT AND STATISTICAL ANALYSES OF PILL-MONITORING DATA JOINT REGRESSION MODELING OF TWO CUMULATIVE INCIDENCE FUNCTIONS UNDER AN ADDITIVITY CONSTRAINT AND STATISTICAL ANALYSES OF PILL-MONITORING DATA by Martin P. Houze B. Sc. University of Lyon, 2000 M. A.

More information

A Regression Model For Recurrent Events With Distribution Free Correlation Structure

A Regression Model For Recurrent Events With Distribution Free Correlation Structure A Regression Model For Recurrent Events With Distribution Free Correlation Structure J. Pénichoux(1), A. Latouche(2), T. Moreau(1) (1) INSERM U780 (2) Université de Versailles, EA2506 ISCB - 2009 - Prague

More information

Lecture 3.1 Basic Logistic LDA

Lecture 3.1 Basic Logistic LDA y Lecture.1 Basic Logistic LDA 0.2.4.6.8 1 Outline Quick Refresher on Ordinary Logistic Regression and Stata Women s employment example Cross-Over Trial LDA Example -100-50 0 50 100 -- Longitudinal Data

More information

GMM Estimation in Stata

GMM Estimation in Stata GMM Estimation in Stata Econometrics I Department of Economics Universidad Carlos III de Madrid Master in Industrial Economics and Markets 1 Outline Motivation 1 Motivation 2 3 4 2 Motivation 3 Stata and

More information

SOCY5601 Handout 8, Fall DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS

SOCY5601 Handout 8, Fall DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS SOCY5601 DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS More on use of X 2 terms to detect curvilinearity: As we have said, a quick way to detect curvilinearity in the relationship between

More information

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data

Outline. Frailty modelling of Multivariate Survival Data. Clustered survival data. Clustered survival data Outline Frailty modelling of Multivariate Survival Data Thomas Scheike ts@biostat.ku.dk Department of Biostatistics University of Copenhagen Marginal versus Frailty models. Two-stage frailty models: copula

More information

flexsurv: flexible parametric survival modelling in R. Supplementary examples

flexsurv: flexible parametric survival modelling in R. Supplementary examples flexsurv: flexible parametric survival modelling in R. Supplementary examples Christopher H. Jackson MRC Biostatistics Unit, Cambridge, UK chris.jackson@mrc-bsu.cam.ac.uk Abstract This vignette of examples

More information

Cox regression: Estimation

Cox regression: Estimation Cox regression: Estimation Patrick Breheny October 27 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/19 Introduction The Cox Partial Likelihood In our last lecture, we introduced the Cox partial

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

More information

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1

Case of single exogenous (iv) variable (with single or multiple mediators) iv à med à dv. = β 0. iv i. med i + α 1 Mediation Analysis: OLS vs. SUR vs. ISUR vs. 3SLS vs. SEM Note by Hubert Gatignon July 7, 2013, updated November 15, 2013, April 11, 2014, May 21, 2016 and August 10, 2016 In Chap. 11 of Statistical Analysis

More information

Creating New Distributions

Creating New Distributions Creating New Distributions Section 5.2 Stat 477 - Loss Models Section 5.2 (Stat 477) Creating New Distributions Brian Hartman - BYU 1 / 18 Generating new distributions Some methods to generate new distributions

More information

Extensions of Cox Model for Non-Proportional Hazards Purpose

Extensions of Cox Model for Non-Proportional Hazards Purpose PhUSE Annual Conference 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Author: Jadwiga Borucka PAREXEL, Warsaw, Poland Brussels 13 th - 16 th October 2013 Presentation Plan

More information

Semiparametric Regression

Semiparametric Regression Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Generating survival data for fitting marginal structural Cox models using Stata Stata Conference in San Diego, California

Generating survival data for fitting marginal structural Cox models using Stata Stata Conference in San Diego, California Generating survival data for fitting marginal structural Cox models using Stata 2012 Stata Conference in San Diego, California Outline Idea of MSM Various weights Fitting MSM in Stata using pooled logistic

More information

Problem set - Selection and Diff-in-Diff

Problem set - Selection and Diff-in-Diff Problem set - Selection and Diff-in-Diff 1. You want to model the wage equation for women You consider estimating the model: ln wage = α + β 1 educ + β 2 exper + β 3 exper 2 + ɛ (1) Read the data into

More information

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation Patrick J. Heagerty PhD Department of Biostatistics University of Washington 166 ISCB 2010 Session Four Outline Examples

More information

Package complex.surv.dat.sim

Package complex.surv.dat.sim Package complex.surv.dat.sim February 15, 2013 Type Package Title Simulation of complex survival data Version 1.1.1 Date 2012-07-09 Encoding latin1 Author David Moriña, Centre Tecnològic de Nutrició i

More information

Pre-test post test, or longitudinal data with baseline

Pre-test post test, or longitudinal data with baseline 1 Pre-test post test, or longitudinal data with baseline Ulrich Halekoh Epidemiology, Biostatistics and Biodemography, SDU February 4, 2014 Pre-post test; two measurements on several subjects Figure 1:

More information

Generalized Survival Models as a Tool for Medical Research

Generalized Survival Models as a Tool for Medical Research From the Department of Medical Epidemiology and Biostatistics Karolinska Institutet, Stockholm, Sweden Generalized Survival Models as a Tool for Medical Research Xingrong Liu Stockholm 2017 All previously

More information

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data

Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data The 3rd Australian and New Zealand Stata Users Group Meeting, Sydney, 5 November 2009 1 Model and Working Correlation Structure Selection in GEE Analyses of Longitudinal Data Dr Jisheng Cui Public Health

More information

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require

In contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as

More information

Analysing repeated measurements whilst accounting for derivative tracking, varying within-subject variance and autocorrelation: the xtiou command

Analysing repeated measurements whilst accounting for derivative tracking, varying within-subject variance and autocorrelation: the xtiou command Analysing repeated measurements whilst accounting for derivative tracking, varying within-subject variance and autocorrelation: the xtiou command R.A. Hughes* 1, M.G. Kenward 2, J.A.C. Sterne 1, K. Tilling

More information

Proportional hazards regression

Proportional hazards regression Proportional hazards regression Patrick Breheny October 8 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/28 Introduction The model Solving for the MLE Inference Today we will begin discussing regression

More information

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Patrick J. Heagerty PhD Department of Biostatistics University of Washington 1 Biomarkers Review: Cox Regression Model

More information

Survival Analysis. DZHW Summer School Workshop Johannes Giesecke HU Berlin

Survival Analysis. DZHW Summer School Workshop Johannes Giesecke HU Berlin Survival Analysis DZHW Summer School Workshop 6.9.2016 Johannes Giesecke HU Berlin Program Block I Basics a) Key concepts (censoring, truncation, analysis time and functions thereof, overview of potential

More information

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University

More information

Modelling geoadditive survival data

Modelling geoadditive survival data Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model

More information

The Stata Journal. Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas

The Stata Journal. Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas The Stata Journal Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas editors@stata-journal.com Nicholas J. Cox Department of Geography Durham University Durham,

More information

The Stata Journal. Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas

The Stata Journal. Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas The Stata Journal Editors H. Joseph Newton Department of Statistics Texas A&M University College Station, Texas editors@stata-journal.com Nicholas J. Cox Department of Geography Durham University Durham,

More information

Dynamic Models Part 1

Dynamic Models Part 1 Dynamic Models Part 1 Christopher Taber University of Wisconsin December 5, 2016 Survival analysis This is especially useful for variables of interest measured in lengths of time: Length of life after

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

Introduction to Event History Analysis. Hsueh-Sheng Wu CFDR Workshop Series June 20, 2016

Introduction to Event History Analysis. Hsueh-Sheng Wu CFDR Workshop Series June 20, 2016 Introduction to Event History Analysis Hsueh-Sheng Wu CFDR Workshop Series June 20, 2016 1 What is event history analysis Event history analysis steps Outline Create data for event history analysis Data

More information

Federated analyses. technical, statistical and human challenges

Federated analyses. technical, statistical and human challenges Federated analyses technical, statistical and human challenges Bénédicte Delcoigne, Statistician, PhD Department of Medicine (Solna), Unit of Clinical Epidemiology, Karolinska Institutet What is it? When

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Nonlinear Regression Functions

Nonlinear Regression Functions Nonlinear Regression Functions (SW Chapter 8) Outline 1. Nonlinear regression functions general comments 2. Nonlinear functions of one variable 3. Nonlinear functions of two variables: interactions 4.

More information

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P. Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

More information

The nltm Package. July 24, 2006

The nltm Package. July 24, 2006 The nltm Package July 24, 2006 Version 1.2 Date 2006-07-17 Title Non-linear Transformation Models Author Gilda Garibotti, Alexander Tsodikov Maintainer Gilda Garibotti Depends

More information

options description set confidence level; default is level(95) maximum number of iterations post estimation results

options description set confidence level; default is level(95) maximum number of iterations post estimation results Title nlcom Nonlinear combinations of estimators Syntax Nonlinear combination of estimators one expression nlcom [ name: ] exp [, options ] Nonlinear combinations of estimators more than one expression

More information

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Modelling excess mortality using fractional polynomials and spline functions. Modelling time-dependent excess hazard

Modelling excess mortality using fractional polynomials and spline functions. Modelling time-dependent excess hazard Modelling excess mortality using fractional polynomials and spline functions Bernard Rachet 7 September 2007 Stata workshop - Stockholm 1 Modelling time-dependent excess hazard Excess hazard function λ

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Censoring. Time to Event (Survival) Data. Special features of time to event (survival) data: Strictly non-negative observations

Censoring. Time to Event (Survival) Data. Special features of time to event (survival) data: Strictly non-negative observations Time to Event (Survival) Data Survival analysis is the analysis of observed times from a well defined origin to the occurrence of a particular event or end-point. Time from entry into a clinical trial

More information

2. We care about proportion for categorical variable, but average for numerical one.

2. We care about proportion for categorical variable, but average for numerical one. Probit Model 1. We apply Probit model to Bank data. The dependent variable is deny, a dummy variable equaling one if a mortgage application is denied, and equaling zero if accepted. The key regressor is

More information

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests

ECON Introductory Econometrics. Lecture 7: OLS with Multiple Regressors Hypotheses tests ECON4150 - Introductory Econometrics Lecture 7: OLS with Multiple Regressors Hypotheses tests Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 7 Lecture outline 2 Hypothesis test for single

More information

Reduced-rank hazard regression

Reduced-rank hazard regression Chapter 2 Reduced-rank hazard regression Abstract The Cox proportional hazards model is the most common method to analyze survival data. However, the proportional hazards assumption might not hold. The

More information

Survival Analysis. Hsueh-Sheng Wu CFDR Workshop Series Spring 2010

Survival Analysis. Hsueh-Sheng Wu CFDR Workshop Series Spring 2010 Survival Analysis Hsueh-Sheng Wu CFDR Workshop Series Spring 2010 1 Outline Survival analysis steps Create data for survival analysis Data for different analyses The dependent variable in Life Table analysis

More information

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata

PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata PSC 8185: Multilevel Modeling Fitting Random Coefficient Binary Response Models in Stata Consider the following two-level model random coefficient logit model. This is a Supreme Court decision making model,

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

Likelihood Construction, Inference for Parametric Survival Distributions

Likelihood Construction, Inference for Parametric Survival Distributions Week 1 Likelihood Construction, Inference for Parametric Survival Distributions In this section we obtain the likelihood function for noninformatively rightcensored survival data and indicate how to make

More information