Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Size: px
Start display at page:

Download "Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016"

Transcription

1 Statistics Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1

2 First question: Are the data truly discrete? : Number of attempts at a puzzle before it is solved Motivating Number of grades completed before dropping out of school Number of doses required for a given effect to be observed Number of inseminations of cows required to achieve pregnancy Years measured by the the number of rings on a tree (time is the number of seasonal cycles) Number of screening visits to detect recurrence of disease 11.2

3 If failures are really instantaneous and the time measure is really continuous, then data not truly discrete, but are interval censored... : 1. Alzheimer s disease cohort study Enter into cohort with normal cognition Annual neuropsychological testing to assess conversion to demented state Motivating 2. Breast cancer screening Time from birth until the development of BC Subjects screened every 5 years until age 40, then each year after that 3. Maintenance therapy studies Cohort of cancer patients in remission Time to disease recurrence Patients regularly screened 11.3

4 Types of interval censoring Fixed interval censoring Researcher (or some external process) selects a fixed screening interval Every subject is screened according to the defined intervals Random interval censoring Screening intervals vary from screening-to-screening and from person-to-person Motivating Independent interval censoring Occurence of screening and lengths of intervals are independent of failure times (conditional on covariates) 11.4

5 Types of interval censoring Example of where independence does not hold... Relapse prevention for ulcers comparing two treatments Regular screening intervals for endoscopy (6 months) Additional endoscopies at other visits - due to symptoms or other health problems Motivating Effective screening intervals are not independent of relapse We will consider fixed interval censoring

6 Notation Suppose patients are followed-up or screened at times t 1, t 2,..., t j,..., t J "Complete" right censored data is given by: (Y 1, δ 1, x 1 ),..., (Y n, δ n, x n ) "Observed" interval censored data is given by: Motivating (Y 1, δ 1, x 1 ),..., (Y n, δ n, x n ) where Y i = j if Y i (t j 1, t j ] 11.6

7 Analysis Strategies Univariate Analysis Cohort life table analysis Assume censoring and death is uniformly distributed within each interval Regression Analysis Motivating Fixed interval proportional hazards model 11.7

8 Consider the proportional hazards model for right censored data Motivating so that λ i (t x i ) = λ 0 (t)e βt x i S i (t x i ) = [S 0 (t)] exp(βt x i ) Now consider this model in the interval censored setting so that S i (t j x i ) = [S 0 (t j )] exp(βt x i ) = Pr[T > t j x i ] = Pr[Surviving j th interval x i ] 11.8

9 Now consider the conditional probability of failing in an interval given survival up to the start of the interval Pr[Failing in j th interval Survive j 1 interval, x i ] = S i(t j 1 xi) S i (t j x i ) S i (t j 1 xi) = 1 S i(t j x i ) S i (t j 1 xi) ( S0 (t j ) = 1 S 0 (t j 1 ) ) exp(β T x i ) Motivating 11.9

10 Now, let So that π j = Pr[Failing in j th interval Survive j 1 interval, x i ] ( S0 (t j ) π j = 1 S 0 (t j 1 ) ( S0 (t j ) 1 π j = S 0 (t j 1 ) ) exp(β T x i ) ) exp(β T x i ) log(1 π j ) = exp(β T x i ) log ( ) S0 (t j ) S 0 (t j 1 ) = exp(β T x i )[Λ 0 (t j ) Λ 0 (t j 1 )] Motivating 11.10

11 Thus we have log[ log(1 π j )] = log[λ 0 (t j ) Λ 0 (t j 1 )] + β T x i γ j + β T x i This is just a binary regression model (ie. a GLM) with a complimentary log-log (cloglog) link and interval-specific intercepts Motivating Provided that J (the number of intervals) is not too large, this model can be fit with ordinary software 11.11

12 (Section 1.14, K & M) Outcome: Time to cessation of breast feeding (weeks) Covariate of interest: smoking, adjusting for race (race= 1 (White), race= 2 (Black), race= 3 (other)) This is illustrative, so lets do the Cox model using the Efron approximation for ties (for comparison) Motivating I cannot compute the exact partial likelihood in R (on my computer) 11.12

13 (Section 1.14, K & M) Cox model with Efron approximation > bfeed <- read.table( " STAT255//bfeed.txt" ) > names( bfeed ) <- c( "duration", "icompbf", "racemom", "poverty", + "momsmoke", "momdrink", "momage", + "yob", "momeduc", "precare" ) > bfeed <- bfeed[ order(bfeed$duration), ] > bfeed$id <- 1:dim(bfeed)[1] > > > ## > ##### Fit Cox model with Efron adjustment for ties > ## > fit <- coxph( Surv(duration,icompbf) ~ factor(racemom) + momsmoke, data=bfeed, method="efron" ) > summary(fit) Motivating exp(coef) exp(-coef) lower.95 upper.95 factor(racemom) factor(racemom) momsmoke

14 (Section 1.14, K & M) Now recode data to be intervals > bfeed$int.durat <- cut(bfeed$duration, c(0,1,2,4,6,10,16,24,36,52,192), include.lowest=true, ordered_result=true ) > xtabs( ~ int.durat + icompbf, data=bfeed ) icompbf int.durat 0 1 [0,1] 2 77 (1,2] 3 71 (2,4] (4,6] 9 75 (6,10] (10,16] (16,24] (24,36] 0 74 (36,52] 0 85 (52,192] 0 27 Motivating 11.14

15 (Section 1.14, K & M) Now, expand the data to set up for fixed-interval censored proportional hazards model > ## > ##### Expand dataset to consider interval censoring > ## > u.evtimes <- as.ordered(unique(bfeed$int.durat[ bfeed$icompbf==1 ])) > num.event <- length( u.evtimes ) > bfeed.texpand <- bfeed[, c("id", "int.durat", "icompbf", "racemom", "momsmoke") ] > bfeed.texpand <- bfeed.texpand[ rep(bfeed.texpand$id,each=num.event), ] > bfeed.texpand$interval <- rep( u.evtimes, sum(!duplicated(bfeed.texpand$id)) ) > bfeed.texpand <- bfeed.texpand[ bfeed.texpand$int.durat >= bfeed.texpand$interval, ] > bfeed.texpand <- bfeed.texpand[ dim(bfeed.texpand)[1]:1, ] > bfeed.texpand$icompbf <- ifelse(!duplicated(bfeed.texpand$id), bfeed.texpand$icompbf,0 ) > bfeed.texpand <- bfeed.texpand[ dim(bfeed.texpand)[1]:1, ] Motivating 11.15

16 (Section 1.14, K & M) Let s have a look at the data... > bfeed.texpand[c(1:5,300:305),] id int.durat icompbf racemom momsmoke interval 2 1 [0,1] [0,1] 28 2 [0,1] [0,1] 35 3 [0,1] [0,1] 86 4 [0,1] [0,1] 88 5 [0,1] [0,1] (2,4] [0,1] (2,4] (1,2] (2,4] (2,4] (2,4] [0,1] (2,4] (1,2] (2,4] (2,4] Motivating 11.16

17 (Section 1.14, K & M) Now, fit fixed-interval survival model > fit.intph <- glm( icompbf ~ factor(as.numeric(interval)) + factor(racemom) + momsmoke, data=bfeed.texpand, family=binomial(link="cloglog") ) > summary(fit.intph) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) < 2e-16 *** factor(as.numeric(interval)) factor(as.numeric(interval)) e-06 *** factor(as.numeric(interval)) * factor(as.numeric(interval)) e-10 *** factor(as.numeric(interval)) < 2e-16 *** factor(as.numeric(interval)) < 2e-16 *** factor(as.numeric(interval)) < 2e-16 *** factor(as.numeric(interval)) < 2e-16 *** factor(as.numeric(interval)) factor(racemom) factor(racemom) *** momsmoke *** Motivating 11.17

18 (Section 1.14, K & M) Use glmci() on the course webpage to exponentiate results and produce CIs > signif( glmci( fit.intph ), 3 ) exp( Est ) ci95.lo ci95.hi z value Pr(> z ) (Intercept) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(as.numeric(interval)) e factor(racemom) e factor(racemom) e momsmoke e Motivating 11.18

19 (Section 1.14, K & M) Interpretation: The estimated relative hazard for cessation of breast feeding for smoking versus non-smoking mothers is exp(0.29) = 1.34 (95% CI: ; p-value =.0001) Motivating This compares to an estimate of 1.32 when the complete data were used with the Efron approximation 11.19

20 What about situations where time truly is discrete: Number of attempts a child takes to solve a puzzle before it is solved Number of grades completed before dropping out of school Number of doses required for a given effect to be observed It is natural to think of these settings as continuation trials Motivating A child will only attempt to solve a puzzle a second time if they failed to solve it the first time A student can only fail the 10 th grade if they passed the 9 th grade A patient is only given dose j + 1 if they did not show benefit at dose j 11.20

21 Discrete-time hazard function Here we may wish to focus on conditional probabilities What is the probability of success on attempt j + 1 given failure on attempt j? Notice how this is similar to a hazard... Motivating What is the probability of failure at time t, given survival up to time t? 11.21

22 Discrete-time hazard function The survival function for discrete-time data is defined as: S(t j ) = Pr[T > t j ] (same as the survival function for continuous-time data) The hazard function for discrete-time data is: λ(t j ) = Pr[T = t j] Pr[T t j ] = Pr[T = t j] Pr[T > t j 1 ] = S(t j 1) S(t j ) S(t j 1 ) Motivating (different from continuous-time data, but a natural extension) 11.22

23 Analysis Strategies Univariate Analysis Ordinary Kaplan-Meier / Nelson-Aalen estimators Regression Analysis Motivating Discrete-time proportional hazards model (aka continuation ratio model) 11.23

24 Let λ(t j ) denote the discrete hazard at time t j. Therefore λ(t j ) is the probability of failure at t j, given survival up to t j (ie. past t j 1 ) The odds of failure at t j, given survival up to t j are then given by λ(t j ) 1 λ(t j ) Motivating Now suppose that λ i (t j x i ) 1 λ i (t j x i ) = λ 0(t j ) 1 λ 0 (t j ) exp(βt x i ) 11.24

25 This can be thought of as a proportional odds model on the conditional probability of failure at time t j, given survival up to t j. It is referred as a continuation ratio model We can also rewrite this so that log ( λi (t j x i ) ) 1 λ i (t j x i ) = log ( ) λ0 (t j ) 1 λ 0 (t j ) exp(βt x i ) Motivating = γ j + β T x i This is has the form of a logistic regression model with separate intercepts for each follow-up time

26 How do we fit the model? Motivating If the number of unique failure (follow-up) times t1,..., t J is reasonable in size with many ties, we can use ordinary logistic regression with standard software. If the number of unique failure (follow-up) times t1,..., t J is large with few ties,we can use the Cox PH model with the "exact" ties options. What if we have both many ties and a large number of unique failure times?...best going with the the logistic model in most packages 11.26

27 Survey data considering factors that affect high school graduation available on N=1,691 9th grade students enrolled in a single school district Covariates available 1. Race (White, Black, Hispanic, other) 2. Gender 3. Familiy income (low, medium, high lowest and highest 20%) 4. Parents education (no HS grad, HS grad, some college, college grad) Goal: Estimate the affect of these covariates on the cumulative probability of HS dropout (we ll focus on mom s education as an example) Motivating 11.27

28 A brief look at the data... ## ##### Read in HS graduation data ## > hsgrad <- read.table( " /hsgrad_comp.txt", header=true ) > nsubjects <- nrow( hsgrad ) > nsubjects [1] 1691 > hsgrad[1:5,] id race male mom.ed dad.ed inc graduate maxgrade Motivating > table( hsgrad$maxgrade )

29 Set the data up for a CRM fit > ### > ### Construct CRM data... > ### > # > ##### STEP (1) construct the pairs (Y,H) > # > hsgrad$maxgrade <- hsgrad$maxgrade - 7 > ncuts <- max(hsgrad$maxgrade) - 1 > print( paste("ncuts =", ncuts) ) [1] "ncuts = 4" Motivating > y.crm <- NULL > h.crm <- NULL > id <- NULL > for( j in 1:nsubjects ){ + yj <- rep( 0, ncuts ) + if( hsgrad$maxgrade[j] <= ncuts ) yj[ hsgrad$maxgrade[j] ] <- 1 + hj <- 1 - c(0,cumsum(yj)[1:(ncuts-1)]) + y.crm <- c( y.crm, yj ) + h.crm <- c( h.crm, hj ) + id <- c( id, rep(j,ncuts) ) + } 11.29

30 Set the data up for a CRM fit > ### > ### Construct CRM data... > ### > # > ##### STEP (2) construct the intercepts > # > level <- factor( rep(1:ncuts, nsubjects ), levels=c(1:ncuts), + labels=paste(": ",(1:ncuts)+7) ) > int.mat <- NULL > for( j in 1:ncuts ){ + intj <- rep( 0, ncuts ) + intj[ j ] <- 1 + int.mat <- cbind( int.mat, rep( intj, nsubjects) ) + } > dimnames(int.mat) <- list( NULL, paste("int",c(1:ncuts),sep="") ) Motivating 11.30

31 Set the data up for a CRM fit > # > ##### STEP (3) expand the X s > # > race <- rep( hsgrad$race, rep(ncuts,nsubjects) ) > male <- rep( hsgrad$male, rep(ncuts,nsubjects) ) > mom.ed <- rep( hsgrad$mom.ed, rep(ncuts,nsubjects) ) > dad.ed <- rep( hsgrad$dad.ed, rep(ncuts,nsubjects) ) > inc <- rep( hsgrad$inc, rep(ncuts,nsubjects) ) > # > print( cbind( id, y.crm, h.crm, level, int.mat, mom.ed )[1:22,] ) id y.crm h.crm level Int1 Int2 Int3 Int4 mom.ed [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] Motivating 11.31

32 Set the data up for a CRM fit > # > ##### STEP (4) drop the H=0 and build dataframe > # > keep <- h.crm==1 > hsgrad.crm.data <- data.frame( + id = id[keep], + y = y.crm[keep], + level = level[keep], + race = race[keep], + male = male[keep], + mom.ed = mom.ed[keep], + dad.ed = dad.ed[keep], + inc = inc[keep] ) > print( hsgrad.crm.data[1:15,] ) id y level race male mom.ed dad.ed inc : : : : : : : Motivating 11.32

33 First consider fitting separate logistic regressions by subsetting on individuals in each grade level ie. In 4 separate models, consider the odds of dropping out of school before completing grade j + 1 given that one has passed grade j > # > ##### > ##### Fit separate logistic regressions for cumulative > ##### probability of dropout > ##### > # > table( hsgrad.crm.data$level ) Motivating : 8 : 9 : 10 :

34 First consider fitting separate logistic regressions by subsetting on individuals in each grade level ie. In 4 separate models, consider the odds of dropping out of school before completing grade j + 1 given that one has passed grade j > fit1 <- glm( y ~ factor( mom.ed ), family=binomial, + subset=(as.integer(level)==1), data = hsgrad.crm.data ) > summary( fit1 ) Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) e-07 *** factor(mom.ed) factor(mom.ed) factor(mom.ed) Motivating > glmci( fit1 ) exp( Est ) ci95.lo ci95.hi z value Pr(> z ) (Intercept) factor(mom.ed) factor(mom.ed) Inf factor(mom.ed)

35 First consider fitting separate logistic regressions by subsetting on individuals in each grade level ie. In 4 separate models, consider the odds of dropping out of school before completing grade j + 1 given that one has passed grade j > fit2 <- glm( y ~ factor( mom.ed ), family=binomial, + subset=(as.integer(level)==2), data = hsgrad.crm.data ) > glmci( fit2 ) exp( Est ) ci95.lo ci95.hi z value Pr(> z ) (Intercept) factor(mom.ed) factor(mom.ed) factor(mom.ed) Motivating > fit3 <- glm( y ~ factor( mom.ed ), family=binomial, + subset=(as.integer(level)==3), data = hsgrad.crm.data ) > glmci( fit3 ) exp( Est ) ci95.lo ci95.hi z value Pr(> z ) (Intercept) factor(mom.ed) factor(mom.ed) factor(mom.ed)

36 First consider fitting separate logistic regressions by subsetting on individuals in each grade level ie. In 4 separate models, consider the odds of dropping out of school before completing grade j + 1 given that one has passed grade j > fit4 <- glm( y ~ factor( mom.ed ), family=binomial, + subset=(as.integer(level)==4), data = hsgrad.crm.data ) > glmci( fit4 ) exp( Est ) ci95.lo ci95.hi z value Pr(> z ) (Intercept) factor(mom.ed) factor(mom.ed) factor(mom.ed) Motivating 11.36

37 Now let s consider fitting the continuation ratio model which simultaneously models all grades > # > ##### > ##### Fit continuation ratio (logit) model > ##### > # > # > fit5 <- glm( y ~ level + factor(mom.ed), family=binomial, + data = hsgrad.crm.data ) Motivating > glmci( fit5 ) exp( Est ) ci95.lo ci95.hi z value Pr(> z ) (Intercept) level: level: level: factor(mom.ed) factor(mom.ed) factor(mom.ed)

38 Motivating Interpretation of coefficients from the CRM model: 11.38

39 We can also test the assumption of a common (conditional) odds ratio across grade levels using an LRT That is, we wish to test whether the effect of mother s education varies by grade level This is called a test of the "proportional odds" assumption As with any diagnostic test, be careful of underpowered tests... > # > ##### > ##### Test the null hypothesis of a single odds ratio across levels > ##### > # > # > fit6 <- glm( y ~ level * factor( mom.ed ), family=binomial, + data = hsgrad.crm.data ) Motivating > anova( fit5, fit6 ) Analysis of Deviance Table Model 1: y ~ level + factor(mom.ed) Model 2: y ~ level * factor(mom.ed) Resid. Df Resid. Dev Df Deviance

40 Might be better to use my lrtest() function since it automatically computes the p-value... > # > ##### > ##### Test the null hypothesis of a single odds ratio across levels > ##### > lrtest( fit5, fit6 ) Motivating Assumption: Model 1 nested within Model 2 Resid. Df Resid. Dev Df Deviance pvalue

41 Conclusion from the above test: Motivating 11.41

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016

Lecture 7. Proportional Hazards Model - Handling Ties and Survival Estimation Statistics Survival Analysis. Presented February 4, 2016 Proportional Hazards Model - Handling Ties and Survival Estimation Statistics 255 - Survival Analysis Presented February 4, 2016 likelihood - Discrete Dan Gillen Department of Statistics University of

More information

Multinomial Regression Models

Multinomial Regression Models Multinomial Regression Models Objectives: Multinomial distribution and likelihood Ordinal data: Cumulative link models (POM). Ordinal data: Continuation models (CRM). 84 Heagerty, Bio/Stat 571 Models for

More information

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine

Lecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 9.1 Survival analysis involves subjects moving through time Hazard may

More information

Lecture 7 Time-dependent Covariates in Cox Regression

Lecture 7 Time-dependent Covariates in Cox Regression Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the

More information

Survival Regression Models

Survival Regression Models Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the

More information

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016

Lecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016 Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family

More information

β j = coefficient of x j in the model; β = ( β1, β2,

β j = coefficient of x j in the model; β = ( β1, β2, Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)

More information

Lecture 8 Stat D. Gillen

Lecture 8 Stat D. Gillen Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Lecture 14: Introduction to Poisson Regression

Lecture 14: Introduction to Poisson Regression Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why

More information

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week

More information

Multivariable Fractional Polynomials

Multivariable Fractional Polynomials Multivariable Fractional Polynomials Axel Benner September 7, 2015 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Logistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction

More information

Lecture 12: Effect modification, and confounding in logistic regression

Lecture 12: Effect modification, and confounding in logistic regression Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression

More information

Multivariable Fractional Polynomials

Multivariable Fractional Polynomials Multivariable Fractional Polynomials Axel Benner May 17, 2007 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example

More information

Relative-risk regression and model diagnostics. 16 November, 2015

Relative-risk regression and model diagnostics. 16 November, 2015 Relative-risk regression and model diagnostics 16 November, 2015 Relative risk regression More general multiplicative intensity model: Intensity for individual i at time t is i(t) =Y i (t)r(x i, ; t) 0

More information

Multistate models and recurrent event models

Multistate models and recurrent event models and recurrent event models Patrick Breheny December 6 Patrick Breheny University of Iowa Survival Data Analysis (BIOS:7210) 1 / 22 Introduction In this final lecture, we will briefly look at two other

More information

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection

Model Selection in GLMs. (should be able to implement frequentist GLM analyses!) Today: standard frequentist methods for model selection Model Selection in GLMs Last class: estimability/identifiability, analysis of deviance, standard errors & confidence intervals (should be able to implement frequentist GLM analyses!) Today: standard frequentist

More information

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times

Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Patrick J. Heagerty PhD Department of Biostatistics University of Washington 1 Biomarkers Review: Cox Regression Model

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Binary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017

Binary Regression. GH Chapter 5, ISL Chapter 4. January 31, 2017 Binary Regression GH Chapter 5, ISL Chapter 4 January 31, 2017 Seedling Survival Tropical rain forests have up to 300 species of trees per hectare, which leads to difficulties when studying processes which

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek

Survival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek Survival Analysis 732G34 Statistisk analys av komplexa data Krzysztof Bartoszek (krzysztof.bartoszek@liu.se) 10, 11 I 2018 Department of Computer and Information Science Linköping University Survival analysis

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?

You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What? You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David

More information

Chapter 4 Regression Models

Chapter 4 Regression Models 23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,

More information

Regression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102

Regression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102 Background Regression so far... Lecture 21 - Sta102 / BME102 Colin Rundel November 18, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Introduction to logistic regression

Introduction to logistic regression Introduction to logistic regression Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia What we are going to learn

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

1 The problem of survival analysis

1 The problem of survival analysis 1 The problem of survival analysis Survival analysis concerns analyzing the time to the occurrence of an event. For instance, we have a dataset in which the times are 1, 5, 9, 20, and 22. Perhaps those

More information

Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

More information

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Log-linearity for Cox s regression model. Thesis for the Degree Master of Science Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical

More information

Matched Pair Data. Stat 557 Heike Hofmann

Matched Pair Data. Stat 557 Heike Hofmann Matched Pair Data Stat 557 Heike Hofmann Outline Marginal Homogeneity - review Binary Response with covariates Ordinal response Symmetric Models Subject-specific vs Marginal Model conditional logistic

More information

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011) Ron Heck, Fall 2011 1 EDEP 768E: Seminar in Multilevel Modeling rev. January 3, 2012 (see footnote) Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October

More information

Lecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018

Lecture 1. Introduction Statistics Statistical Methods II. Presented January 8, 2018 Introduction Statistics 211 - Statistical Methods II Presented January 8, 2018 linear models Dan Gillen Department of Statistics University of California, Irvine 1.1 Logistics and Contact Information Lectures:

More information

Lecture 4 - Survival Models

Lecture 4 - Survival Models Lecture 4 - Survival Models Survival Models Definition and Hazards Kaplan Meier Proportional Hazards Model Estimation of Survival in R GLM Extensions: Survival Models Survival Models are a common and incredibly

More information

STAT 526 Spring Final Exam. Thursday May 5, 2011

STAT 526 Spring Final Exam. Thursday May 5, 2011 STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Statistics Part A Comprehensive Exam Methodology Paper Student Name: ID: McGill University Faculty of Science Department of Mathematics and Statistics Statistics Part A Comprehensive Exam Methodology Paper Date: Friday, May 13, 2016 Time: 13:00 17:00 Instructions

More information

Proportional hazards regression

Proportional hazards regression Proportional hazards regression Patrick Breheny October 8 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/28 Introduction The model Solving for the MLE Inference Today we will begin discussing regression

More information

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 4 Fall 2012 4.2 Estimators of the survival and cumulative hazard functions for RC data Suppose X is a continuous random failure time with

More information

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data Ronald Heck Class Notes: Week 8 1 Class Notes: Week 8 Probit versus Logit Link Functions and Count Data This week we ll take up a couple of issues. The first is working with a probit link function. While

More information

Single-level Models for Binary Responses

Single-level Models for Binary Responses Single-level Models for Binary Responses Distribution of Binary Data y i response for individual i (i = 1,..., n), coded 0 or 1 Denote by r the number in the sample with y = 1 Mean and variance E(y) =

More information

9 Generalized Linear Models

9 Generalized Linear Models 9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

More information

Introducing Generalized Linear Models: Logistic Regression

Introducing Generalized Linear Models: Logistic Regression Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and

More information

Logistic Regressions. Stat 430

Logistic Regressions. Stat 430 Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to

More information

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form: Outline for today What is a generalized linear model Linear predictors and link functions Example: fit a constant (the proportion) Analysis of deviance table Example: fit dose-response data using logistic

More information

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model Other Survival Models (1) Non-PH models We briefly discussed the non-proportional hazards (non-ph) model λ(t Z) = λ 0 (t) exp{β(t) Z}, where β(t) can be estimated by: piecewise constants (recall how);

More information

Generalized logit models for nominal multinomial responses. Local odds ratios

Generalized logit models for nominal multinomial responses. Local odds ratios Generalized logit models for nominal multinomial responses Categorical Data Analysis, Summer 2015 1/17 Local odds ratios Y 1 2 3 4 1 π 11 π 12 π 13 π 14 π 1+ X 2 π 21 π 22 π 23 π 24 π 2+ 3 π 31 π 32 π

More information

More Statistics tutorial at Logistic Regression and the new:

More Statistics tutorial at  Logistic Regression and the new: Logistic Regression and the new: Residual Logistic Regression 1 Outline 1. Logistic Regression 2. Confounding Variables 3. Controlling for Confounding Variables 4. Residual Linear Regression 5. Residual

More information

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014 LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R Liang (Sally) Shan Nov. 4, 2014 L Laboratory for Interdisciplinary Statistical Analysis LISA helps VT researchers

More information

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation Background Regression so far... Lecture 23 - Sta 111 Colin Rundel June 17, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical or categorical

More information

Survival Analysis. STAT 526 Professor Olga Vitek

Survival Analysis. STAT 526 Professor Olga Vitek Survival Analysis STAT 526 Professor Olga Vitek May 4, 2011 9 Survival Data and Survival Functions Statistical analysis of time-to-event data Lifetime of machines and/or parts (called failure time analysis

More information

STAT 526 Spring Midterm 1. Wednesday February 2, 2011

STAT 526 Spring Midterm 1. Wednesday February 2, 2011 STAT 526 Spring 2011 Midterm 1 Wednesday February 2, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Ph.D. course: Regression models. Introduction. 19 April 2012

Ph.D. course: Regression models. Introduction. 19 April 2012 Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

especially with continuous

especially with continuous Handling interactions in Stata, especially with continuous predictors Patrick Royston & Willi Sauerbrei UK Stata Users meeting, London, 13-14 September 2012 Interactions general concepts General idea of

More information

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key

Statistical Methods III Statistics 212. Problem Set 2 - Answer Key Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423

More information

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model

Generalized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example

More information

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data

Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Dynamic Prediction of Disease Progression Using Longitudinal Biomarker Data Xuelin Huang Department of Biostatistics M. D. Anderson Cancer Center The University of Texas Joint Work with Jing Ning, Sangbum

More information

Chapter 20: Logistic regression for binary response variables

Chapter 20: Logistic regression for binary response variables Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried

More information

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status

Ph.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable

More information

MODULE 6 LOGISTIC REGRESSION. Module Objectives:

MODULE 6 LOGISTIC REGRESSION. Module Objectives: MODULE 6 LOGISTIC REGRESSION Module Objectives: 1. 147 6.1. LOGIT TRANSFORMATION MODULE 6. LOGISTIC REGRESSION Logistic regression models are used when a researcher is investigating the relationship between

More information

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy

Logistic Regression. Some slides from Craig Burkett. STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Logistic Regression Some slides from Craig Burkett STA303/STA1002: Methods of Data Analysis II, Summer 2016 Michael Guerzhoy Titanic Survival Case Study The RMS Titanic A British passenger liner Collided

More information

Introduction to the Analysis of Tabular Data

Introduction to the Analysis of Tabular Data Introduction to the Analysis of Tabular Data Anthropological Sciences 192/292 Data Analysis in the Anthropological Sciences James Holland Jones & Ian G. Robertson March 15, 2006 1 Tabular Data Is there

More information

Duration of Unemployment - Analysis of Deviance Table for Nested Models

Duration of Unemployment - Analysis of Deviance Table for Nested Models Duration of Unemployment - Analysis of Deviance Table for Nested Models February 8, 2012 The data unemployment is included as a contingency table. The response is the duration of unemployment, gender and

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

Categorical and Zero Inflated Growth Models

Categorical and Zero Inflated Growth Models Categorical and Zero Inflated Growth Models Alan C. Acock* Summer, 2009 *Alan C. Acock, Department of Human Development and Family Sciences, Oregon State University, Corvallis OR 97331 (alan.acock@oregonstate.edu).

More information

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3

STA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3 STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae

More information

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics

Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the

More information

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen

Definitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation

More information

STAT 7030: Categorical Data Analysis

STAT 7030: Categorical Data Analysis STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012

More information

Sociology 362 Data Exercise 6 Logistic Regression 2

Sociology 362 Data Exercise 6 Logistic Regression 2 Sociology 362 Data Exercise 6 Logistic Regression 2 The questions below refer to the data and output beginning on the next page. Although the raw data are given there, you do not have to do any Stata runs

More information

Introduction to General and Generalized Linear Models

Introduction to General and Generalized Linear Models Introduction to General and Generalized Linear Models Generalized Linear Models - part III Henrik Madsen Poul Thyregod Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs.

More information

12 Modelling Binomial Response Data

12 Modelling Binomial Response Data c 2005, Anthony C. Brooms Statistical Modelling and Data Analysis 12 Modelling Binomial Response Data 12.1 Examples of Binary Response Data Binary response data arise when an observation on an individual

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

Multiple Regression: Chapter 13. July 24, 2015

Multiple Regression: Chapter 13. July 24, 2015 Multiple Regression: Chapter 13 July 24, 2015 Multiple Regression (MR) Response Variable: Y - only one response variable (quantitative) Several Predictor Variables: X 1, X 2, X 3,..., X p (p = # predictors)

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression

22s:152 Applied Linear Regression. Example: Study on lead levels in children. Ch. 14 (sec. 1) and Ch. 15 (sec. 1 & 4): Logistic Regression 22s:52 Applied Linear Regression Ch. 4 (sec. and Ch. 5 (sec. & 4: Logistic Regression Logistic Regression When the response variable is a binary variable, such as 0 or live or die fail or succeed then

More information

R Hints for Chapter 10

R Hints for Chapter 10 R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.

More information

The coxvc_1-1-1 package

The coxvc_1-1-1 package Appendix A The coxvc_1-1-1 package A.1 Introduction The coxvc_1-1-1 package is a set of functions for survival analysis that run under R2.1.1 [81]. This package contains a set of routines to fit Cox models

More information

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.

11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies. Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011

More information

A comparison of 5 software implementations of mediation analysis

A comparison of 5 software implementations of mediation analysis Faculty of Health Sciences A comparison of 5 software implementations of mediation analysis Liis Starkopf, Thomas A. Gerds, Theis Lange Section of Biostatistics, University of Copenhagen Illustrative example

More information

The influence of categorising survival time on parameter estimates in a Cox model

The influence of categorising survival time on parameter estimates in a Cox model The influence of categorising survival time on parameter estimates in a Cox model Anika Buchholz 1,2, Willi Sauerbrei 2, Patrick Royston 3 1 Freiburger Zentrum für Datenanalyse und Modellbildung, Albert-Ludwigs-Universität

More information

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Two-stage Adaptive Randomization for Delayed Response in Clinical Trials Guosheng Yin Department of Statistics and Actuarial Science The University of Hong Kong Joint work with J. Xu PSI and RSS Journal

More information