ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables
|
|
- Catherine Kelly
- 6 years ago
- Views:
Transcription
1 ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November / 53
2 Outline Survival Data Example: Malignant Melanoma Data The Cox Model Cox in SAS Choice of Time-Scale Example: Guinea-Bissau Data Delayed entries Time dependent explanatory variables 2 / 53
3 d i=1 exp(βx i ) j R(t i ) exp(βx j) 3 / 53
4 Survival Data Time to death or other event of interest. One time-scale including a well-defined starting time time-origin: Time from start of randomized clinical trial to death. Time from first employment to pension. Time from filling of a tooth to filling falls out. What is special about survival data? Right-skewed. No problem. CENSORING: For some we will only know a lower bound of lifetime. 4 / 53
5 Simple data Individual Times (months) 5 / 53
6 Survival and hazard function Let T be the TIME to event of interest: S(t) = P(T > t) = probability of survival to time t after entry at time 0 λ(t) = incidence, rate, or hazard Relationship: S(t) = exp ( t ) λ(s)ds = exp( Λ(t)) 0 Λ(t) is called the integrated hazard function. 6 / 53
7 λ(t) = λ S(t) = e λt Hazard rate Survival Function Time (t) Time (t) Λ(t) = λt Integrated hazard Time (t) 7 / 53
8 Kaplan-Meier estimate of survival function Death times t 1,..., t d (ordered). Y (t i ) = # alive just before t i. Ŝ(t) = ( 1 1 ) Y (t t i t i ) Risk sets Individual Times (months) 8 / 53
9 Survival probability Kaplan Meier survival estimate Time (months) Number at risk / 53
10 Malignant Melanoma Data In the period a total of 205 patients had their tumor removed and were followed until At the end of 1977: 57 died of mgl. mel. (status=1) 134 were still alive. (status=2) 14 died of non-related mgl. mel. (status=3) competing risk Purpose: Study effect on survival of sex, age, thickness of tumor, ulceration, etc / 53
11 Malignant melanoma N time status sex age year thickness ulcer / 53
12 The Cox Model The Cox model assumes that the rate for the ith individual is λ i (t) = λ 0 (t) exp(β 1 X i1 + β 2 X i β p X ip ) where β 1, β 2,..., β p are regression parameters, X i1 is the covariate value for covariate 1 for individual i, etc. Finally, λ 0 (t) is the baseline hazard. Time t is the time-scale of choice, e.g. age, time since randomization, or time since operation. As formulated here the only quantity on the right-hand side of the equal sign that depends on time is the baseline hazard λ 0 (t). If all covariates (X s) are zero we get λ i (t) = λ 0 (t). The interpretation of the baseline hazard is thus the hazard of a individual that have all covariates equal to zero. 12 / 53
13 The Cox model λ i (t) = λ 0 (t) exp(β 1 X i1 + β 2 X i β p X ip ) can also be written on the log-scale (natural log) log(λ i (t)) = log(λ 0 (t) exp(β 1 X i1 + β 2 X i β p X ip )) The Cox model assumes that = log(λ 0 (t)) + β 1 X i1 + β 2 X i β p X ip. the effects of covariates are additive and linear on the log rate scale, just like the poisson regression. the CORNER i.e. the baseline hazard is non-parametric and depends on time, and time is thus adjusted for. We now turn to the interpretation of the regression parameters β 1, β 2,..., β p. 13 / 53
14 One binary covariate To make things more simple we only study the effect of one single binary covariate, e.g. sex on the risk of dying { 0 if individual i is a female X i = 1 if individual i is a male The Cox model is λ i (t) = λ 0 (t) exp(βx i ). With X i defined as above we get { λ 0 (t) if individual i is a female λ i (t) = λ 0 (t) exp(β) if individual i is a male 14 / 53
15 Mortality Rate Ratio Hazard Ratio If λ i (t) = { λ 0 (t) λ 0 (t) exp(β) if individual i is a female if individual i is a male then we have that the RATE RATIO (RR) between males and females is RR = λ 0(t) exp(β) = exp(β). λ 0 (t) Importantly, the ratio is independent of time, i.e. we have PROPORTIONAL HAZARDS over time. The Cox model is also called the proportional hazards model. How to estimate β? And what about baseline hazard λ 0 (t)? 15 / 53
16 Likelihood Function The baseline hazard is regarded as a nuisance and is not in general estimated, but it is possible. Let t 1,..., t d be the ordered death times It can been shown, that all we need is to find the β that maximizes the following function called Cox s partial likelihood function d exp(βx i ) L(β) = j R(t i ) exp(βx j) i=1 where R(t i ) is the RISK SET at death time t i i.e. the set of individuals being at risk of dying (under observation) just before time t i. The resulting estimate β is called the MAXIMUM LIKELIHOOD ESTIMATE of β. 16 / 53
17 Likelihood Function a closer look Death times t 1,..., t d, numbering individuals with deaths first: i = 1, 2,..., d, d + 1,..., n. with times and covariates t 1, t 2,..., t d, t d+1,..., t n. X 1, X 2,..., X d, X d+1,..., X n. At each death time we have the RISK SET: individuals alive and at risk of dying just before the death time: R(t 1 ), R(t 2 ),..., R(t d ) 17 / 53
18 Risk sets Individual Times (months) 18 / 53
19 For the Cox model λ i (t) = λ 0 (t) exp(βx i ) we use the Cox likelihood function to estimate β: L(β) = = d exp(βx i ) j R(t i ) exp(βx j) i=1 exp(βx 1 ) j R(t 1 ) exp(βx j) exp(βx 2 ) j R(t 2 ) exp(βx j) exp(βx d ) j R(t d ) exp(βx j) We index individuals in the risk sets using the letter j. Writing j R(t 1 ) exp(βx j) means summing over the individuals in the risk set for death time t 1. If we here assume that no one was censored before the first death time all individuals are in the risk set R(t 1 ) and the sum is exp(βx 1 ) + exp(βx 2 ) + + exp(βx n ). 19 / 53
20 For example for the Cox model λ i (t) = λ 0 (t) exp(β sex) Sex: 1=male, 0=female. Likelihood function: exp(β) j R(t 1 ) exp(βx j) 1 j R(t 2 ) exp(βx j) exp(β) j R(t d ) exp(βx j). If we again assume that no one was censored before the first death time all individuals are in the risk set R(t 1 ) and the sum is exp(β) exp(β) = N M exp(β) + N F, where N M and N F number of males and females respectively in R(t 1 ). The risk sets also play a crucial role in nested case-control studies more on this later in the course. 20 / 53
21 So far the following assumptions have been made for the Cox model The baseline hazard is assumed non-parametric, i.e. assumed to vary freely. The effects of covariates are additive and linear on the log rate scale. The ratio of the hazard rate for two subjects are constant over time. In other words, there is no interaction between the covariates and the time variable. Let us look at the Melanoma data using SAS. 21 / 53
22 Kaplan Meier survival estimates, by sex Time (years) female male What is the estimate of the RR between males and females? 22 / 53
23 Cox in SAS In SAS, proc phreg and proc tphreg can be used for estimating in the Cox model. We will use proc tphreg as this procedure can handle categorical variables much easier than proc phreg. Using proc tphreg we define the variable sex to be categorical using the class statement. For the variable sex 1 is males and 0 is females. proc tphreg data=melanom; class sex; model time*status(2,3) = sex; run; Please note, that we have two censoring codes namely 2 and 3. NB: In SAS 9.2 proc phreg now handles class variables and proc tphreg is obsolete. 23 / 53
24 Part of output from proc tphreg: Analysis of Maximum Likelihood Estimates Parameter Standard Hazard Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio sex The column Parameter Estimate is β. For a class variable SAS will automatically choose the highest number (here 1) as the reference. Thus, the rate ratio or Hazard Ratio is females compared to males. There is no estimate statement in proc (t)phreg, but a similar so-called contrast statement exists. Instead we can use the ref option in the class statement. Note also the option risklimits in the model statement which calculates the confidence interval for the hazard ratio. 24 / 53
25 proc tphreg data=melanom; class sex(ref="0"); model time*status(2,3) = sex / risklimits; run;... Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits sex / 53
26 Melanoma data, thickness of tumor given by variable gtyk 1 if <2mm gtyk = 2 if 2-5 mm 3 if >5 mm proc tphreg data=melanom; class gtyk; model time*status(2,3) = gtyk / risklimits; run; Type 3 Tests Wald Effect DF Chi-Square Pr > ChiSq gtyk <.0001 Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits gtyk < gtyk / 53
27 Melanoma data, + age in years proc tphreg data=melanom; class gtyk sex; model time*status(2,3) = gtyk sex age / risklimits; run; Type 3 Tests Wald Effect DF Chi-Square Pr > ChiSq sex gtyk <.0001 age Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits sex gtyk < gtyk age / 53
28 LR = = 28.0 χ 2 2 (2 degrees of freedom) 28 / 53 Likelihood Ratio Test. proc tphreg data=melanom; class gtyk sex; model time*status(2,3) = gtyk sex; run; Model Fit Statistics Without With Criterion Covariates Covariates -2 LOG L AIC SBC proc tphreg data=melanom; class sex; model time*status(2,3) = sex; run; Model Fit Statistics Without With Criterion Covariates Covariates -2 LOG L AIC SBC
29 SAS: p-value from chi-square test data temp; chisquare=28; df=2; p=1-probchi(chisquare,df); run; proc print data=temp; run; Obs chisquare df p / 53
30 Choice of Time-Scale A study may be conducted over calendar time even though the natural time-scale is time since treatment Melanoma study. Cohort studies are often conducted by recruiting a random sample of the population at the start of the study and then these subjects are followed for a number of years Framingham. A natural time-scale may be age rather than time in study which most often is an artificial time-scale constructed by the investigators. What would time-origin be if age was chosen as time-scale? 30 / 53
31 Vaccinations in Guinea-Bissau Rural Guinea-Bissau: 5274 children under 7 months of age visited two times at home, with an interval of six months. Information about vaccination (BCG, DTP, mealses vaccine) collected at each visit and at second visit death during follow-up is registered. Some children moved away during follow-up, i.e. censored or survived until next visit, also censored. Below are some of the variable names from the bissau data. fuptime dead bcg agem Follow-up time in days 0 = censored, 1 = dead 1 = Yes, 2 = No Age at first visit in months 31 / 53
32 Is the risk of dying associated with vaccination? Outcome Exposure Died Survived Total BCG vaccinated 125 (3.8%) not BCG vaccinated 97 (4.9%) Total 222 (4.2%) / 53
33 proc tphreg data=bissau; class bcg; model fuptime*dead(0)=bcg / rl ; run; Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio Score Wald Type 3 Tests Wald Effect DF Chi-Square Pr > ChiSq bcg Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits bcg / 53
34 proc tphreg data=bissau; class bcg agem; model fuptime*dead(0)=bcg agem / rl ; run; Type 3 Tests Wald Effect DF Chi-Square Pr > ChiSq bcg agem Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits bcg agem agem agem agem agem agem / 53
35 Delayed entries Time in study Age as time Individual 7 6 Individual Times (months) Age (months) 35 / 53
36 Subjects are only at risk at age of entry and onwards. They are not at risk in our World of analysis before age of entry! Handling of delayed entries is easily done by careful control of the RISK SET R(t i ) at death time t i in the likelihood function: L(β) = d exp(βx i ) j R(t i ) exp(βx j) i=1 Only individuals at risk and under observation is included in the risk set R(t i ) at time t i. 36 / 53
37 Delayed entries in SAS data bissau2; set bissau; outage=age+fuptime; run; proc tphreg data=bissau2; class bcg; model (age,outage)*dead(0)= bcg / rl; run; Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits bcg / 53
38 Time dependent explanatory variables The Cox model can be expanded to include time-varying covariates λ i (t) = λ 0 (t) exp(βx i (t)). The likelihood function for death times t 1,..., t d becomes L(β) = d i=1 exp(βx i (t i )) j R(t i ) exp(βx j(t i )). From this we can see that we just need to know the value of the covariates at the deaths times: X i (t 1 ), X i (t 2 ),..., X i (t d ). The covariate values at any time different from a death time is not used in the likelihood function. 38 / 53
39 The most simple time-varying covariate is a binary variable that is allowed to change once during follow-up, e.g. new BCG vaccinations registered between visits in the Bissau data: X i (t) = { 0 if no BCG before time t 1 if BCG-time t 39 / 53
40 A child being BCG-vaccinated after 3 months of follow-up. BCG Follow up (months) The time-varying covariate is 0 in the time interval 0 to 3 months and 1 for the rest of follow-up. For a child who was BCG vaccinated before first visit the time-varying covariate is one during all the follow-up. 40 / 53
41 Multi-state Model λ 01 (t) 0 1 Unexposed Exposed λ 02 (t) 2 Dead λ 12 (t) We want to compare λ 02 (t) and λ 12 (t). The transition λ 01 (t) is not modeled here. 41 / 53
42 Instead of time of follow-up we will use age as time-scale to illustrate the use of BCG as a time-varying covariate in the Bissau data. At visit 2 the vaccination cards were seen for the children at home and an age of BCG vaccination (bcgage) was calculated: id fuptime dead age bcg bcgage outage / 53
43 Binary time-varying covariate in SAS (I) proc tphreg data=bcg; if.<bcgage<outage then bcg_t=1; else bcg_t=0; model (age,outage)*dead(0)=bcg_t / rl ; run; Analysis of Maximum Likelihood Estimates Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits bcg_t < / 53
44 The if-statement if.<bcgage<outage then bcg_t=1; else bcg_t=0; is recalculated at each death time. The outage in the model statement refers to the current death times being evaluated (i.e. a t i in the likelihood). For the first death time which is t 1 = 23 days of age, the if-statement becomes if.<bcgage<23 then bcg_t=1; else bcg_t=0; being calculated for all children at risk at age 23 days (in R(t 1 = 23)) with their individual bcgage-values. This is a recalculation of the time-varying covariate at each death time c.f. the likelihood function. 44 / 53
45 Binary time-varying covariate in SAS (II) Splitting up persons with a changing time-varying covariate in two records: age bcgage outage bcgvacc=0 status=0 bcgvacc=1 status=dead and use delayed entries. Thus, we need to generate a new data set. 45 / 53
46 data splitbcg; set bcg; if bcgage=. or bcgage>outage then do; bcgvacc=0; entryage=age; exitage=outage; status=dead; output; end; if.<bcgage<=age then do; bcgvacc=1; entryage=age; exitage=outage; status=dead; output; end; if age<bcgage<=outage then do; bcgvacc=0; entryage=age ; exitage=bcgage; status= 0; output; bcgvacc=1; entryage=bcgage; exitage=outage; status=dead; output; end; run; id fuptime dead age bcg bcgage outage bcgvacc entryage exitage status / 53
47 proc tphreg data=splitbcg; class bcgvacc(ref="0"); model (entryage,exitage)*status(0)=bcgvacc / rl ; run; Parameter Standard Hazard 95% Hazard Ratio Parameter DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits bcgvacc < / 53
48 Other time-varying covariates Effect of binary X (0,1) changes at t 0 : where λ i (t) = λ 0 (t) exp(β 1 X i + β 2 X i I (t t 0 )), I (t t 0 ) = Can be handled by method I+II. { 1 if t t 0 0 if t < t 0 Effect of binary X (0,1) decreases or increases with time: λ i (t) = λ 0 (t) exp(β 1 X i + β 2 (X i t)). Can be handled by method I or by splitting at each failure or special options. 48 / 53
49 Stanford Heart Transplant Data (p. 235) In a report (Crowley and Hu, J Amer. Statist Assoc. 1977) on the Stanford Heart Transplantation Study, patients identified as been eligible (N=103) for a heart transplant were followed until death or censorship. In total 65 received transplant during follow-up, whereas 38 did not. Assess whether transplanted patients survive better. On the next slide you will find the variables in the transplant data set. Here we will discuss how to analyse and at the exercises we will do some of the analyses. 49 / 53
50 Stanford Heart Transplant Data variables age cens days trans wait mismatch age (in years) at entry into the study. 0 = Censoring 1 = Dead number of days from entry to dead/censoring. 1 = if the person had a heart transplantation 0 = otherwise. number of days from entry to transplantation NB: if trans = 0 then wait = -1 1 = mismatch between HLA type in donor and patient 0 = no mismatch NB: if trans = 0 then mismatch = / 53
51 Obs age cens days trans wait mismatch / 53
52 Piecewise Constant Hazard Rate = Poisson regression Divide the time scale into K pieces and assuming piecewise constant but different hazard rates in each of the intervals. This may provide a sensible summary of many phenomena and is often used in epidemiology. λ 1 λ 2 λ 3 λ K c 0 = 0 c 1 c 2 c 3 c K 1 c K Age Thus λ(t) = λ k for t (c k 1, c k ], k = 1,..., K The intervals do not need to be of same length. We only need to keep record of the total number of deaths and the exposure time in each group. 52 / 53
53 We can further divide each interval into categories of covariates, e.g. sex (F=females, M=males): λ 1F λ 2F λ 3F λ KF λ 1M λ 2M λ 3M λ KM c 0 = 0 c 1 c 2 c 3 c K 1 c K Age Not straight forward in SAS to split the time-scale, but so-called user-written SAS-macros exist. See for example: Stata use stsplit command. R packages exist (e.g. Epi Package) SPSS? 53 / 53
Lecture 7 Time-dependent Covariates in Cox Regression
Lecture 7 Time-dependent Covariates in Cox Regression So far, we ve been considering the following Cox PH model: λ(t Z) = λ 0 (t) exp(β Z) = λ 0 (t) exp( β j Z j ) where β j is the parameter for the the
More informationCase-control studies C&H 16
Case-control studies C&H 6 Bendix Carstensen Steno Diabetes Center & Department of Biostatistics, University of Copenhagen bxc@steno.dk http://bendixcarstensen.com PhD-course in Epidemiology, Department
More informationREGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520
REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU
More informationCase-control studies
Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark b@bxc.dk http://bendixcarstensen.com Department of Biostatistics, University of Copenhagen, 8 November
More information11 November 2011 Department of Biostatistics, University of Copengen. 9:15 10:00 Recap of case-control studies. Frequency-matched studies.
Matched and nested case-control studies Bendix Carstensen Steno Diabetes Center, Gentofte, Denmark http://staff.pubhealth.ku.dk/~bxc/ Department of Biostatistics, University of Copengen 11 November 2011
More informationSurvival Regression Models
Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, 2017 1 / 32 Background on the Proportional Hazards Model The exponential distribution has constant
More informationDefinitions and examples Simple estimation and testing Regression models Goodness of fit for the Cox model. Recap of Part 1. Per Kragh Andersen
Recap of Part 1 Per Kragh Andersen Section of Biostatistics, University of Copenhagen DSBS Course Survival Analysis in Clinical Trials January 2018 1 / 65 Overview Definitions and examples Simple estimation
More informationBeyond GLM and likelihood
Stat 6620: Applied Linear Models Department of Statistics Western Michigan University Statistics curriculum Core knowledge (modeling and estimation) Math stat 1 (probability, distributions, convergence
More informationExtensions of Cox Model for Non-Proportional Hazards Purpose
PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used
More informationExtensions of Cox Model for Non-Proportional Hazards Purpose
PhUSE Annual Conference 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Author: Jadwiga Borucka PAREXEL, Warsaw, Poland Brussels 13 th - 16 th October 2013 Presentation Plan
More informationPart [1.0] Measures of Classification Accuracy for the Prediction of Survival Times
Part [1.0] Measures of Classification Accuracy for the Prediction of Survival Times Patrick J. Heagerty PhD Department of Biostatistics University of Washington 1 Biomarkers Review: Cox Regression Model
More informationStatistics in medicine
Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu
More informationSurvival Analysis I (CHL5209H)
Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really
More informationFaculty of Health Sciences. Regression models. Counts, Poisson regression, Lene Theil Skovgaard. Dept. of Biostatistics
Faculty of Health Sciences Regression models Counts, Poisson regression, 27-5-2013 Lene Theil Skovgaard Dept. of Biostatistics 1 / 36 Count outcome PKA & LTS, Sect. 7.2 Poisson regression The Binomial
More informationYou know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?
You know I m not goin diss you on the internet Cause my mama taught me better than that I m a survivor (What?) I m not goin give up (What?) I m not goin stop (What?) I m goin work harder (What?) Sir David
More informationLecture 2: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 11-12 December 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationSimple logistic regression
Simple logistic regression Biometry 755 Spring 2009 Simple logistic regression p. 1/47 Model assumptions 1. The observed data are independent realizations of a binary response variable Y that follows a
More informationLecture 5: Poisson and logistic regression
Dankmar Böhning Southampton Statistical Sciences Research Institute University of Southampton, UK S 3 RI, 3-5 March 2014 introduction to Poisson regression application to the BELCAP study introduction
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More informationPh.D. course: Regression models. Introduction. 19 April 2012
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable
More informationPh.D. course: Regression models. Regression models. Explanatory variables. Example 1.1: Body mass index and vitamin D status
Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 25 April 2013 www.biostat.ku.dk/~pka/regrmodels13 Per Kragh Andersen Regression models The distribution of one outcome variable
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationMAS3301 / MAS8311 Biostatistics Part II: Survival
MAS3301 / MAS8311 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-10 1 13 The Cox proportional hazards model 13.1 Introduction In the
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationβ j = coefficient of x j in the model; β = ( β1, β2,
Regression Modeling of Survival Time Data Why regression models? Groups similar except for the treatment under study use the nonparametric methods discussed earlier. Groups differ in variables (covariates)
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationLogistic regression analysis. Birthe Lykke Thomsen H. Lundbeck A/S
Logistic regression analysis Birthe Lykke Thomsen H. Lundbeck A/S 1 Response with only two categories Example Odds ratio and risk ratio Quantitative explanatory variable More than one variable Logistic
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationST3241 Categorical Data Analysis I Multicategory Logit Models. Logit Models For Nominal Responses
ST3241 Categorical Data Analysis I Multicategory Logit Models Logit Models For Nominal Responses 1 Models For Nominal Responses Y is nominal with J categories. Let {π 1,, π J } denote the response probabilities
More informationSurvival Analysis. 732G34 Statistisk analys av komplexa data. Krzysztof Bartoszek
Survival Analysis 732G34 Statistisk analys av komplexa data Krzysztof Bartoszek (krzysztof.bartoszek@liu.se) 10, 11 I 2018 Department of Computer and Information Science Linköping University Survival analysis
More informationSTAT 6350 Analysis of Lifetime Data. Failure-time Regression Analysis
STAT 6350 Analysis of Lifetime Data Failure-time Regression Analysis Explanatory Variables for Failure Times Usually explanatory variables explain/predict why some units fail quickly and some units survive
More informationSection IX. Introduction to Logistic Regression for binary outcomes. Poisson regression
Section IX Introduction to Logistic Regression for binary outcomes Poisson regression 0 Sec 9 - Logistic regression In linear regression, we studied models where Y is a continuous variable. What about
More informationSTAT 526 Spring Final Exam. Thursday May 5, 2011
STAT 526 Spring 2011 Final Exam Thursday May 5, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationLog-linearity for Cox s regression model. Thesis for the Degree Master of Science
Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical
More informationYou can specify the response in the form of a single variable or in the form of a ratio of two variables denoted events/trials.
The GENMOD Procedure MODEL Statement MODEL response = < effects > < /options > ; MODEL events/trials = < effects > < /options > ; You can specify the response in the form of a single variable or in the
More informationLecture 9. Statistics Survival Analysis. Presented February 23, Dan Gillen Department of Statistics University of California, Irvine
Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 9.1 Survival analysis involves subjects moving through time Hazard may
More informationCox s proportional hazards model and Cox s partial likelihood
Cox s proportional hazards model and Cox s partial likelihood Rasmus Waagepetersen October 12, 2018 1 / 27 Non-parametric vs. parametric Suppose we want to estimate unknown function, e.g. survival function.
More informationBasic Medical Statistics Course
Basic Medical Statistics Course S7 Logistic Regression November 2015 Wilma Heemsbergen w.heemsbergen@nki.nl Logistic Regression The concept of a relationship between the distribution of a dependent variable
More informationMulti-state Models: An Overview
Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed
More informationTypical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction
Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing
More informationTesting Independence
Testing Independence Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM 1/50 Testing Independence Previously, we looked at RR = OR = 1
More informationTMA 4275 Lifetime Analysis June 2004 Solution
TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,
More information( t) Cox regression part 2. Outline: Recapitulation. Estimation of cumulative hazards and survival probabilites. Ørnulf Borgan
Outline: Cox regression part 2 Ørnulf Borgan Department of Mathematics University of Oslo Recapitulation Estimation of cumulative hazards and survival probabilites Assumptions for Cox regression and check
More informationLecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationFaculty of Health Sciences. Cox regression. Torben Martinussen. Department of Biostatistics University of Copenhagen. 20. september 2012 Slide 1/51
Faculty of Health Sciences Cox regression Torben Martinussen Department of Biostatistics University of Copenhagen 2. september 212 Slide 1/51 Survival analysis Standard setup for right-censored survival
More informationDAGStat Event History Analysis.
DAGStat 2016 Event History Analysis Robin.Henderson@ncl.ac.uk 1 / 75 Schedule 9.00 Introduction 10.30 Break 11.00 Regression Models, Frailty and Multivariate Survival 12.30 Lunch 13.30 Time-Variation and
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More information8 Nominal and Ordinal Logistic Regression
8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on
More informationFederated analyses. technical, statistical and human challenges
Federated analyses technical, statistical and human challenges Bénédicte Delcoigne, Statistician, PhD Department of Medicine (Solna), Unit of Clinical Epidemiology, Karolinska Institutet What is it? When
More informationUNIVERSITY OF CALIFORNIA, SAN DIEGO
UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department
More informationPhD course in Advanced survival analysis. One-sample tests. Properties. Idea: (ABGK, sect. V.1.1) Counting process N(t)
PhD course in Advanced survival analysis. (ABGK, sect. V.1.1) One-sample tests. Counting process N(t) Non-parametric hypothesis tests. Parametric models. Intensity process λ(t) = α(t)y (t) satisfying Aalen
More informationSTAT 7030: Categorical Data Analysis
STAT 7030: Categorical Data Analysis 5. Logistic Regression Peng Zeng Department of Mathematics and Statistics Auburn University Fall 2012 Peng Zeng (Auburn University) STAT 7030 Lecture Notes Fall 2012
More informationIntroduction to Statistical Analysis
Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive
More informationIn contrast, parametric techniques (fitting exponential or Weibull, for example) are more focussed, can handle general covariates, but require
Chapter 5 modelling Semi parametric We have considered parametric and nonparametric techniques for comparing survival distributions between different treatment groups. Nonparametric techniques, such as
More informationChapter 20: Logistic regression for binary response variables
Chapter 20: Logistic regression for binary response variables In 1846, the Donner and Reed families left Illinois for California by covered wagon (87 people, 20 wagons). They attempted a new and untried
More informationAnalysis of Time-to-Event Data: Chapter 4 - Parametric regression models
Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Right censored
More informationAnalysis of Time-to-Event Data: Chapter 6 - Regression diagnostics
Analysis of Time-to-Event Data: Chapter 6 - Regression diagnostics Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/25 Residuals for the
More information3003 Cure. F. P. Treasure
3003 Cure F. P. reasure November 8, 2000 Peter reasure / November 8, 2000/ Cure / 3003 1 Cure A Simple Cure Model he Concept of Cure A cure model is a survival model where a fraction of the population
More informationSTA 303 H1S / 1002 HS Winter 2011 Test March 7, ab 1cde 2abcde 2fghij 3
STA 303 H1S / 1002 HS Winter 2011 Test March 7, 2011 LAST NAME: FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 303 STA 1002 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator. Some formulae
More informationLecture 8 Stat D. Gillen
Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels
More informationTied survival times; estimation of survival probabilities
Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation
More informationCIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis
CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation
More informationCorrelation and regression
1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationMultistate models and recurrent event models
and recurrent event models Patrick Breheny December 6 Patrick Breheny University of Iowa Survival Data Analysis (BIOS:7210) 1 / 22 Introduction In this final lecture, we will briefly look at two other
More informationLecture 12. Multivariate Survival Data Statistics Survival Analysis. Presented March 8, 2016
Statistics 255 - Survival Analysis Presented March 8, 2016 Dan Gillen Department of Statistics University of California, Irvine 12.1 Examples Clustered or correlated survival times Disease onset in family
More informationCount data page 1. Count data. 1. Estimating, testing proportions
Count data page 1 Count data 1. Estimating, testing proportions 100 seeds, 45 germinate. We estimate probability p that a plant will germinate to be 0.45 for this population. Is a 50% germination rate
More informationPower and Sample Size Calculations with the Additive Hazards Model
Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More informationPhilosophy and Features of the mstate package
Introduction Mathematical theory Practice Discussion Philosophy and Features of the mstate package Liesbeth de Wreede, Hein Putter Department of Medical Statistics and Bioinformatics Leiden University
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationRelative-risk regression and model diagnostics. 16 November, 2015
Relative-risk regression and model diagnostics 16 November, 2015 Relative risk regression More general multiplicative intensity model: Intensity for individual i at time t is i(t) =Y i (t)r(x i, ; t) 0
More informationLecture 3. Truncation, length-bias and prevalence sampling
Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationSemiparametric Regression
Semiparametric Regression Patrick Breheny October 22 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/23 Introduction Over the past few weeks, we ve introduced a variety of regression models under
More informationA Survival Analysis of GMO vs Non-GMO Corn Hybrid Persistence Using Simulated Time Dependent Covariates in SAS
Western Kentucky University From the SelectedWorks of Matt Bogard 2012 A Survival Analysis of GMO vs Non-GMO Corn Hybrid Persistence Using Simulated Time Dependent Covariates in SAS Matt Bogard, Western
More informationMultistate models and recurrent event models
Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,
More informationModel Based Statistics in Biology. Part V. The Generalized Linear Model. Chapter 18.1 Logistic Regression (Dose - Response)
Model Based Statistics in Biology. Part V. The Generalized Linear Model. Logistic Regression ( - Response) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch 9, 10, 11), Part IV
More informationLecture 01: Introduction
Lecture 01: Introduction Dipankar Bandyopadhyay, Ph.D. BMTRY 711: Analysis of Categorical Data Spring 2011 Division of Biostatistics and Epidemiology Medical University of South Carolina Lecture 01: Introduction
More informationLecture 12: Effect modification, and confounding in logistic regression
Lecture 12: Effect modification, and confounding in logistic regression Ani Manichaikul amanicha@jhsph.edu 4 May 2007 Today Categorical predictor create dummy variables just like for linear regression
More informationPerson-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data
Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time
More informationStatistics in medicine
Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial
More information9 Estimating the Underlying Survival Distribution for a
9 Estimating the Underlying Survival Distribution for a Proportional Hazards Model So far the focus has been on the regression parameters in the proportional hazards model. These parameters describe the
More informationSurvival Analysis Math 434 Fall 2011
Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup
More informationECONOMETRICS II TERM PAPER. Multinomial Logit Models
ECONOMETRICS II TERM PAPER Multinomial Logit Models Instructor : Dr. Subrata Sarkar 19.04.2013 Submitted by Group 7 members: Akshita Jain Ramyani Mukhopadhyay Sridevi Tolety Trishita Bhattacharjee 1 Acknowledgement:
More informationMethodological challenges in research on consequences of sickness absence and disability pension?
Methodological challenges in research on consequences of sickness absence and disability pension? Prof., PhD Hjelt Institute, University of Helsinki 2 Two methodological approaches Lexis diagrams and Poisson
More informationMeei Pyng Ng 1 and Ray Watson 1
Aust N Z J Stat 444), 2002, 467 478 DEALING WITH TIES IN FAILURE TIME DATA Meei Pyng Ng 1 and Ray Watson 1 University of Melbourne Summary In dealing with ties in failure time data the mechanism by which
More informationInference for Binomial Parameters
Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for
More informationSTA6938-Logistic Regression Model
Dr. Ying Zhang STA6938-Logistic Regression Model Topic 6-Logistic Regression for Case-Control Studies Outlines: 1. Biomedical Designs 2. Logistic Regression Models for Case-Control Studies 3. Logistic
More informationCOMPLEMENTARY LOG-LOG MODEL
COMPLEMENTARY LOG-LOG MODEL Under the assumption of binary response, there are two alternatives to logit model: probit model and complementary-log-log model. They all follow the same form π ( x) =Φ ( α
More informationThe influence of categorising survival time on parameter estimates in a Cox model
The influence of categorising survival time on parameter estimates in a Cox model Anika Buchholz 1,2, Willi Sauerbrei 2, Patrick Royston 3 1 Freiburger Zentrum für Datenanalyse und Modellbildung, Albert-Ludwigs-Universität
More informationssh tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm
Kedem, STAT 430 SAS Examples: Logistic Regression ==================================== ssh abc@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm a. Logistic regression.
More informationClinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.
Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,
More informationChapter 4 Regression Models
23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,
More information9 Generalized Linear Models
9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models
More information