# Survival Regression Models

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Survival Regression Models David M. Rocke May 18, 2017 David M. Rocke Survival Regression Models May 18, / 32

2 Background on the Proportional Hazards Model The exponential distribution has constant hazard f (t) = λe λt S(t) = e λt h(t) = λ Let s make two generalizations. First, let s let the hazard depend on covariates x 1, x 2,... x p. Second, let s let the base hazard depend on t. We can do this using either parametric or semi-parametric approaches. David M. Rocke Survival Regression Models May 18, / 32

3 The Cox Model The generalization is that the hazard function is η = β 1 x β p x p h(t covariates) = h 0 (t)e η This has a log link as in a generalized linear model. It is semi-parametric because the linear predictor depends on estimated parameters but the base hazard function is unspecified. There is no constant term because it is absorbed in the base hazard. Note that for two different individuals with possibly different covariates, the ratio of the hazard functions is exp(η 1 )/ exp(η 2 ) = exp(η 1 η 2 ) which does not depend on t. David M. Rocke Survival Regression Models May 18, / 32

4 The Cox Model How do we fit this model? We need to estimate the coefficients of the covariates, and we need to estimate the base hazard h 0 (t). For the covariates, supposing for simplicity that there are no tied event times, let the event times for the whole data set be t 1, t 2,..., t D. Let the risk set at time t i be R(t i ) and η j = β 1 x j1 + + β p x jp θ j = e η j h(t covariates) = h 0 (t)e η = θh 0 (t) David M. Rocke Survival Regression Models May 18, / 32

5 The Cox Model Conditional on a single failure at time t i, the probability that the event is due to subject j R(t i ) is Pr(j fails 1 failure at t i ) = = h 0 (t i )e η j k R(t i ) h 0(t i )e η k θ j k R(t i ) θ k David M. Rocke Survival Regression Models May 18, / 32

6 The Cox Model If subject j(i) is the one who fails at time t i, then the partial likelihood is L(β T ) = i θ j(i) k R(t i ) θ k and we can numerically maximize this with respect to the coefficients. When there are tied event times adjustments need to be made, but the likelihood is still similar. Note that we don t need to know the base hazard to solve for the coefficients. David M. Rocke Survival Regression Models May 18, / 32

7 The Cox Model Once we have coefficient estimates ˆβ 1,..., ˆβ p, this also defines ˆη j and ˆθ j and then the estimated base cumulative hazard function is Ĥ(t) = t i <t d i k R(t i ) θ k which reduces to the Nelson-Aalen estimate when there are no covariates. There are numerous other estimates that have been proposed as well. David M. Rocke Survival Regression Models May 18, / 32

8 Adjustment for Ties At each time t i at which more than one of the subjects has an event, let d i be the number of events at that time, D i the set of subjects with events at that time, and let s i be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time t i. Let and θ i = exp( η i ). η i = β 1 s i1 + + β p s ip David M. Rocke Survival Regression Models May 18, / 32

9 Adjustment for Ties Let s i be a covariate vector for an artificial subject obtained by adding up the covariate values for the subjects with an event at time t i. Note that η i = j D i β 1 x j1 + + β p x jp = β 1 s i1 + + β p s ip θ i = exp( η i ) = j D i θ i David M. Rocke Survival Regression Models May 18, / 32

10 Adjustment for Ties Breslow s method estimates the partial likelihood as L(β T ) = i = i θ i [ k R(t i ) θ k] d i θ j k R(t i ) θ k j D i This method is equivalent to treating each event as distinct and using the non-ties formula. It works best when the number of ties is small. It is the default in many statistical packages, including PROC PHREG in SAS. David M. Rocke Survival Regression Models May 18, / 32

11 Adjustment for Ties The other common method is Efron s, which is the default in R. L(β T ) = i θ i di j=1 [ k R(t i ) θ k j 1 d i k D i θ k ] This is closer to the exact discrete partial likelihood when there are many ties. The third option in R (and an option also in SAS as discrete ) is the exact method, which is the same one used for matched logistic regression. David M. Rocke Survival Regression Models May 18, / 32

12 Suppose as an example we have a time t where there are 20 individuals at risk and three failures. Let the three individuals have risk parameters θ 1, θ 2, θ 3 and let the sum of the risk parameters of the remaining 17 individuals be θ R. Then the factor in the partial likelihood at time t using Breslow s method is ( θ 1 ) ( θ 2 ) ( θ 3 ) θ R + θ 1 + θ 2 + θ 3 θ R + θ 1 + θ 2 + θ 3 θ R + θ 1 + θ 2 + θ 3 David M. Rocke Survival Regression Models May 18, / 32

13 If on the other hand, they had died in the order 1,2, 3, then the contribution to the partial likelihood would be ( θ 1 ) ( θ 2 ) ( ) θ3 θ R + θ 1 + θ 2 + θ 3 θ R + θ 2 + θ 3 θ R + θ 3 as the risk set got smaller with each failure. But we don t know the order they failed in, so instead of reducing the denominator by one risk coefficient each time, we reduce it by the same fraction. This is Efron s method. David M. Rocke Survival Regression Models May 18, / 32

14 Efron s method: ( θ 1 ) ( θ 2 ) ( θ 3 ) θ R + θ 1 + θ 2 + θ 3 θ R + 2(θ 1 + θ 2 + θ 3 )/3 θ R + (θ 1 + θ 2 + θ 3 )/3 David M. Rocke Survival Regression Models May 18, / 32

15 The coxph() Function in R coxph {survival} R Documentation Fit Proportional Hazards Regression Model Description Fits a Cox proportional hazards regression model. Time dependent variables, time dependent strata, multiple events per subject, and other extensions are incorporated using the counting process formulation of Andersen and Gill. Usage coxph(formula, data=, weights, subset, na.action, init, control, ties=c("efron","breslow","exact"), singular.ok=true, robust=false, model=false, x=false, y=true, tt, method,...) David M. Rocke Survival Regression Models May 18, / 32

16 The coxph() Function in R coxph(formula, data=, weights, subset, na.action, init, control, ties=c("efron","breslow","exact"), singular.ok=true, robust=false, model=false, x=false, y=true, tt, method,...) Arguments formula data weights subset a formula object, with the response on the left of a ~ operator, and the terms on the right. The response must be a survival object as returned by the Surv function. a data.frame in which to interpret the variables named in the formula, or in the subset and the weights argument. vector of case weights. If weights is a vector of integers, then the estimated coefficients are equivalent to estimating the model from data with the individual cases replicated as many times as indicated by weights. expression indicating which subset of the rows of data should be used in the fit. All observations are included by default. David M. Rocke Survival Regression Models May 18, / 32

17 The coxph() Function in R coxph(formula, data=, weights, subset, na.action, init, control, ties=c("efron","breslow","exact"), singular.ok=true, robust=false, model=false, x=false, y=true, tt, method,...) Arguments ties a character string specifying the method for tie handling. If there are no tied death times all the methods are equivalent. Nearly all Cox regression programs use the Breslow method by default, but not this one. The Efron approximation is used as the default here, it is more accurate when dealing with tied death times, and is as efficient computationally. The exact partial likelihood is equivalent to a conditional logistic model, and is appropriate when the times are a small set of discrete values. See further below. David M. Rocke Survival Regression Models May 18, / 32

18 The proportional hazards model is usually expressed in terms of a single survival time value for each person, with possible censoring. Andersen and Gill reformulated the same problem as a counting process; as time marches onward we observe the events for a subject, rather like watching a Geiger counter. The data for a subject is presented as multiple rows or observations, each of which applies to an interval of observation (start, stop]. David M. Rocke Survival Regression Models May 18, / 32

19 There are special terms that may be used in the model equation. A strata term identifies a stratified Cox model; separate baseline hazard functions are fit for each stratum. The cluster term is used to compute a robust variance for the model. The term + cluster(id) where each value of id is unique is equivalent to specifying the robust=t argument. If the id variable is not unique, it is assumed that it identifies clusters of correlated observations. The robust estimate arises from many different arguments and thus has had many labels. It is variously known as the Huber sandwich estimator, White s estimate (linear models/econometrics), the Horvitz-Thompson estimate (survey sampling), the working independence variance (generalized estimating equations), the infinitesimal jackknife, and the Wei, Lin, Weissfeld (WLW) estimate. David M. Rocke Survival Regression Models May 18, / 32

20 Cox Model for bmt Survival vs. Group > dfsurv <- Surv(bmt\$t2,bmt\$d3) > bmt.cox <- coxph(dfsurv~factor(group),data=bmt) > summary(bmt.cox) n= 137, number of events= 83 coef exp(coef) se(coef) z Pr(> z ) factor(group) * factor(group) Signif. codes: 0 *** ** 0.01 * exp(coef) exp(-coef) lower.95 upper.95 factor(group) factor(group) Concordance= (se = ) Rsquare= (max possible= ) Likelihood ratio test= on 2 df, p= Wald test = on 2 df, p= Score (logrank) test = on 2 df, p= David M. Rocke Survival Regression Models May 18, / 32

21 Cox Model for bmt Survival vs. Group Hypothesis tests for factor levels comparing groups 2 and 3 to group 1. Group 3 has the highest hazard so the most significant comparison is not directly shown. The coefficient is on the log-hazard-ratio scale, as in log-risk-ratio. The next column gives the risk ratio , and a hypothesis (Wald) test. The (not shown) group 3 vs. group 2 log hazard ratio is = The hazard ratio is then exp(0.9576) or Inference on all coefficients and combinations can be constructed using coef(bmt.cox) and vcov(bmt.cox) as with logistic regression. coef exp(coef) se(coef) z Pr(> z ) factor(group) * factor(group) Signif. codes: 0 *** ** 0.01 * David M. Rocke Survival Regression Models May 18, / 32

22 Cox Model for bmt Survival vs. Group The next part of the output gives 95% confidence intervals for the relative risk. For the difference between groups 2 and 3 we need to use coef(bmt.cox) and vcov(bmt.cox). exp(coef) exp(-coef) lower.95 upper.95 factor(group) factor(group) > c1 <- coef(bmt.cox) > v1 <- vcov(bmt.cox) > cont1 <- c(-1,1) > t(cont1) %*% c1 [,1] [1,] log hazard ratio group 3 to group2 > t(cont1) %*% v1 %*% cont1 [,1] [1,] > sqrt(t(cont1) %*% v1 %*% cont1) [,1] [1,] standard error of log hazard ratio David M. Rocke Survival Regression Models May 18, / 32

23 Cox Model for bmt Survival vs. Group Concordance is agreement of first failure between pairs of subjects and higher predicted risk between those subjects, omitting non-informative pairs. The Rsquare value is Cox and Snell s pseudo R-squared and is not very useful. There are three tests for whether the model with the group covariate is better than the one without --Usual likelihood ratio chi-squared --Wald test chi-squared, obtained by adding the squares of the z-scores --score = log-rank test, as with comparison of survival functions. The likelihood ratio test is probably best in smaller samples, followed by the Wald test. Concordance= (se = ) Rsquare= (max possible= ) Likelihood ratio test= on 2 df, p= Wald test = on 2 df, p= Score (logrank) test = on 2 df, p= David M. Rocke Survival Regression Models May 18, / 32

24 Cox Model for bmt Survival vs. Group The anova function performs the likelihood ratio test for comparing models. One can use drop1(), add1(), step(), or compare two explicit models. > anova(bmt.cox) Analysis of Deviance Table Cox model: response is dfsurv Terms added sequentially (first to last) loglik Chisq Df Pr(> Chi ) NULL factor(group) ** --- Signif. codes: 0 *** ** 0.01 * David M. Rocke Survival Regression Models May 18, / 32

25 Survival Curves from the Cox Model We defined the base hazard but then did not use it to estimate the coefficients. In fact, there is no meaningful base hazard. In this case, it is the hazard for group 1. We can use survfit to get survival functions, but by default it produces the hazard for an average individual with average covariates. This is like the family with 1.5 children it does not exist. Always it is best to specify the covariate level(s) for which you want the survival curve(s). In this case, we can plot the Cox model survival curves, which are in fact proportional along with the individual KM curves for the groups. David M. Rocke Survival Regression Models May 18, / 32

26 plot(survfit(dfsurv~group,data=bmt)) lines(survfit(bmt.cox,data.frame(group=1:3),conf.int=f),col="red") legend("bottomleft",c("kaplan-meier","cox Model"),col=c("black","red"),lwd=1) title("survival Functions for Three Groups by KM and Cox Model") When we use survfit() with a Cox model, we should include a data frame with the same named columns as the predictors in the Cox model and one or more levels of each. For example > data.frame(group=1:3,age=50) group age > data.frame(group=rep(1:3,2),age=rep(c(50,60),each=3)) group age David M. Rocke Survival Regression Models May 18, / 32

27 Survival Functions for Three Groups by KM and Cox Model Kaplan Meier Cox Model David M. Rocke Survival Regression Models May 18, / 32

28 Homework The addicts data set from the text (see page 91) is from a study by Caplehorn et al. (Methadone Dosage and Retention of Patients in Maintenance Treatment, Med. J. Aust., 1991). These data comprise the times in days spent by heroin addicts from entry to departure from one of two methadone clinics. There are two further covariates, namely, prison record and methadone dose, believed to affect the survival times. David M. Rocke Survival Regression Models May 18, / 32

29 Homework The data set and R input code are on the website. The variables are as follows: id: Subject ID clinic: Clinic (1 or 2) status: Survival status (0 = censored, 1 = departed from clinic) time: Survival time in days prison: Prison record (0 = none, 1 = any) methadone: Methadone dose (mg/day) David M. Rocke Survival Regression Models May 18, / 32

30 Homework 1 Plot the Kaplan-Meier survival curves for the two clinics. 2 Test for whether the two survival curves could have come from the same process using survdiff. 3 Plot the cumulative hazards from the Nelson-Aalen estimator. David M. Rocke Survival Regression Models May 18, / 32

31 Homework 4 A common comparison plot for proportional hazards is the complimentary log-log survival plot which plots ln( ln[ŝ(x)]) = ln Ĥ(t) against ln(t). If the hazards are proportional, then so are the cumulative hazards, and after taking logs, the curves should be parallel. If the lines are straight, then the Weibull model may be appropriate. Make this plot for the two clinics using the Nelson-Aalen estimator and comment on the results. You will use the cloglog option in the plot command. David M. Rocke Survival Regression Models May 18, / 32

32 Homework 5 Construct a Cox model using only the clinic variable. Is the survival different at the two clinics? What is the estimated hazard (risk) ratio, a test for significance, and a 95% confidence interval? 6 Pick one test for the null hypothesis that the clinics do not differ. Why would you depend on this test more than the others? 7 Consider adding the prison and methodone variables. Which of these covariates will improve the model? 8 Plot the two survival curves from your chosen Cox model and add the two KM survival curves. What do you think? David M. Rocke Survival Regression Models May 18, / 32

### Lecture 8 Stat D. Gillen

Statistics 255 - Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels

### Multivariable Fractional Polynomials

Multivariable Fractional Polynomials Axel Benner September 7, 2015 Contents 1 Introduction 1 2 Inventory of functions 1 3 Usage in R 2 3.1 Model selection........................................ 3 4 Example

### ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

### Chapter 4 Regression Models

23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,

### Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of

### Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

### The coxvc_1-1-1 package

Appendix A The coxvc_1-1-1 package A.1 Introduction The coxvc_1-1-1 package is a set of functions for survival analysis that run under R2.1.1 [81]. This package contains a set of routines to fit Cox models

### Multistate Modeling and Applications

Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

### STAT331. Cox s Proportional Hazards Model

STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

### A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

### Longitudinal Modeling with Logistic Regression

Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to

### Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

### STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

### Extensions of Cox Model for Non-Proportional Hazards Purpose

PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

### Logistic Regression. Interpretation of linear regression. Other types of outcomes. 0-1 response variable: Wound infection. Usual linear regression

Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024

### Chapter 1 Statistical Inference

Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

### SAS Analysis Examples Replication C8. * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ;

SAS Analysis Examples Replication C8 * SAS Analysis Examples Replication for ASDA 2nd Edition * Berglund April 2017 * Chapter 8 ; libname ncsr "P:\ASDA 2\Data sets\ncsr\" ; data c8_ncsr ; set ncsr.ncsr_sub_13nov2015

### Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

### Generalized linear models

Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

### COMPLEMENTARY LOG-LOG MODEL

COMPLEMENTARY LOG-LOG MODEL Under the assumption of binary response, there are two alternatives to logit model: probit model and complementary-log-log model. They all follow the same form π ( x) =Φ ( α

### Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Title stata.com stcrreg postestimation Postestimation tools for stcrreg Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

### Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

### Correlation and regression

1 Correlation and regression Yongjua Laosiritaworn Introductory on Field Epidemiology 6 July 2015, Thailand Data 2 Illustrative data (Doll, 1955) 3 Scatter plot 4 Doll, 1955 5 6 Correlation coefficient,

### Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

### Cox regression: Estimation

Cox regression: Estimation Patrick Breheny October 27 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/19 Introduction The Cox Partial Likelihood In our last lecture, we introduced the Cox partial

### Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

### Integrated likelihoods in survival models for highlystratified

Working Paper Series, N. 1, January 2014 Integrated likelihoods in survival models for highlystratified censored data Giuliana Cortese Department of Statistical Sciences University of Padua Italy Nicola

### Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

### Müller: Goodness-of-fit criteria for survival data

Müller: Goodness-of-fit criteria for survival data Sonderforschungsbereich 386, Paper 382 (2004) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Goodness of fit criteria for survival data

### SAS/STAT 13.1 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 13.1 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete

### Statistics 135 Fall 2008 Final Exam

Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

### Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical

### ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

### Understanding the Cox Regression Models with Time-Change Covariates

Understanding the Cox Regression Models with Time-Change Covariates Mai Zhou University of Kentucky The Cox regression model is a cornerstone of modern survival analysis and is widely used in many other

### Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

### Chapter 5: Logistic Regression-I

: Logistic Regression-I Dipankar Bandyopadhyay Department of Biostatistics, Virginia Commonwealth University BIOS 625: Categorical Data & GLM [Acknowledgements to Tim Hanson and Haitao Chu] D. Bandyopadhyay

### Sigmaplot di Systat Software

Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easy-to-use package for complete graphing and data analysis. The statistical

### Full likelihood inferences in the Cox model: an empirical likelihood approach

Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

### SAS/STAT 13.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 13.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.2 User s Guide. The correct bibliographic citation for the complete

### Multinomial Logistic Regression Models

Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word

### Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on -APPENDIX-

Assessing the effect of a partly unobserved, exogenous, binary time-dependent covariate on survival probabilities using generalised pseudo-values Ulrike Pötschger,2, Harald Heinzl 2, Maria Grazia Valsecchi

### 9 Generalized Linear Models

9 Generalized Linear Models The Generalized Linear Model (GLM) is a model which has been built to include a wide range of different models you already know, e.g. ANOVA and multiple linear regression models

### Randomization Tests for Regression Models in Clinical Trials

Randomization Tests for Regression Models in Clinical Trials A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University By Parwen

### Flexible parametric alternatives to the Cox model, and more

The Stata Journal (2001) 1, Number 1, pp. 1 28 Flexible parametric alternatives to the Cox model, and more Patrick Royston UK Medical Research Council patrick.royston@ctu.mrc.ac.uk Abstract. Since its

### 8 Nominal and Ordinal Logistic Regression

8 Nominal and Ordinal Logistic Regression 8.1 Introduction If the response variable is categorical, with more then two categories, then there are two options for generalized linear models. One relies on

### Statistics 262: Intermediate Biostatistics Regression & Survival Analysis

Statistics 262: Intermediate Biostatistics Regression & Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Introduction This course is an applied course,

### Reliability Engineering I

Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00

### Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

### M3 Symposium: Multilevel Multivariate Survival Models For Analysis of Dyadic Social Interaction

M3 Symposium: Multilevel Multivariate Survival Models For Analysis of Dyadic Social Interaction Mike Stoolmiller: stoolmil@uoregon.edu University of Oregon 5/21/2013 Outline Example Research Questions

### Outline of GLMs. Definitions

Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density

### Survival Analysis. Lu Tian and Richard Olshen Stanford University

1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

### Logistic regression: Miscellaneous topics

Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio

### EDF 7405 Advanced Quantitative Methods in Educational Research. Data are available on IQ of the child and seven potential predictors.

EDF 7405 Advanced Quantitative Methods in Educational Research Data are available on IQ of the child and seven potential predictors. Four are medical variables available at the birth of the child: Birthweight

### Sections 4.1, 4.2, 4.3

Sections 4.1, 4.2, 4.3 Timothy Hanson Department of Statistics, University of South Carolina Stat 770: Categorical Data Analysis 1/ 32 Chapter 4: Introduction to Generalized Linear Models Generalized linear

### Variable Selection and Model Choice in Survival Models with Time-Varying Effects

Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität

### Journal of Statistical Software

JSS Journal of Statistical Software January 2011, Volume 38, Issue 7. http://www.jstatsoft.org/ mstate: An R Package for the Analysis of Competing Risks and Multi-State Models Liesbeth C. de Wreede Leiden

### 0.1 weibull: Weibull Regression for Duration Dependent

0.1 weibull: Weibull Regression for Duration Dependent Variables Choose the Weibull regression model if the values in your dependent variable are duration observations. The Weibull model relaxes the exponential

### PASS Sample Size Software. Poisson Regression

Chapter 870 Introduction Poisson regression is used when the dependent variable is a count. Following the results of Signorini (99), this procedure calculates power and sample size for testing the hypothesis

### Generalized Linear Models

Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

### pensim Package Example (Version 1.2.9)

pensim Package Example (Version 1.2.9) Levi Waldron March 13, 2014 Contents 1 Introduction 1 2 Example data 2 3 Nested cross-validation 2 3.1 Summarization and plotting..................... 3 4 Getting

### A comparison of inverse transform and composition methods of data simulation from the Lindley distribution

Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 2287-7843 / Online ISSN 2383-4757 A comparison of inverse transform

### Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

### STA6938-Logistic Regression Model

Dr. Ying Zhang STA6938-Logistic Regression Model Topic 2-Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of

### BIOS 312: Precision of Statistical Inference

and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

### Master s Written Examination

Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

### Survival Analysis APTS 2016/17. Ingrid Van Keilegom ORSTAT KU Leuven. Glasgow, August 21-25, 2017

Survival Analysis APTS 2016/17 Ingrid Van Keilegom ORSTAT KU Leuven Glasgow, August 21-25, 2017 Basic What is Survival analysis? Survival analysis (or duration analysis) is an area of statistics that and

### ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will

Ninth ARTNeT Capacity Building Workshop for Trade Research "Trade Flows and Trade Policy Analysis" June 2013 Bangkok, Thailand Cosimo Beverelli and Rainer Lanz (World Trade Organization) 1 Selected econometric

### 6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

### Lecture 8. Poisson models for counts

Lecture 8. Poisson models for counts Jesper Rydén Department of Mathematics, Uppsala University jesper.ryden@math.uu.se Statistical Risk Analysis Spring 2014 Absolute risks The failure intensity λ(t) describes

### Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p )

Lab 3: Two levels Poisson models (taken from Multilevel and Longitudinal Modeling Using Stata, p. 376-390) BIO656 2009 Goal: To see if a major health-care reform which took place in 1997 in Germany was

### Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

### Linear Regression Models P8111

Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started

### Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

### Hierarchical Generalized Linear Models. ERSH 8990 REMS Seminar on HLM Last Lecture!

Hierarchical Generalized Linear Models ERSH 8990 REMS Seminar on HLM Last Lecture! Hierarchical Generalized Linear Models Introduction to generalized models Models for binary outcomes Interpreting parameter

### Generalized Linear Modeling - Logistic Regression

1 Generalized Linear Modeling - Logistic Regression Binary outcomes The logit and inverse logit interpreting coefficients and odds ratios Maximum likelihood estimation Problem of separation Evaluating

### ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

ESP 178 Applied Research Methods 2/23: Quantitative Analysis Data Preparation Data coding create codebook that defines each variable, its response scale, how it was coded Data entry for mail surveys and

### Models for Ordinal Response Data

Models for Ordinal Response Data Robin High Department of Biostatistics Center for Public Health University of Nebraska Medical Center Omaha, Nebraska Recommendations Analyze numerical data with a statistical

### SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000) AMITA K. MANATUNGA THE ROLLINS SCHOOL OF PUBLIC HEALTH OF EMORY UNIVERSITY SHANDE

### Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

### Package ICBayes. September 24, 2017

Package ICBayes September 24, 2017 Title Bayesian Semiparametric Models for Interval-Censored Data Version 1.1 Date 2017-9-24 Author Chun Pan, Bo Cai, Lianming Wang, and Xiaoyan Lin Maintainer Chun Pan

### Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

### From semi- to non-parametric inference in general time scale models

From semi- to non-parametric inference in general time scale models Thierry DUCHESNE duchesne@matulavalca Département de mathématiques et de statistique Université Laval Québec, Québec, Canada Research

### On the generalized maximum likelihood estimator of survival function under Koziol Green model

On the generalized maximum likelihood estimator of survival function under Koziol Green model By: Haimeng Zhang, M. Bhaskara Rao, Rupa C. Mitra Zhang, H., Rao, M.B., and Mitra, R.C. (2006). On the generalized

### Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Analysis of Covariance In some experiments, the experimental units (subjects) are nonhomogeneous or there is variation in the experimental conditions that are not due to the treatments. For example, a

### Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

### Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of

### Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey

### Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data

International Mathematical Forum, 3, 2008, no. 33, 1643-1654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Al-khedhairi Department of Statistics and O.R. Faculty

### STAT 526 Advanced Statistical Methodology

STAT 526 Advanced Statistical Methodology Fall 2017 Lecture Note 10 Analyzing Clustered/Repeated Categorical Data 0-0 Outline Clustered/Repeated Categorical Data Generalized Linear Mixed Models Generalized

### A brief introduction to mixed models

A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

### Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models:

Contrasting Marginal and Mixed Effects Models Recall: two approaches to handling dependence in Generalized Linear Models: Marginal models: based on the consequences of dependence on estimating model parameters.

### Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

### Models for Binary Outcomes

Models for Binary Outcomes Introduction The simple or binary response (for example, success or failure) analysis models the relationship between a binary response variable and one or more explanatory variables.

### BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

### Lecture 10: Alternatives to OLS with limited dependent variables. PEA vs APE Logit/Probit Poisson

Lecture 10: Alternatives to OLS with limited dependent variables PEA vs APE Logit/Probit Poisson PEA vs APE PEA: partial effect at the average The effect of some x on y for a hypothetical case with sample

### Inference for Binomial Parameters

Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

### Regression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.

Regression models Generalized linear models in R Dr Peter K Dunn http://www.usq.edu.au Department of Mathematics and Computing University of Southern Queensland ASC, July 00 The usual linear regression