Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction


 Lindsay Hardy
 8 months ago
 Views:
Transcription
1 Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Nonparametric Estimates of Survival Comparing Survival Dalla Lana School of Public Health University of Toronto Measuring Treatment Effect Parametric Estimates of Survival References Defining Survival Data Introduction Definition of Failure Time Interest centres on patients who are at risk for some event and the time to the occurrence of the event. s: Time to death Time to stroke Time to disease recurrence Three requirements to precisely determine failure time: unambiguous time origin scale for measuring the passage of time meaning of the failure event must be entirely clear
2 Censoring Typical Survival Data Arising From a Clinical Trial Patients who are not observed to failure are called censored Some Causes: study concludes before all patients destined to fail have actually failed patient withdraws from study before failure observed Mathematical Definitions Introduction The Survivor Function The survivor function F(t) is defined to be Let the failure time be represented by a nonnegative random variable T. Usually, we consider continuous distributions. Discrete and mixed discretecontinuous distributions can also handled. F(t) = Pr(T t) It is sometimes useful to consider the cumulative risk of an event. The cumulative risk function is defined by R(t) = 1 F(t) Note that the definition of F(t) leads to left continuous CDF rather than the usual right continuity.
3 Probability Density Function Hazard Function Continuous Distributions Continuous T. which gives f (t) = F (t) = F(t) = Pr(t T < t + ) lim 0+ t f (u)du Discrete T. Usually handled by assigning to the density a component f j δ(t a j ) for an atom f j at a j where δ( ) is the Dirac delta function. The hazard function (also called agespecific failure rate or force of mortality is defined by h(t) = Pr(t T < t + t T ) lim 0+ By the definition of conditional probability h(t) = f (t)/f(t) Sometimes the hazard is denoted by λ(t). Hazard Function Continuous Distributions (continued) By the definitions of f (t) and that F(0) = 1 we have and h(t) = F (t)/f(t) = d log F(t)/dt ( t ) F(t) = exp h(u)du 0 = exp[ H(t)] Hazard Function Discrete Distributions If there is an atom f j of probability at time a j then h j = f j /F(a j ) = f j /(f j + f j+1 + ) With this definition it can be shown that F(t) = a j <t(1 h j ) By defining the integrated hazard as where H( ) is called the integrated hazard. Finally, f (t) = h(t)exp[ H(t)] we still have H(t) = log(1 h j ) a j <t F(t) = exp[ H(t)]
4 The Likelihood Function Survival Models We assume each patient has a failure time t i and a censoring time c i associated with them. In practice, we observe x i = min(t i, c i ). L = u f (t i; φ) c F(c i; φ) l = u log f (t i; φ) + c log F(c i; φ) l = u log h(x i; φ) + log F(x i ; φ) Suppose in addition to survival some covariates z are measured with the intent of modeling their association with survival. For some baseline hazard function h 0 (t) (or survivor function F 0 (t) we have the following commonly used models. The Accelerated Life model is defined by F(t; z) = F 0 [tψ(z)] or h(t; z) = h 0 [tψ(z)]ψ(z) The choice of F 0 is dictated by a choice of parametric model. The Proportional Hazard model is defined by F(t; z) = [F 0 (t)] ψ(z) or h(t; z) = ψ(z)h 0 (t) The baseline hazard does not need to be explicitly known for this model. The Likelihood Function We begin by assuming a discrete distribution atoms of probability at g unique failure times t 1, t 2,..., t g. It can be shown that the loglikelihood is l = j [d j log h j + (r j d j ) log(1 h j )] where r j is the number of patients in view at time t j and d j is the number who fail at t j. By convention, r j includes individuals who are censored at t j. This is identical to the log likelihood for g independent binomials with r j trials, d j failures and probability of failure h j. This fact is used in the derivation of the results on the next slide. ProductLimit or KaplanMeier Estimate Suppose g unique failure times are observed. Put them in order from smallest to largest. t 1, t 2,..., t g The survivor function is estimated at time t j by ˆF(t j ) = j i=1 ( 1 d ) i = r i j ( ) ri d i Note the similarity between the first expression above and the expression for a discrete survivor function, F(t) = a j <t (1 h j) given previously. The variance of ˆF(t) at time t j is (Greenwood s formula) V ( ˆF(t j )) = [ ˆF(t j )] 2 j i=1 i=1 r i d i r i (r i d i )
5 Suppose we had the following data (+ indicates censored observations). [1] [12] [23] t j r j d j P(t j ) = 1 d j /r j ˆF(t j ) SE /27 = = /25 = = /21 = = /19 = = /17 = = /13 = = /12 = = /11 = = /10 = = /6 = = /4 = = Proportion Event Free Kaplan Meier Survivor Function with Greenwood 95% Confidence Interval R Code > library(survival) > eg1a.km < survfit(surv(days, status), data = eg1a) > summary(eg1a.km) > plot(eg1a.km, mark.time = FALSE, xlab = "Days", + ylab = "Proportion EventFree", + main = "KaplanMeier Survivor Function\n with Greenwood 95% Confidence Interval") Days
6 Actuarial Estimator Comparing Survival Suppose instead of exact failure/censoring times we know only the numbers of failures and censored observations between time intervals. For each interval (t j 1, t j ] we define r j 1 to be the number at risk at t j 1, d j to be the number who fail during the interval and m j to be the number censored during the interval. Define r j = r j m j. Then, the actuarial estimator of survival is F(t j ) = ( 1 d ) k r k j k There are a variety of methods that can be used to compare survival in various ways. Point in time comparison (two groups) MantelHaenszel or Log Rank Test (two or more survival curves) Modelbased tests (two or more groups) Point in Time Comparison The object is to compare survival between two treatments at a prespecified time point t 0. From the KaplanMeier procedure and Greenwood s formula we have, Treatment : ˆF T (t 0 ) and V [ ˆF T (t 0 )] Control : ˆF C (t 0 ) and V [ ˆF C (t 0 )] Then, to conduct an approximate test of the null hypothesis H 0 : F T (t 0 ) = F C (t 0 ) we calculate the Z statistic, Z = Under H 0, Z N(0, 1). ˆF T (t 0 ) ˆF C (t 0 ) V [ ˆF T (t 0 )] + V [ ˆF C (t 0 )] MantelHaenszel Test Consider the two survival curves as a series of 2x2 contingency tables constructed at each time point where one or more events occur. At a time point t j we have a table of the form: Treatment Control Total Dead d Tj d Cj d j Alive n Tj d Tj n Cj d Cj N j d j Total n Tj n Cj N j Then, from the hypergeometric distribution, E(d Tj ) = n T j d j N j V (d Tj ) = n T j n Cj (N j d j )d j N 2 j (N j 1)
7 MantelHaenszel Test Continued Comparing Survival in R Define the following quantities. O T = d Tj E T = E(d Tj ) O C = d j O T E C = d j E T V = V (d Tj ) Now, the null hypothesis H 0 : F T (t) = F C (t) is usually tested with a χ 2 (1) test, but may also be tested with a Z test. and χ 2 = (O T E T ) 2 V Z = O T E T V In R the survdiff() function is used to compare survival curves. This performs tests in a family described by Harrington and Fleming (Biometrika: 1982) where the weights on each event time are F(t) ρ. The survdiff() function accepts an argument rho. If rho = 0 (the default) you get the MantelHaenszel test and if rho = 1 you get the Peto & Peto modification of the GehanWilcoxon test. This is often called the Log Rank Test. Proportion Event Free > survdiff(surv(days, status) ~ rx, data = eg1b) Call: survdiff(formula = Surv(days, status) ~ rx, data = eg1b) N Observed Expected (OE)^2/E (OE)^2/V rx=control rx=treatment Chisq= 0.9 on 1 degrees of freedom, p= Days
8 Measuring the Treatment Effect Among other things, clinical trials are intended to provide an estimate of the treatment effect. For a survival outcome, two types of estimates are commonly used. Point in time Risk Ratio or Risk Reduction. All time or whole curve effect measures by means of Hazard Ratios. Point in Time Treatment Effect Suppose a trial is comparing a treatment T to a control C. The usual expectation would be that the risk of the event would be lower in the treatment group than the control group. There may be situations where there is some clinically meaningful time point t 0 say, that is of interest. Consider three point in time estimates of risk reduction. Risk Difference = R C (t 0 ) R T (t 0 ) Risk Ratio = R T (t 0 )/R C (t 0 ) Relative Risk Reduction = R C (t 0 ) R T (t 0 ) R C (t 0 ) = 1 R T (t 0 ) R C (t 0 ) Estimating Risk Difference Estimate T & C survival curves using KaplanMeier procedure and compute. RD = ˆR C (t 0 ) ˆR T (t 0 ) = (1 ˆF C (t 0 )) (1 ˆF T (t 0 )) = ˆF T (t 0 ) ˆF C (t 0 ) Calculate V ( ˆF T (t 0 )) and V ( ˆF C (t 0 )) using Greenwood s formula (ie. SE 2 ). Then, 95% CI on the RD is (approximately): RD ± 1.96 V ( ˆF T (t)) + V ( ˆF C (t)) Estimating Risk Ratio Estimate T & C survival curves using KaplanMeier procedure and compute. RR = ˆR T (t 0 )/ˆR C (t 0 ) = 1 ˆF T (t 0 ) 1 ˆF C (t 0 ) Calculate V (log ˆR T (t 0 )) and V (log ˆR C (t 0 )) using the following formula. V (log ˆR(t 0 )) = V ( ˆF(t 0 )) (ˆR(t 0 )) 2 Then, 95% CI on RR is (approximately): RR e ±1.96 V (log ˆR T (t 0 ))+V (log ˆR C (t 0 ))
9 Estimating Relative Risk Reduction Estimating Hazard Ratio Estimate T & C survival curves using KaplanMeier procedure and compute. RRR = 1 ˆR T (t 0 ) ˆR C (t 0 ) = 1 RR Determine the CI for RR. Say this is (RR L, RR U ). The the CI for RRR is (1 RR U, 1 RR L ). The MantelHaenzsel test calculations enable estimation of the hazard ratio. ˆψ = O T /E T O C /E C V [log ˆψ] = E T E C Various modelbased estimates are preferred where appropriate. Introduction to Parametric Survival Distributions The Exponential Distribution We saw earlier that the KaplanMeier estimate of survival is not particularly smooth. By assuming a particular parametric form for the survival distribution, smooth estimates of survival can be obtained. Some commonly used distributions in survival analysis are: Exponential Weibull Lognormal Loglogistic We will look at the exponential and Weibull distributions in more detail. For the exponential distribution with parameter ρ and mean 1/ρ, we have F(t) = e ρt, f (t) = ρe ρt, h(t) = ρ, H(t) = ρt A plot of H(t) = log( ˆF(t)), where ˆF(t) is the KaplanMeier estimate versus t will be linear if the exponential distribution is appropriate. The MLE for ρ is ˆρ = d/ x i where d is the total number of failures.
10 The Exponential Likelihood Likelihood Ratio Based Confidence Interval For exponential failure times with d failures we get l(ρ; x) = u log h(x i; φ) + log F(x i ; φ) = u log ρ ρ x i = d log ρ ρ x i U(ρ) = d/ρ x i I (ρ) = d/ρ 2 It is easy to show that solving U(ρ) = 0 gives ˆρ = d/ x i as the MLE and 1/I (ˆρ) is an estimate of its standard error, although not the best to use in practice. Confidence intervals based on the likelihood ratio statistic or from a fitted model are preferred. The likelihood ratio statistic for the null hypothesis ρ = ρ 0 is W (ρ 0 ) = W = 2[l(ˆρ) l(ρ 0 )] This has a nice simple form in the exponential case and is distributed approximately as a chisquared variate with 1 degree of freedom. A 1 α confidence region is {ρ : W (ρ) c 1,α} where c1,α is the upper α point of the chisquared distribution with 1 degree of freedom. Cumulative Hazard Plot Direct computation of hazard and symmetric 95% CI. > rhohat < sum(eg1a$status)/sum(eg1a$days) > se.rhohat < sqrt(rhohat^2/sum(eg1a$status)) > symci < rhohat + c(1, 1) * qnorm(0.975) * se.rhohat > c(rhohat, symci) [1] H(t) Time
11 log likelihood Exponential Log Likelihood Model based estimate. > eg1a.sr < survreg(surv(days, status) ~ 1, data = eg1a, + dist = "exponential") > summary(eg1a.sr) Call: survreg(formula = Surv(days, status) ~ 1, data = eg1a, dist = "exponenti Value Std. Error z p (Intercept) e63 Scale fixed at 1 Exponential distribution Loglik(model)= 70 Loglik(intercept only)= 70 Number of NewtonRaphson Iterations: 5 n= ρ > exp(coef(eg1a.sr)) (Intercept) The Two Group Problem Comparing confidence intervals: Symmetric: Likelihood Based: Model Based: Suppose you have exponentially distributed failure data for a treatment group T and a control group C. Estimate ˆρ T = d T / x Ti, ˆρ C = d C / x Ci and ˆψ = ˆρ T /ˆρ C. It turns out that V (log ˆψ) 1/d T + 1/d C which permits approximate Ztests and confidence intervals to be calculated.
12 Direct Calculation > rhohats < tapply(eg1b$status, eg1b$rx, sum)/tapply(eg1b$days, + eg1b$rx, sum) > hr < rhohats[2]/rhohats[1] > lhrse < sqrt(sum(1/tapply(eg1b$status, eg1b$rx, + sum))) > c(hr, log(hr), lhrse) Treatment Treatment Model based. > eg1b.sr < survreg(surv(days, status) ~ rx, data = eg1b, + dist = "exponential") > summary(eg1b.sr) Call: survreg(formula = Surv(days, status) ~ rx, data = eg1b, dist = "exponent Value Std. Error z p (Intercept) e63 rxtreatment e01 Scale fixed at 1 Exponential distribution Loglik(model)= Loglik(intercept only)= Chisq= 0.71 on 1 degrees of freedom, p= 0.4 Number of NewtonRaphson Iterations: 5 n= 54 The Weibull Distribution The Weibull distribution has the following: Model based (continued). > exp(coef(eg1b.sr)[2]) rxtreatment F(t) = exp[ (ρt) κ ] f (t) = κρ(ρt) κ 1 exp[ (ρt) κ ] h(t) = κρ(ρt) κ 1 Note that if κ = 1 we are left with an exponential distribution with a hazard of ρ. A plot of log( log( ˆF(t))) versus log(t) will be linear if the Weibull distribution is appropriate. The biggest difficulty is that the coefficient estimates don t have nice straight forward interpretations like other models we ve used. A scale parameter is estimated and e β i /scale gives a hazard ratio.
13 Fitting a Weibull in R log(h(t) log(t) > eg2b.sr < survreg(surv(days, status) ~ rx, data = eg1b, + dist = "weibull") > summary(eg2b.sr) Call: survreg(formula = Surv(days, status) ~ rx, data = eg1b, dist = "weibull" Value Std. Error z p (Intercept) e176 rxtreatment e01 Log(scale) e03 Scale= Weibull distribution Loglik(model)= Loglik(intercept only)= Chisq= 0.83 on 1 degrees of freedom, p= 0.36 Number of NewtonRaphson Iterations: 8 n= 54 The hazard ratio for treatment is given by: > exp(coef(eg2b.sr)[2]/eg2b.sr$scale) rxtreatment Survival Days
14 Loglogistic and Lognormal Distributions Computing log hazard in R These distributions are very similar. Both result in nonproportional hazards models. If log T has a logistic distribution, we say T is loglogistic If log T as a normal distribtion, we say T is lognormal If log h(t) is nonmonotonic in t then these distributions are possible. > kmf < survfit(surv(days, status) ~ rx, data = eg1b) > H1 < log(kmf[1]$surv) > H2 < log(kmf[2]$surv) > Ht1 < eg1b.km[1]$time > Ht2 < eg1b.km[2]$time > dh1 < diff(h1)/diff(ht1) > dh2 < diff(h2)/diff(ht2) > loghaz < data.frame(ldh = log(c(dh1, dh2)), times = c(ht1[1], + Ht2[1]), rx = factor(c(rep(1, length(dh1)), + rep(2, length(dh2))), labels = c("control", + "Treatment"))) 2 Plotting log hazard in R > library(lattice) > xyplot(ldh ~ times, data = loghaz, groups = rx, + type = c("p", "smooth"), ylab = "log hazard", + xlab = "Time") log hazard Time
15 Fitting a loglogistic Model in R > llog.fit < survreg(surv(days, status) ~ rx, data = eg1b, + dist = "loglogistic") > summary(llog.fit) Call: > pct < 0:99/100 survreg(formula = Surv(days, status) ~ rx, data = eg1b, dist = "loglogistic") > llog.pred < predict(llog.fit, newdata = data.frame(rx = factor(1:2, Value Std. Error z p + labels = c("control", "Treatment"))), type = "quantile", (Intercept) e p = pct) rxtreatment e01 > plot(kmf, mark.time = FALSE, lty = 1:2) Log(scale) e05 > matlines(t(llog.pred), cbind(1  pct, 1  pct), + col = "red", lty = 1:2) Scale= 0.48 Log logistic distribution Loglik(model)= Loglik(intercept only)= Chisq= 0.71 on 1 degrees of freedom, p= 0.4 Number of NewtonRaphson Iterations: 5 n= 54 Plotting the Result Adding Weibull > weib.pred < predict(eg2b.sr,, newdata = data.frame(rx = factor(1:2, + labels = c("control", "Treatment"))), type = "quantile", + p = pct) > matlines(t(weib.pred), cbind(1  pct, 1  pct), + col = "blue", lty = 1:2)
16 References Cox D.R. and Oakes D. (1984). Analysis of Survival Data. Chapman and Hall Kalbfleisch J.D. and Prentice R.L. (1980). The Statistical Analysis of Failure Time Data. New York: Wiley
STAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationSurvival Distributions, Hazard Functions, Cumulative Hazards
BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationLogistic regression model for survival time analysis using timevarying coefficients
Logistic regression model for survival time analysis using timevarying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshimau.ac.jp Research
More informationST495: Survival Analysis: Maximum likelihood
ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping
More informationChapter 4 Regression Models
23.August 2010 Chapter 4 Regression Models The target variable T denotes failure time We let x = (x (1),..., x (m) ) represent a vector of available covariates. Also called regression variables, regressors,
More informationLecture 3. Truncation, lengthbias and prevalence sampling
Lecture 3. Truncation, lengthbias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationBIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca LenzTönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More informationSurvival Analysis APTS 2016/17. Ingrid Van Keilegom ORSTAT KU Leuven. Glasgow, August 2125, 2017
Survival Analysis APTS 2016/17 Ingrid Van Keilegom ORSTAT KU Leuven Glasgow, August 2125, 2017 Basic What is Survival analysis? Survival analysis (or duration analysis) is an area of statistics that and
More informationParameters Estimation for a Linear Exponential Distribution Based on Grouped Data
International Mathematical Forum, 3, 2008, no. 33, 16431654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Alkhedhairi Department of Statistics and O.R. Faculty
More informationReliability Engineering I
Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00
More informationSAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTERRANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)
SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTERRANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000) AMITA K. MANATUNGA THE ROLLINS SCHOOL OF PUBLIC HEALTH OF EMORY UNIVERSITY SHANDE
More informationThe coxvc_111 package
Appendix A The coxvc_111 package A.1 Introduction The coxvc_111 package is a set of functions for survival analysis that run under R2.1.1 [81]. This package contains a set of routines to fit Cox models
More informationInvestigation of goodnessoffit test statistic distributions by random censored samples
d samples Investigation of goodnessoffit test statistic distributions by random censored samples Novosibirsk State Technical University November 22, 2010 d samples Outline 1 Nonparametric goodnessoffit
More informationAnalysis of competing risks data and simulation of data following predened subdistribution hazards
Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013
More informationTied survival times; estimation of survival probabilities
Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation
More informationContinuous Time Survival in Latent Variable Models
Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract
More informationExam C Solutions Spring 2005
Exam C Solutions Spring 005 Question # The CDF is F( x) = 4 ( + x) Observation (x) F(x) compare to: Maximum difference 0. 0.58 0, 0. 0.58 0.7 0.880 0., 0.4 0.680 0.9 0.93 0.4, 0.6 0.53. 0.949 0.6, 0.8
More informationAFT Models and Empirical Likelihood
AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t
More informationLoglinearity for Cox s regression model. Thesis for the Degree Master of Science
Loglinearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical
More informationCOMPLEMENTARY LOGLOG MODEL
COMPLEMENTARY LOGLOG MODEL Under the assumption of binary response, there are two alternatives to logit model: probit model and complementaryloglog model. They all follow the same form π ( x) =Φ ( α
More informationLecture 11. Interval Censored and. DiscreteTime Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255  Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationNotes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes.
Unit 2: Models, Censoring, and Likelihood for FailureTime Data Notes largely based on Statistical Methods for Reliability Data by W.Q. Meeker and L. A. Escobar, Wiley, 1998 and on their class notes. Ramón
More informationSigmaplot di Systat Software
Sigmaplot di Systat Software SigmaPlot Has Extensive Statistical Analysis Features SigmaPlot is now bundled with SigmaStat as an easytouse package for complete graphing and data analysis. The statistical
More informationSTAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where
STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationLongitudinal Modeling with Logistic Regression
Newsom 1 Longitudinal Modeling with Logistic Regression Longitudinal designs involve repeated measurements of the same individuals over time There are two general classes of analyses that correspond to
More informationHYPOTHESIS TESTING: FREQUENTIST APPROACH.
HYPOTHESIS TESTING: FREQUENTIST APPROACH. These notes summarize the lectures on (the frequentist approach to) hypothesis testing. You should be familiar with the standard hypothesis testing from previous
More informationMüller: Goodnessoffit criteria for survival data
Müller: Goodnessoffit criteria for survival data Sonderforschungsbereich 386, Paper 382 (2004) Online unter: http://epub.ub.unimuenchen.de/ Projektpartner Goodness of fit criteria for survival data
More informationAccelerated life testing in the presence of dependent competing causes of failure
isid/ms/25/5 April 21, 25 http://www.isid.ac.in/ statmath/eprints Accelerated life testing in the presence of dependent competing causes of failure Isha Dewan S. B. Kulathinal Indian Statistical Institute,
More informationA comparison of inverse transform and composition methods of data simulation from the Lindley distribution
Communications for Statistical Applications and Methods 2016, Vol. 23, No. 6, 517 529 http://dx.doi.org/10.5351/csam.2016.23.6.517 Print ISSN 22877843 / Online ISSN 23834757 A comparison of inverse transform
More informationExtensions of Cox Model for NonProportional Hazards Purpose
PhUSE 2013 Paper SP07 Extensions of Cox Model for NonProportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used
More informationStatistics 262: Intermediate Biostatistics Regression & Survival Analysis
Statistics 262: Intermediate Biostatistics Regression & Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Introduction This course is an applied course,
More informationImplementing ResponseAdaptive Randomization in MultiArmed Survival Trials
Implementing ResponseAdaptive Randomization in MultiArmed Survival Trials BASS Conference 2009 Alex Sverdlov, BristolMyers Squibb A.Sverdlov (BMS) ResponseAdaptive Randomization 1 / 35 Joint work
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles doseresponse example
More informationBinomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials
Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =
More informationFrailty Modeling for clustered survival data: a simulation study
Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ  IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ  IHEC University
More informationContinuous case Discrete case General case. Hazard functions. Patrick Breheny. August 27. Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21
Hazard functions Patrick Breheny August 27 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21 Introduction Continuous case Let T be a nonnegative random variable representing the time to an event
More informationECON 721: Lecture Notes on Duration Analysis. Petra E. Todd
ECON 721: Lecture Notes on Duration Analysis Petra E. Todd Fall, 213 2 Contents 1 Two state Model, possible nonstationary 1 1.1 Hazard function.......................... 1 1.2 Examples.............................
More informationKernel density estimation in R
Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to
More information18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages
Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationStatistical Methods in Clinical Trials Categorical Data
Statistical Methods in Clinical Trials Categorical Data Types of Data quantitative Continuous Blood pressure Time to event Categorical sex qualitative Discrete No of relapses Ordered Categorical Pain level
More informationSemiparametric maximum likelihood estimation in normal transformation models for bivariate survival data
Biometrika (28), 95, 4,pp. 947 96 C 28 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asn49 Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival
More informationCount and Duration Models
Count and Duration Models Stephen Pettigrew April 2, 2014 Stephen Pettigrew Count and Duration Models April 2, 2014 1 / 61 Outline 1 Logistics 2 Last week s assessment question 3 Counts: Poisson Model
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 HosmerLemeshow Statistic The HosmerLemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationNotes from Survival Analysis
Notes from Survival Analysis Cambridge Part III Mathematical Tripos 20122013 Lecturer: Peter Treasure Vivak Patel March 23, 2013 1 Contents 1 Introduction, Censoring and Hazard 4 1.1 Introduction..............................
More informationHypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)
Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2) B.H. Robbins Scholars Series June 23, 2010 1 / 29 Outline Ztest χ 2 test Confidence Interval Sample size and power Relative effect
More informationUnderstanding the Cox Regression Models with TimeChange Covariates
Understanding the Cox Regression Models with TimeChange Covariates Mai Zhou University of Kentucky The Cox regression model is a cornerstone of modern survival analysis and is widely used in many other
More informationSTAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and QQ plots. March 8, 2015
STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and QQ plots March 8, 2015 The duality between CI and hypothesis testing The duality between CI and hypothesis
More informationAnalysis of Cure Rate Survival Data Under Proportional Odds Model
Analysis of Cure Rate Survival Data Under Proportional Odds Model Yu Gu 1,, Debajyoti Sinha 1, and Sudipto Banerjee 2, 1 Department of Statistics, Florida State University, Tallahassee, Florida 32310 5608,
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More informationMultinomial Logistic Regression Models
Stat 544, Lecture 19 1 Multinomial Logistic Regression Models Polytomous responses. Logistic regression can be extended to handle responses that are polytomous, i.e. taking r>2 categories. (Note: The word
More information6 Single Sample Methods for a Location Parameter
6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually
More informationSTA6938Logistic Regression Model
Dr. Ying Zhang STA6938Logistic Regression Model Topic 2Multiple Logistic Regression Model Outlines:. Model Fitting 2. Statistical Inference for Multiple Logistic Regression Model 3. Interpretation of
More informationReports of the Institute of Biostatistics
Reports of the Institute of Biostatistics No 02 / 2008 Leibniz University of Hannover Natural Sciences Faculty Title: Properties of confidence intervals for the comparison of small binomial proportions
More informationFrom semi to nonparametric inference in general time scale models
From semi to nonparametric inference in general time scale models Thierry DUCHESNE duchesne@matulavalca Département de mathématiques et de statistique Université Laval Québec, Québec, Canada Research
More informationMonitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison
Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs Christopher Jennison Department of Mathematical Sciences, University of Bath http://people.bath.ac.uk/mascj
More informationPromotion Time Cure Rate Model with Random Effects: an Application to a Multicentre Clinical Trial of Carcinoma
www.srljournal.org Statistics Research Letters (SRL) Volume Issue, May 013 Promotion Time Cure Rate Model with Random Effects: an Application to a Multicentre Clinical Trial of Carcinoma Diego I. Gallardo
More informationConstant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with TypeI Censoring
Algorithms Research 013, (): 4349 DOI: 10.593/j.algorithms.01300.0 Constant Stress Partially Accelerated Life Test Design for Mustafa Kamal *, Shazia Zarrin, ArifUlIslam Department of Statistics & Operations
More informationA Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes
A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes Ritesh Ramchandani Harvard School of Public Health August 5, 2014 Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More informationStatistical Prediction Based on Censored Life Data
Editor s Note: Results following from the research reported in this article also appear in Chapter 12 of Meeker, W. Q., and Escobar, L. A. (1998), Statistical Methods for Reliability Data, New York: Wiley.
More informationMultivariate Survival Data With Censoring.
1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine HuberCarol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11220, 1 Baruch way, 10010 NY.
More informationLecture 7: Hypothesis Testing and ANOVA
Lecture 7: Hypothesis Testing and ANOVA Goals Overview of key elements of hypothesis testing Review of common one and two sample tests Introduction to ANOVA Hypothesis Testing The intent of hypothesis
More informationMaximum Likelihood Based Estimation of Hazard Function under Shape. Restrictions and Related Statistical Inference.
Maximum Likelihood Based Estimation of Hazard Function under Shape Restrictions and Related Statistical Inference by Desale Habtzghi (Under the direction of Somnath Datta and Mary Meyer ) Abstract The
More information5 Introduction to the Theory of Order Statistics and Rank Statistics
5 Introduction to the Theory of Order Statistics and Rank Statistics This section will contain a summary of important definitions and theorems that will be useful for understanding the theory of order
More informationE509A: Principle of Biostatistics. (Week 11(2): Introduction to nonparametric. methods ) GY Zou.
E509A: Principle of Biostatistics (Week 11(2): Introduction to nonparametric methods ) GY Zou gzou@robarts.ca Sign test for two dependent samples Ex 12.1 subj 1 2 3 4 5 6 7 8 9 10 baseline 166 135 189
More informationA note on R 2 measures for Poisson and logistic regression models when both models are applicable
Journal of Clinical Epidemiology 54 (001) 99 103 A note on R measures for oisson and logistic regression models when both models are applicable Martina Mittlböck, Harald Heinzl* Department of Medical Computer
More informationJournal of Statistical Software
JSS Journal of Statistical Software January 2011, Volume 38, Issue 2. http://www.jstatsoft.org/ Analyzing Competing Risk Data Using the R timereg Package Thomas H. Scheike University of Copenhagen MeiJie
More informationThe Type I Generalized Half Logistic Survival Model
International Journal of Theoretical and Applied Mathematics 2016; 2(2): 7478 http://www.sciencepublishinggroup.com/j/ijtam doi: 10.11648/j.ijtam.20160202.17 The Type I Generalized Half Logistic urvival
More informationtime = c(4.5, 3.9, 1.0, 2.3, 4.8, 0.1, 0.3, 2.0, 0.8, 3.0, 0.3, 2.1) answer = c( 0, 1, 1, 1, 0, 1, 0, 0, 1, 0, 1, 0)
Chapter 15. Censored Data Models Suppose you are given a small data set from a call center on how long the customers stayed on the phone, until either hanging up or being answered. The data set looks like
More informationLinear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017
Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical Preliminaries Random Variables Random Variables X: A mapping from Ω to ℝ that describes the question we care about
More informationBinary choice. Michel Bierlaire
Binary choice Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique Fédérale de Lausanne M. Bierlaire (TRANSPOR ENAC EPFL)
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationIntegrated likelihoods in survival models for highlystratified
Working Paper Series, N. 1, January 2014 Integrated likelihoods in survival models for highlystratified censored data Giuliana Cortese Department of Statistical Sciences University of Padua Italy Nicola
More informationModeling and inference for an ordinal effect size measure
STATISTICS IN MEDICINE Statist Med 2007; 00:1 15 Modeling and inference for an ordinal effect size measure Euijung Ryu, and Alan Agresti Department of Statistics, University of Florida, Gainesville, FL
More informationRandomization Tests for Regression Models in Clinical Trials
Randomization Tests for Regression Models in Clinical Trials A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University By Parwen
More informationEstimation of Conditional Kendall s Tau for Bivariate Interval Censored Data
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 22877843 / Online ISSN 23834757 Estimation of Conditional
More informationChapter 4 Parametric Families of Lifetime Distributions
Chapter 4 Parametric Families of Lifetime istributions In this chapter, we present four parametric families of lifetime distributions Weibull distribution, gamma distribution, changepoint model, and mixture
More informationFRAILTY MODELS FOR MODELLING HETEROGENEITY
FRAILTY MODELS FOR MODELLING HETEROGENEITY By ULVIYA ABDULKARIMOVA, B.Sc. A Thesis Submitted to the School of Graduate Studies in Partial Fulfillment of the Requirements for the Degree Master of Science
More informationAttributable Risk Function in the Proportional Hazards Model
UW Biostatistics Working Paper Series 5312005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu
More informationSample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA
Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider
More informationLecture 8 Stat D. Gillen
Statistics 255  Survival Analysis Presented February 23, 2016 Dan Gillen Department of Statistics University of California, Irvine 8.1 Example of two ways to stratify Suppose a confounder C has 3 levels
More informationLogistic regression: Miscellaneous topics
Logistic regression: Miscellaneous topics April 11 Introduction We have covered two approaches to inference for GLMs: the Wald approach and the likelihood ratio approach I claimed that the likelihood ratio
More informationYour use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
The Analysis of Exponentially Distributed LifeTimes with Two Types of Failure Author(s): D. R. Cox Source: Journal of the Royal Statistical Society. Series B (Methodological), Vol. 21, No. 2 (1959), pp.
More informationA new strategy for metaanalysis of continuous covariates in observational studies with IPD. Willi Sauerbrei & Patrick Royston
A new strategy for metaanalysis of continuous covariates in observational studies with IPD Willi Sauerbrei & Patrick Royston Overview Motivation Continuous variables functional form Fractional polynomials
More informationDouble Bootstrap Confidence Interval Estimates with Censored and Truncated Data
Journal of Modern Applied Statistical Methods Volume 13 Issue 2 Article 22 112014 Double Bootstrap Confidence Interval Estimates with Censored and Truncated Data Jayanthi Arasan University Putra Malaysia,
More informationANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL
Statistica Sinica 18(28, 219234 ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Wenbin Lu and Yu Liang North Carolina State University and SAS Institute Inc.
More informationStat 475 Life Contingencies I. Chapter 2: Survival models
Stat 475 Life Contingencies I Chapter 2: Survival models The future lifetime random variable Notation We are interested in analyzing and describing the future lifetime of an individual. We use (x) to denote
More informationGlossary for the Triola Statistics Series
Glossary for the Triola Statistics Series Absolute deviation The measure of variation equal to the sum of the deviations of each value from the mean, divided by the number of values Acceptance sampling
More informationMODULE 6 LOGISTIC REGRESSION. Module Objectives:
MODULE 6 LOGISTIC REGRESSION Module Objectives: 1. 147 6.1. LOGIT TRANSFORMATION MODULE 6. LOGISTIC REGRESSION Logistic regression models are used when a researcher is investigating the relationship between
More informationLogistic Regression. Interpretation of linear regression. Other types of outcomes. 01 response variable: Wound infection. Usual linear regression
Logistic Regression Usual linear regression (repetition) y i = b 0 + b 1 x 1i + b 2 x 2i + e i, e i N(0,σ 2 ) or: y i N(b 0 + b 1 x 1i + b 2 x 2i,σ 2 ) Example (DGA, p. 336): E(PEmax) = 47.355 + 1.024
More informationINVERTED KUMARASWAMY DISTRIBUTION: PROPERTIES AND ESTIMATION
Pak. J. Statist. 2017 Vol. 33(1), 3761 INVERTED KUMARASWAMY DISTRIBUTION: PROPERTIES AND ESTIMATION A. M. Abd ALFattah, A.A. ELHelbawy G.R. ALDayian Statistics Department, Faculty of Commerce, ALAzhar
More informationUNIVERSITÄT POTSDAM Institut für Mathematik
UNIVERSITÄT POTSDAM Institut für Mathematik Testing the Acceleration Function in Life Time Models Hannelore Liero Matthias Liero Mathematische Statistik und Wahrscheinlichkeitstheorie Universität Potsdam
More informationThe Accelerated Failure Time Model Under Biased. Sampling
The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)
More information