# Lecture 5 Models and methods for recurrent event data

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events. Recurrent events (focused topic) - time-to-events model (point process model) - time-between-events model (gap times model) - e.g. repeated infections/hospitalizations/tumor occurrences Ordered multiple events - HIV AIDS death - birth onset age of a genetic disease death - disease staging I II III IV Unordered multiple events

2 Time-to-events and time-between-events models Time-to-events models - Interest focuses on occurrence rate of recurrent events over time. - Time is measured from time-origin to events. - Time-origin could be a fixed calendar time, onset of treatment, or a biological event. - Outcome variables of interest are gap times between events. - This type of models are more relevant when cycling pattern of recurrent events is strong; for example, women s menstrual cycles.

3 5.1 Time-to-events models Consider a continuous point process N(t), where N(t) represents the number of events occurring at or prior to t, 0 t τ. Intensity function. Intensity function of a continuous point process in [0, τ] is conventionally defined as the occurrence rate of events given the event history, λ(t N H (t)) = lim t 0 + Pr(N(t + ) N(t) > 0 N H (t)), where N H (t) = {N(u) : 0 u t} represents the history of the point process before or at t, t [0, τ].

4 Remarks - The intensity function uniquely determines the probability structure of the point process under regularity conditions. - For recurrent events, the so-called conditional regression models are constructed on the basis of the intensity function.

5 Rate function. In contrast with the conditional interpretation of the intensity function, a rate function λ(t), t [0, τ], is defined as the average number of events in unit time at t for subjects in the random population. More precisely, λ(t) = Pr(N(t + ) N(t) > 0) lim 0 + namely, the occurrence rate at t unconditionally on the event history H(t).,

6 Remarks - In general, a rate function itself does not fully determine the probability structure of the point process. - The rate function is conceptually and quantitatively different from the intensity function, and it coincide with the intensity function only when the process is memoryless. - For recurrent events, the so-called marginal regression models are constructed on the basis of the rate function

7 Define the cumulative rate function as Λ(t) = t 0 λ(u)du, t [0, T 0 ]. The CRF Λ(t) is also expectation of the number of recurrent events occurring in [0, t]. Note that E[N(t)] = Λ(t) we frequently write E[dN(t)] = λ(t)dt

8 5.1.1 Poisson process models Poisson process is a counting process model for multiple events occurring over a fixed time interval [0, τ], τ > 0. The Poisson distribution is the probability distribution for the total number of events, M. The Poisson distribution is sometimes used for modelling a count variable in other situations.

9 A point process is a stationary Poisson process if the following three conditions are satisfied (sketch): 1. The probability that exactly one event occurs in a small interval [t, t + h] is approximately λh, where λ is called the intensity (or rate) of events, λ > The probability that 2 or more events occur at the same time is approximately The numbers of events in disjoint regions are independent. ( ) Let µ = λτ > 0. The pdf of M is f M (m) = e µ µ m, m = 0, 1, 2,.... A Poisson process is called a non-stationary Poisson process if the occurrence rate, λ(t), is time dependent. m!

10 5.1.2 Nonparametric estimation of CRF Data. Let t i1 t i,mi be the ordered event times with m i defined as the index for the last observed event. The observed data include {(m i, c i, t i1,..., t i,mi ) : i = 1,..., n}. Population. Note that for a single event process (univariate survival time), the risk population at t is composed of subjects who have not failed prior to t, thus the risk population varies with different values of t. In contrast, for a recurrent event process, the risk population at different t s always coincides with the target population defined at 0. Risk set. Let C i represent the terminating time (censoring time) for observing N(t). The risk set at t is defined as {i : C i t} which includes subjects who are under observation at t. Define R i (t) = I(C i > t) as the risk-set indicator, and R(t) = n i=1 R i(t). Independent censoring. If C i is independent of N i ( ), the risk set forms a random sample from the risk population at t.

11 Under independent censoring assumption, for t > 0 and positive-valued but small, a crude estimate of the occurrence probability in (t, t] can constructed as λ(t) n i=1 mi j=1 I(t ij (t, t]) R(t), (1) with I( ) representing the indicator function. The estimate is essentially an empirical measure with time-dependent sample size R(t). A nonparametric estimate of the CRF corresponding to (1) can then be constructed as ˆΛ(t) = n m i i=1 j=1 I(t ij t) R(t ij ). (2) Nelson (88, JQT; 95, Technometrics)

12 5.1.3 Conditional Regression Models Anderson and Gill (1982, AS) proposed a time-to-events model which extended Cox s proportional hazards model from single event data to recurrent event data. Suppose the dates of recurrent events are recorded with a continuous scale (e.g., by days or weeks), and the outcome measures of interest are recurrent events occurring in the time interval [0, τ], where the constant τ > 0 is determined with the knowledge that recurrent events could potentially be observed up to τ, say 3 years. Let N H (t) be the recurrent event history and Z H (t) the possibly time-dependent covariate history prior to t. For t [0, τ], the AG model assumes the events occur over time with the occurrence rate λ(t N H (t), Z H (t)) = λ 0 (t)exp{x(t)β}, (3) where X(t) = φ(n H (t), Z H (t)) is a transformation of (N H (t), Z H (t))

13 Pros and cons of conditional regression model (i) The AG model can be thought of as a predicting model since the event history is included as a part of conditional statistics in the rate function. (ii) Use of the AG model to identify treatment effects is subject to constraints, since the model identifies treatment effects adjusted for subject-specific event history. In general, AG model is not ideal for identifying treatment effects or population risks. (iii) If the AG model chooses to use time-independent covariate, X(t) = X, the model is then required to be memoryless. For example, two subjects with the same X but different event histories would predict the same occurrence rate of events. Thus, if X =treatment indicator, two patients who receive the same treatment but have different hospitalization records would have the same level of risk for rehospitalization according to the AG model.

14 Statistical methods for conditional regression model AG extended the partial likelihood methods from univariate survival data to recurrent event data. The partial likelihood score function for β 0 can be derived as U(β, t) = n i=1 t 0 {X i (u) X(β, u)}dn i (u) (4) where Z(β, n i=1 t) = Ri(t)Xi(t) exp{βt 0 Xi(t)} n i=1 Ri(t) exp{βt 0 Xi(t)}. Martingale theory was also developed to establish the large sample properties (as an extension of martingale theory for univariate survival analysis).

15 5.1.4 Marginal Regression Models In stead of the conditional regression model, we may consider a marginal model where the event history, N(t), is not included as part of the conditional statistics: λ(t Z(t)) = λ 0 (t)exp{z(t)β}. The marginal model is generally ideal for identifying treatment effects and risk factors, but the estimation procedure of LWYY depends heavily on the independent censoring assumption. The LWYY estimates could be very biased when the follow-up is terminated by reasons associated with the recurrent events such as informative drop-out or death. Statistical inferences can be found in the articles of Pepe and Cai (1993, JASA) and Lin et al. (Huang, 2000, JRSS-B).

16 5.1.5 Semi-parametric latent variable models. With intension to deal with censoring due to death or informative drop-out, Wang et al. (2001, JASA) proposed a semi-parametric latent variable model for time-to-events data: λ(t Z, X) = Z λ 0 (t)exp{xβ} The model allows for informative censoring through the use of a latent variable. The model implies the marginal rate model λ(t X = x) = λ 0(t)exp{xβ}. where λ 0(t) = E[Z] λ 0 (t). The model has the feature of treating both the censoring and latent variable distributions as nonparametric components. The approach avoids modeling and estimating these nonparametric components by proper conditional likelihood techniques. As a related work, a joint model for recurrent events and a failure time was proposed and studied by Huang and Wang (2004, JASA).

17 5.2 Suppose the outcome measure of interest is time between successive events (gap time). When time-between-events is the variable of interest, the occurrence of each recurrent event is considered as the time origin for the occurrence of the next event. Recurrence times could be considered as a type of correlated failure time data in survival analysis. This type of correlated data are, however, different from the correlated data collected from families (e.g., twin data or sibling data) due to the ordering nature of recurrent events.

18 5.2.1 Specific features of data Informative m. For typical multivariate survival data such as family data, cluster size is usually assumed to be uncorrelated with failure times of a cluster. For recurrence time data, the number of recurrent events, m, is typically correlated with recurrence times in follow-up studies - large m is likely to imply shorter times and vice versa. In some applications, m is even used as the outcome measurement for analysis; e.g., in a Poisson model, m is the Poisson count variable. Induced informative censoring. Induced informative censoring is a special feature for ordered events. When the observation of the recurrent event process is censored at C, the censoring time for T j is max{c j 1 k=1 Y k, 0}, for each j = 2, 3,.... Because j 1 k=1 Y k is correlated with T j for j 2, recurrence times of order greater than one are observed subject to informative censoring even if the censoring time C is independent of N( ).

19 Intercepted sampling. The intercepted sampling is a well-known probability feature of renewal processes. It is a specific feature of recurrence time data because the sampling scheme to observe recurrence times in longitudinal studies is similar to the intercepted sampling of renewal processes. For simplicity of understanding, assume the recurrence times {Y j : j = 1, 2,...} are independent and identically distributed (iid). Let f, S and µ represent the density function, survival function and mean of Y j. Let T = C T m and R = T m+1 C be the so-called backward and forward recurrence times. When the censoring time, C, is sufficiently large so that an equilibrium condition is reached, the joint density of (T, R) can then be derived as p T,R (t, r) = f(t + r)i(t 0, r 0)/µ. (5)

20 The marginal density functions of Y, T and R can be derived, based on (1), as p Ym+1 (y) = yf(y)i(y 0)/µ, (6) p T (t) = S(t)I(t 0)/µ, (7) p R (r) = S(r)I(r 0)/µ. (8) The distribution of Y m+1 is referred to as the length-biased distribution. In most of the longitudinal studies, however, the censoring time is not very large and therefore the equilibrium condition is not satisfied. In these cases, although the above distributional results do not hold, the bias from Y m+1 is still significant and one should be careful when conducting statistical analysis. In general, because of the specific data features, standard statistical methods in survival analysis may or may not be appropriate for recurrence time data.

21 5.2.2 Transitional probability Model Let f j (y y i1,..., y i,j 1 ) denote the pdf of Y ij conditioning on (Y i1,..., Y i,j 1 ) = (y i1,..., y i,j 1 ). Suppose the censoring time C i is independent of the recurrent event process N i ( ). Note that the likelihood function is L n m i { f j (y ij y i1,..., y i,j 1 )}S mi+1(y + i,m y i+1 i1,..., y i,mi ) i=1 j=1 A transitional probability model can be constructed by placing distributional assumptions on the conditional probability f j (y y i1,..., y i,j 1 ). In applications, when a transitional probability model is used, it is frequently accompanied by a further 1st-order (or 2nd-order) markovian assumption that the conditional pdf of Y ij depends on (Y i1,..., Y i,j 1 ) only through Y i,j 1.

22 In a regression setting, when covariates x i is present, we assume that conditioning on x i the censoring time C i is independent of N i ( ). The likelihood function is modified as L n m i { f j (y ij x i, y i1,..., y i,j 1 )}S mi+1(y + i,m x i+1 i, y i1,..., y i,mi ) i=1 j=1

23 5.2.3 Parametric Frailty Model Frailty models are basically random-effects or latent-variable models, where the frailty is used to characterize a subject. Assume the following conditions: (i) Conditional on a subject-specific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are independent. (ii) (Independent censoring) C and (N( ), Z) are independent. (iii) (Distributional assumption) Conditional on Z = z, Y j is distributed with pdf f j (y z; θ), θ Θ. The latent variable Z is distributed with pdf h(z; γ), γ Γ.

24 With Assumptions (i), (ii) and (iii), the likelihood function from the data can be formulated as L n i=1 f j (y ij z i ; θ)}s mi+1(y + i,m z i+1 i; θ)h(z i ; γ)dz i m i { j=1 The likelihood function is then maximized to derive estimates (MLEs) of theta and γ. Large sample distributions of the MLEs can be derived based on normal approximation.

25 In a regression setting when covariates x is present, Assumptions (i - iii) can be modified as (i) Conditional on x and a subject-specific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are independently distributed. (ii) (Independent censoring) Conditional on x, C and (N( ), Z) are independent. (iii) (Distributional assumption) Conditional on x and Z = z, Y j is distributed with pdf f j (y z; θ), θ Θ. The latent variable Z is distributed with pdf h(z; γ), γ Γ.

26 With the modified assumptions, the likelihood function is expressed as n L i=1 f j (y ij x i, z i ; θ)}f mi+1(y + i,m x i+1 i, z i ; θ)h(z i ; γ)dz i. m i { j=1 It is, however, generally difficult to compute the MLE. In the literature EM algorithms and other computation algorithms have been developed to resolve the problem.

27 Appendix (optional reading) A.1 Nonparametric estimation of survival function estimation Recurrence times can be treated as a type of correlated survival data in statistical analysis. However, because of the ordinal nature of recurrence times, statistical methods which are appropriate for clustered survival data may not be applicable to recurrence time data. In many medical papers, recurrence time data are frequently analyzed by inappropriate methods as indicated by Aalen and Husebye (1991). Specifically, for estimating the marginal survival function, the Kaplan-Meier estimator derived from the pooled data is frequently used for exploratory analysis although the estimator is generally inappropriate for such analyses. Suppose recurrent events are of the same type and consider the problem of how to estimate the marginal survival function from univariate recurrence time data. Assume the following conditions are satisfied.

28 (i) (Conditional iid assumption) Conditional on a subject-specific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are identically and independently distributed. (ii) (Independent censoring) C and (N( ), Z) are independent. Define the univariate recurrent survival function of Y j as S(y) Pr(Y j > y) = S(y z)dh(z), where S(y z) is the conditional survival function of Y j given Z = z, and H is the distribution function of Z.

29 Under (i) and (ii), let S = 1 S, the nonparametric likelihood function can be formulated as n i=1 d S(u ij z i )]S(u + i,m z i+1 i)dh(z i ). m i [ j=1 Conceptually, the likelihood function involves both infinite parameters (the conditional cdf s S( z i )) and a mixing distribution (H). With infinite parameters, the maximization of the likelihood function could be problematic and therefore it is not used as the tool for finding an estimator of S. Instead of the nonparametric likelihood approach, Wang and Chang (1999, JASA) proposed a class of nonparametric estimators for estimating S(y):

30 Define the observed recurrence times as { u u ij if j = 1,..., m i ij = u + i,m i+1 if j = m i + 1 Define m i = { m i if m i = 0 m i 1 if m i 1

31 Let w i = w(c i ), where w( ) is a positive-valued function. The total mass of the risk set at y is calculated as R (y) = n w i [ m i + 1 i=1 and the mass evaluated at y is d (y) = n [ w ii(m i 1) m i + 1 i=1 m i +1 j=1 m i +1 j=1 I(u ij y)] I(u ij = y)]. Let u (1), u (2),..., u (K) be the ordered and distinct uncensored times. The estimator takes the product limit expression, Ŝ n (y) = { 1 d (u (i) ) } R (u (i) ), u (i) y which is non-increasing in y and satisfies 0 Ŝn(y) 1. Further, this estimator also possesses proper large sample properties.

32 A.2 Semiparametric Regression Models Conditional proportional hazards model. Now, we are back to the general case that recurrent events may or may not be the same. Prentice, Williams and Peterson (1981, Biometrika) modeled time-between-event data by a conditional proportional hazards model as an extension of the usual proportional hazards model for univariate failure time data: λ(t N(t ) = j 1, N H (t), X H (t)) = λ 0j (t t j 1 )exp{z(t)γ j }, (9) for t t j 1. In the model, - N H (t) = {N(u) : 0 u t} is the event history up to t - X H (t) = {X(u) : 0 u t} is the covariate history up to t - λ 0j ( ) is the baseline hazard function - γ j is the regression parameter for the jth recurrence time

33 The possibly time-dependent covariate history up to t is denoted by X H (t). As an important requirement, the event history N H (t) must be part of the given knowledge (conditional statistics) in the PWP model. The time-dependent covariate vector Z(t) = φ(x H (t), N H (t)) is a transformation of (X H (t), N H (t)). This model serves as a proper model for predicting the future events given subject-specific covariates and event history information. However, since event history is part of the conditional statistics in the model, the PWP model does not serve as an appropriate model for identifying treatment or prevention effects. The PWP model has been further extended to include both globally defined parameters β and episode-specific parameters γ j (Chang and Wang, 1999, JASA): λ(t N(t ) = j 1, N H (t), X H (t)) = λ 0j (t t j 1 )exp{z(t)γ j + W (t)β}, (10) for t t j 1, where Z(t) and W (t) are functions of (X H (t), N H (t)).

34 Marginal regression models. In contrast with conditional regression models, marginal regression models do not include the event history N H (t) as part of the covariates and therefore serve as appropriate models for identifying treatment effects or population-based risk factors. Without conditioning on event history, limited techniques have been developed for the analysis of marginal regression models, with exceptions of Huang s accelerated failure time model (Y. Huang, 2000, JASA): log Y j = α j + x j β j + ɛ j, j = 1, 2,...

35 (cont d) Lin, Wei and Robins bivariate accelerated failure time model (1998, Biometrika): log Y 1 = α 1 + x 1 β 1 + ɛ 1, log Y 2 = α 2 + x 2 β 2 + ɛ 2 and Huang and Chen s proportional hazards model for Y j (2002, LIDA): λ(y x) = λ 0 (y)exp{xβ}, where x is the baseline covariates and λ 0 is the baseline hazards function shared by all the episodes. Note that the first two models only partially depend on N(t), and the third model is essentially a renewal model.

36 Trend models. In many applications the distributional pattern of recurrence times can be used as an index for the progression of a disease. Such a distributional pattern is important for understanding the natural history of a disease or for confirming long-term treatment effect. Assume (i) Within each subject, the recurrence times Y 1, Y 2,... are independently distributed with the survival functions S 0, S 1, S 2,..., and (ii) within each subject, the censoring time C is independent of N( ).

37 Assumption (i) can be viewed as a frailty condition where the conditional independence of recurrence times holds within each subject. Assumption (ii) implies that, within subject, the censoring mechanism is uninformative for the probability structure of event process. In applications, one might be interested in testing the null hypothesis (that is, (i)) that the duration distributions of different episodes Y 1, Y 2,... remain the same to confirm the stability of pattern of recurrence times, or to identify the treatment efficacy over time; see Wang and Chen (2001, Biometrics) for nonparametric and semiparametric approaches to deal with the problem.

### Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

### STAT331. Cox s Proportional Hazards Model

STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

### Survival Analysis. Lu Tian and Richard Olshen Stanford University

1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

### Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

### A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

### Multivariate Survival Data With Censoring.

1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

### SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)

SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTER-RANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000) AMITA K. MANATUNGA THE ROLLINS SCHOOL OF PUBLIC HEALTH OF EMORY UNIVERSITY SHANDE

### Multistate Modeling and Applications

Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

### Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

### Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival data

Biometrika (28), 95, 4,pp. 947 96 C 28 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asn49 Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival

### Survival Distributions, Hazard Functions, Cumulative Hazards

BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

### CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

### Attributable Risk Function in the Proportional Hazards Model

UW Biostatistics Working Paper Series 5-31-2005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu

### Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources Yi-Hau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan

### Logistic regression model for survival time analysis using time-varying coefficients

Logistic regression model for survival time analysis using time-varying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshima-u.ac.jp Research

### The Accelerated Failure Time Model Under Biased. Sampling

The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)

### Estimation of Conditional Kendall s Tau for Bivariate Interval Censored Data

Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation of Conditional

### Stat 642, Lecture notes for 04/12/05 96

Stat 642, Lecture notes for 04/12/05 96 Hosmer-Lemeshow Statistic The Hosmer-Lemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal

### STAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where

STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht

### Tied survival times; estimation of survival probabilities

Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation

### Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

### Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison

Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs Christopher Jennison Department of Mathematical Sciences, University of Bath http://people.bath.ac.uk/mascj

### From semi- to non-parametric inference in general time scale models

From semi- to non-parametric inference in general time scale models Thierry DUCHESNE duchesne@matulavalca Département de mathématiques et de statistique Université Laval Québec, Québec, Canada Research

### Nonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities

Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488-6-47-y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional

### Modeling and Analysis of Recurrent Event Data

Edsel A. Peña (pena@stat.sc.edu) Department of Statistics University of South Carolina Columbia, SC 29208 New Jersey Institute of Technology Conference May 20, 2012 Historical Perspective: Random Censorship

### Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models

Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations

### BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca Lenz-Tönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1

### Continuous case Discrete case General case. Hazard functions. Patrick Breheny. August 27. Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21

Hazard functions Patrick Breheny August 27 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21 Introduction Continuous case Let T be a nonnegative random variable representing the time to an event

### Log-linearity for Cox s regression model. Thesis for the Degree Master of Science

Log-linearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical

### Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

### AFT Models and Empirical Likelihood

AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t

### A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes

A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes Ritesh Ramchandani Harvard School of Public Health August 5, 2014 Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes

### Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

### Recurrent Event Data: Models, Analysis, Efficiency

Recurrent Event Data: Models, Analysis, Efficiency Edsel A. Peña Department of Statistics University of South Carolina Columbia, SC 29208 Talk at Brown University March 10, 2008 Recurrent Event Data:Models,

### Reliability Engineering I

Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00

### Integrated likelihoods in survival models for highlystratified

Working Paper Series, N. 1, January 2014 Integrated likelihoods in survival models for highlystratified censored data Giuliana Cortese Department of Statistical Sciences University of Padua Italy Nicola

### A Regression Model for the Copula Graphic Estimator

Discussion Papers in Economics Discussion Paper No. 11/04 A Regression Model for the Copula Graphic Estimator S.M.S. Lo and R.A. Wilke April 2011 2011 DP 11/04 A Regression Model for the Copula Graphic

### Müller: Goodness-of-fit criteria for survival data

Müller: Goodness-of-fit criteria for survival data Sonderforschungsbereich 386, Paper 382 (2004) Online unter: http://epub.ub.uni-muenchen.de/ Projektpartner Goodness of fit criteria for survival data

### Rank Regression Analysis of Multivariate Failure Time Data Based on Marginal Linear Models

doi: 10.1111/j.1467-9469.2005.00487.x Published by Blacwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA Vol 33: 1 23, 2006 Ran Regression Analysis

### Moger, TA; Haugen, M; Yip, BHK; Gjessing, HK; Borgan, Ø. Citation Lifetime Data Analysis, 2010, v. 17, n. 3, p

Title A hierarchical frailty model applied to two-generation melanoma data Author(s) Moger, TA; Haugen, M; Yip, BHK; Gjessing, HK; Borgan, Ø Citation Lifetime Data Analysis, 2010, v. 17, n. 3, p. 445-460

### Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

### Research Projects. Hanxiang Peng. March 4, Department of Mathematical Sciences Indiana University-Purdue University at Indianapolis

Hanxiang Department of Mathematical Sciences Indiana University-Purdue University at Indianapolis March 4, 2009 Outline Project I: Free Knot Spline Cox Model Project I: Free Knot Spline Cox Model Consider

### ST495: Survival Analysis: Maximum likelihood

ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping

### Typical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction

Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Non-parametric Estimates of Survival Comparing

### Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

### Lecture 11. Interval Censored and. Discrete-Time Data. Statistics Survival Analysis. Presented March 3, 2016

Statistics 255 - Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of

### Multivariate spatial modeling

Multivariate spatial modeling Point-referenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Point-referenced

### Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

### Full likelihood inferences in the Cox model: an empirical likelihood approach

Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s10463-010-0272-y Full likelihood inferences in the Cox model: an empirical likelihood approach Jian-Jian Ren Mai Zhou Received: 22 September 2008 / Revised:

### Multivariate Time Series: VAR(p) Processes and Models

Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are k-vectors, Φ 1,..., Φ p are k k matrices, with

### Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

### Statistics 262: Intermediate Biostatistics Regression & Survival Analysis

Statistics 262: Intermediate Biostatistics Regression & Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Introduction This course is an applied course,

### asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data

asymptotic normality of nonparametric M-estimators with applications to hypothesis testing for panel count data Xingqiu Zhao and Ying Zhang The Hong Kong Polytechnic University and Indiana University Abstract:

### ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL

Statistica Sinica 18(28, 219-234 ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Wenbin Lu and Yu Liang North Carolina State University and SAS Institute Inc.

### MODELING THE SUBDISTRIBUTION OF A COMPETING RISK

Statistica Sinica 16(26), 1367-1385 MODELING THE SUBDISTRIBUTION OF A COMPETING RISK Liuquan Sun 1, Jingxia Liu 2, Jianguo Sun 3 and Mei-Jie Zhang 2 1 Chinese Academy of Sciences, 2 Medical College of

### Accelerated life testing in the presence of dependent competing causes of failure

isid/ms/25/5 April 21, 25 http://www.isid.ac.in/ statmath/eprints Accelerated life testing in the presence of dependent competing causes of failure Isha Dewan S. B. Kulathinal Indian Statistical Institute,

### Continuous Time Survival in Latent Variable Models

Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract

### CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2.

### CHAPTER 1 A MAINTENANCE MODEL FOR COMPONENTS EXPOSED TO SEVERAL FAILURE MECHANISMS AND IMPERFECT REPAIR

CHAPTER 1 A MAINTENANCE MODEL FOR COMPONENTS EXPOSED TO SEVERAL FAILURE MECHANISMS AND IMPERFECT REPAIR Helge Langseth and Bo Henry Lindqvist Department of Mathematical Sciences Norwegian University of

### FRAILTY MODELS FOR MODELLING HETEROGENEITY

FRAILTY MODELS FOR MODELLING HETEROGENEITY By ULVIYA ABDULKARIMOVA, B.Sc. A Thesis Submitted to the School of Graduate Studies in Partial Fulfillment of the Requirements for the Degree Master of Science

### Estimation with clustered censored survival data with missing covariates in the marginal Cox model

Estimation with clustered censored survival data with missing covariates in the marginal Cox model Michael Parzen 1, Stuart Lipsitz 2,Amy Herring 3, and Joseph G. Ibrahim 3 (1) Emory University (2) Harvard

### Quantile Regression Methods for Reference Growth Charts

Quantile Regression Methods for Reference Growth Charts 1 Roger Koenker University of Illinois at Urbana-Champaign ASA Workshop on Nonparametric Statistics Texas A&M, January 15, 2005 Based on joint work

### Non-parametric Tests for the Comparison of Point Processes Based on Incomplete Data

Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Non-parametric Tests for the Comparison of Point Processes Based

### Multivariate Non-Normally Distributed Random Variables

Multivariate Non-Normally Distributed Random Variables An Introduction to the Copula Approach Workgroup seminar on climate dynamics Meteorological Institute at the University of Bonn 18 January 2008, Bonn

### STATISTICAL ANALYSIS OF MULTIVARIATE INTERVAL-CENSORED FAILURE TIME DATA

STATISTICAL ANALYSIS OF MULTIVARIATE INTERVAL-CENSORED FAILURE TIME DATA A Dissertation Presented to the Faculty of the Graduate School University of Missouri-Columbia In Partial Fulfillment Of the Requirements

### Statistical inference on the penetrances of rare genetic mutations based on a case family design

Biostatistics (2010), 11, 3, pp. 519 532 doi:10.1093/biostatistics/kxq009 Advance Access publication on February 23, 2010 Statistical inference on the penetrances of rare genetic mutations based on a case

### Semiparametric Estimation for a Generalized KG Model with R. Model with Recurrent Event Data

Semiparametric Estimation for a Generalized KG Model with Recurrent Event Data Edsel A. Peña (pena@stat.sc.edu) Department of Statistics University of South Carolina Columbia, SC 29208 MMR Conference Beijing,

### Harvard University. Harvard University Biostatistics Working Paper Series. Survival Analysis with Change Point Hazard Functions

Harvard University Harvard University Biostatistics Working Paper Series Year 2006 Paper 40 Survival Analysis with Change Point Hazard Functions Melody S. Goodman Yi Li Ram C. Tiwari Harvard University,

### Failure rate in the continuous sense. Figure. Exponential failure density functions [f(t)] 1

Failure rate (Updated and Adapted from Notes by Dr. A.K. Nema) Part 1: Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is

### Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk

### Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

Title stata.com stcrreg postestimation Postestimation tools for stcrreg Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see

### Chapter 5. Chapter 5 sections

1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

### A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

### Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai

Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment

### Analysis of competing risks data and simulation of data following predened subdistribution hazards

Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013

### Extensions of Cox Model for Non-Proportional Hazards Purpose

PhUSE 2013 Paper SP07 Extensions of Cox Model for Non-Proportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used

### Journal of Statistical Software

JSS Journal of Statistical Software January 2011, Volume 38, Issue 2. http://www.jstatsoft.org/ Analyzing Competing Risk Data Using the R timereg Package Thomas H. Scheike University of Copenhagen Mei-Jie

### Example: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected

4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X

### Semiparametric Regression Analysis of Bivariate Interval-Censored Data

University of South Carolina Scholar Commons Theses and Dissertations 12-15-2014 Semiparametric Regression Analysis of Bivariate Interval-Censored Data Naichen Wang University of South Carolina - Columbia

### Discussion of Papers on the Extensions of Propensity Score

Discussion of Papers on the Extensions of Propensity Score Kosuke Imai Princeton University August 3, 2010 Kosuke Imai (Princeton) Generalized Propensity Score 2010 JSM (Vancouver) 1 / 11 The Theme and

### Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

### ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

### Sample size and robust marginal methods for cluster-randomized trials with censored event times

Published in final edited form as: Statistics in Medicine (2015), 34(6): 901 923 DOI: 10.1002/sim.6395 Sample size and robust marginal methods for cluster-randomized trials with censored event times YUJIE

### Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data

International Mathematical Forum, 3, 2008, no. 33, 1643-1654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Al-khedhairi Department of Statistics and O.R. Faculty

### BIOS 312: Precision of Statistical Inference

and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

### Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

### Constant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with Type-I Censoring

Algorithms Research 013, (): 43-49 DOI: 10.593/j.algorithms.01300.0 Constant Stress Partially Accelerated Life Test Design for Mustafa Kamal *, Shazia Zarrin, Arif-Ul-Islam Department of Statistics & Operations

### Efficient Estimation of Censored Linear Regression Model

2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 22 23 24 25 26 27 28 29 3 3 32 33 34 35 36 37 38 39 4 4 42 43 44 45 46 47 48 Biometrika (2), xx, x, pp. 4 C 28 Biometrika Trust Printed in Great Britain Efficient Estimation

### Bayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection By-products and Spontaneous Abortion

Outline Bayesian Methods for Highly Correlated Exposures: An Application to Disinfection By-products and Spontaneous Abortion November 8, 2007 Outline Outline 1 Introduction Outline Outline 1 Introduction

### PROBABILITY DISTRIBUTIONS

Review of PROBABILITY DISTRIBUTIONS Hideaki Shimazaki, Ph.D. http://goo.gl/visng Poisson process 1 Probability distribution Probability that a (continuous) random variable X is in (x,x+dx). ( ) P x < X

### STAT 331. Martingale Central Limit Theorem and Related Results

STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal

### On the generalized maximum likelihood estimator of survival function under Koziol Green model

On the generalized maximum likelihood estimator of survival function under Koziol Green model By: Haimeng Zhang, M. Bhaskara Rao, Rupa C. Mitra Zhang, H., Rao, M.B., and Mitra, R.C. (2006). On the generalized

### The Analysis of Interval-Censored Survival Data. From a Nonparametric Perspective to a Nonparametric Bayesian Approach

The Analysis of Interval-Censored Survival Data. From a Nonparametric Perspective to a Nonparametric Bayesian Approach M.Luz Calle i Rosingana Memòria presentada per a aspirar al grau de Doctor en Matemàtiques.

### Ridge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation

Patrick Breheny February 8 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Large-scale testing is, of course, a big area and we could keep talking

### Association studies and regression

Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration

### ECON 721: Lecture Notes on Duration Analysis. Petra E. Todd

ECON 721: Lecture Notes on Duration Analysis Petra E. Todd Fall, 213 2 Contents 1 Two state Model, possible non-stationary 1 1.1 Hazard function.......................... 1 1.2 Examples.............................

### Chapter 2. Data Analysis

Chapter 2 Data Analysis 2.1. Density Estimation and Survival Analysis The most straightforward application of BNP priors for statistical inference is in density estimation problems. Consider the generic

### On the relative efficiency of using summary statistics versus individual level data in meta-analysis

On the relative efficiency of using summary statistics versus individual level data in meta-analysis By D. Y. LIN and D. ZENG Department of Biostatistics, CB# 7420, University of North Carolina, Chapel