Lecture 5 Models and methods for recurrent event data


 Juniper Robbins
 11 months ago
 Views:
Transcription
1 Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events. Recurrent events (focused topic)  timetoevents model (point process model)  timebetweenevents model (gap times model)  e.g. repeated infections/hospitalizations/tumor occurrences Ordered multiple events  HIV AIDS death  birth onset age of a genetic disease death  disease staging I II III IV Unordered multiple events
2 Timetoevents and timebetweenevents models Timetoevents models  Interest focuses on occurrence rate of recurrent events over time.  Time is measured from timeorigin to events.  Timeorigin could be a fixed calendar time, onset of treatment, or a biological event.  Outcome variables of interest are gap times between events.  This type of models are more relevant when cycling pattern of recurrent events is strong; for example, women s menstrual cycles.
3 5.1 Timetoevents models Consider a continuous point process N(t), where N(t) represents the number of events occurring at or prior to t, 0 t τ. Intensity function. Intensity function of a continuous point process in [0, τ] is conventionally defined as the occurrence rate of events given the event history, λ(t N H (t)) = lim t 0 + Pr(N(t + ) N(t) > 0 N H (t)), where N H (t) = {N(u) : 0 u t} represents the history of the point process before or at t, t [0, τ].
4 Remarks  The intensity function uniquely determines the probability structure of the point process under regularity conditions.  For recurrent events, the socalled conditional regression models are constructed on the basis of the intensity function.
5 Rate function. In contrast with the conditional interpretation of the intensity function, a rate function λ(t), t [0, τ], is defined as the average number of events in unit time at t for subjects in the random population. More precisely, λ(t) = Pr(N(t + ) N(t) > 0) lim 0 + namely, the occurrence rate at t unconditionally on the event history H(t).,
6 Remarks  In general, a rate function itself does not fully determine the probability structure of the point process.  The rate function is conceptually and quantitatively different from the intensity function, and it coincide with the intensity function only when the process is memoryless.  For recurrent events, the socalled marginal regression models are constructed on the basis of the rate function
7 Define the cumulative rate function as Λ(t) = t 0 λ(u)du, t [0, T 0 ]. The CRF Λ(t) is also expectation of the number of recurrent events occurring in [0, t]. Note that E[N(t)] = Λ(t) we frequently write E[dN(t)] = λ(t)dt
8 5.1.1 Poisson process models Poisson process is a counting process model for multiple events occurring over a fixed time interval [0, τ], τ > 0. The Poisson distribution is the probability distribution for the total number of events, M. The Poisson distribution is sometimes used for modelling a count variable in other situations.
9 A point process is a stationary Poisson process if the following three conditions are satisfied (sketch): 1. The probability that exactly one event occurs in a small interval [t, t + h] is approximately λh, where λ is called the intensity (or rate) of events, λ > The probability that 2 or more events occur at the same time is approximately The numbers of events in disjoint regions are independent. ( ) Let µ = λτ > 0. The pdf of M is f M (m) = e µ µ m, m = 0, 1, 2,.... A Poisson process is called a nonstationary Poisson process if the occurrence rate, λ(t), is time dependent. m!
10 5.1.2 Nonparametric estimation of CRF Data. Let t i1 t i,mi be the ordered event times with m i defined as the index for the last observed event. The observed data include {(m i, c i, t i1,..., t i,mi ) : i = 1,..., n}. Population. Note that for a single event process (univariate survival time), the risk population at t is composed of subjects who have not failed prior to t, thus the risk population varies with different values of t. In contrast, for a recurrent event process, the risk population at different t s always coincides with the target population defined at 0. Risk set. Let C i represent the terminating time (censoring time) for observing N(t). The risk set at t is defined as {i : C i t} which includes subjects who are under observation at t. Define R i (t) = I(C i > t) as the riskset indicator, and R(t) = n i=1 R i(t). Independent censoring. If C i is independent of N i ( ), the risk set forms a random sample from the risk population at t.
11 Under independent censoring assumption, for t > 0 and positivevalued but small, a crude estimate of the occurrence probability in (t, t] can constructed as λ(t) n i=1 mi j=1 I(t ij (t, t]) R(t), (1) with I( ) representing the indicator function. The estimate is essentially an empirical measure with timedependent sample size R(t). A nonparametric estimate of the CRF corresponding to (1) can then be constructed as ˆΛ(t) = n m i i=1 j=1 I(t ij t) R(t ij ). (2) Nelson (88, JQT; 95, Technometrics)
12 5.1.3 Conditional Regression Models Anderson and Gill (1982, AS) proposed a timetoevents model which extended Cox s proportional hazards model from single event data to recurrent event data. Suppose the dates of recurrent events are recorded with a continuous scale (e.g., by days or weeks), and the outcome measures of interest are recurrent events occurring in the time interval [0, τ], where the constant τ > 0 is determined with the knowledge that recurrent events could potentially be observed up to τ, say 3 years. Let N H (t) be the recurrent event history and Z H (t) the possibly timedependent covariate history prior to t. For t [0, τ], the AG model assumes the events occur over time with the occurrence rate λ(t N H (t), Z H (t)) = λ 0 (t)exp{x(t)β}, (3) where X(t) = φ(n H (t), Z H (t)) is a transformation of (N H (t), Z H (t))
13 Pros and cons of conditional regression model (i) The AG model can be thought of as a predicting model since the event history is included as a part of conditional statistics in the rate function. (ii) Use of the AG model to identify treatment effects is subject to constraints, since the model identifies treatment effects adjusted for subjectspecific event history. In general, AG model is not ideal for identifying treatment effects or population risks. (iii) If the AG model chooses to use timeindependent covariate, X(t) = X, the model is then required to be memoryless. For example, two subjects with the same X but different event histories would predict the same occurrence rate of events. Thus, if X =treatment indicator, two patients who receive the same treatment but have different hospitalization records would have the same level of risk for rehospitalization according to the AG model.
14 Statistical methods for conditional regression model AG extended the partial likelihood methods from univariate survival data to recurrent event data. The partial likelihood score function for β 0 can be derived as U(β, t) = n i=1 t 0 {X i (u) X(β, u)}dn i (u) (4) where Z(β, n i=1 t) = Ri(t)Xi(t) exp{βt 0 Xi(t)} n i=1 Ri(t) exp{βt 0 Xi(t)}. Martingale theory was also developed to establish the large sample properties (as an extension of martingale theory for univariate survival analysis).
15 5.1.4 Marginal Regression Models In stead of the conditional regression model, we may consider a marginal model where the event history, N(t), is not included as part of the conditional statistics: λ(t Z(t)) = λ 0 (t)exp{z(t)β}. The marginal model is generally ideal for identifying treatment effects and risk factors, but the estimation procedure of LWYY depends heavily on the independent censoring assumption. The LWYY estimates could be very biased when the followup is terminated by reasons associated with the recurrent events such as informative dropout or death. Statistical inferences can be found in the articles of Pepe and Cai (1993, JASA) and Lin et al. (Huang, 2000, JRSSB).
16 5.1.5 Semiparametric latent variable models. With intension to deal with censoring due to death or informative dropout, Wang et al. (2001, JASA) proposed a semiparametric latent variable model for timetoevents data: λ(t Z, X) = Z λ 0 (t)exp{xβ} The model allows for informative censoring through the use of a latent variable. The model implies the marginal rate model λ(t X = x) = λ 0(t)exp{xβ}. where λ 0(t) = E[Z] λ 0 (t). The model has the feature of treating both the censoring and latent variable distributions as nonparametric components. The approach avoids modeling and estimating these nonparametric components by proper conditional likelihood techniques. As a related work, a joint model for recurrent events and a failure time was proposed and studied by Huang and Wang (2004, JASA).
17 5.2 Suppose the outcome measure of interest is time between successive events (gap time). When timebetweenevents is the variable of interest, the occurrence of each recurrent event is considered as the time origin for the occurrence of the next event. Recurrence times could be considered as a type of correlated failure time data in survival analysis. This type of correlated data are, however, different from the correlated data collected from families (e.g., twin data or sibling data) due to the ordering nature of recurrent events.
18 5.2.1 Specific features of data Informative m. For typical multivariate survival data such as family data, cluster size is usually assumed to be uncorrelated with failure times of a cluster. For recurrence time data, the number of recurrent events, m, is typically correlated with recurrence times in followup studies  large m is likely to imply shorter times and vice versa. In some applications, m is even used as the outcome measurement for analysis; e.g., in a Poisson model, m is the Poisson count variable. Induced informative censoring. Induced informative censoring is a special feature for ordered events. When the observation of the recurrent event process is censored at C, the censoring time for T j is max{c j 1 k=1 Y k, 0}, for each j = 2, 3,.... Because j 1 k=1 Y k is correlated with T j for j 2, recurrence times of order greater than one are observed subject to informative censoring even if the censoring time C is independent of N( ).
19 Intercepted sampling. The intercepted sampling is a wellknown probability feature of renewal processes. It is a specific feature of recurrence time data because the sampling scheme to observe recurrence times in longitudinal studies is similar to the intercepted sampling of renewal processes. For simplicity of understanding, assume the recurrence times {Y j : j = 1, 2,...} are independent and identically distributed (iid). Let f, S and µ represent the density function, survival function and mean of Y j. Let T = C T m and R = T m+1 C be the socalled backward and forward recurrence times. When the censoring time, C, is sufficiently large so that an equilibrium condition is reached, the joint density of (T, R) can then be derived as p T,R (t, r) = f(t + r)i(t 0, r 0)/µ. (5)
20 The marginal density functions of Y, T and R can be derived, based on (1), as p Ym+1 (y) = yf(y)i(y 0)/µ, (6) p T (t) = S(t)I(t 0)/µ, (7) p R (r) = S(r)I(r 0)/µ. (8) The distribution of Y m+1 is referred to as the lengthbiased distribution. In most of the longitudinal studies, however, the censoring time is not very large and therefore the equilibrium condition is not satisfied. In these cases, although the above distributional results do not hold, the bias from Y m+1 is still significant and one should be careful when conducting statistical analysis. In general, because of the specific data features, standard statistical methods in survival analysis may or may not be appropriate for recurrence time data.
21 5.2.2 Transitional probability Model Let f j (y y i1,..., y i,j 1 ) denote the pdf of Y ij conditioning on (Y i1,..., Y i,j 1 ) = (y i1,..., y i,j 1 ). Suppose the censoring time C i is independent of the recurrent event process N i ( ). Note that the likelihood function is L n m i { f j (y ij y i1,..., y i,j 1 )}S mi+1(y + i,m y i+1 i1,..., y i,mi ) i=1 j=1 A transitional probability model can be constructed by placing distributional assumptions on the conditional probability f j (y y i1,..., y i,j 1 ). In applications, when a transitional probability model is used, it is frequently accompanied by a further 1storder (or 2ndorder) markovian assumption that the conditional pdf of Y ij depends on (Y i1,..., Y i,j 1 ) only through Y i,j 1.
22 In a regression setting, when covariates x i is present, we assume that conditioning on x i the censoring time C i is independent of N i ( ). The likelihood function is modified as L n m i { f j (y ij x i, y i1,..., y i,j 1 )}S mi+1(y + i,m x i+1 i, y i1,..., y i,mi ) i=1 j=1
23 5.2.3 Parametric Frailty Model Frailty models are basically randomeffects or latentvariable models, where the frailty is used to characterize a subject. Assume the following conditions: (i) Conditional on a subjectspecific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are independent. (ii) (Independent censoring) C and (N( ), Z) are independent. (iii) (Distributional assumption) Conditional on Z = z, Y j is distributed with pdf f j (y z; θ), θ Θ. The latent variable Z is distributed with pdf h(z; γ), γ Γ.
24 With Assumptions (i), (ii) and (iii), the likelihood function from the data can be formulated as L n i=1 f j (y ij z i ; θ)}s mi+1(y + i,m z i+1 i; θ)h(z i ; γ)dz i m i { j=1 The likelihood function is then maximized to derive estimates (MLEs) of theta and γ. Large sample distributions of the MLEs can be derived based on normal approximation.
25 In a regression setting when covariates x is present, Assumptions (i  iii) can be modified as (i) Conditional on x and a subjectspecific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are independently distributed. (ii) (Independent censoring) Conditional on x, C and (N( ), Z) are independent. (iii) (Distributional assumption) Conditional on x and Z = z, Y j is distributed with pdf f j (y z; θ), θ Θ. The latent variable Z is distributed with pdf h(z; γ), γ Γ.
26 With the modified assumptions, the likelihood function is expressed as n L i=1 f j (y ij x i, z i ; θ)}f mi+1(y + i,m x i+1 i, z i ; θ)h(z i ; γ)dz i. m i { j=1 It is, however, generally difficult to compute the MLE. In the literature EM algorithms and other computation algorithms have been developed to resolve the problem.
27 Appendix (optional reading) A.1 Nonparametric estimation of survival function estimation Recurrence times can be treated as a type of correlated survival data in statistical analysis. However, because of the ordinal nature of recurrence times, statistical methods which are appropriate for clustered survival data may not be applicable to recurrence time data. In many medical papers, recurrence time data are frequently analyzed by inappropriate methods as indicated by Aalen and Husebye (1991). Specifically, for estimating the marginal survival function, the KaplanMeier estimator derived from the pooled data is frequently used for exploratory analysis although the estimator is generally inappropriate for such analyses. Suppose recurrent events are of the same type and consider the problem of how to estimate the marginal survival function from univariate recurrence time data. Assume the following conditions are satisfied.
28 (i) (Conditional iid assumption) Conditional on a subjectspecific latent variable Z = z, the recurrence times {Y j : j = 1, 2,...} are identically and independently distributed. (ii) (Independent censoring) C and (N( ), Z) are independent. Define the univariate recurrent survival function of Y j as S(y) Pr(Y j > y) = S(y z)dh(z), where S(y z) is the conditional survival function of Y j given Z = z, and H is the distribution function of Z.
29 Under (i) and (ii), let S = 1 S, the nonparametric likelihood function can be formulated as n i=1 d S(u ij z i )]S(u + i,m z i+1 i)dh(z i ). m i [ j=1 Conceptually, the likelihood function involves both infinite parameters (the conditional cdf s S( z i )) and a mixing distribution (H). With infinite parameters, the maximization of the likelihood function could be problematic and therefore it is not used as the tool for finding an estimator of S. Instead of the nonparametric likelihood approach, Wang and Chang (1999, JASA) proposed a class of nonparametric estimators for estimating S(y):
30 Define the observed recurrence times as { u u ij if j = 1,..., m i ij = u + i,m i+1 if j = m i + 1 Define m i = { m i if m i = 0 m i 1 if m i 1
31 Let w i = w(c i ), where w( ) is a positivevalued function. The total mass of the risk set at y is calculated as R (y) = n w i [ m i + 1 i=1 and the mass evaluated at y is d (y) = n [ w ii(m i 1) m i + 1 i=1 m i +1 j=1 m i +1 j=1 I(u ij y)] I(u ij = y)]. Let u (1), u (2),..., u (K) be the ordered and distinct uncensored times. The estimator takes the product limit expression, Ŝ n (y) = { 1 d (u (i) ) } R (u (i) ), u (i) y which is nonincreasing in y and satisfies 0 Ŝn(y) 1. Further, this estimator also possesses proper large sample properties.
32 A.2 Semiparametric Regression Models Conditional proportional hazards model. Now, we are back to the general case that recurrent events may or may not be the same. Prentice, Williams and Peterson (1981, Biometrika) modeled timebetweenevent data by a conditional proportional hazards model as an extension of the usual proportional hazards model for univariate failure time data: λ(t N(t ) = j 1, N H (t), X H (t)) = λ 0j (t t j 1 )exp{z(t)γ j }, (9) for t t j 1. In the model,  N H (t) = {N(u) : 0 u t} is the event history up to t  X H (t) = {X(u) : 0 u t} is the covariate history up to t  λ 0j ( ) is the baseline hazard function  γ j is the regression parameter for the jth recurrence time
33 The possibly timedependent covariate history up to t is denoted by X H (t). As an important requirement, the event history N H (t) must be part of the given knowledge (conditional statistics) in the PWP model. The timedependent covariate vector Z(t) = φ(x H (t), N H (t)) is a transformation of (X H (t), N H (t)). This model serves as a proper model for predicting the future events given subjectspecific covariates and event history information. However, since event history is part of the conditional statistics in the model, the PWP model does not serve as an appropriate model for identifying treatment or prevention effects. The PWP model has been further extended to include both globally defined parameters β and episodespecific parameters γ j (Chang and Wang, 1999, JASA): λ(t N(t ) = j 1, N H (t), X H (t)) = λ 0j (t t j 1 )exp{z(t)γ j + W (t)β}, (10) for t t j 1, where Z(t) and W (t) are functions of (X H (t), N H (t)).
34 Marginal regression models. In contrast with conditional regression models, marginal regression models do not include the event history N H (t) as part of the covariates and therefore serve as appropriate models for identifying treatment effects or populationbased risk factors. Without conditioning on event history, limited techniques have been developed for the analysis of marginal regression models, with exceptions of Huang s accelerated failure time model (Y. Huang, 2000, JASA): log Y j = α j + x j β j + ɛ j, j = 1, 2,...
35 (cont d) Lin, Wei and Robins bivariate accelerated failure time model (1998, Biometrika): log Y 1 = α 1 + x 1 β 1 + ɛ 1, log Y 2 = α 2 + x 2 β 2 + ɛ 2 and Huang and Chen s proportional hazards model for Y j (2002, LIDA): λ(y x) = λ 0 (y)exp{xβ}, where x is the baseline covariates and λ 0 is the baseline hazards function shared by all the episodes. Note that the first two models only partially depend on N(t), and the third model is essentially a renewal model.
36 Trend models. In many applications the distributional pattern of recurrence times can be used as an index for the progression of a disease. Such a distributional pattern is important for understanding the natural history of a disease or for confirming longterm treatment effect. Assume (i) Within each subject, the recurrence times Y 1, Y 2,... are independently distributed with the survival functions S 0, S 1, S 2,..., and (ii) within each subject, the censoring time C is independent of N( ).
37 Assumption (i) can be viewed as a frailty condition where the conditional independence of recurrence times holds within each subject. Assumption (ii) implies that, within subject, the censoring mechanism is uninformative for the probability structure of event process. In applications, one might be interested in testing the null hypothesis (that is, (i)) that the duration distributions of different episodes Y 1, Y 2,... remain the same to confirm the stability of pattern of recurrence times, or to identify the treatment efficacy over time; see Wang and Chen (2001, Biometrics) for nonparametric and semiparametric approaches to deal with the problem.
Lecture 3. Truncation, lengthbias and prevalence sampling
Lecture 3. Truncation, lengthbias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in
More informationSTAT331. Cox s Proportional Hazards Model
STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations
More informationSurvival Analysis. Lu Tian and Richard Olshen Stanford University
1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival
More informationFrailty Modeling for clustered survival data: a simulation study
Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ  IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ  IHEC University
More informationA COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky
A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),
More informationMultivariate Survival Data With Censoring.
1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine HuberCarol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11220, 1 Baruch way, 10010 NY.
More informationSAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTERRANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000)
SAMPLE SIZE ESTIMATION FOR SURVIVAL OUTCOMES IN CLUSTERRANDOMIZED STUDIES WITH SMALL CLUSTER SIZES BIOMETRICS (JUNE 2000) AMITA K. MANATUNGA THE ROLLINS SCHOOL OF PUBLIC HEALTH OF EMORY UNIVERSITY SHANDE
More informationMultistate Modeling and Applications
Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)
More informationFrailty Models and Copulas: Similarities and Differences
Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt
More informationSemiparametric maximum likelihood estimation in normal transformation models for bivariate survival data
Biometrika (28), 95, 4,pp. 947 96 C 28 Biometrika Trust Printed in Great Britain doi: 1.193/biomet/asn49 Semiparametric maximum likelihood estimation in normal transformation models for bivariate survival
More informationSurvival Distributions, Hazard Functions, Cumulative Hazards
BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution
More informationCTDLPositive Stable Frailty Model
CTDLPositive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland
More informationAttributable Risk Function in the Proportional Hazards Model
UW Biostatistics Working Paper Series 5312005 Attributable Risk Function in the Proportional Hazards Model Ying Qing Chen Fred Hutchinson Cancer Research Center, yqchen@u.washington.edu Chengcheng Hu
More informationConstrained Maximum Likelihood Estimation for Model Calibration Using Summarylevel Information from External Big Data Sources
Constrained Maximum Likelihood Estimation for Model Calibration Using Summarylevel Information from External Big Data Sources YiHau Chen Institute of Statistical Science, Academia Sinica Joint with Nilanjan
More informationLogistic regression model for survival time analysis using timevarying coefficients
Logistic regression model for survival time analysis using timevarying coefficients Accepted in American Journal of Mathematical and Management Sciences, 2016 Kenichi SATOH ksatoh@hiroshimau.ac.jp Research
More informationThe Accelerated Failure Time Model Under Biased. Sampling
The Accelerated Failure Time Model Under Biased Sampling Micha Mandel and Ya akov Ritov Department of Statistics, The Hebrew University of Jerusalem, Israel July 13, 2009 Abstract Chen (2009, Biometrics)
More informationEstimation of Conditional Kendall s Tau for Bivariate Interval Censored Data
Communications for Statistical Applications and Methods 2015, Vol. 22, No. 6, 599 604 DOI: http://dx.doi.org/10.5351/csam.2015.22.6.599 Print ISSN 22877843 / Online ISSN 23834757 Estimation of Conditional
More informationStat 642, Lecture notes for 04/12/05 96
Stat 642, Lecture notes for 04/12/05 96 HosmerLemeshow Statistic The HosmerLemeshow Statistic is another measure of lack of fit. Hosmer and Lemeshow recommend partitioning the observations into 10 equal
More informationSTAT 331. Accelerated Failure Time Models. Previously, we have focused on multiplicative intensity models, where
STAT 331 Accelerated Failure Time Models Previously, we have focused on multiplicative intensity models, where h t z) = h 0 t) g z). These can also be expressed as H t z) = H 0 t) g z) or S t z) = e Ht
More informationTied survival times; estimation of survival probabilities
Tied survival times; estimation of survival probabilities Patrick Breheny November 5 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Tied survival times Introduction Breslow approximation
More informationMultilevel Statistical Models: 3 rd edition, 2003 Contents
Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction
More informationMonitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs. Christopher Jennison
Monitoring clinical trial outcomes with delayed response: incorporating pipeline data in group sequential designs Christopher Jennison Department of Mathematical Sciences, University of Bath http://people.bath.ac.uk/mascj
More informationFrom semi to nonparametric inference in general time scale models
From semi to nonparametric inference in general time scale models Thierry DUCHESNE duchesne@matulavalca Département de mathématiques et de statistique Université Laval Québec, Québec, Canada Research
More informationNonparametric rank based estimation of bivariate densities given censored data conditional on marginal probabilities
Hutson Journal of Statistical Distributions and Applications (26 3:9 DOI.86/s4488647y RESEARCH Open Access Nonparametric rank based estimation of bivariate densities given censored data conditional
More informationModeling and Analysis of Recurrent Event Data
Edsel A. Peña (pena@stat.sc.edu) Department of Statistics University of South Carolina Columbia, SC 29208 New Jersey Institute of Technology Conference May 20, 2012 Historical Perspective: Random Censorship
More informationIntroduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models
Introduction to Empirical Processes and Semiparametric Inference Lecture 25: Semiparametric Models Michael R. Kosorok, Ph.D. Professor and Chair of Biostatistics Professor of Statistics and Operations
More informationBIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY
BIAS OF MAXIMUMLIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY Ingo Langner 1, Ralf Bender 2, Rebecca LenzTönjes 1, Helmut Küchenhoff 2, Maria Blettner 2 1
More informationContinuous case Discrete case General case. Hazard functions. Patrick Breheny. August 27. Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21
Hazard functions Patrick Breheny August 27 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/21 Introduction Continuous case Let T be a nonnegative random variable representing the time to an event
More informationLoglinearity for Cox s regression model. Thesis for the Degree Master of Science
Loglinearity for Cox s regression model Thesis for the Degree Master of Science Zaki Amini Master s Thesis, Spring 2015 i Abstract Cox s regression model is one of the most applied methods in medical
More informationBayesian Inference on Joint Mixture Models for SurvivalLongitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for SurvivalLongitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationAFT Models and Empirical Likelihood
AFT Models and Empirical Likelihood Mai Zhou Department of Statistics, University of Kentucky Collaborators: Gang Li (UCLA); A. Bathke; M. Kim (Kentucky) Accelerated Failure Time (AFT) models: Y = log(t
More informationA Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes
A Generalized Global Rank Test for Multiple, Possibly Censored, Outcomes Ritesh Ramchandani Harvard School of Public Health August 5, 2014 Ritesh Ramchandani (HSPH) Global Rank Test for Multiple Outcomes
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationRecurrent Event Data: Models, Analysis, Efficiency
Recurrent Event Data: Models, Analysis, Efficiency Edsel A. Peña Department of Statistics University of South Carolina Columbia, SC 29208 Talk at Brown University March 10, 2008 Recurrent Event Data:Models,
More informationReliability Engineering I
Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00
More informationIntegrated likelihoods in survival models for highlystratified
Working Paper Series, N. 1, January 2014 Integrated likelihoods in survival models for highlystratified censored data Giuliana Cortese Department of Statistical Sciences University of Padua Italy Nicola
More informationA Regression Model for the Copula Graphic Estimator
Discussion Papers in Economics Discussion Paper No. 11/04 A Regression Model for the Copula Graphic Estimator S.M.S. Lo and R.A. Wilke April 2011 2011 DP 11/04 A Regression Model for the Copula Graphic
More informationMüller: Goodnessoffit criteria for survival data
Müller: Goodnessoffit criteria for survival data Sonderforschungsbereich 386, Paper 382 (2004) Online unter: http://epub.ub.unimuenchen.de/ Projektpartner Goodness of fit criteria for survival data
More informationRank Regression Analysis of Multivariate Failure Time Data Based on Marginal Linear Models
doi: 10.1111/j.14679469.2005.00487.x Published by Blacwell Publishing Ltd, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA Vol 33: 1 23, 2006 Ran Regression Analysis
More informationMoger, TA; Haugen, M; Yip, BHK; Gjessing, HK; Borgan, Ø. Citation Lifetime Data Analysis, 2010, v. 17, n. 3, p
Title A hierarchical frailty model applied to twogeneration melanoma data Author(s) Moger, TA; Haugen, M; Yip, BHK; Gjessing, HK; Borgan, Ø Citation Lifetime Data Analysis, 2010, v. 17, n. 3, p. 445460
More informationLatent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent
Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary
More informationResearch Projects. Hanxiang Peng. March 4, Department of Mathematical Sciences Indiana UniversityPurdue University at Indianapolis
Hanxiang Department of Mathematical Sciences Indiana UniversityPurdue University at Indianapolis March 4, 2009 Outline Project I: Free Knot Spline Cox Model Project I: Free Knot Spline Cox Model Consider
More informationST495: Survival Analysis: Maximum likelihood
ST495: Survival Analysis: Maximum likelihood Eric B. Laber Department of Statistics, North Carolina State University February 11, 2014 Everything is deception: seeking the minimum of illusion, keeping
More informationTypical Survival Data Arising From a Clinical Trial. Censoring. The Survivor Function. Mathematical Definitions Introduction
Outline CHL 5225H Advanced Statistical Methods for Clinical Trials: Survival Analysis Prof. Kevin E. Thorpe Defining Survival Data Mathematical Definitions Nonparametric Estimates of Survival Comparing
More informationCausal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD
Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable
More informationLecture 11. Interval Censored and. DiscreteTime Data. Statistics Survival Analysis. Presented March 3, 2016
Statistics 255  Survival Analysis Presented March 3, 2016 Motivating Dan Gillen Department of Statistics University of California, Irvine 11.1 First question: Are the data truly discrete? : Number of
More informationMultivariate spatial modeling
Multivariate spatial modeling Pointreferenced spatial data often come as multivariate measurements at each location Chapter 7: Multivariate Spatial Modeling p. 1/21 Multivariate spatial modeling Pointreferenced
More informationGeneralized Linear Models for NonNormal Data
Generalized Linear Models for NonNormal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture
More informationFull likelihood inferences in the Cox model: an empirical likelihood approach
Ann Inst Stat Math 2011) 63:1005 1018 DOI 10.1007/s104630100272y Full likelihood inferences in the Cox model: an empirical likelihood approach JianJian Ren Mai Zhou Received: 22 September 2008 / Revised:
More informationMultivariate Time Series: VAR(p) Processes and Models
Multivariate Time Series: VAR(p) Processes and Models A VAR(p) model, for p > 0 is X t = φ 0 + Φ 1 X t 1 + + Φ p X t p + A t, where X t, φ 0, and X t i are kvectors, Φ 1,..., Φ p are k k matrices, with
More informationProbability and Probability Distributions. Dr. Mohammed Alahmed
Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about
More informationStatistics 262: Intermediate Biostatistics Regression & Survival Analysis
Statistics 262: Intermediate Biostatistics Regression & Survival Analysis Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Introduction This course is an applied course,
More informationasymptotic normality of nonparametric Mestimators with applications to hypothesis testing for panel count data
asymptotic normality of nonparametric Mestimators with applications to hypothesis testing for panel count data Xingqiu Zhao and Ying Zhang The Hong Kong Polytechnic University and Indiana University Abstract:
More informationANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL
Statistica Sinica 18(28, 219234 ANALYSIS OF COMPETING RISKS DATA WITH MISSING CAUSE OF FAILURE UNDER ADDITIVE HAZARDS MODEL Wenbin Lu and Yu Liang North Carolina State University and SAS Institute Inc.
More informationMODELING THE SUBDISTRIBUTION OF A COMPETING RISK
Statistica Sinica 16(26), 13671385 MODELING THE SUBDISTRIBUTION OF A COMPETING RISK Liuquan Sun 1, Jingxia Liu 2, Jianguo Sun 3 and MeiJie Zhang 2 1 Chinese Academy of Sciences, 2 Medical College of
More informationAccelerated life testing in the presence of dependent competing causes of failure
isid/ms/25/5 April 21, 25 http://www.isid.ac.in/ statmath/eprints Accelerated life testing in the presence of dependent competing causes of failure Isha Dewan S. B. Kulathinal Indian Statistical Institute,
More informationContinuous Time Survival in Latent Variable Models
Continuous Time Survival in Latent Variable Models Tihomir Asparouhov 1, Katherine Masyn 2, Bengt Muthen 3 Muthen & Muthen 1 University of California, Davis 2 University of California, Los Angeles 3 Abstract
More informationCASE STUDY: Bayesian Incidence Analyses from CrossSectional Data with Multiple Markers of Disease Severity. Outline:
CASE STUDY: Bayesian Incidence Analyses from CrossSectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2.
More informationCHAPTER 1 A MAINTENANCE MODEL FOR COMPONENTS EXPOSED TO SEVERAL FAILURE MECHANISMS AND IMPERFECT REPAIR
CHAPTER 1 A MAINTENANCE MODEL FOR COMPONENTS EXPOSED TO SEVERAL FAILURE MECHANISMS AND IMPERFECT REPAIR Helge Langseth and Bo Henry Lindqvist Department of Mathematical Sciences Norwegian University of
More informationFRAILTY MODELS FOR MODELLING HETEROGENEITY
FRAILTY MODELS FOR MODELLING HETEROGENEITY By ULVIYA ABDULKARIMOVA, B.Sc. A Thesis Submitted to the School of Graduate Studies in Partial Fulfillment of the Requirements for the Degree Master of Science
More informationEstimation with clustered censored survival data with missing covariates in the marginal Cox model
Estimation with clustered censored survival data with missing covariates in the marginal Cox model Michael Parzen 1, Stuart Lipsitz 2,Amy Herring 3, and Joseph G. Ibrahim 3 (1) Emory University (2) Harvard
More informationQuantile Regression Methods for Reference Growth Charts
Quantile Regression Methods for Reference Growth Charts 1 Roger Koenker University of Illinois at UrbanaChampaign ASA Workshop on Nonparametric Statistics Texas A&M, January 15, 2005 Based on joint work
More informationNonparametric Tests for the Comparison of Point Processes Based on Incomplete Data
Published by Blackwell Publishers Ltd, 108 Cowley Road, Oxford OX4 1JF, UK and 350 Main Street, Malden, MA 02148, USA Vol 28: 725±732, 2001 Nonparametric Tests for the Comparison of Point Processes Based
More informationMultivariate NonNormally Distributed Random Variables
Multivariate NonNormally Distributed Random Variables An Introduction to the Copula Approach Workgroup seminar on climate dynamics Meteorological Institute at the University of Bonn 18 January 2008, Bonn
More informationSTATISTICAL ANALYSIS OF MULTIVARIATE INTERVALCENSORED FAILURE TIME DATA
STATISTICAL ANALYSIS OF MULTIVARIATE INTERVALCENSORED FAILURE TIME DATA A Dissertation Presented to the Faculty of the Graduate School University of MissouriColumbia In Partial Fulfillment Of the Requirements
More informationStatistical inference on the penetrances of rare genetic mutations based on a case family design
Biostatistics (2010), 11, 3, pp. 519 532 doi:10.1093/biostatistics/kxq009 Advance Access publication on February 23, 2010 Statistical inference on the penetrances of rare genetic mutations based on a case
More informationSemiparametric Estimation for a Generalized KG Model with R. Model with Recurrent Event Data
Semiparametric Estimation for a Generalized KG Model with Recurrent Event Data Edsel A. Peña (pena@stat.sc.edu) Department of Statistics University of South Carolina Columbia, SC 29208 MMR Conference Beijing,
More informationHarvard University. Harvard University Biostatistics Working Paper Series. Survival Analysis with Change Point Hazard Functions
Harvard University Harvard University Biostatistics Working Paper Series Year 2006 Paper 40 Survival Analysis with Change Point Hazard Functions Melody S. Goodman Yi Li Ram C. Tiwari Harvard University,
More informationFailure rate in the continuous sense. Figure. Exponential failure density functions [f(t)] 1
Failure rate (Updated and Adapted from Notes by Dr. A.K. Nema) Part 1: Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is
More informationFrailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.
Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Melanie M. Wall, Bradley P. Carlin November 24, 2014 Outlines of the talk
More informationDescription Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see
Title stata.com stcrreg postestimation Postestimation tools for stcrreg Description Syntax for predict Menu for predict Options for predict Remarks and examples Methods and formulas References Also see
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationA Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness
A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UTAustin SRC 2014, Galveston, TX 1 Background 2 Working model
More informationOptimal Treatment Regimes for Survival Endpoints from a Classification Perspective. Anastasios (Butch) Tsiatis and Xiaofei Bai
Optimal Treatment Regimes for Survival Endpoints from a Classification Perspective Anastasios (Butch) Tsiatis and Xiaofei Bai Department of Statistics North Carolina State University 1/35 Optimal Treatment
More informationAnalysis of competing risks data and simulation of data following predened subdistribution hazards
Analysis of competing risks data and simulation of data following predened subdistribution hazards Bernhard Haller Institut für Medizinische Statistik und Epidemiologie Technische Universität München 27.05.2013
More informationExtensions of Cox Model for NonProportional Hazards Purpose
PhUSE 2013 Paper SP07 Extensions of Cox Model for NonProportional Hazards Purpose Jadwiga Borucka, PAREXEL, Warsaw, Poland ABSTRACT Cox proportional hazard model is one of the most common methods used
More informationJournal of Statistical Software
JSS Journal of Statistical Software January 2011, Volume 38, Issue 2. http://www.jstatsoft.org/ Analyzing Competing Risk Data Using the R timereg Package Thomas H. Scheike University of Copenhagen MeiJie
More informationExample: physical systems. If the state space. Example: speech recognition. Context can be. Example: epidemics. Suppose each infected
4. Markov Chains A discrete time process {X n,n = 0,1,2,...} with discrete state space X n {0,1,2,...} is a Markov chain if it has the Markov property: P[X n+1 =j X n =i,x n 1 =i n 1,...,X 0 =i 0 ] = P[X
More informationSemiparametric Regression Analysis of Bivariate IntervalCensored Data
University of South Carolina Scholar Commons Theses and Dissertations 12152014 Semiparametric Regression Analysis of Bivariate IntervalCensored Data Naichen Wang University of South Carolina  Columbia
More informationDiscussion of Papers on the Extensions of Propensity Score
Discussion of Papers on the Extensions of Propensity Score Kosuke Imai Princeton University August 3, 2010 Kosuke Imai (Princeton) Generalized Propensity Score 2010 JSM (Vancouver) 1 / 11 The Theme and
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables
ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53
More informationSample size and robust marginal methods for clusterrandomized trials with censored event times
Published in final edited form as: Statistics in Medicine (2015), 34(6): 901 923 DOI: 10.1002/sim.6395 Sample size and robust marginal methods for clusterrandomized trials with censored event times YUJIE
More informationParameters Estimation for a Linear Exponential Distribution Based on Grouped Data
International Mathematical Forum, 3, 2008, no. 33, 16431654 Parameters Estimation for a Linear Exponential Distribution Based on Grouped Data A. Alkhedhairi Department of Statistics and O.R. Faculty
More informationBIOS 312: Precision of Statistical Inference
and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample
More informationIgnoring the matching variables in cohort studies  when is it valid, and why?
Ignoring the matching variables in cohort studies  when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposureoutcome association
More informationConstant Stress Partially Accelerated Life Test Design for Inverted Weibull Distribution with TypeI Censoring
Algorithms Research 013, (): 4349 DOI: 10.593/j.algorithms.01300.0 Constant Stress Partially Accelerated Life Test Design for Mustafa Kamal *, Shazia Zarrin, ArifUlIslam Department of Statistics & Operations
More informationEfficient Estimation of Censored Linear Regression Model
2 3 4 5 6 7 8 9 2 3 4 5 6 7 8 9 2 2 22 23 24 25 26 27 28 29 3 3 32 33 34 35 36 37 38 39 4 4 42 43 44 45 46 47 48 Biometrika (2), xx, x, pp. 4 C 28 Biometrika Trust Printed in Great Britain Efficient Estimation
More informationBayesian Methods for Highly Correlated Data. Exposures: An Application to Disinfection Byproducts and Spontaneous Abortion
Outline Bayesian Methods for Highly Correlated Exposures: An Application to Disinfection Byproducts and Spontaneous Abortion November 8, 2007 Outline Outline 1 Introduction Outline Outline 1 Introduction
More informationPROBABILITY DISTRIBUTIONS
Review of PROBABILITY DISTRIBUTIONS Hideaki Shimazaki, Ph.D. http://goo.gl/visng Poisson process 1 Probability distribution Probability that a (continuous) random variable X is in (x,x+dx). ( ) P x < X
More informationSTAT 331. Martingale Central Limit Theorem and Related Results
STAT 331 Martingale Central Limit Theorem and Related Results In this unit we discuss a version of the martingale central limit theorem, which states that under certain conditions, a sum of orthogonal
More informationOn the generalized maximum likelihood estimator of survival function under Koziol Green model
On the generalized maximum likelihood estimator of survival function under Koziol Green model By: Haimeng Zhang, M. Bhaskara Rao, Rupa C. Mitra Zhang, H., Rao, M.B., and Mitra, R.C. (2006). On the generalized
More informationThe Analysis of IntervalCensored Survival Data. From a Nonparametric Perspective to a Nonparametric Bayesian Approach
The Analysis of IntervalCensored Survival Data. From a Nonparametric Perspective to a Nonparametric Bayesian Approach M.Luz Calle i Rosingana Memòria presentada per a aspirar al grau de Doctor en Matemàtiques.
More informationRidge regression. Patrick Breheny. February 8. Penalized regression Ridge regression Bayesian interpretation
Patrick Breheny February 8 Patrick Breheny HighDimensional Data Analysis (BIOS 7600) 1/27 Introduction Basic idea Standardization Largescale testing is, of course, a big area and we could keep talking
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationECON 721: Lecture Notes on Duration Analysis. Petra E. Todd
ECON 721: Lecture Notes on Duration Analysis Petra E. Todd Fall, 213 2 Contents 1 Two state Model, possible nonstationary 1 1.1 Hazard function.......................... 1 1.2 Examples.............................
More informationChapter 2. Data Analysis
Chapter 2 Data Analysis 2.1. Density Estimation and Survival Analysis The most straightforward application of BNP priors for statistical inference is in density estimation problems. Consider the generic
More informationOn the relative efficiency of using summary statistics versus individual level data in metaanalysis
On the relative efficiency of using summary statistics versus individual level data in metaanalysis By D. Y. LIN and D. ZENG Department of Biostatistics, CB# 7420, University of North Carolina, Chapel
More informationCURE MODEL WITH CURRENT STATUS DATA
Statistica Sinica 19 (2009), 233249 CURE MODEL WITH CURRENT STATUS DATA Shuangge Ma Yale University Abstract: Current status data arise when only random censoring time and event status at censoring are
More informationLatent Variable Methods for the Analysis of Genomic Data
John D. Storey Center for Statistics and Machine Learning & LewisSigler Institute for Integrative Genomics Latent Variable Methods for the Analysis of Genomic Data http://genomine.org/talks/ Data m variables
More information