Regression Modeling of Time to Event Data Using the Ornstein-Uhlenbeck Process

Size: px
Start display at page:

Download "Regression Modeling of Time to Event Data Using the Ornstein-Uhlenbeck Process"

Transcription

1 Regression Modeling of Time to Event Data Using the Ornstein-Uhlenbeck Process Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Roger Alan Erich, M.S. Graduate Program in Biostatistics The Ohio State University 2012 Dissertation Committee: Professor Michael L. Pennell, Advisor Professor Thomas J. Santner Professor Dennis K. Pearl

2 c Copyright by Roger Alan Erich

3 Abstract In this research, we develop innovative regression models for survival analysis that model time to event data using a latent health process which stabilizes around an equilibrium point; a characteristic often observed in biological systems. Regression modeling in survival analysis is typically accomplished using Cox regression, which requires the assumption of proportional hazards. An alternative model, which does not require proportional hazards, is the First Hitting Time (FHT) model where a subject s health is modeled using a latent stochastic process. In this modeling framework, an event occurs once the process hits a predetermined boundary. The parameters of the process are related to covariates through generalized link functions thereby providing regression coefficients with clinically meaningful interpretations. In this dissertation, we present an FHT model based on the Ornstein-Uhlenbeck (OU) process; a modified Wiener process which drifts from the starting value of the process toward a state of equilibrium or homeostasis present in many biological applications. We extend previous OU process models to allow the process to change according to covariate values. We also discuss extensions of our methodology to include random effects accounting for unmeasured covariates. In addition, we present a mixture model with a cure rate using the OU process to model the latent health status of those subjects susceptible to experiencing the event under study. We apply these methods ii

4 to survival data collected on melanoma patients and to another survival data set pertaining to carcinoma of the oropharynx. iii

5 This document is dedicated to my family and to those brave men and women of the Armed Forces that gave their lives to protect our country s freedom during the completion of this PhD. iv

6 Acknowledgments Without the support, patience and guidance of the following people, this study would not have been completed. It is to them that I owe my deepest gratitude. Dr. Michael Pennell, my advisor, who guided me through the entire process of this research. Without his expertise, this would not have been possible. Dove Erich, my dear wife, without whom this effort would have been worth nothing. Your love, support, and sacrifice helped tremendously during this trying time, and I will be forever grateful. Ashley and Ellie Erich, my girls, who have sacrificed so much time with me. Arnold and Doris Erich, my parents, who have always believed in me. My dissertation committee that provided me with valuable guidance and feedback to make this research stronger and more viable. All of the faculty and staff at The Air Force Institute of Technology who supported me through this longer than anticipated PhD program. Dr. Bill Baker who provided valuable mathematical insight to help me get unstuck in my research which allowed me to complete this degree. Last, but not least, I am ever grateful to God who makes all things possible. v

7 Vita June St Marys Area High School B.S. Mathematics, Pennsylvania State University M.S. Applied Mathematics, A.F. Inst. of Technology 2007 to Graduate Student, Department of Biostatistics, The Ohio State University Fields of Study Major Field: Biostatistics vi

8 Table of Contents Page Abstract Dedication Acknowledgments List of Figures List of Tables ii iv v x xii 1. Introduction Threshold Regression: A First Hitting Time Regression Model Gamma Process and Inverse Gamma First Hitting Time Wiener Process Models The Wiener Process The Inverse Gaussian Distribution The Inverse Gaussian Distribution and Survival Analysis Previous Work using the FHT model with an Underlying Wiener Stochastic Process Strengths and Limitations of Using the Wiener Process Model Survival Models Based on the Ornstein-Uhlenbeck Process First Hitting Time for the Ornstein-Uhlenbeck Process The Shape of the Hazard Function Modeling the Hazard as the Square of an Ornstein-Uhlenbeck Process The Ornstein-Uhlenbeck Process in Biostatistical Applications Operational Time vii

9 3. The Ornstein-Uhlenbeck Model With Initial State Dependent On Covariates Proposed OU Threshold Regression Model Simulation Study Application of OU-TR Model to Overall Survival of Patients with Carcinoma of the Oropharynx Discussion The Ornstein-Uhlenbeck Mixture Model Proposed OU Threshold Regression Mixture Model Simulation Study Application of OU-TR Mixture Model to Time to Relapse Data from Patients with Melanoma Discussion The Ornstein-Uhlenbeck Random Effects Model for Survival Data with Unmeasured Covariates OU-TR Random Effects Model Proposed Model Simulation Study Application of the OU-TR Random Effects Model to Overall Survival of Patients with Carcinoma of the Oropharynx OU-TR Random Effects Mixture Model Proposed Model Simulation Study Application of OU-TR Random Effects Mixture Model to Time to Relapse Data from Patients with Melanoma Discussion Conclusion Bibliography Appendices A. Standard Error Derivations for OU-TR Model viii

10 B. Standard Error Derivations for OU-TR Mixture Model C. Random Effects Density Function and Survival Function Derivations D. Simplification of the Likelihood Function Under the OU-TR Random Effects Model E. Standard Error Derivations for OU-TR Random Effects Model F. Standard Error Derivations for OU-TR Random Effects Mixture Model. 127 G. Newly Developed Matlab Functions for Fitting OU-TR Models to Data. 131 G.1 OU-TR Model G.2 OU-TR Mixture Model G.3 OU-TR Random Effects Model G.4 OU-TR Random Effects Mixture Model ix

11 List of Figures Figure Page 2.1 Inverse Gaussian Densities with τ = 1 for Several Values of λ Inverse Gaussian Densities with λ = 1 for Several Values of τ Sample Paths of OU Process and Wiener Process Hazard function of time to absorption (parameter values: a = 0,b = 1,σ 2 = 2) Goodness of Fit of Best BIC Carcinoma of the Oropharynx Model Goodness of Fit of Second Best BIC Carcinoma of the Oropharynx Model for Subjects with a Disability Goodness of Fit of Second Best BIC Carcinoma of the Oropharynx Model for Subjects without a Disability Comparing Goodness of Fit of Model with Interaction Between Disability StatusandTumorSize(Int)withthebestBICMainEffectsModel(No Int) Goodness of Fit of Best and Second Best Melanoma Models (in terms of BIC) for Nodal Categories 0 and Goodness of Fit of Best and Second Best Melanoma Models (in terms of BIC) for Nodal Categories 2 and Estimated Survival Curves for OU-TR and OU-TR Random Effects Models When Psi = 0.25 (Scenario 1 from Table 5.6) x

12 5.2 Estimated Survival Curves for OU-TR and OU-TR Random Effects Models When Psi = 0.5 (Scenario 2 from Table 5.6) Estimated Survival Curves for OU-TR and OU-TR Random Effects Models When Psi = 1 (Scenario 3 from Table 5.6) Estimated Survival Curves for OU-TR and OU-TR Random Effects Models When Psi = 2 (Scenario 4 from Table 5.6) Goodness of Fit of Best BIC Carcinoma of the Oropharynx Model (OU-TR and OU-TR Random Effects) for Subjects with Disability Goodness of Fit of Best BIC Carcinoma of the Oropharynx Model (OU-TR and OU-TR Random Effects) for Subjects with No Disability Comparison of Survival Estimates When Bias is Present in ˆψ Estimated Survival Curves for OU-TR Mixture Model and OU-TR Random Effects Mixture Model When Psi = 0.25 (Scenario 1 from Table 5.13) Estimated Survival Curves for OU-TR Mixture Model and OU-TR RandomEffectsMixtureModelWhenPsi=0.5(Scenario2fromTable 5.13) Estimated Survival Curves for OU-TR Mixture Model and OU-TR Random Effects Mixture Model When Psi = 1 (Scenario 3 from Table 5.13) Estimated Survival Curves for OU-TR Mixture Model and OU-TR Random Effects Mixture Model When Psi = 2 (Scenario 4 from Table 5.13) Goodness of Fit of Best BIC Melanoma Model (Mixture Model and Random Effects Mixture Model) for Nodal Categories 0 and Goodness of Fit of Best BIC Melanoma Model (Mixture Model and Random Effects Mixture Model) for Nodal Categories 2 and xi

13 List of Tables Table Page 3.1 Results of Simulation Study Based on 1000 Data Sets of Size Summary Statistics of Variables Considered in Modeling the Oropharynx Data Final Model for Death from Carcinoma of the Oropharynx Results of Mixture Model Simulation Study Based on 1000 Data Sets of Size Summary Statistics of Variables Considered in Modeling the Melanoma Data Stage 1 of OU-TR Mixture Model Building for Melanoma Data With Relapse as Event Stage 2 of OU-TR Mixture Model Building for Melanoma Data With Relapse as Event Final Model for Relapse from Melanoma Simulation Results for OU-TR Random Effects Model Based on 1000 Data Sets of Size 200 with Psi = Simulation Results for OU-TR Random Effects Model Based on 1000 Data Sets of Size 200 with Psi = Simulation Results for OU-TR Random Effects Model Based on 1000 Data Sets of Size 200 with Psi = xii

14 5.4 Simulation Results for OU-TR Random Effects Model Based on 1000 Data Sets of Size 200 with Psi = Simulation Results for OU-TR RE Model Based on 1000 Data Sets of Size 300 with Psi = Simulation Results Examining Effect of Ignoring Random Effect (RE) in the OU-TR Model. Results are Based on 1000 Data Sets of Size OU-TR and OU-TR Random Effects Models for Carcinoma of the Oropharynx Data Simulation Results Based on 1000 Data Sets of Size 300 with Psi = Simulation Results Based on 1000 Data Sets of Size 300 with Psi = Simulation Results Based on 1000 Data Sets of Size 300 with Psi = Simulation Results Based on 1000 Data Sets of Size 300 with Psi = Simulation Results Based on 1000 Data Sets of Size 1000 with Varying True Values of Psi Simulation Results Based on 1000 Data Sets of Size 300 for OU-TR Random Effects Mixture Model Comparison to OU-TR Mixture Model without Random Effects Final OU-TR Mixture Model and OU-TR Random Effects Mixture Model for the Melanoma Data xiii

15 Chapter 1: Introduction In biostatistical research, it is often the goal to determine important factors affecting subject s survival time or time to development of disease. Numerous models are used to identify these factors. One popular choice is the Cox proportional hazards model (Cox, 1972). This model has many great features that include not being required to assume a distribution for the baseline hazard function, interpretation of regression parameters in terms of relative risk and the use of the partial likelihood function. However, this model provides erroneous results if the proportional hazards assumption is violated due to time varying covariate effects. For example, the effectiveness of a drug treatment may increase or decrease over time. A doctor may prescribe an antibiotic which loses its ability to fight infection over time. Thus, another treatment may be prescribed that builds up in the system eliminating the infection. If we look at a patient s health post-surgery, there usually is an initial increase in mortality risk immediately following surgery before a beneficial health effect is observed. Also, unexplained heterogeneity in a subject s risk or frailty may also result in non proportional hazards (Hougaard, 1991 and Keiding, 1997). To combat this phenomenon, a shared frailty model may be fit to the data (Vaupel et. al., 1979). However, within each cluster of subjects with the shared frailty value, the proportional hazards assumption must still be met in order to draw sound inferences. Other 1

16 methods are available to remedy problems associated with non proportional hazards under the Cox model. They include using time dependent covariate effects in the model or simply stratifying on the covariate that is introducing the non proportional hazards (Klein and Moeschberger, 2003). Another useful, though less frequently used, approach for identifying important prognostic variables of survival is a First Hitting Time (FHT) model. This approach does not require the proportional hazards assumption. In an FHT model, a stochastic process represents patient health with failure occurring once the process hits a boundary (Lee and Whitmore, 2006). For example, we may model a subject s health status using a Wiener process resulting in an FHT with an inverse Gaussian distribution (Chhikara and Folks, 1989). Since death or disease is the outcome of a series of genetic and physiological events where a subject s health deteriorates until it reaches a boundary, the FHT model is theoretically an attractive choice (Pennell et. al., 2010). Take for instance, a subject who has been diagnosed with lung cancer. This cancer has stages 0 through 5 with higher stages indicating more extensive disease. If left untreated, subjects will transition from one stage to the next, until they ultimately die from the disease. In this context, an FHT model would be well suited to analyze time to event data and highlight important variables that have an impact on survival. In threshold regression, covariate information is integrated into the parameters of FHT models via generalized link functions (Lee and Whitmore, 2006). For example, the initial state and variance of the Wiener process have been related to covariates using a log-link, and an identity link has been used for the drift parameter (cf. Lee et. al., 2000, 2004; Aalen and Gjessing, 2001; Aalen et. al., 2008). The proportional 2

17 hazards assumption is avoided in these models given that the effects on the hazard vary with time. In a 2004 paper, Aalen and Gjessing discuss survival models based on the Ornstein- Uhlenbeck (OU) process. The OU process is a modification of a Wiener process to include drift toward an equilibrium state. Many biological processes have the property, termed homeostasis, of diffusing back and forth while simultaneously tending to stabilize around a certain point. Examples of homeostatic biological processes include body temperature regulation in warm-blooded animals and blood ph regulation in the human body (Blessing, 1997). Also, the urinary system in the human body removes salt, excess ions and waste from plasma which is vital in the homeostatic regulation of the ionic composition, volume and ph of the internal environment (Chiras, 2005). In this dissertation, we developed new statistical methodologies for analyzing time to event data based on the OU process. We have extended previous models, based on the OU process, to incorporate available covariate information. To demonstrate the usefulness of these methodologies, we applied the methods to real data from biomedical studies and assessed model fit by comparing our estimated OU survival curve to the Kaplan-Meier curve generated from the same data. The first data set consists of 192 subjects from a clinical trial in the treatment of carcinoma of the oropharynx found in Kalbfleisch and Prentice (1980). Patients were randomly assigned to one of two treatments, radiation therapy in itself or radiation therapy in conjunction with chemotherapy. An objective of this study was to compare the two treatments with respect to patient survival. Covariates considered in this OU model approach included age, sex, treatment, patient physical condition, tumor site, 3

18 tumor grade, tumor T-stage and tumor N-stage. Time until death from cancer of the oropharynx was recorded. The second data set examined comes from a clinical trial which includes 713 melanoma patients who, after definitive surgery, were randomly assigned to treatment or observation groups. We applied our model to data from 315 subjects who did not receive treatment during the study in order to analyze the natural progression of the disease. Covariate information for each subject, including age, sex, treatment, nodal category, and Breslow score were available for analysis. Time until relapse and/or death from melanoma were recorded. We also broaden our threshold regression approach to model data using a mixture model which introduces a cure rate. Finally, we extended our approach by including subject specific random effects which account for unexplained heterogeneity in initial health status. The remainder of this dissertation is organized as follows. First we describe the concept of the threshold regression model and provide some examples of these models. Next, we detail two specific types of threshold regression models; the Wiener process model and the Ornstein-Uhlenbeck process model. Then, we perform a simulation study using the OU process with covariates incorporated into the initial health status (called the OU-TR model). Following this section, we apply the OU-TR model to the carcinoma of the oropharynx clinical trial data and present results. In the next chapter, we describe the use of a mixture model incorporating the OU process to model those subjects susceptible to experiencing the event under study. A simulation study is conducted using this OU-TR mixture model, and the model is applied to the melanoma study data. Following this chapter, an explanation is given for the OU process models in which random effects are incorporated to capture unexplained heterogeneity between subjects. Simulation studies of this random effect modeling 4

19 method are conducted for both the OU-TR random effects model and the OU-TR random effects mixture model, and these models are applied to the carcinoma of the oropharynx and the melanoma clinical trial data respectively. Finally, we highlight future work to be accomplished that may enhance the capabilities of the OU process models described in this dissertation. 5

20 Chapter 2: Threshold Regression: A First Hitting Time Regression Model There are two basic components to the FHT model as described in Lee and Whitmore, The first is a parent stochastic process {X t,t T,X t = x X} with initial value X 0 = x 0, T is the time space and X is the state space of the process. The second component consists of a boundary set or threshold B, where B X. X t may have many different properties such as one or more dimensions, the Markov property, a continuous or discrete state or a monotonic sample path. In the context of medical applications, X t is often latent and describes the health status of the subject. In epidemiological applications, X t frequently describes the unobservable status of the disease under investigation. AsdescribedinLeeandWhitmore(2006), ifwetakex 0 tolieoutsideofb, thefirst hitting time of B is the random variable S = inf {t : X t B}. Therefore, the time when the stochastic process first encounters B is the first hitting time. The threshold state is the first state encountered by the process in the boundary set, X S B. Thus, a stopping condition is defined by the boundary set. If the parent process is latent, we cannot observe the FHT event in the state space of the process directly. For example, liver transplant patients have several factors used to determine initial health status after transplant. These factors may include type of transplant, age and weight to 6

21 name a few. The boundary can be set as death due to complications from the liver transplant. Thus, the process models the decline in health from the initial point to death when the process hits the threshold. First hitting time (FHT) models have been applied in an array of fields such as engineering, economics, business and medicine. They have been used to model labour turnover (Whitmore, 1979), the onset time for a cancer induced by occupational exposure (Lee et al., 2004), length of a hospital stay (Eaton and Whitmore, 1977) and strike duration(linden, 2000). What makes FHT models valuable in applications is the capability to include regression structures. This allows effects of covariates to account for natural dispersion of the data, thereby explaining variability and sharpening inferences. Regression structures also provide scientific insights into potential causal roles of covariates in the underlying processes, boundary sets and time scales (Lee and Whitmore, 2006). As described by Lee and Whitmore (2006), there are several possible choices for the stochastic process X t including a Bernoulli process, Poisson process, Markov chain, Wiener process, gamma process and an Ornstein- Uhlenbeck (OU) process. 2.1 Gamma Process and Inverse Gamma First Hitting Time In the gamma process model, described in Lee and Whitmore (2006), the parent process is {X t,t 0} with initial value X 0 = x 0 > 0 and X t = x 0 G t where {G t,t 0} is a gamma process with G 0 = 0. The gamma process, described in Kyprianou (2006) in section (pages 7-8), has increments G t s = G t G s, where 0 s < t <, that are stationary, independent and gamma distributed with shape parameter α and scale parameter β which are constants. Thus the pdf of the 7

22 gamma(α, β) distribution is f(g t s α,β) = βα Γ(α) (G t s) α 1 e βg t s where G t s (0, ), α > 0 and β > 0. Some authors have considered generalizations of the gamma process in which the shape or scale parameter vary monotonically with time. For instance, Kalbfleisch (1978) used a gamma process with shape parameter α(t) Λ 0(t) as the prior for the cumulative baseline hazard function (Λ 0 (t)) in a Bayesian analysis of the Cox model, where Λ (t) is a parametric cumulative baseline hazard function representing one s best a priori guess at Λ 0 (t). Since a gamma process has monotonic sample paths, the first hitting time of the parent process (X t = 0) has an inverse gamma distribution. An advantage of this model is that computational routines for the gamma distribution are readily available. In Singpurwalla (1995), the use of this model is motivated by the fact that item wear is nondecreasing and failure of many components or systems of components is more likely due to wear than a traumatic event. In Lawless and Crowder (2004), Singpurwalla s gamma process model is extended to incorporate covariates to better explain reliability of items with certain characteristics. Random effects are also included to explain heterogeneity between these items not accounted for by the observed covariates. Lawless and Crowder(2004) set up the gamma process model by defining G t to be gamma(α,η(t)) where η(t) is a given monotone increasing function of time. Covariates are incorporated in the gamma process by changing α to α(v) where v is a vector of covariate values which allows rescaling of G t without changing the shape parameter of its gamma distribution. To incorporate random effects into this model, further alteration of α is accomplished by using rα(v) where r is the random effect. In their paper, an application is presented involving metal 8

23 fatigue crack growth data with random effects specific to each unit but no covariates. However, it is suggested that α(v) = exp(βv ) be used as the regression function specification when covariates are involved. They define the monotone increasing function of time to be η(t) = β 0 (1 y β 2 0 β 1 β 2 t) β 1 2 where y 0 is the initial crack length and the β s are parameters that vary randomly across units. A possible use of the gamma process model in a biological application is given in Lee and Whitmore (2006). Here, they define the process as X t = x 0 with probability 1 p and X t = x 0 Z t with probability p where p is a susceptibility probability and Z t is a gamma process. For example, a patient can have a benign form of disease with probability 1 p or a malignant form with probability p. Thus, the malignant form of the disease advances monotonically en route to death from the disease. In contrast, the gamma process model may not be a good choice in applications where health does not decline consistently over time; for example, diseases with long latency periods. 2.2 Wiener Process Models The Wiener Process We begin by defining the Wiener process (Prahbu, 1965, Section 3, p. 10), X t, withdriftµ (, )andvarianceσ 2 > 0. Theprocesshasthefollowingproperties for any t 1 < t 2 < t 3 < t 4 : 1. X t has independent increments; X t2 X t1 and X t4 X t3 are independent. 2. X t2 X t1 has a normal distribution with mean µ(t 2 t 1 ) and variance σ 2 (t 2 t 1 ) where t 1 < t 2. 9

24 Under these conditions, the probability density function (pdf) for X t = X given that the process started at x 0 is f(x x 0,t) = 1 σ 2πt exp ( (x x 0 µt) 2 2σ 2 t ). (2.1) Further details on the Wiener process can be found in Chhikara and Folks (1989) and Prahbu (1965). Next we focus on the first passage time T of X t to a < x 0, where a is the predetermined threshold value. The conditions X 0 = x 0, X t > a, 0 < t < T, and X T = a are necessary for T to be the first passage time. If T is finite, the density function of T is derived by finding the Laplace transform (Prabhu, 1965). Details of these derivations can be found in Chhikara and Folks (1989). The resulting first hitting time distribution is inverse Gaussian which is explained in the following section The Inverse Gaussian Distribution The probability density function (pdf) of an inverse Gaussian random variable X is f (x τ,λ) = λ 2π x 3/2 exp [ λ(x τ)2 2τ 2 x ], x > 0 (2.2) where τ and λ are greater than zero. The mean of the distribution is τ and the scale parameter is λ. If we define φ = λ/τ, the shape of the distribution depends on φ only. The inverse Gaussian distribution represents a broad class of distributions, varying from a highly skewed to a symmetrical distribution as φ goes from 0 to (Chhikara and Folks, 1989). Since φ 1 = τ/λ, the inverse Gaussian distribution moves closer 10

25 to normal when φ is increased. As shown in Chhikara and Folks (1989), the density curves in Figures 2.1 and 2.2 illustrate the wide range of shapes possible when using the inverse Gaussian distribution λ = λ = λ = 0.5 λ = 1 λ = Time Figure 2.1: Inverse Gaussian Densities with τ = 1 for Several Values of λ 6 5 τ = τ = τ = 1 τ = 5 τ = Time Figure 2.2: Inverse Gaussian Densities with λ = 1 for Several Values of τ 11

26 2.2.3 The Inverse Gaussian Distribution and Survival Analysis When studying subject survival or time to disease occurrence/recurrence, the inverse Gaussian model has some useful properties. The hazard function for the inverse Gaussian tends to initially increase, then decrease, and approach a constant value as the lifetime becomes infinite. This property is frequently found when lifetimes are dominated by early event times (Chhikara and Folks, 1989) such as studies involving organ and bone marrow transplants(klein and Moeschberger, 2003). Another important property is that the family of inverse Gaussian distributions is rather broad. This distribution can represent a highly skewed to an almost normal distribution (Chhikara and Folks, 1989). Suppose F(t) denotes the cdf of a subject s survival time. Then, the subject s survival function S(t) at time t is the probability of experiencing the event after time t. Therefore, S(t) = 1 F(t). The cdf of t in terms of the standard normal distribution function, given by Schuster (1968), is F(t) = Φ [ ( ) ] ( λ t λ +exp(2λ/τ)φ[ t τ 1 1+ t t τ) ]. (2.3) Therefore, the survival function for the inverse Gaussian is S(t) = Φ [ ( λ 1 t t τ) ] [ ( λ exp(2λ/τ)φ 1+ t t τ) ]. (2.4) AsmentionedinSection2.2.1, thefirsthittingtime(t)ofawienerprocessfollows an inverse Gaussian distribution. The drift of the Wiener process may be positive, negative or zero. If the Wiener process has negative drift (µ < 0), then there is a propensity to drift toward the threshold (a), (Whitmore, 1979); i.e., S( ) = 0. With µ < 0, we obtain, from derivations explained in Chhikara and Folks (1989), a proper 12

27 inverse Gaussian distribution IG(τ,λ) of the first hitting times with parameters defined as follows: τ = (x 0 a) µ and λ = (a x 0) 2 σ 2. (2.5) Thus, the pdf of the inverse Gaussian first hitting time distribution when µ < 0 in terms of the Wiener process parameters is [ f (t µ,σ 2 ) = x 0 a 2πσ 2 t 3/2 exp µ(t+ ] x 0 a µ )2, t > 0,σ 2 > 0. (2.6) 2σ 2 t The corresponding cdf is F(t) = Φ (x 0 a) 2 σ 2 t ( ) tµ x 0 a 1 +exp ( ) 2(x0 a)µ (x 0 a) σ 2 Φ 2 σ 2 t ( 1+ tµ ). (2.7) x 0 a Inaprocessthathaspositivedrift(µ > 0), hittingapresetthreshold(a)theoretically may never occur (Whitmore, 1979); i.e., it has a cure rate (S( ) is not necessarily 0). For example, in a clinical trial study where a patient s health was modeled using the Wiener process with positive drift, the subject may never experience the event under study (they are cured). With µ > 0, the resulting improper distribution of the first hitting time is inverse Gaussian IG( τ,λ) and the cure rate is S( ) = 1 exp( 2x 0 µ/σ 2 ). If µ = 0, the process also has a propensity to drift toward a (Whitmore, 1979). However, as seen in Chhikara and Folks (1989), the FHT is not inverse Gaussian when µ = 0, but it is a stable distribution with index 1/2 (See Feller 1966, Section 6.1, p. 170) with probability density function f(t σ 2 ) = ( 1 2σ 2πt exp 1 ). 3 8tσ 2 13

28 Inthresholdregression,theprocess{X t }andboundarysetb haveparametersthat are dependent on covariates differing between individuals (Lee and Whitmore 2006). Using appropriate regression link functions, these parameters are joined to linear combinations of covariates, such as g θ (θ i ) = z i γ for θ. Here g θ is the link function, the parameter θ i is the value of the parameter θ for individual i, z i = (1,z i1,z i2,...,z ik ) is the covariate vector of individual i and γ is the associated vector of regression coefficients. Normally, the link function will be chosen to map the parameter space into the real line. Likewise, covariates and their mathematical forms in the regression function zγ must be chosen appropriately, as is the case in a conventional regression analysis. An attractive feature of the threshold regression model is that we are able to relate subject characteristics to clinically meaningful parameters. We illustrate an example of this using the Wiener process FHT model. The Wiener process has mean (drift) parameter µ and variance parameter σ 2 initial process level parameter x 0 and the boundary set that includes the threshold a. However, the survival function only depends on these three parameters via x 0 /σ and µ/σ. Hence, when analyzing right censored data, there are essentially only two free parameters. Thus, we can arbitrarily set σ 2 = 1 without loss of generality (Aalen and Gjessing, 2001). In the railroad worker case-control study presented in Lee et al. (2009), covariates, such as smoking status, asbestos exposure and whether or not the subject worked as an engineer, were incorporated into the Wiener process model to determine their effect on survival. In this case-control study, the following link functions were used: µ = β 0 +β 1 y 1 + +β k y k. ln(x 0 ) = γ 0 +γ 1 y 1 + +γ k y k. 14

29 where y = (y 1,y 2,...,y k ) is a vector of regression covariates. The underlying process was assumed to be a Wiener process with negative drift. Therefore, the first hitting time distribution was IG( τ,λ), with τ and λ as defined in (2.5), and maximum likelihoodtechniqueswereusedtofindtheestimatesofβ andγ fromthelinkfunctions above. In this model setup, β represents the covariate effects on the initial health status and γ represents the covariate effects on the rate of decline in health status Previous Work using the FHT model with an Underlying Wiener Stochastic Process Several authors have utilized the Wiener process FHT model in survival, reliability and economic applications. In this section, we will summarize several specific uses of this model. The first example comes from research conducted in Lee et al. (2009). An FHT model with the Wiener process as the underlying stochastic process was used to analyze data, detailed in Garshick et al. (2004), from a case-control study which includes 3641 railroad workers where 1256 died from lung cancer (cases) and 2385 workers from the same population that did not die of lung cancer, suicide, accident or unknown cause (controls). Since 1959, the rail industry used diesel power for their locomotives. Thus, railroad workers began to be exposed to diesel exhaust. For this case-control study, diesel exhaust exposure was captured by breaking down jobs with the railroad into three categories. The first category contains engineers, brakemen, firemen, conductors and hostlers. The second category consists of railroad shop workers and the third includes all other workers such as ticket and station agents, clerks and rail car repair workers. Since this data comes from a case-control study, each case subject contributes an observed lifetime from the reference date to the year 15

30 of death and each control subject contributes a censored survival time (censored by some other cause of death) measured from the same reference date (Lee et al., 2009). Covariates were incorporated into the model via the drift parameter and the initial health status. Operational time was also used in this study and is explained at the end of this chapter. In Pennell et al. (2010), a Bayesian methodology was used in a Wiener process FHT model that accounts for unmeasured covariates in both the initial health status and the drift. To accomplish this, a random effect was included in the drift component and each subject s initial health status, x 0i, was modeled as a truncated normal random variable. This methodology was applied to data from malignant melanoma patients where non proportional hazards and unexplained heterogeneity were present. The results are compared to previous studies conducted on this data using Cox regression and fitting a similar FHT model without random effects. Research conducted by Lee, Whitmore and Rosner (2010) explored threshold regression for survival data in longitudinal studies involving time-varying covariates. To handle this type of data, the authors suggest breaking up longitudinal data into intervals and modeling time to event over each interval using threshold regression with a latent Wiener process under a Markov assumption. This method was illustrated using data from a nurse s health study of lung cancer risk with completion times of surveys defining the different time intervals. Lee, Chang and Whitmore (2008) conducted research using a threshold regression mixture model for assessing treatment efficacy in a multiple myeloma clinical trial. The subjects in this study were initially randomized to either receive Velcade or a high-dose Dexamethasone treatment. Based on the subject s response, they were 16

31 switched to the other treatment if necessary. A mixture of two Wiener process FHT models was fit to the survival data since there was evidence of a bimodal FHT distribution in each treatment group. The mixing parameter in this model is the proportion of patients receiving one of the two treatments. A composite time scale was used to distinguish the rate of disease progression before and after switching treatments (see section 2.4 at the end of this chapter). Covariates were incorporated in the model via the drift parameter. An extension of the univariate Wiener process model can be found in research conducted by Whitmore, Crowder and Lawless (1998). Here, a bivariate Wiener process was used to jointly model a latent process and an observable marker process. This technique was demonstrated with a simulated example and applied to a data set obtained from an aluminum production process. The data set contained failure age in days of the reduction cells (used to perform electrolysis of molten alumina and cryolite) and failure data on two markers which include the percentage iron contamination level and horizontal distortion of the cell in inches. The bivariate Wiener process model of Whitmore et al. (1998) was extended by Tong et al. (2008) to the case when only current status data are available. Current status data is also known as interval censored data where only one observation on each subject is available and the failure time is either smaller or larger than the observed time. This type of data can be found in cross-sectional studies and animal studies examining time to appearance of internal tumors. Horrocks and Thompson (2004), proposed a Wiener process model for competing risks data. The model is based on the time, T, that a Wiener process hits one of two boundaries which represent two possible competing outcomes. Covariates 17

32 were incorporated into the model via the drift component of the underlying Wiener process. The model was used on a subset of data from the Utah Department of Health representing all hospital discharges in The two competing outcomes were healthy discharge and death in hospital. The upper and lower thresholds were modeled as a linear function of covariates. Horrocks and Thompson discussed an extension of their model for length of stay that accounts for the presence of heterogeneity in the population (accomplished through use of a mixture model). Another competing risk application of the Wiener process FHT model was used in research conducted by Lindqvist and Skogsrud (2009). They considered a competing risks framework for a component that will experience either failure or a preventive maintenance procedure to avoid failure. A novel approach is presented that models component degradation using the Wiener process with failure associated with hitting a predetermined threshold. In addition, a potential time for maintenance associated with hitting a threshold before the failure threshold is accounted for in the model. A final example of an application involving the Wiener process comes from research conducted by Saebo et al. (2005) on genetic evaluation of mastitis resistance in cows. In the model setup, it is assumed that each cow is in a unique state of health at any given time that is a certain distance from onset of disease. The latent physiological battle against the disease can be modeled by a Wiener process with drift toward the disease threshold. Two risk patterns are associated with development of mastitis that include physical changes known to start in the days leading up to calving and the cow s environment such as milking technique and hygiene. Thus, these two risk patterns invite the model setup involving two latent Wiener processes. 18

33 2.2.5 Strengths and Limitations of Using the Wiener Process Model The Wiener process is widely used as the underlying stochastic process in research involving first hitting time models in survival and reliability applications. The first hitting time distribution, when using this process, is inverse Gaussian. As explained in Chhikara and Folks (1989), this distribution is very flexible and can represent skewed as well as approximately normal data. Thus, the Wiener process can be an important tool when modeling data characterized by early incidence of events. Also, when using the Wiener process, the inverse Gaussian distribution and the survival function for the first hitting times are easily computable and provide an efficient means of finding maximum likelihood estimates of model parameters. Another useful feature is the ability to model cure rates since the drift parameter can either be positive or negative. Finally, the ease of incorporating covariates into the model via the drift parameter or the applicable threshold make the Wiener process a viable choice in biostatistical research. A limitation of the Wiener process model, originating in the defining properties of the process, is that disjoint time increments are independent. This can cause problems for modeling, for example, movements of an organism or a patient s health status that logically depend on the state in the previous time increment of the process. This virtually eliminates the capability to model homeostasis in the underlying process as the Wiener process models rapidly fluctuating phenomena (Horsthemke and Lefever, 1984). A possible solution to this deficiency is the incorporation of the Ornstein- Uhlenbeck process, a modification of the Wiener process, to allow adequate modeling 19

34 of the homeostatic properties of many biological processes. The OU process is described in the next section. 2.3 Survival Models Based on the Ornstein-Uhlenbeck Process In a 2004 paper, Aalen and Gjessing discussed survival models based on the Ornstein-Uhlenbeck (OU) process. This process is a mean reverting modification of a Wiener process in that it has a propensity to drift in the direction of a fixed equilibrium level. Homeostasis, defined as simultaneously diffusing back and forth while stabilizing around a certain point, is a characteristic found in many natural processes. Thus, the OU process is natural to consider in a biological context. An example of a biological process that exhibits homeostasis is kidney function. Kidneys getridofextrawaterandionsfrombloodthroughpassageofurine. Thus, thekidneys carry out homeostatic regulation by removing waste or excess products from the body. For the purposes of this research, we consider two concepts when modeling with the OU process. If the event under study is a positive one, such as modeling time to discharge from the hospital or ICU, the threshold, in the FHT model context, will be regarded as a healthy homeostasis. Also, in another modeling situation, we can have subjects being pulled from a healthy status toward an unhealthy homeostasis or threshold representing death or disease. In the following sections, details of the OU process are explained and an example is given describing a previous use of this model in the literature First Hitting Time for the Ornstein-Uhlenbeck Process AalenandGjessing(2004)statethattheWienerprocess,representedbyW t,iswell known for modeling random processes with continuous sample paths. Its time steps 20

35 over an interval are normally distributed with mean 0 and variance proportional to the interval length. The OU process, represented by X t, is actually a modified Wiener process with a drift toward a state of equilibrium. The OU process can be defined by the stochastic differential equation (Cox and Miller, 1965, Section 5.8, p. 226) dx t = (a bx t )dt+σdw t (2.8) Where < a <, b > 0 and σ > 0. According to Aalen and Gjessing (2004), X 0 is typically modeled as Gaussian or treated as a constant. This equation tells us that for small time intervals (t,t + t), the change in X t has drift toward a/b, but is agitated by the Gaussian noise contained in dw t (often called white noise). This process is attracted to the equilibrium point a/b. This attraction is known as the mean-reverting property of the OU process. As shown in Aalen and Gjessing (2004), X is Gaussian, and EX t = a/b+(x 0 a/b)exp( bt) which converges to a/b as t. Also, Var(X t ) = [σ 2 /(2b)](1 exp( 2bt)) which converges to σ 2 /(2b) as t and Cov(X s,x t ) = [σ 2 /(2b)][exp( b s t ) exp( b(s + t))]. If we ignore the initial fluctuations at the start of the process due to X 0 a/b, the OU process is stationary and Gaussian with an autocorrelation function that decays exponentially over time (Aalen and Gjessing, 2004). Details regarding the OU process are also available in Aalen et al. (2008). To demonstrate the tendency for the OU process to reach a state of equilibrium in contrast to the Wiener process, corresponding sample paths are generated for the two processes and shown in figure 2.3. The initial value, X 0, of all processes was set to 4, σ 2 is set to 2, the mean of the OU process was set to 0 (a = 0,b = 1) and the drift parameter for the Wiener process was set to 2, 0 and 2. In this plot we see the Wiener process with positive drift tends to move away from X 0 in a positive direction, and the Wiener process with negative drift tends to 21

36 move away from X 0 in a negative direction. For the Wiener process with zero drift, the path tends to stay close to the starting point of the process. The OU sample path moves toward the process mean of 0 and stabilizes. This behavior exhibits the OU process mean reverting property. The variance of the Wiener processes, described in this plot, is 2 while the variance of the OU process converges to 1 as t approaches infinity. 25 Wiener Process (drift = 2, 0, and 2) and OU Process (mean = 0) Paths with Xo = 4 20 Wiener Process drift = Wiener Process drift = 0 Health Status 5 0 OU Process 5 10 Wiener Process drift = Time (years) Figure 2.3: Sample Paths of OU Process and Wiener Process From this point on, it is assumed that the process is absorbed once it hits zero. The OU process X t is a process describing the latent progression of a subject toward a health-relatedevent. SupposeweletX t correspondtoasubject sdiseasedevelopment which may be latent. Then, the time the development reaches a particular level, an event occurs for that subject. Therefore, we define the subject s event time as the first time the latent process X t hits a threshold. We assume the OU process, X t, 22

37 starts at a deterministic positive value x 0. We define T = inf{t : X t = 0}, where T 0, to be the event time (Aalen and Gjessing, 2004). We can define the hazard rate for continuous T by h(t) = d/dtp(t > t). (2.9) P(T > t) It can be shown that when t approaches, the hazard rate h(t), the rate of the first passage across the boundary, converges to a constant h 0 (Aalen and Gjessing, 2004) The Shape of the Hazard Function Exact formulas for the hazard rate exist in the symmetric case (a = 0 in equation (2.8)). Thus, in this situation, we are modeling time to homeostasis since the mean and the threshold of the process are equal to 0. Unfortunately, under the general OU model, there is no closed form for the hazard rate. In Finch (2004), an attempt was made to find the general formula in closed form, but only a numerical solution is available. Also, in Aalen and Gjessing (2004), they state a closed-form symbolic inversion is hardly possible in general for the Laplace transforms required to find formulas for the density and survival functions. In Ricciardi and Sato (1988, p. 46) the probability density of time to event when starting in X 0 (parameter values a = 0,b = 1,σ 2 = 2) is given by 2 f(t) = π X e 2t 0 (e 2t 1) and the corresponding survival function is ( exp 3/2 X2 0 2(e 2t 1) ). (2.10) ( ) X0 S(t) = 2Φ 1, (2.11) e 2t 1 23

38 where Φ(.) is the standard cumulative normal distribution function. The corresponding hazard rate is calculated as h(t) = f(t)/s(t). For the OU process, the hazard rate, where the parameter values are a = 0,b = 1,σ 2 = 2, starts at 0 and then converges toward an equilibrium level. In the case of these specific model parameters, the hazard rate converges to 1. According to Aalen and Gjessing (2004), this convergence indicates the advancement of the underlying distribution toward quasi-stationarity on the state space, (see also Aalen and Gjessing, 2001). Aalen and Gjessing (2004) discuss how the shape of the hazard function changes with X 0 ; Figure 2.4 is a redrawing of their Figure 1. The hazard rate is generally increasing if X 0 is far from 0. When X 0 moves closer to zero, we get a more unimodal hazard. For small X 0 that are close to 0, we obtain a generally decreasing hazard rate. Note that the hazard function corresponding to X 0 = 0.2 in Figure 2.4 starts out at zero, has a strong initial increase and then is generally decreasing. Therefore, the shape of the hazard rate is driven by the distance X 0 is from the threshold (Aalen and Gjessing, 2001). 24

for Time-to-event Data Mei-Ling Ting Lee University of Maryland, College Park

for Time-to-event Data Mei-Ling Ting Lee University of Maryland, College Park Threshold Regression for Time-to-event Data Mei-Ling Ting Lee University of Maryland, College Park MLTLEE@UMD.EDU Outline The proportional hazards (PH) model is widely used in analyzing time-to-event data.

More information

Bayesian Analysis for Markers and Degradation

Bayesian Analysis for Markers and Degradation Bayesian Analysis for Markers and Degradation Mei-Ling Ting Lee 12, Maria Shubina 1, and Alan Zaslavsky 13 1 Biostatistics Department, Harvard School of Public Health, Boston, USA 2 Channing Laboratory,

More information

Package threg. August 10, 2015

Package threg. August 10, 2015 Package threg August 10, 2015 Title Threshold Regression Version 1.0.3 Date 2015-08-10 Author Tao Xiao Maintainer Tao Xiao Depends R (>= 2.10), survival, Formula Fit a threshold regression

More information

Multi-state Models: An Overview

Multi-state Models: An Overview Multi-state Models: An Overview Andrew Titman Lancaster University 14 April 2016 Overview Introduction to multi-state modelling Examples of applications Continuously observed processes Intermittently observed

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Multistate Modeling and Applications

Multistate Modeling and Applications Multistate Modeling and Applications Yang Yang Department of Statistics University of Michigan, Ann Arbor IBM Research Graduate Student Workshop: Statistics for a Smarter Planet Yang Yang (UM, Ann Arbor)

More information

Survival Analysis I (CHL5209H)

Survival Analysis I (CHL5209H) Survival Analysis Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca January 7, 2015 31-1 Literature Clayton D & Hills M (1993): Statistical Models in Epidemiology. Not really

More information

Power and Sample Size Calculations with the Additive Hazards Model

Power and Sample Size Calculations with the Additive Hazards Model Journal of Data Science 10(2012), 143-155 Power and Sample Size Calculations with the Additive Hazards Model Ling Chen, Chengjie Xiong, J. Philip Miller and Feng Gao Washington University School of Medicine

More information

Philosophy and Features of the mstate package

Philosophy and Features of the mstate package Introduction Mathematical theory Practice Discussion Philosophy and Features of the mstate package Liesbeth de Wreede, Hein Putter Department of Medical Statistics and Bioinformatics Leiden University

More information

Multistate models in survival and event history analysis

Multistate models in survival and event history analysis Multistate models in survival and event history analysis Dorota M. Dabrowska UCLA November 8, 2011 Research supported by the grant R01 AI067943 from NIAID. The content is solely the responsibility of the

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Survival Distributions, Hazard Functions, Cumulative Hazards

Survival Distributions, Hazard Functions, Cumulative Hazards BIO 244: Unit 1 Survival Distributions, Hazard Functions, Cumulative Hazards 1.1 Definitions: The goals of this unit are to introduce notation, discuss ways of probabilistically describing the distribution

More information

Lecture 5 Models and methods for recurrent event data

Lecture 5 Models and methods for recurrent event data Lecture 5 Models and methods for recurrent event data Recurrent and multiple events are commonly encountered in longitudinal studies. In this chapter we consider ordered recurrent and multiple events.

More information

A comparison of methods to estimate time-dependent correlated gamma frailty models

A comparison of methods to estimate time-dependent correlated gamma frailty models DEPARTMENT OF MATHEMATICS MASTER THESIS APPLIED MATHEMATICS A comparison of methods to estimate time-dependent correlated gamma frailty models Author: Frank W.N. Boesten Thesis Advisor: Dr. M. Fiocco (MI

More information

STAT331. Cox s Proportional Hazards Model

STAT331. Cox s Proportional Hazards Model STAT331 Cox s Proportional Hazards Model In this unit we introduce Cox s proportional hazards (Cox s PH) model, give a heuristic development of the partial likelihood function, and discuss adaptations

More information

Instrumental variables estimation in the Cox Proportional Hazard regression model

Instrumental variables estimation in the Cox Proportional Hazard regression model Instrumental variables estimation in the Cox Proportional Hazard regression model James O Malley, Ph.D. Department of Biomedical Data Science The Dartmouth Institute for Health Policy and Clinical Practice

More information

Adaptive Prediction of Event Times in Clinical Trials

Adaptive Prediction of Event Times in Clinical Trials Adaptive Prediction of Event Times in Clinical Trials Yu Lan Southern Methodist University Advisor: Daniel F. Heitjan May 8, 2017 Yu Lan (SMU) May 8, 2017 1 / 19 Clinical Trial Prediction Event-based trials:

More information

ST5212: Survival Analysis

ST5212: Survival Analysis ST51: Survival Analysis 8/9: Semester II Tutorial 1. A model for lifetimes, with a bathtub-shaped hazard rate, is the exponential power distribution with survival fumction S(x) =exp{1 exp[(λx) α ]}. (a)

More information

Survival Analysis Math 434 Fall 2011

Survival Analysis Math 434 Fall 2011 Survival Analysis Math 434 Fall 2011 Part IV: Chap. 8,9.2,9.3,11: Semiparametric Proportional Hazards Regression Jimin Ding Math Dept. www.math.wustl.edu/ jmding/math434/fall09/index.html Basic Model Setup

More information

Chapter 5. Chapter 5 sections

Chapter 5. Chapter 5 sections 1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions

More information

Multistate Modelling Vertical Transmission and Determination of R 0 Using Transition Intensities

Multistate Modelling Vertical Transmission and Determination of R 0 Using Transition Intensities Applied Mathematical Sciences, Vol. 9, 2015, no. 79, 3941-3956 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ams.2015.52130 Multistate Modelling Vertical Transmission and Determination of R 0

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume II: Probability Emlyn Lloyd University oflancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester - New York - Brisbane

More information

Bivariate Degradation Modeling Based on Gamma Process

Bivariate Degradation Modeling Based on Gamma Process Bivariate Degradation Modeling Based on Gamma Process Jinglun Zhou Zhengqiang Pan Member IAENG and Quan Sun Abstract Many highly reliable products have two or more performance characteristics (PCs). The

More information

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models Nicholas C. Henderson Thomas A. Louis Gary Rosner Ravi Varadhan Johns Hopkins University July 31, 2018

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Lecture 3. Truncation, length-bias and prevalence sampling

Lecture 3. Truncation, length-bias and prevalence sampling Lecture 3. Truncation, length-bias and prevalence sampling 3.1 Prevalent sampling Statistical techniques for truncated data have been integrated into survival analysis in last two decades. Truncation in

More information

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis

CIMAT Taller de Modelos de Capture y Recaptura Known Fate Survival Analysis CIMAT Taller de Modelos de Capture y Recaptura 2010 Known Fate urvival Analysis B D BALANCE MODEL implest population model N = λ t+ 1 N t Deeper understanding of dynamics can be gained by identifying variation

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n

Chapter 4 Fall Notations: t 1 < t 2 < < t D, D unique death times. d j = # deaths at t j = n. Y j = # at risk /alive at t j = n Bios 323: Applied Survival Analysis Qingxia (Cindy) Chen Chapter 4 Fall 2012 4.2 Estimators of the survival and cumulative hazard functions for RC data Suppose X is a continuous random failure time with

More information

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification Todd MacKenzie, PhD Collaborators A. James O Malley Tor Tosteson Therese Stukel 2 Overview 1. Instrumental variable

More information

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI

ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI ESTIMATING STATISTICAL CHARACTERISTICS UNDER INTERVAL UNCERTAINTY AND CONSTRAINTS: MEAN, VARIANCE, COVARIANCE, AND CORRELATION ALI JALAL-KAMALI Department of Computer Science APPROVED: Vladik Kreinovich,

More information

Lecture 22 Survival Analysis: An Introduction

Lecture 22 Survival Analysis: An Introduction University of Illinois Department of Economics Spring 2017 Econ 574 Roger Koenker Lecture 22 Survival Analysis: An Introduction There is considerable interest among economists in models of durations, which

More information

Modeling Arbitrarily Interval-Censored Survival Data with External Time-Dependent Covariates

Modeling Arbitrarily Interval-Censored Survival Data with External Time-Dependent Covariates University of Northern Colorado Scholarship & Creative Works @ Digital UNC Dissertations Student Research 12-9-2015 Modeling Arbitrarily Interval-Censored Survival Data with External Time-Dependent Covariates

More information

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks Y. Xu, D. Scharfstein, P. Mueller, M. Daniels Johns Hopkins, Johns Hopkins, UT-Austin, UF JSM 2018, Vancouver 1 What are semi-competing

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

A multi-state model for the prognosis of non-mild acute pancreatitis

A multi-state model for the prognosis of non-mild acute pancreatitis A multi-state model for the prognosis of non-mild acute pancreatitis Lore Zumeta Olaskoaga 1, Felix Zubia Olaskoaga 2, Guadalupe Gómez Melis 1 1 Universitat Politècnica de Catalunya 2 Intensive Care Unit,

More information

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models 26 March 2014 Overview Continuously observed data Three-state illness-death General robust estimator Interval

More information

Analysing geoadditive regression data: a mixed model approach

Analysing geoadditive regression data: a mixed model approach Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression

More information

DISCRETE PROBABILITY DISTRIBUTIONS

DISCRETE PROBABILITY DISTRIBUTIONS DISCRETE PROBABILITY DISTRIBUTIONS REVIEW OF KEY CONCEPTS SECTION 41 Random Variable A random variable X is a numerically valued quantity that takes on specific values with different probabilities The

More information

Survival Analysis. Stat 526. April 13, 2018

Survival Analysis. Stat 526. April 13, 2018 Survival Analysis Stat 526 April 13, 2018 1 Functions of Survival Time Let T be the survival time for a subject Then P [T < 0] = 0 and T is a continuous random variable The Survival function is defined

More information

Survival Analysis. Lu Tian and Richard Olshen Stanford University

Survival Analysis. Lu Tian and Richard Olshen Stanford University 1 Survival Analysis Lu Tian and Richard Olshen Stanford University 2 Survival Time/ Failure Time/Event Time We will introduce various statistical methods for analyzing survival outcomes What is the survival

More information

Joint Modeling of Longitudinal Item Response Data and Survival

Joint Modeling of Longitudinal Item Response Data and Survival Joint Modeling of Longitudinal Item Response Data and Survival Jean-Paul Fox University of Twente Department of Research Methodology, Measurement and Data Analysis Faculty of Behavioural Sciences Enschede,

More information

3003 Cure. F. P. Treasure

3003 Cure. F. P. Treasure 3003 Cure F. P. reasure November 8, 2000 Peter reasure / November 8, 2000/ Cure / 3003 1 Cure A Simple Cure Model he Concept of Cure A cure model is a survival model where a fraction of the population

More information

Statistical Inference and Methods

Statistical Inference and Methods Department of Mathematics Imperial College London d.stephens@imperial.ac.uk http://stats.ma.ic.ac.uk/ das01/ 31st January 2006 Part VI Session 6: Filtering and Time to Event Data Session 6: Filtering and

More information

Robustifying Trial-Derived Treatment Rules to a Target Population

Robustifying Trial-Derived Treatment Rules to a Target Population 1/ 39 Robustifying Trial-Derived Treatment Rules to a Target Population Yingqi Zhao Public Health Sciences Division Fred Hutchinson Cancer Research Center Workshop on Perspectives and Analysis for Personalized

More information

Probability Distributions Columns (a) through (d)

Probability Distributions Columns (a) through (d) Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)

More information

Measuring Social Influence Without Bias

Measuring Social Influence Without Bias Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from

More information

Multi-state models: prediction

Multi-state models: prediction Department of Medical Statistics and Bioinformatics Leiden University Medical Center Course on advanced survival analysis, Copenhagen Outline Prediction Theory Aalen-Johansen Computational aspects Applications

More information

Unobserved Heterogeneity

Unobserved Heterogeneity Unobserved Heterogeneity Germán Rodríguez grodri@princeton.edu Spring, 21. Revised Spring 25 This unit considers survival models with a random effect representing unobserved heterogeneity of frailty, a

More information

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition

COPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Brief Review of Probability

Brief Review of Probability Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions

More information

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables

ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES. Cox s regression analysis Time dependent explanatory variables ADVANCED STATISTICAL ANALYSIS OF EPIDEMIOLOGICAL STUDIES Cox s regression analysis Time dependent explanatory variables Henrik Ravn Bandim Health Project, Statens Serum Institut 4 November 2011 1 / 53

More information

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes

Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes Semiparametric Models for Joint Analysis of Longitudinal Data and Counting Processes by Se Hee Kim A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in partial

More information

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1 1.1 The Probability Model...1 1.2 Finite Discrete Models with Equally Likely Outcomes...5 1.2.1 Tree Diagrams...6 1.2.2 The Multiplication Principle...8

More information

Ignoring the matching variables in cohort studies - when is it valid, and why?

Ignoring the matching variables in cohort studies - when is it valid, and why? Ignoring the matching variables in cohort studies - when is it valid, and why? Arvid Sjölander Abstract In observational studies of the effect of an exposure on an outcome, the exposure-outcome association

More information

A general mixed model approach for spatio-temporal regression data

A general mixed model approach for spatio-temporal regression data A general mixed model approach for spatio-temporal regression data Thomas Kneib, Ludwig Fahrmeir & Stefan Lang Department of Statistics, Ludwig-Maximilians-University Munich 1. Spatio-temporal regression

More information

Modeling Prediction of the Nosocomial Pneumonia with a Multistate model

Modeling Prediction of the Nosocomial Pneumonia with a Multistate model Modeling Prediction of the Nosocomial Pneumonia with a Multistate model M.Nguile Makao 1 PHD student Director: J.F. Timsit 2 Co-Directors: B Liquet 3 & J.F. Coeurjolly 4 1 Team 11 Inserm U823-Joseph Fourier

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

2008 Winton. Statistical Testing of RNGs

2008 Winton. Statistical Testing of RNGs 1 Statistical Testing of RNGs Criteria for Randomness For a sequence of numbers to be considered a sequence of randomly acquired numbers, it must have two basic statistical properties: Uniformly distributed

More information

Frailty Models and Copulas: Similarities and Differences

Frailty Models and Copulas: Similarities and Differences Frailty Models and Copulas: Similarities and Differences KLARA GOETHALS, PAUL JANSSEN & LUC DUCHATEAU Department of Physiology and Biometrics, Ghent University, Belgium; Center for Statistics, Hasselt

More information

5. Parametric Regression Model

5. Parametric Regression Model 5. Parametric Regression Model The Accelerated Failure Time (AFT) Model Denote by S (t) and S 2 (t) the survival functions of two populations. The AFT model says that there is a constant c > 0 such that

More information

Statistical Methods for Alzheimer s Disease Studies

Statistical Methods for Alzheimer s Disease Studies Statistical Methods for Alzheimer s Disease Studies Rebecca A. Betensky, Ph.D. Department of Biostatistics, Harvard T.H. Chan School of Public Health July 19, 2016 1/37 OUTLINE 1 Statistical collaborations

More information

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models

System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models Fatih Cavdur fatihcavdur@uludag.edu.tr March 20, 2012 Introduction Introduction The world of the model-builder

More information

Comparing Group Means When Nonresponse Rates Differ

Comparing Group Means When Nonresponse Rates Differ UNF Digital Commons UNF Theses and Dissertations Student Scholarship 2015 Comparing Group Means When Nonresponse Rates Differ Gabriela M. Stegmann University of North Florida Suggested Citation Stegmann,

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 4: and multivariable regression Fatma Shebl, MD, MS, MPH, PhD Assistant Professor Chronic Disease Epidemiology Department Yale School of Public Health Fatma.shebl@yale.edu

More information

Evaluating the value of structural heath monitoring with longitudinal performance indicators and hazard functions using Bayesian dynamic predictions

Evaluating the value of structural heath monitoring with longitudinal performance indicators and hazard functions using Bayesian dynamic predictions Evaluating the value of structural heath monitoring with longitudinal performance indicators and hazard functions using Bayesian dynamic predictions C. Xing, R. Caspeele, L. Taerwe Ghent University, Department

More information

Evaluation and Comparison of Mixed Effects Model Based Prognosis for Hard Failure

Evaluation and Comparison of Mixed Effects Model Based Prognosis for Hard Failure IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379 Evaluation and Comparison of Mixed Effects Model Based Prognosis for Hard Failure Junbo Son, Qiang Zhou, Shiyu Zhou, Xiaofeng Mao, and Mutasim

More information

Variable Selection in Competing Risks Using the L1-Penalized Cox Model

Variable Selection in Competing Risks Using the L1-Penalized Cox Model Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2008 Variable Selection in Competing Risks Using the L1-Penalized Cox Model XiangRong Kong Virginia Commonwealth

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Estimation of the primary hazard ratio in the presence of a secondary covariate with non-proportional hazards An undergraduate honors thesis submitted to the Department

More information

Gamma process model for time-dependent structural reliability analysis

Gamma process model for time-dependent structural reliability analysis Gamma process model for time-dependent structural reliability analysis M.D. Pandey Department of Civil Engineering, University of Waterloo, Waterloo, Ontario, Canada J.M. van Noortwijk HKV Consultants,

More information

Multistate models and recurrent event models

Multistate models and recurrent event models Multistate models Multistate models and recurrent event models Patrick Breheny December 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/22 Introduction Multistate models In this final lecture,

More information

Frailty Modeling for clustered survival data: a simulation study

Frailty Modeling for clustered survival data: a simulation study Frailty Modeling for clustered survival data: a simulation study IAA Oslo 2015 Souad ROMDHANE LaREMFiQ - IHEC University of Sousse (Tunisia) souad_romdhane@yahoo.fr Lotfi BELKACEM LaREMFiQ - IHEC University

More information

Multistate models and recurrent event models

Multistate models and recurrent event models and recurrent event models Patrick Breheny December 6 Patrick Breheny University of Iowa Survival Data Analysis (BIOS:7210) 1 / 22 Introduction In this final lecture, we will briefly look at two other

More information

3 Continuous Random Variables

3 Continuous Random Variables Jinguo Lian Math437 Notes January 15, 016 3 Continuous Random Variables Remember that discrete random variables can take only a countable number of possible values. On the other hand, a continuous random

More information

Causal Sensitivity Analysis for Decision Trees

Causal Sensitivity Analysis for Decision Trees Causal Sensitivity Analysis for Decision Trees by Chengbo Li A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Computer

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

MAS3301 / MAS8311 Biostatistics Part II: Survival

MAS3301 / MAS8311 Biostatistics Part II: Survival MAS330 / MAS83 Biostatistics Part II: Survival M. Farrow School of Mathematics and Statistics Newcastle University Semester 2, 2009-0 8 Parametric models 8. Introduction In the last few sections (the KM

More information

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Part III Measures of Classification Accuracy for the Prediction of Survival Times Part III Measures of Classification Accuracy for the Prediction of Survival Times Patrick J Heagerty PhD Department of Biostatistics University of Washington 102 ISCB 2010 Session Three Outline Examples

More information

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL

Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL Lecture 6 PREDICTING SURVIVAL UNDER THE PH MODEL The Cox PH model: λ(t Z) = λ 0 (t) exp(β Z). How do we estimate the survival probability, S z (t) = S(t Z) = P (T > t Z), for an individual with covariates

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

TMA 4275 Lifetime Analysis June 2004 Solution

TMA 4275 Lifetime Analysis June 2004 Solution TMA 4275 Lifetime Analysis June 2004 Solution Problem 1 a) Observation of the outcome is censored, if the time of the outcome is not known exactly and only the last time when it was observed being intact,

More information

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes: Practice Exam 1 1. Losses for an insurance coverage have the following cumulative distribution function: F(0) = 0 F(1,000) = 0.2 F(5,000) = 0.4 F(10,000) = 0.9 F(100,000) = 1 with linear interpolation

More information

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520 Department of Statistics North Carolina State University Presented by: Butch Tsiatis, Department of Statistics, NCSU

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Mathematics, Box F, Brown University, Providence RI 02912, Web site:

Mathematics, Box F, Brown University, Providence RI 02912,  Web site: April 24, 2012 Jerome L, Stein 1 In their article Dynamics of cancer recurrence, J. Foo and K. Leder (F-L, 2012), were concerned with the timing of cancer recurrence. The cancer cell population consists

More information

Lecture 2: Probability and Distributions

Lecture 2: Probability and Distributions Lecture 2: Probability and Distributions Ani Manichaikul amanicha@jhsph.edu 17 April 2007 1 / 65 Probability: Why do we care? Probability helps us by: Allowing us to translate scientific questions info

More information

Bayesian Analysis for Partially Complete Time and Type of Failure Data

Bayesian Analysis for Partially Complete Time and Type of Failure Data Bayesian Analysis for Partially Complete Time and Type of Failure Data Debasis Kundu Abstract In this paper we consider the Bayesian analysis of competing risks data, when the data are partially complete

More information

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score

Causal Inference with General Treatment Regimes: Generalizing the Propensity Score Causal Inference with General Treatment Regimes: Generalizing the Propensity Score David van Dyk Department of Statistics, University of California, Irvine vandyk@stat.harvard.edu Joint work with Kosuke

More information

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion Glenn Heller and Jing Qin Department of Epidemiology and Biostatistics Memorial

More information

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation

Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation Time-varying failure rate for system reliability analysis in large-scale railway risk assessment simulation H. Zhang, E. Cutright & T. Giras Center of Rail Safety-Critical Excellence, University of Virginia,

More information

ST745: Survival Analysis: Nonparametric methods

ST745: Survival Analysis: Nonparametric methods ST745: Survival Analysis: Nonparametric methods Eric B. Laber Department of Statistics, North Carolina State University February 5, 2015 The KM estimator is used ubiquitously in medical studies to estimate

More information

Ph.D. course: Regression models. Introduction. 19 April 2012

Ph.D. course: Regression models. Introduction. 19 April 2012 Ph.D. course: Regression models Introduction PKA & LTS Sect. 1.1, 1.2, 1.4 19 April 2012 www.biostat.ku.dk/~pka/regrmodels12 Per Kragh Andersen 1 Regression models The distribution of one outcome variable

More information

Flexible modelling of the cumulative effects of time-varying exposures

Flexible modelling of the cumulative effects of time-varying exposures Flexible modelling of the cumulative effects of time-varying exposures Applications in environmental, cancer and pharmaco-epidemiology Antonio Gasparrini Department of Medical Statistics London School

More information

Survival Analysis using Bivariate Archimedean Copulas. Krishnendu Chandra

Survival Analysis using Bivariate Archimedean Copulas. Krishnendu Chandra Survival Analysis using Bivariate Archimedean Copulas Krishnendu Chandra Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy under the Executive Committee of the

More information

Kernel density estimation in R

Kernel density estimation in R Kernel density estimation in R Kernel density estimation can be done in R using the density() function in R. The default is a Guassian kernel, but others are possible also. It uses it s own algorithm to

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions

More information