Reinforced urns and the subdistribution beta-stacy process prior for competing risks analysis Andrea Arfè 1 Stefano Peluso 2 Pietro Muliere 1 1 Università Commerciale Luigi Bocconi 2 Università Cattolica del Sacro Cuore SISBAYES 217 workshop, Rome, Italy Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 1 / 19
Introduction In clinical prognostic research with a time-to-event outcome, occurrence of competing risks may preclude the occurrence of another event of interest. Example: Amsterdam Cohort Study of AIDS progression in HIV-infected men [Geskus et al., 23] Syncytium Inducing (SI) AIDS At risk non-si AIDS Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 2 / 19
Subdistribution functions Subdistribution function [Kalbfleisch and Prentice, 22] F (t, d) = P (T t, C = c) where: T > is the time to event onset C {1,..., K} encodes the type of occurring event Can consider covariates: F (t, c x) Note: will assume that T {1, 2, 3,...} Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 3 / 19
Outline 1 Literature review 2 The subdistribution beta-stacy process 3 Predictive characterization via reinforced urn processes 4 Theoretical properties 5 A semiparametric competing risks regression model 6 Application: analysis of data from the Amsterdam Cohort Studies 7 Future work Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 4 / 19
Literature review I Competing risks data has received widespread attention in the classical nonparametric literature, c.f. Kalbfleisch and Prentice [22], Aalen et al. [28], Andersen et al. [212], Lawless [211], Crowder [212], and Pintilie [26]. Classical Kalbfleisch and Prentice [22] estimator of the subdistribution function: F (t, c) = t Ŝ(u 1) Âc(u) u=1 where: Ŝ(t) is the Kaplan-Meier estimator of the survival function of T ; Â c (t) is the Nelson-Aalen estimator of the cause-specific cumulative hazard function for the event C = c. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 5 / 19
Literature review II In contrast with the frequentist literature, the Bayesian nonparametric literature on competing risks is still sparse. Nonparametric bayesian inference for competing risks involves a process prior on the space of subdistribution functions. Current proposals are based on either the gamma process [Ge and Chen, 212] or the beta process [Hjort, 199, De Blasi and Hjort, 27] and its extensions, such as the beta-dirichlet process [Kim et al., 212, Chae et al., 213], for subdistribution or cause-specific hazard functions. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 6 / 19
The subdistribution beta-stacy process prior Let α t,c > for all t = 1, 2,... and c = 1,..., K and suppose that τ α t, lim τ + k d= α =. (1) t,d t=1 Subdistribution beta-stacy process A random subdistribution function F is subdistribution beta-stacy if: 1 F (, c) = with probability 1 for all c = 1,..., k; 2 for all c = 1,..., k and all t 1, t 1 F (t, c) = W t,c u=1 ( 1 where F (t, c) = F (t, c) F (t 1, c) and ) k W u,d, d=1 W t = (W t,,..., W t,k ) ind Dirichlet(α t,,..., α t,k ). Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 7 / 19
Reinforced Urn Processes A reinforced urn process (RUP) is a stochastic process (X n ) n with countable state-space S and X = x S. Each state x S is associated with a Pólya urn U(x) with balls of colors represented by the elements of the finite set E; If X n = x and a ball of color c E is extracted from U(x), this is replaced in U(x) together with another ball of the same color (reiforcement). Then, X n+1 = q(x, c) S, where q(x, c) is the law of motion. Theorem (Muliere et al. [2]) If (X n ) n is recurrent, then there exists a random transition matrix Q on S conditionally on which (X n ) n is a Markov Chain with transition matrix Q. The rows of Q are independent Dirichlet processes with base measure determined by the initial composition of the urns. RUPs provide a predictive characterization of Pólya trees and discrete-time neutral-to-the-right processes [Muliere et al., 2]. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 8 / 19
Construction via reinforced urn processes I a) Patient 1 b) Patient 2 Color 2 Color 2 (,2) (1,2) (2,2) X 3 =(3,2) (,2) (1,2) (2,2) (3,2) Color 1 Color 1 (,1) (1,1) (2,1) (3,1) (,1) (1,1) X 6 =(2,1) (3,1) Color 2 1 1 2 2 1 2 2 1 Color 2 2 1 2 1 1 2 1 2 1 X =(,) X 1 =(1,) X 2 =(2,) (3,) X 4 =(,) X 5 =(1,) (2,) (3,) 1 2 3 time 1 2 X = (, ), X 1 = (1, ), X 2 = (2, ), X 3 = (3, 2), X 4 = (, ), X 5 = (1, ), X 6 = (2, 1) (T 1, C 1 ) = (3, 2), (T 2, C 2 ) = (2, 1) 3 time Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 9 / 19
Construction via reinforced urn processes II The parameters α t,, α t,1,..., α t,k determine the initial composition of the urn U((t 1, )). Lemma (Recurrency condition) (X n ) n is recurrent if and only if lim τ τ + t=1 α t, k d= α t,d Theorem (Predictive characterization) = (2) Assume (2) holds. The sequence ((T i, C i )) i 1 generated by (X n ) n is exchangeable. Its de Finetti measure is the law of a subdistribution beta-stacy process. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 1 / 19
Centering the process Let F be a fixed subdistribution function and ω t > for all t 1. Write F sbs(ω, F ) if F has is a subdistribution beta-stacy process with parameters ) k α t,c = ω t F (t, c), α t, = ω t (1 F (t, d) d=1 Moments of F sbs(ω, F ) E[F (t, c)] = F (t, c) Var (F (t, c)) is a decreasing function of ω t Var (F (t, c)) as ω t + Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 11 / 19
Relation with other prior processes If F (t, c) sbs(ω, F ), then K d=1 F (t, d) is a random distribution function distributed according to the discrete-time beta-stacy process of Walker and Muliere [1997]. The cumulative hazards A c (t) = F (t, c)/(1 k d=1 F (t 1, d)) are independent discrete-time beta processes [Hjort, 199]. The subdistribution beta-stacy process is therefore also related to the beta-dirichlet process of Kim et al. [212], a generalization of Hjort s beta process. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 12 / 19
Posterior computations with censored data Right-censored data: T i = min(t i, R i ), C i = C i I {T i < R i }, where R i is a random censoring time, i = 1,..., n. Censoring is ignorable if i) R 1,..., R n are independent with common distribution function H, ii) conditional on F and H, (T 1, C 1 ),..., (T n, C n ) and R 1,..., R n are independent [Heitjan and Rubin, 1991]. Theorem 1 If censoring is ignorable and F sbs(ω, F ) a priori, conditionally on (T 1, C 1 ),..., (T n, C n) the posterior distribution of F is sbs(ω, F ). 2 F (t, c) F (t, c) as max(ω t ), where F is the classical Kalbfleisch-Prentice estimator of F. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 13 / 19
Semiparametric regression for competing risks A subdistribution beta-stacy mixture regression model: (T i, C i ) ind F i (i = 1,..., n) F i ind sbs(ω(θ, x i ), F ( θ, x i )) F (t, c θ, x i ) = F (1) (c θ)f (2) (t c, θ, x i ) 1 ω t (θ, x i ) = K d=1 F (t, d θ, x i ) θ π(θ) We choose: F (1) = multinomial logistic model; F (2) = Weibull regression model; regression coefficients are assigned diffuse normal priors; Weibull scale parameters are assigned non-informative Gamma(ɛ, ɛ) distributions. Inference can be easily carried out via a Metropolis-Hastings algorithm after marginalizing away the F i from the likelihood. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 14 / 19
Analysis of Amsterdam Cohort Studies data I Objective: assessing the long-term prognosis of HIV infected men with respect to the risk of non-si AIDS onset or SI onset as a function of CCR5 genotype: WW (wild type allele on both chromosomes) or WM (mutant allele on one chromosome). A total of 324 male patients were followed-up from the date of HIV infection to the earliest among the dates of AIDSs onset, SI phenotype onset, or right censoring, i.e. death, study drop-out, or end of the study period. A total of 65 (2.1%) patients had a WM genotype, while mean age at HIV infection was about 34.6 years (s.d., 7.2 years). Overall, the 324 patients accumulated 2262.2 person-years of follow up (minimum - maximum follow-up:.1 years - 13.9 years), generating 117 cases of AIDS onset and 17 cases of SI onset. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 15 / 19
Analysis of Amsterdam Cohort Studies data II AIDS (subdistr. beta Stacy model) SI (subdistr. beta Stacy model) Cumulative probability..2.4.6.8 WW WM Cumulative probability..2.4.6.8 WM WW 5 1 15 2 5 1 15 2 Years since HIV infection Years since HIV infection AIDS (Classical estimates) SI (Classical estimates) Cumulative probability..2.4.6.8 WW WM Cumulative probability..2.4.6.8 WM WW 5 1 15 2 5 1 15 2 Years since HIV infection Years since HIV infection Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 16 / 19
Future developments Other reinforced urn schemes could be contemplated. For example, each extracted ball may be reinforced by a general amount m > of new similar balls, or m may be random variable depending on the color of the extracted balls, as in Muliere et al. [26] A continuous-time generalization of the subdistribution beta-stacy process could be considered. We are currently developing a characterization of such process from a predictive perspective by means of the continuous-time urn models of Muliere et al. [23] and Bulla and Muliere [27] A generalization of the reinforced urn process considered in this work could be attempted to characterize a process prior on the space of transition kernels of a Markovian multistate process. Such process could be useful for the predictive Bayesian nonparametric analysis of event-history data [Aalen et al., 28] Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 17 / 19
References I Odd Aalen, Ornulf Borgan, and Hakon Gjessing. Survival and event history analysis: a process point of view. Springer Science & Business Media, New York, 28. Per Kragh Andersen, Ornulf Borgan, Richard D Gill, and Niels Keiding. Statistical models based on counting processes. Springer Science & Business Media, New York, 212. Paolo Bulla and Pietro Muliere. Bayesian nonparametric estimation for reinforced markov renewal processes. Statistical Inference for Stochastic Processes, 1(3):283 33, 27. Minwoo Chae, Rafael Weißbach, Kwang Hyun Cho, and Yongdai Kim. A mixture of beta dirichlet processes prior for bayesian analysis of event history data. Journal of the Korean Statistical Society, 42 (3):313 321, 213. Martin J Crowder. Multivariate survival analysis and competing risks. CRC Press, Boca Raton, Florida, 212. Pierpaolo De Blasi and Nils Lid Hjort. Bayesian survival analysis in proportional hazard models with logistic relative risk. Scandinavian Journal of Statistics, 34(1):229 257, 27. Miaomiao Ge and Ming-Hui Chen. Bayesian inference of the fully specified subdistribution model for survival data with competing risks. Lifetime Data Analysis, 18(3):339 363, 212. Ronald B Geskus, Frank A Miedema, Jaap Goudsmit, Peter Reiss, Hanneke Schuitemaker, and Roel A Coutinho. Prediction of residual time to aids and death based on markers and cofactors. Journal of Acquired Immune Deficiency Syndromes, 32(5):514 521, 23. Daniel F Heitjan and Donald B Rubin. Ignorability and coarse data. The Annals of Statistics, 19: 2244 2253, 1991. Nils Lid Hjort. Nonparametric Bayes estimators based on beta processes in models for life history data. The Annals of Statistics, 18:1259 1294, 199. John D Kalbfleisch and Ross L Prentice. The statistical analysis of failure time data. John Wiley & Sons, Hoboken, New Jersey, 2nd edition edition, 22. Yongdai Kim, Lancelot James, and Rafael Weissbach. Bayesian analysis of multistate event history data: beta-dirichlet process prior. Biometrika, 99(1):127 14, 212. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 18 / 19
References II Jerald F Lawless. Statistical models and methods for lifetime data. John Wiley & Sons, Hoboken, New Jersey, 211. P Muliere, P Secchi, and SG Walker. Urn schemes and reinforced random walks. Stochastic Processes and their Applications, 88(1):59 78, 2. Pietro Muliere, Piercesare Secchi, and Stephen G Walker. Reinforced random processes in continuous time. Stochastic Processes and their Applications, 14(1):117 13, 23. Pietro Muliere, Anna Maria Paganoni, and Piercesare Secchi. A randomly reinforced urn. Journal of Statistical Planning and Inference, 136(6):1853 1874, 26. Melania Pintilie. Competing risks: a practical perspective. John Wiley & Sons, Chichester, England, 26. Stephen Walker and Pietro Muliere. Beta-stacy processes and a generalization of the pólya-urn scheme. The Annals of Statistics, 25:1762 178, 1997. Andrea Arfè (Bocconi) Subdistribution beta-stacy 8 Feb 217 19 / 19