Spatial Survival Analysis via Copulas

Size: px

Start display at page:

Download "Spatial Survival Analysis via Copulas"

Noel Doyle
5 years ago
Views:

1 1 / 38 Spatial Survival Analysis via Copulas Tim Hanson Department of Statistics University of South Carolina, U.S.A. International Conference on Survival Analysis in Memory of John P. Klein Medical College of Wisconsin June 2014

2 Outline 2 / 38 1 I: Prostate cancer, areally-referenced semiparametric survival 2

3 3 / 38 SCCCR data set on prostate cancer survival Large dataset on prostate cancer survival that does not follow proportional hazards. n = patients from South Carolina Central Cancer Registry (SCCCR) for the period ; each recorded with county, race, marital status, grade of tumor, and SEER summary stage; 72.3% are censored. Need to allow for non-proportional hazards and accommodate correlation of survival times within county. Joint work with Li Li and Jiajia Zhang.

4 4 / 38 Extended hazards model Etezadi-Amoli and Ciampi (1987) propose EH model Say x = (x 1, x 2 ), then EH is λ x (t) = λ 0 (te x β )e x γ. λ x (t) = λ 0 (te β 1x 1 +β 2 x 2 )e γ 1x 1 +γ 2 x 2. γ 1 = β 1 x 1 has AFT interpretation; β 1 = 0 x 1 has PH interpretation; γ 1 = 0 x 1 has AH interpretation. Likelihood for observing {(t i, x i, δ i )} n i=1 { n { δi } ti L(β, γ, λ 0 ( )) = e γ x i λ 0 (e β x i t i )} exp e γ x i λ 0 (te β x i )dt 0 i=1

5 5 / 38 Baseline hazard λ 0 (t) Want to shrink λ 0 (t) toward parametric target λ θ (t): λ 0 (t) = J b j B kj (t) j=1 where B k1 ( ),..., B k,j ( ) are kth order B-spline basis function over knots (s 1,..., s J+k ). Let s j = j+k j=k+1 s k/(k 1) and b j = λ θ ( s j ). Schoenberg s approximation theorem (Marsden 1972) says max 0 t sj+1 λ 0 (t) λ θ (t) ɛ(λ θ, k, J).

6 6 / 38 Posterior updating of model Step 1: Update the blocks {β, γ}, θ, c separately using adaptive Metropolis-Hastings algorithms (Haario, Saksman, and Tamminen 2005). Step 2: Sample g-prior parameters g 1 1 from Gamma(a g + 1, b g + β x xβ/2n + b g ) and g 1 2 from Gamma(a g + 1, b g + γ x xβ/2n + b g ). Step 3: Sample the latent random vectors u i from ( ) b 1 B 1 (e β x i t u i β, γ, b, θ, u i Mult i ) n j=1 b j B j (e β x i t i ),..., b J B J (e β x i t i ) n j=1 b j Bj (e β x i t i ) Step 4: B-spline coefficients are updated by b j β, γ, b j, θ, {u i } Gamma n ti u ij + cλ θ ( s j ), c + e γ x i B j (te β x i )dt i S i=1 0

7 Works great on simulated data Survival function Density function S 0(t) True Mean estimates 2.5%, 97.5% quantiles f 0(t) True Mean estimates 2.5%, 97.5% quantiles t t Hazard function Hazard function λ 0(t) True Mean estimates 2.5%, 97.5% quantiles λ 0(t) True Mean estimates Mean λ θ t t 7 / 38

8 8 / 38 Spatial dependence via frailties impractical PH with frailties: where g ci EH with frailties: λ(t i x) = λ 0 (t i )e γ x i +g ci, are county-level frailties, c i is county subject i in. λ(t i x) = λ 0 {t i e β x i +b ci }e γ x i +g ci, where, for our data, b 1,..., b 46 and g 1,..., g 46 are county-level frailties. Possible but impractical, and hard to interpret.

9 9 / 38 Spatial dependence via copula works great Define Y i = Φ 1 { 1 e Λ i (T i ) } where Φ( ) is the standard normal cumulative distribution function. Under Li and Lin (2006) Y N(0, Γ). Likelihood ffrom data {(t i, x i, δ i )} n i=1 is [ ] [ ] f i (t i ) f i (z i ) L s (β, γ, b, θ, Γ) = φ(y i ) φ(y i ) I(z i > t i ) φ(y; 0, Γ) dz i i S c i S c How to define Γ? i S We consider county-level lattice data; popular correlation model is intrinsic conditional autoregressive (ICAR) prior.

10 10 / 38 ICAR definition α = (α 1,, α m ) is vector of correlated effects on a lattice. ICAR prior on α is p(α) exp{ ϕα (D W)α/2} Implication of ICAR ( prior: n ) α j α j, ϕ N j=1 ω ijα j /ω j+, 1/(ϕω j+ ). Random effects approach Ỹ = (Ỹ1,..., Ỹm) = (Ỹ11,..., Ỹ1n 1,..., Ỹm1,..., Ỹmn m ). Ỹ ij = α i + ɛ ij, α N m (0, Ω), ɛ N n (0, Iσ 2 ). Resulting correlation matrix Γ = corr(ỹ) involves one unknown parameter ϕ.

11 11 / 38 Posterior updating-latent survival approach MCMC sampling steps 1 to 3 are the same as before. Step 4: Propose bj new from ( n ) ti Gamma u ij + cλ θ ( s j ), c + e γ x i B j (te β x i )dt i S c i=1 0 and accept it with probability { } n i=1 min 1, φ(y i)e ynew Γ 1 y new /2 n new i=1 φ(yi )e y Γ 1 y/2 where y new is new transformed failure time vector corresponding to bj new. Step 5: Sample Y i N(y i y i, Γ)I(y i > Φ 1 {F i (t i ))} then set z i = F 1 i {Φ(y i )}. Step 6: Update ϕ using adaptive Metropolis-Hastings.

12 12 / 38 Efficient evaluation of y Γ 1 y Γ is a n n matrix Elements of Γ 1 can be easily computed using SVD, Γ 1 = A 1 U 1 ( (K + σ 2 I m ) 1 σ 2 I m ) U 1 A 1 + σ 2 A 2 where A is a diagonal matrix, U 1 = (u 1,..., u m ), u i is a vector of length n with ones corresponding to county i and zero elsewhere. y Γ 1 y = x ( (K + σ 2 I m ) 1 σ 2 I m ) x + σ 2 y A 2 y

13 13 / 38 Savage-Dickey ratio for global and per-variable tests Example of global test of PH vs. EH BF 12 = π(β = 0 D, EH) π(β = 0 EH). Example of per-variable of PH for x j vs. EH BF 12 = π(β j = 0 D, EH) π(β j = 0 EH).

14 14 / 38 SCCCR data SCCCR prostate cancer data for the period Baseline covariates are county of residence, age, race, marital status, grade of tumor differentiation, and SEER summary stage. n = patients in the dataset after excluding subjects with missing information. 72.3% of the survival times are right-censored. Goal: assess racial disparity in prostate cancer survival, adjusting for the remaining risk factors and accounting for the county the subject lives in.

15 15 / 38 SCCCR data Table: Summary characteristics of prostate cancer patients in SC from Covariate n Sample percentage Race Black White Marital status Non-married Married Grade well or moderately differentiated poorly differentiated or undifferentiated SEER summary stage Localized or regional Distant

16 16 / 38 Non-spatial EH and reduced models Table: Summary of fitting the extended hazard model EH, the reduced model, AFT, and PH; indicates LPML and DIC Covar EH Reduced AFT PH PH+additive age β = γ β = 0 β = 0 Age β (0.48,0.52) 0.48(0.46,0.50) 0.48(0.45,0.51) γ (0.42,0.49) γ 1 = β (0.62,0.68) Race β (0.15,0.21) 0.20(0.16,0.21) 0.18(0.15,0.22) γ (0.12,0.24) γ 2 = β (0.21,0.32) 0.26(0.20,0.31) Marital β (-0.11,-0.02) -0.05(-0.09,-0.00) 0.26(0.21,0.30) status γ (0.29,0.40) 0.33(0.28,0.40) 0.33(0.27,0.39) 0.31(0.26,0.37) Grade β (-0.02,0.08) β 4 = (0.22,0.32) γ (0.29,0.41) 0.37(0.31,0.43) 0.38(0.32,0.44) 0.37(0.33,0.43) SEER β (2.80,3.53) 3.27(2.79,3.57) 1.50(1.41,1.59) stage γ (0.83,1.20) 1.00(0.82,1.19) 1.56(1.47,1.64) 1.57(1.19,1.65) LPML DIC

17 17 / 38 Non-spatial EH and reduced models Table: Bayes factors for comparing EH to PH, AFT, and AH with and without spatial correlation. EH Spatial+EH Covariate PH AFT AH PH AFT AH Age > > 1000 > > 1000 Race > > 1000 > 1000 < 0.01 > 1000 Marital status 1.79 > 1000 > > 1000 > 1000 Grade 0.14 > 1000 > > 1000 > 1000 SEER stage > 1000 > 1000 > 1000 > 1000 > 1000 > 1000

18 18 / 38 Spatial EH and reduced models Table: Summary of spatial models; indicates LPML and DIC Covariates Marginal EH Marginal reduced PH+ICAR+additive age β = 0 Age β (0.47,0.52) 0.47(0.46,0.49) γ (0.43,0.49) γ 1 = β 1 Race β (0.15,0.21) 0.20(0.17,0.22) γ (0.11,0.23) γ 2 = β (0.18,0.30) Marital status β (-0.10,-0.02) -0.02(-0.05,-0.00) γ (0.28,0.41) 0.33(0.27,0.39) 0.32(0.25,0.38) Grade β (-0.01,0.07) β 4 = 0 γ (0.30,0.42) 0.38(0.32,0.43) 0.37(0.32,0.44) SEER stage β (2.86,3.34) 2.77(2.72,2.82) γ (0.94,1.26) 1.21(1.01,1.33) 1.55(1.46,1.64) ϕ 50.1(19.9,113.7) 54.6(22.7,120.8) 33.08(9.2,100.1) LPML DIC

19 < to to to to 0.33 >0.33 < to to to to 0.06 > / 38 Spatial EH and reduced models Mortality rate PH ICAR frailty Marginal reduced model random effects < to to to to 0.04 >0.04 (a) (b) (c) Figure: Map of (a) Mortality rate, (b) ICAR frailties in the PH model and (c) random effects in the marginal reduced model for SC counties.

20 20 / 38 Spatial EH and reduced models Hazard estimates Survival estimates Hazard PH EH Reduced AFT Survival probability PH EH Reduced AFT Time in Years Time in Years Figure: Baseline hazard (left) and survival probabilities (right) estimates.

21 21 / 38 Spatial EH and reduced models Hazard estimates Survival estimates Hazard Posterior mean for blacks Posterior mean for whites Survival probability Posterior mean for blacks Posterior mean for whites 95% CI Time in Years Time in Years Figure: Hazard and survival for black patients (solid line) and white patients; baseline covariates.

22 22 / 38 Interpretation for race effect Based on the fitted results of the reduced models with and without spatial dependence, white South Carolina subjects diagnosed with prostrate cancer in live 22% longer (e ) than black patients (95% CI is 18% to 25%), fixing age, stage, and SEER stage. Cox said...the physical or substantive basis for...proportional hazards models...is one of its weaknesses... and goes on to suggest that...accelerated failure time models are in many ways more appealing because of their quite direct physical interpretation. The SCCCR analysis showed that the main covariate of interest, race, is best modeled as an AFT effect. Survival probabilities for black patients are significantly lower than those for white patients when other factors are fixed at the same levels.

23 23 / 38 More interpretation Decreasing age by one year increases survival time by 5.4%. Hazard of dying increases 46% for poorly or undifferentiated grades vs. well or moderately differentiated, holding age, race, and SEER stage constant. SEER stage has general EH effects, e (AH) and e (PH). Those with distant stage are at least three times worse in one-sixteenth of the time as those with localized or regional. Marital status essentially has PH interpretation; single (including widowed or separated) subjects are e times more likely to die at any instant than married.

24 24 / 38 About this work Analysis of frog population extinction Joint work with Haiming Zhou and Roland Knapp (Sierra Nevada Aquatic Research Laboratory). Frogs and other amphibians have been dying off in large numbers since the 1980s because of a deadly fungus called Batrachochytrium dendrobatidis, also known as Bd. Dr. Knapp has been studying the amphibian declines for the past decade at Sierra Nevada Aquatic Research Laboratory; he has hiked thousands of miles and surveyed hundreds of frog populations in Sequoia-Kings Canyon National Park collecting the data by hand. Again, proportional hazards is grossly violated for these data.

25 / 38 The Frog Data (2000-2011) Analysis of frog population extinction Contains 309 frog populations. Each was followed up until infection or being censored (10% censoring).

25 25 / 38 The Frog Data ( ) Analysis of frog population extinction Contains 309 frog populations. Each was followed up until infection or being censored (10% censoring). The response is the Bd infection time (i.e. Bd arrival year baseline year). Main covariates: bdwater: whether or not Bd has been found in the watershed. bddistance: straight-line distance to the nearest Bd location. Populations near each other tend to become infected at about the same time.

26 26 / 38 Objectives Analysis of frog population extinction T i = T (s i ) is time to local extinction for pop n located at s i. x i = x(s i ) : a p 1 vector of covariates. Goal 1: describe the association between x(s) and T (s) while allowing for spatial dependence. Goal 2: predict T (s 0 ) given x(s 0 ) at any new location s 0.

27 27 / 38 Analysis of frog population extinction LDDPM model and Spatial Extension LDDPM (De Iorio et al., 2009; Jara et al., 2010): Y i = log T i given x i independently follow mixture model ( y x ) F xi (y G) = Φ i β dg{β, σ 2 }, σ where G follows a Dirichlet Process (DP) prior, i.e. G DP(α, G 0 ). the joint distribution of Y = (Y 1,..., Y n ) by F(t 1,..., t n G) = C(F x1 (t 1 G),..., F xn (t n G); θ), where C(u 1,..., u n ; θ) is a spatial copula with parameter θ, which is used for capturing spatial dependence. We use point-referenced Gaussian copula of Li and Lin (2006).

28 28 / 38 Specification of F xi (y G) Analysis of frog population extinction Assume Y i = log T i given x i follows a mixture model ( y x ) F xi (y G) = Φ i β dg{β, σ 2 }, σ where G DP(α, G 0 ) with concentration parameter α. G in truncated stick-breaking form (Sethuraman, 1994) as N G = w k δ (βk,σk 2), k=1 w k = V k (1 V j ), j<k where V k α iid Beta(1, α), (β k, σ 2 k ) iid G 0.

29 29 / 38 The Likelihood Analysis of frog population extinction Observed data {(y o i, δ i, x i, s i ) : i = 1,..., n}. Denote by y i the latent true log event-time corresponding to y o i. Then δ i = I(y i = y o i ), where δ i = 0 implies y i > y o i. The augmented likelihood for parameters {y 1,..., y n, G, θ} is n { } L = f xi (y i G) δ i I(y i = yi o ) + (1 δ i )I(y i > yi o ) i=1 C 1/2 exp { 1 } 2 z (C 1 I n )z where f xi (y i G) is the density w.r.t. F xi (y i G), and z = (z 1,..., z n ) with z i = Φ 1 {F xi (y i G)}.

30 30 / 38 MCMC Overview Analysis of frog population extinction All parameters involved in G are updated based on a modification of the blocked Gibbs sampler (Ishwaran and James, 2001): M-H samplers with independent proposals. The latent y i s are updated via independent M-H sampler. Delayed rejection (Tierney and Mira, 1999) used for several parameters; helps sampler not get stuck. The correlation parameters θ are updated using adaptive M-H (Haario et al., 2001). For large n, the inversion of the n n matrix C substantially sped up using a full scale approximation (FSA) (Sang and Huang, 2012).

31 Analysis of frog population extinction Simulated Data Assume the log T given x follows a mixture of normals f (y x) = 0.4N( x, 1 2 ) + 0.6N(2.5 x, ). Specify covariance matrix C with θ 1 = 0.98 and θ 2 = 0.1. Around 10% of survival times are right-censored Figure: 330 randomly selected sites: the n = 300 subjects for estimation; the 30 subjects for prediction. 31 / 38

32 32 / 38 Analysis of frog population extinction Simulated Data: Inference on Spatial Correlation Table: Posterior summary statistics for the spatial correlation parameters Par. True Mean Median Std. dev. 95% HPD Interval θ (0.933, 1.000) θ (0.056, 0.138)

33 33 / 38 Simulated Data: Estimated Density Analysis of frog population extinction Truth 0.6 De Iorio Ours Figure: Estimated density of log(t ) when x = 1 (left) and x = 1 (right). Recall: f (y x) = 0.4N( x, 1 2 ) + 0.6N(2.5 x, ).

34 34 / 38 Simulated Data: Prediction Analysis of frog population extinction Table: The mean squared prediction error (MSPE) for the held-out 30 locations under two different methods Method MSPE De Iorio Ours 0.269

35 35 / 38 Analysis of frog population extinction Frog Data: Inference on Spatial Correlation FYI: the spatial Gaussian copula involves the n n correlation matrix C(s i, s j ; θ) = θ 1 ρ(s i, s j ) + (1 θ 1 )I(s i = s j ), where θ 1 [0, 1] and ρ(s i, s j ) = exp { θ 2 s i s j }. Posterior mean ˆθ 1 = Posterior mean ˆθ 2 = , indicating the correlation decays by 1 exp( ) = 8% for every 1-km increase in distance and 1 exp( ) = 58% for every 10-km increase in distance. Table: Posterior summary statistics for the spatial correlation parameters Par. Mean Median Std. dev. 95% HPD Interval θ (0.9879, ) θ (0.0493, )

36 36 / 38 Frog Data: Estimated Hazard Analysis of frog population extinction bdwater=0 0.9 bdwater= Figure: Estimated hazard function with 90% point-wise CI for bdwater = 1 vs bdwater = 0 when the bddistance is fixed at its population mean.

Analysis of frog population extinction 0.5 0 0.5 1 1.

37 37 / 38 Frog Data: Spatial Prediction Spatial map for the transformed process z(s) = Φ 1 { F x(s) (log T (s) G) }. Analysis of frog population extinction Figure: Predictive spatial map based on new 2000 random locations in D.

38 38 / 38 Remarks Analysis of frog population extinction Proposed Bayesian spatial copula approaches to estimate survival curves semiparametrically (EH model) and nonparametrically (LDDPM) while allowing for spatial dependence, leading to high predictive accuracy. Adopted the FSA approach to compute the inversion of n-dimensional spatial covariance matrices for georeferenced data. Future research involves use of low-rank approaches to modeling data with lots of spatial locations, e.g. random projections (Banerjee et al., 2012) or predictive processes (Banerjee et al., 2008). Two different Banerjees! Extension to simpler semiparametric models such as proportional odds. Thanks to my colleagues Li Li, Haiming Zhou, Roland Knapp, and Jiajia Zhang. Thanks for the invitation!

Recent Advances in Bayesian Spatial Survival Modeling

1 / 47 Recent Advances in Bayesian Spatial Survival Modeling Tim Hanson Department of Statistics University of South Carolina, U.S.A. University of Alabama Department of Mathematics April 10, 2015 2 /