Bayesian semiparametric analysis of short- and long- term hazard ratios with covariates

Bayesian semiparametric analysis of short- and long- term hazard ratios with covariates Department of Statistics, ITAM, Mexico 9th Conference on Bayesian nonparametrics Amsterdam, June 10-14, 2013

Contents Motivating example Model Extension to covariates Bayesian inference Data analysis

Study Survival Study: Ovarian cancer patients who have undergone a treatment with erythoprotein (EPO) stimulating agents (ESA)

Study Survival Study: Ovarian cancer patients who have undergone a treatment with erythoprotein (EPO) stimulating agents (ESA) ESA are used to relieve chemotherapy induced anemia in patients with cancer EPO is a glycoprotein hormone that controls red blood cell production In contrast, other studies have shown that ESA treatments can compromise survival for some types of cancer It was believed that effects of ESA was induced via the canonical EPO receptor (EpoR)

Study n = 174 patients

Study n = 174 patients Covariates: X 1 = ESA treatment indicator (0 or 1). 51% received treat. X 2 = EphB4 level of expression (low or high). 39% high expsn. X 3 = EpoR level of expression (low or high) X 4 = Cytoreduction (suboptimal and optimal) X 5 = Disease s stage (I V) X 6 = Age (20 92). Average of 58 years

Study K M survival curves 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 10 12 Time Figure : KM curves for ESA treatment indicator.

Definitions In survival analysis, let

Definitions In survival analysis, let h T (t) hazard rate for the treatment group

Definitions In survival analysis, let h T (t) hazard rate for the treatment group h C (t) hazard rate for the control group

Definitions In survival analysis, let h T (t) hazard rate for the treatment group h C (t) hazard rate for the control group We define the hazard ratio as HR(t) = h T (t) h C (t)

Definitions In survival analysis, let h T (t) hazard rate for the treatment group h C (t) hazard rate for the control group We define the hazard ratio as HR(t) = h T (t) h C (t) There are three most common behaviours for HR HR(t) = c proportional hazards lim t HR(t) = 1 proportional odds model

Semiparametric model Yang and Prentice (2005) proposed a model that accounts for the previous three behaviours of a HR

Semiparametric model Yang and Prentice (2005) proposed a model that accounts for the previous three behaviours of a HR Their model is S C (t) = {1 + R(t)} 1 and S T (t) = {1 + λθ } θ R(t) This implies that HR(t) = λθ λ + (θ λ)/{1 + R(t)} where λ, θ > 0 and R(t) is monotone nondecreasing

Semiparametric model The model s name is justified by lim HR(t) = λ and lim HR(t) = θ t 0 t

Semiparametric model The model s name is justified by lim HR(t) = λ and lim HR(t) = θ t 0 t If λ = θ proportional hazards

Semiparametric model The model s name is justified by lim HR(t) = λ and lim HR(t) = θ t 0 t If λ = θ proportional hazards If θ = 1 proportional odds model

Semiparametric model The model s name is justified by lim HR(t) = λ and lim HR(t) = θ t 0 t If λ = θ proportional hazards If θ = 1 proportional odds model If (λ 1)(θ 1) < 0 hazard functions cross The model is semiparametric because it includes: (λ, θ) and R(t) = 1 S C (t) S C (t) is the odds function of the control group

Semiparametric model S(t) 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 Figure : Varying (λ, θ): (1, 1) (solid line), (2, 0.5) (dashed line), and (0.5, 3) (dotted-dashed line). t

Extensions Nieto-Barajas (2013) extended the model to include covariates with S(t x i ) = { 1 + λ } θi i R(t) θ i λ i = exp(δ x i ) and θ i = exp(γ x i ) where δ and γ do not have intercept

Extensions Nieto-Barajas (2013) extended the model to include covariates with S(t x i ) = { 1 + λ } θi i R(t) θ i λ i = exp(δ x i ) and θ i = exp(γ x i ) where δ and γ do not have intercept Here we can have more than two groups HR i (t) = h(t x i )/h(t 0) has the same interpretation as in the two groups model

Follow a Bayesian approach for inference on δ, γ and R(t)

Follow a Bayesian approach for inference on δ, γ and R(t) Concentrate first on R(t):

Follow a Bayesian approach for inference on δ, γ and R(t) Concentrate first on R(t): is monotone nondecreasing function

Follow a Bayesian approach for inference on δ, γ and R(t) Concentrate first on R(t): is monotone nondecreasing function lim t 0 R(t) = 0 and lim t R(t) =

Follow a Bayesian approach for inference on δ, γ and R(t) Concentrate first on R(t): is monotone nondecreasing function lim t 0 R(t) = 0 and lim t R(t) = R(t) behaves like a cumulative hazard function (c.h.f.) There is a vast list of priors for c.h.f.: beta and beta-stacy (NTR) (Hjort, 1990; Walker and Muliere, 1997) mixtures of gamma and weighted gamma (Dykstra and Laud, 1981; Lo and Weng, 1989)

Lévy-driven prior: mixture of kernel k(t, s) with respect to a Lévy process (or IAP) r(s) R(t) = 0 k(t, s)r(ds)

Lévy-driven prior: mixture of kernel k(t, s) with respect to a Lévy process (or IAP) r(s) R(t) = 0 k(t, s)r(ds) k(t, s) is nondecreasing a right-continuous as a function of t

Lévy-driven prior: mixture of kernel k(t, s) with respect to a Lévy process (or IAP) r(s) R(t) = 0 k(t, s)r(ds) k(t, s) is nondecreasing a right-continuous as a function of t lim t 0 k(t, s) = 0 for all s 0 This is a general class If k(t, s) = h(t, s)i (s t)β(s), and r(s) a gamma p. L&W(89)

For the kernel we will take

For the kernel we will take Location Weibull kernel k(t, s) = {1 e a(t s)b }I (s t), a, b > 0

For the kernel we will take Location Weibull kernel k(t, s) = {1 e a(t s)b }I (s t), a, b > 0 Mean Weibull kernel k(t, s) = 1 e {Γ(1+b 1 )t/s} b, b > 0

R(t) 0 1 2 3 4 0 2 4 6 8 10 Figure : Kernel comparison. Loc. exp. (dotted line), loc. Weib. (solid line), and mean Weib. (dashed line). t

For the Lévy intensity ν(dv, ds) = ρ(dv s)α(ds) we will take

For the Lévy intensity ν(dv, ds) = ρ(dv s)α(ds) we will take A non-homogeneous version of the GG p. ρ(dv s) = Γ(1 ɛ) 1 v (1+ɛ) e β(s)v dv, where

For the Lévy intensity ν(dv, ds) = ρ(dv s)α(ds) we will take A non-homogeneous version of the GG p. ρ(dv s) = Γ(1 ɛ) 1 v (1+ɛ) e β(s)v dv, where ɛ {(0, 1) { 1}}

For the Lévy intensity ν(dv, ds) = ρ(dv s)α(ds) we will take A non-homogeneous version of the GG p. ρ(dv s) = Γ(1 ɛ) 1 v (1+ɛ) e β(s)v dv, where ɛ {(0, 1) { 1}} β(s) = cs d, c > 0, d { 1, 0, 1} s 0

For the Lévy intensity ν(dv, ds) = ρ(dv s)α(ds) we will take A non-homogeneous version of the GG p. ρ(dv s) = Γ(1 ɛ) 1 v (1+ɛ) e β(s)v dv, where ɛ {(0, 1) { 1}} β(s) = cs d, c > 0, d { 1, 0, 1} s 0 Together with α(s) = ωp 0 (s)

E{S(t)} 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 t E{S(t)} 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 t E{S(t)} 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 t E{S(t)} 0.0 0.2 0.4 0.6 0.8 1.0 E{S(t)} 0.0 0.2 0.4 0.6 0.8 1.0 E{S(t)} 0.0 0.2 0.4 0.6 0.8 1.0 0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 t Figure : Prior E{S(t λ, θ)}. Loc.Weib.(a, b) = (1, 2) (top row), mean Weib.b = 2 (bottom row). Decreasing β (first column), constant β (second column), and increasing β (third column). (λ, θ): (1, 1) (solid line), (2, 0.5) (dashed line), and (0.5, 3) (dotted line). t t

For the vector parameters δ and γ we will take

For the vector parameters δ and γ we will take δ s N(0, σδ 2 ), for s = 1,..., p

For the vector parameters δ and γ we will take δ s N(0, σδ 2 ), for s = 1,..., p γ s N(0, σγ), 2 for s = 1,..., p

For the vector parameters δ and γ we will take δ s N(0, σδ 2 ), for s = 1,..., p γ s N(0, σγ), 2 for s = 1,..., p δ s and γ s independent a-priori

Posterior characterization: via a Gibbs sampler with [r t, δ, γ], [δ t, r, γ], [γ t, r, δ]

Posterior characterization: via a Gibbs sampler with [r t, δ, γ], [δ t, r, γ], [γ t, r, δ] The tricky part is to find [r t, δ, γ].

Posterior characterization: via a Gibbs sampler with { E [r t, δ, γ], [δ t, r, γ], [γ t, r, δ] The tricky part is to find [r t, δ, γ]. This will be done by computing { E e } φ(s)r(ds) P(A Z) e φ(s)r(ds) } A = E{P(A Z)} where A = (A 1,..., A n ) is the data with A i either an exact or right censored obs.

To find [r t, δ, γ] we require latent variables: A pair (u i, w i ) for each exact obs., and A single u i for each right censored obs.

To find [r t, δ, γ] we require latent variables: A pair (u i, w i ) for each exact obs., and A single u i for each right censored obs. The u i s are dependent and the w i s are independent

To find [r t, δ, γ] we require latent variables: A pair (u i, w i ) for each exact obs., and A single u i for each right censored obs. s are independent [r t, w] = [r t, w, u][u t, w]d u, with [r t, w, u] an updated IAP with fixed jump locations at w i s To simplify computations on [δ t, r, γ] and [γ t, r, δ] we introduce: Another latent variable v i for each obs. The u i s are dependent and the w i

Data analysis Implemented our semiparametric model with 6 covariates Same covariates were used in λ i and θ i

Data analysis Implemented our semiparametric model with 6 covariates Same covariates were used in λ i and θ i Table : LPML statistics for different prior settings. (ω, ɛ) (c, d) Loc. Weib. K. mean-s Weib. K. (1, 1) 243.85 244.43 (1, 0.5) (1, 0) 239.39 240.76 (1, 1) 240.54 243.54 (1, 1) 244.37 247.96 (10, 0.5) (1, 0) 240.97 241.66 (1, 1) 241.56 245.70 (1, 1) (1, 0) 239.84 241.89 Prop. Hazards 284.48

Data analysis R(t) 0 1 2 3 4 5 6 h(t 0) 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0 2 4 6 8 10 12 14 0 2 4 6 8 10 12 14 t t Figure : Post. estimates. R(t) (left) and baseline hazard h(t 0) (right). Using mean Weibull kernel (solid) and using location Weibull kernel (dashed).

Data analysis Table : Posterior estimates of covariate coefficients δ and γ. Loc. Weib. K. Parameter Mean 95% CI P(< 0) δ 1 0.337 ( 0.160, 0.883) 0.066 δ 2 1.442 (0.885, 2.060) 0.000 δ 3 0.213 ( 0.401, 0.944) 0.282 δ 4 0.683 ( 1.219, 0.136) 0.990 δ 5 0.195 ( 0.135, 0.494) 0.114 δ 6 1.646 ( 0.617, 3.719) 0.070 γ 1 1.261 (0.558, 1.685) 0.000 γ 2 0.949 (0.250, 1.591) 0.010 γ 3 0.356 ( 0.894, 0.049) 0.955 γ 4 0.500 ( 0.242, 0.952) 0.080 γ 5 0.032 ( 0.279, 0.338) 0.421 γ 6 2.453 ( 0.834, 5.750) 0.091

Data analysis h(t) 0.0 0.2 0.4 0.6 0 2 4 6 8 10 12 14 Figure : Posterior predictive hazard rates. Hypothetical individuals x 1 = (0, 0, 0, 1, 3, 92), x 2 = (0, 1, 0, 1, 3, 20), x 3 = (1, 0, 0, 1, 3, 92), and x 4 = (1, 1, 0, 1, 3, 20). t

References Yang, S. and Prentice, R. (2005). Semiparametric analysis of short-term and long-term hazard ratios with two sample survival data. Biometrika 92, 1 17. Nieto-Barajas, L.E. (2013). Bayesian semiparametric analysis of short- and long-term hazard ratios with covariates. Computational Statistics and Data Analysis. In press. DOI: 10.1016/j.csda.2013.03.012