CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

Similar documents
Longitudinal breast density as a marker of breast cancer risk

Analysing geoadditive regression data: a mixed model approach

Bayesian Hypothesis Testing in GLMs: One-Sided and Ordered Alternatives. 1(w i = h + 1)β h + ɛ i,

Frailty Probit model for multivariate and clustered interval-censor

A general mixed model approach for spatio-temporal regression data

Model Selection in Bayesian Survival Analysis for a Multi-country Cluster Randomized Trial

Multi-state Models: An Overview

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Modelling geoadditive survival data

Multistate Modeling and Applications

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, )

CTDL-Positive Stable Frailty Model

Optimal rules for timing intercourse to achieve pregnancy

Dynamic Scheduling of the Upcoming Exam in Cancer Screening

The sbgcop Package. March 9, 2007

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Stat 642, Lecture notes for 04/12/05 96

Survival Prediction Under Dependent Censoring: A Copula-based Approach

Bayesian methods for latent trait modeling of longitudinal data

REGRESSION ANALYSIS FOR TIME-TO-EVENT DATA THE PROPORTIONAL HAZARDS (COX) MODEL ST520

Bayesian Hierarchical Models

STAT331. Cox s Proportional Hazards Model

Bayesian shrinkage approach in variable selection for mixed

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Default Priors and Effcient Posterior Computation in Bayesian

Joint longitudinal and survival-cure models in tumour xenograft experiments

Bayesian non-parametric model to longitudinally predict churn

Multistate models and recurrent event models

Pairwise rank based likelihood for estimating the relationship between two homogeneous populations and their mixture proportion

Package sbgcop. May 29, 2018

A Bayesian Nonparametric Approach to Causal Inference for Semi-competing risks

Bayesian model selection in graphs by using BDgraph package

Models for Multivariate Panel Count Data

Efficient adaptive covariate modelling for extremes

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Hmms with variable dimension structures and extensions

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

STA 216, GLM, Lecture 16. October 29, 2007

Lognormal Measurement Error in Air Pollution Health Effect Studies

Part IV Extensions: Competing Risks Endpoints and Non-Parametric AUC(t) Estimation

Lecture 7 Time-dependent Covariates in Cox Regression

A TWO-STAGE LINEAR MIXED-EFFECTS/COX MODEL FOR LONGITUDINAL DATA WITH MEASUREMENT ERROR AND SURVIVAL

Bayes methods for categorical data. April 25, 2017

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

Marginal Specifications and a Gaussian Copula Estimation

Joint longitudinal and survival-cure models in tumour xenograft experiments

State Space and Hidden Markov Models

Multistate models and recurrent event models

Package SimSCRPiecewise

Approaches for Multiple Disease Mapping: MCAR and SANOVA

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Lecture 5 Models and methods for recurrent event data

2 Describing Contingency Tables

Mixture modelling of recurrent event times with long-term survivors: Analysis of Hutterite birth intervals. John W. Mac McDonald & Alessandro Rosina

Approximate Bayesian Computation

Sample-weighted semiparametric estimates of cause-specific cumulative incidence using left-/interval censored data from electronic health records

Riemann Manifold Methods in Bayesian Statistics

A joint modeling approach for multivariate survival data with random length

Parametric Maximum Likelihood Estimation of Cure Fraction Using Interval-Censored Data

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Bayesian Multivariate Logistic Regression

Performance of INLA analysing bivariate meta-regression and age-period-cohort models

A multistate additive relative survival semi-markov model

SUPPLEMENT TO MARKET ENTRY COSTS, PRODUCER HETEROGENEITY, AND EXPORT DYNAMICS (Econometrica, Vol. 75, No. 3, May 2007, )

Other Survival Models. (1) Non-PH models. We briefly discussed the non-proportional hazards (non-ph) model

Markov Chain Monte Carlo in Practice

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model

Analysis of Cure Rate Survival Data Under Proportional Odds Model

Joint Longitudinal and Survival-cure Models with Constrained Parameters in Tumour Xenograft Experiments

Partial factor modeling: predictor-dependent shrinkage for linear regression

NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET

Robust estimates of state occupancy and transition probabilities for Non-Markov multi-state models

A multi-state model for the prognosis of non-mild acute pancreatitis

The Wishart distribution Scaled Wishart. Wishart Priors. Patrick Breheny. March 28. Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/11

Session 9: Introduction to Sieve Analysis of Pathogen Sequences, for Assessing How VE Depends on Pathogen Genomics Part I

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

Empirical Validation of the Critical Thinking Assessment Test: A Bayesian CFA Approach

New mixture models and algorithms in the mixtools package

Semiparametric Mixed Effects Models with Flexible Random Effects Distribution

Modeling conditional distributions with mixture models: Applications in finance and financial decision-making

Multistate models in survival and event history analysis

Bayesian Semiparametric Dynamic Frailty Models for. Multiple Event Time Data

Bayesian Inference for Clustered Extremes

Survival Analysis for Case-Cohort Studies

Score test for random changepoint in a mixed model

Part III Measures of Classification Accuracy for the Prediction of Survival Times

Expression Data Exploration: Association, Patterns, Factors & Regression Modelling

Statistical Inference and Methods

Fast Likelihood-Free Inference via Bayesian Optimization

Disease mapping with Gaussian processes

Latent Class Analysis

Survival Analysis Math 434 Fall 2011

Vertical modeling: analysis of competing risks data with a cure proportion

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Support Vector Hazard Regression (SVHR) for Predicting Survival Outcomes. Donglin Zeng, Department of Biostatistics, University of North Carolina

Using Estimating Equations for Spatially Correlated A

Bayesian spatial hierarchical modeling for temperature extremes

ABC methods for phase-type distributions with applications in insurance risk problems

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Transcription:

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity Outline: 1. NIEHS Uterine Fibroid Study Design of Study Scientific Questions Difficulties 2. General Problem and Earlier Approaches 3. Bayesian Modeling Framework Stochastic structure Prior Elicitation MCMC Scheme 4. Application to Fibroid Data & Results 5. Discussion

Studying Uterine Leiomyoma (Fibroids) Background: Uterine fibroids are a smooth muscle tumor Fibroids bleeding, pain, infertility, preg compl. Fibroids hysterectomy Fibroids typically regress after menopause African Americans have a higher clinical rate Interests: Inferences on black-white differences in Age-specific rate of preclinical onset Rate of progression after onset

NIEHS Uterine Fibroid Study (Donna Baird, PI) Design: Cross-Sectional Screening Study Participants: Sample from D.C. HMO 840 African American, 524 whites Aged 35-49, pre- and postmenopausal Data: Clinical history (age at first diagnosis) Current presence of detectable fibroids Severity (length uterus, size/number tumors) Age at myomectomy, hysterectomy or menopause

Goals: 1. Estimate cumulative incidence 2. Compare black & white incidence 3. Assess differences in preclinical progression Problem: Data are cross-sectional Age at onset is interval censored Current severity depends on age at onset Informatively missing data What can we do?

Multistate Modeling R S Disease-Free (or not detectable) λ(t) Preclinical Detectable Disease α(t) Clinical Disease State 1 State 2 State 3 Figure 1. Progressive three-state model of the onset and diagnosis process.

Summary of NIEHS uterine fibroid data Surrogate Data a Race Current Status State Number Uterus(cm) Tumor rank b Black No fibroids 1 130 Preclinical 2 185 9.07 (2.17) 1.81 (.76) History of fibroids 3 420 No history, missing? 105 All black 840 White No fibroids 1 190 Preclinical 2 140 8.62 (1.48) 1.49 (.65) History of fibroids 3 125 No history, missing? 69 All white 524 a Mean (sd) among women with leiomyoma detected at screening. b Ordinal measure of size/number of tumors was averaged.

Earlier Approaches Dunson and Baird (2001, Biometrics): Incorporates diagnostic history in estimating incidence. Ryan and Orav (1988, Biometrika): Tumor severity incorporated as covariates in modeling rate of natural death (tumorigenicity experiments). Chen et al (2000, Biometrics): Time-homogenous Markov modeling approach - allows progression between states within preclinical stage. Craig et al (1999), Stats in Med): Bayesian analysis of interval-censored disease data - severity data incorporated as covariates. New approach is needed for inference on progression

Discrete-Time Stochastic Modeling Approach R = Age at Preclinical Onset (Entry State 2) S = Age at Clinical Diagnosis (Entry State 3) I j = (t j 1, t j ] for j = 1,..., J, t 0 = 0 Transition Rates: λ j = Pr(R I j R > t j 1 ) Onset Rate α j = Pr(S I j S > t j 1, R t j ) Diagnosis Rate

Marker Process: Z k = kth Measure of Severity at T (Screen) Zk = Normal variable underlying Z k Z k = g k (Zk; τ k ) Link model Underlying severity model: (Z k R I j, T I l, j l, S > t l ) N ( l j h=1 µ hk, 1 ) Conditional expectation of Z k is 0 when j = l Expectation depends on waiting time in disease state Accommodates discrete and continuous measures

Bayesian Semiparametric Analysis Regression models for the rate parameters: λ ij = h 1 (ω j + x ijβ) Onset Rate α ij = h 2 (ν j + x ijψ) Diagnosis Rate µ ihk = u hυ k + x ihκ k + σξ i Progression, ξ i N(0, 1) ω j, ν j, υ k Baseline parameters β, ψ, κ k Regression parameters σξ i Prior distributions: Subject-specific effect Normal for β, ψ, {υ k, κ k } and σ. For the baseline rate parameters, let ω j 1 2 (ω j 1 + ω j+1 ) Gamma(c 1, d 1 ) ν j 1 2 (ν j 1 + ν j+1 ) Gamma(c 2, d 2 ) The degree of smoothing towards local linearity depends on c, d for small samples

Posterior Computation using MCMC 1. For subjects with disease, sample interval of entry. 2. Sample underlying variables {Z ik} and link parameters τ. 3. Update the parameters {ω j }, β, {ν j }, ψ in onset and diagnosis process. 4. Update the parameters {υ k, κ k }, σ and latent variables {ξ i } in disease progression process. 5. Repeat steps 1-4. Conditional on onset interval, likelihood is simple MCMC Algorithm is straightforward to implement

Application to NIEHS Uterine Fibroid Data Interest: Black-white differences onset/progression Assumption: Preclinical disease menopause Transition Rate Models: log{ log(1 λ ij )} = ω j + x i (j = 1, 2 j 10, 11 j 14)β, log{ log(1 α ij )} = ω j (1 x i ) + β j x i, x i = 1 for blacks, x i = 0 for whites Age is divided into J = 14 intervals: (0, 36], (36, 37], (37, 38],..., (48, 49].

Disease Progression Model Z 1 Z 2 = Length of uterus = 1-3 ranking of size/number of tumors Link Model: Z i1 = τ 11 + τ 21 Z i1 and Z i2 = τ are nuisance link parameters 1 Z i2 τ 12 2 τ 12 < Z i2 τ 22 3 τ 22 < Z i2 Underlying Variable Model: µ ihk = υ k + x i κ k + σξ i for k = 1, 2 and h = 1,..., 13, υ k κ k σξ i = Underlying rate of change in marker k for whites = Black-white difference in underlying rate = Woman-specific factor

Bayesian Analysis Prior distributions were chosen 30,000 MCMC iterates collected (5000 burn-in) Posterior Summaries Progression Paramters Posterior Summaries Surrogate Parameter Mean 90% Credible Interval Uterine Length τ 11 8.196 (7.909, 8.495) (Z 1 ) τ 21 1.465 (1.334, 1.607) υ 1 0.090 (0.028, 0.155) κ 1 0.074 (0.014, 0.136) Tumor Size/ τ 12 0.149 (-0.269, 0.591) Number τ 22 1.399 (0.884, 1.968) (Z 2 ) υ 2 0.030 (0.001, 0.095) κ 2 0.064 (0.002, 0.138) Shared Parameter σ 0.124 (0.094, 0.155) Conclusion: African Americans have a higher rate of preclinical growth of uterine fibroids

Higher incidence rate among blacks 35 (Pr(β 1 > 0) > 0.99, Pr(β 2 > 0) =.58, Pr(β 3 > 0) =.32 )

Estimated Cumulative Incidence Curves Cumulative Incidence of Leiomyoma 0.0 0.2 0.4 0.6 0.8 1.0 African American White 36 38 40 42 44 46 48 Age (years) Estimates not incorporating disease severity data are denoted by +.

SUMMARY Bayesian approach for inference on disease incidence and progression from cross-sectional data. Three-state model of preclinical onset and clinical diagnosis Multiple markers are linked to underlying normal variables Accounts for dependency among markers with different scales Approach can be adapted for different data structures

Discussion This type of multistate modeling approach with surrogates for the waiting times in the different states can be adapted for many different applications. Often, data do not consist of a simple right-censored survival time. When there are several states an individual can progress through and the exact transition times are unknown, the likelihood can be complex. The problem can be simplified by working in discrete time and considering the interval censored transition times as latent variables. Often, identifiability is in question and one may need some assumptions

Common Assumptions Markov: transition rates are independent of history of process Homogeneity: time transition rates are independent of Semi-Markov: transition rates are independent of time given waiting time in the current state