Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial William R. Gillespie Pharsight Corporation Cary, North Carolina, USA PAGE 2003 Verona, Italy June 12-13, 2003 Oral Session 1: Estimating parameter uncertainty 1

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial Bayesian principles and methods provide a coherent framework for: Quantifying uncertainty, Making inferences in the presence of that uncertainty. Bayesian modeling and simulation are practical options for many applications due to recent advances in hardware, numerical methods and software. This presentation describes an approach used with a recent project to optimize the design and analysis of a Phase II proof-ofconcept (PoC) trial. Bayesian methods used throughout a model-based approach. Model development Trial simulation Trial analysis Focus on technical execution. 2

The scenario Simuzine is a NCE for treatment of a slowly progressive illness. Previous Phase II PoC study of simuzine: Primary efficacy score measured at baseline and 6 months. Results encouraging but inconclusive. Longer duration treatment may be necessary to reach a decisive outcome. Additional longitudinal data available for model development: Efficacy score for patients with observations at various times over durations up to 6 years. Believed to be representative of the placebo group in the new trial. The new trial is already underway, so the M&S effort focuses on optimizing analysis of the trial results to support a PoC decision. 3

The scenario (cont.) New trial design: Parallel, 2 treatment arm trial comparing simuzine 100 mg to placebo. Primary endpoint = efficacy score at 2 years (LOCF imputation). 100 patients per treatment arm. The example compares the use of 3 different trial analyses: Conventional frequentist analysis of endpoint data (ANCOVA) Bayesian longitudinal analysis with use of prior data. Bayesian longitudinal analysis without prior data (non-informative priors). 4

Fully Bayesian approach to trial simulation and analysis The case study illustrates the following 3 applications of Bayesian modeling: Model development A Bayesian longitudinal model of an efficacy score is fitted using Markov chain Monte Carlo simulation (MCMC, WinBUGS). Trial simulation Uncertainty in the model parameters is considered by resampling the MCMC-generated samples from the joint posterior distribution of the model parameters (S-PLUS). Trial analysis The simulated trial data combined with prior data are analyzed with a Bayesian longitudinal model (WinBUGS). The resulting inferences are compared with those obtained with a more conventional endpoint analysis (ANCOVA, S-PLUS) and use of the Bayesian longitudinal model without the prior data. 5

Model for the effect of simuzine on an efficacy score in the target patient population Selected model: Log(score) changes linearly with time. Sex, age and time from disease onset affect the intercept. Dose of simuzine affects the slope. Log(score) at the i th observation time in the j th patient is modeled as: 2 ( scoreij ) N ( α j + β jtij σ ) ( 55) log ~, α = α + θ I + θ age + θ t j 0 j sex female, j age j β onset, j β = θ + θ j β drug j 0 j 2 ( α α) α ~ N θ, ω Prior distributions (chosen to be relatively non-informative) 6 6 12 6 ( ) β ( ) drug ( ) age ( ) θ ~ N 0,10 θ ~ N 0,10 θ ~ N 0,10 θ ~ N 0,10 θ α sex dose 1 1 ~ N 0,10 ~ gamma 10,10 ~ gamma 10,10 6 4 4 4 4 ( ) ( ) ( ) 2 2 σ ω α 6

Model fitting results: Posterior predictive intervals score 600 500 400 300 200 100 0 The model captures the decline in score after disease onset: Observed and model predicted (median and 90% prediction intervals) scores (placebo data) 0 5 10 15 20 time from disease onset (y) score change from baseline at 6 months And the effect of simuzine: Observed and model predicted (median and 90% prediction intervals) change from baseline 60 40 20 0-20 -40-60 0 20 40 60 80 100 dose (mg/d) 7

Model fitting results: Examples of individual fits 76 77 78 79 80 600 500 400 300 200 100 7.2 7.4 7.6 3.1 3.3 3.5 4.2 4.4 4.6 4.4 4.6 2.6 2.8 3.0 600 500 400 300 200 100 71 72 73 74 75 score 6.4 6.6 6.85.1 5.3 5.5 66 67 1.2 1.4 1.63.1 3.3 3.51.6 1.8 2.0 68 69 70 600 500 400 300 200 100 4.2 4.4 4.6 12.2 12.4 12.6 4.2 4.4 4.6 8.2 8.4 8.6 4.4 4.6 4.8 600 500 400 300 200 100 61 62 63 64 65 5.8 6.0 6.2 3.1 3.3 3.5 4.2 4.4 4.6 8.1 8.3 8.5 3.4 3.6 3.8 time (y) 8

Model fitting results: Posterior marginal distributions of parameter estimates Pr(θ drug >0) = 0.94 frequency 0 20 60 100 0 5 10 20 30 theta.beta sd(alpha0) 0.18 0.24 0.0 0.0004 0.0008 theta.drug 0 2 4 6 8 10-0.060-0.040-1000 2000-0.1 0.1 sigma theta.age 0 20 40 60 80 0 50 100 200 0.08 0.10-0.020-0.005 5.50 5.65 value 9 theta.sex 0 5 10 15 theta.alpha

Extending the model for simulating 2 year outcomes The current model is based on data from only 6 months of simuzine administration. Model predictions for 1 and 2 years represent major extrapolations from experience. The magnitude of the response at 1 and 2 years is more uncertain than indicated by simple linear extrapolation. For example, the available clinical evidence is also consistent with a more pessimistic model in which the drug benefit is not sustained beyond 6 months. Extrapolation beyond 6 months is based on expert judgment: Upper bound: Linear extrapolation (constant slope) Lower bound: Slope changes to pretreatment value after 6 months Uncertainty in the post-6 month slope is modeled as a uniform distribution between those extremes. 10

Model for simulating the effect of simuzine over 2 years Log(score) at the i th observation time in the j th patient is modeled as: 2 ( scoreij ) N ( µ ij σ ) log ~, α j + β1 jtij, tij 0.5 µ ij = α j + β1j 0.5 + β2 j ij 0.5, tij > 0.5 ( ) ( t ) ( 55) α = α + θ I + θ age + θ t j 0 j sex female, j age j β onset, j β = θ + θ dose β = θ + θ θ 1j β drug j 2 j β extrap drug j 2 0 j ~ N ( α, α) extrap ~ Uniform( 0,1) 2 2 ( θα, θβ, θdrug, θag e θsex σ ωα ) α θ ω θ dose,,, ~ MCMC estimated joint posterior distribution 11

Model results indicate a highly uncertain but potentially large drug effect on the efficacy score 0.0 1.0 2.0 0 20 40 60 80 100 60 mg 80 mg 100 mg 1.5 years 2 years 230 40 score 230 220 210 200 190 0 mg 20 mg 40 mg 220 210 200 190 score: difference from placebo 40 30 20 10 0 0.5 years 1 years 30 20 10 0 0.0 1.0 2.0 0.0 1.0 2.0 0 20 40 60 80 100 time (y) dose (mg/d) Model-predicted population mean score (median & 90% probability intervals) as a function of dose and time 12

Trial simulation with uncertainty Uncertainty is modeled as inter-trial variation in the model parameters Each trial is simulated using one random draw from the joint posterior distribution of the model parameters. Those parameter values represent the truth for that simulated trial. Each simulated trial outcome is compared to its own unique truth under the model. The basic notion is that you don t know the real truth, so you would like to explore the performance of the trial design over a range of possibilities consistent with your uncertainty. Algorithm: For j = 1 to n.trials Sample parameters from the joint posterior distribution. Simulate the trial. Calculate statistic(s) of interest (e.g., treatment means, hypothesis test results, go/no-go decision, choice of treatment regimen further development, etc.). Assess performance by comparison to model truth. Implemented in S-PLUS 13

Trial performance is measured by the quality of the proof-ofconcept decision Probability of reaching the correct (highest value) decision, i.e., go for a winner drug and no-go for a loser drug. You want to choose a trial design and a go/no-go decision method and criteria that minimizes: Pr(go loser): probability of an incorrect go decision. Pr(stop winner): probability of a lost opportunity. What is a winner or loser drug treatment? The working definition of a winner used for the analyses presented here is a drug treatment that results in at least a 50% reduction in the rate of decline of the efficacy score over 2 years. 14

Prior information indicates a 73% probability that simuzine 100 mg is a winner frequency 0.5 0.4 0.3 0.2 loser Pr(drug effect > 50%) = 0.728 winner 0.1 0.0-2 0 2 4 6 fractional reduction in rate of decline in score over 2 years 15

Current trial design and per protocol analysis results in too many go decisions for losers Pr(go loser) = 0.34 Pr(stop winner) = 0.037 high probability of an incorrect go decision. low probability of a lost opportunity. fraction of simulated trials 0.0 0.05 0.10 0.15 N per treatment = 100-1 0 1 2 3 4 fractional reduction in rate of decline in score 16

Results of Bayesian longitudinal analyses Go criteria: Pr( 50% reduction in the rate of decline over 2 years) p crit Bayesian longitudinal analysis (without prior information) can be calibrated to markedly improve the PoC decision by reducing incorrect go decisions. Analysis criteria Simulated trial results p crit Pr(stop winner) Pr(go loser) Bayesian longitudinal analysis without prior information 0.5 0.065 0.152 0.6 0.085 0.122 0.7 0.097 0.084 0.8 0.126 0.057 0.9 0.182 0.041 0.95 0.216 0.020 ANCOVA results 0.037 0.338 17

Results of Bayesian longitudinal analyses Go criteria: Pr( 50% reduction in the rate of decline over 2 years) p crit Incorporation of the prior data offers little or no additional improvement in the quality of the PoC decision. Analysis criteria Simulated trial results p crit Pr(stop winner) Pr(go loser) Bayesian longitudinal analysis without prior information 0.5 0.065 0.152 0.6 0.085 0.122 0.7 0.097 0.084 0.8 0.126 0.057 0.9 0.182 0.041 0.95 0.216 0.020 Bayesian longitudinal analysis with prior information 0.5 0.078 0.149 0.6 0.111 0.115 0.7 0.126 0.088 0.8 0.145 0.057 0.9 0.180 0.027 0.95 0.223 0.017 ANCOVA results 0.037 0.338 18

What next? The presented approach for assessing trial design performance does not explicitly optimize the tradeoffs between false positives (go loser) and false negatives (stop winner). That may be addressed by associating values (possibly economic) to the losses due to those competing errors and using Bayesian decision analysis to optimize the choice of analysis criteria (p crit ). Alternatively, the go/no-go decision method could be based on Bayesian decision analysis rather than the approach shown here. 19

Take home message: Bayesian modeling and simulation can be done NOW! Recent advances in computer hardware, numerical methods and software make fully Bayesian approaches a practical option for many modeling, simulation and decision analysis applications. Bayesian principles and methods provide a coherent framework for: Quantifying uncertainty, Making inferences in the presence of that uncertainty. 20

Wladimiro Tulli "LE SCARPETTE 1997 21

Additional slides 22

Model development General approach Exploratory data analysis Graphical exploration plus crude regression analyses (primarily S-PLUS). Model exploration and selection Linear and nonlinear mixed effects models fitted by maximum likelihood methods (NONMEM or S-PLUS) Final model parameter estimation Mixed effects models fitted by Bayesian methods (WinBUGS). More rigorously characterizes the correlated uncertainties in the parameter estimates. Posterior distributions of the parameters are used in subsequent clinical trial simulations. 23

Simulations to optimize the Phase II PoC trial analysis Focuses on methods of trial analysis, i.e., something that may still be influenced now that the trial is underway. Conventional endpoint analysis (ANCOVA) versus Bayesian longitudinal analysis with or without use of prior data. Implementation: Simulations performed using S-PLUS. Simulated trials analyzed with S-PLUS (conventional ANCOVA) or WinBUGS (Bayesian longitudinal analyses). 24

Model fitting results: examples of posterior predictive checks Simulated trial results are consistent with the results of the previous trial. -20 0 10 20 30 20 30 40 50 time = 0.44 dose = 20 time = 0.44 dose = 100 time = 0.44 dose = 0 time = 0.44 dose = 20 time = 0.44 dose = 100 Percent of Total 25 20 15 10 Percent of Total 20 15 10 5 5 0 0-20 0 10 20 30 difference from placebo in mean score change from baseline 25 20 30 40 50 20 30 40 50 sd(score change from baseline) Histograms depict the distribution of simulated trial outcomes using the same design and patient covariates as the previous trial. Observed values are shown as vertical dashed lines.

Model results indicate a highly uncertain but potentially large drug effect on the efficacy score 0.5 Uncertainty distribution of the mean decline of the score over 2 years due to simuzine 100 mg 0.4 frequency 0.3 0.2 Pr(drug effect > 0) = 0.94 0.1 0.0-2 0 2 4 6 fractional reduction in rate of decline in score over 2 years 26

Trial analysis methods ANCOVA Dependent variable: percent change in score at endpoint Covariates: baseline score and simuzine dose (as a categorical variable) Go criteria: p < 0.05 for dose effect and percent change in score greater for simuzine 100 mg than for placebo Bayesian longitudinal analysis Dependent variable: score (all observation times) The model used for simulation is fit to the trial data Trial data alone Trial data + prior data Relatively non-informative prior distributions used for the model parameters Go criteria: Pr( 50% reduction in the rate of decline over 2 years) p crit A range of p crit values are explored (0.5, 0.6, 0.7, 0.8, 0.9, 0.95). 27

A Bayesian longitudinal analysis (without prior information) can be calibrated to markedly improve the PoC decision by reducing incorrect go decisions Pr(drug effect > 0.5) = 0.5 Pr(drug effect > 0.5) = 0.6 fraction of simulated trials 0.0 0.10 0.0 0.10-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.7-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.9 0.0 0.10 0.0 0.10-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.8-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.95 Go criteria: Pr( 50% reduction in the rate of decline over 2 years) p crit Analysis criteria Simulated trial results p crit Pr(stop winner) Pr(go loser) 0.5 0.065 0.152 0.6 0.085 0.122 0.7 0.097 0.084 0.8 0.126 0.057 0.9 0.182 0.041 0.95 0.216 0.020 ANCOVA results 0.037 0.338 0.0 0.10 0.0 0.10-1 0 1 2 3 4-1 0 1 2 3 4 fractional reduction in rate of decline in score 28

Incorporation of the prior data offers little or no additional improvement in the quality of the PoC decision (Bayesian longitudinal analysis method) Pr(drug effect > 0.5) = 0.5 Pr(drug effect > 0.5) = 0.6 fraction of simulated trials 0.0 0.10 0.0 0.10 0.0 0.10-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.7-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.9 0.0 0.10 0.0 0.10-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.8-1 0 1 2 3 4 Pr(drug effect > 0.5) = 0.95 0.0 0.10 Go criteria: Pr( 50% reduction in the rate of decline over 2 years) p crit Analysis criteria Simulated trial results p crit Pr(stop winner) Pr(go loser) Bayesian longitudinal analysis without prior information 0.5 0.065 0.152 0.6 0.085 0.122 0.7 0.097 0.084 0.8 0.126 0.057 0.9 0.182 0.041 0.95 0.216 0.020 Bayesian longitudinal analysis with prior information 0.5 0.078 0.149 0.6 0.111 0.115 0.7 0.126 0.088 0.8 0.145 0.057 0.9 0.180 0.027 0.95 0.223 0.017 ANCOVA results 0.037 0.338-1 0 1 2 3 4-1 0 1 2 3 4 fractional reduction in rate of decline in score 29

Summary of key inferences Current trial design and per protocol analysis results in too many go decisions for losers. A Bayesian longitudinal analysis (without prior information) can be calibrated to markedly improve the PoC decision by reducing incorrect go decisions. Incorporation of the prior data offers little or no additional improvement in the quality of the PoC decision (Bayesian longitudinal analysis method). 30