Workshop 12: ARE MISSING DATA PROPERLY ACCOUNTED FOR IN HEALTH ECONOMICS AND OUTCOMES RESEARCH?

Similar documents
L applicazione dei metodi Bayesiani nella Farmacoeconomia

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Nonrespondent subsample multiple imputation in two-phase random sampling for nonresponse

Health utilities' affect you are reported alongside underestimates of uncertainty

Whether to use MMRM as primary estimand.

Basics of Modern Missing Data Analysis

2 Naïve Methods. 2.1 Complete or available case analysis

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

MISSING or INCOMPLETE DATA

Some methods for handling missing values in outcome variables. Roderick J. Little

Analyzing Pilot Studies with Missing Observations

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Richard D Riley was supported by funding from a multivariate meta-analysis grant from

7 Sensitivity Analysis

Missing Data Issues in the Studies of Neurodegenerative Disorders: the Methodology

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Latent Variable Model for Weight Gain Prevention Data with Informative Intermittent Missingness

Selection on Observables: Propensity Score Matching.

A comparison of fully Bayesian and two-stage imputation strategies for missing covariate data

Methods for estimating complier average causal effects for cost-effectiveness analysis

MISSING or INCOMPLETE DATA

Accepted Manuscript. Comparing different ways of calculating sample size for two independent means: A worked example

Fast approximations for the Expected Value of Partial Perfect Information using R-INLA

Alexina Mason. Department of Epidemiology and Biostatistics Imperial College, London. 16 February 2010

Longitudinal analysis of ordinal data

Comparison of multiple imputation methods for systematically and sporadically missing multilevel data

Shu Yang and Jae Kwang Kim. Harvard University and Iowa State University

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Extending causal inferences from a randomized trial to a target population

A Note on Bayesian Inference After Multiple Imputation

Reconstruction of individual patient data for meta analysis via Bayesian approach

Chapter 4. Parametric Approach. 4.1 Introduction

Fractional Imputation in Survey Sampling: A Comparative Review

ANALYSIS OF ORDINAL SURVEY RESPONSES WITH DON T KNOW

A weighted simulation-based estimator for incomplete longitudinal data models

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Mapping non-preference onto preference-based PROMs Patient-reported outcomes measures (PROMs) in health economics

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Pooling multiple imputations when the sample happens to be the population.

6 Pattern Mixture Models

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas

Statistical Practice

Propensity Score Weighting with Multilevel Data

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Variational Bayesian Inference for Parametric and Non-Parametric Regression with Missing Predictor Data

Longitudinal + Reliability = Joint Modeling

Known unknowns : using multiple imputation to fill in the blanks for missing data

Downloaded from:

Bayesian Hierarchical Models

Discussion of Missing Data Methods in Longitudinal Studies: A Review by Ibrahim and Molenberghs

Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

Planned Missingness Designs and the American Community Survey (ACS)

Causal Mediation Analysis in R. Quantitative Methodology and Causal Mechanisms

Time-Invariant Predictors in Longitudinal Models

A Sampling of IMPACT Research:

Multivariate Survival Analysis

A simple bivariate count data regression model. Abstract

Introduction to Statistical Analysis

Bayes methods for categorical data. April 25, 2017

Inferences on missing information under multiple imputation and two-stage multiple imputation

Practical considerations for survival models

Sparse Linear Models (10/7/13)

Bayesian non-parametric model to longitudinally predict churn

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

MODEL ASSESSMENT FOR MODELS WITH MISSING DATA. Xiaolei Zhou. Chapel Hill 2015

Estimating Causal Effects of Organ Transplantation Treatment Regimes

Using modern statistical methodology for validating and reporti. Outcomes

Correction of the likelihood function as an alternative for imputing missing covariates. Wojciech Krzyzanski and An Vermeulen PAGE 2017 Budapest

Discussion of Identifiability and Estimation of Causal Effects in Randomized. Trials with Noncompliance and Completely Non-ignorable Missing Data

Bayesian Methods for Machine Learning

Core Courses for Students Who Enrolled Prior to Fall 2018

Improving Efficiency of Inferences in Randomized Clinical Trials Using Auxiliary Covariates

Sampling bias in logistic models

Multi-state Models: An Overview

MISSING DATA IN REGRESSION MODELS FOR NON-COMMENSURATE MULTIPLE OUTCOMES

Factor Analytic Models of Clustered Multivariate Data with Informative Censoring (refer to Dunson and Perreault, 2001, Biometrics 57, )

PIRLS 2016 Achievement Scaling Methodology 1

Comparison between conditional and marginal maximum likelihood for a class of item response models

Methods for Handling Missing Data

Analysing longitudinal data when the visit times are informative

Time-Invariant Predictors in Longitudinal Models

Stat 542: Item Response Theory Modeling Using The Extended Rank Likelihood

Dose-response modeling with bivariate binary data under model uncertainty

Lecture 7 Time-dependent Covariates in Cox Regression

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Estimation of Optimal Treatment Regimes Via Machine Learning. Marie Davidian

Joint Modeling of Longitudinal Item Response Data and Survival

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

[Part 2] Model Development for the Prediction of Survival Times using Longitudinal Measurements

Pattern Mixture Models for the Analysis of Repeated Attempt Designs

Introduction to mtm: An R Package for Marginalized Transition Models

arxiv: v1 [stat.me] 27 Feb 2017

Bayes: All uncertainty is described using probability.

Transcription:

Workshop 12: ARE MISSING DATA PROPERLY ACCOUNTED FOR IN HEALTH ECONOMICS AND OUTCOMES RESEARCH? ISPOR Europe 2018 Barcelona, Spain Tuesday, 13 November 2018 15:30 16:30

ISPOR Statistical Methods in HEOR Special Interest Group (SIG) Mission: To provide statistical leadership for strengthening the use of appropriate statistical methodology in health economics and outcomes research and improve the analytic techniques used in real world data analysis. Co-Chairs of SIG Rita M. Kristy, MS, Senior Director, Medical Affairs Statistics, Astellas Pharma Global Development, Northbrook, IL, USA David J. Vanness, PhD, Professor, Health Policy and Administration, The Pennsylvania State University, State College, PA, USA Co-Chairs of ISPOR Missing Data in HEOR Working Group Necdet Gunsoy, PhD, MPH, Director of Analytics and Innovation for Value Evidence and Outcomes at GlaxoSmithKline (GSK), England, United Kingdom Gianluca Baio, PhD, MSc, Reader in Statistics and Health Economics, University College London (UCL), England, United Kingdom 2

SECTION Summary of methods David J. Vanness, PhD Professor, Health Policy and Administration The Pennsylvania State University

Notation and Conventions Let U = C 1,, C n, E 1,, E n, X 1,, X n be a vector random variable consisting of the true costs C, effectiveness measures E, and covariates X for individuals i = 1, N. C, E, and X are each possibly multidimensional, comprising, different types j = 1, J, k = 1, K, l = 1, L and time points t = 1, T of cost c jt, outcome e kt and covariate x lt measurements. Let U f θ, where θ parameterizes the joint distribution f of costs, effectiveness and covariates. θ is the key object of inference for cost-effectiveness analysis 4

U f θ 5

Notation and conventions (continued) Let M = (m c111, m cjtn, m e111,, m ektn, m x111,, m xltn ) be a vector variable of binary missing-data indicators, where for example, m cjti = 0 if cost type j is missing for subject i at time t, and = 1 if it is observed. Let M π φ, where φ parameterizes the process that causes missing data (Rubin DB. Inference and missing data. Biometrika 1976;63:581 592.) Let V = M U + 1 M NA be the vector variable of outcomes and covariates as collected, with unobserved values replaced by NA. Given an actual realization of a missing data pattern m, partition each element of U (the true data) and of V (the collected data) into components u 0, u 1 and v 0, v 1 corresponding to values where m = 0 (missing), and m = 1 (observed), respectively. u (1) = v (1) is known, but u (0) is unobserved. 6

V π φ f θ 7

V π φ f θ m = 0 8 v (1) v (0)

U f θ 9 u (1) u (0)

Key Question: what can we infer about the f θ that generates true data U given that we observe π φ f θ that generates the collected data V? Ignorable missingness: π φ f θ : the process that generates the missingness is distinct from the process that generates the true data. (Missing Completely At Random MCAR) π φ f θ v 1 : the process that generates the missingness is distinct from the process that generates the true data, conditional on the observed data. (Missing at Random MAR) Non-ignorable missingness π φ f θ ( v 1, u 0 ) : the process that generates the missingness is only distinct from the process that generates the true data after conditioning on observed and unobserved data. (Missing Not at Random MNAR) Means we cannot ignore the process that generated the missing data. 10

A taxonomy of approaches to missing data: Weighting CCA / Deletion IPW Unconditional LOCF Joint Model Single Multiple Ignorable Imputation Regression Hot-Deck Chained Equations Single Non-Stochastic (Single) Stochastic Single Multiple Nearest- Neighbor Multiple Direct Analysis ACA Maximum Likelihood Full Joint Model EM Algorithm 11 Non- Ignorable Ignorable Methods + Sensitivity Analysis Pattern Mixture Models Additive Proportional Fully Bayesian

SECTION How is missing data accounted for in HEOR? Necdet Gunsoy, PhD, MPH Director, Analytics and Innovation GlaxoSmithKline (GSK)

What is special about HEOR? Regulatory literature is mostly focussed on missing data for clinical outcomes Most HEOR is focussed upon cost, health-related quality of life (HRQoL), and utility data Cost and utility data have very particular distributions (e.g. highly skewed cost data with many 0s, EQ-5D index data with many 1s) Cost and utility/hrqol can be strongly correlated (e.g. 0 cost associated with high utility) HEOR requires more tailored approaches to report and account for missing data 13

What is special about HEOR? HEOR requires more tailored approaches to report and account for missing data But very few studies focus on missing data for HEOR 14

Missing data methods in HEOR A literature review was conducted to understand methodological approaches used to account for missing data in cost-effectiveness analyses (CEA) 16 studies were identified 15

Missing data methods in HEOR 16 studies were identified All data (outcomes and explanatory variables) Generic outcome What type of missing data was addressed? Cost Cost & HRQoL jointly The majority of studies looked at cost and/or HRQoL specifically HRQoL (incl. PROs) 16

Missing data methods in HEOR 16 studies were identified Registries In what context was missing data addressed? RCTs The majority of studies looked at economic evaluations alongside RCTs Not specified 17

Missing data methods in HEOR 16 studies were identified What type of method was used to address missing data? Multiple imputation Other Inverse probability weighting 9 studies used multiple imputation >10 different types of MI approaches were investigated 18 Pattern mixture NOTE: some articles were review articles or provided general guidance

Missing data methods in HEOR 16 studies were identified Was any guidance provided in the use of methods to account for missing data? Guidance 3 studies provided practical guidance 19

Missing data methods in HEOR Few studies have investigated methodological approach to account for missing data specifically in the context of CEA The majority of studies are focussed on outcomes (vs. also explanatory factors), data from RCTs, and on MI method There was a large variety of data manipulation and MI methods used 20

SECTION Overview of Guidance Nneka Onwudiwe, MBA, PharmD, PhD Senior Scientific Reviewer US Food and Drug Administration

22 Professional Biography

Disclaimer The views expressed in this presentation are those of the speaker. Therefore, nothing in this presentation should be construed to represent FDA s views or policies. 23

Terminology Missingness the existence of missing data and the mechanism that explains the reason for the data being missing Missing data mechanisms MCAR MAR MNAR Proportion of missing data directly related to the quality of statistical inferences Missing data occur at two levels Unit level or item level Patterns of missing data Univariate, monotone, arbitrary Statistical methods Direct imputation (LOCF, BOCF), MMRM, MI, weighting, etc. Assumptions and patterns of missingness to determine statistical methods MCAR, MAR, MNAR assumptions of analytic models 24

To provide a practical guidance on how to handle missing data in within-trial CEAs: the analysis should be based on a plausible assumption for the missing data mechanism the method chosen for the basecase should fit with the assumed mechanism sensitivity analysis should be conducted to explore to what extent the results change with the assumption made 25

26 Three stages in the analysis: Stage 1 descriptive analysis to inform the assumption on the missing data mechanism amount of missing data by trial group at each follow-up period missing data patterns association between missingness and baseline variables association between missingness and observed outcomes Stage 2how to choose between alternative methods given their underlying assumptions should fit with the assumption regarding the missing data mechanism and account for the uncertainty handle the particular characteristics of CEA data

Stage 2how to choose, cont d. Missing Baseline Values mean imputation and MI are suggested options Complete Case Analysis, Available Case Analysis and Inverse Probability Weighting CCA are valid under MCAR IPW is suitable for a monotonic pattern of missing data Single Imputation mean imputation valid for missing baseline variables Last-value carried forward (LVCF) can bias parameter estimates single imputation methods are not appropriate to handle missing data on outcomes Multiple Imputation MI can handle both monotonic and nonmonotonic missing data under MAR and can be modified to handle MNAR two approaches to implementing MI: joint modelling (MI-JM) and chained equations (MICE) MI-JM assumes multivariate normal distribution MICE accommodates non-normal distributions 27

28 Stage 2how to choose, cont d. Likelihood-Based Methods Likelihood-based models assume MAR conditional on the variables Likelihood-based methods can produce similar results to MI rely on the correct specification of the model, including its parametric assumptions wrong specification of the model may impact results Stage 3methods for sensitivity analysis to MAR selection models and pattern mixture approaches Selection models using a weighting approach tends to fail for large departures from MAR

29

30

31 This article reviews how missing cost effectiveness data were addressed in trial based CEA by examining: the extent of missing data how these were addressed in the analysis whether sensitivity analyses to different missing data assumptions were performed The article also provides a critical review of findings and recommendations to improve practice

32

33

34

35

36

Recommendations Alternative sources should also be considered to minimize missing information, for example, administrative data or electronic health records Report details of the pattern of missing data The choice of CCA for the primary analysis approach is difficult to justify in the presence of repeated measurements Consider approaches valid under more plausible MAR assumptions and making use of all the observed data, such as MI Likelihood based repeated measures models Bayesian models Appropriate methods for sensitivity analysis Selection models Pattern mixture models 37

Objective: This paper presents practical guidance on the choice of MI models for handling missing PROMs data based on the characteristics of the trial dataset. The comparative performance of complete cases analysis. Methods: Realistic missing at random data were simulated using follow-up data from an RCT considering three different PROMs (Oxford Knee Score (OKS), EuroQoL 5 Dimensions 3 Levels (EQ-5D- 3L), 12-item Short Form Survey (SF-12)). Data were multiply imputed at the item (using ordinal logit and predicted mean matching models), sub-scale and score level; unadjusted mean outcomes, as well as treatment effects from linear regression models were obtained for 1000 simulations. Performance was assessed by root mean square errors (RMSE) and mean absolute errors (MAE). 38

Instruments The 5-year follow-up data for three patient reported outcome measures: The Oxford Knee Score (OKS): an instrument designed to assess outcomes following a knee replacement in RCTs. It consists of 12 five level items, and the composite score ranges from 0 to 48, with higher scores indicating better outcomes. The SF-12: a 12-item generic health measure, with higher scores indicating better outcomes. The SF-12 generates two subscales, the physical component summary score (PCS) and the mental health component summary score (MCS). EQ-5D-3L: a utility questionnaire assessing participants health state based on their mobility, self-care, usual activities, pain/discomfort and anxiety/depression. Scores of 1 indicate full health, 0 indicates a health state equal to death, and scores lower than 0 indicate health states worse than death. 39

40

41

42

43

44

Recommendations Imputation at the item/subscale level may provide more precise estimates of treatment effect compared to the imputation at the composite score level or CCA Imputation at the item/subscale level is often infeasible and prone to convergence Appropriate sensitivity analysis to assess the impact of missing data 45

SECTION Fully Bayesian MI as an Alternative Gianluca Baio, PhD, MSc Professor of Statistics and Health Economics University College London (UCL)

Health technology assessment (HTA) Objective: Combine costs & benefits of a given intervention into a rational scheme for allocating resources, increasingly often under a Bayesian framework 1

Health technology assessment (HTA) Objective: Combine costs & benefits of a given intervention into a rational scheme for allocating resources, increasingly often under a Bayesian framework Statistical model Estimates relevant population parameters θ Varies with the type of available data (& statistical approach!) 1

Health technology assessment (HTA) Objective: Combine costs & benefits of a given intervention into a rational scheme for allocating resources, increasingly often under a Bayesian framework e = fe(θ) c = fc(θ)... Statistical model Economic model Estimates relevant population parameters θ Varies with the type of available data (& statistical approach!) Combines the parameters to obtain a population average measure for costs and clinical benefits Varies with the type of available data & statistical model used 1

Health technology assessment (HTA) Objective: Combine costs & benefits of a given intervention into a rational scheme for allocating resources, increasingly often under a Bayesian framework Statistical model e = fe(θ) c = fc(θ)... Economic model ICER = g( e, c) EIB = h( e, c; k)... Decision analysis 1 Estimates relevant population parameters θ Varies with the type of available data (& statistical approach!) Combines the parameters to obtain a population average measure for costs and clinical benefits Varies with the type of available data & statistical model used Summarises the economic model by computing suitable measures of cost-effectiveness Dictates the best course of actions, given current evidence Standardised process

Health technology assessment (HTA) Objective: Combine costs & benefits of a given intervention into a rational scheme for allocating resources, increasingly often under a Bayesian framework Uncertainty analysis Assesses the impact of uncertainty (eg in parameters or model structure) on the economic results Mandatory in many jurisdictions (including NICE) Fundamentally Bayesian! Statistical model e = fe(θ) c = fc(θ)... Economic model ICER = g( e, c) EIB = h( e, c; k)... Decision analysis 1 Estimates relevant population parameters θ Varies with the type of available data (& statistical approach!) Combines the parameters to obtain a population average measure for costs and clinical benefits Varies with the type of available data & statistical model used Summarises the economic model by computing suitable measures of cost-effectiveness Dictates the best course of actions, given current evidence Standardised process

Standard Statistical modelling Individual level data in HTA Demographics HRQL data Resource use data Clinical outcome ID Trt Sex Age... u 0 u 1... u J c 0 c 1... c J y 0 y 1... y J 1 1 M 23... 0.32 0.66... 0.44 103 241... 80 y 10 y 11... y 1J 2 1 M 21... 0.12 0.16... 0.38 1 204 1 808... 877 y 20 y 21... y 2J 3 2 F 19... 0.49 0.55... 0.88 16 12... 22 y 30 y 31... y 3J................................................... y ij = Survival time, event indicator (eg CVD), number of events, continuous measurement (eg blood pressure),... u ij = Utility-based score to value health (eg EQ-5D, SF-36, Hospital Anxiety & Depression Scale),... c ij = Use of resources (drugs, hospital, GP appointments,... ) 2

Standard Statistical modelling Individual level data in HTA Demographics HRQL data Resource use data Clinical outcome ID Trt Sex Age... u 0 u 1... u J c 0 c 1... c J y 0 y 1... y J 1 1 M 23... 0.32 0.66... 0.44 103 241... 80 y 10 y 11... y 1J 2 1 M 21... 0.12 0.16... 0.38 1 204 1 808... 877 y 20 y 21... y 2J 3 2 F 19... 0.49 0.55... 0.88 16 12... 22 y 30 y 31... y 3J................................................... y ij = Survival time, event indicator (eg CVD), number of events, continuous measurement (eg blood pressure),... u ij = Utility-based score to value health (eg EQ-5D, SF-36, Hospital Anxiety & Depression Scale),... c ij = Use of resources (drugs, hospital, GP appointments,... ) 1 Compute individual QALYs and total costs as e i = J j=1 (u ij + u ij 1 ) δ j 2 J and c i = c ij, j=0 [ ] Timej Timej 1 with: δ j = Unit of time 2

Standard Statistical modelling Individual level data in HTA Demographics HRQL data Resource use data Clinical outcome ID Trt Sex Age... u 0 u 1... u J c 0 c 1... c J y 0 y 1... y J 1 1 M 23... 0.32 0.66... 0.44 103 241... 80 y 10 y 11... y 1J 2 1 M 21... 0.12 0.16... 0.38 1 204 1 808... 877 y 20 y 21... y 2J 3 2 F 19... 0.49 0.55... 0.88 16 12... 22 y 30 y 31... y 3J................................................... y ij = Survival time, event indicator (eg CVD), number of events, continuous measurement (eg blood pressure),... u ij = Utility-based score to value health (eg EQ-5D, SF-36, Hospital Anxiety & Depression Scale),... c ij = Use of resources (drugs, hospital, GP appointments,... ) 1 Compute individual QALYs and total costs as e i = J j=1 (u ij + u ij 1 ) δ j 2 J and c i = c ij, j=0 [ ] Timej Timej 1 with: δ j = Unit of time 2 (Often implicitly) assume normality and linearity and model independently individual QALYs and total costs by controlling for baseline values e i = α e0 + α e1 u 0i + α e2 Trt i + ε ei [+...], ε ei Normal(0, σ e) c i = α c0 + α c1 c 0i + α c2 Trt i + ε ci [+...], ε ci Normal(0, σ c) 2

Standard Statistical modelling Individual level data in HTA Demographics HRQL data Resource use data Clinical outcome ID Trt Sex Age... u 0 u 1... u J c 0 c 1... c J y 0 y 1... y J 1 1 M 23... 0.32 0.66... 0.44 103 241... 80 y 10 y 11... y 1J 2 1 M 21... 0.12 0.16... 0.38 1 204 1 808... 877 y 20 y 21... y 2J 3 2 F 19... 0.49 0.55... 0.88 16 12... 22 y 30 y 31... y 3J................................................... y ij = Survival time, event indicator (eg CVD), number of events, continuous measurement (eg blood pressure),... u ij = Utility-based score to value health (eg EQ-5D, SF-36, Hospital Anxiety & Depression Scale),... c ij = Use of resources (drugs, hospital, GP appointments,... ) 1 Compute individual QALYs and total costs as e i = J j=1 (u ij + u ij 1 ) δ j 2 J and c i = c ij, j=0 [ ] Timej Timej 1 with: δ j = Unit of time 2 (Often implicitly) assume normality and linearity and model independently individual QALYs and total costs by controlling for baseline values e i = α e0 + α e1 u 0i + α e2 Trt i + ε ei [+...], ε ei Normal(0, σ e) c i = α c0 + α c1 c 0i + α c2 Trt i + ε ci [+...], ε ci Normal(0, σ c) 3 Estimate population average cost and effectiveness differentials and use bootstrap to quantify uncertainty 2

What s wrong with this?... Potential correlation between costs & clinical benefits Strong positive correlation effective treatments are innovative and result from intensive and lengthy research are associated with higher unit costs Negative correlation more effective treatments may reduce total care pathway costs e.g. by reducing hospitalisations, side effects, etc. Because of the way in which standard models are set up, bootstrapping generally only approximates the underlying level of correlation Bayesian methods usually do a better job! 3

What s wrong with this?... Potential correlation between costs & clinical benefits Strong positive correlation effective treatments are innovative and result from intensive and lengthy research are associated with higher unit costs Negative correlation more effective treatments may reduce total care pathway costs e.g. by reducing hospitalisations, side effects, etc. Because of the way in which standard models are set up, bootstrapping generally only approximates the underlying level of correlation Bayesian methods usually do a better job! Joint/marginal normality not realistic Particularly for small/pilot studies Costs usually skewed and benefits may be bounded in [0; 1] Can use transformation (e.g. logs) but care is needed when back transforming to the natural scale Should use more suitable models (e.g. Beta, Gamma or log-normal) generally easier under a Bayesian framework 3

What s wrong with this?... Potential correlation between costs & clinical benefits Strong positive correlation effective treatments are innovative and result from intensive and lengthy research are associated with higher unit costs Negative correlation more effective treatments may reduce total care pathway costs e.g. by reducing hospitalisations, side effects, etc. Because of the way in which standard models are set up, bootstrapping generally only approximates the underlying level of correlation Bayesian methods usually do a better job! Joint/marginal normality not realistic Particularly for small/pilot studies Costs usually skewed and benefits may be bounded in [0; 1] Can use transformation (e.g. logs) but care is needed when back transforming to the natural scale Should use more suitable models (e.g. Beta, Gamma or log-normal) generally easier under a Bayesian framework... and of course Partially Observed data Can have item and/or unit non-response Missingness may occur in either or both benefits/costs The missingness mechanisms may also be correlated What exactly to adjust for, at baseline available vs complete cases! Particularly for small/pilot studies 3

Bayesian approach to HTA In general can represent the joint distribution as p(e, c) = p(e)p(c e) = p(c)p(e c) 4

Bayesian approach to HTA In general can represent the joint distribution as p(e, c) = p(e)p(c e) = p(c)p(e c) Marginal model for e µ e [...] φ ie e i p(e φ ei, τ e) g e(φ ei) = α 0 [+...] µ e = g 1 e (α0) e i τ e φ ei = location τ e = ancillary 4

Bayesian approach to HTA In general can represent the joint distribution as p(e, c) = p(e)p(c e) = p(c)p(e c) c i p(c e, φ ci, τ c) g c(φ ci) = β 0 + β 1(e i µ e) [+...] µ c = g 1 c (β0) Conditional model for c [...] τ c µ c φ ic β 1 Marginal model for e µ e [...] φ ie e i p(e φ ei, τ e) g e(φ ei) = α 0 [+...] µ e = g 1 e (α0) φ ci = location τ c = ancillary c i e i τ e φ ei = location τ e = ancillary 4

Bayesian approach to HTA In general can represent the joint distribution as p(e, c) = p(e)p(c e) = p(c)p(e c) c i p(c e, φ ci, τ c) g c(φ ci) = β 0 + β 1(e i µ e) [+...] µ c = g 1 c (β0) Conditional model for c [...] τ c µ c φ ic β 1 Marginal model for e µ e [...] φ ie e i p(e φ ei, τ e) g e(φ ei) = α 0 [+...] µ e = g 1 e (α0) φ ci = conditional mean τ c = conditional variance c i e i τ e φ ei = marginal mean τ e = marginal variance For example: e i Normal(φ ei, τ e), φ ei = α 0 [+... ], µ e = α 0 c i e i Normal(φ ci, τ c), φ ci = β 0 + β 1 (e i µ e) [+...], µ c = β 0 4

Bayesian approach to HTA In general can represent the joint distribution as p(e, c) = p(e)p(c e) = p(c)p(e c) Conditional model for c Marginal model for e c i p(c e, φ ci, τ c) [...] µ c β 1 µ e [...] e i p(e φ ei, τ e) g c(φ ci) = β 0 + β 1(e i µ e) [+...] g e(φ ei) = α 0 [+...] µ c = g 1 c (β0) τ c φ ic φ ie µ e = g 1 e (α0) φ ci = conditional mean τ c = shape τ c/φ ci = rate c i e i τ e φ ei = marginal mean τ e = marginal scale For example: e i Beta(φ ei τ e, (1 φ ei )τ e), logit(φ ei ) = α 0 [+... ], µ e = exp(α 0) 1+exp(α 0 ) c i e i Gamma(τ c, τ c/φ ci ), log(φ ci ) = β 0 + β 1 (e i µ e) [+...], µ c = exp(β 0 ) 4

Bayesian approach to HTA In general can represent the joint distribution as p(e, c) = p(e)p(c e) = p(c)p(e c) Conditional model for c Marginal model for e c i p(c e, φ ci, τ c) [...] µ c β 1 µ e [...] e i p(e φ ei, τ e) g c(φ ci) = β 0 + β 1(e i µ e) [+...] g e(φ ei) = α 0 [+...] µ c = g 1 c (β0) τ c φ ic φ ie µ e = g 1 e (α0) φ ci = conditional mean τ c = shape τ c/φ ci = rate c i e i τ e φ ei = marginal mean τ e = marginal scale For example: e i Beta(φ ei τ e, (1 φ ei )τ e), logit(φ ei ) = α 0 [+... ], µ e = exp(α 0) 1+exp(α 0 ) c i e i Gamma(τ c, τ c/φ ci ), log(φ ci ) = β 0 + β 1 (e i µ e) [+...], µ c = exp(β 0 ) Combining modules and fully characterising uncertainty about deterministic functions of random quantities is relatively straightforward using MCMC 4

Bayesian approach to HTA In general can represent the joint distribution as p(e, c) = p(e)p(c e) = p(c)p(e c) Conditional model for c Marginal model for e c i p(c e, φ ci, τ c) [...] µ c β 1 µ e [...] e i p(e φ ei, τ e) g c(φ ci) = β 0 + β 1(e i µ e) [+...] g e(φ ei) = α 0 [+...] µ c = g 1 c (β0) τ c φ ic φ ie µ e = g 1 e (α0) φ ci = conditional mean τ c = shape τ c/φ ci = rate c i e i τ e φ ei = marginal mean τ e = marginal scale For example: e i Beta(φ ei τ e, (1 φ ei )τ e), logit(φ ei ) = α 0 [+... ], µ e = exp(α 0) 1+exp(α 0 ) c i e i Gamma(τ c, τ c/φ ci ), log(φ ci ) = β 0 + β 1 (e i µ e) [+...], µ c = exp(β 0 ) 4 Combining modules and fully characterising uncertainty about deterministic functions of random quantities is relatively straightforward using MCMC Prior information can help stabilise inference (especially with sparse data!), eg Cancer patients are unlikely to survive as long as the general population ORs are unlikely to be greater than ±5

Known unknowns?... Gabrio et al. (2017). PharmacoEconomics Open, 1(2), 79-97 Missing cost (2003-2009) Missing cost (2009-2015) Missing effectiveness (2003-2009) Missing effectiveness (2009-2015) 5

(Bayesian) Missing data in HTA MCAR (e, c) Model of missingness for c Model of analysis for (c, e) Model of missingness for e Selection models γ c x ci µ c µ e x ei γ e π ci ψ c φ ic φ ie ψ e π ei m ci c i e i m ei Partially observed data Unobservable parameters Deterministic function of random quantities Fully observed, unmodelled data Fully observed, modelled data 6 m ei Bernoulli(π ei ); m ci Bernoulli(π ci ); logit(π ei ) = γ e0 logit(π ci ) = γ c0

(Bayesian) Missing data in HTA MAR (e, c) Model of missingness for c Model of analysis for (c, e) Model of missingness for e Selection models γ c x ci µ c µ e x ei γ e π ci ψ c φ ic φ ie ψ e π ei m ci c i e i m ei Partially observed data Unobservable parameters Deterministic function of random quantities Fully observed, unmodelled data Fully observed, modelled data 6 m ei Bernoulli(π ei ); m ci Bernoulli(π ci ); logit(π ei ) = γ e0 + K k=1 γ ekx eik logit(π ci ) = γ c0 + H h=1 γ chx cih

(Bayesian) Missing data in HTA MNAR (e, c) Model of missingness for c Model of analysis for (c, e) Model of missingness for e Selection models γ c x ci µ c µ e x ei γ e π ci ψ c φ ic φ ie ψ e π ei m ci c i e i m ei Partially observed data Unobservable parameters Deterministic function of random quantities Fully observed, unmodelled data Fully observed, modelled data 6 m ei Bernoulli(π ei ); m ci Bernoulli(π ci ); logit(π ei ) = γ e0 + K k=1 γ ekx eik + γ ek+1 e i logit(π ci ) = γ c0 + H h=1 γ chx cih + γ ch+1 c i

(Bayesian) Missing data in HTA MNAR e; MAR c Model of missingness for c Model of analysis for (c, e) Model of missingness for e Selection models γ c x ci µ c µ e x ei γ e π ci ψ c φ ic φ ie ψ e π ei m ci c i e i m ei Partially observed data Unobservable parameters Deterministic function of random quantities Fully observed, unmodelled data Fully observed, modelled data 6 m ei Bernoulli(π ei ); m ci Bernoulli(π ci ); logit(π ei ) = γ e0 + K k=1 γ ekx eik + γ ek+1 e i logit(π ci ) = γ c0 + H h=1 γ chx cih

(Bayesian) Missing data in HTA MAR e; MNAR c Model of missingness for c Model of analysis for (c, e) Model of missingness for e Selection models γ c x ci µ c µ e x ei γ e π ci ψ c φ ic φ ie ψ e π ei m ci c i e i m ei Partially observed data Unobservable parameters Deterministic function of random quantities Fully observed, unmodelled data Fully observed, modelled data 6 m ei Bernoulli(π ei ); m ci Bernoulli(π ci ); logit(π ei ) = γ e0 + K k=1 γ ekx eik logit(π ci ) = γ c0 + H h=1 γ chx cih + γ ch+1 c i

Motivating example: MenSS trial The MenSS pilot RCT evaluates the cost-effectiveness of a new digital intervention to reduce the incidence of STI in young men with respect to the SOC QALYs calculated from utilities (EQ-5D 3L) Total costs calculated from different components (no baseline) 7

Motivating example: MenSS trial Partially observed data The MenSS pilot RCT evaluates the cost-effectiveness of a new digital intervention to reduce the incidence of STI in young men with respect to the SOC QALYs calculated from utilities (EQ-5D 3L) Total costs calculated from different components (no baseline) Time Type of outcome observed (%) observed (%) Control (n 1=75) Intervention (n 2=84) Baseline utilities 72 (96%) 72 (86%) 3 months utilities and costs 34 (45%) 23 (27%) 6 months utilities and costs 35 (47%) 23 (27%) 12 months utilities and costs 43 (57%) 36 (43%) Complete cases utilities and costs 27 (44%) 19 (23%) 7

Motivating example: MenSS trial Skewness & structural values The MenSS pilot RCT evaluates the cost-effectiveness of a new digital intervention to reduce the incidence of STI in young men with respect to the SOC QALYs calculated from utilities (EQ-5D 3L) Total costs calculated from different components (no baseline) Frequency 0 2 4 6 8 10 n 1 = 27 mean = 0.904 median = 0.931 Frequency 0 2 4 6 8 10 n 1 = 27 mean = 208 median = 123 0.5 0.6 0.7 0.8 0.9 1.0 0 200 400 600 800 1000 QALYs costs ( ) Frequency 0 2 4 6 8 10 n 2 = 19 mean = 0.902 median = 0.943 Frequency 0 2 4 6 8 10 n 2 = 19 mean = 189 median = 183 7 0.5 0.6 0.7 0.8 0.9 1.0 QALYs 0 200 400 600 800 1000 costs ( )

Cost-effectiveness analysis (1) Normal & Independent outcomes Cost Effectiveness Acceptability Curve CC ICER=8368 ICER= 473 1.0 400 Cost Effectiveness Plane e = 0.002 ( 0.045,0.042) 400 0.6 0.2 0.4 Probability of Cost Effectiveness 0 200 Cost differential 200 0.8 c = 18 ( 125,89) AC k=20000 e = 0.041 ( 0.004,0.086) 0.2 0.1 0.0 Effectiveness differential 8 0.1 CC AC 0.0 c = 18 ( 125,89) 0.2 0 5000 10000 15000 Willingness to pay 20000 25000 30000

Modelling 1 Bivariate Normal Simpler and closer to standard frequentist model Account for correlation between QALYs and costs µ ct βt Conditional model for c e c it e it Normal(φ cit, ψ ct ) φ cit = µ ct + β t (e it µ et ) ψ ct φ ict Gabrio et al. (2018). https://arxiv.org/abs/1801.09541 + SiM (in press) µ et u 0it α t Marginal model for e e it Normal(φ eit, ψ et ) φ iet φ eit = µ et + α t (u 0it ū 0t ) φ eit = µ et + α t u 0it c it e it ψ et 9

Modelling 1 Bivariate Normal Simpler and closer to standard frequentist model Account for correlation between QALYs and costs 2 Beta-Gamma Gabrio et al. (2018). https://arxiv.org/abs/1801.09541 + SiM (in press) Account for correlation between outcomes Model the relevant ranges: QALYs (0, 1) and costs (0, ) But: needs to rescale observed data e it = (eit ɛ) to avoid spikes at 1 Conditional model for c e c it e it Gamma(ψ ctφ cit, ψ ct ) µ ct βt µ et u 0it α t log(φ cit ) = µ ct + β t (e it µ et) Marginal model for e ψ ct φ ict φ iet e it Beta (φ eitψ et, (1 φ eit )ψ et ) logit(φ eit ) = µ et + α t (u 0it ū 0t ) φ eit = µ et + α t u 0it c it e it ψ et 9

Modelling 9 1 Bivariate Normal Simpler and closer to standard frequentist model Account for correlation between QALYs and costs 2 Beta-Gamma Account for correlation between outcomes Model the relevant ranges: QALYs (0, 1) and costs (0, ) But: needs to rescale observed data e it = (eit ɛ) to avoid spikes at 1 3 Hurdle model Gabrio et al. (2018). https://arxiv.org/abs/1801.09541 + SiM (in press) Model e it as a mixture to account for correlation between outcomes, model the relevant ranges and account for structural values May expand to account for partially observed baseline utility u 0it µ ct βt µ <1 u et 0it α Conditional model for c e t c it e it Gamma(ψ ctφ cit, ψ ct ) log(φ cit ) = µ ct + β t (e it µ et) ψ ct φ ict µ et φ iet ψ et Mixture model for e e 1 it := 1 e <1 e <1 it e 1 it Beta (φ eit ψ et, (1 φ eit )ψ et ) it logit(φ eit ) = µ <1 et + α t(u 0it ū 0t ) logit(φ eit ) = µ <1 et + α tu 0it π it Model for the structural ones d it := I(e it = 1) Bernoulli(π it ) logit(π it ) = X it η t c it X it d it e it η t e it = π ite 1 it + (1 π it)e <1 it µ et = (1 π t )µ <1 et + π t

Modelling 9 1 Bivariate Normal Simpler and closer to standard frequentist model Account for correlation between QALYs and costs 2 Beta-Gamma Account for correlation between outcomes Model the relevant ranges: QALYs (0, 1) and costs (0, ) But: needs to rescale observed data e it = (eit ɛ) to avoid spikes at 1 3 Hurdle model Gabrio et al. (2018). https://arxiv.org/abs/1801.09541 + SiM (in press) Model e it as a mixture to account for correlation between outcomes, model the relevant ranges and account for structural values May expand to account for partially observed baseline utility u 0it µ ct βt µ <1 u et 0it α Conditional model for c e t c it e it Gamma(ψ ctφ cit, ψ ct ) log(φ cit ) = µ ct + β t (e it µ et) ψ ct φ ict µ et φ iet ψ et Mixture model for e e 1 it := 1 e <1 e <1 it e 1 it Beta (φ eit ψ et, (1 φ eit )ψ et ) it logit(φ eit ) = µ <1 et + α t(u 0it ū 0t ) logit(φ eit ) = µ <1 et + α tu 0it π it Model for the structural ones d it := I(e it = 1) Bernoulli(π it ) logit(π it ) = X it η t c it X it d it e it η t e it = π ite 1 it + (1 π it)e <1 it µ et = (1 π t )µ <1 et + π t

Modelling 9 1 Bivariate Normal Simpler and closer to standard frequentist model Account for correlation between QALYs and costs 2 Beta-Gamma Account for correlation between outcomes Model the relevant ranges: QALYs (0, 1) and costs (0, ) But: needs to rescale observed data e it = (eit ɛ) to avoid spikes at 1 3 Hurdle model Gabrio et al. (2018). https://arxiv.org/abs/1801.09541 + SiM (in press) Model e it as a mixture to account for correlation between outcomes, model the relevant ranges and account for structural values May expand to account for partially observed baseline utility u 0it µ ct βt µ <1 u et 0it α Conditional model for c e t c it e it Gamma(ψ ctφ cit, ψ ct ) log(φ cit ) = µ ct + β t (e it µ et) ψ ct φ ict µ et φ iet ψ et Mixture model for e e 1 it := 1 e <1 e <1 it e 1 it Beta (φ eit ψ et, (1 φ eit )ψ et ) it logit(φ eit ) = µ <1 et + α t(u 0it ū 0t ) logit(φ eit ) = µ <1 et + α tu 0it π it Model for the structural ones d it := I(e it = 1) Bernoulli(π it ) logit(π it ) = X it η t c it X it d it e it η t e it = π ite 1 it + (1 π it)e <1 it µ et = (1 π t )µ <1 et + π t

Results Control Intervention Complete only vs all cases mean (90% HPD) mean (90% HPD) 0.90 (0.88; 0.93) 0.90 (0.87; 0.94) Hurdle Model 0.88 (0.85; 0.91) Hurdle Model 0.90 (0.86; 0.94) 0.88 (0.86; 0.91) 0.88 (0.85; 0.92) Beta Gamma 0.88 (0.85; 0.90) Beta Gamma 0.91 (0.88; 0.94) 0.90 (0.88; 0.93) 0.90 (0.87; 0.94) Bivariate Normal 0.87 (0.85; 0.90) Bivariate Normal 0.92 (0.88; 0.95) 0.75 0.80 0.85 0.90 0.95 1.00 QALYs 0.75 0.80 0.85 0.90 0.95 1.00 QALYs Complete cases only All cases (missing at random, MAR) 10

Results Control Intervention Complete only vs all cases mean (90% HPD) mean (90% HPD) Hurdle Model 220 (118; 329) 198 (111; 282) Hurdle Model 234 (93; 377) 193 (84; 307) Beta Gamma 231 (105; 347) 200 (111; 286) Beta Gamma 228 (91; 363) 189 (83; 303) Bivariate Normal 207 (128; 288) 234 (154; 321) Bivariate Normal 190 (123; 254) 187 (122; 256) 0 200 400 600 costs ( ) 0 200 400 600 costs ( ) Complete cases only All cases (missing at random, MAR) 10

Bivariate Normal Bayesian multiple imputation (under MAR) QALY values 0.0 0.2 0.4 0.6 0.8 1.0 QALY values 0.0 0.2 0.4 0.6 0.8 1.0 Imputed, observed baseline Imputed, missing baseline Observed 11 Individuals (n 1 = 75) Individuals (n 2 = 84)

Bivariate Normal Bayesian multiple imputation (under MAR) QALY values 0.0 0.2 0.4 0.6 0.8 1.0 QALY values 0.0 0.2 0.4 0.6 0.8 1.0 Beta-Gamma QALY values 0.0 0.2 0.4 0.6 0.8 1.0 QALY values 0.0 0.2 0.4 0.6 0.8 1.0 Imputed, observed baseline Imputed, missing baseline Observed 11 Individuals (n 1 = 75) Individuals (n 2 = 84)

Bivariate Normal Bayesian multiple imputation (under MAR) QALY values 0.0 0.2 0.4 0.6 0.8 1.0 QALY values 0.0 0.2 0.4 0.6 0.8 1.0 Beta-Gamma QALY values 0.0 0.2 0.4 0.6 0.8 1.0 QALY values 0.0 0.2 0.4 0.6 0.8 1.0 Hurdle model Imputed, observed baseline Imputed, missing baseline Observed 11 QALY values 0.0 0.2 0.4 0.6 0.8 1.0 Individuals (n 1 = 75) Individuals (n 2 = 84) QALY values 0.0 0.2 0.4 0.6 0.8 1.0

Cost-effectiveness analysis (2) More complex modelling K=20000 1000 500 0 500 1000 0.2 0.1 0.0 0.1 0.2 Effectiveness differential Cost differential Hurdle model Beta Gamma Bivariate Normal ICER 1627 ICER 594 ICER 1067 Cost Effectiveness Plane 0.00 0.25 0.50 0.75 1.00 0 10000 20000 30000 Willingness to pay Probability of Cost Effectiveness Hurdle model Beta Gamma Bivariate Normal MNAR1 MNAR2 MNAR3 MNAR4 Cost Effectiveness Acceptability Curve 12

missinghe: a R package to deal with missing data in HTA Objective: Run a set of complex models to account for different level of complexity & missingness pic diagnostics selection hurdle BCEA summary plot Gabrio et al. (2018). https://arxiv.org/abs/1801.09541 https://github.com/giabaio/missinghe 13

Conclusions A full Bayesian approach to handling missing data extends standard imputation methods Can consider MAR and MNAR with relatively little expansion to the basic model 14

Conclusions A full Bayesian approach to handling missing data extends standard imputation methods Can consider MAR and MNAR with relatively little expansion to the basic model Particularly helpful in cost-effectiveness analysis, to account for Asymmetrical distributions for the main outcomes Correlation between costs & benefits Structural values (eg spikes at 1 for utilities or spikes at 0 for costs) 14

Conclusions A full Bayesian approach to handling missing data extends standard imputation methods Can consider MAR and MNAR with relatively little expansion to the basic model Particularly helpful in cost-effectiveness analysis, to account for Asymmetrical distributions for the main outcomes Correlation between costs & benefits Structural values (eg spikes at 1 for utilities or spikes at 0 for costs) Need specialised software + coding skills R package missinghe under development to implement a set of general models Preliminary work available at https://github.com/giabaio/missinghe prior = list("mu.prior.e"=mu.prior.e,"delta.prior.e"=delta.prior.e) model = run_model(data=data,model.eff=e~1,model.cost=c~1, dist_e="norm",dist_c="norm",type="mnar_eff",stand=false, program="jags",forward=false,prob=c(0.05,0.95),n.chains=2,n.iter=20000, n.burnin=floor(20000/2),inits=null,n.thin=1,save_model=false,prior=prior ) Eventually, will be able to combine with existing packages (eg BCEA: http://www.statistica.it/gianluca/bcea; https://github.com/giabaio/bcea) to perform the whole economic analysis 14

15 Thank you!

Sign up as a Special Interest Group Member Visit ISPOR webpage www.ispor.org Click Member Groups Select ISPOR Special Interest Groups Need ISPOR membership number For more information, e-mail sigs@ispor.org