Meta-Analysis for Diagnostic Test Data: a Bayesian Approach

Similar documents
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLĂ„RNING

STA 4273H: Statistical Machine Learning

Performance Evaluation

Linear Discriminant Analysis Based in part on slides from textbook, slides of Susan Holmes. November 9, Statistics 202: Data Mining

BANA 7046 Data Mining I Lecture 4. Logistic Regression and Classications 1

Bayesian Linear Regression

Bayesian methods for missing data: part 1. Key Concepts. Nicky Best and Alexina Mason. Imperial College London

Richard D Riley was supported by funding from a multivariate meta-analysis grant from

Density Estimation. Seungjin Choi

Default Priors and Effcient Posterior Computation in Bayesian

Linear Regression Models P8111

Part I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis

Probability and Estimation. Alan Moses

Bayesian Inference and MCMC

Introduction to Probabilistic Graphical Models: Exercises

Markov Chain Monte Carlo methods

BNP survival regression with variable dimension covariate vector

Performance of INLA analysing bivariate meta-regression and age-period-cohort models

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models

Bayesian Inference of Multiple Gaussian Graphical Models

Principles of Bayesian Inference

Ages of stellar populations from color-magnitude diagrams. Paul Baines. September 30, 2008

Part 6: Multivariate Normal and Linear Models

A.I. in health informatics lecture 2 clinical reasoning & probabilistic inference, I. kevin small & byron wallace

Machine Learning Linear Classification. Prof. Matteo Matteucci

Constructing Confidence Intervals of the Summary Statistics in the Least-Squares SROC Model

Bayesian GLMs and Metropolis-Hastings Algorithm

Distribution-free ROC Analysis Using Binary Regression Techniques

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

CPSC 540: Machine Learning

Effect and shrinkage estimation in meta-analyses of two studies

Nearest Neighbor Gaussian Processes for Large Spatial Data

Binary Dependent Variables

Performance Evaluation

A general mixed model approach for spatio-temporal regression data

STA 4273H: Statistical Machine Learning

Text Mining for Economics and Finance Latent Dirichlet Allocation

Obnoxious lateness humor

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Integrated Non-Factorized Variational Inference

Lecture 7 and 8: Markov Chain Monte Carlo

Bios 6648: Design & conduct of clinical research

STA 216, GLM, Lecture 16. October 29, 2007

STAT 425: Introduction to Bayesian Analysis

Basic math for biology

Propensity Score Methods for Causal Inference

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Overview of statistical methods used in analyses with your group between 2000 and 2013

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Bayesian Linear Models

Lecture 8: The Metropolis-Hastings Algorithm

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Reconstruction of individual patient data for meta analysis via Bayesian approach

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Lecture 10: Introduction to Logistic Regression

Principles of Bayesian Inference

Monte Carlo in Bayesian Statistics

Weighted MCID: Estimation and Statistical Inference

Computational statistics

Or How to select variables Using Bayesian LASSO

Principles of Bayesian Inference

Probabilistic Graphical Networks: Definitions and Basic Results

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

MCMC Sampling for Bayesian Inference using L1-type Priors

Bayesian Graphical Models for Structural Vector AutoregressiveMarch Processes 21, / 1

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

CPSC 540: Machine Learning

Bayesian Linear Models

Approximate Bayesian Computation

Including historical data in the analysis of clinical trials using the modified power priors: theoretical overview and sampling algorithms

Contents. Part I: Fundamentals of Bayesian Inference 1

Introduction to Probabilistic Graphical Models

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Lecture Notes based on Koop (2003) Bayesian Econometrics

Review. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 770: Categorical Data Analysis

Why analyze as ordinal? Mixed Models for Longitudinal Ordinal Data Don Hedeker University of Illinois at Chicago

McGill University. Department of Epidemiology and Biostatistics. Bayesian Analysis for the Health Sciences. Course EPIB-675.

Statistics & Data Sciences: First Year Prelim Exam May 2018

A MCMC Approach for Learning the Structure of Gaussian Acyclic Directed Mixed Graphs

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

THE SKILL PLOT: A GRAPHICAL TECHNIQUE FOR EVALUATING CONTINUOUS DIAGNOSTIC TESTS

AMS-207: Bayesian Statistics

One-stage dose-response meta-analysis

Metropolis-Hastings Algorithm

Bayesian Regression Linear and Logistic Regression

Chris Bishop s PRML Ch. 8: Graphical Models

Motivation Scale Mixutres of Normals Finite Gaussian Mixtures Skew-Normal Models. Mixture Models. Econ 690. Purdue University

Diagnostics. Gad Kimmel

Bayesian Networks Basic and simple graphs

Performance evaluation of binary classifiers

Nonparametric Bayesian Methods (Gaussian Processes)

Part 8: GLMs and Hierarchical LMs and GLMs

Control Variates for Markov Chain Monte Carlo

an introduction to bayesian inference

Bayesian Analysis of Multivariate Normal Models when Dimensions are Absent

BIAS OF MAXIMUM-LIKELIHOOD ESTIMATES IN LOGISTIC AND COX REGRESSION MODELS: A COMPARATIVE SIMULATION STUDY

Figure 36: Respiratory infection versus time for the first 49 children.

Cross-sectional space-time modeling using ARNN(p, n) processes

Transcription:

Meta-Analysis for Diagnostic Test Data: a Bayesian Approach Pablo E. Verde Coordination Centre for Clinical Trials Heinrich Heine Universität Düsseldorf

Preliminaries: motivations for systematic reviews To generalized a medical result in order to apply it to other populations, subgroups, etc. To collect background information to design a new clinical trial. Costs: it is substantially cheaper to analyze published information than carry out a new clinical trial. Health Care Evaluation: to evaluate critically published information.

Running Example Acute appendicitis is one of the most common acute surgical events (e.g. 250,000 per year in USA). Traditional diagnoses reported false positive rates of 20% to 30%. Computer Tomography scans (CT) has been advocated to be of high potential diagnostic benefit in suspected appendicitis. Question: Which diagnostic performance has this technology?

Running Example A systematic review was performed (Ohmann et al. 2005) to evaluate the diagnostic benefit of this technology. They selected 52 studies for analysis: author country n tp fn fp tn se sp.. Applegate2001 USA 96 87 2 4 3 98 43 Brandt2003 USA 179 168 1 3 7 99 70 Cho1999 AU 36 21 0 1 14 100 93 Cole2001 USA 96 40 5 4 43 89 91 D'Ippolito1998 Brazil 52 40 4 0 8 91 100 Hershko2002 Israel 197 67 5 7 118 93 94.. Pooled results with a random effects model: Sensitivity: 94.5% [94.1, 95.8] Specificity: 94.1% [91.8, 94.5]

Particular features of data coming from systematic reviews Studies included in a systematic review are unique, i.e., they are NOT a random sample of studies. The data collection is vulnerable to multiple sources of bias: study internal bias bias to external validity bias to inclusion criteria publication bias Usually an small number of studies are included in the review. Bayesian methods are particularly well-suited to such scenario.

The Diagnostic Test Data Test results are usually summarized in a 2 2 table giving the number of positive and negative test results for patients with and without disease: Reference result Present Absent Total Test Outcome (+) (-) a c b d a+b c+d Total a+c b+d a+b+c+d

The Diagnostic Test Data Summaries True positive rate (TPR) or Sensitivity = a/(a+c) True negative rate (TNR) or Specificity = d/(b+d) The false positive rate: FPR= b/(b+d) The false negative rate: FNR= c/(a+c) The Positive Predictive Value: PPV = a/(a+b) The Negative Predictive Value: NPV = d/(c+d) Likelihood ratio for a positive test: LR + = Sensitivity / ( 1- Specificity) Likelihood ratio for a negative test: LR - = (1-Sensitivity)/Specificity Diagnostic Odds Ratio DOR= (a d)/(b c) DOR= Sensitivity 1-Sensitivity 1-Specificity Specificity = LR + LR -

Test Data Generation Process Let y j be a test measurement of patient j and k a threshold value. Then the test outcome is T j = test positive if y k i test negative otherwise. The test threshold is chosen as a tread off between TPR and FPR. The construction of diagnostic test introduces in general strong correlation between TPR and FPR.

The Summary ROC curve (SROC) The idea of SROC curve is to represent the relationship between TPR and FPR across studies, assuming that they may use different threshold values (don t take this literally!). It can be achieved by using a meta-regression model (Moses et al. 1993) that under a random effect model is: D i ~ Normal(A+BS i, σ 2 Di + τ2 ), D i = logit(tpr i ) - logit(fpr i ), S i = logit(tpr i ) + logit(fpr i ) and σ Di is the asymptotic standard error of D i. Reverse the transformations and deduce the relationship between TPR and FPR: exp -A (1-B) TPR= 1+exp -A (1-B) ( ) 1+B ( ) 1+B ( ) FPR ( 1-FPR ) ( ) FPR ( 1-FPR ) ( ) ( 1-B) ( ) ( 1-B)

The SROC curve analysis Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 5.5783 0.1768 31.554 < 2e-16 *** S -0.3683 0.1227-3.001 0.00419 ** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Limitations of the SROC curve It has been developed, essentially as a graphical device : It is not useful to make predictions of future studies S is assumed fixed, which is NOT. Difficult to link explanatory variables. General functional statistics, e.g., the area under the SROC curve: AUC= ( ) x ( 1-x) ( ) x ( 1-x) exp -A (1-B) 1+exp -A (1-B) is only analytically tractable for B=0. 1 0 ( ) 1+B ( ) 1+B ( ) ( 1-B) ( ) ( 1-B) dx A practical Bayesian approach may complement these issues.

A Basic Bayesian Model Given the data y of m studies, we can model the number of true positives a i and false positives b i directly by a GLMM as follows: with and a i ~ Binomial(TPR i, n i,1 ), b i ~ Binomial(FPR i, n i - n i,1 ), logit(tpr i ) = (D i + S i ) / 2, logit(fpr i ) = (D i - S i ) / 2, (D i, S i ) ~ MNormal(µ, Λ), Λ = Σ -1. Note: the logit( ) transformation can be replaced by other link function (e.g. probit or complementary log logistic, etc.). The multivariate Gaussian assumption can, also, be replaced by a multivariate t or other density.

Prior specification Independent Normals for the components of µ = (µ D, µ S ): µ D ~ N(m D, v D ), µ S ~ N(m s, v s ). Independent Uniforms for the variance covariance matrix of D i and S i, Λ -1 = Σ: σ 2 D,D ~Uniform(0,k D ), σ2 S,S ~Uniform(0,k S ), σ D,S = σ S,D = ρ σ D,D σ S,S, ρ~uniform(-r,r). The constants m D, m s, v D, v s, k D, k S and r can be use for prior elicitation and sensitivity analysis.

Inferential Statements Observed quantities, i.e., data y Unknown quantities are represented by θ (parameters, missing data, ) Inference is based on the posterior distribution p(θ y) p(θ) p(y θ). Inference on functional parameters, say g(θ), is based on the marginal posterior p(g(θ) y). Predictions of new data y * are based on the marginal predictive p(y * y) Those problems are no analytical tractable, we based our inference on empirical approximations using MCMC (a Gibbs sampler).

The directed acyclic graph (DAG) The posterior density p(θ y) is factorised using a directed local Markov property This decomposition: p(θ y) = p(θ V parents[v], y), defines relationship between model quantities, programs the kernel of the Markov chain, gives a non-algebraic description of the model.

The DAG of the model µ Λ Σ Di, Si TPRi FPRi ni,1 ai bi ni,2 i = 1,,m

Recovering the SROC From the distribution of (D i S i = s i ) we can recover the SROC as follows: E(D i S i = s i ) = A + B (s i - µ S ), A = µ D - µ S ρ σ D,D / σ S,S and B = ρ σ D,D / σ S,S var(d i S i = s i ) = σ 2 D,D (1-ρ2 ). Reverse the transformations and deduce the relationship between TPR and FPR. The AUC can be calculated from SROC numerically. Operationally, we calculate the required functions (A, B, SROC, AUC) as logical nodes at each iteration of the MCMC and we summarise posteriors samples.

Extended DAG for the SROC curve B Λ SROC A µ Σ Di, Si fprj AUC TPRi FPRi ni,1 ai bi ni,2

Making predictions and simulating data from the model Reasons for predictions and data simulation: Predict the results of a new study. Assess the results of a study not included in the review. Proof-of-concepts and plan a new study. Model checking and model selection. To predict a new study results we define an stochastic node child (D *, S * ) of p(µ, Λ y) and two logical nodes TPR(D *, S * ) and FPR(D *, S * ). These results in a predictive posterior of TPR * and FPR *. Given the sample size of the study, new data is simulated as an stochastic node child of Bin(TPR *, n1) and Bin(FPR *, n2).

Extended DAG for predictions B Λ Σ SROC A µ D*, S* fprj Di, Si TPR* FPR* AUC TPRi FPRi a* b* ni,1 ai bi ni,2 n1 n2

Analysis Computational specification: No prior elicitation was done. We take the hyper-prior constants: m D =0, m s =0, v D =0.05 v s =0.05, k D = 100, k S = 100 and r=1. MCMC set up: We run a single chain of length 20,000 we take one in 5 we discard the first 500 values. 3500 values to make inference. Convergence diagnostics were done graphically. No mayor convergence problems were presented.

Pr(TPR i y) and Pr(FPR i y) True positive rate False positive rate [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43][44] [10] [18] [28] [29] [31] [32] [34] [20] [21] [22] [23] [24] [2] [5] [8] [9] [11] [12] [13] [14] [15] [17] [33] [19] [25] [27] [35] [36] [37] [39] [40] [38] [6] [16] [26] [30] [41] [7] [4] [3] [1] [45] [46] [47] [48] [49] [50] [51] [52] [46] [47] [49] [42] [43] [44] [45] [50] [51] [52] [48] 0.7 0.8 0.9 1.0 0.0 0.2 0.4 0.6 0.8

Pr(µ, Σ y)

Pr(SROC y), Pr(A y) Pr(B y) CI 95%: A: 6.062 [5.654, 6.4775] the length is 25% wider than RE B: -0.571 [-0.953, -0.167] here the length is 63% wider!

Pr(AUC y) CI 95%: AUC: 96.73, [94.75, 97.94]

Predictions: Pr(TPR * y) and Pr(FPR * y) CIs 95% TPR * : 0.954 [0.867, 0.986] ; 1 - FPR * : 0.951 [0.670, 0.994]

Predictions: Pr(TP * y), Pr(FP * y), n=100

Predictive Summary Surface: Pr(TPR *, FPR * y)

Further work Model diagnostic: sensitivity to priors; residuals at different levels Generalized evidence synthesis: How to combine studies when they are different Link explanatory variables Model study quality Publication bias

Conclusions