Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model

Similar documents
Bagging During Markov Chain Monte Carlo for Smoother Predictions

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application

Density Estimation. Seungjin Choi

Statistical properties and inference of the antimicrobial MIC test

STAT 518 Intro Student Presentation

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Contents. Part I: Fundamentals of Bayesian Inference 1

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Computational statistics

BUGS Bayesian inference Using Gibbs Sampling

Integrated Non-Factorized Variational Inference

Assessing Regime Uncertainty Through Reversible Jump McMC

Fast Likelihood-Free Inference via Bayesian Optimization

STA 4273H: Statistical Machine Learning

Empirical Validation of the Critical Thinking Assessment Test: A Bayesian CFA Approach

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles

New Insights into History Matching via Sequential Monte Carlo

Sample Size Calculations for ROC Studies: Parametric Robustness and Bayesian Nonparametrics

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 18-16th March Arnaud Doucet

CPSC 540: Machine Learning

Recent Advances in Bayesian Inference Techniques

Mixed effect model for the spatiotemporal analysis of longitudinal manifold value data

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

A Bayesian Approach to Phylogenetics

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Approximate Bayesian Computation

Spatially Adaptive Smoothing Splines

Bayesian Inference of Interactions and Associations

VCMC: Variational Consensus Monte Carlo

Markov Chain Monte Carlo, Numerical Integration

Luke B Smith and Brian J Reich North Carolina State University May 21, 2013

Bayesian Nonparametric Regression through Mixture Models

Bayesian Nonparametrics

The STS Surgeon Composite Technical Appendix

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Bayesian model selection: methodology, computation and applications

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Functional Estimation in Systems Defined by Differential Equation using Bayesian Smoothing Methods

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

CASE STUDY: Bayesian Incidence Analyses from Cross-Sectional Data with Multiple Markers of Disease Severity. Outline:

Bayesian Nonparametric Regression for Diabetes Deaths

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

A Shape Constrained Estimator of Bidding Function of First-Price Sealed-Bid Auctions

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

A nonparametric Bayesian approach to copula estimation

Nonparametric Bayes Estimator of Survival Function for Right-Censoring and Left-Truncation Data

5.5.3 Statistical Innovative Trend Test Application Crossing Trend Analysis Methodology Rational Concept...

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A general mixed model approach for spatio-temporal regression data

Individualized Treatment Effects with Censored Data via Nonparametric Accelerated Failure Time Models

Determining the number of components in mixture models for hierarchical data

Gaussian Mixture Model

Bayesian Networks in Educational Assessment

Bayesian Regression Linear and Logistic Regression

Markov Chain Monte Carlo in Practice

Chapter 2. Data Analysis

Bayesian rules of probability as principles of logic [Cox] Notation: pr(x I) is the probability (or pdf) of x being true given information I

Analysing geoadditive regression data: a mixed model approach

A nonparametric Bayesian approach to inference for non-homogeneous. Poisson processes. Athanasios Kottas 1. (REVISED VERSION August 23, 2006)

STA 4273H: Statistical Machine Learning

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

Monte Carlo Inference Methods

Nonparametric Bayesian Methods (Gaussian Processes)

Infer relationships among three species: Outgroup:

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Methods for Machine Learning

Bayesian Nonparametric Rasch Modeling: Methods and Software

Non-Parametric Bayesian Population Dynamics Inference

Image segmentation combining Markov Random Fields and Dirichlet Processes

Quantile POD for Hit-Miss Data

Experimental Design and Data Analysis for Biologists

Semiparametric Bayesian Inference for. Multilevel Repeated Measurement Data

COMPOSITIONAL IDEAS IN THE BAYESIAN ANALYSIS OF CATEGORICAL DATA WITH APPLICATION TO DOSE FINDING CLINICAL TRIALS

Forward Problems and their Inverse Solutions

Bayes methods for categorical data. April 25, 2017

Bayesian Estimation and Inference for the Generalized Partial Linear Model

Efficient adaptive covariate modelling for extremes

Density Modeling and Clustering Using Dirichlet Diffusion Trees

Statistical Inference for Stochastic Epidemic Models

Bayesian Registration of Functions with a Gaussian Process Prior

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Modelling and forecasting of offshore wind power fluctuations with Markov-Switching models

Hmms with variable dimension structures and extensions

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Scaling up Bayesian Inference

Modelling Receiver Operating Characteristic Curves Using Gaussian Mixtures

Bayesian estimation of the discrepancy with misspecified parametric models

Session 5B: A worked example EGARCH model

Markov chain Monte Carlo

Monte Carlo in Bayesian Statistics

Physician Performance Assessment / Spatial Inference of Pollutant Concentrations

Bayesian PalaeoClimate Reconstruction from proxies:

Tutorial on Probabilistic Programming with PyMC3

Fundamental Issues in Bayesian Functional Data Analysis. Dennis D. Cox Rice University

A Flexible Class of Models for Data Arising from a thorough QT/QTc study

MCMC 2: Lecture 2 Coding and output. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

Unsupervised Learning

Transcription:

1 / 23 Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model Glen DePalma gdepalma@purdue.edu Bruce A. Craig bacraig@purdue.edu Eastern North American Region/International Biometric Society March 12, 2013

MIC/DIA Pathogen Susceptibility Tests 2 / 23

Current Practice - Error Rate Bounded Method (ERB) 3 / 23

4 / 23 Concerns with the ERB Method ERB method uses only observed results and does not properly take into account the measurement error of each test. Repeat runs of the ERB for the same drug can result in very different DIA breakpoints (low precision). DIA breakpoints are biased due to the different rounding in each test (poor accuracy).

5 / 23 Model-Based Approach Instead of focusing on observed test results, a model-based approach attempts to get to the underlying truth. Our model separates the scatterplot into three components. 1. The test procedures (i.e., rounding) and experimental variability. 2. The drug-specific relationship between the true MICs and DIAs. 3. The underlying distribution of pathogens (or MICs). The first component links the observed MIC/DIA pair with an underlying true MIC value. The second and third components describe the relationship between the true MIC and its corresponding true DIA.

Probability Model 6 / 23

7 / 23 Previous Work on Model-Based Approaches First model-based methods used a linear relationship to describe the MIC/DIA relationship based on observed data. Craig in 2000 proposed a much more reasonable logistic model that takes into account test characteristics. Drawbacks of Craig s model: 1. Some real data sets suggest poor fit for a logistic relationship. 2. Difficult for clinicians to implement in practice. We extend Craig s approach to a flexible nonparametric model. Key to implementation, we provide software for clinicians to use our nonparametric model in practice.

8 / 23 1. Distribution of Observed Test Results For pathogen i, let m i and d i denote the true MIC and DIA. The joint distribution of observed MIC (x) and observed DIA (y) are modeled as: x i = m i + ɛ y i = [d i + δ] ɛ N(0, σ 2 m) δ N(0, σ 2 d ) where σ m and σ d represent the experimental variability evident in the MIC and DIA test.

2. True MIC/DIA Relationship We link the pair of observed test results by modeling the 1-1 relationship between the true MIC and DIA (d i = g(m i )). Since the relationship is of unknown functional form, we use the non-parametric approach of I-splines (Ramsay, 1988). I-splines ensure the relationship is monotonically decreasing given the spline coefficients are positive. I-spline bases for knots: 0,.2,.4,.6,.8, 1 9 / 23

10 / 23 Knot Selection Due to the unknown m and d values knot selection is a difficult problem. Knot selection based on fit statistics will not work. Propose two solutions: 1. Add, remove, or update knot location using RJMCMC Updates based on least square coefficient estimates 2. Constrain coefficients via random walk prior (Christensen et al. 2012) βt β 0...β t 1 N (β t 1, λ) Non-informative priors put on β0 and λ

11 / 23 3. Underlying Distribution of MICs The collection of pathogen strains used to generate a scatterplot for the ERB method are commonly considered to a be a random sample of the pathogens that would be seen in patients at a hospital or clinic. We use the distribution of observed (MIC, DIA) pairs to estimate this population distribution. To allow for multi-modality and skewness, the underlying pathogen (true MIC) distribution is modeled with a Dirichlet Process Mixture of Normals (Ghosh and Ramamoorthi, 2003).

Bayesian Inference To use our model-based breakpoint determination procedure, all the model parameters must be estimated from a scatterplot. 1. Spline coefficients 2. Smoothness parameter or number and location of knots 3. Mixture of Normal components 4. True MIC values Bayesian inference is used to obtain the joint posterior of parameters. Use MCMC to approximate posterior. Our approach utilizes the posterior distribution of the model parameters to compute the probabilities of correct classification and determine the DIA breakpoints. 12 / 23

13 / 23 Probability of Correct Classification Probability model links observed MIC results to true MIC Therefore can determine probability of correct identification Given MIC breakpoints M L and M U ( ) ML m Pr(x M L ) = Φ σ m ( ) ( ) pmic(m) = MU 1 m ML m Pr(M L < x < M U ) = Φ Φ ( σ m ) σ m MU 1 m Pr(x M U ) = 1 Φ σ m m M L M L < m < M U m M U Similar calculations for DIA test (different rounding)

14 / 23 Estimating DIA Breakpoints Calculate DIA breakpoints based on loss function: L = min (0, p DIA (g(u)) p MIC (u)) 2 w(u) du

15 / 23 Simulation Study Assumed different true relationships between MIC and DIA 1. Simulated a scatterplot of 500 isolates 2. Calculated DIA breakpoints for the nonparametric and logistic models Repeated one and two 500 times and compared breakpoint accuracy between models

Simulation 1: Logistic Relationship 16 / 23

Simulation 1: Logistic Relationship 17 / 23

Simulation 2: Mild Departure 18 / 23

Simulation 2: Mild Departure 19 / 23

Simulation 3: Major Departure 20 / 23

Simulation 3: Major Departure 21 / 23

22 / 23 Conclusion Because of the increasing number of moderately susceptible and resistant isolates, choosing appropriate breakpoints has become more of a statistical problem. We ve proposed a flexible nonparametric model-based approach that estimates the diameter breakpoints based on the probability of correct classification instead of minimizing the observed discrepancies between the two tests. Working with the FDA and CLSI to assess true data. Online software, using the R package Shiny from RStudio, is available for clinicians to use our model in practice. http://glimmer.rstudio.com/dbets/dbets/

23 / 23 References Brooks, Steve et al. Handbook of Markov Chain Monte Carlo. Boca Raton: CRC/Taylor & Francis, 2011. Craig, Bruce A. "Modeling Approach to Diameter Breakpoint Determination." Diagnostic Microbiology and Infectious Disease 36.3 (2000): 193-202. Green, P.J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711-732. Ramsay, J.O. "Monotone Regression Splines in Action." Statistical Science 3.4 (1988) 425-41. Turnidge, J., and D.L. Paterson. "Setting and Revising Antibacterial Susceptibility Breakpoints." Clinical Microbiology Reviews 20.3 (2007): 391-408. Thanks! http://glimmer.rstudio.com/dbets/dbets/