Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model

1 / 23 Disk Diffusion Breakpoint Determination Using a Bayesian Nonparametric Variation of the Errors-in-Variables Model Glen DePalma gdepalma@purdue.edu Bruce A. Craig bacraig@purdue.edu Eastern North American Region/International Biometric Society March 12, 2013

MIC/DIA Pathogen Susceptibility Tests 2 / 23

Current Practice - Error Rate Bounded Method (ERB) 3 / 23

4 / 23 Concerns with the ERB Method ERB method uses only observed results and does not properly take into account the measurement error of each test. Repeat runs of the ERB for the same drug can result in very different DIA breakpoints (low precision). DIA breakpoints are biased due to the different rounding in each test (poor accuracy).

5 / 23 Model-Based Approach Instead of focusing on observed test results, a model-based approach attempts to get to the underlying truth. Our model separates the scatterplot into three components. 1. The test procedures (i.e., rounding) and experimental variability. 2. The drug-specific relationship between the true MICs and DIAs. 3. The underlying distribution of pathogens (or MICs). The first component links the observed MIC/DIA pair with an underlying true MIC value. The second and third components describe the relationship between the true MIC and its corresponding true DIA.

Probability Model 6 / 23

7 / 23 Previous Work on Model-Based Approaches First model-based methods used a linear relationship to describe the MIC/DIA relationship based on observed data. Craig in 2000 proposed a much more reasonable logistic model that takes into account test characteristics. Drawbacks of Craig s model: 1. Some real data sets suggest poor fit for a logistic relationship. 2. Difficult for clinicians to implement in practice. We extend Craig s approach to a flexible nonparametric model. Key to implementation, we provide software for clinicians to use our nonparametric model in practice.

8 / 23 1. Distribution of Observed Test Results For pathogen i, let m i and d i denote the true MIC and DIA. The joint distribution of observed MIC (x) and observed DIA (y) are modeled as: x i = m i + ɛ y i = [d i + δ] ɛ N(0, σ 2 m) δ N(0, σ 2 d ) where σ m and σ d represent the experimental variability evident in the MIC and DIA test.

2. True MIC/DIA Relationship We link the pair of observed test results by modeling the 1-1 relationship between the true MIC and DIA (d i = g(m i )). Since the relationship is of unknown functional form, we use the non-parametric approach of I-splines (Ramsay, 1988). I-splines ensure the relationship is monotonically decreasing given the spline coefficients are positive. I-spline bases for knots: 0,.2,.4,.6,.8, 1 9 / 23

10 / 23 Knot Selection Due to the unknown m and d values knot selection is a difficult problem. Knot selection based on fit statistics will not work. Propose two solutions: 1. Add, remove, or update knot location using RJMCMC Updates based on least square coefficient estimates 2. Constrain coefficients via random walk prior (Christensen et al. 2012) βt β 0...β t 1 N (β t 1, λ) Non-informative priors put on β0 and λ

11 / 23 3. Underlying Distribution of MICs The collection of pathogen strains used to generate a scatterplot for the ERB method are commonly considered to a be a random sample of the pathogens that would be seen in patients at a hospital or clinic. We use the distribution of observed (MIC, DIA) pairs to estimate this population distribution. To allow for multi-modality and skewness, the underlying pathogen (true MIC) distribution is modeled with a Dirichlet Process Mixture of Normals (Ghosh and Ramamoorthi, 2003).

Bayesian Inference To use our model-based breakpoint determination procedure, all the model parameters must be estimated from a scatterplot. 1. Spline coefficients 2. Smoothness parameter or number and location of knots 3. Mixture of Normal components 4. True MIC values Bayesian inference is used to obtain the joint posterior of parameters. Use MCMC to approximate posterior. Our approach utilizes the posterior distribution of the model parameters to compute the probabilities of correct classification and determine the DIA breakpoints. 12 / 23

13 / 23 Probability of Correct Classification Probability model links observed MIC results to true MIC Therefore can determine probability of correct identification Given MIC breakpoints M L and M U ( ) ML m Pr(x M L ) = Φ σ m ( ) ( ) pmic(m) = MU 1 m ML m Pr(M L < x < M U ) = Φ Φ ( σ m ) σ m MU 1 m Pr(x M U ) = 1 Φ σ m m M L M L < m < M U m M U Similar calculations for DIA test (different rounding)

14 / 23 Estimating DIA Breakpoints Calculate DIA breakpoints based on loss function: L = min (0, p DIA (g(u)) p MIC (u)) 2 w(u) du

15 / 23 Simulation Study Assumed different true relationships between MIC and DIA 1. Simulated a scatterplot of 500 isolates 2. Calculated DIA breakpoints for the nonparametric and logistic models Repeated one and two 500 times and compared breakpoint accuracy between models

Simulation 1: Logistic Relationship 16 / 23

Simulation 1: Logistic Relationship 17 / 23

Simulation 2: Mild Departure 18 / 23

Simulation 2: Mild Departure 19 / 23

Simulation 3: Major Departure 20 / 23

Simulation 3: Major Departure 21 / 23

22 / 23 Conclusion Because of the increasing number of moderately susceptible and resistant isolates, choosing appropriate breakpoints has become more of a statistical problem. We ve proposed a flexible nonparametric model-based approach that estimates the diameter breakpoints based on the probability of correct classification instead of minimizing the observed discrepancies between the two tests. Working with the FDA and CLSI to assess true data. Online software, using the R package Shiny from RStudio, is available for clinicians to use our model in practice. http://glimmer.rstudio.com/dbets/dbets/

23 / 23 References Brooks, Steve et al. Handbook of Markov Chain Monte Carlo. Boca Raton: CRC/Taylor & Francis, 2011. Craig, Bruce A. "Modeling Approach to Diameter Breakpoint Determination." Diagnostic Microbiology and Infectious Disease 36.3 (2000): 193-202. Green, P.J. (1995). Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711-732. Ramsay, J.O. "Monotone Regression Splines in Action." Statistical Science 3.4 (1988) 425-41. Turnidge, J., and D.L. Paterson. "Setting and Revising Antibacterial Susceptibility Breakpoints." Clinical Microbiology Reviews 20.3 (2007): 391-408. Thanks! http://glimmer.rstudio.com/dbets/dbets/