Bayesian Use of Likelihood Ratios in Biostatistics

Size: px

Start display at page:

Download "Bayesian Use of Likelihood Ratios in Biostatistics"

Harvey Underwood
5 years ago
Views:

1 Bayesian Use of Likelihood Ratios in Biostatistics David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz, USA draper JSM 2010 Vancouver, Canada 4 Aug 2010 Bayesian use of likelihood ratios in biostatistics 1

2 Case Study: Diagnosing Sepsis in Newborns (Newman TB, Puopolo KM, Wi S, Draper D, Escobar GE (2010). Interpreting complete blood counts soon after birth in newborns at risk for sepsis. Pediatrics, forthcoming.) Sepsis is a serious medical condition in which the entire body exhibits an inflammatory response to infection, usually bacterial (e.g., Group B streptococcus (GBS)). It s particularly dangerous in newborns, where early-onset sepsis (EOS) usually presents within the first 24 hours after birth. However, the evaluation of EOS is difficult: risk factors for infection are common, and early signs and symptoms are nonspecific. When newborns are symptomatic or have significant risk factors, a complete blood count (CBC) is usually ordered; for example, CDC guidelines recommend a CBC for high-risk infants (e.g., those with GBS-positive mothers not adequately treated for infection). Unfortunately the CDC recommendations are silent on how to use CBC results to estimate the risk of infection. Bayesian use of likelihood ratios in biostatistics 2

3 Use of CBC Components to Diagnose Sepsis Published reference ranges for components of the CBC including the absolute neutrophil count () and the proportion of total neutrophils that are immature (I/T) vary widely, and these variables may be affected by many factors besides infection, including infant age (in hours), the method of delivery, maternal hypertension, and the infant s sex. Many different values for the sensitivity P (test positive sepsis) and specificity P (test negative not sepsis) of CBC components have been published, depending on the population studied and what levels of these tests were considered abnormal. Moreover, most previous studies have dichotomized each of the CBC components rather than treating them continuously which wastes information by failing to quantify the difference between borderline and profoundly abnormal results and no one previous to our study had tried to evaluate the effects of factors such as infant age and delivery method on diagnostic performance. Bayesian use of likelihood ratios in biostatistics 3

4 Study Methods As part of a larger project based on a $1.35 million NIH grant, we took advantage of the electronic medical record systems at Northern California Kaiser Permanente Medical Care Program (KPMCP) and Brigham and Women s Hospital (BWH, Boston) to improve on previous practice. Methods. Retrospective cross-sectional study involving KPMCP, BWH demographic, laboratory, hospitalization data bases; we queried microbiology data bases to identify all infants for whom blood culture was obtained at < 72 hours of age; we kept first positive blood culture for infants with positive cultures (septic), and first blood culture for other infants, then matched all blood cultures by date, time to (single) CBC obtained closest in time to blood culture for each infant. Study subjects. Newborn infants were eligible for the study if (a) they were born from 1 Jan 1995 through 30 Sep 2007 at a KPMCP hospital that had at least 100 total births in that time period, or at the BWH from 1 Jan 1993 through 31 Dec 2007; (b) their estimated gestational age was 34 weeks; and (c) they had a CBC and blood culture drawn within 1 hour of one another at < 72 hours of age. Bayesian use of likelihood ratios in biostatistics 4

5 The Promise of Electronic Medical Records Sepsis is rare but deadly: of the 550,367 infants eligible for the study based on their hospital, year of birth, and gestational age, we identified 311 (0.57/1000 live births) with positive blood cultures; we included in this study the subset of 67,623 infants (12.3% of the 550,367 eligible newborns) who had a CBC done within 1 hour of a blood culture, including 245 of the 311 whose blood culture was positive (3.6/1000 infants receiving CBCs): thus 245 sepsis-positive and 67,378 sepsis-negative babies. Goal of analysis. With sepsis and other diseases, we re working toward a clinical goal in the nascent era of electronic medical records (EMRs) in which current posterior probabilities of disease status and adverse outcomes (e.g., unplanned transfer to the intensive care unit) become prior probabilities for real-time sequential updating as new information (vital signs, laboratory results, signs and symptoms) arrives. As a stepping-stone toward that eventual goal, we re now putting in place at Kaiser a Bayesian system in which Bayesian use of likelihood ratios in biostatistics 5

6 Likelihood Ratios (1) an initial probability of sepsis is estimated based on maternal risk factors up til birth; (2) the probability in (1) is updated at newborn age 12 hours via Bayes s Theorem based on new infant data in the first 12 hours of life; (3) the probability in (2) is updated at 24 hours via Bayes s Theorem based on new infant data in hours 12 24; and so on. A convenient way to do this Bayesian updating is with Bayes s Theorem in odds form: with diagnostic data y and true sepsis = S, [ ] [ ] P (S y) = P (S) P (y S) P (not S y) P (not S) P (y not S) posterior = prior Bayes = likelihood (1) odds odds factor ratio So how should likelihood ratios be estimated from data? Bayesian use of likelihood ratios in biostatistics 6

7 Estimating Likelihood Ratios Consider gathering data on a screening test T for a disease to estimate the test s sensitivity and specificity. For this purpose you would take a random sample, of size (say) n D > 0, of blood samples that were known (on the basis of a gold-standard test) to contain the disease agent D, of which (say) r D would register as positive (+) by T, and a parallel and independent random sample, of size (say) n D > 0, of blood samples that were known not to contain the disease agent (using D to denote absence of the disease), of which (say) r D would register as not positive ( ) by T. The sampling model would be (r D π D ) Binomial(n D, π D ) (r D π D) Binomial(n D, π D), in which (2) 0 < π D < 1 is the underlying probability P (+ D) of test-positives in the population of all true-positive blood samples, similarly 0 < π D < 1 is the underlying probability P ( D) of test-negatives in the population of all true-negative blood samples, and Bayesian use of likelihood ratios in biostatistics 7

8 Interval Estimation of a Likelihood Ratio r D and r D are independent (given π D and π D). With a given sample of blood of unknown disease status that came out positive (say) on T, in this notation Bayes s Theorem on the odds scale is [ ] [ ] P (D +) P (D) P (+ D) P ( D +) = P ( D) P (+ D), (3) in which the second multiplicative factor P (+ D) on the right side of (3) is P (+ D) the likelihood ratio based on the screening test T ; the population quantity that the likelihood ratio estimates is θ = π D, (4) 1 π D and the goal of the inference is an interval estimate for θ. As usual the frequentist (repeated-sampling) and Bayesian approaches may both be examined as methods for creating such an interval; with little information about θ external to the data set (r D, n D, r D, n D) and large values of (n D, n D), the expectation would be that the two approaches would yield similar findings, Bayesian use of likelihood ratios in biostatistics 8

9 Likelihood-Based Inference but for small (n D, n D) the Bayesian approach might well be better calibrated (because it involves integrating over a skewed likelihood function instead of maximizing over it). Approximate likelihood (repeated-sampling) inference. From standard Binomial-sampling results the maximum-likelihood estimates (MLEs) of π D and π D are ˆπ D = r D nd and ˆπ D = r D n, respectively, and by the D functional-invariance property of maximum-likelihood estimation the MLE ˆθ = of θ is then ˆπ D r D n D = 1 ˆπ D n D (n D r D), (5) in which for sensible behavior (given that 0 < θ < by assumption) it s evidently necessary to assume that r D < n D and r D > 0. Standard (Fisherian) maximum-likelihood inference is based on the hope that in repeated sampling ˆθ will be approximately Gaussian, and indeed this will be true for large enough sample sizes, but for moderate values of (n D, n D) since 0 < θ < the repeated-sampling distribution of ˆθ will be positively skewed. Bayesian use of likelihood ratios in biostatistics 9

10 Transform the Scale One approach to solving this problem is the bootstrap, which would be straightforward but computationally intensive; another is to do maximum-likelihood inference on a transformed scale (on which the repeated-sampling distribution of the MLE is closer to Gaussian) and back-transform; here I give details on the transformation approach. The obvious transformation for positive θ is to work with η = log(θ) = log(π D ) log(1 π D), (6) for which the MLE is ˆη = log(ˆθ) = log(ˆπ D ) log(1 ˆπ D). (7) In repeated sampling the distribution of ˆη should be approximately Gaussian with mean fairly close to η and variance V (ˆη) = V [log(ˆπ D )] + V [log(1 ˆπ D)]. (8) The variances in (8) can each be approximated by a standard Taylor-series ( -method) calculation: if in repeated sampling Y has mean E(Y ) and Bayesian use of likelihood ratios in biostatistics 10

11 Method variance V (Y ) and f is a function whose first derivative exists at E(Y ), then V [f(y )]. = { f [E(Y )] }2 V (Y ). (9) With f(y) = log(y) and Y = ˆπ D, so that E(Y ) = π D and V (Y ) = π D(1 π D ) n D, this yields V [log(ˆπ D )] =. ( ) 2 1 π D (1 π D ) = 1 π D, (10) n D π D π D and a similar calculation with f(y) = log(1 y) and Y = ˆπ D gives n D V [log(1 ˆπ D)]. = π D n D(1 π D), (11) so that the repeated-sampling variance of ˆη may be approximately ˆV (ˆη). = 1 ˆπ D n D ˆπ D + estimated by ˆπ D n D(1 ˆπ D) = n D r D n D r D + r D n D(n D r D). (12) To ensure both sensible estimates of θ in (5) and non-zero variance estimates in (12) it s necessary to assume that 0 < r D < n D and 0 < r D < n D. Bayesian use of likelihood ratios in biostatistics 11

12 Bayesian Solution Based on the above assumption of approximate Gaussian sampling distribution for ˆη, an approximate 100(1 α)% confidence interval for η would then be of the form ( ˆη ± Φ 1 1 α ) ˆV (ˆη), (13) 2 where Φ is the standard normal CDF; denoting the left and right endpoints of (13) by ˆη L and ˆη R, respectively, the corresponding approximate 100(1 α)% confidence interval for θ would then be [exp(ˆη L ), exp(ˆη R )]. (14) Bayesian solution. This is simpler and does not require an appeal to large-sample approximations. If you have little information about the probabilities π D and π D external to the data set y = (r D, n D, r D, n D), as will often be the case, this can readily be conveyed by augmenting model (2) above with conjugate Beta prior distributions with small values of the hyper-parameters; Bayesian use of likelihood ratios in biostatistics 12

13 Bayesian Solution the prior model is then π D Beta(α D, β D ) π D Beta(α D, β D) (15) with (e.g.) α D = β D = α D = β D = ɛ for some small ɛ > 0. By standard conjugate updating the posterior distributions for π D and π D are then (independently) also Beta: (π D y) Beta(α D + r D, β D + n D r D ) (π D y) Beta(α D + r D, β D + n D r D). (16) The posterior distribution p(θ y) for θ given the data has no closed-form expression but may easily be approximated to any desired accuracy by simulation: you simply generate m IID draws from the Beta posterior distribution p(π D y) in the first line of (16), for some large value of m, and store the generated draws in a column called πd; Bayesian use of likelihood ratios in biostatistics 13

14 Bayesian Solution independently generate m IID draws from the Beta posterior distribution p(π D y) in the second line of (16) and store the generated create a third column θ = draws in another column called π D; and π D 1 π D and summarize it in all relevant ways (e.g., a density trace provides a visual summary of p(θ y), the mean or median of the θ values may be used as a point estimate, and the α ( ) and 2 1 α 2 quantiles of the θ distribution provide the left and right endpoints of a 100(1 α)% interval estimate for θ). It s also interesting to simulate from the posterior distribution for η given y (by creating a fourth column η = log(θ )) to see how close this distribution is to a Gaussian form, to examine (by the Bernstein-von Mises Theorem) whether the assumption on which the likelihood approach was based that in repeated sampling ˆη IID Gaussian[η, V (ˆη)] is reasonable for a given data set. An example. Consider a test with sensitivity 96% and specificity 97%, and sample sizes ranging from 50 to 2,000. Bayesian use of likelihood ratios in biostatistics 14

15 An Example Maximum-likelihood and Bayesian likelihood ratio point and interval estimates for a moderately accurate screening test; the Bayesian results use ɛ = 0.01 and m = 100,000. Point Estimates 95% Interval Posterior Likelihood Posterior r D n D r D n D MLE Median Mean L U L U With small (n D, n D), MLE of likelihood ratio, which corresponds approximately (with little information external to sample data) to posterior mode, is substantially smaller than either posterior median or mean (see skewness in posterior distributions for θ in figures on next page). The Bayesian intervals are substantially wider than their likelihood counterparts for small and moderate sample sizes, but by the time (n D, n D) has reached (1000, 2000) the two methods have yielded similar findings. Bayesian use of likelihood ratios in biostatistics 15

16 An Example Density theta Density Density theta Density eta eta Top and bottom panels are posterior distributions for θ and η, respectively (with Gaussian approximation for η); left and right columns correspond to (n D, n D) = (50, 100) and (100, 200), respectively. The Gaussian approximation for η on which the likelihood method is based is poor with (n D, n D) = (50, 100), better (but still not good) with (n D, n D) = (100, 200), and excellent with (n D, n D) = (1000, 2000) (next page). Bayesian use of likelihood ratios in biostatistics 16

17 Simulation Study Density theta Density eta Simulation study (joint work with JC LaGuardia). We performed a simulation study to examine repeated-sampling bias of point estimates of likelihood ratios, and (repeated-sampling) actual coverage of interval estimates. Bayesian use of likelihood ratios in biostatistics 17

18 ML and Bayesian Tuning Constants A refinement. Whenever you use a screening test in a situation in which the specificity is close to 1, ˆπ D = r D n D 1 ˆπ D 1 ˆπ D = ˆθ. (17) In this case you ll end up with a frequentist likelihood ratio estimate that s unstable, because its denominator is too close to 0. In the Bayesian approach, in this same situation if the hyper-parameter values are too close to 0, the posterior estimate of π D will again be close to 1 and the Bayesian point estimate θ can be similarly unstable. This can easily happen if the underlying specificity of the screening process is high and/or if the sample sizes are small. The obvious remedies are as follows: (Bayesian approach) Use hyper-parameter values α D = α D = β D = β D = C B that are not too close to 0. (MLE) Mimic what happens in the Beta-Binomial Bayesian approach by adding a constant C L to all of the values (r D, r D, n D, n D). Bayesian use of likelihood ratios in biostatistics 18

19 Simulation Study Results Factorial design of the simulation study. Variables Values π D π D n D n D C L = C B We used the full-factorial simulation design summarized in this table, with 2,400 Monte Carlo repetitions in each cell of the factorial. By way of outcomes we monitored the relative bias of each of the point estimates (modified MLE, Bayesian posterior mean, Bayesian posterior mode) and the actual coverage of nominal 90% modified ML and Bayesian intervals. Simulation conclusions were as follows. Bayesian use of likelihood ratios in biostatistics 19

20 Simulation Study Results Both approaches can be calibrated to obtain approximately unbiased point estimates in almost all scenarios examined, but Bayesian interval estimates had better actual coverage behavior than modified-ml interval estimates for small and moderate sample sizes: actual interval coverage for Bayesian intervals, when using C B value that gave good point estimate, was higher than interval coverage from modified 90% likelihood confidence interval, when using C L that gave good point estimate. Within the Beta family of prior distributions for a Binomial parameter π, three popular choices to specify diffuseness, when not much is known about π external to the data, are (a) the Jeffreys prior, with (α, β) = (0.5, 0.5); (b) the Laplace (Uniform) prior, with (α, β) = (1.0, 1.0); and (c) (α, β) = (ɛ, ɛ) for a value of ɛ near 0 (such as 0.1 or 0.01). Of these three choices, the Uniform prior performs substantially better than the other two conventional diffuse-prior choices when estimating a likelihood ratio. Bayesian use of likelihood ratios in biostatistics 20

21 Results For Sepsis Screening Broadening the Laplace-Uniform idea, (α, β) values ranging from 0.7 to 1.15 are worth considering; if your sample size in the non-diseased group is small, lean toward using lower values from that interval, and if your sample size in the non-diseased group is large, go for higher values. Results for sepsis screening were as follows. Likelihood Ratio for Age at % of % of Time of Number Those With Those Without CBC (hours) With Infection Infection < > 4 Infection With Result With Result Low values are highly predictive of sepsis, especially if they occur more than 4 hours after birth. Bayesian use of likelihood ratios in biostatistics 21

22 Results For Sepsis Screening; The Next Step Likelihood Ratio for Age at % of % of Time of Number Those With Those Without CBC (hours) With Infection Infection I/T < > 4 Infection With Result With Result High values of the I to T ratio are moderately predictive of sepsis, especially if they occur more than 4 hours after birth. The next step. How would you use both the and I/T values to modify a baseline probability of sepsis from the maternal information? You can only multiply the likelihood ratios if and I/T are independent for both the sepsis and non-sepsis infants (not likely to be true); we need to estimate their joint likelihood ratio. Bayesian use of likelihood ratios in biostatistics 22

23 Bayes s Theorem Backwards If an accurate method can be found to estimate P (sepsis, I/T ), this can be done by running Bayes s Theorem in odds form backwards: with P (, I/T S = 1) P (, I/T S = 0) = S = 1 for sepsis and 0 otherwise, [ ] P (S = 0) P (S = 1) [ P (S = 1, I/T ) 1 P (S = 1, I/T ) ]. (18) The first thing that comes to mind in estimating P (S = 1, I/T ) is logistic regression, but it s important to bring the predictors and I/T into the model in the correct form; what does the surface P (S = 1, I/T ) look like with our data? Exploratory tools for generalized linear models are not as abundant as with linear models; I used local regression, via the loess command (followed by predict) in R, to explore this surface; recall that my data set has 245 sepsis-positive and 67,378 sepsis-negative babies. Actually I really want to look at P (S = 1, I/T, age), but this will be difficult to visualize, and my clinician colleagues prefer the and I/T answer to be stratified by age group, so I found age cutpoints that Bayesian use of likelihood ratios in biostatistics 23

24 Local Regression captured approximately equal numbers of sepsis-positive infants: Number of Age (hours) Sepsis-Positives Sepsis-Negatives (1, 2] (2, 6] > A bit of advice: with up to 25,000 observations in each data set, run the loess command like this: case.anc.i2t.age1.loess <- loess( case1 ~ anc1 * i2t1, case.anc.i2t.age1, statistics = "approximate", trace.hat = "approximate" ) For several of the age groups the results were remarkable; perspective and contour plots follow; note that predictions sometimes go negative, because loess doesn t know anything about bounds on the outcome. Bayesian use of likelihood ratios in biostatistics 24

25 P( case ) Response Surface Exploration Age <= 1 hour I2T Bayesian use of likelihood ratios in biostatistics 25

26 I2T Response Surface Exploration Age <= 1 hour Bayesian use of likelihood ratios in biostatistics

27 P( case ) Response Surface Exploration 1 < Age <= 2 hours I2T Bayesian use of likelihood ratios in biostatistics 27

28 I2T Response Surface Exploration 1 < Age <= 2 hours Bayesian use of likelihood ratios in biostatistics

29 P( case ) Response Surface Exploration 2 < Age <= 6 hours I2T Bayesian use of likelihood ratios in biostatistics 29

30 I2T Response Surface Exploration 2 < Age <= 6 hours Bayesian use of likelihood ratios in biostatistics

31 P( case ) Response Surface Exploration Age > 6 hours I2T Bayesian use of likelihood ratios in biostatistics 31

32 I2T Response Surface Exploration Age > 6 hours Bayesian use of likelihood ratios in biostatistics

33 Fixing the Negative Probabilities The estimated probabilities from loess are highly suggestive, but sometimes go negative; I see three ways forward (in progress): Try to come up with a parametric surface in and I/T for use in, e.g., a logistic regression model (challenging for several of the age groups). Figure out how to scale the estimated probabilities from loess so that they retain fidelity to the correct response surface while not going negative. A variety of reasonable ways of doing this have all led to similar results; one such set is plotted on the following pages (using the overall rate of sepsis ( ) as P (S = 1)). Fit a Bayesian nonparametric model to the data, via (e.g.) Gaussian processes (joint work with B Gramacy): The generative, hierarchical, GP classification model we use may be described as follows: let C(x) {0, 1} be the classification label at input x R m ; let Z Z(X) R N be a vector of N latent variables, one for each row in the N m design matrix X; each row is x i for i = 1,..., N with Bayesian use of likelihood ratios in biostatistics 33

34 LR Likelihood Ratio Estimation Age <= 1 hour I2T Bayesian use of likelihood ratios in biostatistics 34

35 I2T Likelihood Ratio Estimation Age <= 1 hour Bayesian use of likelihood ratios in biostatistics

36 LR Likelihood Ratio Estimation 1 < Age <= 2 hours I2T Bayesian use of likelihood ratios in biostatistics 36

37 I2T Likelihood Ratio Estimation 1 < Age <= 2 hours Bayesian use of likelihood ratios in biostatistics

38 LR Likelihood Ratio Estimation 2 < Age <= 6 hours I2T Bayesian use of likelihood ratios in biostatistics 38

39 4 I2T Likelihood Ratio Estimation 2 < Age <= 6 hours Bayesian use of likelihood ratios in biostatistics

40 LR Likelihood Ratio Estimation Age > 6 hours I2T Bayesian use of likelihood ratios in biostatistics 40

41 I2T Likelihood Ratio Estimation Age > 6 hours Bayesian use of likelihood ratios in biostatistics

42 Gaussian Process Classification corresponding latent Z i ; we assume that X has been pre-scaled to the unit C(x i ) indep Bernoulli[p(x i )] p(x i ) = exp{ Z i} 1 + exp{ Z i } cube; our generative model is Z σ 2, K GP(0, σ 2, K) N N (0, σ 2 K), where K i,j = K(x i, x j ) { } m K(x i, x j ) d, g = exp x ik x jk 2 /d j } + δ i,j g k=1 σ 2 IG(5/2, 10/2) d i iid G(1, 20) g Exp(1) The priors chosen for the free parameters d = (d 1,..., d m ), g, σ 2 are the defaults in the tgp package (Gramacy and Taddy, 2010) for R. Bayesian use of likelihood ratios in biostatistics 42

43 Gaussian Process Classification The correlation function K is from the separable Gaussian family, and d and g are the range and nugget parameters, respectively; we use the shorthand K K d, g. A logit link is implied by the second line of the model, when g = 0; freeing g 0 generalizes the logit of effective links parameteritizing a continuum between probit and logit links (Neal, 1998); thus by inferring g (in the posterior) we infer the link. Conditional on the parameters and settings of the latent Z variables, a sample from the predictive distribution of C(x) at a new input x is obtained via standard kriging equations and an application of the inverse logit transformation: We have that Z(x) σ 2, K is normally distributed with mean k(x)k 1 Z and variance σ 2 [1 + g + k(x) T K 1 k(x)], where k(x) = (K(x, x 1 ),..., K(x, x N )) T. Samples from the posterior predictive distribution are obtained by conditioning on samples from the posterior of Z, σ 2 and (the parameters of) K; these are then mapped to the probabilities of class labels. Bayesian use of likelihood ratios in biostatistics 43

44 Gaussian Process Classification Posterior inference for the parameters of the GP classification model is obtained by MCMC using Metropolis-within-Gibbs sampling. Condiional on the latent Z variables, samples for (σ 2, d, g) may be obtained by following any one of several approaches for inference in regression GPs, by treating the latents as real-valued observations at the predictors X; you get an IG conditional for (σ 2 d, g) for a Gibbs update, and (blocked) MH or slice sampling of full conditionals can be used for (d, g σ 2 ); see Gramacy and Lee (2008) for details. Conditional on the parameters (σ 2, d, g), there are two common ways to update the latents Z: Neal (1998) proposes an adaptive rejection sampling approach; we follow Broderick and Gramacy (2010), who proposed a 10-fold randomly blocked Metropolis-within-Gibbs approach which exploits convenient factorization of the label (P (C(X) = c(x) Z(X)) and latent Z(X) parts of the prior, and the fact that the kriging equations are easily generalized to the multivariate conditional distribution of one group of the latents given the others; the result is a trivial Metropolis-Hastings acceptance calculation and good mixing properties. Bayesian use of likelihood ratios in biostatistics 44

45 Gaussian Process Classification Software, which is an extension of the tgp package, is available from Bobby Gramacy upon request; see Gramacy (2007) for specific computational details and help with the R interface. The main computational problem is having to invert matrices on each MCMC iteration that unfortunately grow in size with the number of observations; getting even 10,000 posterior samples with data on 10,000 24,000 infants would take an appallingly long time. Some idea of what to expect can be found by retaining all of the 245 sepsis-positive babies and sampling (say) 755 sepsis-negative babies in a space-filling way in I/T space, to yield a data set with 1,000 observations; this permits results to be obtained overnight, but biases estimates of P (S = 1, I/T ) upward by oversampling on the positives; it may be possible to overcome this bias (work in progress). Bayesian use of likelihood ratios in biostatistics 45

46 P( case ) P( case ) loess (Full Data ) Versus GP (Subsample) Age <= 1 hour I2T Age <= 1 hours I2T Bayesian use of likelihood ratios in biostatistics 46

47 0.08 Bayesian use of likelihood ratios in biostatistics I2T Age <= 1 hours I2T Age <= 1 hour loess (Full Data) Versus GP (Subsample)

48 P( case ) P( case ) loess (Full Data ) Versus GP (Subsample) 1 < Age <= 2 hours I2T 2 <= Age <= 3 hours I2T Bayesian use of likelihood ratios in biostatistics 48

49 Bayesian use of likelihood ratios in biostatistics I2T <= Age <= 2 hours I2T < Age <= 2 hours loess (Full Data) Versus GP (Subsample)

50 P( case ) P( case ) loess (Full Data ) Versus GP (Subsample) Age > 6 hours I2T Age > 6 hours I2T Bayesian use of likelihood ratios in biostatistics 50

51 Bayesian use of likelihood ratios in biostatistics I2T Age > hours I2T Age > 6 hours loess (Full Data) Versus GP (Subsample)

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An