Bayesian Use of Likelihood Ratios in Biostatistics

Size: px
Start display at page:

Download "Bayesian Use of Likelihood Ratios in Biostatistics"

Transcription

1 Bayesian Use of Likelihood Ratios in Biostatistics David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz, USA draper JSM 2010 Vancouver, Canada 4 Aug 2010 Bayesian use of likelihood ratios in biostatistics 1

2 Case Study: Diagnosing Sepsis in Newborns (Newman TB, Puopolo KM, Wi S, Draper D, Escobar GE (2010). Interpreting complete blood counts soon after birth in newborns at risk for sepsis. Pediatrics, forthcoming.) Sepsis is a serious medical condition in which the entire body exhibits an inflammatory response to infection, usually bacterial (e.g., Group B streptococcus (GBS)). It s particularly dangerous in newborns, where early-onset sepsis (EOS) usually presents within the first 24 hours after birth. However, the evaluation of EOS is difficult: risk factors for infection are common, and early signs and symptoms are nonspecific. When newborns are symptomatic or have significant risk factors, a complete blood count (CBC) is usually ordered; for example, CDC guidelines recommend a CBC for high-risk infants (e.g., those with GBS-positive mothers not adequately treated for infection). Unfortunately the CDC recommendations are silent on how to use CBC results to estimate the risk of infection. Bayesian use of likelihood ratios in biostatistics 2

3 Use of CBC Components to Diagnose Sepsis Published reference ranges for components of the CBC including the absolute neutrophil count () and the proportion of total neutrophils that are immature (I/T) vary widely, and these variables may be affected by many factors besides infection, including infant age (in hours), the method of delivery, maternal hypertension, and the infant s sex. Many different values for the sensitivity P (test positive sepsis) and specificity P (test negative not sepsis) of CBC components have been published, depending on the population studied and what levels of these tests were considered abnormal. Moreover, most previous studies have dichotomized each of the CBC components rather than treating them continuously which wastes information by failing to quantify the difference between borderline and profoundly abnormal results and no one previous to our study had tried to evaluate the effects of factors such as infant age and delivery method on diagnostic performance. Bayesian use of likelihood ratios in biostatistics 3

4 Study Methods As part of a larger project based on a $1.35 million NIH grant, we took advantage of the electronic medical record systems at Northern California Kaiser Permanente Medical Care Program (KPMCP) and Brigham and Women s Hospital (BWH, Boston) to improve on previous practice. Methods. Retrospective cross-sectional study involving KPMCP, BWH demographic, laboratory, hospitalization data bases; we queried microbiology data bases to identify all infants for whom blood culture was obtained at < 72 hours of age; we kept first positive blood culture for infants with positive cultures (septic), and first blood culture for other infants, then matched all blood cultures by date, time to (single) CBC obtained closest in time to blood culture for each infant. Study subjects. Newborn infants were eligible for the study if (a) they were born from 1 Jan 1995 through 30 Sep 2007 at a KPMCP hospital that had at least 100 total births in that time period, or at the BWH from 1 Jan 1993 through 31 Dec 2007; (b) their estimated gestational age was 34 weeks; and (c) they had a CBC and blood culture drawn within 1 hour of one another at < 72 hours of age. Bayesian use of likelihood ratios in biostatistics 4

5 The Promise of Electronic Medical Records Sepsis is rare but deadly: of the 550,367 infants eligible for the study based on their hospital, year of birth, and gestational age, we identified 311 (0.57/1000 live births) with positive blood cultures; we included in this study the subset of 67,623 infants (12.3% of the 550,367 eligible newborns) who had a CBC done within 1 hour of a blood culture, including 245 of the 311 whose blood culture was positive (3.6/1000 infants receiving CBCs): thus 245 sepsis-positive and 67,378 sepsis-negative babies. Goal of analysis. With sepsis and other diseases, we re working toward a clinical goal in the nascent era of electronic medical records (EMRs) in which current posterior probabilities of disease status and adverse outcomes (e.g., unplanned transfer to the intensive care unit) become prior probabilities for real-time sequential updating as new information (vital signs, laboratory results, signs and symptoms) arrives. As a stepping-stone toward that eventual goal, we re now putting in place at Kaiser a Bayesian system in which Bayesian use of likelihood ratios in biostatistics 5

6 Likelihood Ratios (1) an initial probability of sepsis is estimated based on maternal risk factors up til birth; (2) the probability in (1) is updated at newborn age 12 hours via Bayes s Theorem based on new infant data in the first 12 hours of life; (3) the probability in (2) is updated at 24 hours via Bayes s Theorem based on new infant data in hours 12 24; and so on. A convenient way to do this Bayesian updating is with Bayes s Theorem in odds form: with diagnostic data y and true sepsis = S, [ ] [ ] P (S y) = P (S) P (y S) P (not S y) P (not S) P (y not S) posterior = prior Bayes = likelihood (1) odds odds factor ratio So how should likelihood ratios be estimated from data? Bayesian use of likelihood ratios in biostatistics 6

7 Estimating Likelihood Ratios Consider gathering data on a screening test T for a disease to estimate the test s sensitivity and specificity. For this purpose you would take a random sample, of size (say) n D > 0, of blood samples that were known (on the basis of a gold-standard test) to contain the disease agent D, of which (say) r D would register as positive (+) by T, and a parallel and independent random sample, of size (say) n D > 0, of blood samples that were known not to contain the disease agent (using D to denote absence of the disease), of which (say) r D would register as not positive ( ) by T. The sampling model would be (r D π D ) Binomial(n D, π D ) (r D π D) Binomial(n D, π D), in which (2) 0 < π D < 1 is the underlying probability P (+ D) of test-positives in the population of all true-positive blood samples, similarly 0 < π D < 1 is the underlying probability P ( D) of test-negatives in the population of all true-negative blood samples, and Bayesian use of likelihood ratios in biostatistics 7

8 Interval Estimation of a Likelihood Ratio r D and r D are independent (given π D and π D). With a given sample of blood of unknown disease status that came out positive (say) on T, in this notation Bayes s Theorem on the odds scale is [ ] [ ] P (D +) P (D) P (+ D) P ( D +) = P ( D) P (+ D), (3) in which the second multiplicative factor P (+ D) on the right side of (3) is P (+ D) the likelihood ratio based on the screening test T ; the population quantity that the likelihood ratio estimates is θ = π D, (4) 1 π D and the goal of the inference is an interval estimate for θ. As usual the frequentist (repeated-sampling) and Bayesian approaches may both be examined as methods for creating such an interval; with little information about θ external to the data set (r D, n D, r D, n D) and large values of (n D, n D), the expectation would be that the two approaches would yield similar findings, Bayesian use of likelihood ratios in biostatistics 8

9 Likelihood-Based Inference but for small (n D, n D) the Bayesian approach might well be better calibrated (because it involves integrating over a skewed likelihood function instead of maximizing over it). Approximate likelihood (repeated-sampling) inference. From standard Binomial-sampling results the maximum-likelihood estimates (MLEs) of π D and π D are ˆπ D = r D nd and ˆπ D = r D n, respectively, and by the D functional-invariance property of maximum-likelihood estimation the MLE ˆθ = of θ is then ˆπ D r D n D = 1 ˆπ D n D (n D r D), (5) in which for sensible behavior (given that 0 < θ < by assumption) it s evidently necessary to assume that r D < n D and r D > 0. Standard (Fisherian) maximum-likelihood inference is based on the hope that in repeated sampling ˆθ will be approximately Gaussian, and indeed this will be true for large enough sample sizes, but for moderate values of (n D, n D) since 0 < θ < the repeated-sampling distribution of ˆθ will be positively skewed. Bayesian use of likelihood ratios in biostatistics 9

10 Transform the Scale One approach to solving this problem is the bootstrap, which would be straightforward but computationally intensive; another is to do maximum-likelihood inference on a transformed scale (on which the repeated-sampling distribution of the MLE is closer to Gaussian) and back-transform; here I give details on the transformation approach. The obvious transformation for positive θ is to work with η = log(θ) = log(π D ) log(1 π D), (6) for which the MLE is ˆη = log(ˆθ) = log(ˆπ D ) log(1 ˆπ D). (7) In repeated sampling the distribution of ˆη should be approximately Gaussian with mean fairly close to η and variance V (ˆη) = V [log(ˆπ D )] + V [log(1 ˆπ D)]. (8) The variances in (8) can each be approximated by a standard Taylor-series ( -method) calculation: if in repeated sampling Y has mean E(Y ) and Bayesian use of likelihood ratios in biostatistics 10

11 Method variance V (Y ) and f is a function whose first derivative exists at E(Y ), then V [f(y )]. = { f [E(Y )] }2 V (Y ). (9) With f(y) = log(y) and Y = ˆπ D, so that E(Y ) = π D and V (Y ) = π D(1 π D ) n D, this yields V [log(ˆπ D )] =. ( ) 2 1 π D (1 π D ) = 1 π D, (10) n D π D π D and a similar calculation with f(y) = log(1 y) and Y = ˆπ D gives n D V [log(1 ˆπ D)]. = π D n D(1 π D), (11) so that the repeated-sampling variance of ˆη may be approximately ˆV (ˆη). = 1 ˆπ D n D ˆπ D + estimated by ˆπ D n D(1 ˆπ D) = n D r D n D r D + r D n D(n D r D). (12) To ensure both sensible estimates of θ in (5) and non-zero variance estimates in (12) it s necessary to assume that 0 < r D < n D and 0 < r D < n D. Bayesian use of likelihood ratios in biostatistics 11

12 Bayesian Solution Based on the above assumption of approximate Gaussian sampling distribution for ˆη, an approximate 100(1 α)% confidence interval for η would then be of the form ( ˆη ± Φ 1 1 α ) ˆV (ˆη), (13) 2 where Φ is the standard normal CDF; denoting the left and right endpoints of (13) by ˆη L and ˆη R, respectively, the corresponding approximate 100(1 α)% confidence interval for θ would then be [exp(ˆη L ), exp(ˆη R )]. (14) Bayesian solution. This is simpler and does not require an appeal to large-sample approximations. If you have little information about the probabilities π D and π D external to the data set y = (r D, n D, r D, n D), as will often be the case, this can readily be conveyed by augmenting model (2) above with conjugate Beta prior distributions with small values of the hyper-parameters; Bayesian use of likelihood ratios in biostatistics 12

13 Bayesian Solution the prior model is then π D Beta(α D, β D ) π D Beta(α D, β D) (15) with (e.g.) α D = β D = α D = β D = ɛ for some small ɛ > 0. By standard conjugate updating the posterior distributions for π D and π D are then (independently) also Beta: (π D y) Beta(α D + r D, β D + n D r D ) (π D y) Beta(α D + r D, β D + n D r D). (16) The posterior distribution p(θ y) for θ given the data has no closed-form expression but may easily be approximated to any desired accuracy by simulation: you simply generate m IID draws from the Beta posterior distribution p(π D y) in the first line of (16), for some large value of m, and store the generated draws in a column called πd; Bayesian use of likelihood ratios in biostatistics 13

14 Bayesian Solution independently generate m IID draws from the Beta posterior distribution p(π D y) in the second line of (16) and store the generated create a third column θ = draws in another column called π D; and π D 1 π D and summarize it in all relevant ways (e.g., a density trace provides a visual summary of p(θ y), the mean or median of the θ values may be used as a point estimate, and the α ( ) and 2 1 α 2 quantiles of the θ distribution provide the left and right endpoints of a 100(1 α)% interval estimate for θ). It s also interesting to simulate from the posterior distribution for η given y (by creating a fourth column η = log(θ )) to see how close this distribution is to a Gaussian form, to examine (by the Bernstein-von Mises Theorem) whether the assumption on which the likelihood approach was based that in repeated sampling ˆη IID Gaussian[η, V (ˆη)] is reasonable for a given data set. An example. Consider a test with sensitivity 96% and specificity 97%, and sample sizes ranging from 50 to 2,000. Bayesian use of likelihood ratios in biostatistics 14

15 An Example Maximum-likelihood and Bayesian likelihood ratio point and interval estimates for a moderately accurate screening test; the Bayesian results use ɛ = 0.01 and m = 100,000. Point Estimates 95% Interval Posterior Likelihood Posterior r D n D r D n D MLE Median Mean L U L U With small (n D, n D), MLE of likelihood ratio, which corresponds approximately (with little information external to sample data) to posterior mode, is substantially smaller than either posterior median or mean (see skewness in posterior distributions for θ in figures on next page). The Bayesian intervals are substantially wider than their likelihood counterparts for small and moderate sample sizes, but by the time (n D, n D) has reached (1000, 2000) the two methods have yielded similar findings. Bayesian use of likelihood ratios in biostatistics 15

16 An Example Density theta Density Density theta Density eta eta Top and bottom panels are posterior distributions for θ and η, respectively (with Gaussian approximation for η); left and right columns correspond to (n D, n D) = (50, 100) and (100, 200), respectively. The Gaussian approximation for η on which the likelihood method is based is poor with (n D, n D) = (50, 100), better (but still not good) with (n D, n D) = (100, 200), and excellent with (n D, n D) = (1000, 2000) (next page). Bayesian use of likelihood ratios in biostatistics 16

17 Simulation Study Density theta Density eta Simulation study (joint work with JC LaGuardia). We performed a simulation study to examine repeated-sampling bias of point estimates of likelihood ratios, and (repeated-sampling) actual coverage of interval estimates. Bayesian use of likelihood ratios in biostatistics 17

18 ML and Bayesian Tuning Constants A refinement. Whenever you use a screening test in a situation in which the specificity is close to 1, ˆπ D = r D n D 1 ˆπ D 1 ˆπ D = ˆθ. (17) In this case you ll end up with a frequentist likelihood ratio estimate that s unstable, because its denominator is too close to 0. In the Bayesian approach, in this same situation if the hyper-parameter values are too close to 0, the posterior estimate of π D will again be close to 1 and the Bayesian point estimate θ can be similarly unstable. This can easily happen if the underlying specificity of the screening process is high and/or if the sample sizes are small. The obvious remedies are as follows: (Bayesian approach) Use hyper-parameter values α D = α D = β D = β D = C B that are not too close to 0. (MLE) Mimic what happens in the Beta-Binomial Bayesian approach by adding a constant C L to all of the values (r D, r D, n D, n D). Bayesian use of likelihood ratios in biostatistics 18

19 Simulation Study Results Factorial design of the simulation study. Variables Values π D π D n D n D C L = C B We used the full-factorial simulation design summarized in this table, with 2,400 Monte Carlo repetitions in each cell of the factorial. By way of outcomes we monitored the relative bias of each of the point estimates (modified MLE, Bayesian posterior mean, Bayesian posterior mode) and the actual coverage of nominal 90% modified ML and Bayesian intervals. Simulation conclusions were as follows. Bayesian use of likelihood ratios in biostatistics 19

20 Simulation Study Results Both approaches can be calibrated to obtain approximately unbiased point estimates in almost all scenarios examined, but Bayesian interval estimates had better actual coverage behavior than modified-ml interval estimates for small and moderate sample sizes: actual interval coverage for Bayesian intervals, when using C B value that gave good point estimate, was higher than interval coverage from modified 90% likelihood confidence interval, when using C L that gave good point estimate. Within the Beta family of prior distributions for a Binomial parameter π, three popular choices to specify diffuseness, when not much is known about π external to the data, are (a) the Jeffreys prior, with (α, β) = (0.5, 0.5); (b) the Laplace (Uniform) prior, with (α, β) = (1.0, 1.0); and (c) (α, β) = (ɛ, ɛ) for a value of ɛ near 0 (such as 0.1 or 0.01). Of these three choices, the Uniform prior performs substantially better than the other two conventional diffuse-prior choices when estimating a likelihood ratio. Bayesian use of likelihood ratios in biostatistics 20

21 Results For Sepsis Screening Broadening the Laplace-Uniform idea, (α, β) values ranging from 0.7 to 1.15 are worth considering; if your sample size in the non-diseased group is small, lean toward using lower values from that interval, and if your sample size in the non-diseased group is large, go for higher values. Results for sepsis screening were as follows. Likelihood Ratio for Age at % of % of Time of Number Those With Those Without CBC (hours) With Infection Infection < > 4 Infection With Result With Result Low values are highly predictive of sepsis, especially if they occur more than 4 hours after birth. Bayesian use of likelihood ratios in biostatistics 21

22 Results For Sepsis Screening; The Next Step Likelihood Ratio for Age at % of % of Time of Number Those With Those Without CBC (hours) With Infection Infection I/T < > 4 Infection With Result With Result High values of the I to T ratio are moderately predictive of sepsis, especially if they occur more than 4 hours after birth. The next step. How would you use both the and I/T values to modify a baseline probability of sepsis from the maternal information? You can only multiply the likelihood ratios if and I/T are independent for both the sepsis and non-sepsis infants (not likely to be true); we need to estimate their joint likelihood ratio. Bayesian use of likelihood ratios in biostatistics 22

23 Bayes s Theorem Backwards If an accurate method can be found to estimate P (sepsis, I/T ), this can be done by running Bayes s Theorem in odds form backwards: with P (, I/T S = 1) P (, I/T S = 0) = S = 1 for sepsis and 0 otherwise, [ ] P (S = 0) P (S = 1) [ P (S = 1, I/T ) 1 P (S = 1, I/T ) ]. (18) The first thing that comes to mind in estimating P (S = 1, I/T ) is logistic regression, but it s important to bring the predictors and I/T into the model in the correct form; what does the surface P (S = 1, I/T ) look like with our data? Exploratory tools for generalized linear models are not as abundant as with linear models; I used local regression, via the loess command (followed by predict) in R, to explore this surface; recall that my data set has 245 sepsis-positive and 67,378 sepsis-negative babies. Actually I really want to look at P (S = 1, I/T, age), but this will be difficult to visualize, and my clinician colleagues prefer the and I/T answer to be stratified by age group, so I found age cutpoints that Bayesian use of likelihood ratios in biostatistics 23

24 Local Regression captured approximately equal numbers of sepsis-positive infants: Number of Age (hours) Sepsis-Positives Sepsis-Negatives (1, 2] (2, 6] > A bit of advice: with up to 25,000 observations in each data set, run the loess command like this: case.anc.i2t.age1.loess <- loess( case1 ~ anc1 * i2t1, case.anc.i2t.age1, statistics = "approximate", trace.hat = "approximate" ) For several of the age groups the results were remarkable; perspective and contour plots follow; note that predictions sometimes go negative, because loess doesn t know anything about bounds on the outcome. Bayesian use of likelihood ratios in biostatistics 24

25 P( case ) Response Surface Exploration Age <= 1 hour I2T Bayesian use of likelihood ratios in biostatistics 25

26 I2T Response Surface Exploration Age <= 1 hour Bayesian use of likelihood ratios in biostatistics

27 P( case ) Response Surface Exploration 1 < Age <= 2 hours I2T Bayesian use of likelihood ratios in biostatistics 27

28 I2T Response Surface Exploration 1 < Age <= 2 hours Bayesian use of likelihood ratios in biostatistics

29 P( case ) Response Surface Exploration 2 < Age <= 6 hours I2T Bayesian use of likelihood ratios in biostatistics 29

30 I2T Response Surface Exploration 2 < Age <= 6 hours Bayesian use of likelihood ratios in biostatistics

31 P( case ) Response Surface Exploration Age > 6 hours I2T Bayesian use of likelihood ratios in biostatistics 31

32 I2T Response Surface Exploration Age > 6 hours Bayesian use of likelihood ratios in biostatistics

33 Fixing the Negative Probabilities The estimated probabilities from loess are highly suggestive, but sometimes go negative; I see three ways forward (in progress): Try to come up with a parametric surface in and I/T for use in, e.g., a logistic regression model (challenging for several of the age groups). Figure out how to scale the estimated probabilities from loess so that they retain fidelity to the correct response surface while not going negative. A variety of reasonable ways of doing this have all led to similar results; one such set is plotted on the following pages (using the overall rate of sepsis ( ) as P (S = 1)). Fit a Bayesian nonparametric model to the data, via (e.g.) Gaussian processes (joint work with B Gramacy): The generative, hierarchical, GP classification model we use may be described as follows: let C(x) {0, 1} be the classification label at input x R m ; let Z Z(X) R N be a vector of N latent variables, one for each row in the N m design matrix X; each row is x i for i = 1,..., N with Bayesian use of likelihood ratios in biostatistics 33

34 LR Likelihood Ratio Estimation Age <= 1 hour I2T Bayesian use of likelihood ratios in biostatistics 34

35 I2T Likelihood Ratio Estimation Age <= 1 hour Bayesian use of likelihood ratios in biostatistics

36 LR Likelihood Ratio Estimation 1 < Age <= 2 hours I2T Bayesian use of likelihood ratios in biostatistics 36

37 I2T Likelihood Ratio Estimation 1 < Age <= 2 hours Bayesian use of likelihood ratios in biostatistics

38 LR Likelihood Ratio Estimation 2 < Age <= 6 hours I2T Bayesian use of likelihood ratios in biostatistics 38

39 4 I2T Likelihood Ratio Estimation 2 < Age <= 6 hours Bayesian use of likelihood ratios in biostatistics

40 LR Likelihood Ratio Estimation Age > 6 hours I2T Bayesian use of likelihood ratios in biostatistics 40

41 I2T Likelihood Ratio Estimation Age > 6 hours Bayesian use of likelihood ratios in biostatistics

42 Gaussian Process Classification corresponding latent Z i ; we assume that X has been pre-scaled to the unit C(x i ) indep Bernoulli[p(x i )] p(x i ) = exp{ Z i} 1 + exp{ Z i } cube; our generative model is Z σ 2, K GP(0, σ 2, K) N N (0, σ 2 K), where K i,j = K(x i, x j ) { } m K(x i, x j ) d, g = exp x ik x jk 2 /d j } + δ i,j g k=1 σ 2 IG(5/2, 10/2) d i iid G(1, 20) g Exp(1) The priors chosen for the free parameters d = (d 1,..., d m ), g, σ 2 are the defaults in the tgp package (Gramacy and Taddy, 2010) for R. Bayesian use of likelihood ratios in biostatistics 42

43 Gaussian Process Classification The correlation function K is from the separable Gaussian family, and d and g are the range and nugget parameters, respectively; we use the shorthand K K d, g. A logit link is implied by the second line of the model, when g = 0; freeing g 0 generalizes the logit of effective links parameteritizing a continuum between probit and logit links (Neal, 1998); thus by inferring g (in the posterior) we infer the link. Conditional on the parameters and settings of the latent Z variables, a sample from the predictive distribution of C(x) at a new input x is obtained via standard kriging equations and an application of the inverse logit transformation: We have that Z(x) σ 2, K is normally distributed with mean k(x)k 1 Z and variance σ 2 [1 + g + k(x) T K 1 k(x)], where k(x) = (K(x, x 1 ),..., K(x, x N )) T. Samples from the posterior predictive distribution are obtained by conditioning on samples from the posterior of Z, σ 2 and (the parameters of) K; these are then mapped to the probabilities of class labels. Bayesian use of likelihood ratios in biostatistics 43

44 Gaussian Process Classification Posterior inference for the parameters of the GP classification model is obtained by MCMC using Metropolis-within-Gibbs sampling. Condiional on the latent Z variables, samples for (σ 2, d, g) may be obtained by following any one of several approaches for inference in regression GPs, by treating the latents as real-valued observations at the predictors X; you get an IG conditional for (σ 2 d, g) for a Gibbs update, and (blocked) MH or slice sampling of full conditionals can be used for (d, g σ 2 ); see Gramacy and Lee (2008) for details. Conditional on the parameters (σ 2, d, g), there are two common ways to update the latents Z: Neal (1998) proposes an adaptive rejection sampling approach; we follow Broderick and Gramacy (2010), who proposed a 10-fold randomly blocked Metropolis-within-Gibbs approach which exploits convenient factorization of the label (P (C(X) = c(x) Z(X)) and latent Z(X) parts of the prior, and the fact that the kriging equations are easily generalized to the multivariate conditional distribution of one group of the latents given the others; the result is a trivial Metropolis-Hastings acceptance calculation and good mixing properties. Bayesian use of likelihood ratios in biostatistics 44

45 Gaussian Process Classification Software, which is an extension of the tgp package, is available from Bobby Gramacy upon request; see Gramacy (2007) for specific computational details and help with the R interface. The main computational problem is having to invert matrices on each MCMC iteration that unfortunately grow in size with the number of observations; getting even 10,000 posterior samples with data on 10,000 24,000 infants would take an appallingly long time. Some idea of what to expect can be found by retaining all of the 245 sepsis-positive babies and sampling (say) 755 sepsis-negative babies in a space-filling way in I/T space, to yield a data set with 1,000 observations; this permits results to be obtained overnight, but biases estimates of P (S = 1, I/T ) upward by oversampling on the positives; it may be possible to overcome this bias (work in progress). Bayesian use of likelihood ratios in biostatistics 45

46 P( case ) P( case ) loess (Full Data ) Versus GP (Subsample) Age <= 1 hour I2T Age <= 1 hours I2T Bayesian use of likelihood ratios in biostatistics 46

47 0.08 Bayesian use of likelihood ratios in biostatistics I2T Age <= 1 hours I2T Age <= 1 hour loess (Full Data) Versus GP (Subsample)

48 P( case ) P( case ) loess (Full Data ) Versus GP (Subsample) 1 < Age <= 2 hours I2T 2 <= Age <= 3 hours I2T Bayesian use of likelihood ratios in biostatistics 48

49 Bayesian use of likelihood ratios in biostatistics I2T <= Age <= 2 hours I2T < Age <= 2 hours loess (Full Data) Versus GP (Subsample)

50 P( case ) P( case ) loess (Full Data ) Versus GP (Subsample) Age > 6 hours I2T Age > 6 hours I2T Bayesian use of likelihood ratios in biostatistics 50

51 Bayesian use of likelihood ratios in biostatistics I2T Age > hours I2T Age > 6 hours loess (Full Data) Versus GP (Subsample)

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

On Markov chain Monte Carlo methods for tall data

On Markov chain Monte Carlo methods for tall data On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational

More information

Bayesian Non-parametric Modeling With Skewed and Heavy-Tailed Data 1

Bayesian Non-parametric Modeling With Skewed and Heavy-Tailed Data 1 Bayesian Non-parametric Modeling With Skewed and Heavy-tailed Data David Draper (joint work with Milovan Krnjajić, Thanasis Kottas and John Wallerius) Department of Applied Mathematics and Statistics University

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Due Thursday, September 19, in class What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Advanced Methods for Data Analysis (36-402/36-608 Spring 2014 1 Generalized linear models 1.1 Introduction: two regressions So far we ve seen two canonical settings for regression.

More information

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Lecture 5 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Inference for a Population Proportion

Inference for a Population Proportion Al Nosedal. University of Toronto. November 11, 2015 Statistical inference is drawing conclusions about an entire population based on data in a sample drawn from that population. From both frequentist

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

AMS 132: Discussion Section 2

AMS 132: Discussion Section 2 Prof. David Draper Department of Applied Mathematics and Statistics University of California, Santa Cruz AMS 132: Discussion Section 2 All computer operations in this course will be described for the Windows

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Bayesian Classification Methods

Bayesian Classification Methods Bayesian Classification Methods Suchit Mehrotra North Carolina State University smehrot@ncsu.edu October 24, 2014 Suchit Mehrotra (NCSU) Bayesian Classification October 24, 2014 1 / 33 How do you define

More information

Marginal Screening and Post-Selection Inference

Marginal Screening and Post-Selection Inference Marginal Screening and Post-Selection Inference Ian McKeague August 13, 2017 Ian McKeague (Columbia University) Marginal Screening August 13, 2017 1 / 29 Outline 1 Background on Marginal Screening 2 2

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent Latent Variable Models for Binary Data Suppose that for a given vector of explanatory variables x, the latent variable, U, has a continuous cumulative distribution function F (u; x) and that the binary

More information

Making rating curves - the Bayesian approach

Making rating curves - the Bayesian approach Making rating curves - the Bayesian approach Rating curves what is wanted? A best estimate of the relationship between stage and discharge at a given place in a river. The relationship should be on the

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

Neutral Bayesian reference models for incidence rates of (rare) clinical events

Neutral Bayesian reference models for incidence rates of (rare) clinical events Neutral Bayesian reference models for incidence rates of (rare) clinical events Jouni Kerman Statistical Methodology, Novartis Pharma AG, Basel BAYES2012, May 10, Aachen Outline Motivation why reference

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013 Eco517 Fall 2013 C. Sims MCMC October 8, 2013 c 2013 by Christopher A. Sims. This document may be reproduced for educational and research purposes, so long as the copies contain this notice and are retained

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 CS students: don t forget to re-register in CS-535D. Even if you just audit this course, please do register.

More information

A note on multiple imputation for general purpose estimation

A note on multiple imputation for general purpose estimation A note on multiple imputation for general purpose estimation Shu Yang Jae Kwang Kim SSC meeting June 16, 2015 Shu Yang, Jae Kwang Kim Multiple Imputation June 16, 2015 1 / 32 Introduction Basic Setup Assume

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes

More information

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides

Probabilistic modeling. The slides are closely adapted from Subhransu Maji s slides Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Bayesian Multivariate Logistic Regression

Bayesian Multivariate Logistic Regression Bayesian Multivariate Logistic Regression Sean M. O Brien and David B. Dunson Biostatistics Branch National Institute of Environmental Health Sciences Research Triangle Park, NC 1 Goals Brief review of

More information

COS513 LECTURE 8 STATISTICAL CONCEPTS

COS513 LECTURE 8 STATISTICAL CONCEPTS COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions

More information

Quantifying the Price of Uncertainty in Bayesian Models

Quantifying the Price of Uncertainty in Bayesian Models Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title Quantifying the Price of Uncertainty in Bayesian Models Author(s)

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling

2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling 2016 SISG Module 17: Bayesian Statistics for Genetics Lecture 3: Binomial Sampling Jon Wakefield Departments of Statistics and Biostatistics University of Washington Outline Introduction and Motivating

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

STA 216, GLM, Lecture 16. October 29, 2007

STA 216, GLM, Lecture 16. October 29, 2007 STA 216, GLM, Lecture 16 October 29, 2007 Efficient Posterior Computation in Factor Models Underlying Normal Models Generalized Latent Trait Models Formulation Genetic Epidemiology Illustration Structural

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

Bayesian Modeling, Inference, Prediction and Decision-Making

Bayesian Modeling, Inference, Prediction and Decision-Making Bayesian Modeling, Inference, Prediction and Decision-Making 2: Exchangeability and Conjugate Modeling David Draper (draper@ams.ucsc.edu) Department of Applied Mathematics and Statistics University of

More information

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29 Outline 1 Introduction 2 Spatial

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models

Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling

More information

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.

PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation. PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using

More information

CTDL-Positive Stable Frailty Model

CTDL-Positive Stable Frailty Model CTDL-Positive Stable Frailty Model M. Blagojevic 1, G. MacKenzie 2 1 Department of Mathematics, Keele University, Staffordshire ST5 5BG,UK and 2 Centre of Biostatistics, University of Limerick, Ireland

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T.

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T. Exam 3 Review Suppose that X i = x =(x 1,, x k ) T is observed and that Y i X i = x i independent Binomial(n i,π(x i )) for i =1,, N where ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T x) This is called the

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Design of Text Mining Experiments. Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.

Design of Text Mining Experiments. Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt. Design of Text Mining Experiments Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.taddy/research Active Learning: a flavor of design of experiments Optimal : consider

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Lecture 5: GPs and Streaming regression

Lecture 5: GPs and Streaming regression Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X

More information

General Regression Model

General Regression Model Scott S. Emerson, M.D., Ph.D. Department of Biostatistics, University of Washington, Seattle, WA 98195, USA January 5, 2015 Abstract Regression analysis can be viewed as an extension of two sample statistical

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

Lecture 13 Fundamentals of Bayesian Inference

Lecture 13 Fundamentals of Bayesian Inference Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018

Bayesian Methods. David S. Rosenberg. New York University. March 20, 2018 Bayesian Methods David S. Rosenberg New York University March 20, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 March 20, 2018 1 / 38 Contents 1 Classical Statistics 2 Bayesian

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Bayesian linear regression

Bayesian linear regression Bayesian linear regression Linear regression is the basis of most statistical modeling. The model is Y i = X T i β + ε i, where Y i is the continuous response X i = (X i1,..., X ip ) T is the corresponding

More information

Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study

Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study Modeling and Interpolation of Non-Gaussian Spatial Data: A Comparative Study Gunter Spöck, Hannes Kazianka, Jürgen Pilz Department of Statistics, University of Klagenfurt, Austria hannes.kazianka@uni-klu.ac.at

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33 Hypothesis Testing Econ 690 Purdue University Justin L. Tobias (Purdue) Testing 1 / 33 Outline 1 Basic Testing Framework 2 Testing with HPD intervals 3 Example 4 Savage Dickey Density Ratio 5 Bartlett

More information

The Calibrated Bayes Factor for Model Comparison

The Calibrated Bayes Factor for Model Comparison The Calibrated Bayes Factor for Model Comparison Steve MacEachern The Ohio State University Joint work with Xinyi Xu, Pingbo Lu and Ruoxi Xu Supported by the NSF and NSA Bayesian Nonparametrics Workshop

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013 Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you

More information

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information