A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles

Size: px
Start display at page:

Download "A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles"

Transcription

1 A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles Jeremy Gaskins Department of Bioinformatics & Biostatistics University of Louisville Joint work with Claudio Fuentes (OSU) and Rolando de la Cruz (Universidad Adolfo Ibáñez) July 29, 2018 JSM Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 1

2 The Motivating Question Women undergoing assisted reproductive therapy (ART) in order to become pregnant are at heightened risk of early pregnancy loss. The concentration of the hormone β-hcg is associated with the growth of the fetus in early pregnancy and may be predictive of abnormal pregnancy (loss of fetus or complications leading to nonterminal delivery). Normal Pregnancies Abnormal Pregnancies Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 2

3 The Motivating Question If we know (a part of) a patient s β-hcg trajectory, can we use this information to predict whether she will go on to have a successful pregnancy? Let D = 1 denote disease (abnormal pregnancy) and let Y denote the vector of HCG-values for patient i. Fit a model f (Y D = 0) for the normal pregnancy patients and a model f (Y D = 1) for the abnormal pregnancy patients. Use Bayes Theorem to get a probability of abnormal pregnancy, given the HCG trajectory. pr(d = 1 Y) = f (Y D = 1)pr(D = 1) f (Y D = 1)pr(D = 1) + f (Y D = 0)pr(D = 0). Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 3

4 The Motivating Question Normal Pregnancies Abnormal Pregnancies A couple of concerns: The abnormal pregnancy model doesn t appear to fit very well. I may believe that there are many ways for the pregnancy to fail and a model that tries to explain all of these in the same way won t work well. We need a model that allows multiple types of trajectories, but we don t necessarily know how many there should be. Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 4

5 Contents 1 Bayesian Nonparametric Model for Longitudinal Trajectory 2 Computational Considerations 3 Simulation Study 4 Data Application: Assisted pregnancy in Chilean women Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 5

6 Some Notation N = number of patients In pregnancy data: N = 173 D i = disease status (1 for disease, 0 for healthy) 49 (28.3%) abnormal pregnancies t ij = the jth measurement occasion for patient i n i = number of measurement occasions for patient i Measurement times are not aligned. n i ranges from 1 to 6, with 30% having only 1 measurement. Y ij = longitudinal biomarker measurement for patient i at the jth observation (time t ij ) Y ij = log 10 (β-hcg) Y i = (Y i1, Y i2,..., Y i,ni ) = vector of biomarkers for patient i Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 6

7 2-Component Model First, let s formalize the 2-component model that we compare our approach to (Marshall and Baron, 2000). D i Bern(φ) Y ij = f (t ij ; θ Di ) + γ i + ɛ ij f (t; θ) = θ exp{ θ 2 t θ 3 } This model has two components: γ i N(0, γ 2 ) ɛ i MVN ni ( 0ni, σ 2 R i (ρ) ) Healthy: probability 1 φ and parameter θ 0 = (θ 01, θ 02, θ 03 ) Disease: probability φ and parameter θ 1 = (θ 11, θ 12, θ 13 ) Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 7

8 Bayesian Nonparametric Model for Longitudinal Trajectory We extend the 2-component model to a Bayesian Nonparametric version. (θ i, φ i ) DP (α, MVN 3 (θ, Σ) Beta(a, b)) D i Bern(φ i ) Y ij = f (t ij ; θ i ) + γ i + ɛ ij γ i N(0, γ 2 ) ɛ i MVN ni ( 0ni, σ 2 R i (ρ) ), Due to the DP choice, there are many ties in the (θ i, φ i )s across patients. In essence, we have a small number of clusters with unique values of these parameters. We will have multiple types of clusters. Clusters with φ (c) near 0 will contain mainly healthy patients and clusters φ (c) near 1 mainly diseased patients. Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 8

9 Bayesian Nonparametric Model for Longitudinal Trajectory Let c i {1, 2, 3,...} be the cluster indicator for patient i, we can equivalently express the model as: k 1 pr(c i = k) = ψ k = V k (1 V j ) D i c i = k Bern(φ (k) ) j=1 Y ij c i = k = f (t ij ; θ (k) ) + γ i + ɛ ij γ i N(0, γ 2 ) ɛ i MVN ni ( 0ni, σ 2 R i ) θ (k) MVN 3 (θ, Σ) φ (k) Beta(a, b) V k Beta(1, α) Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 9

10 Bayesian Nonparametric Model for Longitudinal Trajectory If we observe a patient s longitudinal trajectory y, we can predict disease status as follows. pr(d = 1 y) = E{I(D = 1) y} = E [E {I(D = 1) C = k, y} y] H = φ (k) pr(c = k y) = k=1 H k=1 φ (k) ψ k MVN (y; f (t; θ (k) ), σ 2 R i (ρ) + γ 2 11 ) H h=1 ψ h MVN(y; f (t; θ (h) ), σ 2 R i (ρ) + γ 2 11 ) Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 10

11 Bayesian Nonparametric Model for Longitudinal Trajectory We will also need to specify priors for the remaining distributions: α Gamma(1, 1) γ 2 InvGamma(0.1, 0.1) σ 2 InvGamma(0.1, 0.1) ρ Unif(0, 1) θ MVN 3 (1 3, 10 2 I 3 ) Σ InvWish(5, I 3 ) a = b = 0.5 Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 11

12 Contents 1 Bayesian Nonparametric Model for Longitudinal Trajectory 2 Computational Considerations 3 Simulation Study 4 Data Application: Assisted pregnancy in Chilean women Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 12

13 Markov chain Monte Carlo sampling In order to obtain any posterior inference, we will need a Markov chain Monte Carlo (MCMC) sampling algorithm. However, there are some challenges to drawing inference in our model. Label Switching: The likelihood function is invariant to permutations of the cluster labels. (Stephens, 2000). For instance, an estimate φ (1) from the MCMC sample is non-sensical because the types of patient who define the first cluster changes over the course of the sampler. Obviously, cluster-defined parameters are not identifiable, but these two parameters are estimable: Prob. two patients are in the same cluster: pr(c i = C j y i, y j ) Disease predictions: pr(d = 1 y) = H k=1 φ(k) pr(c = k y) Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 13

14 Markov chain Monte Carlo sampling If all we need is to make predictions about disease status, then label switching is not an issue. But if we want to make conclusions about particular trajectories, we will need posterior samples with identifiable parameters. Based on the posterior pairwise probabilities pr(c i = C j y i, y j ), we estimate an optimal partition, the mostly likely clustering configuration. Given the optimal partition, we can run a Stage II MCMC algorithm without updating the cluster membership to get a usable posterior sample. Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 14

15 Estimating the optimal partition Dahl s Method (Dahl, 2006) Find ĉ = (ĉ 1, ĉ 2,..., ĉ n ) from MCMC sample that minimizes N i=1 j=1 N [I (ĉ i = ĉ j ) pr(c i = C j y)] 2. Hierarchical Clustering Using the pairwise clustering probabilities pr(c i = C j y i, y j ), we can obtain a dendrogram describing the clustering relationships There are various choices of the linkage criteria that can be used, and we consider the average linkage and Ward s method. 2 Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data

16 Estimating the optimal partition Hierarchical Clusterings This gives us a partition for each number of clusters k, but we still have to choose the value k. Some automated methods designed to choose k based on minimizing some criteria: Gamma measure, Tau measure, Silhouette index (Charrad et al, 2014). But most approaches to selecting k require an actual (rectangular) data matrix. Under average linkage, dendrogram height h represents that for i and j assigned to different clusters pr(c i C j y) h, so we can specify a value of h, such as 0.75 or 0.9. Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 16

17 Model Choices Different Models under consideration: 1 Two-component model 2 Bayesian Nonparametric model with label switching Disease prediction through Bayesian model averaging (BMA) 3 Bayesian Nonparametric model with 2-stage estimation Choose optimal clustering by Dahl s method 4 Bayesian Nonparametric model with 2-stage estimation Choose optimal clustering through hierarchical clustering under each method for choosing k Because the number of clusters and the membership are unknown, the BMA choice most accurately represents what we can learn from the data. Theoretically, predictions under BMA should be best (lowest variance) by Rao-Blackwell considerations. Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 17

18 Contents 1 Bayesian Nonparametric Model for Longitudinal Trajectory 2 Computational Considerations 3 Simulation Study 4 Data Application: Assisted pregnancy in Chilean women Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 18

19 Simulation Study #1 All Groups ( n = 200 ) Group 1 ( n = 80 ) Group 2 ( n = 40 ) φ= 0 φ= 0.2 Group 3 ( n = 30 ) Group 4 ( n = 30 ) Group 5 ( n = 20 ) φ= 0.2 φ= 0.9 φ= 1 Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 19

20 Simulation Study #2 All Groups ( n = 200 ) Group 1 ( n = 150 ) Group 2 ( n = 50 ) φ= 0 φ= 1 In each case, we generate 200 datasets with 200 observations and apply each of the methods. Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 20

21 Simulation Study Results Out of sample Within sample Model Clusters % error AUC % error AUC Simulation Study #1 (5 components) 2-component % (0.9) (.013) 25.0% (2.5) (.013) BMA 8.1 (1.1) 21.7% (0.7) (.007) 20.1% (3.0) (.029) Dahl 9.0 (2.9) 22.6% (0.9) (.009) 20.2% (2.9) (.030) Avg(h =.75) 5.0 (1.1) 22.8% (1.5) (.022) 20.7% (3.3) (.035) Avg(median) 7.7 (1.1) 22.9% (1.8) (.033) 20.6% (3.3) (.043) Avg(Silhouette) 4.2 (0.9) 23.0% (2.2) (.018) 21.2% (3.2) (.036). Simulation Study #2 (2 components) 2-component % (1.0) (.007) 16.0% (2.1) (.027) BMA 3.0 (0.6) 18.0% (1.0) (.008) 16.0% (2.0) (.028) Dahl 2.5 (0.8) 18.0% (1.0) (.008) 16.1% (2.2) (.027) Avg(h =.75) 2.1 (0.2) 18.0% (1.1) (.009) 16.1% (2.0) (.028) Avg(median) 2.7 (0.7) 18.2% (1.6) (.032) 16.3% (2.7) (.045) Avg(Silhouette) 2.1 (0.4) 18.0% (1.1) (.009) 16.1% (2.2) (.029). Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 21

22 Simulation Study Results Comments: If the two-component model is not correct, it produces substantially worse predictions; when two-component model is correct, BMA and two-stage BNP estimators do just as well. BMA beats the two-stage BNP predictions but not by too much. The minor loss in prediction accuracy may be justified by more clear interpretations. The Dahl method tends to have too many clusters, even though it is among the best two-stage methods in prediction. We recommend prediction using the BMA estimates and interpretation through two-stage procedure under the average linkage with fixed h = 0.75, followed by the Silhouette index (either average or Ward linkage) and Dahl s method. Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 22

23 Contents 1 Bayesian Nonparametric Model for Longitudinal Trajectory 2 Computational Considerations 3 Simulation Study 4 Data Application: Assisted pregnancy in Chilean women Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 23

24 Data Application: Assisted pregnancy in Chilean women We estimate the accuracy using within sample prediction from the full data (n = 173) and from a 25-fold cross validation (n=35, 20% test set). Full data 25-fold CV Model Clusters % error AUC % error AUC 2-component % % (1.2) (.004) BMA % % (1.1) (.003) Dahl % % (1.0) (.004) Avg(h = 0.75) % % (1.4) (.005) Avg(Silhouette) % % (1.0) (.005) Ward(Silhouette) % % (1.0) (.005) Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 24

25 Data Application: Assisted pregnancy in Chilean women We consider the interpretable conclusions based on the Stage II model fits from the Silhouette index with average linkage. Cluster A Cluster B n=16 Disease prob=0.971 (0.86, 1) n=25 Disease prob=0.942 (0.83, 1) Index Cluster C Cluster D Cluster E n=2 Disease prob=0.494 (0.06, 0.93) n=13 Disease prob=0.181 (0.03, 0.42) n=117 Disease prob=0.056 (0.02, 0.1) Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 25

26 References Gaskins, J., C. Fuentes, R. de la Cruz. (2017) A Bayesian Nonparametric Model for Predicting Pregnancy Outcomes Using Longitudinal Profiles. arxiv: Charrad, M., H. Ghazzali, V. Boiteau, and A. Niknafs. (2014) NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software 61:1 36. Dahl, D.B. (2006), Model-Based Clustering for Expression Data via a Dirichlet Process Mixture Model, in Bayesian Inference for Gene Expression and Proteomics. K.-A. Do, P. Muller, and M. Vannucci (eds.) Cambridge University Press. Marshall, G., and A.E. Baron. (2000) Linear discriminant models for unbalanced longitudinal data. Stat in Med. 19: Stephens, M. (2000) Dealing with label switching in mixture models. JRSSB 62: Jeremy Gaskins University of Louisville BNP Disease Classification from Longitudinal Data 26

arxiv: v1 [stat.ap] 5 Nov 2017

arxiv: v1 [stat.ap] 5 Nov 2017 arxiv:1711.01512v1 [stat.ap] 5 Nov 2017 A Bayesian Nonparametric Model for Predicting Pregnancy Outcomes Using Longitudinal Profiles Jeremy T. Gaskins, Claudio Fuentes, and Rolando De La Cruz University

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

Bayesian non-parametric model to longitudinally predict churn

Bayesian non-parametric model to longitudinally predict churn Bayesian non-parametric model to longitudinally predict churn Bruno Scarpa Università di Padova Conference of European Statistics Stakeholders Methodologists, Producers and Users of European Statistics

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

Nonparametric Bayes tensor factorizations for big data

Nonparametric Bayes tensor factorizations for big data Nonparametric Bayes tensor factorizations for big data David Dunson Department of Statistical Science, Duke University Funded from NIH R01-ES017240, R01-ES017436 & DARPA N66001-09-C-2082 Motivation Conditional

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model

Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model Simultaneous inference for multiple testing and clustering via a Dirichlet process mixture model David B Dahl 1, Qianxing Mo 2 and Marina Vannucci 3 1 Texas A&M University, US 2 Memorial Sloan-Kettering

More information

Bayesian Networks in Educational Assessment

Bayesian Networks in Educational Assessment Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Part 7: Hierarchical Modeling

Part 7: Hierarchical Modeling Part 7: Hierarchical Modeling!1 Nested data It is common for data to be nested: i.e., observations on subjects are organized by a hierarchy Such data are often called hierarchical or multilevel For example,

More information

The STS Surgeon Composite Technical Appendix

The STS Surgeon Composite Technical Appendix The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

A Bayesian multi-dimensional couple-based latent risk model for infertility

A Bayesian multi-dimensional couple-based latent risk model for infertility A Bayesian multi-dimensional couple-based latent risk model for infertility Zhen Chen, Ph.D. Eunice Kennedy Shriver National Institute of Child Health and Human Development National Institutes of Health

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics

Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics Distance-Based Probability Distribution for Set Partitions with Applications to Bayesian Nonparametrics David B. Dahl August 5, 2008 Abstract Integration of several types of data is a burgeoning field.

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

David Hughes. Flexible Discriminant Analysis Using. Multivariate Mixed Models. D. Hughes. Motivation MGLMM. Discriminant. Analysis.

David Hughes. Flexible Discriminant Analysis Using. Multivariate Mixed Models. D. Hughes. Motivation MGLMM. Discriminant. Analysis. Using Using David Hughes 2015 Outline Using 1. 2. Multivariate Generalized Linear Mixed () 3. Longitudinal 4. 5. Using Complex data. Using Complex data. Longitudinal Using Complex data. Longitudinal Multivariate

More information

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang

Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January

More information

Nonparametric Bayes Modeling

Nonparametric Bayes Modeling Nonparametric Bayes Modeling Lecture 6: Advanced Applications of DPMs David Dunson Department of Statistical Science, Duke University Tuesday February 2, 2010 Motivation Functional data analysis Variable

More information

Bayesian Nonparametrics

Bayesian Nonparametrics Bayesian Nonparametrics Peter Orbanz Columbia University PARAMETERS AND PATTERNS Parameters P(X θ) = Probability[data pattern] 3 2 1 0 1 2 3 5 0 5 Inference idea data = underlying pattern + independent

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

Image segmentation combining Markov Random Fields and Dirichlet Processes

Image segmentation combining Markov Random Fields and Dirichlet Processes Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

David B. Dahl. Department of Statistics, and Department of Biostatistics & Medical Informatics University of Wisconsin Madison

David B. Dahl. Department of Statistics, and Department of Biostatistics & Medical Informatics University of Wisconsin Madison AN IMPROVED MERGE-SPLIT SAMPLER FOR CONJUGATE DIRICHLET PROCESS MIXTURE MODELS David B. Dahl dbdahl@stat.wisc.edu Department of Statistics, and Department of Biostatistics & Medical Informatics University

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Chapter 2. Data Analysis

Chapter 2. Data Analysis Chapter 2 Data Analysis 2.1. Density Estimation and Survival Analysis The most straightforward application of BNP priors for statistical inference is in density estimation problems. Consider the generic

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Nonparametric Bayesian Models --Learning/Reasoning in Open Possible Worlds Eric Xing Lecture 7, August 4, 2009 Reading: Eric Xing Eric Xing @ CMU, 2006-2009 Clustering Eric Xing

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Bayes methods for categorical data. April 25, 2017

Bayes methods for categorical data. April 25, 2017 Bayes methods for categorical data April 25, 2017 Motivation for joint probability models Increasing interest in high-dimensional data in broad applications Focus may be on prediction, variable selection,

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Variable selection for model-based clustering of categorical data

Variable selection for model-based clustering of categorical data Variable selection for model-based clustering of categorical data Brendan Murphy Wirtschaftsuniversität Wien Seminar, 2016 1 / 44 Alzheimer Dataset Data were collected on early onset Alzheimer patient

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

VCMC: Variational Consensus Monte Carlo

VCMC: Variational Consensus Monte Carlo VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object

More information

Probabilities for climate projections

Probabilities for climate projections Probabilities for climate projections Claudia Tebaldi, Reinhard Furrer Linda Mearns, Doug Nychka National Center for Atmospheric Research Richard Smith - UNC-Chapel Hill Steve Sain - CU-Denver Statistical

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Machine Learning. Probabilistic KNN.

Machine Learning. Probabilistic KNN. Machine Learning. Mark Girolami girolami@dcs.gla.ac.uk Department of Computing Science University of Glasgow June 21, 2007 p. 1/3 KNN is a remarkably simple algorithm with proven error-rates June 21, 2007

More information

Scaling up Bayesian Inference

Scaling up Bayesian Inference Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background

More information

Advances and Applications in Perfect Sampling

Advances and Applications in Perfect Sampling and Applications in Perfect Sampling Ph.D. Dissertation Defense Ulrike Schneider advisor: Jem Corcoran May 8, 2003 Department of Applied Mathematics University of Colorado Outline Introduction (1) MCMC

More information

Bayesian model selection: methodology, computation and applications

Bayesian model selection: methodology, computation and applications Bayesian model selection: methodology, computation and applications David Nott Department of Statistics and Applied Probability National University of Singapore Statistical Genomics Summer School Program

More information

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu

Lecture 16-17: Bayesian Nonparametrics I. STAT 6474 Instructor: Hongxiao Zhu Lecture 16-17: Bayesian Nonparametrics I STAT 6474 Instructor: Hongxiao Zhu Plan for today Why Bayesian Nonparametrics? Dirichlet Distribution and Dirichlet Processes. 2 Parameter and Patterns Reference:

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

1 Bayesian Linear Regression (BLR)

1 Bayesian Linear Regression (BLR) Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling

Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence

More information

Bayesian Clustering with the Dirichlet Process: Issues with priors and interpreting MCMC. Shane T. Jensen

Bayesian Clustering with the Dirichlet Process: Issues with priors and interpreting MCMC. Shane T. Jensen Bayesian Clustering with the Dirichlet Process: Issues with priors and interpreting MCMC Shane T. Jensen Department of Statistics The Wharton School, University of Pennsylvania stjensen@wharton.upenn.edu

More information

Dynamic Scheduling of the Upcoming Exam in Cancer Screening

Dynamic Scheduling of the Upcoming Exam in Cancer Screening Dynamic Scheduling of the Upcoming Exam in Cancer Screening Dongfeng 1 and Karen Kafadar 2 1 Department of Bioinformatics and Biostatistics University of Louisville 2 Department of Statistics University

More information

Multivariate Survival Analysis

Multivariate Survival Analysis Multivariate Survival Analysis Previously we have assumed that either (X i, δ i ) or (X i, δ i, Z i ), i = 1,..., n, are i.i.d.. This may not always be the case. Multivariate survival data can arise in

More information

Hybrid Dirichlet processes for functional data

Hybrid Dirichlet processes for functional data Hybrid Dirichlet processes for functional data Sonia Petrone Università Bocconi, Milano Joint work with Michele Guindani - U.T. MD Anderson Cancer Center, Houston and Alan Gelfand - Duke University, USA

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet. Stat 535 C - Statistical Computing & Monte Carlo Methods Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Introduction to Markov chain Monte Carlo The Gibbs Sampler Examples Overview of the Lecture

More information

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

A Fully Nonparametric Modeling Approach to. BNP Binary Regression A Fully Nonparametric Modeling Approach to Binary Regression Maria Department of Applied Mathematics and Statistics University of California, Santa Cruz SBIES, April 27-28, 2012 Outline 1 2 3 Simulation

More information

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks

ST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks (9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown

More information

Scaling Neighbourhood Methods

Scaling Neighbourhood Methods Quick Recap Scaling Neighbourhood Methods Collaborative Filtering m = #items n = #users Complexity : m * m * n Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion)

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Longitudinal breast density as a marker of breast cancer risk

Longitudinal breast density as a marker of breast cancer risk Longitudinal breast density as a marker of breast cancer risk C. Armero (1), M. Rué (2), A. Forte (1), C. Forné (2), H. Perpiñán (1), M. Baré (3), and G. Gómez (4) (1) BIOstatnet and Universitat de València,

More information

Stat 451 Lecture Notes Monte Carlo Integration

Stat 451 Lecture Notes Monte Carlo Integration Stat 451 Lecture Notes 06 12 Monte Carlo Integration Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapter 6 in Givens & Hoeting, Chapter 23 in Lange, and Chapters 3 4 in Robert & Casella 2 Updated:

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs

Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs Small-variance Asymptotics for Dirichlet Process Mixtures of SVMs Yining Wang Jun Zhu Tsinghua University July, 2014 Y. Wang and J. Zhu (Tsinghua University) Max-Margin DP-means July, 2014 1 / 25 Outline

More information

Bayesian model selection in graphs by using BDgraph package

Bayesian model selection in graphs by using BDgraph package Bayesian model selection in graphs by using BDgraph package A. Mohammadi and E. Wit March 26, 2013 MOTIVATION Flow cytometry data with 11 proteins from Sachs et al. (2005) RESULT FOR CELL SIGNALING DATA

More information

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University

More information

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models

Outline. Clustering. Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Capturing Unobserved Heterogeneity in the Austrian Labor Market Using Finite Mixtures of Markov Chain Models Collaboration with Rudolf Winter-Ebmer, Department of Economics, Johannes Kepler University

More information

10. Exchangeability and hierarchical models Objective. Recommended reading

10. Exchangeability and hierarchical models Objective. Recommended reading 10. Exchangeability and hierarchical models Objective Introduce exchangeability and its relation to Bayesian hierarchical models. Show how to fit such models using fully and empirical Bayesian methods.

More information

CMPS 242: Project Report

CMPS 242: Project Report CMPS 242: Project Report RadhaKrishna Vuppala Univ. of California, Santa Cruz vrk@soe.ucsc.edu Abstract The classification procedures impose certain models on the data and when the assumption match the

More information

Bayesian statistics and modelling in ecology

Bayesian statistics and modelling in ecology Bayesian statistics and modelling in ecology Prof Paul Blackwell University of Sheffield, UK February 2011 Bayesian statistics Bayesian statistics: the systematic use of probability to express uncertainty

More information

Bayesian Registration of Functions with a Gaussian Process Prior

Bayesian Registration of Functions with a Gaussian Process Prior Bayesian Registration of Functions with a Gaussian Process Prior RELATED PROBLEMS Functions Curves Images Surfaces FUNCTIONAL DATA Electrocardiogram Signals (Heart Disease) Gait Pressure Measurements (Parkinson

More information

Bayesian Inference of Interactions and Associations

Bayesian Inference of Interactions and Associations Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test

Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test Estimation of Optimally-Combined-Biomarker Accuracy in the Absence of a Gold-Standard Reference Test L. García Barrado 1 E. Coart 2 T. Burzykowski 1,2 1 Interuniversity Institute for Biostatistics and

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Nonparametric Bayesian Methods - Lecture I

Nonparametric Bayesian Methods - Lecture I Nonparametric Bayesian Methods - Lecture I Harry van Zanten Korteweg-de Vries Institute for Mathematics CRiSM Masterclass, April 4-6, 2016 Overview of the lectures I Intro to nonparametric Bayesian statistics

More information

Simultaneous Inference for Multiple Testing and Clustering via Dirichlet Process Mixture Models

Simultaneous Inference for Multiple Testing and Clustering via Dirichlet Process Mixture Models Simultaneous Inference for Multiple Testing and Clustering via Dirichlet Process Mixture Models David B. Dahl Department of Statistics Texas A&M University Marina Vannucci, Michael Newton, & Qianxing Mo

More information

Overview of statistical methods used in analyses with your group between 2000 and 2013

Overview of statistical methods used in analyses with your group between 2000 and 2013 Department of Epidemiology and Public Health Unit of Biostatistics Overview of statistical methods used in analyses with your group between 2000 and 2013 January 21st, 2014 PD Dr C Schindler Swiss Tropical

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Approximate Bayesian Computation and Particle Filters

Approximate Bayesian Computation and Particle Filters Approximate Bayesian Computation and Particle Filters Dennis Prangle Reading University 5th February 2014 Introduction Talk is mostly a literature review A few comments on my own ongoing research See Jasra

More information

Hmms with variable dimension structures and extensions

Hmms with variable dimension structures and extensions Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating

More information

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May

Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May Cluster investigations using Disease mapping methods International workshop on Risk Factors for Childhood Leukemia Berlin May 5-7 2008 Peter Schlattmann Institut für Biometrie und Klinische Epidemiologie

More information

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models

Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Introduction to Bayesian Statistics with WinBUGS Part 4 Priors and Hierarchical Models Matthew S. Johnson New York ASA Chapter Workshop CUNY Graduate Center New York, NY hspace1in December 17, 2009 December

More information

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang

Chapter 4 Dynamic Bayesian Networks Fall Jin Gu, Michael Zhang Chapter 4 Dynamic Bayesian Networks 2016 Fall Jin Gu, Michael Zhang Reviews: BN Representation Basic steps for BN representations Define variables Define the preliminary relations between variables Check

More information