An Application of MCMC Methods and of Mixtures of Laws for the Determination of the Purity of a Product.

Size: px
Start display at page:

Download "An Application of MCMC Methods and of Mixtures of Laws for the Determination of the Purity of a Product."

Transcription

1 An Application of MCMC Methods and of Mixtures of Laws for the Determination of the Purity of a Product. Julien Cornebise & Myriam Maumy E.S.I-E-A LSTA 72 avenue Maurice Thorez, Université Pierre et Marie Curie - Paris VI IVRY SUR SEINE, France 175 rue du Chevaleret, ( cornebis@et.esiea.fr) PARIS, France IRMA Université Louis Pasteur - Strasbourg I 7 rue René Descartes, STRASBOURG Cedex, France ( mmaumy@math.u-strasbg.fr) Abstract The aim of this article is to illustrate practical issues that may arise when applying MCMC methods in a real case. MCMC methods are used over a model of mixture of distributions, in order to determine honnesty specifications about some kind of product. First, assuming a known number of components, parameters of each component are estimated using Gibbs sampling. Convergence and label-switching are taken into account. Then, determination of the number of components is mentioned, through model selection with Bayes Factors. Résumé Le but de cet article est d illustrer des questions partiques qui peuvent survenir lorsque l on applique des méthodes MCMC à un cas réel. Ces dernières sont utilisées sur un modèle de mélange de lois, afin de déterminer des spécifications d honnêteté sur un certain type de produits. En travaillant un premier temps à nombre de composantes fixé, les paramètres de chaque composante sont estimés à l aide de l échantilloneur de Gibbs. La convergence et le label-switching sont pris en compte. Enfin, la détermination du nombre de composantes est envisagé, à l aide des Facteurs de Bayes. 1. The Problem and its Modelization A particular kind of products, avalaible in stores almost everywhere, is subject to different kind of adulteration (or fraud). Some products are pure and thus have low glucose and xylose rates, while some other have been chemically modified (adjunction of extra-components), wich reduces their quality and increases their glucose and xylose rates. Given a set of 1002 observations of the couple (glucose rate, xylose rate), can we determine : (i) the number K of kinds of production, and their parameters (mean, standard deviation), (K 1 different possible modifications, plus one for pure products), (ii) the proportion of each kind of production, and (iii) a way to detect wether a product 1

2 is pure or not (e.g. the 99% quantile of the distribution of the glucose for the honnest kind of production)? We here only consider the univariate case, i.e. glucose and xylose rates separately. The approach could later be generalised to the two-dimensional variable. We modelize the distribution of the glucose (resp. the xylose) as a mixture of normal laws. We consider that the T = 1002 observations from the sample come from K distinct populations (1 honest and K 1 modified), each population k {1,..., K} following a normal law of density f k and of parameters θ k = (µ k, σ k ). The density function of an observation x i, 1 i T is : [x i π, θ] = K k=1 π kf k (x i θ k ), where f k (. θ k ) is the density of a normal law of parameters θ k = (µ k, σ k ) and π k is the probability of belonging to population k, with K k=1 π k = 1. The parameters of interest are θ = (θ 1,..., θ K ) and π = (π 1,..., π K ). The matter of choosing the value of K will be mentioned later. The hypothesis of normality of the law may (and have to) be reconsidered in a second time (e.g. for log-normal distribution). Other parametrisations for mixtures of normal laws have been published, for example by Robert in Droesbeke et al. (2002), wich avoids some label-switching (see below) troubles, but miss the natural physical interpretation, very useful here. The parameters θ k of each law are considered as random variables, with their own distribution, and so the notation [..] is used for the conditional density and for the parametrized density. The distribution of density [θ, π] is the prior distribution, and its parameters are the hyperparameters. They allow to take mathematically into account the prior knowledge of the experts of the field, if avalaible (here, the potential informations held by the chemists). The general aim of a study is to estimate a function of the real values of the parameters F (θ, π). This can be done by calculating E[F (θ, π) x] = F (θ, π)[θ, π x]d(θ, π), where [θ, π x] is the density of the posterior distribution, i.e. the distribution of the parameters conditionnaly to the observations x = (x 1,..., x n ). It can be computed via the Bayes Formula : [θ, π x] = [x θ, π][θ, π]/[x], where [x] is the density of the prior distribution of the observations, which can be taken as constant and thus ignored. The first question is therefore to choose the prior distribution of the parameters, [θ, π]. Before going any further, we introduce the hidden variable z i, i {1,..., T } in the model mentioned above, which is not observed and is thus named latent variable. z i {1,..., K} indicates the original population of the observation x i, and z = (z 1,..., z T ). Thus, the z i are i.i.d, with [z i = k π] = π k, [x i θ, π, z i = k] = N (x µ k, σ k ), where N (. µ, σ) is the density of a univariate normal law. Analysis of mixture distributions by MCMC methods have been the subject of many publications, for example Diebolt and Robert (1990), Richardson and Green (1997), Stephens (1997), Marin et al. (to appear). 2. Choosing the Prior and its Hyperparameters The choice of the prior distribution of the parameters is always a much discussed point 2

3 in any bayesian analysis. Several cases may arise. Either the experts of the field have valuable information about the distribution of the parameters (for example, the mean of each component is approximately known); or they don t have any information at all (or it should be ignored, for blind validation purposes). In this last case, we can either use empirical prior, i.e. hyperparameters built upon the data, or use non-informative prior, i.e. prior carrying no information at all, or poorly informative, i.e. carrying as few information as possible. Moreover, the prior is often chosen in a closed-by-sampling or conjugate prior family, i.e. such that conditionning by the sample (passing from the prior to the posterior distribution) only result in a change of the hyperparameters, not in a change of family. This simplifies implementation. This part of the bayesian analysis is the most subjective. A sensitivity analysis should be done after the study, to test the stability of the results relatively to hyperparameters and prior choices. Here we have chosen the following model, that often arises, because each law belongs to a closed-by-sampling family : π = (π 1,..., π K ) Di(a 1,..., a K ), µ k σ k N (m k, σ k / c k ), σ 2 k IG(α k, β k ) where Di is a Dirichlet law, and IG an Inverse Gamma. The hyperparameters are a k, m k, c k, α k, and β k, k {1,..., K}. For non-informative prior, we should have a Uniform law for π, which can be obtained with a 1 =... = a K = 1, a Dirac density for σk 2, which can be obtained with limit values for α k and β k, and a unary density on R for each µ k, which is more difficult to implement with limit values on c k. In practice, non-informative prior is not so easy to deal with : implementation under Matlab or Scilab, for example, it is not always possible due to infinite values of the parameters, of to particular densities. Therefore, poorly informative prior, or even empirical prior, should be used. This is what we have done here. We have chosen : k {1,..., K} π k = 1, c k = 1, m k = x (empirical mean), α k = K, and β k = (T (K 1)) 1 T i=1 (x i x) 2. Thus, the proportions are non-informative, the mean of the components equals the sample s one, and the mean of σk 2 is equal to the empirical variance (E[σk 2] = (α k 1)β k for Inverse Gamma). For reasons that should become clear further (related to label-switching problems and bayes factors), we have choosen the same hyperparameters for each one of the K components : this maintains the symmetry of the density (and therefore of the likelihood). The choice of prior is cleary the weakest, the most arguable, and the most argued point of the analysis. Each of the many approaches has its good and bad points. The approach chosen here is neither the most rigorous one, nor the purest, but allows easy implementation. Further discussions about the choice of the prior can be found, for example in Droesbeke et al. (2002), and Gelman et al. (2003). 3. Gibbs Sampling, Complete Conditionnal Laws The evaluation of the expectation E[F (θ, π) x] is hard to achieve, either analitically or numerically, due either to its complex expression or to its highly multidimensional 3

4 nature. MCMC methods are a good way to solve this problem. The principle of Monte- Carlo methods is to use N independent realizations (θ (i), π (i) ) of random variables following the posterior distribution [θ, π x], and to approximate : F (θ, π)[θ, π x]d(θ, π) i=n i=1 F (θ(i), π (i) )/N. Here F will be first the identity function, to estimate the parameters, then the 99%-quantile function of the component corresponding to pure product. In this last case, in order to lower the producer-risk, some praticians advise to use the empirical 95%-quantile of the sampled values F (θ (i), π (i) ), instead of their empirical mean. Markov-Chain part of MCMC methods solves the problem of sampling from the posterior distribution. Different algorithms exist. The two main are Metropolis-Hastings, and the Gibbs Sampler. We have used this last one, which is quite straightforward to implement. Its aim is to sample from the joint posterior distribution of (θ, π). It s based on the complete conditional law. The general principle is the following. Let θ = (θ 1,..., θ n ) the vector of the parameters of the model. Here, we have θ = (θ, π) = (µ 1, σ 1,..., µ K, σ K, π 1,..., π K ). Let θ (i) = (θ 1,..., θ i 1, θ i+1,..., θ n ) the vector θ without its i th component. The density of the complete conditional law of θ i is [θ i θ (i), x], whose closed form is often known. Let s assume that we know all complete conditionnal laws, i.e. [θ i θ (i), x], i {1,..., n}. Let s also take arbitrary initial values θ (0) = (θ (0) 1,..[., θ n (0) ). The Gibbs Sampling algorithm consists in successively sampling θ (j) 1 from θ 1 θ (j 1) 2, θ (j 1) 3, θ (j 1) 4,..., θ n (j 1), ] [ ] [ ] θ (j) 2 from θ 2 θ (j) 1, θ (j 1) 3, θ (j 1) 4,..., θ n (j 1),, θ n (j) from θ n θ (j) 1, θ (j) 2, θ (j) 3,..., θ (j) n 1, for j {1,..., N + M}, where M is a number of iterations that will be discarded. They are called burn-in iterations, and correspond to the time before convergence. Eventually, it can be shown that θ (M) = (θ (M) 1,..., θ n (M) ) converges in law to the posterior joint distribution [θ 1,..., θ n x]. The following N values are then considered as a sample from this distribution, and can be used to approximate F (θ) by empirical mean, as mentioned above. In our model, we have the following complete conditional laws (in order to simplify the notations, the list of all parameters but θ are figured by (θ)) : z i (z i ), x Mu(π 1,... π K ) π (π), x Di(a 1 + n 1,..., a K + n K ), where n k = i:z i =k 1 µ k (µ k ), x N ( m kc k +s k σ c k +n k, nk k +c k ) σ 2 k (σ k), x IG ( ( α k + n k+1, β 2 k + 1 ck (µ 2 k m k ) 2 + i:z i =k (x k µ k ) 2)) Now the sampling can take place, for example with M = 5000 and N = Convergence issues are discussed below. Metropolis Hastings algorithm details and variants of the Gibbs Sampler can be found in Droesbeke et al. (2002), in Richardson and Green (1997), or also in Marin et al. (2004). 4. Convergence Issues 4

5 Since the historically first convergence diagnostic (known as the thick pen one, Gelfand and Hills (1990)), there exist two main kinds of methods to determine wether the sampler has reached convergence or not, i.e. wether the values from the current generation can be considered as a sample of the target distribution. They are single-chain convergence diagnostic, and multi-chains ones. Single-chain diagnostics have clearly not been advised by Gelman and Rubin (1992). The principle of muti-chains diagnostics is to run multiple independent chains from very different starting points, and to test wether the last values of each chain come from the same distribution or not. If that is the case, we can assume that convergence has been reached. Gelmand and Rubin (1992) have proposed a method which is often used, based on the comparison of within and between-chains variances. At the beginning of the sampling, the chains are much influenced by the starting point, and the between-chains variance is high above the within-chain one. When each chain as reached the target distribution, the ratio between within and between-chain variance should be around 1. This method as been much improved since then, and some more subtle tests are avalaible. This method (that won t be detailed here, to save space) is quite straightforward to implement, and has proven to be efficient in many cases. But here, we encounter an important problem, making it almost impossible to use : the label switching problem. 5. The Label Switching Problem A particularity of the mixture of laws is that the density of the observations, as well as the likelihood and the posterior joint density of the parameters (which is the target distribution of the Gibbs Sampler) is symmetrical, i.e. invariant by permutation of the components. Therefore, this last density has up to K! duplications of each mode, and the sampler can move from one mode to another freely, thus permuting the components. The absence of label-switching means that the sampler is stuck in a local mode, maybe because modes are well separated (e.g. when K is very low). The space of parameters is thus not completely explored, which is not sane. When estimating a function F (θ, π) which is also invariant by permutation of the components (e.g. estimating the density at a given point), label-switching is not a problem. But when trying to estimate the parameters of each component, this label-switching has to be undone. Two approaches come in mind : imposing an identification constraint during the sampling, or post-processing the sampled values to undo the permutation. The first solution constrains the exploration of the space of parameters by the sampler, and thus may alter the results, see Celeux and Robert (2000), Marin et al. (2004), and Stephens (2000). Forcing the prior to be highly separable (using much different hyperparameters for components) isn t a good idea neither : label switching arises anyway, and the resulting lack of correspondance between the prior and the component would corrupt any interpretation the prior (such as Bayes factors). We applied here the second solution. The post-processing must be done very carefully : ordering on µ k, or on σ k, or on π k, isn t a good idea. Some components may be close in 5

6 mean but not in variance, and vice-versa. Some methods use a loss function and clustering algorithms (e.g. K-means with K! classes) to determine to which mode (i.e. permutation) belongs each of the sampled vectors of parameters. The reader is invited to read the three references above for more details. Label-switching is alas incompatible with convergence diagnostics such as these mentionned in paragraph 4 : comparing the variance between chains is meaningless when components can swap! Moreover, clustering assumes that convergence has been reached, it would thus be nonsense to use variance-based diagnostics on post-processed samples. 6. Choosing a Model : Bayes Factors Until now, we ve worked with a given number K of components. The question is now to choose between different models. Let M = {M 1,..., M K } be a finite ensemble of possible models (each one with k components, up to K, K = 7 in our application) to explain the observations x i, parametrized by θ. The best model within the K possible is the one with the highest a posteriori probability. The a posteriori probability of model M k is obtained via Bayes formula : [M k x] = ([M k ][x M k ]) / ( M k M [M k][x M k ] ), where [M k ] is an a priori probability of the validity of M k, with M k M [M k] = 1 (e.g. k [M k ] = 1/K), and [x M k ], defined by [x M k ] = [x θk ][θ k ] dθ k is the a priori predictive law of x under model M k. Let s consider M 1 et M 2. The ratio between a posteriori probabilities [M 2 x]/[m 1 x] = ([M 2 ][x M 2 ]) / ([M 1 ][x M 1 ]) is a bet a posteriori in favor of M 2 compared to M 1. The ratio B 21 = [x M 2 ]/[x M 1 ], modifying the bet a priori [M 2 ]/[M 1 ] is called Bayes factors of model M 2 relatively to M 1. Kass and Raftery (1995) suggest a scale to interpret the Bayes factors, wich can be valid for a first indication, but is far from being general. The evaluation of Bayes factors requires integration, wich can be done with MCMC methods. Nevertheless, sampling from the prior distribution [θ k ] wouldn t be efficient because the prior is often very flat. Kass and Raftery (1995) advises to use another method, based on samples from the posterior distribution [θ x], making once more use of the Bayes formula. It implies to continue the Gibbs sampler, setting each parameter one after the other to its estimate, as described in Chib (1995) and Carlin and Chib (1995). There to label-switching is a problem, and that s why other methods such as Reversible Jump or Birth Death processes are more and more often used, as in Stephens (2000). 7. Possible Improvements, Conclusion We have tried here to illustrate some basic technics of MCMC Bayesian Analysis. They are quite straightforward to implement, and easy to use in a first approach of the problem. Nevertheless they suffer of some particular drawbacks, such as label-switching avoiding convergence diagnostic and even easy estimation of the parameters of interest. That s why many more efficient developments have been achieved these last years. However, this kind of algorithm still deals better with this real-case dataset than other algorithms we have tried on it (such as EM algorithm for example). 6

7 Many points are still to be improved in our work : hypothesis of normality should be reconsidered (for log-normality, maybe), the choice of the prior, analysis of sensitivity, and convergence diagnostic. This is an illustration of the difficulties that praticians may face before benefitting from the powerful tools that are MCMC Bayesian Methods. The authors would like to thank Philippe Girard (Nestlé Research Center, Lausanne, Switzerland), Eric Parent, (ENGREF, Paris, France), and Luc Perrault (Hydro-Québec, Varennes, Québec, Canada) for their help and support. Bibliography [1] B.P. Carlin and S. Chib. (1995) Bayesian model choice via markov chain monte carlo methods. J. R. Statist. Soc. B, 57, [2] G. Celeux, M. Hurn, and C. Robert. (2000) Computational and inferential difficulties with mixture posterior distributions. J. Am. Stat. Assoc., 95, [3] S. Chib. (1995) Marginal likelihood from the gibbs output. J. Am. Stat. Assoc., 90, [4] J. Diebolt and C.P. Robert. (1990) Bayesian estimation of infnite mixture distributions, part i : Theoretical aspects. Technical Report 110, LSTA, Université Paris VI. [5] J.J. Droesbeke, J. Fine, and G. Saporta, editors. (2002) Méthodes Bayésiennes en statistique. Technip. [6] A.E. Gelfand, S.E. Hills, A. Rancine-Poon, and A.F.M. Smith. (1990) Illustration of bayesian inference in normal data models using gibbs sampling. J. Amer. Stat. Assoc, 85, [7] A. Gelman and D.B. Rubin. (1992) Inference from iterative simulation using multiple sequence (with discussion). Statistical Science, 7, [8] A. Gelman and D.B. Rubin. (1992) A single series from the gibbs sampler provides a false sense of security (with discussion). In J.M. Bernardo, J.O. Berger, A.P. Dawid, and A.F.M. Smith, editors, Bayesian Statistics 4, pages Oxford University Press. [9] A. Gelman, J.B. Carlin, H.S. Stern, and D.B. Rubin. (2003) Bayesian Data Analysis. Chapman and Hall. [10] R.E. Kass and A.E. Raftery. (1995) Bayes factors. J. Am. Stat. Assoc., 90, [11] J.M. Marin, K. Mengersen, and C.P. Robert. Bayesian modelling and inference on mixtures of distributions. (to appear). D. Dey and C.R. Rao, editors, Handbook of Statistics 25. Elsevier-Sciences. [12] S. Richardson and P.J. Green. (1997) On bayesian analysis of mixtures with an unknown number of components. J. R. Statist. Soc. B, 59(4), [13] M. Stephens. (1997) Bayesian Methods for Mixtures of Normal Distributions. PhD thesis, Magdalen College, Oxford. [14] M. Stephens. (2000) Bayesian analysis of mixtures with an unknown number of components : an alternative to reversible jump methods. Annals of Statistics. [15] M. Stephens. (2000) Dealing with label-switching in mixture models. J. R. Stat. Soc. B, 62,

A Practical Implementation of the Gibbs Sampler for Mixture of Distributions: Application to the Determination of Specifications in Food Industry

A Practical Implementation of the Gibbs Sampler for Mixture of Distributions: Application to the Determination of Specifications in Food Industry A Practical Implementation of the Gibbs Sampler for Mixture of Distributions: Application to the Determination of Specifications in Food Industry Julien Cornebise 1, Myriam Maumy 2, and Philippe Girard

More information

On a multivariate implementation of the Gibbs sampler

On a multivariate implementation of the Gibbs sampler Note On a multivariate implementation of the Gibbs sampler LA García-Cortés, D Sorensen* National Institute of Animal Science, Research Center Foulum, PB 39, DK-8830 Tjele, Denmark (Received 2 August 1995;

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Hmms with variable dimension structures and extensions

Hmms with variable dimension structures and extensions Hmm days/enst/january 21, 2002 1 Hmms with variable dimension structures and extensions Christian P. Robert Université Paris Dauphine www.ceremade.dauphine.fr/ xian Hmm days/enst/january 21, 2002 2 1 Estimating

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

Label Switching and Its Simple Solutions for Frequentist Mixture Models

Label Switching and Its Simple Solutions for Frequentist Mixture Models Label Switching and Its Simple Solutions for Frequentist Mixture Models Weixin Yao Department of Statistics, Kansas State University, Manhattan, Kansas 66506, U.S.A. wxyao@ksu.edu Abstract The label switching

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Bayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni

Bayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni Bayesian Modelling and Inference on Mixtures of Distributions Modelli e inferenza bayesiana per misture di distribuzioni Jean-Michel Marin CEREMADE Université Paris Dauphine Kerrie L. Mengersen QUT Brisbane

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

More information

Different points of view for selecting a latent structure model

Different points of view for selecting a latent structure model Different points of view for selecting a latent structure model Gilles Celeux Inria Saclay-Île-de-France, Université Paris-Sud Latent structure models: two different point of views Density estimation LSM

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Surveying the Characteristics of Population Monte Carlo

Surveying the Characteristics of Population Monte Carlo International Research Journal of Applied and Basic Sciences 2013 Available online at www.irjabs.com ISSN 2251-838X / Vol, 7 (9): 522-527 Science Explorer Publications Surveying the Characteristics of

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

A Simple Solution to Bayesian Mixture Labeling

A Simple Solution to Bayesian Mixture Labeling Communications in Statistics - Simulation and Computation ISSN: 0361-0918 (Print) 1532-4141 (Online) Journal homepage: http://www.tandfonline.com/loi/lssp20 A Simple Solution to Bayesian Mixture Labeling

More information

Bayesian Mixture Labeling by Minimizing Deviance of. Classification Probabilities to Reference Labels

Bayesian Mixture Labeling by Minimizing Deviance of. Classification Probabilities to Reference Labels Bayesian Mixture Labeling by Minimizing Deviance of Classification Probabilities to Reference Labels Weixin Yao and Longhai Li Abstract Solving label switching is crucial for interpreting the results of

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Kobe University Repository : Kernel

Kobe University Repository : Kernel Kobe University Repository : Kernel タイトル Title 著者 Author(s) 掲載誌 巻号 ページ Citation 刊行日 Issue date 資源タイプ Resource Type 版区分 Resource Version 権利 Rights DOI URL Note on the Sampling Distribution for the Metropolis-

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data Petr Volf Institute of Information Theory and Automation Academy of Sciences of the Czech Republic Pod vodárenskou věží 4, 182 8 Praha 8 e-mail: volf@utia.cas.cz Model for Difference of Two Series of Poisson-like

More information

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov

More information

A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models

A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models A comparison of reversible jump MCMC algorithms for DNA sequence segmentation using hidden Markov models Richard J. Boys Daniel A. Henderson Department of Statistics, University of Newcastle, U.K. Richard.Boys@ncl.ac.uk

More information

Bayesian finite mixtures with an unknown number of. components: the allocation sampler

Bayesian finite mixtures with an unknown number of. components: the allocation sampler Bayesian finite mixtures with an unknown number of components: the allocation sampler Agostino Nobile and Alastair Fearnside University of Glasgow, UK 2 nd June 2005 Abstract A new Markov chain Monte Carlo

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

An adaptive kriging method for characterizing uncertainty in inverse problems

An adaptive kriging method for characterizing uncertainty in inverse problems Int Statistical Inst: Proc 58th World Statistical Congress, 2, Dublin Session STS2) p98 An adaptive kriging method for characterizing uncertainty in inverse problems FU Shuai 2 University Paris-Sud & INRIA,

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Recursive Deviance Information Criterion for the Hidden Markov Model

Recursive Deviance Information Criterion for the Hidden Markov Model International Journal of Statistics and Probability; Vol. 5, No. 1; 2016 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education Recursive Deviance Information Criterion for

More information

An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods

An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods Statistics and Computing 14: 323 332, 2004 C 2004 Kluwer Academic Publishers. Manufactured in The Netherlands. An empirical comparison of EM, SEM and MCMC performance for problematic Gaussian mixture likelihoods

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Marginal maximum a posteriori estimation using Markov chain Monte Carlo

Marginal maximum a posteriori estimation using Markov chain Monte Carlo Statistics and Computing 1: 77 84, 00 C 00 Kluwer Academic Publishers. Manufactured in The Netherlands. Marginal maximum a posteriori estimation using Markov chain Monte Carlo ARNAUD DOUCET, SIMON J. GODSILL

More information

Session 3A: Markov chain Monte Carlo (MCMC)

Session 3A: Markov chain Monte Carlo (MCMC) Session 3A: Markov chain Monte Carlo (MCMC) John Geweke Bayesian Econometrics and its Applications August 15, 2012 ohn Geweke Bayesian Econometrics and its Session Applications 3A: Markov () chain Monte

More information

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition. Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface

More information

STAT 425: Introduction to Bayesian Analysis

STAT 425: Introduction to Bayesian Analysis STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte

More information

Submitted to the Brazilian Journal of Probability and Statistics

Submitted to the Brazilian Journal of Probability and Statistics Submitted to the Brazilian Journal of Probability and Statistics A Bayesian sparse finite mixture model for clustering data from a heterogeneous population Erlandson F. Saraiva a, Adriano K. Suzuki b and

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

BLIND SEPARATION OF TEMPORALLY CORRELATED SOURCES USING A QUASI MAXIMUM LIKELIHOOD APPROACH

BLIND SEPARATION OF TEMPORALLY CORRELATED SOURCES USING A QUASI MAXIMUM LIKELIHOOD APPROACH BLID SEPARATIO OF TEMPORALLY CORRELATED SOURCES USIG A QUASI MAXIMUM LIKELIHOOD APPROACH Shahram HOSSEII, Christian JUTTE Laboratoire des Images et des Signaux (LIS, Avenue Félix Viallet Grenoble, France.

More information

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH

Lecture 5: Spatial probit models. James P. LeSage University of Toledo Department of Economics Toledo, OH Lecture 5: Spatial probit models James P. LeSage University of Toledo Department of Economics Toledo, OH 43606 jlesage@spatial-econometrics.com March 2004 1 A Bayesian spatial probit model with individual

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Bayesian nonparametrics

Bayesian nonparametrics Bayesian nonparametrics 1 Some preliminaries 1.1 de Finetti s theorem We will start our discussion with this foundational theorem. We will assume throughout all variables are defined on the probability

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

Heriot-Watt University

Heriot-Watt University Heriot-Watt University Heriot-Watt University Research Gateway Prediction of settlement delay in critical illness insurance claims by using the generalized beta of the second kind distribution Dodd, Erengul;

More information

Stat 516, Homework 1

Stat 516, Homework 1 Stat 516, Homework 1 Due date: October 7 1. Consider an urn with n distinct balls numbered 1,..., n. We sample balls from the urn with replacement. Let N be the number of draws until we encounter a ball

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions R U T C O R R E S E A R C H R E P O R T Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions Douglas H. Jones a Mikhail Nediak b RRR 7-2, February, 2! " ##$%#&

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Penalized Loss functions for Bayesian Model Choice

Penalized Loss functions for Bayesian Model Choice Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs

Model-based cluster analysis: a Defence. Gilles Celeux Inria Futurs Model-based cluster analysis: a Defence Gilles Celeux Inria Futurs Model-based cluster analysis Model-based clustering (MBC) consists of assuming that the data come from a source with several subpopulations.

More information

Markov-Chain Monte Carlo

Markov-Chain Monte Carlo Markov-Chain Monte Carlo CSE586 Computer Vision II Spring 2010, Penn State Univ. References Recall: Sampling Motivation If we can generate random samples x i from a given distribution P(x), then we can

More information

Gibbs Sampling for (Coupled) Infinite Mixture Models in the Stick Breaking Representation

Gibbs Sampling for (Coupled) Infinite Mixture Models in the Stick Breaking Representation Gibbs Sampling for (Coupled) Infinite Mixture Models in the Stick Breaking Representation Ian Porteous, Alex Ihler, Padhraic Smyth, Max Welling Department of Computer Science UC Irvine, Irvine CA 92697-3425

More information

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution

Outline. Binomial, Multinomial, Normal, Beta, Dirichlet. Posterior mean, MAP, credible interval, posterior distribution Outline A short review on Bayesian analysis. Binomial, Multinomial, Normal, Beta, Dirichlet Posterior mean, MAP, credible interval, posterior distribution Gibbs sampling Revisit the Gaussian mixture model

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Bayesian inference for multivariate extreme value distributions

Bayesian inference for multivariate extreme value distributions Bayesian inference for multivariate extreme value distributions Sebastian Engelke Clément Dombry, Marco Oesting Toronto, Fields Institute, May 4th, 2016 Main motivation For a parametric model Z F θ of

More information

Bayesian Inference and Decision Theory

Bayesian Inference and Decision Theory Bayesian Inference and Decision Theory Instructor: Kathryn Blackmond Laskey Room 2214 ENGR (703) 993-1644 Office Hours: Tuesday and Thursday 4:30-5:30 PM, or by appointment Spring 2018 Unit 6: Gibbs Sampling

More information

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL

POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL COMMUN. STATIST. THEORY METH., 30(5), 855 874 (2001) POSTERIOR ANALYSIS OF THE MULTIPLICATIVE HETEROSCEDASTICITY MODEL Hisashi Tanizaki and Xingyuan Zhang Faculty of Economics, Kobe University, Kobe 657-8501,

More information

What is the most likely year in which the change occurred? Did the rate of disasters increase or decrease after the change-point?

What is the most likely year in which the change occurred? Did the rate of disasters increase or decrease after the change-point? Chapter 11 Markov Chain Monte Carlo Methods 11.1 Introduction In many applications of statistical modeling, the data analyst would like to use a more complex model for a data set, but is forced to resort

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Monte Carlo Integration using Importance Sampling and Gibbs Sampling

Monte Carlo Integration using Importance Sampling and Gibbs Sampling Monte Carlo Integration using Importance Sampling and Gibbs Sampling Wolfgang Hörmann and Josef Leydold Department of Statistics University of Economics and Business Administration Vienna Austria hormannw@boun.edu.tr

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Doing Bayesian Integrals

Doing Bayesian Integrals ASTR509-13 Doing Bayesian Integrals The Reverend Thomas Bayes (c.1702 1761) Philosopher, theologian, mathematician Presbyterian (non-conformist) minister Tunbridge Wells, UK Elected FRS, perhaps due to

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection Silvia Pandolfi Francesco Bartolucci Nial Friel University of Perugia, IT University of Perugia, IT

More information

Weakly informative reparameterisations for location-scale mixtures

Weakly informative reparameterisations for location-scale mixtures Weakly informative reparameterisations for location-scale mixtures arxiv:1601.01178v4 [stat.me] 31 Jul 2017 Kaniav Kamary Université Paris-Dauphine, CEREMADE, and INRIA, Saclay Jeong Eun Lee Auckland University

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 4 Problem: Density Estimation We have observed data, y 1,..., y n, drawn independently from some unknown

More information

Bayesian Sparse Correlated Factor Analysis

Bayesian Sparse Correlated Factor Analysis Bayesian Sparse Correlated Factor Analysis 1 Abstract In this paper, we propose a new sparse correlated factor model under a Bayesian framework that intended to model transcription factor regulation in

More information

Mixture models. Mixture models MCMC approaches Label switching MCMC for variable dimension models. 5 Mixture models

Mixture models. Mixture models MCMC approaches Label switching MCMC for variable dimension models. 5 Mixture models 5 MCMC approaches Label switching MCMC for variable dimension models 291/459 Missing variable models Complexity of a model may originate from the fact that some piece of information is missing Example

More information

Bayesian Nonparametric Regression for Diabetes Deaths

Bayesian Nonparametric Regression for Diabetes Deaths Bayesian Nonparametric Regression for Diabetes Deaths Brian M. Hartman PhD Student, 2010 Texas A&M University College Station, TX, USA David B. Dahl Assistant Professor Texas A&M University College Station,

More information

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture February Arnaud Doucet Stat 535 C - Statistical Computing & Monte Carlo Methods Lecture 13-28 February 2006 Arnaud Doucet Email: arnaud@cs.ubc.ca 1 1.1 Outline Limitations of Gibbs sampling. Metropolis-Hastings algorithm. Proof

More information

Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal*

Analysis of the Gibbs sampler for a model. related to James-Stein estimators. Jeffrey S. Rosenthal* Analysis of the Gibbs sampler for a model related to James-Stein estimators by Jeffrey S. Rosenthal* Department of Statistics University of Toronto Toronto, Ontario Canada M5S 1A1 Phone: 416 978-4594.

More information

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016

Bayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016 Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several

More information

Bayesian Methods in Multilevel Regression

Bayesian Methods in Multilevel Regression Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design

More information

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Osnat Stramer 1 and Matthew Bognar 1 Department of Statistics and Actuarial Science, University of

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Bayesian time series classification

Bayesian time series classification Bayesian time series classification Peter Sykacek Department of Engineering Science University of Oxford Oxford, OX 3PJ, UK psyk@robots.ox.ac.uk Stephen Roberts Department of Engineering Science University

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures

Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures 17th Europ. Conf. on Machine Learning, Berlin, Germany, 2006. Variational Bayesian Dirichlet-Multinomial Allocation for Exponential Family Mixtures Shipeng Yu 1,2, Kai Yu 2, Volker Tresp 2, and Hans-Peter

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information