MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

Size: px
Start display at page:

Download "MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution"

Transcription

1 MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous cousin the Gibbs sampler January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 1

2 MH II goal is to sample from target density: π(x) exp [ H(x)/β] the above form is known as the Boltzman form of a distribution H(x) is called the fitness or energy function β is called the temperature EXAMPLE: target density for X Normal 1 (µ, σ 2 ): π(x) = 1 [ σ 2π exp [ exp 1 2σ 1 (x µ)2 2σ2 (x µ)2 2 ] ] here H(x) = 1 2σ (x µ) 2 and β = 1 2 January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 2

3 MH III going to use a proposal distribution pdf to generate guesses or proposals for the draws from the target: T (x, ) EXAMPLE: given x, Normal proposal pdf for Y Normal 1 (x, τ 2 ): T (x, ) Normal 1 (x, τ 2 ; ) January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 3

4 MH IV going to have to evaluate the proposal density pdf: T (x, y) make sure T (y, x) > 0 whenever T (x, y) > 0, otherwise the sampler will not work also don t just assume T (y, x) = T (x, y), a very common trap for beginners EXAMPLE: Normal proposal density: T (x, y) Normal 1 (x, τ 2 ; y) [ exp 1 ] (y x)2 2τ 2 January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 4

5 MH V acceptance probability used in Metropolis-Hastings algorithm: α(x, y) = min { 1, } π(y)t (y, x) π(x)t (x, y) if proposal is symmetric, i.e., if T (x, y) = T (y, x) then we have: α(x, y) = min if π(y) π(x) then α(x, y) = 1 if π(y) < π(x) then α(x, y) < 1 { 1, π(y) } π(x) aside: if proposal is symmetric then the algorithm is called Metropolis algorithm note since we deal with ratios above its enough to know π( ) and T (, ) up to a proportionality constant January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 5

6 MH VI the MH algorithm (for N-many iterations): 1. initialize: set t = 0 and get a starting value x (t) 2. propose: generate y from T (x (t), ) 3. eval: evaluate acceptance probability α(x (t), y) 4. move: generate u from Uniform(0, 1) and set x (t+1) = y if u α(x (t), y) x (t) otherwise 5. if t N stop otherwise set t = t + 1 and go to step 2 aside: its enough to compute α(, ) without the min part because u 1, what??? January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 6

7 MH VII process the samples: {x (t) : t = 0, 1,..., N} discard some initial samples, say, N/10 is the burn-in period, for notational ease, reindex the rest as: {x (t) : t = 1, 2,..., M} use the rest for inference EXAMPLE: to estimate the mean of the target density use the estimator: 1 M M t=1 x (t) January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 7

8 a simple example: set up: target: X Normal 1 (µ, σ 2 ) proposal: Y Normal 1 (x, τ 2 ) so we have: π(x) exp [ 1 2σ 2 (x µ) 2] MH VIII T (x, ) Normal 1 (x, τ 2 ; ) [ T (x, y) exp 1 2τ (y x) 2], note its symmetric! 2 { } { α(x, y) = min = min 1, exp 1, π(y) π(x) [ 1 2σ 2 (y µ) σ 2 (x µ) 2]} important aside: all the above expressions are nice and fine but while implementing do all your computations in log-scale January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 8

9 MH IX some general guidelines while implementing a typical MH sampler: tweak your proposal T (x, y) so that (see [2, Gelman et. al]) α(x, y) [40%, 50%] if x(, y) R 1 α(x, y) [20%, 30%] if x(, y) R d, d > 1 too high (above 70%) or too low (below 10%) values of α(x, y) is a sign of bad choice for T (x, y) start your sampler from dispersed starting values and check that you converge around the same region of the sample space propose to move very highly correlated variables together do not use very high-dimensional proposals, such proposals are rarely accpeted January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 9

10 EM I goal is to find the Maximum Likelihood Estimator (MLE) or the Maximum A Posterior (MAP) Estimator Expectation-Maximization (EM) algorithm is the most popular method for the above the above maximization problem involves two steps: the Expectation step or the E-step the Maximization step or the M-step January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 10

11 EM II set up: data: y := (y 1, y 2,..., y n ) parameter of interest: θ nuisance parameter or missing data: z E-step: Q θ θ (t) := 8 < E θ (t) [log p (θ, z y)] = R log p (θ, z y) p(z θ (t), y) dz : E θ (t) [log p (z, y θ)] = R log p (z, y θ) p(z θ (t), y) dz for MAP for MLE M-step: θ (t+1) := arg max θ Q (θ θ (t)) January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 11

12 EM III the EM algorithm with ɛ-close stopping: 1. initialize: set t = 0 and get a starting value θ (t) 2. E-step: get Q ( θ θ (t)) 3. M-step: get θ (t+1) = arg max θ Q ( θ θ (t)) 4. if θ (t+1) θ (t) ɛ stop otherwise set t = t + 1 and go to step 2 in some easy cases you could combine the E-step and the M-step if you have a closed from expression for Q ( θ θ (t)) and (hence) for arg max θ Q ( θ θ (t)) January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 12

13 EM IV EXAMPLE: we want MAP estimator of µ from (with σ 2 unknown): y i Normal 1 (µ, σ 2 ), i = 1, 2,..., n µ Normal 1 (µ 0, τ0 2 ) p(log σ) 1 so we have: data: y := (y 1, y 2,..., y n ) parameter of interest: θ = µ nuisance parameter: z = σ 2 January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 13

14 EM V we observe: log p (θ, z y) = log p ( µ, σ 2 y ) = const 1 2τ 2 0 (µ µ 0 ) 2 (n + 1) log σ 1 2σ 2 n (y i µ) 2 i=1 we also note: p(z θ (t), y) = p(σ 2 µ (t), y) Inv χ 2 ( n, 1 n n i=1 ) (y i µ (t)) 2 January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 14

15 EM VI E-step: only compute the expectations of the terms which involve θ because other terms are not useful in the M-step so we note: Q ( θ θ (t)) = Q (µ µ (t)) = const 1 2τ 2 0 = const 1 2τ 2 0 [ ] (µ µ 0 ) 2 1 n E µ (t) 2σ 2 (y i µ) 2 i=1 { 1 (µ µ 0 ) n 2} (y i µ (t)) n 2 n i=1 we are ignoring the followin for the mentioned reason i=1 (y i µ) 2 (n + 1)E µ (t) [log σ] January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 15

16 EM VII M-step: note Q ( θ θ (t)) = Q ( µ µ (t)) is a quadratic in µ and hence easy to maximize taking derivatives once (and then twice) one can show θ (t+1) := arg max θ = arg max µ = 1 τ 2 0 Q (θ θ (t)) Q (µ µ (t)) n µ 0 + P n ȳ i=1(y i µ (t) ) 2 1 τ 2 0 = µ (t+1) 1 n n + P n i=1(y i µ (t) ) 2 1 n January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 16

17 EM VIII EXAMPLE: we want MLE for mixture proportions, (π 1, π 2,..., π k ): we have k-many known densities f j ( ), j = 1, 2,..., k there are k-many unknown proportions π j, j = 1, 2,..., k with k j=1 π j = 1 y i k j=1 π jf j ( ), i = 1, 2,..., n so we have: data: y := (y 1, y 2,..., y n ) parameter of interest: θ := (π 1, π 2,..., π k ) introduce missing data: z := (z 1, z 2,..., z n ) such that [z i θ] Multinomial(1, θ), i = 1, 2,..., n note here we need to cook up the missing data in such a way that integrating / summing it out gives us back our original model, see next slide January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 17

18 now we can rewrite our model as: EM IX [y i z i = e j, θ] f j ( ), i = 1, 2,..., n, j = 1, 2,..., k p(z i = e j θ) = π j, i = 1, 2,..., n, j = 1, 2,..., k here e j is the j-th canonical vector for j = 1, 2,..., k (e.g. e 1 = (1, 0, 0,..., 0) etc.) check that: z p(y, z θ) = p(y θ) so we have: p log p (z, y θ) = ( z ij = 1 y, θ (t)) = p n i=1 k z ij log {π j f j (y i )} j=1 (z i = e j y, θ (t)) and = π (t) j f j (y i ) k j =1 π(t) j f j (y i ) = a(t) ij, say January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 18

19 E-step: EM X Q ( θ θ (t)) = = n i=1 n i=1 k E θ (t)(z ij ) log {π j f j (y i )} j=1 k j=1 a (t) ij log {π jf j (y i )} M-step: its a constrained maximization problem with k j=1 π j = 1 which gives: θ (t+1) := arg max Q (θ θ (t)) = θ n i=1 a(t) ij n i=1 k j=1 a(t) ij = 1 n n i=1 a (t) ij January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 19

20 EM XI the tricky (theoretical) part of the EM algorithm is that many missing data schemes may give rise to the same model under consideration but not all are helpful EXAMPLE: in the mixture proportions example defining z the following way is not helpful at all (although it satisfies z p(y, z θ) = p(y θ)) p(z i = j θ) = π j, i = 1, 2,..., n, j = 1, 2,..., k note here z i is of dimension 1 as opposed to k, as before to find the best missing data scheme is an art, really check out The Art of Data Augmentation [5, van Dyk et.al] January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 20

21 References [1] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm (C/R: p22-37). Journal of the Royal Statistical Society, Series B, Methodological, 39:1 22, [2] A. Gelman, G. O. Roberts, and W. R. Gilks. Efficient Metropolis jumping rules. In Bayesian Statistics 5 Proceedings of the Fifth Valencia International Meeting, pages , [3] Andrew Gelman and Donald B. Rubin. Inference from iterative simulation using multiple sequences (Disc: p , ). Statistical Science, 7: , [4] Charles J. Geyer. Practical Markov chain Monte Carlo (Disc: p ). Statistical Science, 7: , [5] David A. van Dyk and Xiao-Li Meng. The art of data augmentation (Pkg: p1-111). Journal of Computational and Graphical Statistics, 10(1):1 50, January 2, 2006 c Gopi Goswami (goswami@stat.harvard.edu) Page 21

Parallel Tempering I

Parallel Tempering I Parallel Tempering I this is a fancy (M)etropolis-(H)astings algorithm it is also called (M)etropolis (C)oupled MCMC i.e. MCMCMC! (as the name suggests,) it consists of running multiple MH chains in parallel

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

EM Algorithm II. September 11, 2018

EM Algorithm II. September 11, 2018 EM Algorithm II September 11, 2018 Review EM 1/27 (Y obs, Y mis ) f (y obs, y mis θ), we observe Y obs but not Y mis Complete-data log likelihood: l C (θ Y obs, Y mis ) = log { f (Y obs, Y mis θ) Observed-data

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 18, 2015 1 / 45 Resources and Attribution Image credits,

More information

Contents. Part I: Fundamentals of Bayesian Inference 1

Contents. Part I: Fundamentals of Bayesian Inference 1 Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee September 03 05, 2017 Department of Biostatistics, Fielding School of Public Health, University of California, Los Angeles Linear Regression Linear regression is,

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC Stat 451 Lecture Notes 07 12 Markov Chain Monte Carlo Ryan Martin UIC www.math.uic.edu/~rgmartin 1 Based on Chapters 8 9 in Givens & Hoeting, Chapters 25 27 in Lange 2 Updated: April 4, 2016 1 / 42 Outline

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public

More information

Control Variates for Markov Chain Monte Carlo

Control Variates for Markov Chain Monte Carlo Control Variates for Markov Chain Monte Carlo Dellaportas, P., Kontoyiannis, I., and Tsourti, Z. Dept of Statistics, AUEB Dept of Informatics, AUEB 1st Greek Stochastics Meeting Monte Carlo: Probability

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants

Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants Faming Liang Texas A& University Sooyoung Cheon Korea University Spatial Model Introduction

More information

Bayesian Linear Models

Bayesian Linear Models Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014. Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Peter Beerli October 10, 2005 [this chapter is highly influenced by chapter 1 in Markov chain Monte Carlo in Practice, eds Gilks W. R. et al. Chapman and Hall/CRC, 1996] 1 Short

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Markov Chain Monte Carlo A Contribution to the Encyclopedia of Environmetrics

Markov Chain Monte Carlo A Contribution to the Encyclopedia of Environmetrics Markov Chain Monte Carlo A Contribution to the Encyclopedia of Environmetrics Galin L. Jones and James P. Hobert Department of Statistics University of Florida May 2000 1 Introduction Realistic statistical

More information

Monte Carlo Inference Methods

Monte Carlo Inference Methods Monte Carlo Inference Methods Iain Murray University of Edinburgh http://iainmurray.net Monte Carlo and Insomnia Enrico Fermi (1901 1954) took great delight in astonishing his colleagues with his remarkably

More information

INTRODUCTION TO BAYESIAN STATISTICS

INTRODUCTION TO BAYESIAN STATISTICS INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types

More information

Bayesian Linear Regression

Bayesian Linear Regression Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective

More information

BUGS Bayesian inference Using Gibbs Sampling

BUGS Bayesian inference Using Gibbs Sampling BUGS Bayesian inference Using Gibbs Sampling Glen DePalma Department of Statistics May 30, 2013 www.stat.purdue.edu/~gdepalma 1 / 20 Bayesian Philosophy I [Pearl] turned Bayesian in 1971, as soon as I

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

Markov Chain Monte Carlo in Practice

Markov Chain Monte Carlo in Practice Markov Chain Monte Carlo in Practice Edited by W.R. Gilks Medical Research Council Biostatistics Unit Cambridge UK S. Richardson French National Institute for Health and Medical Research Vilejuif France

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

David Giles Bayesian Econometrics

David Giles Bayesian Econometrics David Giles Bayesian Econometrics 5. Bayesian Computation Historically, the computational "cost" of Bayesian methods greatly limited their application. For instance, by Bayes' Theorem: p(θ y) = p(θ)p(y

More information

Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems

Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems LEARNING AND INFERENCE IN GRAPHICAL MODELS Chapter 08: Direct Maximum Likelihood/MAP Estimation and Incomplete Data Problems Dr. Martin Lauer University of Freiburg Machine Learning Lab Karlsruhe Institute

More information

Computer Intensive Methods in Mathematical Statistics

Computer Intensive Methods in Mathematical Statistics Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018

Markov Chain Monte Carlo Inference. Siamak Ravanbakhsh Winter 2018 Graphical Models Markov Chain Monte Carlo Inference Siamak Ravanbakhsh Winter 2018 Learning objectives Markov chains the idea behind Markov Chain Monte Carlo (MCMC) two important examples: Gibbs sampling

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School Bayesian modelling Hans-Peter Helfrich University of Bonn Theodor-Brinkmann-Graduate School H.-P. Helfrich (University of Bonn) Bayesian modelling Brinkmann School 1 / 22 Overview 1 Bayesian modelling

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

A Bayesian Treatment of Linear Gaussian Regression

A Bayesian Treatment of Linear Gaussian Regression A Bayesian Treatment of Linear Gaussian Regression Frank Wood December 3, 2009 Bayesian Approach to Classical Linear Regression In classical linear regression we have the following model y β, σ 2, X N(Xβ,

More information

Optimization Methods II. EM algorithms.

Optimization Methods II. EM algorithms. Aula 7. Optimization Methods II. 0 Optimization Methods II. EM algorithms. Anatoli Iambartsev IME-USP Aula 7. Optimization Methods II. 1 [RC] Missing-data models. Demarginalization. The term EM algorithms

More information

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture)

18 : Advanced topics in MCMC. 1 Gibbs Sampling (Continued from the last lecture) 10-708: Probabilistic Graphical Models 10-708, Spring 2014 18 : Advanced topics in MCMC Lecturer: Eric P. Xing Scribes: Jessica Chemali, Seungwhan Moon 1 Gibbs Sampling (Continued from the last lecture)

More information

Bayesian model selection in graphs by using BDgraph package

Bayesian model selection in graphs by using BDgraph package Bayesian model selection in graphs by using BDgraph package A. Mohammadi and E. Wit March 26, 2013 MOTIVATION Flow cytometry data with 11 proteins from Sachs et al. (2005) RESULT FOR CELL SIGNALING DATA

More information

Nonparametric Drift Estimation for Stochastic Differential Equations

Nonparametric Drift Estimation for Stochastic Differential Equations Nonparametric Drift Estimation for Stochastic Differential Equations Gareth Roberts 1 Department of Statistics University of Warwick Brazilian Bayesian meeting, March 2010 Joint work with O. Papaspiliopoulos,

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

Fitting Narrow Emission Lines in X-ray Spectra

Fitting Narrow Emission Lines in X-ray Spectra Outline Fitting Narrow Emission Lines in X-ray Spectra Taeyoung Park Department of Statistics, University of Pittsburgh October 11, 2007 Outline of Presentation Outline This talk has three components:

More information

ABC methods for phase-type distributions with applications in insurance risk problems

ABC methods for phase-type distributions with applications in insurance risk problems ABC methods for phase-type with applications problems Concepcion Ausin, Department of Statistics, Universidad Carlos III de Madrid Joint work with: Pedro Galeano, Universidad Carlos III de Madrid Simon

More information

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Jonas Hallgren 1 1 Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden BFS 2012 June

More information

Gaussian Mixture Model

Gaussian Mixture Model Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,

More information

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31 Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 11 Markov Chain Monte Carlo cont. October 6, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university The two stage Gibbs sampler If the conditional distributions are easy to sample

More information

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain

More information

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem?

Who was Bayes? Bayesian Phylogenetics. What is Bayes Theorem? Who was Bayes? Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 The Reverand Thomas Bayes was born in London in 1702. He was the

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

Bayesian Phylogenetics

Bayesian Phylogenetics Bayesian Phylogenetics Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison October 6, 2011 Bayesian Phylogenetics 1 / 27 Who was Bayes? The Reverand Thomas Bayes was born

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Forward Problems and their Inverse Solutions

Forward Problems and their Inverse Solutions Forward Problems and their Inverse Solutions Sarah Zedler 1,2 1 King Abdullah University of Science and Technology 2 University of Texas at Austin February, 2013 Outline 1 Forward Problem Example Weather

More information

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection

A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection A generalization of the Multiple-try Metropolis algorithm for Bayesian estimation and model selection Silvia Pandolfi Francesco Bartolucci Nial Friel University of Perugia, IT University of Perugia, IT

More information

Gibbs Sampling in Linear Models #2

Gibbs Sampling in Linear Models #2 Gibbs Sampling in Linear Models #2 Econ 690 Purdue University Outline 1 Linear Regression Model with a Changepoint Example with Temperature Data 2 The Seemingly Unrelated Regressions Model 3 Gibbs sampling

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

The Ising model and Markov chain Monte Carlo

The Ising model and Markov chain Monte Carlo The Ising model and Markov chain Monte Carlo Ramesh Sridharan These notes give a short description of the Ising model for images and an introduction to Metropolis-Hastings and Gibbs Markov Chain Monte

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Gaussian Mixture Models

Gaussian Mixture Models Gaussian Mixture Models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Some slides courtesy of Eric Xing, Carlos Guestrin (One) bad case for K- means Clusters may overlap Some

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods: Markov Chain Monte Carlo Group Prof. Daniel Cremers 11. Sampling Methods: Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr 1 /

More information

Advanced Statistical Modelling

Advanced Statistical Modelling Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Overlapping Astronomical Sources: Utilizing Spectral Information

Overlapping Astronomical Sources: Utilizing Spectral Information Overlapping Astronomical Sources: Utilizing Spectral Information David Jones Advisor: Xiao-Li Meng Collaborators: Vinay Kashyap (CfA) and David van Dyk (Imperial College) CHASC Astrostatistics Group April

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017

Markov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017 Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the

More information

Likelihood Inference for Lattice Spatial Processes

Likelihood Inference for Lattice Spatial Processes Likelihood Inference for Lattice Spatial Processes Donghoh Kim November 30, 2004 Donghoh Kim 1/24 Go to 1234567891011121314151617 FULL Lattice Processes Model : The Ising Model (1925), The Potts Model

More information

Simulation - Lectures - Part III Markov chain Monte Carlo

Simulation - Lectures - Part III Markov chain Monte Carlo Simulation - Lectures - Part III Markov chain Monte Carlo Julien Berestycki Part A Simulation and Statistical Programming Hilary Term 2018 Part A Simulation. HT 2018. J. Berestycki. 1 / 50 Outline Markov

More information

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007

Lecturer: David Blei Lecture #3 Scribes: Jordan Boyd-Graber and Francisco Pereira October 1, 2007 COS 597C: Bayesian Nonparametrics Lecturer: David Blei Lecture # Scribes: Jordan Boyd-Graber and Francisco Pereira October, 7 Gibbs Sampling with a DP First, let s recapitulate the model that we re using.

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information