MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

Size: px
Start display at page:

Download "MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17"

Transcription

1 MCMC for big data Geir Storvik BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

2 Outline Why ordinary MCMC is not scalable Different approaches for making MCMC scalable Summary/status Geir Storvik MCMC for big data BigInsight lunch - May / 17

3 Big data and statistics Jordan et al. (2013), On statistics, computation and scalability: gatherers of large-scale data are often forced to turn to ad hoc procedures that perhaps do provide algorithmic guarantees but which may provide no statistical guarantees and which in fact may have poor or even disastrous statistical properties Statistical "solutions": Better algorithms for "standard" optimal solutions Embarrassingly parallel methods Bootstrapping Bagging/random forrest Divide-and-conquer methods (biglm package in R) Dynamic updating (Kalman filtering, particle filters) Sub-sampling (stochastic approximation) Alternative procedures that are both computationally and statistically efficient Bags of Little Bootstraps (Kleiner et al., 2014) Bayesian (MCMC-based) methods: Left-behind! Geir Storvik MCMC for big data BigInsight lunch - May / 17

4 Computing expectations Complicated expectations needed in many statistical inference settings ML estimation with latent variables L(θ) = p(y θ) = p(y z; θ)p(z θ)dz = E p(z θ) [p(y Z; θ)] z Bayesian statistics ˆθ = E p(θ y) [θ y] Markov chain Monte Carlo (Bayesian setting): Simulate Markov chain θ 1, θ 2,... Exact MCMC θ m D p(θ y) as m M M 1 θ m E p(θ y) [θ y] as M m=1 Approximate MCMC θ m D p(θ y) as m p(θ y) p(θ y) ε Geir Storvik MCMC for big data BigInsight lunch - May / 17

5 Metropolis-Hastings Algorithm: Generate θ q( θ i 1 ) { Calculate α = min 1, } π(θ y)q(θ i 1 θ ) π(θ i 1 y)q(θ θ i 1 ) Put θ i = { θ with probability α; θ i 1 otherwise. For transition density P(θ θ): p(θ y)p(θ θ) = p(θ y)p(θ θ ) Detailed balance Calculation of α (independent case): π(θ y) π(θ i 1 y) = π(θ )p(y θ ) π(θ i 1 )p(y θ i 1 ) ind = π(θ ) n i=1 p(y i θ ) π(θ i 1 ) n i=1 p(y i θ i 1 ) For big data: Product too time/memory-consuming Geir Storvik MCMC for big data BigInsight lunch - May / 17

6 Alternatives Change estimator Approximate Bayesian Computation (ABC) Variational Bayes Alternative MCMC methods (Bardenet et al., 2017) Divide-and-conquer methods Exact sub-sampling methods Approximate sub-sampling methods (Methods dynamically including more data) Geir Storvik MCMC for big data BigInsight lunch - May / 17

7 Divide-and-conquer metods Procedure Split the data into a large number of smaller (possibly overlapping) data sets Perform inference on each smaller data set Combine the results Neiswanger et al. (2013); Scott et al. (2016); Wang and Dunson (2013); Li et al. (2017); Minsker et al. (2014) Properties Computation only on smaller datasets Easy to run in parallel. Separation lead to inexact results Some asymptotic results available, but in the limit simple Laplace approximations better and easier (?) Geir Storvik MCMC for big data BigInsight lunch - May / 17

8 Consensus Monte Carlo (Scott et al., 2016) Assume independent blocks y 1,..., y S : S S p(θ y) = p s(θ y s) p(y s θ)p(θ) 1/S s=1 s=1 Simulate θ s1,..., θ sg from p s(θ y) p(y s θ)p(θ) 1/S Combine θ g = ( ) 1 s s Ws Wsθsg Properties: Exact if p s(θ y), s = 1,..., S are Gaussian ( W s = Var ps(θ y) [θ]) Approximate in general When choice of model complexity involved: How does prior p(θ) 1/S and subset of data influence complexity? Alternative: p s(θ y) p(y s θ) S p(θ) Geir Storvik MCMC for big data BigInsight lunch - May / 17

9 Delayed acceptance (Banterle et al., 2015) We have α(θ, θ ) = min{1, ρ(θ, θ )} where ρ(θ, θ ) = p(θ ) p(θ) = p(θ ) p(θ) n i=1 S s=1 p(y i θ ) p(y i θ) p(y s θ ) p(y s θ) Delayed acceptance: Accept with probability { S min 1, s=1 [ p(θ ) p(θ) ] } 1/S p(y s θ ) p(y s θ) Sequential procedure with only evaluating p(ys θ ) p(y s θ) Possible gain when rejecting, not when accepting at each step Geir Storvik MCMC for big data BigInsight lunch - May / 17

10 Subsampling-based Independent data: n log p(y θ) = log p(y i θ) i=1 ML/Gradient methods ˆθ s+1 = ˆθ s + γ [log p(ˆθ s )) + log p(y ˆθ s )] Big data: ˆθ s+1 =ˆθ s + γ s[ log p(ˆθ s )) + log p(y ˆθ s )] log p(y θ) = n m p(y ij θ) m j=1 i 1,..., i m random subsample of {1,..., n}: Utilising an unbiased estimate of log p(y θ). Stochastic gradient descent, convergence if γ s =, s=1 γ s < 2 s=1 Geir Storvik MCMC for big data BigInsight lunch - May / 17

11 Pseudo-likelihood approach Idea: Replace α by { } ˆπ(θ y)q(θ i 1 θ ) ˆα = min 1, ˆπ(θ i 1 y)q(θ θ i 1 ) If E[ˆπ(θ y)] = π(θ y) and positive: convergence properties are preserved! (Beaumont, 2003; Andrieu et al., 2009) Question: How to construct ˆπ(θ y)? Problem with subsampling p(y θ) = exp( log p(y θ)) is a biased estimate of p(y θ). Jacob et al. (2015): Without additional knowledge on log p(y θ) we cannot obtain positive, unbiased estimates p(y θ)! Geir Storvik MCMC for big data BigInsight lunch - May / 17

12 Firefly MCMC (Maclaurin and Adams, 2014) p(θ y) p(θ) i p(y i θ) can be extended to p(θ, z y) p(θ) i p(y i θ) i [ p(yi θ) B i (θ) p(y i θ) ] zi [ ] Bi (θ) 1 zi p(y i θ) where z i {0, 1} and 0 < B i (θ) p(y i θ). p(θ, z y) has p(θ y) as marginal Simulation: p(θ y, z) i [p(y i θ) B i (θ)] z i [B i (θ)] 1 z i Only require evaluation of p(y i θ) for z i = 1! p(z y, θ) i [ p(yi θ) B i (θ) p(y i θ) ] zi [ ] Bi (θ) 1 zi p(y i θ) Simple binomial sampling Main benefit if B i (θ) p(y i θ) and simple to calculate Enough to resample a (small) fraction of z i s at each iteration. Geir Storvik MCMC for big data BigInsight lunch - May / 17

13 Stochastic Gradient Langevin Dynamics (Welling and Teh, 2001) Stochastic optimisation (convergence towards mode): ( ) θ t+1 = θ t + ε t log p(θ t ) + n m log p(y ti θ) 2 m i=1 Require t=1 ε t =, t=1 ε2 t < Langevin dynamics (convergence towards posterior distribution) ( ) θ t+1 = θ t + ε n log p(θ t ) + log p(y i θ) + η t, η t N(0, ε) 2 i=1 Stochastic gradient Langevin: ( ) θ t+1 = θ t + ε t log p(θ t ) + n m log p(y ti θ) + η t, η t N(0, ε) 2 m i=1 Require t=1 ε t =, t=1 ε2 t < Geir Storvik MCMC for big data BigInsight lunch - May / 17

14 Noisy MCMC Simplifying notation: π(θ y) = π(θ) "Standard" MCMC: sup θ0 δ θ0 P s π TV Cρ s What if we use P instead of P where π P π? Mitrophanov (2005); Alquier et al. (2016): ) δ θ0 P s δ Ps θ0 TV (λ + Cρλ P P 1 ρ TV where Note: Independent of s! λ = log(1/c) log(ρ) Geir Storvik MCMC for big data BigInsight lunch - May / 17

15 Noisy Metropolis Hastings Draw U F (u θ ) Apply acceptance rate ˆα(θ, θ, u) α(θ, θ ) within M-H. Properties: Assume E F (u θ ) ˆα(θ, θ, u ) α(θ, θ ) δ(θ, θ ) Then ) δ θ0 P s δ Ps θ0 (λ + Cρλ 1 ρ θ q(θ θ)δ(θ, θ )dθ Examples: Ignoring discretisation error in Langevin Dynamics Using pseudo-likelihoods within Gibbs random fields Geir Storvik MCMC for big data BigInsight lunch - May / 17

16 Summary Many interesting approaches Some are exact but can have slow convergence Some are approximate, difficult to evaluate performance (Most success perhaps gained through use of sequential Monte Carlo as well!) "Standard" accelerating MCMC approaches also useful: Simulated tempering Adaptive MCMC Multiple-try MCMC Rao-Blackwellisation More research needed! Geir Storvik MCMC for big data BigInsight lunch - May / 17

17 References P. Alquier, N. Friel, R. Everitt, and A. Boland. Noisy Monte Carlo: Convergence of Markov chains with approximate transition kernels. Statistics and Computing, 26(1-2):29 47, C. Andrieu, G. O. Roberts, et al. The pseudo-marginal approach for efficient Monte Carlo computations. The Annals of Statistics, 37(2): , M. Banterle, C. Grazian, A. Lee, and C. P. Robert. Accelerating metropolis-hastings algorithms by delayed acceptance. arxiv preprint arxiv: , R. Bardenet, A. Doucet, and C. Holmes. On Markov chain Monte Carlo methods for tall data. The Journal of Machine Learning Research, 18(1): , M. A. Beaumont. Estimation of population growth or decline in genetically monitored populations. Genetics, 164(3): , P. E. Jacob, A. H. Thiery, et al. On nonnegative unbiased estimators. The Annals of Statistics, 43(2): , M. I. Jordan et al. On statistics, computation and scalability. Bernoulli, 19(4): , A. Kleiner, A. Talwalkar, P. Sarkar, and M. I. Jordan. A scalable bootstrap for massive data. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(4): , C. Li, S. Srivastava, and D. B. Dunson. Simple, scalable and accurate posterior interval estimation. Biometrika, 104(3): , D. Maclaurin and R. P. Adams. Firefly Monte Carlo: Exact MCMC with Subsets of Data. In UAI, pages , S. Minsker, S. Srivastava, L. Lin, and D. Dunson. Scalable and robust Bayesian inference via the median posterior. In International Conference on Machine Learning, pages , A. Y. Mitrophanov. Sensitivity and convergence of uniformly ergodic Markov chains. Journal of Applied Probability, 42(4): , W. Neiswanger, C. Wang, and E. Xing. Asymptotically exact, embarrassingly parallel MCMC. arxiv preprint arxiv: , S. L. Scott, A. W. Blocker, F. V. Bonassi, H. A. Chipman, E. I. George, and R. E. McCulloch. Bayes and big data: The consensus Monte Carlo algorithm. International Journal of Management Science and Engineering Management, 11(2):78 88, X. Wang and D. B. Dunson. Parallelizing MCMC via Weierstrass sampler. arxiv preprint arxiv: , M. Welling and Y. W. Teh. Belief optimization for binary networks: A stable alternative to loopy belief propagation. In Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence, pages Morgan Kaufmann Publishers Inc., Geir Storvik MCMC for big data BigInsight lunch - May / 17

On Markov chain Monte Carlo methods for tall data

On Markov chain Monte Carlo methods for tall data On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational

More information

Inexact approximations for doubly and triply intractable problems

Inexact approximations for doubly and triply intractable problems Inexact approximations for doubly and triply intractable problems March 27th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers of) interacting

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Notes on pseudo-marginal methods, variational Bayes and ABC

Notes on pseudo-marginal methods, variational Bayes and ABC Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Department of Statistics

Department of Statistics Research Report Department of Statistics Research Report Department of Statistics No. 2015:6 Speeding up MCMC by Delayed Acceptance and Data Subsampling Matias Quiroz No. 2015:6 +++++++++++++++ Speeding

More information

Controlled sequential Monte Carlo

Controlled sequential Monte Carlo Controlled sequential Monte Carlo Jeremy Heng, Department of Statistics, Harvard University Joint work with Adrian Bishop (UTS, CSIRO), George Deligiannidis & Arnaud Doucet (Oxford) Bayesian Computation

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Inference in state-space models with multiple paths from conditional SMC

Inference in state-space models with multiple paths from conditional SMC Inference in state-space models with multiple paths from conditional SMC Sinan Yıldırım (Sabancı) joint work with Christophe Andrieu (Bristol), Arnaud Doucet (Oxford) and Nicolas Chopin (ENSAE) September

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Introduction to Stochastic Gradient Markov Chain Monte Carlo Methods

Introduction to Stochastic Gradient Markov Chain Monte Carlo Methods Introduction to Stochastic Gradient Markov Chain Monte Carlo Methods Changyou Chen Department of Electrical and Computer Engineering, Duke University cc448@duke.edu Duke-Tsinghua Machine Learning Summer

More information

An ABC interpretation of the multiple auxiliary variable method

An ABC interpretation of the multiple auxiliary variable method School of Mathematical and Physical Sciences Department of Mathematics and Statistics Preprint MPS-2016-07 27 April 2016 An ABC interpretation of the multiple auxiliary variable method by Dennis Prangle

More information

Bayesian inference for intractable distributions

Bayesian inference for intractable distributions Bayesian inference for intractable distributions Nial Friel University College Dublin nial.friel@ucd.ie October, 2014 Introduction Main themes: The Bayesian inferential approach has had a profound impact

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Jonas Hallgren 1 1 Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden BFS 2012 June

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

Importance Weighted Consensus Monte Carlo for Distributed Bayesian Inference

Importance Weighted Consensus Monte Carlo for Distributed Bayesian Inference Importance Weighted Consensus Monte Carlo for Distributed Bayesian Inference Qiang Liu Computer Science Dartmouth College qiang.liu@dartmouth.edu Abstract The recent explosion in big data has created a

More information

an introduction to bayesian inference

an introduction to bayesian inference with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena

More information

Scaling up Bayesian Inference

Scaling up Bayesian Inference Scaling up Bayesian Inference David Dunson Departments of Statistical Science, Mathematics & ECE, Duke University May 1, 2017 Outline Motivation & background EP-MCMC amcmc Discussion Motivation & background

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

VCMC: Variational Consensus Monte Carlo

VCMC: Variational Consensus Monte Carlo VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL Xuebin Zheng Supervisor: Associate Professor Josef Dick Co-Supervisor: Dr. David Gunawan School of Mathematics

More information

Bayesian Phylogenetics:

Bayesian Phylogenetics: Bayesian Phylogenetics: an introduction Marc A. Suchard msuchard@ucla.edu UCLA Who is this man? How sure are you? The one true tree? Methods we ve learned so far try to find a single tree that best describes

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Kernel Sequential Monte Carlo

Kernel Sequential Monte Carlo Kernel Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) * equal contribution April 25, 2016 1 / 37 Section

More information

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Afternoon Meeting on Bayesian Computation 2018 University of Reading Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of

More information

Distributed Bayesian Learning with Stochastic Natural-gradient EP and the Posterior Server

Distributed Bayesian Learning with Stochastic Natural-gradient EP and the Posterior Server Distributed Bayesian Learning with Stochastic Natural-gradient EP and the Posterior Server in collaboration with: Minjie Xu, Balaji Lakshminarayanan, Leonard Hasenclever, Thibaut Lienart, Stefan Webb,

More information

Approximate Inference using MCMC

Approximate Inference using MCMC Approximate Inference using MCMC 9.520 Class 22 Ruslan Salakhutdinov BCS and CSAIL, MIT 1 Plan 1. Introduction/Notation. 2. Examples of successful Bayesian models. 3. Basic Sampling Algorithms. 4. Markov

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Pseudo-marginal MCMC methods for inference in latent variable models

Pseudo-marginal MCMC methods for inference in latent variable models Pseudo-marginal MCMC methods for inference in latent variable models Arnaud Doucet Department of Statistics, Oxford University Joint work with George Deligiannidis (Oxford) & Mike Pitt (Kings) MCQMC, 19/08/2016

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Evidence estimation for Markov random fields: a triply intractable problem

Evidence estimation for Markov random fields: a triply intractable problem Evidence estimation for Markov random fields: a triply intractable problem January 7th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Variational Scoring of Graphical Model Structures

Variational Scoring of Graphical Model Structures Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational

More information

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA)

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA) Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA) arxiv:1611.01450v1 [stat.co] 4 Nov 2016 Aliaksandr Hubin Department of Mathematics, University of Oslo and Geir Storvik

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model

The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for

More information

Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget Anoop Korattikara, Yutian Chen and Max Welling,2 Department of Computer Science, University of California, Irvine 2 Informatics Institute,

More information

Sampling Algorithms for Probabilistic Graphical models

Sampling Algorithms for Probabilistic Graphical models Sampling Algorithms for Probabilistic Graphical models Vibhav Gogate University of Washington References: Chapter 12 of Probabilistic Graphical models: Principles and Techniques by Daphne Koller and Nir

More information

Monte Carlo Inference Methods

Monte Carlo Inference Methods Monte Carlo Inference Methods Iain Murray University of Edinburgh http://iainmurray.net Monte Carlo and Insomnia Enrico Fermi (1901 1954) took great delight in astonishing his colleagues with his remarkably

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Coresets for Bayesian Logistic Regression

Coresets for Bayesian Logistic Regression Coresets for Bayesian Logistic Regression Tamara Broderick ITT Career Development Assistant Professor, MIT With: Jonathan H. Huggins, Trevor Campbell 1 Bayesian inference Bayesian inference Complex, modular

More information

One Pseudo-Sample is Enough in Approximate Bayesian Computation MCMC

One Pseudo-Sample is Enough in Approximate Bayesian Computation MCMC Biometrika (?), 99, 1, pp. 1 10 Printed in Great Britain Submitted to Biometrika One Pseudo-Sample is Enough in Approximate Bayesian Computation MCMC BY LUKE BORNN, NATESH PILLAI Department of Statistics,

More information

arxiv: v1 [stat.co] 2 Nov 2017

arxiv: v1 [stat.co] 2 Nov 2017 Binary Bouncy Particle Sampler arxiv:1711.922v1 [stat.co] 2 Nov 217 Ari Pakman Department of Statistics Center for Theoretical Neuroscience Grossman Center for the Statistics of Mind Columbia University

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

Probabilistic Machine Learning

Probabilistic Machine Learning Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory Pseudo-arginal Metropolis-Hastings: a siple explanation and (partial) review of theory Chris Sherlock Motivation Iagine a stochastic process V which arises fro soe distribution with density p(v θ ). Iagine

More information

Bayesian Inference in Astronomy & Astrophysics A Short Course

Bayesian Inference in Astronomy & Astrophysics A Short Course Bayesian Inference in Astronomy & Astrophysics A Short Course Tom Loredo Dept. of Astronomy, Cornell University p.1/37 Five Lectures Overview of Bayesian Inference From Gaussians to Periodograms Learning

More information

Patterns of Scalable Bayesian Inference Background (Session 1)

Patterns of Scalable Bayesian Inference Background (Session 1) Patterns of Scalable Bayesian Inference Background (Session 1) Jerónimo Arenas-García Universidad Carlos III de Madrid jeronimo.arenas@gmail.com June 14, 2017 1 / 15 Motivation. Bayesian Learning principles

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

17 : Optimization and Monte Carlo Methods

17 : Optimization and Monte Carlo Methods 10-708: Probabilistic Graphical Models Spring 2017 17 : Optimization and Monte Carlo Methods Lecturer: Avinava Dubey Scribes: Neil Spencer, YJ Choe 1 Recap 1.1 Monte Carlo Monte Carlo methods such as rejection

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration Bayes Factors, posterior predictives, short intro to RJMCMC Thermodynamic Integration Dave Campbell 2016 Bayesian Statistical Inference P(θ Y ) P(Y θ)π(θ) Once you have posterior samples you can compute

More information

Learning Static Parameters in Stochastic Processes

Learning Static Parameters in Stochastic Processes Learning Static Parameters in Stochastic Processes Bharath Ramsundar December 14, 2012 1 Introduction Consider a Markovian stochastic process X T evolving (perhaps nonlinearly) over time variable T. We

More information

Bayesian Computations for DSGE Models

Bayesian Computations for DSGE Models Bayesian Computations for DSGE Models Frank Schorfheide University of Pennsylvania, PIER, CEPR, and NBER October 23, 2017 This Lecture is Based on Bayesian Estimation of DSGE Models Edward P. Herbst &

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College

More information

Graphical Models for Query-driven Analysis of Multimodal Data

Graphical Models for Query-driven Analysis of Multimodal Data Graphical Models for Query-driven Analysis of Multimodal Data John Fisher Sensing, Learning, & Inference Group Computer Science & Artificial Intelligence Laboratory Massachusetts Institute of Technology

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

Bayesian Calibration of Simulators with Structured Discretization Uncertainty

Bayesian Calibration of Simulators with Structured Discretization Uncertainty Bayesian Calibration of Simulators with Structured Discretization Uncertainty Oksana A. Chkrebtii Department of Statistics, The Ohio State University Joint work with Matthew T. Pratola (Statistics, The

More information

Bayesian inference for multivariate extreme value distributions

Bayesian inference for multivariate extreme value distributions Bayesian inference for multivariate extreme value distributions Sebastian Engelke Clément Dombry, Marco Oesting Toronto, Fields Institute, May 4th, 2016 Main motivation For a parametric model Z F θ of

More information

Computer Practical: Metropolis-Hastings-based MCMC

Computer Practical: Metropolis-Hastings-based MCMC Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Riemann Manifold Methods in Bayesian Statistics

Riemann Manifold Methods in Bayesian Statistics Ricardo Ehlers ehlers@icmc.usp.br Applied Maths and Stats University of São Paulo, Brazil Working Group in Statistical Learning University College Dublin September 2015 Bayesian inference is based on Bayes

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

Patterns of Scalable Bayesian Inference

Patterns of Scalable Bayesian Inference Foundations and Trends R in Machine Learning Vol. 9, No. 1-2 (2016) 1 129 c 2016 E. Angelino, M. J. Johnson, and R. P. Adams DOI: 10.1561/2200000052 Patterns of Scalable Bayesian Inference Elaine Angelino

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

Streaming Variational Bayes

Streaming Variational Bayes Streaming Variational Bayes Tamara Broderick, Nicholas Boyd, Andre Wibisono, Ashia C. Wilson, Michael I. Jordan UC Berkeley Discussion led by Miao Liu September 13, 2013 Introduction The SDA-Bayes Framework

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Introduction to Probabilistic Machine Learning

Introduction to Probabilistic Machine Learning Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning

More information

Bayesian Classification and Regression Trees

Bayesian Classification and Regression Trees Bayesian Classification and Regression Trees James Cussens York Centre for Complex Systems Analysis & Dept of Computer Science University of York, UK 1 Outline Problems for Lessons from Bayesian phylogeny

More information

Statistical Tools and Techniques for Solar Astronomers

Statistical Tools and Techniques for Solar Astronomers Statistical Tools and Techniques for Solar Astronomers Alexander W Blocker Nathan Stein SolarStat 2012 Outline Outline 1 Introduction & Objectives 2 Statistical issues with astronomical data 3 Example:

More information

On the flexibility of Metropolis-Hastings acceptance probabilities in auxiliary variable proposal generation

On the flexibility of Metropolis-Hastings acceptance probabilities in auxiliary variable proposal generation On the flexibility of Metropolis-Hastings acceptance probabilities in auxiliary variable proposal generation Geir Storvik Department of Mathematics and (sfi) 2 Statistics for Innovation, University of

More information

The Origin of Deep Learning. Lili Mou Jan, 2015

The Origin of Deep Learning. Lili Mou Jan, 2015 The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond January 18, 2017 Contents 1 Batch and Recursive Estimation 2 Towards Bayesian Filtering 3 Kalman Filter and Bayesian Filtering and Smoothing

More information

Sequential Monte Carlo Methods (for DSGE Models)

Sequential Monte Carlo Methods (for DSGE Models) Sequential Monte Carlo Methods (for DSGE Models) Frank Schorfheide University of Pennsylvania, PIER, CEPR, and NBER October 23, 2017 Some References These lectures use material from our joint work: Tempered

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

arxiv: v5 [stat.co] 10 Apr 2018

arxiv: v5 [stat.co] 10 Apr 2018 THE BLOCK-POISSON ESTIMATOR FOR OPTIMALLY TUNED EXACT SUBSAMPLING MCMC MATIAS QUIROZ 1,2, MINH-NGOC TRAN 3, MATTIAS VILLANI 4, ROBERT KOHN 1 AND KHUE-DUNG DANG 1 arxiv:1603.08232v5 [stat.co] 10 Apr 2018

More information