Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants

Size: px
Start display at page:

Download "Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants"

Transcription

1 Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants Faming Liang Texas A& University Sooyoung Cheon Korea University

2 Spatial Model Introduction Spatial models, e.g., autologistic model, Potts model, and autonormal model, have been used in modeling of many scientific problems: Image analysis (Hurn et al. 2003) Disease mapping (Green and Richardson, 2002) genetic analysis (Francois et al., 2006) A major problem with the models is that the normalizing constant is intractable!

3 Spatial Model Introduction The Problem Suppose we have a data X generated from a statistical model with the likelihood function f(x θ) = p(x, θ), x X, θ Θ, (1) Z(θ) where θ is the parameter, and Z(θ) is the normalizing constant which depends on θ and is not available in closed form. Let π(θ) denote the prior density of θ. The posterior distribution of θ given X = x is then given by π(θ x) = 1 p(x, θ)π(θ). (2) Z(θ)

4 Spatial Model Introduction Difficulty: The Metropolis-Hastings algorithm cannot be directly applied to simulate from π(θ x), because the acceptance probability would involve the unknown ratio Z(θ)/Z(θ ), where θ denotes the proposed value. The Metropolis-Hastings ratio is given by r = Z(θ) Z(θ ) p(x, θ )π(θ ) p(x, θ)π(θ) T (θ θ) T (θ θ ).

5 Spatial Model Introduction Existing approaches to the problem: The likelihood approximation-based methods: Pseudo-likelihood (Besag, 1974) MCMLE (Geyer and Thompson, 1992) Stochastic approximation Monte Carlo (Liang, 2007; Liang et al., 2007) Auxiliary variable MCMC methods: Møller et al. s algorithm (2006) Exchange Algorithm (Murray et al., 2006) Double MH algorithm (Liang, 2009)

6 Spatial Model Introduction Algorithm Summary: In MCDWIS, the state space of the Markov chain is augmented to a population, a collection of weighted samples (θ, w) = {θ 1, w 1 ;... ; θ n, w n }, where n is called the population size, and (θ i, w i ) is called an individual state of the population. Given the current population (θ t, w t ), an iteration of the MCDWIS involves two steps: 1. Monte Carlo Dynamic weighting (MCDW): Update each individual state of the current population by a MCDW transition. 2. Population control: Split or replicate the individual states with large weights and discard the individual states with small weights. The MCDWIS removes the need of perfect sampling, and thus can be applied to many statistical models for which perfect sampling is not available or very expensive.

7 Spatial Model Introduction (θt,wt) (θ t,w t) (θt+1,wt+1) MCDW Population control Wup,t enriched Wlow,t survived? pruned? N t+1 [Nmin, N max] Figure 1: A diagram of the MCDWIS algorithm.

8 Dynamically Weighted Importance Sampling Theory Let g t (θ, w) denote the joint density of (θ, w), an individual state of (θ t, w t ), and let ψ(θ) denote the target distribution. Dynamically weighted importance sampling differs from conventional importance sampling in that for any given θ, the weight w is a random variable instead of a constant defined as the ratio of the true and trial densities at θ. Definition 0.1 The distribution g t (θ, w) defined on Θ (0, ) is called correctly weighted with respect to ψ(θ) if the following conditions hold, wg t (θ, w)dw = c tθ ψ(θ), (3) c tθψ(θ)dθ A c = ψ(θ)dθ, (4) Θ tθψ(θ)dθ A where A is any Borel set, A Θ.

9 Dynamically Weighted Importance Sampling Theory Definition 0.2 If g t (θ, w) is correctly weighted with respect to ψ(θ), and the samples (θ t,i, w t,i ) are simulated from g t (θ, w) for i = 1, 2,..., n t, then (θ t, w t ) = (θ t,1, w t,1 ; ; θ t,nt, w t,nt ) is called a correctly weighted population with respect to ψ(θ). Let (θ t, w t ) be a correctly weighted population with respect to ψ(θ), and let θ 1,..., θ m be distinct states in θ t. Generate a random variable/vector ϑ such that P {ϑ = θ i} = nt where I( ) is an indicator function. j=1 w ji(θ j = θ i) nt j=1 w, i = 1, 2,..., m, (5) j Theorem 0.1 As the population size n t, the random variable ϑ generated in (5) converges in distribution to a random variable θ which is distributed with the pdf ψ(θ).

10 Dynamically Weighted Importance Sampling Theory Let (θ 1, w 1 ),..., (θ N, w N ) be a series of correctly weighted populations generated by a DWIS algorithm with respect to ψ(θ). Then the quantity µ = E ψ ρ(θ), assuming existence, can be estimated by nt N t=1 i=1 µ = w t,iρ(θ t,i ) N nt t=1 i=1 w t,i which is consistent and asymptotically normally distributed., (6) Definition 0.3 A transition rule for a population (θ, w) is said to be invariant with respect to the dynamic importance weights (IDIW) if the joint density of (θ, w) remains correctly weighted whenever the initial joint density is correctly weighted.

11 MCDWIS Methodology Monte Carlo Dynamic Weighting Sampler 1. Draw θ from some proposal distribution T (θ θ). 2. Simulate auxiliary samples y 1,..., y m from f(y θ ) using a MCMC algorithm, say, the MH algorithm. Estimate the normalizing constant ratio R t (θ, θ ) = Z(θ)/Z(θ ) by R t (θ, θ ) = 1 m m i=1 p(y i, θ) p(y i, θ ), (7) which is also known as the importance sampling (IS) estimator of R t (θ, θ ). 3. Calculate the Monte Carlo dynamic weighting ratio r d = r d (θ, θ, w) = w R t (θ, θ ) p(x, θ ) p(x, θ) T (θ θ ) T (θ θ).

12 MCDWIS Methodology 4. Choose β t = β t (θ t, w t ) 0 and draw U unif(0, 1). Update (θ, w) as (θ, w ) { (θ, w (θ, r d /a), if U a, ) = (θ, w/(1 a)), otherwise, where a = r d /(r d + β t ); β t is a function of (θ t, w t ), but remains a constant for each individual state of the same population.

13 MCDWIS Methodology Theorem 0.2 The Monte Carlo dynamic weighting sampler is IDIW; that is, if the joint distribution g t (θ, w) for (θ t, w t ) is correctly weighted with respect to π(θ x), after one Monte Carlo dynamic weighting step, the new joint density g t+1 (θ, w ) for (θ t+1, w t+1 ) is also correctly weighted with respect to π(θ x). Remarks: R t (θ, θ ) is an unbiased estimator of R t (θ, θ ). To avoid an extremely large weight caused by a nearly zero divisor, both Θ and X iare assumed to be compact. Then, there exists a constant r 0 such that for any pair (θ, θ ) Θ Θ, r 0 R t (θ, θ ) p(x, θ ) p(x, θ) T (θ θ ) T (θ θ) 1 r 0. (8)

14 2. (Pruned) If w t,i < W low,t, prune the state with probability q = 1 w t,i /W low,t. If it is pruned, drop (θ t,i, w t,i ) from (θ t, w t ); otherwise, update (θ t,i, w t,i ) as (θ t,i, W low,t ) and set n t = n t + 1. MCDWIS Methodology A Population Control Scheme Let (θ t,i, w t,i ) be the ith individual state of the population, let n t and n t denote the current and new population sizes, let W low,t and W up,t denote the lower and upper weight control bounds, let n min and n max denote the minimum and maximum population size allowed by the user, and let n low and n up denote the lower and upper reference bound of the population size. 1. (Initialization) Initialize the parameters W low,t and W up,t by W low,t = n t i=1 w t,i /n up, W up,t = n t i=1 w t,i /n low. Set n t = 0 and λ > 1. Do steps 2 4 for i = 1, 2,, n t.

15 MCDWIS Methodology 3. (Enriched) If w t,i > W up,t, set d = [w t,i /W up,t + 1], w t,i = w t,i /d, replace (θ t,i, w t,i ) by d identical states (θ t,i, w t,i), and set n t = n t + d, where [z] denotes the integer part of z. 4. (Unchanged) If W low,t w t,i W up,t, keep (θ t,i, w t,i ) unchanged, and set n t = n t (Checking) If n t > n max, set W low,t λw low,t, W up,t λw up,t and n t = 0, do step 2 4 again for i = 1, 2,, n t. If n t < n min, set W low,t W low,t /λ, W up,t W up,t /λ and n t = 0, do step 2 4 again for i = 1, 2,, n t. Otherwise, stop. In this scheme, λ is required to be greater than 1, and n low, n up, n min and n max are required to satisfy the constraint n min < n low < n up < n max. With the APEPCS, the population size is strictly controlled to the range [n min, n max ], and the weights are adjusted to the range [W low,t, W up,t ]. Therefore, the APEPCS avoids the possible overflow or extinction of a population in simulations.

16 MCDWIS Methodology Theorem 0.3 The APEPCS is IDIW; that is, if the joint distribution g t (θ, w) for (θ t, w t ) is correctly weighted with respect to π(θ x), then after one run of the scheme, the new joint distribution g t+1 (θ, w ) for (θ t+1, w t+1 ) is also correctly with respect to π(θ x).

17 MCDWIS Methodology A Monte Carlo Dynamically Weighted Importance Sampler Let W c denote a dynamic weighting move switching parameter, which switches the value of β t between 0 and 1 depending on the value of W up,t. (Move type setting) If W up,t W c, then set β t = 1. Otherwise, set β t = 0. (MCDW) Apply the Monte Carlo dynamic weighting move to the population (θ t, w t ). The new population is denoted by (θ t+1, w t+1). (Population Control) Apply APEPCS to (θ t+1, w t+1). population is denoted by (θ t+1, w t+1 ). The new

18 MCDWIS Methodology Let (θ 1, w 1 ),..., (θ N, w N ) denote a series of populations generated by MCDWIS. Then, according to (6), the quantity µ = E π ρ(θ) can be estimated by µ = N t=n 0 +1 nt i=1 w t,iρ(θ t,i ) N nt t=n 0 i=1 w, (9) t,i where N 0 denotes the number of burn-in iterations.

19 MCDWIS Methodology Weight Behavior Analysis Lemma 0.1 Let f(x θ) = p(x, θ)/z(θ) denote the likelihood function of x, let π(θ) denote the prior distribution of θ, and let T ( ) denote a proposal distribution of θ. Define p(θ, θ x) = p(x, θ)π(θ)t (θ θ), and r(θ, θ ) = R(θ, θ )p(θ, θ x)/p(θ, θ x) to be a Monte Carlo MH ratio, where R(θ, θ ) denotes an unbiased estimator of Z(θ)/Z(θ ). Then e 0 = E log r(θ, θ ) 0, where the expectation is taken with respect to the joint density ϕ( R) p(θ, θ x)/z(θ).

20 MCDWIS Methodology The weight process of MCDWIS can be characterized by the following process: Z t = { Z t 1 + log r(θ t 1, θ t ) log(d t ), if Z t 1 > 0, 0, if Z t 1 < 0, (10) Theorem 0.4 Under mild conditions, the MCDWIS almost surely has finite moments of any order.

21 Spatial Autologistic Models Numerical Results Let x = {x i : i D} denote the observed binary data, where x i is called a spin and D is the set of indices of the spins. f(x θ) = 1 Z(θ) exp θ a i D x i + θ b 2 ( x i i D j n(i) ) x j, (θ a, θ b ) Θ, (11) where θ = (θ a, θ b ), the parameter θ a determines the overall proportion of x i = +1, the parameter θ b determines the intensity of interaction between x i and its neighbors, and Z(θ) is the intractable normalizing constant defined by Z(θ) = for all possible x The prior is specified by exp θ a j D x j + θ b 2 ( x i i D (θ a, θ b ) Θ = [ 1, 1] [0, 1]. j n(i) ) x j.

22 Spatial Autologistic Models Numerical Results Estimates comparison: MCDWIS estimate: ( θ a, θ b ) = ( , ) with the standard error ( , ). Exchange algorithm: ( θ a, θ b ) = ( , ) with the standard error ( , ). Contour Monte Carlo: ( , ) (Liang, 2007) Stochastic Approximation Monte Carlo: ( , ) (Liang et al., 2007) Monte Carlo MLE: ( 0.304, 0.117) (Sherman et al., 2006)

23 Spatial Autologistic Models Numerical Results True Observations Fitted mortality rate Figure 2: US cancer mortality data. (a) The mortality map of liver and gallbladder cancers (including bile ducts) for white males during the decade Black squares denote counties of high cancer mortality rate, and white squares denote counties of low cancer mortality rate. (b) Fitted cancer mortality rates by the autologistic model with the parameters being replaced by its approximate Bayesian estimates. The cancer mortality rate of each county is represented by the gray level of the corresponding square.

24 Spatial Autologistic Models Numerical Results (a) (b) (c) population size theta log(wup) iteration iteration iteration Figure 3: Simulation results of the MCDWIS for the U.S. Cancer Mortality example: (a) time plot of population size; (b) time plot of β t ; and (c) time plot of log(w up,t ). The dotted line in plot (c) shows the value of log(w c ).

25 Spatial Autologistic Models Numerical Results MCDWIS Exchange algorithm MPLE (θ a, θ b ) θa θb T θa θb T θa θb (0,0.1) (0,0.2) (0,0.3) (0,0.4) (0.1,0.1) (0.3,0.3) (0.5,0.5) (.0025) (.0019) 5.8 (.0024) (.0018) 1.2 (.0024) (.0019) (.0021) (.0019) 5.8 (.0020) (.0019) 2.8 (.0022) (.0022) (.0013) (.0017) 5.8 (.0014) (.0017) 7.9 (.0016) (.0022) (.0012) (.0020) 5.8 (.0005) (.0012) (.0012) (.0020) (.0025) (.0023) 5.8 (.0025) (.0022) 1.1 (.0025) (.0023) (.0105) (.0045) 5.8 (.0097) (.0043) 3.5 (.0102) (.0046) (.0347) (.0122) 5.8 (.0393) (.0123)

26 MCDWIS Discussion Unlike other auxiliary variable MCMC algorithms, MCDWIS removes the need of perfect sampling, and thus can be applied to a wide range of problems for which perfect sampling is not available or very expensive. The MCDWIS allows for the use of Monte Carlo estimates in MCMC simulations, while still leaving the target distribution invariant under the criterion of dynamically weighted importance sampling. The MCDWIS can potentially be used to Bayesian inference for the missing data problems, where it often involves simulating from a posterior distribution with intractable integrals.

A = {(x, u) : 0 u f(x)},

A = {(x, u) : 0 u f(x)}, Draw x uniformly from the region {x : f(x) u }. Markov Chain Monte Carlo Lecture 5 Slice sampler: Suppose that one is interested in sampling from a density f(x), x X. Recall that sampling x f(x) is equivalent

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference Osnat Stramer 1 and Matthew Bognar 1 Department of Statistics and Actuarial Science, University of

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution

MH I. Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution MH I Metropolis-Hastings (MH) algorithm is the most popular method of getting dependent samples from a probability distribution a lot of Bayesian mehods rely on the use of MH algorithm and it s famous

More information

Likelihood Inference for Lattice Spatial Processes

Likelihood Inference for Lattice Spatial Processes Likelihood Inference for Lattice Spatial Processes Donghoh Kim November 30, 2004 Donghoh Kim 1/24 Go to 1234567891011121314151617 FULL Lattice Processes Model : The Ising Model (1925), The Potts Model

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

An ABC interpretation of the multiple auxiliary variable method

An ABC interpretation of the multiple auxiliary variable method School of Mathematical and Physical Sciences Department of Mathematics and Statistics Preprint MPS-2016-07 27 April 2016 An ABC interpretation of the multiple auxiliary variable method by Dennis Prangle

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 Sequential parallel tempering With the development of science and technology, we more and more need to deal with high dimensional systems. For example, we need to align a group of protein or DNA sequences

More information

Stochastic Approximation Monte Carlo and Its Applications

Stochastic Approximation Monte Carlo and Its Applications Stochastic Approximation Monte Carlo and Its Applications Faming Liang Department of Statistics Texas A&M University 1. Liang, F., Liu, C. and Carroll, R.J. (2007) Stochastic approximation in Monte Carlo

More information

MCMC and Gibbs Sampling. Kayhan Batmanghelich

MCMC and Gibbs Sampling. Kayhan Batmanghelich MCMC and Gibbs Sampling Kayhan Batmanghelich 1 Approaches to inference l Exact inference algorithms l l l The elimination algorithm Message-passing algorithm (sum-product, belief propagation) The junction

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods John Geweke University of Iowa, USA 2005 Institute on Computational Economics University of Chicago - Argonne National Laboaratories July 22, 2005 The problem p (θ, ω I)

More information

Monte Carlo Inference Methods

Monte Carlo Inference Methods Monte Carlo Inference Methods Iain Murray University of Edinburgh http://iainmurray.net Monte Carlo and Insomnia Enrico Fermi (1901 1954) took great delight in astonishing his colleagues with his remarkably

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

Stochastic Approximation in Monte Carlo Computation

Stochastic Approximation in Monte Carlo Computation Stochastic Approximation in Monte Carlo Computation Faming Liang, Chuanhai Liu and Raymond J. Carroll 1 June 22, 2006 Abstract The Wang-Landau algorithm is an adaptive Markov chain Monte Carlo algorithm

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabás Póczos & Aarti Singh Contents Markov Chain Monte Carlo Methods Goal & Motivation Sampling Rejection Importance Markov

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Markov Chain Monte Carlo Lecture 4

Markov Chain Monte Carlo Lecture 4 The local-trap problem refers to that in simulations of a complex system whose energy landscape is rugged, the sampler gets trapped in a local energy minimum indefinitely, rendering the simulation ineffective.

More information

Assessing Regime Uncertainty Through Reversible Jump McMC

Assessing Regime Uncertainty Through Reversible Jump McMC Assessing Regime Uncertainty Through Reversible Jump McMC August 14, 2008 1 Introduction Background Research Question 2 The RJMcMC Method McMC RJMcMC Algorithm Dependent Proposals Independent Proposals

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Stochastic Approximation in Monte Carlo Computation

Stochastic Approximation in Monte Carlo Computation Stochastic Approximation in Monte Carlo Computation Faming Liang, Chuanhai Liu and Raymond J. Carroll 1 June 26, 2006 Abstract The Wang-Landau algorithm is an adaptive Markov chain Monte Carlo algorithm

More information

SC7/SM6 Bayes Methods HT18 Lecturer: Geoff Nicholls Lecture 2: Monte Carlo Methods Notes and Problem sheets are available at http://www.stats.ox.ac.uk/~nicholls/bayesmethods/ and via the MSc weblearn pages.

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Research Report no. 1 January 2004

Research Report no. 1 January 2004 MaPhySto The Danish National Research Foundation: Network in Mathematical Physics and Stochastics Research Report no. 1 January 2004 Jesper Møller, A. N. Pettitt, K. K. Berthelsen and R. W. Reeves: An

More information

Bayesian GLMs and Metropolis-Hastings Algorithm

Bayesian GLMs and Metropolis-Hastings Algorithm Bayesian GLMs and Metropolis-Hastings Algorithm We have seen that with conjugate or semi-conjugate prior distributions the Gibbs sampler can be used to sample from the posterior distribution. In situations,

More information

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods Konstantin Zuev Institute for Risk and Uncertainty University of Liverpool http://www.liv.ac.uk/risk-and-uncertainty/staff/k-zuev/

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

MCMC: Markov Chain Monte Carlo

MCMC: Markov Chain Monte Carlo I529: Machine Learning in Bioinformatics (Spring 2013) MCMC: Markov Chain Monte Carlo Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington Spring 2013 Contents Review of Markov

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Notes on pseudo-marginal methods, variational Bayes and ABC

Notes on pseudo-marginal methods, variational Bayes and ABC Notes on pseudo-marginal methods, variational Bayes and ABC Christian Andersson Naesseth October 3, 2016 The Pseudo-Marginal Framework Assume we are interested in sampling from the posterior distribution

More information

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods Jonas Hallgren 1 1 Department of Mathematics KTH Royal Institute of Technology Stockholm, Sweden BFS 2012 June

More information

Monte Carlo in Bayesian Statistics

Monte Carlo in Bayesian Statistics Monte Carlo in Bayesian Statistics Matthew Thomas SAMBa - University of Bath m.l.thomas@bath.ac.uk December 4, 2014 Matthew Thomas (SAMBa) Monte Carlo in Bayesian Statistics December 4, 2014 1 / 16 Overview

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Mixture models. Mixture models MCMC approaches Label switching MCMC for variable dimension models. 5 Mixture models

Mixture models. Mixture models MCMC approaches Label switching MCMC for variable dimension models. 5 Mixture models 5 MCMC approaches Label switching MCMC for variable dimension models 291/459 Missing variable models Complexity of a model may originate from the fact that some piece of information is missing Example

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

On Markov chain Monte Carlo methods for tall data

On Markov chain Monte Carlo methods for tall data On Markov chain Monte Carlo methods for tall data Remi Bardenet, Arnaud Doucet, Chris Holmes Paper review by: David Carlson October 29, 2016 Introduction Many data sets in machine learning and computational

More information

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17 MCMC for big data Geir Storvik BigInsight lunch - May 2 2018 Geir Storvik MCMC for big data BigInsight lunch - May 2 2018 1 / 17 Outline Why ordinary MCMC is not scalable Different approaches for making

More information

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters Exercises Tutorial at ICASSP 216 Learning Nonlinear Dynamical Models Using Particle Filters Andreas Svensson, Johan Dahlin and Thomas B. Schön March 18, 216 Good luck! 1 [Bootstrap particle filter for

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

Adaptive Monte Carlo methods

Adaptive Monte Carlo methods Adaptive Monte Carlo methods Jean-Michel Marin Projet Select, INRIA Futurs, Université Paris-Sud joint with Randal Douc (École Polytechnique), Arnaud Guillin (Université de Marseille) and Christian Robert

More information

Learning Bayesian Networks for Biomedical Data

Learning Bayesian Networks for Biomedical Data Learning Bayesian Networks for Biomedical Data Faming Liang (Texas A&M University ) Liang, F. and Zhang, J. (2009) Learning Bayesian Networks for Discrete Data. Computational Statistics and Data Analysis,

More information

Bayesian Classification and Regression Trees

Bayesian Classification and Regression Trees Bayesian Classification and Regression Trees James Cussens York Centre for Complex Systems Analysis & Dept of Computer Science University of York, UK 1 Outline Problems for Lessons from Bayesian phylogeny

More information

SMC 2 : an efficient algorithm for sequential analysis of state-space models

SMC 2 : an efficient algorithm for sequential analysis of state-space models SMC 2 : an efficient algorithm for sequential analysis of state-space models N. CHOPIN 1, P.E. JACOB 2, & O. PAPASPILIOPOULOS 3 1 ENSAE-CREST 2 CREST & Université Paris Dauphine, 3 Universitat Pompeu Fabra

More information

Lecture 7 and 8: Markov Chain Monte Carlo

Lecture 7 and 8: Markov Chain Monte Carlo Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani

More information

Annealing Between Distributions by Averaging Moments

Annealing Between Distributions by Averaging Moments Annealing Between Distributions by Averaging Moments Chris J. Maddison Dept. of Comp. Sci. University of Toronto Roger Grosse CSAIL MIT Ruslan Salakhutdinov University of Toronto Partition Functions We

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Inexact approximations for doubly and triply intractable problems

Inexact approximations for doubly and triply intractable problems Inexact approximations for doubly and triply intractable problems March 27th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers of) interacting

More information

Graphical Models and Kernel Methods

Graphical Models and Kernel Methods Graphical Models and Kernel Methods Jerry Zhu Department of Computer Sciences University of Wisconsin Madison, USA MLSS June 17, 2014 1 / 123 Outline Graphical Models Probabilistic Inference Directed vs.

More information

TEORIA BAYESIANA Ralph S. Silva

TEORIA BAYESIANA Ralph S. Silva TEORIA BAYESIANA Ralph S. Silva Departamento de Métodos Estatísticos Instituto de Matemática Universidade Federal do Rio de Janeiro Sumário Numerical Integration Polynomial quadrature is intended to approximate

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models

Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Kohta Aoki 1 and Hiroshi Nagahashi 2 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Exact and approximate recursive calculations for binary Markov random fields defined on graphs

Exact and approximate recursive calculations for binary Markov random fields defined on graphs NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET Exact and approximate recursive calculations for binary Markov random fields defined on graphs by Håkon Tjelmeland and Haakon Michael Austad PREPRINT STATISTICS

More information

The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland),

The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland), The University of Auckland Applied Mathematics Bayesian Methods for Inverse Problems : why and how Colin Fox Tiangang Cui, Mike O Sullivan (Auckland), Geoff Nicholls (Statistics, Oxford) fox@math.auckland.ac.nz

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Markov chain Monte Carlo (MCMC) Gibbs and Metropolis Hastings Slice sampling Practical details Iain Murray http://iainmurray.net/ Reminder Need to sample large, non-standard distributions:

More information

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL Xuebin Zheng Supervisor: Associate Professor Josef Dick Co-Supervisor: Dr. David Gunawan School of Mathematics

More information

Basic math for biology

Basic math for biology Basic math for biology Lei Li Florida State University, Feb 6, 2002 The EM algorithm: setup Parametric models: {P θ }. Data: full data (Y, X); partial data Y. Missing data: X. Likelihood and maximum likelihood

More information

An introduction to Sequential Monte Carlo

An introduction to Sequential Monte Carlo An introduction to Sequential Monte Carlo Thang Bui Jes Frellsen Department of Engineering University of Cambridge Research and Communication Club 6 February 2014 1 Sequential Monte Carlo (SMC) methods

More information

MCMC for Cut Models or Chasing a Moving Target with MCMC

MCMC for Cut Models or Chasing a Moving Target with MCMC MCMC for Cut Models or Chasing a Moving Target with MCMC Martyn Plummer International Agency for Research on Cancer MCMSki Chamonix, 6 Jan 2014 Cut models What do we want to do? 1. Generate some random

More information

Lecture 6: Markov Chain Monte Carlo

Lecture 6: Markov Chain Monte Carlo Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline

More information

Theory of Stochastic Processes 8. Markov chain Monte Carlo

Theory of Stochastic Processes 8. Markov chain Monte Carlo Theory of Stochastic Processes 8. Markov chain Monte Carlo Tomonari Sei sei@mist.i.u-tokyo.ac.jp Department of Mathematical Informatics, University of Tokyo June 8, 2017 http://www.stat.t.u-tokyo.ac.jp/~sei/lec.html

More information

Bayesian Estimation with Sparse Grids

Bayesian Estimation with Sparse Grids Bayesian Estimation with Sparse Grids Kenneth L. Judd and Thomas M. Mertens Institute on Computational Economics August 7, 27 / 48 Outline Introduction 2 Sparse grids Construction Integration with sparse

More information

Bayesian computation for statistical models with intractable normalizing constants

Bayesian computation for statistical models with intractable normalizing constants Bayesian computation for statistical models with intractable normalizing constants Yves F. Atchadé, Nicolas Lartillot and Christian Robert (First version March 2008; this version Jan. 2010) Abstract: This

More information

MCMC notes by Mark Holder

MCMC notes by Mark Holder MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:

More information

Markov Chain Monte Carlo, Numerical Integration

Markov Chain Monte Carlo, Numerical Integration Markov Chain Monte Carlo, Numerical Integration (See Statistics) Trevor Gallen Fall 2015 1 / 1 Agenda Numerical Integration: MCMC methods Estimating Markov Chains Estimating latent variables 2 / 1 Numerical

More information

Bayesian inference for intractable distributions

Bayesian inference for intractable distributions Bayesian inference for intractable distributions Nial Friel University College Dublin nial.friel@ucd.ie October, 2014 Introduction Main themes: The Bayesian inferential approach has had a profound impact

More information

Machine Learning. Probabilistic KNN.

Machine Learning. Probabilistic KNN. Machine Learning. Mark Girolami girolami@dcs.gla.ac.uk Department of Computing Science University of Glasgow June 21, 2007 p. 1/3 KNN is a remarkably simple algorithm with proven error-rates June 21, 2007

More information

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Lecture 2: From Linear Regression to Kalman Filter and Beyond Lecture 2: From Linear Regression to Kalman Filter and Beyond Department of Biomedical Engineering and Computational Science Aalto University January 26, 2012 Contents 1 Batch and Recursive Estimation

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Simulated Annealing Barnabás Póczos & Ryan Tibshirani Andrey Markov Markov Chains 2 Markov Chains Markov chain: Homogen Markov chain: 3 Markov Chains Assume that the state

More information

Tutorial on Probabilistic Programming with PyMC3

Tutorial on Probabilistic Programming with PyMC3 185.A83 Machine Learning for Health Informatics 2017S, VU, 2.0 h, 3.0 ECTS Tutorial 02-04.04.2017 Tutorial on Probabilistic Programming with PyMC3 florian.endel@tuwien.ac.at http://hci-kdd.org/machine-learning-for-health-informatics-course

More information

ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors. RicardoS.Ehlers

ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors. RicardoS.Ehlers ComputationalToolsforComparing AsymmetricGARCHModelsviaBayes Factors RicardoS.Ehlers Laboratório de Estatística e Geoinformação- UFPR http://leg.ufpr.br/ ehlers ehlers@leg.ufpr.br II Workshop on Statistical

More information

Monte Carlo Methods. Leon Gu CSD, CMU

Monte Carlo Methods. Leon Gu CSD, CMU Monte Carlo Methods Leon Gu CSD, CMU Approximate Inference EM: y-observed variables; x-hidden variables; θ-parameters; E-step: q(x) = p(x y, θ t 1 ) M-step: θ t = arg max E q(x) [log p(y, x θ)] θ Monte

More information

Computer intensive statistical methods

Computer intensive statistical methods Lecture 11 Markov Chain Monte Carlo cont. October 6, 2015 Jonas Wallin jonwal@chalmers.se Chalmers, Gothenburg university The two stage Gibbs sampler If the conditional distributions are easy to sample

More information

Approximate Inference using MCMC

Approximate Inference using MCMC Approximate Inference using MCMC 9.520 Class 22 Ruslan Salakhutdinov BCS and CSAIL, MIT 1 Plan 1. Introduction/Notation. 2. Examples of successful Bayesian models. 3. Basic Sampling Algorithms. 4. Markov

More information

The Metropolis-Hastings Algorithm. June 8, 2012

The Metropolis-Hastings Algorithm. June 8, 2012 The Metropolis-Hastings Algorithm June 8, 22 The Plan. Understand what a simulated distribution is 2. Understand why the Metropolis-Hastings algorithm works 3. Learn how to apply the Metropolis-Hastings

More information

Tutorial on ABC Algorithms

Tutorial on ABC Algorithms Tutorial on ABC Algorithms Dr Chris Drovandi Queensland University of Technology, Australia c.drovandi@qut.edu.au July 3, 2014 Notation Model parameter θ with prior π(θ) Likelihood is f(ý θ) with observed

More information

Answers and expectations

Answers and expectations Answers and expectations For a function f(x) and distribution P(x), the expectation of f with respect to P is The expectation is the average of f, when x is drawn from the probability distribution P E

More information

Approximate Bayesian Computation: a simulation based approach to inference

Approximate Bayesian Computation: a simulation based approach to inference Approximate Bayesian Computation: a simulation based approach to inference Richard Wilkinson Simon Tavaré 2 Department of Probability and Statistics University of Sheffield 2 Department of Applied Mathematics

More information

1. Fisher Information

1. Fisher Information 1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score

More information

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham

MCMC 2: Lecture 3 SIR models - more topics. Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham MCMC 2: Lecture 3 SIR models - more topics Phil O Neill Theo Kypraios School of Mathematical Sciences University of Nottingham Contents 1. What can be estimated? 2. Reparameterisation 3. Marginalisation

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Evidence estimation for Markov random fields: a triply intractable problem

Evidence estimation for Markov random fields: a triply intractable problem Evidence estimation for Markov random fields: a triply intractable problem January 7th, 2014 Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers

More information

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that

More information

Markov chain Monte Carlo

Markov chain Monte Carlo 1 / 26 Markov chain Monte Carlo Timothy Hanson 1 and Alejandro Jara 2 1 Division of Biostatistics, University of Minnesota, USA 2 Department of Statistics, Universidad de Concepción, Chile IAP-Workshop

More information

DAG models and Markov Chain Monte Carlo methods a short overview

DAG models and Markov Chain Monte Carlo methods a short overview DAG models and Markov Chain Monte Carlo methods a short overview Søren Højsgaard Institute of Genetics and Biotechnology University of Aarhus August 18, 2008 Printed: August 18, 2008 File: DAGMC-Lecture.tex

More information

Markov Chain Monte Carlo Methods

Markov Chain Monte Carlo Methods Markov Chain Monte Carlo Methods p. /36 Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and Mobility Laboratory Markov Chain Monte Carlo Methods p. 2/36 Markov Chains

More information