Evidence estimation for Markov random fields: a triply intractable problem

Similar documents
Inexact approximations for doubly and triply intractable problems

An ABC interpretation of the multiple auxiliary variable method

A Review of Pseudo-Marginal Markov Chain Monte Carlo

Notes on pseudo-marginal methods, variational Bayes and ABC

MCMC for big data. Geir Storvik. BigInsight lunch - May Geir Storvik MCMC for big data BigInsight lunch - May / 17

Bayesian Indirect Inference using a Parametric Auxiliary Model

Inference in state-space models with multiple paths from conditional SMC

Kernel adaptive Sequential Monte Carlo

Adaptive HMC via the Infinite Exponential Family

SMC 2 : an efficient algorithm for sequential analysis of state-space models

Monte Carlo in Bayesian Statistics

Bayes Factors, posterior predictives, short intro to RJMCMC. Thermodynamic Integration

Controlled sequential Monte Carlo

Exercises Tutorial at ICASSP 2016 Learning Nonlinear Dynamical Models Using Particle Filters

Delayed Rejection Algorithm to Estimate Bayesian Social Networks

A = {(x, u) : 0 u f(x)},

Density Estimation. Seungjin Choi

Approximate Bayesian computation: methods and applications for complex systems

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

The Poisson transform for unnormalised statistical models. Nicolas Chopin (ENSAE) joint work with Simon Barthelmé (CNRS, Gipsa-LAB)

An introduction to Approximate Bayesian Computation methods

Calibration of Stochastic Volatility Models using Particle Markov Chain Monte Carlo Methods

Variational Scoring of Graphical Model Structures

Lecture 6: Graphical Models: Learning

Fast Maximum Likelihood estimation via Equilibrium Expectation for Large Network Data

Bayesian Indirect Inference using a Parametric Auxiliary Model

Session 5B: A worked example EGARCH model

Machine Learning Summer School

Kernel Sequential Monte Carlo

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

STA 4273H: Statistical Machine Learning

GAUSSIAN PROCESS REGRESSION

Bayesian model selection for exponential random graph models via adjusted pseudolikelihoods

Lecture 2: From Linear Regression to Kalman Filter and Beyond

Tutorial on ABC Algorithms

Summary STK 4150/9150

CSC2535: Computation in Neural Networks Lecture 7: Variational Bayesian Learning & Model Selection

Fast Likelihood-Free Inference via Bayesian Optimization

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Expectation Propagation for Approximate Bayesian Inference

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

Pseudo-marginal MCMC methods for inference in latent variable models

an introduction to bayesian inference

Bayesian estimation of complex networks and dynamic choice in the music industry

Riemann Manifold Methods in Bayesian Statistics

Lecture 9: PGM Learning

Answers and expectations

Introduction to Probabilistic Machine Learning

Bayesian Inference and MCMC

Estimating the marginal likelihood with Integrated nested Laplace approximation (INLA)

arxiv: v1 [stat.me] 30 Sep 2009

Eco517 Fall 2013 C. Sims MCMC. October 8, 2013

Monte Carlo Dynamically Weighted Importance Sampling for Spatial Models with Intractable Normalizing Constants

CPSC 540: Machine Learning

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL

Lecture 7 and 8: Markov Chain Monte Carlo

Markov Chain Monte Carlo methods

Bayesian Inference for Discretely Sampled Diffusion Processes: A New MCMC Based Approach to Inference

Stochastic modelling of urban structure

MCMC Sampling for Bayesian Inference using L1-type Priors

Principles of Bayesian Inference

Chapter 4 - Fundamentals of spatial processes Lecture notes

Likelihood-free MCMC

Unsupervised Learning

Lecture : Probabilistic Machine Learning

Machine Learning. Probabilistic KNN.

Sequential Monte Carlo Methods (for DSGE Models)

Learning the hyper-parameters. Luca Martino

Markov Chain Monte Carlo Lecture 4

Recent Advances in Bayesian Inference Techniques

STA414/2104 Statistical Methods for Machine Learning II

Zig-Zag Monte Carlo. Delft University of Technology. Joris Bierkens February 7, 2017

Approximate Bayesian Computation and Particle Filters

Improving power posterior estimation of statistical evidence

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Markov Chain Monte Carlo

High dimensional Ising model selection

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Sequential Monte Carlo and Particle Filtering. Frank Wood Gatsby, November 2007

On the Pitfalls of Nested Monte Carlo

David Giles Bayesian Econometrics

Introduction to Bayesian Statistics

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Normalising constants and maximum likelihood inference

Variational Bayes with Synthetic Likelihood

Approximate Inference using MCMC

arxiv: v2 [stat.co] 27 May 2011

Sequential Monte Carlo Samplers for Applications in High Dimensions

Pseudo-marginal Metropolis-Hastings: a simple explanation and (partial) review of theory

Introduction to Bayesian inference

Probabilistic Graphical Models

Approximate Bayesian Computation: a simulation based approach to inference

Bayesian Learning in Undirected Graphical Models

Introduction to Bayesian Inference

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Bayesian Regression Linear and Logistic Regression

Log Gaussian Cox Processes. Chi Group Meeting February 23, 2016

Bayesian Inference: Concept and Practice

One-parameter models

Transcription:

Evidence estimation for Markov random fields: a triply intractable problem January 7th, 2014

Markov random fields Interacting objects Markov random fields (MRFs) are used for modelling (often large numbers of) interacting objects usually modelling symmetrical interactions. Used widely in statistics, physics and computer science, e.g. image analysis; ferromagnetism; geostatistics; point processes; social networks.

Markov random fields Image analysis The log expression of 72 genes on a particular chromosome over 46 hours (from Friel et al. 2009).

Markov random fields Pairwise Markov random fields

Markov random fields Intractable normalising constants Pairwise MRFs correspond to the factorisation f (Y θ) γ(y θ) = φ(y i,y j θ). (i,j) Nei(Y) We also need to specify the normalising constant Z(θ) = φ(y i,y j θ)dy Y (i,j) Nei(Y) In general we are interested in models that take the form f (Y θ) = γ(y θ) Z(θ).

A doubly intractable problem Doubly intractable Suppose we want to estimate parameters θ after observing Y = y. Use Bayesian inference to find p(θ y) p(y θ)p(θ). Could use MCMC, but the acceptance probability in MH is { min 1, q(θ θ ) p(θ ) γ(y θ } ) 1 Z(θ) q(θ θ) p(θ) γ(y θ) Z(θ. ) 1

A doubly intractable problem Doubly intractable Suppose we want to estimate parameters θ after observing Y = y. Use Bayesian inference to find p(θ y) p(y θ)p(θ). Could use MCMC, but the acceptance probability in MH is { min 1, q(θ θ ) p(θ ) γ(y θ } ) 1 Z(θ) q(θ θ) p(θ) γ(y θ) Z(θ. ) 1

A doubly intractable problem ABC-MCMC Approximate an intractable likelihood at θ with: R 1 R π ε (S(x r ) S(y)) r=1 where the x r f (. θ) are R simulations from f (originally in Ratmann et al. (2009)). Often R = 1 and π ε (. S(y)) = U (. (S(y) ε,s(y) + ε)). Essentially a nonparametric kernel estimator to the conditional distribution of the statistics given θ, based on simulations from f. ABC-MCMC is an MCMC algorithm that targets this approximate posterior.

A doubly intractable problem ABC-MCMC Approximate an intractable likelihood at θ with: R 1 R π ε (S(x r ) S(y)) r=1 where the x r f (. θ) are R simulations from f (originally in Ratmann et al. (2009)). Often R = 1 and π ε (. S(y)) = U (. (S(y) ε,s(y) + ε)). Essentially a nonparametric kernel estimator to the conditional distribution of the statistics given θ, based on simulations from f. ABC-MCMC is an MCMC algorithm that targets this approximate posterior.

A doubly intractable problem ABC on ERGMs True ABC

A doubly intractable problem Synthetic likelihood An alternative approximation proposed in Wood (2010). Again take R simulations from f, x r f (. θ), and take the summary statistics of each. But instead use a multivariate normal approximation to the distribution of the summary statistics given θ: L(S(y) θ) = N (S(y) µ θ, Σ ) θ, where µ θ = 1 R R S (x r ), r=1 Σ θ = sst R 1, with s = (S (x 1 ) µ θ,...,s (x R ) µ θ ).

A doubly intractable problem The single auxiliary variable method Møller et al. (2006) augment the target distribution with an extra variable u and use p(θ,u y) q u (u θ,y)f (y θ)p(θ) where q u is some (normalised) arbitrary distribution and u is on the same space as y. As the MH proposal in (θ,u)-space they use (θ,u ) f (u θ )q(θ θ). This gives an acceptance probability of { min 1, q(θ θ ) p(θ ) γ(y θ ) q u (u θ,y) q(θ θ) p(θ) γ(y θ) γ(u θ ) γ(u θ) q u (u θ,y) }.

A doubly intractable problem Exact approximations Note that q u(u θ,y) γ(u θ ) 1 estimator of Z(θ ). is an unbiased importance sampling still targets the correct distribution! first seen in the pseudo-marginal methods of Beaumont (2003) and Andrieu and Roberts (2009). Relies on being able to simulate exactly from f (. θ ), which is usually not possible or computationally expensive. Girolami et al. (2013) introduce an approach that does not require exact simulation ( Russian Roulette ).

A doubly intractable problem Exact approximations Note that q u(u θ,y) γ(u θ ) 1 estimator of Z(θ ). is an unbiased importance sampling still targets the correct distribution! first seen in the pseudo-marginal methods of Beaumont (2003) and Andrieu and Roberts (2009). Relies on being able to simulate exactly from f (. θ ), which is usually not possible or computationally expensive. Girolami et al. (2013) introduce an approach that does not require exact simulation ( Russian Roulette ).

A doubly intractable problem Exact approximations Note that q u(u θ,y) γ(u θ ) 1 estimator of Z(θ ). is an unbiased importance sampling still targets the correct distribution! first seen in the pseudo-marginal methods of Beaumont (2003) and Andrieu and Roberts (2009). Relies on being able to simulate exactly from f (. θ ), which is usually not possible or computationally expensive. Girolami et al. (2013) introduce an approach that does not require exact simulation ( Russian Roulette ).

Estimating the marginal likelihood The marginal likelihood (also known as the evidence) is p(y) = p(θ)f (y θ)dθ. Used in Bayesian model comparison θ p(m y) = p(m)p(y M), most commonly seen in the Bayes factor, for comparing models p(y M 1 ) p(y M 2 ). All commonly used methods require f (y θ) to be tractable in θ, and usually can t be estimated from MCMC output a triply intractable problem - Friel (2013).

Using importance sampling (IS) Importance sampling Returns a weighted sample {(θ (p),w (p) ) 1 p P} from p(θ y). For p = 1 : P Simulate θ (p) q(.) Weight w (p) = p(θ (p) )f (y θ (p) ). q(θ (p) ) Then p(y) = 1 P P p=1 w (p).

Using ABC-IS Didelot, Everitt, Johansen and Lawson (2011) investigate the use of the ABC approximation when using IS for marginal likelihoods. The weights are w (p) = p(θ (p) ) 1 R R r=1 π ε (S(x r (p) ) S(y)) q(θ (p) ) } R where { x r (p) f (. θ (p) ). r=1 This method gives p(s(y)) p(y). Didelot et al. (2011), Grelaud et al. (2009), Robert et al. (2011), Marin et al. (2014), discuss the choice of summary statistics.

Exponential family models Didelot et al. (2011): when comparing two exponential family models, if S 1 (y) is sufficient for the parameters in model 1 S 2 (y) is sufficient for the parameters in model 2 Then using the vector S(y) = (S 1 (y),s 2 (y)) for both models gives p(y M 1 ) p(y M 2 ) = p(s(y) M 1) p(s(y) M 2 ). Marin et al. (2014) has much more general guidance.

Synthetic likelihood IS We could also use the SL approximation within IS. The weight update is then p(θ (p) )N (S(y) µ θ, Σ ) θ w (p) = q(θ (p), ) where µ θ, Σ { θ are based on x (p) r } R f (. θ (p) ). r=1 Does not require choosing ε, but relies on normality assumption.

Exact methods? Importance sampling: p(y) = θ 1 P = 1 P f (y θ)p(θ) q(θ)dθ q(θ) P f (y θ (p) )p(θ (p) ) p=1 q(θ (p) ) P γ(y θ (p) )p(θ (p) ) 1 p=1 q(θ (p) ) Z(θ (p) ). Intractable...

Exact methods? Importance sampling: p(y) = θ 1 P = 1 P f (y θ)p(θ) q(θ)dθ q(θ) P f (y θ (p) )p(θ (p) ) p=1 q(θ (p) ) P γ(y θ (p) )p(θ (p) ) 1 p=1 q(θ (p) ) Z(θ (p) ). Intractable...

SAV importance sampling Consider the SAV target p(θ,u y) q u (u θ,y)f (y θ)p(θ), noting that it has the same marginal likelihood as p(θ y). Suppose we do importance sampling on this SAV target, and choose the proposal to be q(θ,u) = f (u θ)q(θ). We obtain p(y) = 1 P = 1 P P p=1 P p=1 q u (u θ (p),y)γ(y θ (p) )p(θ (p) ) Z(θ (p) ) γ(u θ (p) )q(θ (p) ) Z(θ (p) ) γ(y θ (p) )p(θ (p) ) q(θ (p) ) q u (u θ (p),y) γ(u θ (p). )

SAV importance sampling Consider the SAV target p(θ,u y) q u (u θ,y)f (y θ)p(θ), noting that it has the same marginal likelihood as p(θ y). Suppose we do importance sampling on this SAV target, and choose the proposal to be q(θ,u) = f (u θ)q(θ). We obtain p(y) = 1 P = 1 P P p=1 P p=1 q u (u θ (p),y)γ(y θ (p) )p(θ (p) ) Z(θ (p) ) γ(u θ (p) )q(θ (p) ) Z(θ (p) ) γ(y θ (p) )p(θ (p) ) q(θ (p) ) q u (u θ (p),y) γ(u θ (p). )

Exact approximations revisited Using unbiased weight estimates within importance sampling: (IS) 2 (Tran et al., 2013); random weight particle filters (Fearnhead et al. 2010); (SMC) 2 (Chopin et al. 2011). For each θ, we could use multiple u variables and use the estimate 1 Z(θ) = 1 q u (u (m) θ,y) M γ(u (m). θ) M m=1 For u the proposal is pre-determined, but we need to choose q u (u θ,y). Møller et al. (2006): one possible choice is q u (u θ,y) = γ(u θ)/z( θ) where θ is an ML estimate (or some other appropriate estimate) of θ.

Exact approximations revisited Using unbiased weight estimates within importance sampling: (IS) 2 (Tran et al., 2013); random weight particle filters (Fearnhead et al. 2010); (SMC) 2 (Chopin et al. 2011). For each θ, we could use multiple u variables and use the estimate 1 Z(θ) = 1 q u (u (m) θ,y) M γ(u (m). θ) M m=1 For u the proposal is pre-determined, but we need to choose q u (u θ,y). Møller et al. (2006): one possible choice is q u (u θ,y) = γ(u θ)/z( θ) where θ is an ML estimate (or some other appropriate estimate) of θ.

Exact approximations revisited Using unbiased weight estimates within importance sampling: (IS) 2 (Tran et al., 2013); random weight particle filters (Fearnhead et al. 2010); (SMC) 2 (Chopin et al. 2011). For each θ, we could use multiple u variables and use the estimate 1 Z(θ) = 1 q u (u (m) θ,y) M γ(u (m). θ) M m=1 For u the proposal is pre-determined, but we need to choose q u (u θ,y). Møller et al. (2006): one possible choice is q u (u θ,y) = γ(u θ)/z( θ) where θ is an ML estimate (or some other appropriate estimate) of θ.

SAVIS / MAVIS Using the suggested q u gives the following importance sampling estimate of 1/Z(θ) 1 Z(θ) = 1 Z( θ) 1 M M m=1 γ(u (m) θ) γ(u (m) θ). Or, using annealed importance sampling (Neal, 2001) with the sequence of targets f k (. θ, θ,y) γ k (. θ, θ) = γ(. θ) (K+1 k)/(k+1) +γ(. θ) k/(k+1), we obtain 1 Z(θ) = 1 Z( θ) 1 M M m=1 K k=0 γ k+1 (u (m) k θ,θ,y) γ k (u (m) k θ,θ,y).

SAVIS / MAVIS Using the suggested q u gives the following importance sampling estimate of 1/Z(θ) 1 Z(θ) = 1 Z( θ) 1 M M m=1 γ(u (m) θ) γ(u (m) θ). Or, using annealed importance sampling (Neal, 2001) with the sequence of targets f k (. θ, θ,y) γ k (. θ, θ) = γ(. θ) (K+1 k)/(k+1) +γ(. θ) k/(k+1), we obtain 1 Z(θ) = 1 Z( θ) 1 M M m=1 K k=0 γ k+1 (u (m) k θ,θ,y) γ k (u (m) k θ,θ,y).

Non-exact approximations... MAVIS is exact only if exact sampling from f (. θ) is possible (also applies to ABC and synthetic likelihood); 1/Z( θ) is known. In practice use MCMC to simulate from f (. θ); estimate 1/Z( θ) offline in advance of running the IS. In the context of MCMC, one can show that these approximations do not introduce large errors see MCMW approach in Andrieu and Roberts (2009) (also Everitt (2012)), and Nial Friel s talk tomorrow ( Monte Carlo methods in network analysis session).

Non-exact approximations... MAVIS is exact only if exact sampling from f (. θ) is possible (also applies to ABC and synthetic likelihood); 1/Z( θ) is known. In practice use MCMC to simulate from f (. θ); estimate 1/Z( θ) offline in advance of running the IS. In the context of MCMC, one can show that these approximations do not introduce large errors see MCMW approach in Andrieu and Roberts (2009) (also Everitt (2012)), and Nial Friel s talk tomorrow ( Monte Carlo methods in network analysis session).

Non-exact approximations... MAVIS is exact only if exact sampling from f (. θ) is possible (also applies to ABC and synthetic likelihood); 1/Z( θ) is known. In practice use MCMC to simulate from f (. θ); estimate 1/Z( θ) offline in advance of running the IS. In the context of MCMC, one can show that these approximations do not introduce large errors see MCMW approach in Andrieu and Roberts (2009) (also Everitt (2012)), and Nial Friel s talk tomorrow ( Monte Carlo methods in network analysis session).

Toy example: Poisson vs geometric Consider i.i.d. observations {y i } n i=1 of a discrete random variable that takes values in N. We find the Bayes factor for the models 1 Y θ Poisson(θ), θ Exp(1) f 1 ({y i } n i=1 θ) = λ x i exp( λ) i x i! 1 = exp(nλ) λ x i i x i! 2 Y θ Geometric(θ), θ Unif(0,1) f 2 ({y i } n i=1 θ) = p(1 p) x i = 1 p n (1 p) x i. i

Results: box plots

Results: ABC-IS

Results: SL-IS

Results: MAVIS

Application to social networks Compare the evidence for two alternative exponential random graph models p(y θ) exp(θ T S(y)). in model 1 S(y) = number of edges in model 2 S(y) = (number of edges, number of two stars) (so now θ is 2-d). Use prior p(θ) = N (0,25I ), as in Friel (2013).

Results: social network Friel (2013) finds that the evidence for model 1 is 37.499 that for model 2. Using 1000 importance points (with 100 simulations from the likelihood for each point)... ABC: ε = 0.1 gives p(y M 1 )/ p(y M 2 ) 4; ε = 0.05 gives p(y M 1 )/ p(y M 2 ) 20, but has only 5 points with non-zero weight! Synthetic likelihood obtains p(y M 1 )/ p(y M 2 ) 40. MAVIS finds log[ p(y M 1 )] = 69.62304, log[ p(y M 2 )] = 73.33692 giving p(y M 1 )/ p(y M 2 ) 41.

Results: social network Friel (2013) finds that the evidence for model 1 is 37.499 that for model 2. Using 1000 importance points (with 100 simulations from the likelihood for each point)... ABC: ε = 0.1 gives p(y M 1 )/ p(y M 2 ) 4; ε = 0.05 gives p(y M 1 )/ p(y M 2 ) 20, but has only 5 points with non-zero weight! Synthetic likelihood obtains p(y M 1 )/ p(y M 2 ) 40. MAVIS finds log[ p(y M 1 )] = 69.62304, log[ p(y M 2 )] = 73.33692 giving p(y M 1 )/ p(y M 2 ) 41.

Results: social network Friel (2013) finds that the evidence for model 1 is 37.499 that for model 2. Using 1000 importance points (with 100 simulations from the likelihood for each point)... ABC: ε = 0.1 gives p(y M 1 )/ p(y M 2 ) 4; ε = 0.05 gives p(y M 1 )/ p(y M 2 ) 20, but has only 5 points with non-zero weight! Synthetic likelihood obtains p(y M 1 )/ p(y M 2 ) 40. MAVIS finds log[ p(y M 1 )] = 69.62304, log[ p(y M 2 )] = 73.33692 giving p(y M 1 )/ p(y M 2 ) 41.

Results: social network Friel (2013) finds that the evidence for model 1 is 37.499 that for model 2. Using 1000 importance points (with 100 simulations from the likelihood for each point)... ABC: ε = 0.1 gives p(y M 1 )/ p(y M 2 ) 4; ε = 0.05 gives p(y M 1 )/ p(y M 2 ) 20, but has only 5 points with non-zero weight! Synthetic likelihood obtains p(y M 1 )/ p(y M 2 ) 40. MAVIS finds log[ p(y M 1 )] = 69.62304, log[ p(y M 2 )] = 73.33692 giving p(y M 1 )/ p(y M 2 ) 41.

Results: social network Friel (2013) finds that the evidence for model 1 is 37.499 that for model 2. Using 1000 importance points (with 100 simulations from the likelihood for each point)... ABC: ε = 0.1 gives p(y M 1 )/ p(y M 2 ) 4; ε = 0.05 gives p(y M 1 )/ p(y M 2 ) 20, but has only 5 points with non-zero weight! Synthetic likelihood obtains p(y M 1 )/ p(y M 2 ) 40. MAVIS finds log[ p(y M 1 )] = 69.62304, log[ p(y M 2 )] = 73.33692 giving p(y M 1 )/ p(y M 2 ) 41.

Comparison of methods ABC vs MAVIS both require the simulation of auxiliary variables, but in ABC/SL the use of summary statistics dramatically reduces the dimension of the space; but MAVIS only requires the auxiliary variable to look like it is a good simulation from f (. θ), not (the different requirement) that it is a good match to y. Plus the standard drawbacks of ABC remain choice of tolerance ε not able to estimate the evidence, only Bayes factors. SL vs ABC SL fails when Gaussian assumption is not appropriate...... but it is surprisingly robust and there is no need to choose an ε.

Comparison of methods ABC vs MAVIS both require the simulation of auxiliary variables, but in ABC/SL the use of summary statistics dramatically reduces the dimension of the space; but MAVIS only requires the auxiliary variable to look like it is a good simulation from f (. θ), not (the different requirement) that it is a good match to y. Plus the standard drawbacks of ABC remain choice of tolerance ε not able to estimate the evidence, only Bayes factors. SL vs ABC SL fails when Gaussian assumption is not appropriate...... but it is surprisingly robust and there is no need to choose an ε.

Comparison of methods ABC vs MAVIS both require the simulation of auxiliary variables, but in ABC/SL the use of summary statistics dramatically reduces the dimension of the space; but MAVIS only requires the auxiliary variable to look like it is a good simulation from f (. θ), not (the different requirement) that it is a good match to y. Plus the standard drawbacks of ABC remain choice of tolerance ε not able to estimate the evidence, only Bayes factors. SL vs ABC SL fails when Gaussian assumption is not appropriate...... but it is surprisingly robust and there is no need to choose an ε.

Summary We can tackle doubly intractable problems using: ABC; synthetic likelihood; auxiliary variable methods; Russian Roulette. Used in importance sampling, we can also estimate marginal likelihoods and Bayes factors. For high-dimensional θ, SMC algorithms can be employed in some cases the Bayes factor can be estimated directly. Thanks to Nial Friel, Melina Evdemon-Hogan and Ellen Rowing.