Computationally Efficient MCMC for Hierarchical Bayesian Inverse Problems

Size: px
Start display at page:

Download "Computationally Efficient MCMC for Hierarchical Bayesian Inverse Problems"

Transcription

1 Computationally Efficient MCMC for Hierarchical Bayesian Inverse Problems Andrew Brown 1, Arvind Saibaba 2, Sarah Vallélian 2,3 SAMSI January 28, 2017 Supported by SAMSI Visiting Research Fellowship NSF grant DMS to SAMSI 1 Department of Mathematical Sciences, Clemson University, Clemson, SC 29634, USA 2 Department of Mathematics, North Carolina State University, Raleigh, NC 27695, USA 3 Statistical and Applied Mathematical Sciences Institute, Research Triangle Park, NC 27709, USA

2 Courtesy of A. Saibaba Inverse Problems Parameters x Model Ax + noise Data d Fox, Haario, and Christen (2013)

3 Assume that the process can be (approximately) represented as a linear operator E.g., discretization of an integral equation, solution to a system of PDEs, multiple linear regression model In practice, it is likely that the data are observed with measurement error With additive noise, the model becomes d = Ax + e, where we assume, for instance, that e N(0, µ 1 I). We are concerned with cases in which this problem isn t well behaved. E.g., poorly conditioned, infinitely many solutions

4 Regularized Inversion In general, ˆx = arg min x Ax d λ L(x x 0 ) 2 2 = (A T A + λ 2 Q) 1 (A T d + λ 2 Qx 0 ), where Q = L T L and x 0 is a default solution. Note: arg max ˆx = x ( k 1 exp ) 1 2 Ax d 2 2 }{{} f(d x) ( k 2 exp Inverse problem admits Bayesian interpretation ) λ 2 L(x x 0) 2 2 }{{} π(x λ) Hoerl and Kinnard (1970), Tikhonov and Arsenin (1977), Press et al. (2007), Fox et al. (2013)

5 The Bayesian Machinery Given prior information and a data generating model, goal = update information about the parameters of interest, given the observed data via Bayes rule: Posterior Likelihood Prior MAP estimator = posterior mode arg max x arg max π(x d, λ) = x f(d x)π(x λ) The posterior distribution facilitates more complete inferences Other point estimators (posterior mean, posterior median, etc.) In particular, it allows quantification of uncertainty about the estimators Berger (1985), Bernardo and Smith (1994), Robert (2007), Carlin and Louis (2009), Gelman et al. (2013),...

6 Suppose d x, µ N(Ax, µ 1 I), x σ N(0, σ 1 Γ) With µ and σ fixed, the posterior x d, σ, µ N (m, Σ), where Σ = ( µa T A + σγ 1) 1 m = ΣµA T d MAP = posterior mode = posterior mean ˆx = ( µa T A + σγ 1) 1 µa T d ( A T A + λl T L ) 1 A T d, where λ = σ/µ and Γ 1 = L T L. Lindley and Smith (1972)

7 Regularization Parameter? Often, λ is fixed a priori Cross validation, generalized cross validation, L-curves, etc. Alternatively, assign µ and σ (or λ = σ/µ) a prior distribution Data have more freedom to determine plausible values Uncertainty associated with the parameters is accounted for in a posteriori inferences Challenge: Posterior distribution may not be available in closed form Monte Carlo sampling, approximation methods, etc. O Sullivan (1986), Hansen (1992), Hansen and O Leary (1993)

8 Gibbs Sampling MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Suppose we have a k-dimensional parameter space, θ = (θ 1,..., θ k ) T, from which we wish to sample the posterior distribution, h(θ) := π(θ y) f(y θ)π(θ). Assuming the availability of the full conditional distributions, {p(θ i θ ( i), y) i = 1,..., k}, the Gibbs sampling algorithm is as follows: 1 Draw θ (t) 1 p(θ 1 θ (t 1) 2, θ (t 1) 3,..., θ (t 1) k ) 2 Draw θ (t) 2 p(θ 2 θ (t) 1, θ(t 1) 3,..., θ (t 1) k ). k Draw θ (t) k p(θ k θ (t) 1,..., θ(t) k 1 ) Repeat for t = 1,..., T. Under mild conditions, θ (t) = (θ (t) 1,..., θ(t) k )T converges to a draw from π(θ 1,..., θ k y) Geman and Geman (1984), Gelfand and Smith (1990), Carlin and Louis (2009)

9 Metropolis(-Hastings) Sampling MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Algorithm: x (l) = current state, x = proposed state h( ) = (unnormalized) density of the target distribution q( x (l) ) = (unnormalized) density of the proposal distribution Set x (l+1) = x with probability α(x (l), x ), else x (l+1) = x (l), where { α(x (l), x ) = min 1, h(x )q(x (l) x } ) h(x (l) )q(x x (l) ) Random walk: Want 44% ( 23%) acceptance for univariate (multivariate) parameters Proposal distribution closely approximating target distribution: Want acceptance rate as high as possible Gibbs sampling can be shown to be a special case of the Metropolis-Hastings algorithm with acceptance probability = 1. Metropolis et al. (1953), Hastings (1970), Roberts et al. (1997), Rosenthal (2011)

10 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Independence Sampling Independence MH sampler proposes states from a density that is independent of the current state of the chain q(θ θ (t 1) ) q(θ ) = g(θ ) Acceptance ratio: where w(θ) h(θ)/g(θ) Similar to the accept-reject algorithm h(θ )g(θ (t 1) ) h(θ (t 1) )g(θ ) w(θ ) w(θ (t 1) ), 1 Draw a candidate value X g with h(x) Mg(x), for some M 1, x 2 Accepts the draw with probability h(x)/m g(x)

11 Metropolis-within-Gibbs MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Often, one or more of the conditional distributions needed for Gibbs sampling is not available Solution: Substitute a Metropolis(-Hastings) step to approximate a draw from the conditional distribution The MH sub-chain is not irreducible with respect to the full parameter space Valid Markov chain with stationary distribution = full conditional distribution (remaining parameters held fixed) Convergence can be improved by running several sub-iterations at the MH step More common to use a single MH draw, which still works Müller (1991), Neal (1998), Robert and Casella (2004), Gamerman and Lopes (2006), Carlin and Louis (2009)

12 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject After a sufficiently large number of iterations (the burn in period), the simulated draws may be treated as realizations from the posterior of interest. Common practice is to run multiple chains (3 or 5) starting from different values and compare the draws from each chain (after burn-in) to assess convergence Chains can be run in parallel on a computer to save processor time Examine trace plots of chain output Can also use quantitative measures such as the (scalar and multivariate) potential scale reduction factor (PSRF) The parameters can be partitioned into subvectors θ = (θ T 1, θ T 2,..., θ T P ) T and updated in blocks p(θ 1 θ 2,..., θ P, y), etc. This can speed up convergence when subsets of the parameters are highly correlated in the posterior Gelman and Rubin (1992), Brooks and Gelman (1998), Carlin and Louis (2009)

13 Hierarchical Model MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject d x, µ N m (Ax, µ 1 I) x σ N n (0, σ 1 Γ) µ Ga(a µ, b µ ) σ Ga(a σ, b σ ) Full conditional distributions for Gibbs sampling: ( (µa x σ, µ, d N T n A + σγ 1) 1 µa T d, ( µa T A + σγ 1) ) 1 ( m µ x, σ, d Ga 2 + a µ, 1 ) 2 Ax d b µ ( n σ x, µ Ga 2 + a σ, 1 ) 2 Lx b σ where L T L = Γ 1. Geman and Geman (1984), Gelfand and Smith (1990), Gelman and Rubin (1992), Brooks and Gelman (1998), Carlin and Louis (2009)

14 Example - 1D Image Restoration MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Consider Fredholm integral equation of the first kind: b a K(s, t)f(t)dt I n (s) = n w j K(s, t j )f(t j ), j=1 Quadrature yields Ax + noise = d where d = (I n (s 1 ),..., I n (s n )) T + noise and the desired solution is x = (f(t 1 ),..., f(t n )) T One-dimensional image restoration: K = point spread function for infinitely long slit, limits of integration = ±π/2. Shaw (1972), Hansen (2007)

15 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Prior on x taken to be Gaussian process with power exponential covariance function: { x GP(0, σ 1 R(, )), R(x i, x j ) = exp 2 x } i x j π Regularization determined by λ = σ/µ, so we take a µ, b µ to concentrate µ about 1 and assign σ a vague prior Three chains run in parallel 2,000 iterations each Convergence checked via trace plots and PSRF

16 Example: 1D Image Restoration MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Figure: Observed data (left panel) and true solution (right panel) for the one-dimensional image restoration example. The solid black line in the left panel represents the noise-free observations Ax. Shaw (1972), Hansen (2007)

17 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject MAP with posterior mean ˆλ MAP with loo-cv estimated λ Posterior mean Estimator RE RMSE MAP Posterior Mean Table: Relative error (RE) and root mean square error (RMSE) of the estimates for the one-dimensional image restoration example.

18 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject π(µ) π(σ) µ σ Figure: Estimated marginal posterior densities for µ (left panel) and σ (right panel) for the one-dimensional image restoration example.

19 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Approximate Sampling from Conditional Distributions Sampling from the full conditional of x requires ( µa T A + σγ 1) 1/2 z, z N(0, I). When x is high-dimensional, this is computationally expensive For difficult conditional distributions, common to use Metropolis-Hastings with simpler proposal distributions Proposed alternative: Find a computationally cheap approximation, and correct for the approximation using M-H. Metropolis et al. (1953), Hastings (1970), Tierney (1994), Rosenthal (2011), Gelman et al. (2013)

20 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Note: Σ = (µa T A + σl T L) 1 = L 1 (µl T A T AL 1 + σi) 1 L T When A is poorly conditioned, we expect the spectrum of L T A T AL 1 to decay quickly. I.e., L T A T AL 1 = VΛV T V k Λ k V T k L T A T AL 1 does not need to be explicitly computed so that the k largest eigenvalues can be found relatively quickly

21 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Some algebra and Woodbury formula yields L 1 (µl T A T AL 1 + σi) 1 L T ( σ 1 L 1 I + µ ) 1 σ V kλ k Vk T L T = σ 1 L 1 (I V k DVk T )L T =: Σ where D = diag ( ) µλj. µλ j + σ Similarly, we can factor Σ = GG T with G = σ 1/2 L 1 (I V k DV T k ), where D = diag ( ) 1 ± 1 (D) jj

22 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Suggests a Gaussian proposal distribution with an easy-to-compute covariance matrix and factorization ( x µ, σ, d N Σ(µA T d), Σ ) Idea: Inside the block-gibbs sampler, use the cheap proposal as an approximation to the target (full conditional) distribution of x in an MH independence sampler Fast to sample from this distribution Fast to evaluate the likelihood function associated with this distribution

23 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Proposition The MH acceptance ratio can be computed as η(z, x) = w(z)/w(x), where ( w(x) = exp 1 ) 2 x (Σ 1 Σ 1 )x. Proposition The MH acceptance ratio can be simplified to η(z, x) = exp µ n [ ( ) 2 ( λ j v j Lz v j Lx) ] 2. 2 j=k+1 Note: z ± x L v j η(z, x) = 1 B et al. (2017+)

24 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Proposition Given the current state of the chain x, the expected value and the variance of the acceptance ratio are given by where N l := exp 1 µ η := E z x [η(z, x)] = N 1 w(x) ση 2 := V z x [η(z, x)] = 1 w 2 (x) µ2 2σ n j=k+1 ( 1 N 2 1 N 2 1 lµλ j lµλ j + σ (b AL 1 v j ) 2 j>k ), ( 1 + lµ σ λ j) 1/2, B et al. (2017+)

25 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Implication: We can bound in probability the deviation of the acceptance rate from its mean, given the current state x. E.g., by Chebyshev inequality, for any state x, Pr z x (η(z, x) [µ η ± 4.47σ η ]) 0.95

26 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Example: 1-D Image Restoration (again) log(λi) log(y) Index i 10-8 Predicted: 1 min(1,e[η]) Empirical: 1 E[min(1,η)] K Figure: (Left) Spectrum of the prior-preconditioned Hessian matrix for the one-dimensional image restoration example. (Right) Rejection rates y computed with different eigenvalue truncations for the one-dimensional image restoration example, log 10 scale. Approximate 95% pointwise error upper bound given by the dashed line. Note that the lower bound for the pointwise error is negative and therefore not plotted.

27 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Proposition The target density h(x) and the proposal density q(x) can be bounded as h(x) N 1 q(x) for all x, where N 1 1 is given by n N 1 := exp µ2 µλ j 2σ µλ j + σ (bt AL 1 v j ) 2 ( 1 + µ ) 1/2 σ λ j j=k+1 j>k B et al. (2017+)

28 MCMC Fast Approximations Acceptance Rate Convergence and Accept-Reject Implication: The subchain produced by our proposed sampler has stationary distribution π( d, µ, σ) and is uniformly ergodic K p (x, ) π( d, µ, σ) T V 2 ( 1 N1 1 ) p x supp π, where K p (x, ) is the p-step transition kernel starting from x and T V denotes the total variation norm. Suggests also that the approximating distribution q( ) can be used in an accept-reject algorithm instead of the independence MH sampler For a given candidate density, an independence MH algorithm is superior to an accept-reject algorithm in terms of sampling efficiency Robert and Casella (2004)

29 EEG CT Example: EEG Source Localization Distribution of source locations (dipoles) inside the brain generated by neuronal activity Current sources create electric potentials that are measured by electrodes on the scalp Model: d = Ax + noise d R m represents the electrode measurements at different locations along the scalp x R n represents the current sources on a discretized grid in the brain A R m n, m < n, is the leadfield matrix determined by conductivity and geometry of the head Bioelectromagnetic inverse problem has no unique solution Hauk (2004)

30 EEG CT Simulate randomly-oriented dipoles located on an intracerebral source grid Realistic head model is used to account for the physics and create the leadfield matrix using the boundary element method Noise is added to the observed scalp potentials Software: German Gomez-Herrero EEG Tutorial: io/tutorials/dipoles/tutorial_dipoles.htm Fieldtrip MATLAB Toolbox for EEG: Oostendrop and Oosterom (1989), Hauk (2004)

31 EEG CT

32 EEG CT dim(d) = 257, dim(x) = 1261 retain 257 eigenvalues Hyperparameters on µ set to concentrate about 1 Hyperparameters on σ taken to be vague Prior on x taken to be N(0, σ 1 I) (the usual ridge penalty) For both samplers, three chains are run in parallel for 2,000 iterations each Discard first 500 iterations as burn-in period Regularization term for MAP determined from MCMC output

33 EEG CT λ = µ/σ Parameter PSRF µ 1.00 σ x 1.577

34 EEG CT Figure: Solutions to the EEG inverse problem using block Gibbs (left panel), Hastings-within-Gibbs (middle panel), and MAP (right panel)

35 Efficiency EEG CT Algorithm Wall Time (s) CES Block Gibbs MHwG CES = Cost per effective sample = (integrated autocorrelation time) (total computation time) / (length of chain) Note: Effective sample size = (Chain length) / (autocorrleation time) Kass et al. (1998), Carlin and Louis (2009), Fox and Norton (2016)

36 EEG CT Example: Computed Tomography X-ray is passed through a body from a source (s = 0) to a sensor (s = S) along a line determined by angle and distance with respect to a fixed origin L(x, ω) Intensity I is attenuated due to absorption by the tissue through which the X-ray passes: { I(S) = I(0) exp S 0 } η(x(s))ds z = log(i(s)/i(0)) yields the Radon transform model for CT: z(x, ω) = η(x(s))ds L(x,ω) Reconstructing η( ) creates an image of the body that was scanned Discretized version: z = Ax + noise Kaipio and Somersalo (2005), Bardsley (2011)

37 EEG CT Simulated Computed Tomography Data Use Shepp-Logan phantom as the true image Target image is discretized so that dim(x) = = Simulate observed data (Radon transform model) over discretized lines and angles so that dim(z) = 5000 Data generating model: z = Ax + e, where e N(0, µ 1 I), µ 1/2 = 0.01 Ax. Kaipio and Somersalo (2005), Rue and Held (2005), Bardsley (2011)

38 EEG CT

39 EEG CT Only considering MH-within-Gibbs with 5000 dim(x) eigenvalues retained (block Gibbs is not feasible) Hyperparameters on µ set to concentrate about 1 Hyperparameters on σ taken to be vague Prior on x taken to be N(0, L T L), where L = discrete Laplacian (GMRF) to enforce smoothness a priori One chain run for 2,000 iterations each Discard first 1000 iterations as burn-in period Regularization term for MAP determined from MCMC output Rue and Held (2005)

40 EEG CT Figure: MAP estimate (left panel) and posterior mean (right panel)

41 EEG CT Estimate Relative Error RMSE MAP Posterior Mean Wall time for MHwG sampler = s ( 45 min.) Posterior provides access to almost any estimator we want and quantifies the associated uncertainty

42 Features of the Approach Eigenvalue decomposition does not require explicitly computing L T A T AL 1, but only matrix-vector products. Finding the eigenvalues is a precomputation before iterating. Once found, the proposal is cheap for any given µ and σ. For ill-posed problems, the acceptance probability is close to (or exactly equal to) one. We are still modeling the full dimension of x, not projecting onto a lower-dimensional subspace. Exploiting the nature of the forward model This approach allows incorporation of strong or vague prior information about the solution through specification of the prior covariance (precision) matrix Prior information determined from fmri can help to solve the EEG problem Prior smoothness assumptions through a GP prior or Laplacian Dale et al. (2000), Banerjee et al. (2008), Higdon et al. (2008), Banerjee et al. (2012)

43 Q: Which estimator to use? A: It depends on what you want from your estimate. From a Bayesian (decision theoretic) perspective, the posterior mode (MAP) is not always the optimal estimator Squared error loss (i.e., estimation/prediction) Posterior mean Absolute error loss Posterior median 0-1 loss (i.e., thresholding) Posterior mode etc. The mean will not zero out elements of x in regularized solutions as much as the mode (but we can still determine thresholding rules via, e.g., k-means clustering) Estimators of sparse objects need not be sparse themselves in order to yield excellent performance. -Carvalho, Polson, and Scott (2010) Bhattacharya et al. (2015)

44 Variance Component Priors We use Gamma priors on µ and σ for convenience It has been noted that a Gamma prior for σ can be inappropriate for situations in which x is thought to be sparse Can artificially force σ away from zero Alternatives include: Jeffreys prior π(µ, σ) µ 1 (µ + σ) 1 Proper Jeffreys prior Half-Cauchy π(µ, σ) (µ + σ) 2 π(σ 1/2 µ 1/2 ) ( 1 + σ ) 1 µ Tiao and Tan (1965), Gelman (2006), Scott and Berger (2006), Polson and Scott (2010)

45 Global-Local Scale Mixtures d N(Ax, µ 1 I), x j N(0, η 2 τj 2 ), j = 1,..., j: Choice of prior on τ j produces different sparsity properties Ex: Laplace prior (LASSO) τ 2 j Exp(λ/2) t prior Horseshoe prior τ 2 j InvGamma(λ/2, λ/2) τ j HalfCauchy(0, λ) Generalized Double Pareto τ j Exp(λ 2 /2), λ 2 Gamma(α, β) Spike-slab prior τ j = γ j λ, γ j Bernoulli(π) Etc. How to extend the low-rank approximation approach to the global-local framework? George and McCulloch (1993), Tibshirani (1996), Park and Casella (2008), Polson and Scott (2010), Carvalho et al. (2010), Armagan et al. (2013)

46 Brown, D. A., Saibaba, A. K., and Vallélian, S., (2017+), Computationally efficient Markov chain Monte Carlo methods for hierarchical Bayesian inverse problems, submitted, arxiv: Thank you!

Efficient MCMC Sampling for Hierarchical Bayesian Inverse Problems

Efficient MCMC Sampling for Hierarchical Bayesian Inverse Problems Efficient MCMC Sampling for Hierarchical Bayesian Inverse Problems Andrew Brown 1,2, Arvind Saibaba 3, Sarah Vallélian 2,3 CCNS Transition Workshop SAMSI May 5, 2016 Supported by SAMSI Visiting Research

More information

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J. Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox fox@physics.otago.ac.nz Richard A. Norton, J. Andrés Christen Topics... Backstory (?) Sampling in linear-gaussian hierarchical

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

arxiv: v1 [stat.co] 2 Nov 2018

arxiv: v1 [stat.co] 2 Nov 2018 Efficient Marginalization-based MCMC Methods for Hierarchical Bayesian Inverse Problems Arvind K. Saibaba Johnathan Bardsley D. Andrew Brown Alen Alexanderian arxiv:1811.01091v1 [stat.co] 2 Nov 2018 November

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

The Bayesian approach to inverse problems

The Bayesian approach to inverse problems The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Or How to select variables Using Bayesian LASSO

Or How to select variables Using Bayesian LASSO Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO x 1 x 2 x 3 x 4 Or How to select variables Using Bayesian LASSO On Bayesian Variable Selection

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model

Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model UNIVERSITY OF TEXAS AT SAN ANTONIO Hastings-within-Gibbs Algorithm: Introduction and Application on Hierarchical Model Liang Jing April 2010 1 1 ABSTRACT In this paper, common MCMC algorithms are introduced

More information

MCMC algorithms for fitting Bayesian models

MCMC algorithms for fitting Bayesian models MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models

More information

ST 740: Markov Chain Monte Carlo

ST 740: Markov Chain Monte Carlo ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Point spread function reconstruction from the image of a sharp edge

Point spread function reconstruction from the image of a sharp edge DOE/NV/5946--49 Point spread function reconstruction from the image of a sharp edge John Bardsley, Kevin Joyce, Aaron Luttman The University of Montana National Security Technologies LLC Montana Uncertainty

More information

Bayesian data analysis in practice: Three simple examples

Bayesian data analysis in practice: Three simple examples Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Learning the hyper-parameters. Luca Martino

Learning the hyper-parameters. Luca Martino Learning the hyper-parameters Luca Martino 2017 2017 1 / 28 Parameters and hyper-parameters 1. All the described methods depend on some choice of hyper-parameters... 2. For instance, do you recall λ (bandwidth

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

Computer Practical: Metropolis-Hastings-based MCMC

Computer Practical: Metropolis-Hastings-based MCMC Computer Practical: Metropolis-Hastings-based MCMC Andrea Arnold and Franz Hamilton North Carolina State University July 30, 2016 A. Arnold / F. Hamilton (NCSU) MH-based MCMC July 30, 2016 1 / 19 Markov

More information

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation

Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Metropolis-Hastings Algorithm

Metropolis-Hastings Algorithm Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll

More information

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics arxiv:1603.08163v1 [stat.ml] 7 Mar 016 Farouk S. Nathoo, Keelin Greenlaw,

More information

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems John Bardsley, University of Montana Collaborators: H. Haario, J. Kaipio, M. Laine, Y. Marzouk, A. Seppänen, A. Solonen, Z.

More information

Bayes: All uncertainty is described using probability.

Bayes: All uncertainty is described using probability. Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w

More information

Overlapping block proposals for latent Gaussian Markov random fields

Overlapping block proposals for latent Gaussian Markov random fields NORGES TEKNISK-NATURVITENSKAPELIGE UNIVERSITET Overlapping block proposals for latent Gaussian Markov random fields by Ingelin Steinsland and Håvard Rue PREPRINT STATISTICS NO. 8/3 NORWEGIAN UNIVERSITY

More information

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31

Hierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31 Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

Markov chain Monte Carlo

Markov chain Monte Carlo Markov chain Monte Carlo Karl Oskar Ekvall Galin L. Jones University of Minnesota March 12, 2019 Abstract Practically relevant statistical models often give rise to probability distributions that are analytically

More information

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris

Simulation of truncated normal variables. Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Simulation of truncated normal variables Christian P. Robert LSTA, Université Pierre et Marie Curie, Paris Abstract arxiv:0907.4010v1 [stat.co] 23 Jul 2009 We provide in this paper simulation algorithms

More information

MARKOV CHAIN MONTE CARLO

MARKOV CHAIN MONTE CARLO MARKOV CHAIN MONTE CARLO RYAN WANG Abstract. This paper gives a brief introduction to Markov Chain Monte Carlo methods, which offer a general framework for calculating difficult integrals. We start with

More information

Lecture 8: Bayesian Estimation of Parameters in State Space Models

Lecture 8: Bayesian Estimation of Parameters in State Space Models in State Space Models March 30, 2016 Contents 1 Bayesian estimation of parameters in state space models 2 Computational methods for parameter estimation 3 Practical parameter estimation in state space

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Integrated Non-Factorized Variational Inference

Integrated Non-Factorized Variational Inference Integrated Non-Factorized Variational Inference Shaobo Han, Xuejun Liao and Lawrence Carin Duke University February 27, 2014 S. Han et al. Integrated Non-Factorized Variational Inference February 27, 2014

More information

Markov chain Monte Carlo methods in atmospheric remote sensing

Markov chain Monte Carlo methods in atmospheric remote sensing 1 / 45 Markov chain Monte Carlo methods in atmospheric remote sensing Johanna Tamminen johanna.tamminen@fmi.fi ESA Summer School on Earth System Monitoring and Modeling July 3 Aug 11, 212, Frascati July,

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Advances and Applications in Perfect Sampling

Advances and Applications in Perfect Sampling and Applications in Perfect Sampling Ph.D. Dissertation Defense Ulrike Schneider advisor: Jem Corcoran May 8, 2003 Department of Applied Mathematics University of Colorado Outline Introduction (1) MCMC

More information

Inverse Problems in the Bayesian Framework

Inverse Problems in the Bayesian Framework Inverse Problems in the Bayesian Framework Daniela Calvetti Case Western Reserve University Cleveland, Ohio Raleigh, NC, July 2016 Bayes Formula Stochastic model: Two random variables X R n, B R m, where

More information

Kobe University Repository : Kernel

Kobe University Repository : Kernel Kobe University Repository : Kernel タイトル Title 著者 Author(s) 掲載誌 巻号 ページ Citation 刊行日 Issue date 資源タイプ Resource Type 版区分 Resource Version 権利 Rights DOI URL Note on the Sampling Distribution for the Metropolis-

More information

MONTE CARLO METHODS. Hedibert Freitas Lopes

MONTE CARLO METHODS. Hedibert Freitas Lopes MONTE CARLO METHODS Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes hlopes@chicagobooth.edu

More information

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait

A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling. Christopher Jennison. Adriana Ibrahim. Seminar at University of Kuwait A Search and Jump Algorithm for Markov Chain Monte Carlo Sampling Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Adriana Ibrahim Institute

More information

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling Professor Erik Sudderth Brown University Computer Science October 27, 2016 Some figures and materials courtesy

More information

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Yong Huang a,b, James L. Beck b,* and Hui Li a a Key Lab

More information

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation Luke Tierney Department of Statistics & Actuarial Science University of Iowa Basic Ratio of Uniforms Method Introduced by Kinderman and

More information

arxiv: v1 [stat.co] 18 Feb 2012

arxiv: v1 [stat.co] 18 Feb 2012 A LEVEL-SET HIT-AND-RUN SAMPLER FOR QUASI-CONCAVE DISTRIBUTIONS Dean Foster and Shane T. Jensen arxiv:1202.4094v1 [stat.co] 18 Feb 2012 Department of Statistics The Wharton School University of Pennsylvania

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information

Lecture 8: The Metropolis-Hastings Algorithm

Lecture 8: The Metropolis-Hastings Algorithm 30.10.2008 What we have seen last time: Gibbs sampler Key idea: Generate a Markov chain by updating the component of (X 1,..., X p ) in turn by drawing from the full conditionals: X (t) j Two drawbacks:

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Recent Advances in Bayesian Inference for Inverse Problems

Recent Advances in Bayesian Inference for Inverse Problems Recent Advances in Bayesian Inference for Inverse Problems Felix Lucka University College London, UK f.lucka@ucl.ac.uk Applied Inverse Problems Helsinki, May 25, 2015 Bayesian Inference for Inverse Problems

More information

17 : Markov Chain Monte Carlo

17 : Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models, Spring 2015 17 : Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Heran Lin, Bin Deng, Yun Huang 1 Review of Monte Carlo Methods 1.1 Overview Monte Carlo

More information

On Bayesian Computation

On Bayesian Computation On Bayesian Computation Michael I. Jordan with Elaine Angelino, Maxim Rabinovich, Martin Wainwright and Yun Yang Previous Work: Information Constraints on Inference Minimize the minimax risk under constraints

More information

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL

PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL PSEUDO-MARGINAL METROPOLIS-HASTINGS APPROACH AND ITS APPLICATION TO BAYESIAN COPULA MODEL Xuebin Zheng Supervisor: Associate Professor Josef Dick Co-Supervisor: Dr. David Gunawan School of Mathematics

More information

INTRODUCTION TO BAYESIAN STATISTICS

INTRODUCTION TO BAYESIAN STATISTICS INTRODUCTION TO BAYESIAN STATISTICS Sarat C. Dass Department of Statistics & Probability Department of Computer Science & Engineering Michigan State University TOPICS The Bayesian Framework Different Types

More information

Adaptive Posterior Approximation within MCMC

Adaptive Posterior Approximation within MCMC Adaptive Posterior Approximation within MCMC Tiangang Cui (MIT) Colin Fox (University of Otago) Mike O Sullivan (University of Auckland) Youssef Marzouk (MIT) Karen Willcox (MIT) 06/January/2012 C, F,

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Geometric ergodicity of the Bayesian lasso

Geometric ergodicity of the Bayesian lasso Geometric ergodicity of the Bayesian lasso Kshiti Khare and James P. Hobert Department of Statistics University of Florida June 3 Abstract Consider the standard linear model y = X +, where the components

More information

Package horseshoe. November 8, 2016

Package horseshoe. November 8, 2016 Title Implementation of the Horseshoe Prior Version 0.1.0 Package horseshoe November 8, 2016 Description Contains functions for applying the horseshoe prior to highdimensional linear regression, yielding

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

Partial factor modeling: predictor-dependent shrinkage for linear regression

Partial factor modeling: predictor-dependent shrinkage for linear regression modeling: predictor-dependent shrinkage for linear Richard Hahn, Carlos Carvalho and Sayan Mukherjee JASA 2013 Review by Esther Salazar Duke University December, 2013 Factor framework The factor framework

More information

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods Prof. Daniel Cremers 11. Sampling Methods Sampling Methods Sampling Methods are widely used in Computer Science as an approximation of a deterministic algorithm to represent uncertainty without a parametric

More information

A Level-Set Hit-And-Run Sampler for Quasi- Concave Distributions

A Level-Set Hit-And-Run Sampler for Quasi- Concave Distributions University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 2014 A Level-Set Hit-And-Run Sampler for Quasi- Concave Distributions Shane T. Jensen University of Pennsylvania Dean

More information

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling

Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling 1 / 27 Statistical Machine Learning Lecture 8: Markov Chain Monte Carlo Sampling Melih Kandemir Özyeğin University, İstanbul, Turkey 2 / 27 Monte Carlo Integration The big question : Evaluate E p(z) [f(z)]

More information

A Review of Pseudo-Marginal Markov Chain Monte Carlo

A Review of Pseudo-Marginal Markov Chain Monte Carlo A Review of Pseudo-Marginal Markov Chain Monte Carlo Discussed by: Yizhe Zhang October 21, 2016 Outline 1 Overview 2 Paper review 3 experiment 4 conclusion Motivation & overview Notation: θ denotes the

More information

SAMPLING ALGORITHMS. In general. Inference in Bayesian models

SAMPLING ALGORITHMS. In general. Inference in Bayesian models SAMPLING ALGORITHMS SAMPLING ALGORITHMS In general A sampling algorithm is an algorithm that outputs samples x 1, x 2,... from a given distribution P or density p. Sampling algorithms can for example be

More information

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning

April 20th, Advanced Topics in Machine Learning California Institute of Technology. Markov Chain Monte Carlo for Machine Learning for for Advanced Topics in California Institute of Technology April 20th, 2017 1 / 50 Table of Contents for 1 2 3 4 2 / 50 History of methods for Enrico Fermi used to calculate incredibly accurate predictions

More information

BAYESIAN MODEL CRITICISM

BAYESIAN MODEL CRITICISM Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes

More information

Lasso & Bayesian Lasso

Lasso & Bayesian Lasso Readings Chapter 15 Christensen Merlise Clyde October 6, 2015 Lasso Tibshirani (JRSS B 1996) proposed estimating coefficients through L 1 constrained least squares Least Absolute Shrinkage and Selection

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Michael Johannes Columbia University Nicholas Polson University of Chicago August 28, 2007 1 Introduction The Bayesian solution to any inference problem is a simple rule: compute

More information

Discretization-invariant Bayesian inversion and Besov space priors. Samuli Siltanen RICAM Tampere University of Technology

Discretization-invariant Bayesian inversion and Besov space priors. Samuli Siltanen RICAM Tampere University of Technology Discretization-invariant Bayesian inversion and Besov space priors Samuli Siltanen RICAM 28.10.2008 Tampere University of Technology http://math.tkk.fi/inverse-coe/ This is a joint work with Matti Lassas

More information

Markov Chain Monte Carlo A Contribution to the Encyclopedia of Environmetrics

Markov Chain Monte Carlo A Contribution to the Encyclopedia of Environmetrics Markov Chain Monte Carlo A Contribution to the Encyclopedia of Environmetrics Galin L. Jones and James P. Hobert Department of Statistics University of Florida May 2000 1 Introduction Realistic statistical

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

Kernel adaptive Sequential Monte Carlo

Kernel adaptive Sequential Monte Carlo Kernel adaptive Sequential Monte Carlo Ingmar Schuster (Paris Dauphine) Heiko Strathmann (University College London) Brooks Paige (Oxford) Dino Sejdinovic (Oxford) December 7, 2015 1 / 36 Section 1 Outline

More information

Comparing Non-informative Priors for Estimation and. Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and. Prediction in Spatial Models Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Vigre Semester Report by: Regina Wu Advisor: Cari Kaufman January 31, 2010 1 Introduction Gaussian random fields with specified

More information

c 2016 Society for Industrial and Applied Mathematics

c 2016 Society for Industrial and Applied Mathematics SIAM J. SCI. COMPUT. Vol. 8, No. 5, pp. A779 A85 c 6 Society for Industrial and Applied Mathematics ACCELERATING MARKOV CHAIN MONTE CARLO WITH ACTIVE SUBSPACES PAUL G. CONSTANTINE, CARSON KENT, AND TAN

More information

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014

Bayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014 Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. Model: where g(t) = a(t s)f(s)ds + e(t), a(t) t = (rapidly). The problem,

More information

Monte Carlo methods for sampling-based Stochastic Optimization

Monte Carlo methods for sampling-based Stochastic Optimization Monte Carlo methods for sampling-based Stochastic Optimization Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Joint works with B. Jourdain, T. Lelièvre, G. Stoltz from ENPC and E. Kuhn from

More information

wissen leben WWU Münster

wissen leben WWU Münster MÜNSTER Sparsity Constraints in Bayesian Inversion Inverse Days conference in Jyväskylä, Finland. Felix Lucka 19.12.2012 MÜNSTER 2 /41 Sparsity Constraints in Inverse Problems Current trend in high dimensional

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information