Recent Advances in Bayesian Inference for Inverse Problems

Similar documents
MCMC Sampling for Bayesian Inference using L1-type Priors

wissen leben WWU Münster

Discretization-invariant Bayesian inversion and Besov space priors. Samuli Siltanen RICAM Tampere University of Technology

arxiv: v2 [math.na] 5 Jul 2016

arxiv: v1 [math.st] 19 Mar 2016

Uncertainty quantification for inverse problems with a weak wave-equation constraint

Point spread function reconstruction from the image of a sharp edge

Bayesian Methods and Uncertainty Quantification for Nonlinear Inverse Problems

Hierarchical Matérn fields and Cauchy difference priors for Bayesian inversion

Sparse tomography. Samuli Siltanen. Department of Mathematics and Statistics University of Helsinki, Finland

Bayesian Paradigm. Maximum A Posteriori Estimation

(4D) Variational Models Preserving Sharp Edges. Martin Burger. Institute for Computational and Applied Mathematics

Introduction to Bayesian methods in inverse problems

Probabilistic Machine Learning

Resolving the White Noise Paradox in the Regularisation of Inverse Problems

Well-posed Bayesian Inverse Problems: Beyond Gaussian Priors

The Bayesian approach to inverse problems

Bayesian Inference and MCMC

Bayesian Methods for Machine Learning

Density Estimation. Seungjin Choi

Advances and Applications in Perfect Sampling

Bayesian Nonparametric Regression for Diabetes Deaths

DRAGON ADVANCED TRAINING COURSE IN ATMOSPHERE REMOTE SENSING. Inversion basics. Erkki Kyrölä Finnish Meteorological Institute

Integrated Non-Factorized Variational Inference

Efficient MCMC Sampling for Hierarchical Bayesian Inverse Problems

Bayesian Complementary Clustering, MCMC and Anglo-Saxon placenames

The Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision

Bayesian Calibration of Simulators with Structured Discretization Uncertainty

Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors

Bayesian Learning (II)

Probabilistic Graphical Networks: Definitions and Basic Results

CPSC 540: Machine Learning

Bayesian Regression Linear and Logistic Regression

Multimodal Nested Sampling

Markov Chain Monte Carlo, Numerical Integration

Linear regression example Simple linear regression: f(x) = ϕ(x)t w w ~ N(0, ) The mean and covariance are given by E[f(x)] = ϕ(x)e[w] = 0.

Stochastic Collocation Methods for Polynomial Chaos: Analysis and Applications

Kernel adaptive Sequential Monte Carlo

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Introduction to Machine Learning CMU-10701

On Bayesian Computation

Tutorial on Probabilistic Programming with PyMC3

Practical unbiased Monte Carlo for Uncertainty Quantification

Statistical Inversion of Absolute Permeability in Single-Phase Darcy Flow

Variational Bayesian Inference Techniques

Large Scale Bayesian Inference

Bayesian Analysis. Bayesian Analysis: Bayesian methods concern one s belief about θ. [Current Belief (Posterior)] (Prior Belief) x (Data) Outline

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems

Spatial Statistics with Image Analysis. Outline. A Statistical Approach. Johan Lindström 1. Lund October 6, 2016

Introduction to Machine Learning

Augmented Tikhonov Regularization

Markov chain Monte Carlo methods for visual tracking

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Probabilistic Graphical Models Lecture 17: Markov chain Monte Carlo

Introduction to Gaussian Processes

PART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics

Or How to select variables Using Bayesian LASSO

MCMC and Gibbs Sampling. Kayhan Batmanghelich

The Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.

Sequential Monte Carlo Samplers for Applications in High Dimensions

Markov Chain Monte Carlo Using the Ratio-of-Uniforms Transformation. Luke Tierney Department of Statistics & Actuarial Science University of Iowa

Markov Chain Monte Carlo

Dimension-Independent likelihood-informed (DILI) MCMC

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Closed-form Gibbs Sampling for Graphical Models with Algebraic constraints. Hadi Mohasel Afshar Scott Sanner Christfried Webers

Example: Ground Motion Attenuation

Computer Vision Group Prof. Daniel Cremers. 11. Sampling Methods

Variational Methods in Bayesian Deconvolution

Inverse Problems in the Bayesian Framework

CS-E3210 Machine Learning: Basic Principles

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

17 : Optimization and Monte Carlo Methods

STA 4273H: Statistical Machine Learning

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Nonparametric Bayesian Methods (Gaussian Processes)

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods

Bayesian Methods for Sparse Signal Recovery

A hierarchical Krylov-Bayes iterative inverse solver for MEG with physiological preconditioning

Variational Learning : From exponential families to multilinear systems

Hamiltonian Monte Carlo for Scalable Deep Learning

A Review of Pseudo-Marginal Markov Chain Monte Carlo

Hyperpriors for Matérn fields with applications in Bayesian inversion arxiv: v1 [math.st] 9 Dec 2016

Super-Resolution. Shai Avidan Tel-Aviv University

CS242: Probabilistic Graphical Models Lecture 7B: Markov Chain Monte Carlo & Gibbs Sampling

Pattern Recognition and Machine Learning

Regularization Parameter Selection for a Bayesian Multi-Level Group Lasso Regression Model with Application to Imaging Genomics

Efficient Variational Inference in Large-Scale Bayesian Compressed Sensing

When one of things that you don t know is the number of things that you don t know

Geometric ergodicity of the Bayesian lasso

Probabilistic numerics for deep learning

Deblurring Jupiter (sampling in GLIP faster than regularized inversion) Colin Fox Richard A. Norton, J.

Computer Vision Group Prof. Daniel Cremers. 14. Sampling Methods

Bayesian Inference: Principles and Practice 3. Sparse Bayesian Models and the Relevance Vector Machine

Bayesian Networks BY: MOHAMAD ALSABBAGH

Doing Bayesian Integrals

Probabilistic machine learning group, Aalto University Bayesian theory and methods, approximative integration, model

Recent Advances in Bayesian Inference Techniques

STA414/2104 Statistical Methods for Machine Learning II

Markov Networks. l Like Bayes Nets. l Graph model that describes joint probability distribution using tables (AKA potentials)

Bayesian Learning in Undirected Graphical Models

Transcription:

Recent Advances in Bayesian Inference for Inverse Problems Felix Lucka University College London, UK f.lucka@ucl.ac.uk Applied Inverse Problems Helsinki, May 25, 2015

Bayesian Inference for Inverse Problems Noisy, ill-posed inverse problems: f = N (A(u), ε) Example: f = Au + ε p like (f u) ( ) exp 1 2 f A u 2 Σ 1 ε p prior (u) exp ( λ D T u 2 ) 2 p post (u f ) ( ) exp 1 2 f A u 2 λ D T u 2 Σ 1 2 ε Probabilistic representation of solution allows for a rigorous quantification of its uncertainties.

Increased Interest in Bayesian Inversion Inverse problems in the Bayesian framework edited by Daniela Calvetti, Jari P Kaipio and Erkki Somersalo. Special issue of Inverse Problems, November 2014. UQ and a Model Inverse Problem Marco Iglesias and Andrew M. Stuart SIAM News, July/August 2014. Advantageous for high uncertainties: Strongly non-linear problems. Severely ill-posed problems. Little analytical structure Additional model uncertainties. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 2

Recent Trends in Bayesian Inversion (...that I m aware of) Uncertainty quantification of inverse solutions. Dynamic Bayesian inversion for prediction or control of dynamical systems Infinite dimensional Bayesian inversion. M26: Theoretical perspectives in Bayesian inverse problems Incorporating model uncertainties. New ways of encoding a-priori information. M29: Priors and SPDEs Large-scale posterior sampling techniques. M23: Sampling methods for high dimensional Bayesian inverse problems Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 3

Sparsity / Compressible Representation (a) 100% (b) 10% (c) 1% Sparsity a-priori constraints are used in variational regularization, compressed sensing and ridge regression: { û λ = argmin 1 2 f A u 2 2 + λ D T } u 1 u (e.g. total variation, wavelet shrinkage, LASSO,...) Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 4

Sparsity / Compressible Representation (a) 100% (b) 10% (c) 1% Sparsity a-priori constraints are used in variational regularization, compressed sensing and ridge regression: { û λ = argmin 1 2 f A u 2 2 + λ D T } u 1 u (e.g. total variation, wavelet shrinkage, LASSO,...) How about sparsity as a-priori information in the Bayesian approach? Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 4

PhD Thesis Bayesian Inversion in Biomedical Imaging Submitted 2014, supervised by Martin Burger and Carsten H. Wolters. Linear inverse problems in biomedical imaging applications. WESTFÄLISCHE WILHELMS-UNIVERSITÄT MÜNSTER > Bayesian Inversion in Biomedical Imaging Simulated data scenarios and experimental CT and EEG/MEG data. Sparsity by means of lp -norm based priors Hierarchical prior modeling Felix Lucka - 2014 - Focus on computation and application. Here: Results for l p -priors and CT. wissen leben WWU Münster Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 5

The l p Approach to Sparse Bayesian Inversion ( ) ( ) p prior (u) exp λ D T u p p, p post(u f ) exp 1 f A 2 u 2 Σ 1 λ D T u p p ε Decrease p from 2 to 0 and stop at p = 1 for convenience.

The l p Approach to Sparse Bayesian Inversion ( ) ( ) p prior (u) exp λ D T u p p, p post(u f ) exp 1 f A 2 u 2 Σ 1 λ D T u p p ε Decrease p from 2 to 0 and stop at p = 1 for convenience. ( ) exp λ D T u 2 2 ( ) exp 1 f A 2 u 2 Σ 1 λ D T u 2 2 ε ) exp ( λ D T u 1 ( ) exp 1 f A 2 u 2 Σ 1 λ D T u 1 ε

Bayesian Inference with l 1 Priors ( ) p post (u f ) exp 1 2 f A u 2 λ D T u Σ 1 1 ε Aims: Bayesian inversion in high dimensions (n ). Priors: Simple l 1, total variation (TV), Besov space priors. Starting points: M. Lassas, S. Siltanen, 2004. Can one use total variation prior for edge-preserving Bayesian inversion? Inverse Problems, 20. M. Lassas, E. Saksman, S. Siltanen, 2009. Discretization invariant Bayesian inversion and Besov space priors. Inverse Problems and Imaging, 3(1). 1 u λ = 10 λ = 40 λ = 160 λ = 640 λ =2560 0 0 1/3 2/3 1 V. Kolehmainen, M. Lassas, K. Niinimäki, S. Siltanen, 2012. Sparsity-promoting Bayesian inversion Inverse Problems, 28(2).

Efficient MCMC Techniques for `1 Priors Task: Monte Carlo integration by samples from ppost (u f ) exp 12 kf A uk2σ 1 λ kd T uk1 ε Problem: Standard Markov chain Monte Carlo (MCMC) sampler (Metropolis-Hastings) inefficient for large n or λ. Contributions: I Development of explicit single component Gibbs sampler. I Tedious implementation for different scenarios. I Still efficient in high dimensions (n > 106 ). I Detailed evaluation and comparison to MH. F.L., 2012. Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors. Inverse Problems, 28(12):125012.

New Theoretical Ideas for an Old Bayesian Debate Z u MAP := argmax { ppost (u f )} vs. u CM := u ppost (u f ) du u Rn I CM preferred in theory, dismissed in practice. I MAP discredited by theory, chosen in practice.

New Theoretical Ideas for an Old Bayesian Debate Z u MAP := argmax { ppost (u f )} vs. u CM := u ppost (u f ) du u Rn I CM preferred in theory, dismissed in practice. I MAP discredited by theory, chosen in practice. However: I MAP results looks/performs better or similar to CM. I Gaussian priors: MAP = CM. Funny coincidence? I Theoretical argument has a logical flaw. u, n= 63 n= 255 n = 1 023 n = 4 095 n = 16 383 n = 65 535 1 u, 0 0 n n n n n n 1 = = = = = = 63 255 1 023 4 095 16 383 65 535 0 1/3 2/3 1 0 1/3 2/3 1

New Theoretical Ideas for an Old Bayesian Debate Z u MAP := argmax { ppost (u f )} vs. u CM := u ppost (u f ) du u Rn I CM preferred in theory, dismissed in practice. I MAP discredited by theory, chosen in practice. Contributions: I Theoretical rehabilitation of MAP. I Key: Bayes cost functions based on Bregman distances. I Gaussian case consistent in this framework. M. Burger, F.L., 2014. Maximum a posteriori estimates in linear inverse problems with log-concave priors are proper Bayes estimators., Inverse Problems, 30(11):114004. u, n n n n n n 1 T. Helin, M. Burger, 2015. Maximum a posteriori probability estimates in infinite-dimensional Bayesian inverse problems., arxiv:1412.5816v2 = = = = = = 63 255 1 023 4 095 16 383 65 535 Talk by Martin Burger in M40-III 0 0 1/3 2/3 1

Recent Generalization: Slice-Within-Gibbs Sampling p prior (u) exp ( λ D T u 1 ) Limitations: D must be diagonalizable (synthesis priors): l q p-prior: exp ( λ D T u p) q? TV in 2D/3D? Additional hard-constraints? Contributions: Replace explicit by generalized slice sampling. Implementation & evaluation for most common priors. R.M. Neal, 2003. Slice Sampling. Annals of Statistics 31(3) F.L., 2015. Fast Gibbs sampling for high-dimensional Bayesian inversion. (in preparation)

Application to Experimental Data: Walnut-CT Cooperation with Samuli Siltanen, Esa Niemi et al. Implementation of MCMC methods for Fanbeam-CT. Besov and TV prior; non-negativity constraints. Stochastic noise modeling. Bayesian perspective on limited angle CT. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 11

Walnut-CT with TV Prior: Full vs. Limited Angle (a) MAP, full (b) CM, full (c) CStd, full (d) MAP, limited (e) CM, limited (f) CStd, limited

Walnut-CT with TV Prior: Non-Negativity Constraints, Limited Angle (a) CM, uncon (b) CM, non-neg (c) CStd, uncon (d) CStd, non-neg

Summary & Conclusions Sparsity as a-priori information can be modeled in different ways. The elementary MCMC posterior samplers may show very different performance. Contrary to common beliefs they are not in general slow and scale bad with increasing dimension. Sample-based Bayesian inversion in high dimensions (n > 10 6 ) is feasible if tailored samplers are developed. MAP estimates are proper Bayes estimators. But MAP or CM? is NOT the key question in Bayesian inversion. Everything beyond point-estimates is far more interesting and can really complement variational approaches. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 14

Current Related Work & Outlook Fast samplers can be used for simulated annealing. Reason for the efficiency of the Gibbs samplers is unclear. Adaptation, parallelization, multimodality, multi-grid. Combine l p -type and hierarchical priors: l p -hypermodels. Application studies had proof-of-concept character up to now. Specific UQ task to explore full potential of the Bayesian approach. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 15

F.L., 2014. Bayesian Inversion in Biomedical Imaging PhD Thesis, University of Münster. M. Burger, F.L., 2014. Maximum a posteriori estimates in linear inverse problems with log-concave priors are proper Bayes estimators Inverse Problems, 30(11):114004. F.L., 2012. Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors. Inverse Problems, 28(12):125012. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 16

Thank you for your attention! F.L., 2014. Bayesian Inversion in Biomedical Imaging PhD Thesis, University of Münster. M. Burger, F.L., 2014. Maximum a posteriori estimates in linear inverse problems with log-concave priors are proper Bayes estimators Inverse Problems, 30(11):114004. F.L., 2012. Fast Markov chain Monte Carlo sampling for sparse Bayesian inference in high-dimensional inverse problems using L1-type priors. Inverse Problems, 28(12):125012. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 16

Efficient MCMC Techniques for `1 Priors (a) Reference (b) MH-Iso, 1h (c) MH-Iso, 4h (d) MH-Iso, 16h (e) Reference (f) SC Gibbs, 1h (g) SC Gibbs, 4h (h) SC Gibbs, 16h Deconvolution, simple `1 prior, n = 513 513 = 263 169. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 17

Verification of Theoretical Predictions by MCMC Numerical verification of the discretization dilemma of the TV prior (Lassas & Siltanen, 2004): For λ n = const., n the TV prior diverges. CM diverges. MAP converges to edge-preserving limit. 1 u, n = 63 n = 255 n = 1023 n = 4095 n = 16383 n = 65535 1 u, n = 63 n = 255 n = 1023 n = 4095 n = 16383 n = 65535 0 0 0 1/3 2/3 1 (a) CM by our Gibbs Sampler 0 1/3 2/3 1 (b) MAP by ADMM

Verification of Theoretical Predictions by MCMC Numerical verification of the discretization dilemma of the TV prior (Lassas & Siltanen, 2004): For λ n = const., n the TV prior diverges. CM diverges. MAP converges to edge-preserving limit. u, n = 63 n = 255 n = 1023 n = 4095 n = 16383 n = 65535 u, û a CM û b CM 1 1 1/3 2/3 (a) Zoom into CM estimates 1/3 2/3 (b) MCMC convergence check

Verification of Theoretical Predictions by MCMC Numerical verification of the discretization dilemma of the TV prior (Lassas & Siltanen, 2004): For λ n n + 1, n the TV prior converges to a smoothness prior. CM converges to smooth limit. MAP converges to constant. 1 u, n = 63 n = 255 n = 1023 n = 4095 n = 16383 n = 65535 1 u, n = 63 n = 255 n = 1023 n = 4095 n = 16383 n = 65535 0 0 0 1/3 2/3 1 (a) CM by our Gibbs Sampler 0 1/3 2/3 1 (b) MAP by ADMM

Need for New Theoretical Predictions For images dimensions > 1: No theory yet...but we can compute it. Test scenario: CT using only 45 projection angles and 500 measurement pixel. real solution data f colormap Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 19

Need for New Theoretical Predictions For images dimensions > 1: No theory yet...but we can compute it. MAP, n = 64 2, λ = 500 CM, n = 64 2, λ = 500 Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 20

Need for New Theoretical Predictions For images dimensions > 1: No theory yet...but we can compute it. MAP, n = 128 2, λ = 500 CM, n = 128 2, λ = 500 Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 20

Need for New Theoretical Predictions For images dimensions > 1: No theory yet...but we can compute it. MAP, n = 256 2, λ = 500 CM, n = 256 2, λ = 500 cf. Louchet, 2008, Louchet & Moisan, 2013 for the denoising case, A = I. Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 20

Examination of Alternative Priors by MCMC: TV-p ( ) p post (u) exp 1 2 f A u 2 λ D T u p Σ 1 p ε 1 u, p =1.4 p =1.2 p =1.0 p =0.8 1 u, p =1.4 p =1.2 p =1.0 p =0.8 0 0 0 1/3 2/3 1 (c) CM (Gibbs-MCMC) 0 1/3 2/3 1 (d) MAP (Simulated Annealing) Felix Lucka, f.lucka@ucl.ac.uk - Recent Advances in Bayesian Inference for Inverse Problems 21

Examination of Besov Space Priors by MCMC An l 1 -type, wavelet-based prior: p prior (u) exp ( λ WV T u 1 ) motivated by: M. Lassas, E. Saksman, S. Siltanen, 2009. Discretization invariant Bayesian inversion and Besov space priors., Inverse Probl Imaging, 3(1). V. Kolehmainen, M. Lassas, K. Niinimäki, S. Siltanen, 2012. Sparsity-promoting Bayesian inversion, Inverse Probl, 28(2). K. Hämäläinen, A. Kallonen, V. Kolehmainen, M. Lassas, K. Niinimäki, S. Siltanen, 2013. Sparse Tomography, SIAM J Sci Comput, 35(3).

Walnut-CT with TV Prior: Full Angle (a) MAP (b) MAP, special color scale (c) CStd (d) CM (e) CM, special color scale (f) CM of u 2