Bayesian Paradigm. Maximum A Posteriori Estimation

Size: px

Start display at page:

Download "Bayesian Paradigm. Maximum A Posteriori Estimation"

Aldous Smith
6 years ago
Views:

1 Bayesian Paradigm Maximum A Posteriori Estimation

2 Simple acquisition model noise + degradation

3 Constraint minimization or

4 Equivalent formulation Constraint minimization Lagrangian (unconstraint minimization) Maximum Aposteriori (MAP)

5 Discrete representation Image is a random field Each pixel value is a realization of some random variable

6 Random variables Discrete: takes a countable number of distinct values e.g. num. of children in a family Continuous: x probability density function ( distribution ): Moments: expectation ( average ), variance

7 Random variables Expectation Variance

8 Image as a random field

9 Normal distribution

10 Multivariate Normal Distribution

11 Multivariate Normal Distribution

12 Gamma distribution

13 Bayesian Paradigm a posteriori distribution unknown likelihood given by our problem a priori distribution our prior knowledge Maximum a posteriori (MAP): max p(u z) Maximum likelihood (MLE): max p(z u)

14 Likelihood

Image Prior 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-0.

15 Image Prior Intensity histogram Gradient histogram

16 Image Prior Intensities Gradients (log)

17 Image Prior Gradient histogram

18 Image Prior Tikhonov regularization TV regularization

19 Image Prior Non-convex regularization

20 MAP

21 Kullback-Leibler divergence Measures similarity between two distributions

22 Convex function f(x) Jensen s inequality:

23 Kullback-Leibler divergence is strictly convex

24 Variational Bayes Approximate posterior with a simpler form Minimize with respect to E-L:

25 Variational Bayes

26 Variational Bayes Minimize with respect to all other factors

27 Variational Bayes Minimize with respect to

28 Factorized approximation

29 GM approximated by a single Gaussian depends on initialization

30 Denoising with unknown noise level Noise model: Bayes: and is unknown

31 MAP denoising Minimize the -log posterior with respect to : with respect to :

32 A Posteriori surface

33 MAP denoising

34 Marginalized denoising Expectation of marginalized posterior

35 Marginalized denoising

36 VB denoising Find the factorized approximation of the posterior: image:

37 VB denoising noise precision:

38 VB denoising

39 Comparission

40 Sparse & Redundant Representation Iterated Shrinkage Algorithms Wavelet thresholding

41 We will Talk About The minimization of the functional Substitution: and assume D is invertible : This is our original denoising functional, if Moore-Penrose pseudoinverse

42 A Closer Look at the Functional D rectangular, overcomplete dictionary -> underdetermined system a sparse representation ρ(a) measure of sparsity

43 Measure of Sparsity norms ( ) norm, counts nonzero elements many other sparsity measures smooth l 1

44 l 2 unit ball

45 l 1 unit ball

46 l 0.9 unit ball

47 l 0.5 unit ball

48 l 0.3 unit ball

49 Equivalent formulation Method of Lagrange multipliers

50 l 2 -norm Solution via pseudoinverse is simple but not sparse:

51 l 1 -norm

52 Why Seeking Sparse Representation? Assume x is 1-sparse signal in the dictionary is the perfect solution Overcomplete dictionaries => sparsity

53 Classical Solvers Steepest Descent: Hessian: Convergence depends on the condition number κ of the Hessian: Hessian is ill-conditioned => poor convergence

54 The Unitary Case (DD T = I) Use Define We have m independent 1D optimization problems l 2 is unitarily invariant

55 The 1D Task We need to solve the following 1D problem: LUT a opt b Such a Look-Up-Table (LUT) can be built for ANY sparsity measure function ρ(a), including non-convex ones and non-smooth ones (e.g., l 0 norm).

56 Shrinkage for ρ(a)= a p

57 Soft & Hard Thresholding Soft thresholding Hard thresholding

58 The Unitary Case: Summary Minimizing is done by: x Multiply by D H b LUT S, b a Multiply by D xˆ DONE! The obtained solution is the GLOBAL minimizer of f(a), even if f(a) is non-convex.

59 The minimization of Leads to two very Contradicting Observations: 1. The problem is quite hard classic optimization find it hard. 2. The problem is trivial for the case of unitary D. How Can We Enjoy This Simplicity in the General Case?

60 Iterated-Shrinkage Algorithms Common to all is a set of operations in every iteration that includes: (i) Multiplication by D (ii) Multiplication by D T, and (iii) A Scalar shrinkage on the solution S ρ,λ (a).

61 The Proximal-Point Method Aim: minimize f(a) Suppose it is found to be too hard. Define a surrogate-function g(a,a 0 )=f(a)+dist(a-a 0 ). Then, the following algorithm necessarily converges to a local minima of f(a) [Rokafellar, `76]: a 0 a 1 Minimize Minimize a 2 a k a k+1 Minimize g(a,a 0 ) g(a,a 1 ) g(a,a k ) Comments: (i) The minimization of g(a,a 0 ) should be easier. (ii) Looks like it will slow-down convergence. Really?

62 The Proposed Surrogate-Function Original function: Distance: The beauty is that vanishes! Not a function of a. Minimization of g(a,a 0 ) is done in a closed form by shrinkage, done on the vector b k, and this generates the solution a k+1 of the next iteration. We have m independent 1D optimization problems

63 Iterated-Shrinkage Algorithm While the Unitary case solution is given by x Multiply by D T b LUT S, b aˆ Multiply by D xˆ the general case, by SSF requires: x Multiply by D T /c + b k S, c b a LUT k 1 Multiply by D xˆ

64 Linear Programming Basis Pursuit: LP: Interior-point Algorithms for LP problems

65 Compressed Sensing In compressed sensing we compress the signal x of size m by exploiting its origin. This is done by p<<m random projections. The core idea: (P size: p m) holds all the information about the original signal x, even though p<<m. Reconstruction? Done by MAP and solving y M Da x Px Noise? xˆ

Sparse & Redundant Representations by Iterated-Shrinkage Algorithms

Sparse & Redundant Representations by Michael Elad * The Computer Science Department The Technion Israel Institute of technology Haifa 3000, Israel 6-30 August 007 San Diego Convention Center San Diego,