Hierarchical prior models for scalable Bayesian Tomography
|
|
- Moses Barber
- 5 years ago
- Views:
Transcription
1 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 1/45. Hierarchical prior models for scalable Bayesian Tomography Ali Mohammad-Djafari Laboratoire des Signaux et Systèmes (L2S) UMR8506 CNRS-CentraleSupélec-UNIV PARIS SUD SUPELEC, Gif-sur-Yvette, France Invited talk at WSTL, Szeged, Hungary
2 Contents 1. Computed Tomography or how to See inside of a body 2. Algebraic and Regularization methods 3. Basic Bayesian approach 4. Two main steps: Choosing appropriate Prior model Do the computational efficiently 5. Unsupervised Bayesian methods: JMAP, Marginalization, VBA 6. Hierarchical prior modelling Sparsity enforcing through Student-t and IGSM Gauss-Markov-Potts models 7. Computational tools: JMAP, Gibbs Sampling MCMC, VBA 8. Case study: Image Reconstruction with only two projections 9. Implementation issues Main GPU implementation steps: Forward and Back Projections Multi-Resolution implementation 10. Conclusions A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 2/45
3 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 3/45 Computed Tomography or how to see inside of a body f (x, y) a section of a real 3D body f (x, y, z) g φ (r) a line of observed radiography g φ (r, z) Forward model: Line integrals or Radon Transform g φ (r) = f (x, y) dl + ɛ φ (r) L r,φ = f (x, y) δ(r x cos φ y sin φ) dx dy + ɛ φ (r) Inverse problem: Image reconstruction Given the forward model H (Radon Transform) and a set of data g φi (r), i = 1,, M find f (x, y)
4 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 4/45 2D and 3D Computed Tomography 3D 2D g φ (r 1, r 2 ) = f (x, y, z) dl L r1,r 2,φ g φ (r) = f (x, y) dl L r,φ Forward probelm: f (x, y) or f (x, y, z) g φ (r) or g φ (r 1, r 2 ) Inverse problem: g φ (r) or g φ (r 1, r 2 ) f (x, y) or f (x, y, z)
5 Algebraic methods: Discretization y S r H ij f (x, y) f 1 f j g i φ f N x D f (x, y) = j f j b j (x, y) { 1 if (x, y) pixel j b j (x, y) = g(r, φ) 0 else N g(r, φ) = f (x, y) dl g i = H ij f j + ɛ i L j=1 g k = H k f + ɛ k, k = 1,, K g = Hf + ɛ g k projection at angle φ k, g all the projections. A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 5/45
6 Algebraic methods g 1 g =. g K, H = H 1. H K g k = H k f+ɛ k g = k H k f+ɛ k = Hf+ɛ H is huge dimensional: 2D: , 3D: Hf corresponds to forward projection H t g corresponds to Back projection (BP) H may not be invertible and even not square H is, in general, ill-conditioned Generalized Inverse Minimum Norm Solution f = H t (HH t ) 1 g = H t k (H kh t k ) 1 g k k can be interpreted as the Filtered Back Projection solution Due to ill-posedness of the inverse problems, Generalized inversion and Least squares (LS) methods: { f = arg min J(f) = g Hf 2 } f = (H t H) 1 H t g f A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 6/45
7 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 7/45 Inversion: Deterministic methods: Data matching Observation model g = Hf + ɛ g i = [Hf] i + ɛ i = h i (f), i = 1,..., M Mismatch between data and output of the model (g, H(f)) Examples: f = arg min { (g, H(f))} f LS (g, H(f)) = g H(f) 2 = i L p (g, H(f)) = g H(f) p = i g i h i (f) 2 g i h i (f) p, 1 < p < 2 KL (g, H(f)) = i g i ln g i h i (f) In general, does not give satisfactory results for inverse problems.
8 Regularization theory Inverse problems = Ill posed problems Need for prior information Functional space (Tikhonov): g = H(f ) + ɛ J(f ) = g H(f ) λ Df 2 2 Finite dimensional space (Philips & Towmey): g = Hf + ɛ J(f) = g Hf 2 + λ f 2 Minimum norme LS (MNLS): J(f) = g H(f) 2 + λ f 2 Classical regularization: J(f) = g H(f) 2 + λ Df 2 More general regularization: J(f) = Q(g H(f)) + λω(df) or J(f) = 1 (g, H(f)) + λ 2 (Df, f 0 ) Limitations: Errors are considered implicitly white and Gaussian Limited prior information on the solution Lack of tools for the determination of the hyperparameters A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 8/45
9 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 9/45 Inversion: Probabilistic methods Taking account of errors and uncertainties Probability theory Maximum Likelihood (ML) Minimum Inaccuracy (MI) Probability Distribution Matching (PDM) Maximum Entropy (ME) and Information Theory (IT) Bayesian Inference (Bayes) Advantages: Explicit account of the errors and noise A large class of priors via explicit or implicit modelling A coherent approach to combine information content of the data and priors Limitations: Practical implementation and cost of calculation
10 Bayesian estimation approach M : g = Hf + ɛ Observation model M + Hypothesis on the noise ɛ p(g f; M) = pɛ(g Hf) A priori information p(f M) p(g f; M) p(f M) Bayes : p(f g; M) = p(g M) Link with regularization : Maximum A Posteriori (MAP) : f = arg max {p(f g)} = arg max {p(g f) p(f)} f f = arg min {J(f) = ln p(g f) ln p(f)} f Regularization: f = arg min {J(f) = Q(g, Hf) + λω(f)} f with Q(g, Hf) = ln p(g f) and λω(f) = ln p(f) A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 10/45
11 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 11/45 Case of linear models and Gaussian priors g = Hf + ɛ Prior knowledge on the noise: ɛ N (0, v ɛ 2 I) p(g f) exp [ 1 2v ɛ 2 g Hf 2 Prior knowledge on f: [ f N (0, vf 2 (D D) 1 ) p(f) exp 1 ] 2vf 2 Df 2 A posteriori: [ p(f g) exp 1 2v 2 g Hf 2 1 ] ɛ 2vf 2 Df 2 MAP : f = arg maxf {p(f g)} = arg min f {J(f)} with J(f) = g Hf 2 + λ Df 2, λ = vɛ2 v 2 f Advantage : characterization of the solution p(f g) = N ( f, Σ) f = ( H H + λd D ) 1 H g, with ] Σ = vɛ ( H H + λd D ) 1
12 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 12/45 MAP estimation with other priors: f = arg min {J(f)} f Separable priors: with J(f) = 1 v ɛ g Hf 2 + Ω(f) Gaussian: p(f j ) exp [ α f j 2] Ω(f) = α j f j 2 = f 2 2 Generalized Gaussian: p(f j ) exp [ α f j p ], 1 < p < 2 Ω(f) = α j f j p = f p p Gamma: p(f j ) fj α exp [ βf j ] Ω(f) = α j ln f j + βf j Beta: p(f j ) fj α (1 f j ) β Ω(f) = α j ln f j + β j ln(1 f j) Markovian models: p(f j f) exp α φ(f j, f i ) Ω(f) = α φ(f j, f i ), i Nj j i N j
13 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 13/45 Main advantages of the Bayesian approach MAP = Regularization Posterior mean? Marginal MAP? More information in the posterior law than only its mode or its mean Tools for estimating hyper parameters Tools for model selection More specific and specialized priors, particularly through the hidden variables and hierarchical models More computational tools: Expectation-Maximization for computing the maximum likelihood parameters MCMC for posterior exploration Variational Bayes for analytical computation of the posterior marginals...
14 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 14/45 Two main steps in the Bayesian approach Prior modeling Separable: Gaussian, Gamma, Sparsity enforcing: Generalized Gaussian, mixture of Gaussians, mixture of Gammas,... Markovian: Gauss-Markov, GGM,... Markovian with hidden variables (contours, region labels) Choice of the estimator and computational aspects MAP, Posterior mean, Marginal MAP MAP needs optimization algorithms Posterior mean needs integration methods Marginal MAP and Hyperparameter estimation need integration and optimization Approximations: Gaussian approximation (Laplace) Numerical exploration MCMC Variational Bayes (Separable approximation)
15 Which images I am looking for? A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 15/45
16 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 16/45 Which image I am looking for? Gauss-Markov Generalized GM Piecewize Gaussian Mixture of GM
17 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 17/45 Different prior models for signals and images: Separable Simple Gaussian, Gamma, Generalized Gaussian p(f) exp α φ(f j ) j Simple Markovian models: Gauss-Markov, Generalized Gauss-Markov p(f) exp α φ(f j f i ) j j N (i) Hierarchical models with hidden variables: Bernouilli-Gaussian, Gaussian-Gamma p(f z) exp φ(f j z j ) and p(z) exp φ(z j ) j j with different choices for φ(f j z j ) and pφ(z j )
18 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 18/45 Bayesian inference as Hierarchical models Simple case: g = Hf + ɛ θ 2 θ 1 p(f g, θ) p(g f, θ 1 ) p(f θ 2 ) Objective: Infer f f ɛ MAP: f = arg maxf {p(f g, H θ)} Posterior Mean (PM): f = f p(f g, θ) df g Example: { Caussian case: p(g f, θ1 ) = N (g Hf, θ 1 I) v f v ɛ p(f g, θ) = N (f f, p(f θ 2 ) = N (g 0, θ 2 I) Σ) MAP: f = arg minf {J(f)} with f ɛ J(f) = 1 v ɛ g Hf v f f 2 H Posterior Mean (PM)=MAP: g { f = (H t H + λi) 1 H t g Σ = (H t H + λi) 1 with λ = vɛ v f.
19 Gaussian model: Simple separable and Markovian g = Hf + ɛ Separable Gaussian v f v ɛ f ɛ H g Gauss-Markov v f, D v ɛ f ɛ H g g = Hf + ɛ { p(g f, θ1 ) = N (g Hf, v ɛ I) p(f g, θ) = N (f f, Σ) p(f v f ) = N (g 0, θ 2 I) MAP: f = arg minf {J(f)} with J(f) = 1 v ɛ g Hf v f f 2 Posterior Mean (PM)=MAP: { f = (H t H + λi] 1 H t g Σ = v ɛ (H t H + λi] 1 with λ = vɛ v f. Markovian case: p(f v f, D) = N (g 0, v f (DD t ) 1 ) MAP: J(f) = 1 v ɛ g Hf v f Df 2 Posterior Mean (PM)=MAP: { f = (H t H + λd t D] 1 H t g Σ = v ɛ (H t H + λd t D] 1 with λ = ve v f. A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 19/45
20 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 20/45 Bayesian inference (Unsupervised case) Unsupervised case: Hyper parameter estimation p(f, θ g) p(g f, θ 1 ) p(f θ 2 ) p(θ) JMAP: ( f, θ) = arg max (f,θ) {p(f, θ g)} Marginalization β 0 α 0 1: p(f g) = p(f, θ g) dθ θ 2 θ 1 Objective: Infer (f, θ) Marginalization 2: p(θ g) = p(f, θ g) df followed by: f ɛ H θ = arg max θ {p(θ g)} f = arg max f {p(f g, θ) } MCMC Gibbs sampling: g f p(f θ, g) θ p(θ f, g) until convergence Use samples generated to compute mean and variances VBA: Approximate p(f, θ g) by q 1 (f) q 2 (θ) Use q 1 (f) to infer f and q 2 (θ) to infer θ
21 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 21/45 Variational Bayesian Approximation Approximate p(f, θ g) by q(f, θ g) = q 1 (f g) q 2 (θ g) and then continue computations. Criterion KL(q(f, θ g) : p(f, θ g)) KL(q : p) = q ln q/p = q 1 q 2 ln q 1q 2 = p q 1 ln q 1 + q 2 ln q 2 q ln p = H(q 1 ) H(q 2 ) < ln p > q Alternate optimization algorithm q 1 q 2 q 1 q 2, [ ] q 1 (f) exp ln p(g, f, θ; M) [ q2 (θ)] q 2 (θ) exp ln p(g, f, θ; M) q1 (f) p(f, θ g) Variational Bayesian Approximation q 1 (f) f q 2 (θ) θ
22 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 22/45 JMAP, Marginalization, VBA JMAP: Marginalization p(f, θ g) optimization f θ p(f, θ g) p(θ g) θ p(f θ, g) f Joint Posterior Marginalize over f Variational Bayesian Approximation p(f, θ g) Variational Bayesian Approximation q 1 (f) f q 2 (θ) θ
23 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 23/45 Variational Bayesian Approximation Approximate p(f, θ g) by q(f, θ) = q 1 (f) q 2 (θ) and then use them for any inferences on f and θ respectively. Criterion KL(q(f, θ g) : p(f, θ g)) KL(q : p) = q ln q p = q 1 q 2 ln q 1q 2 p Iterative algorithm q 1 q 2 q 1 q 2, q 1 (f) q 2 (θ) [ ] exp ln p(g, f, θ; M) q2 [ (θ)] exp ln p(g, f, θ; M) q1 (f) p(f, θ g) Variational Bayesian Approximation q 1 (f) f q 2 (θ) θ
24 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 24/45 BVA: Choice of family of laws q 1 and q 2 { Case 1 : Joint MAP { q 1 (f f) = δ(f f) f = arg maxf {p(f, θ g; } M) } q 2 (θ θ) = δ(θ θ) θ= arg max θ {p( f, θ g; M) Case 2 : EM { q1 (f) p(f θ, g) Q(θ, θ)= ln p(f, θ g; M) q1 (f θ) q 2 (θ θ) = δ(θ θ) θ = arg max θ {Q(θ, θ) } Appropriate choice for inverse problems with Exponential families and Conjugate priors { q 1 (f) p(f θ, g; M) Accounts for the uncertainties of q 2 (θ) p(θ f, g; M) θ for f and vice versa.
25 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 25/45 JMAP, EM and VBA JMAP Alternate optimization Algorithm: θ (0) θ f = arg max f {p(f, θ g) } θ θ θ } = arg max θ {p( f, θ g) f f f EM: θ (0) θ q 1 (f) = p(f θ, g) q 1 (f) f θ θ Q(θ, θ) = ln p(f, θ g) q1(f) θ = arg max θ {Q(θ, θ) } q 1 (f) VBA: θ (0) q 2 (θ) q 1 (f) exp θ q 2 (θ) q 2 (θ) exp [ ] ln p(f, θ g) q2(θ) [ ] ln p(f, θ g) q1(f) q 1 (f) f q 1 (f)
26 Unsupervised Gaussian model Unsupervised Gaussian { case: g = Hf + ɛ, ɛ N (ɛ 0, v ɛ I) p(g f, vɛ ) = N (g Hf, v ɛ I) { p(f v f ) = N (g 0, v f I) p(vɛ ) = IG(v ɛ α ɛ0, β ɛ0 ) p(v f ) = IG(v f α f0, β f0 ) p(f, v ɛ, v f g) p(g f, v ɛ ) p(f v f ) p(v ɛ ) p(v f ) α f0, β f0 α ɛ0, β ɛ0 JMAP: ( f, ˆv ɛ, v f ) = arg max (f,v ɛ,v f ) {p(f, v ɛ, v f g)} v f v Alternate optimization: ɛ f = (H λi) 1 H + H g with λ = ˆvɛ v f f ɛ ˆv ɛ = β ɛ α ɛ, H β ɛ = β ɛ0 + g H f 2, α ɛ = α ɛ0 + M/2 v f = β f α f, β f = β f0 + f 2, α f = α f0 + N/2 g VBA: Approximate p(f, v ɛ, v f g) by q 1 (f) q 2 (v ɛ ) q 3 (v f ) Alternate optimization: q 1 (f) = N (f µ, Σ f ) q 2 (v ɛ ) = IG(v ɛ α ɛ, β ɛ ) q 3 (v f ) = IG(v ɛ α f, β f ) A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 26/45
27 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 27/45 Unsupervised Gaussian model { g = Hf + ɛ, ɛ N (ɛ 0, v ɛ I) { p(g f, vɛ ) = N (g Hf, v ɛ I) p(vɛ ) = IG(v, ɛ α ɛ0, β ɛ0 ) p(f v f ) = N (g 0, v f I) p(v f ) = IG(v f α f0, β f0 ) p(f, v ɛ, v f g) p(g f, v ɛ ) p(f v f ) p(v ɛ ) p(v f ) JMAP: ( f, ˆv ɛ, v f ) = arg max (f,v ɛ,v f ) {p(f, v ɛ, v f g)} f = (H λi) 1 H + H g with λ = ˆvɛ v f ˆv ɛ = β ɛ α ɛ, βɛ = β ɛ0 + g H f 2, α ɛ = α ɛ0 + M/2 v f = β f α f, βf = β f0 + f 2, α f = α f0 + N/2 VBA: Approximate p(f, v ɛ, v f g) by q 1 (f) q 2 (v ɛ ) q 3 (v f ) ( λi) 1 q 1 (f) = N (f µ, Σ µ = H H + H g with λ = ˆvɛ v f f ) q 2 (v ɛ ) = IG(v ɛ α ɛ, β Σ = ˆv ɛ ) ɛ (H H + λi ) 1 q 3 (v f ) = IG(v ɛ α f, β f ) ˆv ɛ = β ɛ α ɛ, βɛ = β ɛ0 + < g H f 2 >, α ɛ = α ɛ0 + M 2 v f = β f α f, βf = β f0 + < f 2 >, α f = α f0 + ] N 2 ] < g H f 2 >= g H µ 2 + Hdiag [ Σ H t, < f 2 >= µ 2 + diag [ Σ
28 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 28/45 Sparsity enforcing models Student-t model St(f ν) exp [ ν + 1 log ( 1 + f 2 /ν )] 2 Infinite Gausian Scaled Mixture (IGSM) equivalence St(f ν) 0 N (f, 0, 1/z) G(z α, β) dz, with α = β = ν/2 p(f z) p(z α, β) p(f, z α, β) = j p(f j z j ) = [ j N (f j 0, 1/z j ) exp 1 2 = j G(z j α, β) [ j z(α 1) j exp [ βz j ] ] exp j (α 1) ln z j βz j [ ] exp 1 2 j z jfj 2 + (α 1) ln z j βz j ] j z jfj 2
29 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 29/45 Non stationary noise and Integrated Gaussian Scaled Mixture (IGSM) model Non stationary noise: g = Hf+ɛ, ɛ i N (ɛ i 0, v ɛi ) ɛ N (ɛ 0, V ɛ = diag [v ɛ1,, v ɛm ]) Student-t prior model and its equivalent IGSM : f j v fj N (f j 0, v fj ) and v fj IG(v fj α f0, β f0 ) f j St(f j α f0, β f0 ) { p(g f, vɛ ) = N (g Hf, V α f0, β f0 α ɛ0, β ɛ ), V ɛ = diag [v ɛ ] ɛ0 { p(f v f ) = N (g 0, V f ), V f = diag [v f ] p(vɛ ) = v f v i IG(v ɛ i α ɛ0, β ɛ0 ) ɛ p(v f ) = i IG(v f j α f0, β f0 ) p(f, v ɛ, v f g) p(g f, v ɛ ) p(f v f ) p(v ɛ ) p(v f ) f ɛ JMAP: ( f, ˆv ɛ, ˆv f ) = arg max H (f,v ɛ,v f ) {p(f, v ɛ, v f g)} VBA: Approximate p(f, v ɛ, v f g) g by q 1 (f) q 2 (v ɛ ) q 3 (v f )
30 Sparse model in a Transform domain 1 g = Hf + ɛ, f = Dz, z sparse { p(g z, vɛ ) = N (g HDf, v ɛ I) { p(z v z ) = N (z 0, V z ), V z = diag [v z ] α z0, β z0 p(vɛ ) = IG(v ɛ α ɛ0, β ɛ0 ) p(v z ) = i IG(v z j α z0, β z0 ) v z α ɛ0, β p(z, v ɛ, v z, v ξ g) p(g z, v ɛ ) p(z v z ) p(v ɛ ) p(v z ) p(v ξ ) ɛ0 JMAP: z v (ẑ, ˆv ɛ, v z ) = arg max {p(z, v ɛ, v z g)} ɛ (z,v ɛ,v z ) D Alternate optimization: f ɛ ẑ = arg minz {J(z)} with: H J(z) = 1 2 ˆv ɛ g HDz 2 + V 1/2 z z 2 g v zj = βz 0 +ẑ2 j α z0 +1/2 ˆv ɛ = βɛ 0 + g HDẑ 2 α ɛ0 +M/2 VBA: Approximate p(z, v ɛ, v z, v ξ g) by q 1 (z) q 2 (v ɛ ) q 3 (v z ) Alternate optimization. A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 30/45
31 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 31/45 Sparse model in a Transform domain 2 g = Hf + ɛ, f = Dz + ξ, z sparse p(g f, v ɛ ) = N (g Hf, v ɛ I) α ξ0, β ξ0 α z0, β p(f z) = N (f Dz, v z0 ξ I), p(z v z ) = N (z 0, V z ), V z = diag [v z ] v ξ v z p(v ɛ ) = IG(v ɛ α ɛ0, β ɛ0 ) α ɛ0, β ɛ0 p(v z ) = i IG(v z j α z0, β z0 ) p(v ξ z v ξ ) = IG(v ξ α ξ0, β ξ0 ) ɛ p(f, z, v ɛ, v z, v ξ g) p(g f, v ɛ ) p(f z f ) p(z v z ) D p(v ɛ ) p(v z ) p(v ξ ) f ɛ JMAP: H ( f, ẑ, ˆv ɛ, v z, v ξ ) = arg max {p(f, z, v ɛ, v z, v ξ g)} (f,z,v ɛ,v z,v ξ ) g Alternate optimization. VBA: Approximate p(f, z, v ɛ, v z, v ξ g) by q 1 (f) q 2 (z) q 3 (v ɛ ) q 4 (v z ) q 5 (v ξ ) Alternate optimization.
32 Gauss-Markov-Potts prior models for images f (r) z(r) c(r) = 1 δ(z(r) z(r0 )) p(f (r) z(r) = k, mk, vk ) = N (mk, vk ) X p(f (r)) = P(z(r) = k) N (mk, vk ) Mixture of Gaussians I I k Q Separable iid hidden variables: p(z) = r p(z(r)) Markovian hidden variables: p(z) Potts-Markov: X p(z(r) z(r0 ), r0 V(r)) exp γ δ(z(r) z(r0 )) r0 V(r) X X p(z) exp γ δ(z(r) z(r0 )) r R r0 V(r) A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 32/45
33 Four different cases To each pixel of the image is associated 2 variables f (r) and z(r) I f z Gaussian iid, z iid : Mixture of Gaussians I f z Gauss-Markov, z iid : Mixture of Gauss-Markov I f z Gaussian iid, z Potts-Markov : Mixture of Independent Gaussians (MIG with Hidden Potts) I f z Markov, z Potts-Markov : Mixture of Gauss-Markov (MGM with hidden Potts) f (r) z(r) A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 33/45
34 Gauss-Markov-Potts prior models for images f (r) z(r) c(r) = 1 δ(z(r) z(r0 )) a0 g = Hf + m 0, v0 γ α 0, β 0 p(g f, v ) = N (g Hf, v I) α0, β0 p(v ) = IG(v α 0, β 0 )??? p(f = k,q mk, vk ) = N (f (r) mk, vk ) (r) z(r) P v z θ p(f z, θ) = k r Rk ak N (f (r) mk, v k ), θ = {(a, m, k k v k ), k = 1,, K R p(θ) = D(a a )N 0, v 0)IG(v α0, β0 ) h0 P(a m i P 0 p(z γ) exp γ δ(z(r) z(r )) Potts MRF 0 r r N (r) H? p(f, z, θ g) p(g f, v ) p(f z, θ) p(z γ) g MCMC: Gibbs Sampling VBA: Alternate optimization. A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 34/45
35 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 35/45 Bayesian Computation and Algorithms Joint posterior probability law of all the unknowns f, z, θ p(f, z, θ g) p(g f, θ 1 ) p(f z, θ 2 ) p(z θ 3 ) p(θ) Often, the expression of p(f, z, θ g) is complex. Its optimization (for Joint MAP) or its marginalization or integration (for Marginal MAP or PM) is not easy Two main techniques: MCMC: Needs the expressions of the conditionals p(f z, θ, g), p(z f, θ, g), and p(θ f, z, g) VBA: Approximate p(f, z, θ g) by a separable one q(f, z, θ g) = q 1 (f) q 2 (z) q 3 (θ) and do any computations with these separable ones.
36 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 36/45 MCMC based algorithm p(f, z, θ g) p(g f, z, θ 1 ) p(f z, θ 2 ) p(z θ 3 ) p(θ) General Gibbs sampling scheme: f p(f ẑ, θ, g) ẑ p(z f, θ, g) θ (θ f, ẑ, g) Generate samples f using p(f ẑ, θ, g) p(g f, θ) p(f ẑ, θ) When Gaussian, can be done via optimization of a quadratic criterion. Generate samples z using p(z f, θ, g) p(g f, ẑ, θ) p(z) Often needs sampling (hidden discrete variable) Generate samples θ using p(θ f, ẑ, g) p(g f, σ 2 ɛ I) p( f ẑ, (m k, v k )) p(θ) Use of Conjugate priors analytical expressions. After convergence use samples to compute means and variances.
37 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 37/45 Computed Tomography with only two projections y H ij f 1 f j g i f N g(r, φ) = L f (x, y) dl f (x, y) = j f j b j (x, y) { 1 if (x, y) pixel j b j (x, y) = 0 else N g i = H ij f j + ɛ i j=1 g = Hf + ɛ f (x, y) Case study: Reconstruction from 2 projections g 1 (x) = f (x, y) dy, g 2 (y) = f (x, y) dx Very ill-posed inverse problem f (x, y) = g 1 (x) g 2 (y) Ω(x, y) Ω(x, y) is a Copula: Ω(x, y) dx = 1 Ω(x, y) dy = 1 x
38 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 38/45 Simple example g 1 g 2 g 3 g =?? 4?? f 1 f 3 g 3 f 2 f 4 g 4 g 1 g 2 f 1 f 2 f 3 f 4 Hf = g f = H 1 g if H invertible. H is rank deficient: rank(h) = f 1 f 4 f 7 f 2 f 5 f 8 f 3 f 6 f 9 g 1 g 2 g 3 Problem has infinite number of solutions. How to find all those solutions? g 4 g 5 g 6 Which one is the good one? Needs prior information. To find an unique solution, one needs either more data or prior information. 0 0
39 Application in CT: Reconstruction from 2 projections g f g = Hf + g f N (Hf, σ 2 I) Gaussian f z iid Gaussian or Gauss-Markov z iid or Potts c q(r) {0, 1} 1 δ(z(r) z(r0 )) binary p(f, z, θ g) p(g f, θ 1 ) p(f z, θ 2 ) p(z θ 3 ) p(θ) A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 39/45
40 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 40/45 Proposed algorithms p(f, z, θ g) p(g f, θ 1 ) p(f z, θ 2 ) p(z θ 3 ) p(θ) MCMC based general scheme: f p(f ẑ, θ, g) ẑ p(z f, θ, g) θ (θ f, ẑ, g) Iterative algorithme: Estimate f using p(f ẑ, θ, g) p(g f, θ) p(f ẑ, θ) Needs optimization of a quadratic criterion. Estimate z using p(z f, θ, g) p(g f, ẑ, θ) p(z) Needs sampling of a Potts Markov field. Estimate θ using p(θ f, ẑ, g) p(g f, σ 2 ɛ I) p( f ẑ, (m k, v k )) p(θ) Conjugate priors analytical expressions. Variational Bayesian Approximation Approximate p(f, z, θ g) by q 1 (f) q 2 (z) q 3 (θ)
41 Results Original Backprojection Filtered BP Gauss-Markov+pos GM+Line process GM+Label process c LS z c A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 41/45
42 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 42/45 Implementation issues In almost all the algorithms, the step of computation of f needs an optimization algorithm. The criterion to optimize is often in the form of J(f) = g Hf 2 + λ Df 2 Very often, we use the gradient based algorithms which need to compute J(f) = 2H t (g Hf) + 2λD t Df So, for the simplest case, in each step, we have f(k+1) = f(k) + α (k) [ H t (g H f (k) ) + 2λD t D f (k)]
43 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 43/45 Gradient based algorithms f(k+1) = f(k) + α [ H ( g H f (k)) λd D f (k)] 1. Compute ĝ = H f (Forward projection) 2. Compute δg = g ĝ (Error or residual) 3. Compute δf 1 = H δg (Backprojection of error) 4. Compute δf 2 = D D f (Correction due to regularization) 5. Update f(k+1) = f(k) + [δf1 + δf 2 ] Initial estimated guess image f (0) f (k) update Forward projection H correction term in image space Backprojection δf = H δg λd Df (k) H projections of estimated image ĝ = Hf (k) compare correction term in projection space δg = g ĝ Measured projections g Steps 1 and 3 need great computational cost and have been implemented on GPU.
44 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 44/45 Multi-Resolution Implementation Sacle 1: black g (1) = H (1) f (1) ( N N ) Sacle 2: green g (2) = H (2) f (2) (N/2 N/2) Sacle 3: red g (3) = H (3) f (3) (N/4 N/4)
45 A. Mohammad-Djafari, Hierarchical prior models for scalable Bayesian Tomography, WSTL, January 25-27, 2016, Szeged, Hungary. 45/45 Conclusions Computed Tomography is an Inverse problem Analytical methods have many limitations Algebraic methods push further these limitations Deterministic Regularization methods push still further the limitations of ill-conditioning. Probabilistic and in particular the Bayesian approach has many potentials Hierarchical prior model with hidden variables are very powerful tools for Bayesian approach to inverse problems. Gauss-Markov-Potts models for images incorporating hidden regions and contours Main Bayesian computation tools: JMAP, MCMC and VBA Application in different imaging system (X ray CT, Microwaves, PET, Ultrasound, Optical Diffusion Tomography (ODT), Acoustic source localization,...) Current Projects: Efficient implementation in 2D and 3D cases
A. Mohammad-Djafari, Bayesian Discrete Tomography from a few number of projections, Mars 21-23, 2016, Polytechnico de Milan, Italy.
A. Mohammad-Djafari, Bayesian Discrete Tomography from a few number of projections, Mars 21-23, 2016, Polytechnico de Milan, Italy. 1. A Student-t based sparsity enforcing hierarchical prior for linear
More informationInverse Problems: From Regularization to Bayesian Inference
1 / 42 Inverse Problems: From Regularization to Bayesian Inference An Overview on Prior Modeling and Bayesian Computation Application to Computed Tomography Ali Mohammad-Djafari Groupe Problèmes Inverses
More informationBayesian X-ray Computed Tomography using a Three-level Hierarchical Prior Model
L. Wang, A. Mohammad-Djafari, N. Gac, MaxEnt 16, Ghent, Belgium. 1/26 Bayesian X-ray Computed Tomography using a Three-level Hierarchical Prior Model Li Wang, Ali Mohammad-Djafari, Nicolas Gac Laboratoire
More informationApproximate Bayesian Computation for Big Data
A. Mohammad-Djafari, Approximate Bayesian Computation for Big Data, Tutorial at MaxEnt 2016, July 10-15, Gent, Belgium. 1/63. Approximate Bayesian Computation for Big Data Ali Mohammad-Djafari Laboratoire
More informationBayesian inference for Machine Learning, Inverse Problems and Big Data: from Basics to Computational algorithms
A. Mohammad-Djafari, Approximate Bayesian Computation for Big Data, Seminar 2016, Nov. 21-25, Aix-Marseille Univ. 1/76. Bayesian inference for Machine Learning, Inverse Problems and Big Data: from Basics
More informationBayesian Microwave Breast Imaging
Bayesian Microwave Breast Imaging Leila Gharsalli 1 Hacheme Ayasso 2 Bernard Duchêne 1 Ali Mohammad-Djafari 1 1 Laboratoire des Signaux et Systèmes (L2S) (UMR8506: CNRS-SUPELEC-Univ Paris-Sud) Plateau
More informationBayesian inference methods for sources separation
A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 1/12. Bayesian inference methods for sources separation Ali Mohammad-Djafari Laboratoire des Signaux et Systèmes, UMR8506 CNRS-SUPELEC-UNIV
More informationInverse problems in imaging and computer vision
1 / 62. Inverse problems in imaging and computer vision From deterministic methods to probabilistic Bayesian inference Ali Mohammad-Djafari Groupe Problèmes Inverses Laboratoire des Signaux et Systèmes
More informationBayesian Approach for Inverse Problem in Microwave Tomography
Bayesian Approach for Inverse Problem in Microwave Tomography Hacheme Ayasso Ali Mohammad-Djafari Bernard Duchêne Laboratoire des Signaux et Systèmes, UMRS 08506 (CNRS-SUPELEC-UNIV PARIS SUD 11), 3 rue
More informationInverse Problems: From Regularization Theory to Probabilistic and Bayesian Inference
A. Mohammad-Djafari, Inverse Problems: From Regularization to Bayesian Inference, Pôle signaux, 1/11. Inverse Problems: From Regularization Theory to Probabilistic and Bayesian Inference Ali Mohammad-Djafari
More informationVARIATIONAL BAYES WITH GAUSS-MARKOV-POTTS PRIOR MODELS FOR JOINT IMAGE RESTORATION AND SEGMENTATION
VARIATIONAL BAYES WITH GAUSS-MARKOV-POTTS PRIOR MODELS FOR JOINT IMAGE RESTORATION AND SEGMENTATION Hacheme AYASSO and Ali MOHAMMAD-DJAFARI Laboratoire des Signaux et Systèmes, UMR 8506 (CNRS-SUPELEC-UPS)
More informationEntropy, Information Theory, Information Geometry and Bayesian Inference in Data, Signal and Image Processing and Inverse Problems
Entropy 2015, 17, 3989-4027; doi:10.3390/e17063989 OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Article Entropy, Information Theory, Information Geometry and Bayesian Inference in Data,
More informationSensors, Measurement systems and Inverse problems
A. Mohammad-Djafari, Sensors, Measurement systems and Inverse problems, 2012-2013 1/78. Sensors, Measurement systems and Inverse problems Ali Mohammad-Djafari Laboratoire des Signaux et Systèmes, UMR8506
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationInverse problems, Deconvolution and Parametric Estimation
A. Mohammad-Djafari, Inverse problems, Deconvolution and Parametric Estimation, MATIS SUPELEC, 1/102. Inverse problems, Deconvolution and Parametric Estimation Ali Mohammad-Djafari Laboratoire des Signaux
More informationParametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a
Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a Some slides are due to Christopher Bishop Limitations of K-means Hard assignments of data points to clusters small shift of a
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationVariational inference
Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationVariational Scoring of Graphical Model Structures
Variational Scoring of Graphical Model Structures Matthew J. Beal Work with Zoubin Ghahramani & Carl Rasmussen, Toronto. 15th September 2003 Overview Bayesian model selection Approximations using Variational
More informationMAXIMUM ENTROPIES COPULAS
MAXIMUM ENTROPIES COPULAS Doriano-Boris Pougaza & Ali Mohammad-Djafari Groupe Problèmes Inverses Laboratoire des Signaux et Systèmes (UMR 8506 CNRS - SUPELEC - UNIV PARIS SUD) Supélec, Plateau de Moulon,
More informationThe Bayesian approach to inverse problems
The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationLikelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009
with with July 30, 2010 with 1 2 3 Representation Representation for Distribution Inference for the Augmented Model 4 Approximate Laplacian Approximation Introduction to Laplacian Approximation Laplacian
More informationIntroduction to Bayesian methods in inverse problems
Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationMCMC Sampling for Bayesian Inference using L1-type Priors
MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling
More informationVariational Methods in Bayesian Deconvolution
PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the
More informationBayesian inference with hierarchical prior models for inverse problems in imaging systems
A Mohammad-Djafari, Tutorial talk: Bayesian inference for inverse problems, WOSSPA, May 11-15, 2013, Mazafran, Algiers, 1/96 Bayesian inference with hierarchical prior models for inverse problems in imaging
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationLecture 1b: Linear Models for Regression
Lecture 1b: Linear Models for Regression Cédric Archambeau Centre for Computational Statistics and Machine Learning Department of Computer Science University College London c.archambeau@cs.ucl.ac.uk Advanced
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationSUPREME: a SUPer REsolution Mapmaker for Extended emission
SUPREME: a SUPer REsolution Mapmaker for Extended emission Hacheme Ayasso 1,2 Alain Abergel 2 Thomas Rodet 3 François Orieux 2,3,4 Jean-François Giovannelli 5 Karin Dassas 2 Boualam Hasnoun 1 1 GIPSA-LAB
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationAlgorithmisches Lernen/Machine Learning
Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationCovariance Matrix Simplification For Efficient Uncertainty Management
PASEO MaxEnt 2007 Covariance Matrix Simplification For Efficient Uncertainty Management André Jalobeanu, Jorge A. Gutiérrez PASEO Research Group LSIIT (CNRS/ Univ. Strasbourg) - Illkirch, France *part
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ and Center for Automated Learning and
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationChapter 20. Deep Generative Models
Peng et al.: Deep Learning and Practice 1 Chapter 20 Deep Generative Models Peng et al.: Deep Learning and Practice 2 Generative Models Models that are able to Provide an estimate of the probability distribution
More informationBayesian Machine Learning
Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning
More informationInverse problems in imaging science: From classical regularization methods To state of the art Bayesian methods
A. Mohammad-Djafari, Inverse problems in imaging science:..., Tutorial presentation, IPAS 2014: Tunisia, Nov. 5-7, 2014, 1/76. Inverse problems in imaging science: From classical regularization methods
More informationVariable selection for model-based clustering
Variable selection for model-based clustering Matthieu Marbac (Ensai - Crest) Joint works with: M. Sedki (Univ. Paris-sud) and V. Vandewalle (Univ. Lille 2) The problem Objective: Estimation of a partition
More informationVariational Principal Components
Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings
More informationProbabilistic & Unsupervised Learning
Probabilistic & Unsupervised Learning Week 2: Latent Variable Models Maneesh Sahani maneesh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc ML/CSML, Dept Computer Science University College
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationSparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference
Sparse Bayesian Logistic Regression with Hierarchical Prior and Variational Inference Shunsuke Horii Waseda University s.horii@aoni.waseda.jp Abstract In this paper, we present a hierarchical model which
More informationComputer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression
Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.
More informationOutline Lecture 2 2(32)
Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic
More informationOutline lecture 2 2(30)
Outline lecture 2 2(3), Lecture 2 Linear Regression it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic Control
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationClustering bi-partite networks using collapsed latent block models
Clustering bi-partite networks using collapsed latent block models Jason Wyse, Nial Friel & Pierre Latouche Insight at UCD Laboratoire SAMM, Université Paris 1 Mail: jason.wyse@ucd.ie Insight Latent Space
More informationAn introduction to Bayesian inference and model comparison J. Daunizeau
An introduction to Bayesian inference and model comparison J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland Overview of the talk An introduction to probabilistic modelling Bayesian model comparison
More informationHierarchical Sparse Bayesian Learning. Pierre Garrigues UC Berkeley
Hierarchical Sparse Bayesian Learning Pierre Garrigues UC Berkeley Outline Motivation Description of the model Inference Learning Results Motivation Assumption: sensory systems are adapted to the statistical
More informationComputer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression
Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:
More informationParametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory
Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationGeneral Bayesian Inference I
General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationBayesian inference J. Daunizeau
Bayesian inference J. Daunizeau Brain and Spine Institute, Paris, France Wellcome Trust Centre for Neuroimaging, London, UK Overview of the talk 1 Probabilistic modelling and representation of uncertainty
More informationBayesian Learning (II)
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning (II) Niels Landwehr Overview Probabilities, expected values, variance Basic concepts of Bayesian learning MAP
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationStatistical learning. Chapter 20, Sections 1 4 1
Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete
More informationResults: MCMC Dancers, q=10, n=500
Motivation Sampling Methods for Bayesian Inference How to track many INTERACTING targets? A Tutorial Frank Dellaert Results: MCMC Dancers, q=10, n=500 1 Probabilistic Topological Maps Results Real-Time
More informationan introduction to bayesian inference
with an application to network analysis http://jakehofman.com january 13, 2010 motivation would like models that: provide predictive and explanatory power are complex enough to describe observed phenomena
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationIEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm
IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.
More informationLecture 9: PGM Learning
13 Oct 2014 Intro. to Stats. Machine Learning COMP SCI 4401/7401 Table of Contents I Learning parameters in MRFs 1 Learning parameters in MRFs Inference and Learning Given parameters (of potentials) and
More informationProbabilistic Reasoning in Deep Learning
Probabilistic Reasoning in Deep Learning Dr Konstantina Palla, PhD palla@stats.ox.ac.uk September 2017 Deep Learning Indaba, Johannesburgh Konstantina Palla 1 / 39 OVERVIEW OF THE TALK Basics of Bayesian
More informationThe Expectation Maximization or EM algorithm
The Expectation Maximization or EM algorithm Carl Edward Rasmussen November 15th, 2017 Carl Edward Rasmussen The EM algorithm November 15th, 2017 1 / 11 Contents notation, objective the lower bound functional,
More informationLinear Dynamical Systems
Linear Dynamical Systems Sargur N. srihari@cedar.buffalo.edu Machine Learning Course: http://www.cedar.buffalo.edu/~srihari/cse574/index.html Two Models Described by Same Graph Latent variables Observations
More informationPattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM
Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures
More informationBayesian Knowledge Fusion in Prognostics and Health Management A Case Study
Bayesian Knowledge Fusion in Prognostics and Health Management A Case Study Masoud Rabiei Mohammad Modarres Center for Risk and Reliability University of Maryland-College Park Ali Moahmamd-Djafari Laboratoire
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationCSE 473: Artificial Intelligence Autumn Topics
CSE 473: Artificial Intelligence Autumn 2014 Bayesian Networks Learning II Dan Weld Slides adapted from Jack Breese, Dan Klein, Daphne Koller, Stuart Russell, Andrew Moore & Luke Zettlemoyer 1 473 Topics
More informationCSci 8980: Advanced Topics in Graphical Models Gaussian Processes
CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian
More informationBayesian Paradigm. Maximum A Posteriori Estimation
Bayesian Paradigm Maximum A Posteriori Estimation Simple acquisition model noise + degradation Constraint minimization or Equivalent formulation Constraint minimization Lagrangian (unconstraint minimization)
More informationPattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods
Pattern Recognition and Machine Learning Chapter 6: Kernel Methods Vasil Khalidov Alex Kläser December 13, 2007 Training Data: Keep or Discard? Parametric methods (linear/nonlinear) so far: learn parameter
More informationPart 1: Expectation Propagation
Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s
More informationThe Particle Filter. PD Dr. Rudolph Triebel Computer Vision Group. Machine Learning for Computer Vision
The Particle Filter Non-parametric implementation of Bayes filter Represents the belief (posterior) random state samples. by a set of This representation is approximate. Can represent distributions that
More informationThe Expectation-Maximization Algorithm
1/29 EM & Latent Variable Models Gaussian Mixture Models EM Theory The Expectation-Maximization Algorithm Mihaela van der Schaar Department of Engineering Science University of Oxford MLE for Latent Variable
More informationIntroduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf
1 Introduction to Machine Learning Maximum Likelihood and Bayesian Inference Lecturers: Eran Halperin, Lior Wolf 2014-15 We know that X ~ B(n,p), but we do not know p. We get a random sample from X, a
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationBetter restore the recto side of a document with an estimation of the verso side: Markov model and inference with graph cuts
June 23 rd 2008 Better restore the recto side of a document with an estimation of the verso side: Markov model and inference with graph cuts Christian Wolf Laboratoire d InfoRmatique en Image et Systèmes
More informationBAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA
BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci
More information