Bayesian inference methods for sources separation

Size: px

Start display at page:

Download "Bayesian inference methods for sources separation"

Cassandra Thornton
5 years ago
Views:

des Signaux et Systèmes, UMR8506 CNRS-SUPELEC-UNIV PARIS SUD 11 SUPELEC, 91192

1 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 1/12. Bayesian inference methods for sources separation Ali Mohammad-Djafari Laboratoire des Signaux et Systèmes, UMR8506 CNRS-SUPELEC-UNIV PARIS SUD 11 SUPELEC, Gif-sur-Yvette, France

2 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 2/12 General source separation problem g(t) = Af(t)+ǫ(t), t [1,,T] g(r) = Af(r)+ǫ(r), r = (x,y) R 2 f unknown sources A mixing matrix, a j steering vectors g observed signals ǫ represents the errors of modeling and measurement [ g1 g 2 g = Af g i = a ij f j g = a j f j j j ] [ ][ ] [ ] a11 a = 12 f1 f1 0 f = 2 0 a 21 a 22 f 2 0 f 1 0 f 2 g = Af = Fa with F = f I, a = vec(a) a 11 a 21 a 12 a 22 A known, estimation of f: g = Af +ǫ f known, estimation of A: g = Fa+ǫ Joint estimation of f and A: g = Af +ǫ = Fa+ǫ

3 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 3/12 General Bayesian source separation problem p(f,a g,θ 1,θ 3 ) = p(g f,a,θ 1)p(f θ 2 )p(a θ 3 ) p(g θ 1,θ 2,θ 3 ) p(g f,a,θ 1 ) likelihood p(f θ 2 ) and p(a θ 3 ) priors p(f,a g,θ 1,θ 3 ) joint posterior θ = (θ 1,θ 2,θ 3 ) hyper-parameters Two approaches: Estimate first A and then use it for estimating f Joint estimation In real application, we also have to estimate θ: p(f,a,θ g) = p(g f,a,θ 1)p(f θ 2 )p(a θ 3 )p(θ) p(g)

4 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 4/12 Bayesian inference for sources f when A is known g = Af +ǫ Prior knowledge on ǫ: ǫ N(ǫ 0,v ǫ I) p(g f,a) = N(g Af,v ǫ I) exp Simple prior models for f: p(f α) exp{ αω(f)} Expression of the posterior law: p(f g,a) p(g f,a)p(f) exp{ J(f)} with J(f) = 1 2v ǫ g Af 2 +αω(f) Link between MAP estimation and regularization { } 1 g Af 2 2v ǫ p(f θ,g) Optimization of J(f) = 1 2v ǫ g Af 2 +αω(f) f

5 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 5/12 MAP and link with regularization Gaussian: Ω(f) = f 2 = j f j 2 J(f) = 1 2v ǫ g Af 2 +α f 2 f = [A A+λI] 1 A g Generalized Gaussian: Ω(f) = γ j f j β ). Student-t model: Ω(f) = ν +1 2 log ( 1+f 2 j /ν). j Elastic Net model: Ω(f) = j [ γ1 f j +γ 2 f 2 j]

6 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 6/12 Full Bayesian and Variational Bayesian Approximation Full Bayesian: p(f,θ g) p(g f,θ 1 )p(f θ 2 )p(θ) Approximate p(f,θ g) by q(f,θ g) = q 1 (f g)q 2 (θ g) and then continue computations. Criterion KL(q(f,θ g) : p(f,θ g)) KL(q : p) = qlnq/p = q 1 q 2 ln q 1q 2 p Iterative algorithm q 1 q 2 q 1 q 2, { } q 1 (f) exp lnp(g,f,θ;m) q2 { (θ)} q 2 (θ) exp lnp(g,f,θ;m) q1 (f) p(f,θ g) Variational Bayesian Approximation q 1 (f) f q 2 ( θ) θ

7 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 7/12 Estimation of A when the sources f are known Source separation is a bilinear model: [ g1 g 2 g = Af = Fa = Af ] [ ][ ] [ a11 a = 12 f1 f1 0 f = 2 0 a 21 a 22 f 2 0 f 1 0 f 2 Problem is more ill-posed. F = f I, a = vec(a) ] a 11 a 21 a 12 a 22 We need absolutely to impose constraintes on elements or the structure of A, for example: Positivity of the elements Toeplitz or TBBT structure, { Symmetry p(a) exp α I A A 2} Sparsity p(a) exp { α } i,j A ij The same Bayesian approach then can be applied

8 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 8/12 General case: Joint Estimation of A and f v 0 A 0,V 0 v ǫ p(f j (t) v 0j ) = N(0,v f(t) { 0j ) A p(f(t) v 0 ) exp 1 } 2 j f2 j (t)/v 0j p(a ij A 0ij,V 0ij ) = N(A 0ij,V 0ij ) ǫ(t) g(t) p(a A 0,V 0 ) = N(A 0,V 0 ) p(g(t) A,f(t),v ǫ ) = N(Af(t),v ǫ I) p(f 1..T,A g 1..T ) p(g 1..T A,f 1..T,v ǫ ) p(f 1..T ) p(a A 0,V 0 ) t p(g(t) A,f(t),v ǫ) p(f(t) z(t)) p(a A 0,V 0 ) p(f(t) g 1..T,A,v ǫ,v 0 ) = N( f(t), Σ) p(a g 1..T,f 1..T,v ǫ,a 0,V 0 ) = N(Â, V )

9 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 9/12 Joint Estimation of A and f.. v 0 = [v f,..,v f ], All sources a priori same variance v f v ǫ = [v ǫ,..,v ǫ ], All noise terms a priori same variance v ǫ A 0 = 0, V 0 = v a I p(f(t) g(t),a,v ǫ,v 0 ) = N( f(t), Σ) { Σ = (A A+λ f I) 1 f(t) = (A A+λ f I) 1 A g(t), λ f = v ǫ /v f p(a g(t),f(t),v ǫ,a 0,V 0 ) = N(Â, V ) { V = (F F +λ f I) 1 Â = t g(t)f (t)( t f(t)f (t)+λ a I) 1 λ a = v ǫ /v a

10 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 10/12 Joint Estimation of A and f.. p(f 1..T,A g 1..T ) p(g 1..T A,f 1..T,v ǫ ) p(f 1..T ) p(a A 0,V 0 ) t p(g(t) A,f(t),v ǫ) p(f(t) z(t)) p(a A 0,V 0 ) Joint MAP: Alternate optimization f(t) = (Â Â+λ f I) 1 Â g(t), λ f = v ǫ /v f Â = t g(t) f (t)( t f(t) f (t)+λ a I) 1 λ a = v ǫ /v a Alternate optimization Algorithm: A (0) Â (Â Â+λ f I) 1Â g f(t) Â t g(t) f (t)( t f(t) f (t)+λ a I) 1 f(t)

11 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 11/12 Joint Estimation of A and f with a Gaussian prior model.. VBA: p(f 1..T,A g 1..T ) q 1 (f 1..T A,g 1..T ) q 2 (A f 1..T,g 1..T ) q 1 (f(t) g(t),a,v ǫ,v 0 ) = N( f(t), Σ) { Σ = (A A+λ f V ) 1 f(t) = (A A+λ f V ) 1 A g(t), λ f = v ǫ /v f q 2 (A g(t),f(t),v ǫ,a 0,V 0 ) = N(Â, V ) V = (F F +λ f Σ) 1 Â = 1 t (t)( g(t)f t f(t)f (t)+λ a Σ) λ a = v ǫ /v a A (0) Â f(t) (Â ) = Â+λ f V 1Â g(t) V (0) V Σ = (A A+λ f V ) 1 Â V Â = t g(t) f (t)( f(t) f ) 1 t (t)+λ a Σ V = (F F +λ f Σ) 1 f(t) Σ f(t) Σ

12 A. Mohammad-Djafari, BeBec2012, February 22-23, 2012, Berlin, Germany, 12/12 Conclusions General source separation problem Estimation of f when A is known Estimation of A when the sources f are known Joint estimation of the sources f and the mixing matrix A General Bayesian inference for source separation Full Bayesian with hyperparameter estimation Priors which enforce sparsity Generalized Gaussian, Student-t Mixture of Gaussians or Gammas, Bernoulli-Gaussian Computational tools: Laplace approximation, MCMC and Variational Bayesian Approximation Advanced Bayesian methods: Non-Gaussian, Dependent and nonstationnary signals and images. Some domaines of applications Source localization, Spectrometry, CMB, Sattelite Image separation, Hyperspectral image processing

Bayesian X-ray Computed Tomography using a Three-level Hierarchical Prior Model

L. Wang, A. Mohammad-Djafari, N. Gac, MaxEnt 16, Ghent, Belgium. 1/26 Bayesian X-ray Computed Tomography using a Three-level Hierarchical Prior Model Li Wang, Ali Mohammad-Djafari, Nicolas Gac Laboratoire