Blind Source Separation (BSS) and Independent Componen Analysis (ICA) Massoud BABAIE-ZADEH Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39
Outline Part I Part II Introduction Separability Part III problem Some famous approaches for solving BSS Part IV Part V Extensions to ICA Applications, my works and perspectives Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.2/39
Part I Introduction to BSS and ICA Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.3/39
Part I Blind Source Separation (BSS) s 1 a 11 x 1 b 11 y 1 a 21 b 21 a 12 b 12 s 2 a 22 x 2 b 22 y 2 Mixing System Separating System s i : Original source (assumed to be independent). x i : Received (mixed) signals. y i : Estimated sources. Goal: y i = s i Is it possible? Isn t it ill-posed? Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.4/39
Part I Some historical notes: Herault, Jutten and Ans work Observation in 1982: The angular position (p(t)) and the angular velocity (v(t) = dp(t)/dt) of a joint is represented by two nervous signals f 1 (t) and f 2 (t), each one is a linear combination of position and velocity: f 1 (t) = a 11 p(t) + a 12 v(t) f 2 (t) = a 21 p(t) + a 22 v(t) At each instant the nervous system knows p(t) and v(t) p(t) and v(t) must be recoverable only from f 1 (t) and f 2 (t) Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.5/39
Part I Herault and Jutten (HJ) Algorithm x 1 x 2 m 12 m 21 y 1 y 2 Presented in GRETSI 85, COGNITAVA 85 and Snowbird 86. Choosing m 12 and m 21 correctly results in separation: { y 1 = x 1 m 12 y 2 y = x My y = (I + M) 1 x y 2 = x 2 m 21 y 1 Main Idea: E {f(y 1 )g(y 2 )} = 0 Independence (ICA) The algorithm: m 12 m 12 µf(y 1 )g(y 2 ) m 21 m 21 µf(y 2 )g(y 1 ) Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.6/39
Part I Growing up interest mid 80 s mid 90 s: Slow development: Neural networks scientists were challenging some other topics! Mainly French scientists. 1988: J.-L Lacoume work based on cumulant. Starting 1989: P. Comon and J.-F. Cardoso s papers. 1994: Bell & Sejnowsky work. From mid 90 s: Exploring interest. 1999: First international conference, ICA 99 (France), collecting more than 100 researchers. ICA2000 (Finland), ICA2001 (USA), ICA2003 (Japan) and upcoming ICA2004 (Spain). Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.7/39
Part II Separability Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.8/39
Part II Linear (instantaneous) mixtures s x y A B s = (s 1,..., s N ) T : source vector. x = (x 1,..., x N ) T : observation vector. y = (y 1,..., y N ) T : output vector. x = As: unknown mixing system. y = Bx: separating system. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.9/39
Part II Linear (instantaneous) mixtures s x y A B Main assumption: The sources (s i s) are statistically independent. Separability: Does the independence of the outputs (ICA) imply the separation of the sources (BSS)? Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.10/39
Part II Counter-example s C y s 1 and s 2 independent N(0, 1). C an orthonormal (rotation) matrix. R y = E { yy T } = CR s C T = CC T = I y 1 and y 2 are independent. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.11/39
Part II Darmois Theorem (1947) y 1 = a 1 s 1 + a 2 s 2 + + a N s N y 2 = b 1 s 1 + b 2 s 2 + + b N s N s i s are independent y 1 and y 2 are independent If for an i we have a i b i 0 s i is Gaussian. If y 1 and y 2 are independent, a Non-Gaussian source cannot be present in both of them. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.12/39
Part II Separability of linear mixtures s x y A B Linear mixtures are separable, provided that there is no more than 1 Gaussian source ([Comon 91 & 94] inspired from [Darmois 1947]): Independence of the outputs Separation of the sources Indeterminacies are trivial: Permutation and Scale. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.13/39
Part II Geometric PSfrag replacements Interpretation [Puntonet et. al. GRETSI 95] s 1 s 2 rag replacements s2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 s 1 1 0.5 0 0.5 1 1.5 1 0.5 1 a x 2 0 w 2 0.5 1 1.5 1.5 1 0.5 0 0.5 1 1.5 w 1 x 1 b p s1 s 2 (s 1, s 2 ) = p s1 (s 1 )p s2 (s 2 ) p s2 s 1 (s 2 s 1 ) = p s2 (s 2 ) Bounded sources: x = As, A = The distribution of (s 1, s 2 ) forms a rectangular region [ 1 a b 1 ] Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.14/39
Part II Non-Gaussianity Non-Gassianity, isn t it too restrictive? Many practical signals (speech, PSK, bounded signals,... ) are not Gaussian. Cramer s Theorem: X = X 1 + X 2 + + X N. X i s independent. X is Gaussian All X i s must be Gaussian. Gaussian sources can be separated if there is some time dependence (non-iid) or non-stationarity. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.15/39
Part II Independence versus Decorrelation Question: Can we use decorrelation (2nd order independence) for source separation? NO! Example: x i s are decorrelated (R x = I). B any orthogonal (rotation) matrix. y = Bx R y = BR x B T = I y i s decorrelated. Decorrelation property remains unchanged under any orthogonal mixing matrix. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.16/39
Part II PCA Principle Component Analysis (PCA) = Karhunen-Loève Transform = Hotelling transform = Whitening R x = E { xx } T : Covariance matrix of x. E and Λ: Eigenvector and eigenvalue matrices of R x. W E T : The whitening matrix. y = Wx R y = Λ (diagonal) y i s are decorrelated. PCA is NOT sufficient for ICA (it leaves an unknown rotation). Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.17/39
Part II Another interpretation y = Bx B = [ 1 a b 1 ] 2 unkowns (a and b) must be determined. Decorrelation property (E {y 1 y 2 } = 0) gives only 1 equation Not sufficient. Decorrelation (2nd order independence) is not sufficient Higher Order Statistics (HOS). Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.18/39
Part II Summary of Part II s x y A B Linear mixtures can be separated (At most 1 Gaussian source). Remaining indeterminacies: Scale, Permutation. Output independence is sufficient for source separation. Independence cannot be reduced to decorrelation. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.19/39
Part III Some famous approaches for solving BSS problem Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.20/39
Part III Using Higher Order Statistics (HOS) techniques 4th order independence is sufficient. Higher order characteristics of a random variable is usually described using its cumulants. cumulants: Coefficients of Taylor series of the second characteristic function Ψ x (s) = ln Φ x (s) = ln E {e sx }. Cross-cumulants: Coefficients of Taylor series of Ψ x1 x 2 (s 1, s 2 ) = ln E {e s 1x 1 +s 2 x 2 }. 4th order independence Cancelling Cum 13 (y 1, y 2 ), Cum 22 (y 1, y 2 ) and Cum 31 (y 1, y 2 ). Requires non-zero 4th order statistics, Only for linear mixtures. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.21/39
Part III Mutual Information, an independence criterion Independence of x = (x 1,..., x N ) T p x (x) = N i=1 p x i (x i ) ( ) N I(x) = KL p x (x) p xi (x i ) i=1 = i H(x i) H(x) = x p x (x) ln p x (x) i p x i (x i ) dx H Shannon s entropy H(x) = E {p x (x)} Main property: I(x) 0. I(x) = 0 iff x 1,..., x N are independent. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.22/39
Part III Minimizing output mutual information I(y) = E { ψ B y (y)x } T B T. ψ y (y) (ψ y1 (y 1 ),..., ψ yn (y N )) T. ψ yi (y i ) d dy i ln p yi (y i ). Steepest descent: B B µ B I(y). Equivarient algorithm [Cardoso&Laheld 96]: B B µ B I(y)B B I(y) = B I(y)BT = E { ψ y (y)y T } I. Not applicable for more complicated mixtures (I(y + ) I(y) =?). Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.23/39
Part III Some definitions Score function of a random variable x: ψ x (x) d dx ln p x(x) For a random vector x = (x 1,..., x N ) T : Marginal Score Function (MSF): ψ x (x) (ψ x1 (x 1 ),..., ψ N (x N )) T, ψ i (x i ) d ln p xi (x i ) dx i Joint Score Function (JSF): ϕ x (x) (ϕ 1 (x 1 ),..., ϕ N (x N )) T, ϕ i (x) ln p x (x) x i Score Function Difference (SFD): β x (x) ψ x (x) ϕ x (x) Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.24/39
Part III Differential of mutual information I(x + ) I(x) = E { T β x (x) } + o( ) For a differentiable multi-variate function: f(x + ) f(x) = T ( f(x)) + o( ) SFD can be called the stochastic gradient of the mutual information. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.25/39
Part III Some other ideas for source separation Maximizing Non-Gaussianity of the outputs. x 1 = a 11 s 1 + a 12 s 2 + + a 1N s N : each x i is more Gaussian than all sources. y 1 = b 11 x 1 + b 12 x 2 + + b 1N x N : Determine b 1i s to produce as non-gaussian as possible y 1 Separation. Measure of non-gaussianity: Neg-entropy. Example: FastICA algorithm [Hyvärinen 99]. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.26/39
Part III Some other ideas for source separation (continued.) Second order approaches (applicable for Gaussian sources, too): Exploiting time correlation E {y 1 (n)y 2 (n)} = 0 and E {y 1 (n)y 2 (n 1)} = 0. Requires time correlation (non-applicable for iid sources). Exploiting non-stationarity: Joint diagonalization of Covariance matrix [Pham 2001]. Requires non-stationarity. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.27/39
Part III Non-separable source Non-separable if ALL these three properties: Gaussian. iid. stationary. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.28/39
Part III Summary of Part III Algorithms based on output independence: Cancelling 4th order cross-cumulants. Minimizing mutual information. Algorithms based on non-gaussianity. Second order algorithms: Algorithms based on time correlation. Algorithms based on non-stationarity. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.29/39
Part IV Extensions to ICA Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.30/39
Part IV Extensions to linear instantaneous mixtures Complex signals. Noisy ICA: x = As + n Different number of sources and sensors: Overdetermined mixtures. Estimating number of sources? Underdetermined mixtures. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.31/39
Part IV Convolutive Mixtures s 1 a 11 x 1 b 11 y 1 a 21 b 21 a 12 b 12 s 2 a 22 x 2 b 22 y 2 Mixing System Separating System Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.32/39
Part IV Convolutive Mixtures s 1 A 11 (z) A 21 (z) x 1 B 11 (z) B 21 (z) y 1 A 12 (z) B 12 (z) s 2 A 22 (z) x 2 B 22 (z) y 2 Mixing System Separating System Separation system: y(n) = B 0 x(n) + B 1 x(n 1) + + B M x(n M) Extension to the Widrow s noise canceller system. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.32/39
Part IV Convolutive Mixtures s 1 A 11 (z) A 21 (z) x 1 B 11 (z) B 21 (z) y 1 A 12 (z) B 12 (z) s 2 A 22 (z) x 2 B 22 (z) y 2 Mixing System Separating System Separation system: y(n) = B 0 x(n) + B 1 x(n 1) + + B M x(n M) Extension to the Widrow s noise canceller system. Convolutive mixtures are separable, too [Yellin, Weinstein, 95]: Output independence Separation. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.32/39
Part IV Non-linear Mixtures s x y A B Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.33/39
Part IV Non-linear Mixtures s x y F G Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.33/39
Part IV Non-linear Mixtures s x y F G In general, non-linear mixtures are not separable: Output Independence source separation Independence is not strong enough for source separation. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.33/39
Part IV Non-linear Mixtures s x y F G How to overcome this problem? Regularization techniques (smoothness)? Structural constraints Others (temporal correlation? non-stationarity?) Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.33/39
Part IV PNL (Post Non-Linear) mixtures s 1. s N A w 1 w N f 1. f N e 1 e N g 1 g N x 1 x N B Mixing System Separating System. y 1. y N Separability theorem [Taleb & Jutten, IEEE trans. SP, 99]: The outputs are independent iff: g i = f 1 i BA = PD Provided that: The sources are really mixed (at least 2 non-zero entries in each row of A). Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.34/39
Part IV CPNL (Convolutive PNL) mixtures s 1. s N A(z) w 1 w N f 1. f N e 1 e N g 1. g N x 1 x N B(z) Mixing System Separating System Separability: s =. s T (n 1) s T (n) s T (n + 1).. e T (n 1), e = e T (n) e T (n + 1)., Ā = A(z) = n A nz n, e = f ( Ās ) y 1. y N A n+1 A n A n 1 A n+2 A n+1 A n Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.35/39
Part V Applications, my works and perspectives Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.36/39
Part V Applications Feature Extraction. Image denoising (using noisy ICA methods). Medical engineering applications (ECG, EEG, MEG, Artifact separation). Telecommunications (Blind Channel Equalization, CDMA). Financial applications. Audio separation. Seismic applications. Astronomy. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.37/39
Part V My works, mainly at my PhD thesis CPNL mixtures. Gradient of mutual information (SFD): General approach for any (separable) parametric model. Gradient approach. Minimization-Projection approach. Special cases: Linear, convolutive and PNL. Proof of separability of PNL mixtures. A geometric method for separating PNL mixtures (compensating sensors nonlinearities before separation). Post Convolutive mixtures and their properties. Even smooth non-linear systems may preserve the independence. (Not at my PhD thesis) Blind estimation of a Wiener telecommunication channel (linear channel + nonlinear receiver). Manuscript downloadable from: http://www.lis.inpg.fr/theses Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.38/39
Part V Perspectives Continuation of my previous works Writing 2 papers. Adaptive algorithms. Underdetermined mixtures: it seems that the minimization-projection approach can be used for identifying (but not separating) such systems. Working on PNL-L mixtures:. B fn A f 1.. PNL mixtures: Compensating sensor non-linearities before separation. Further work on developed algorithms (improvements, convergence analysis,... ) Searching for nding better (maybe optimal) SFD estimators. A few other small ideas. Working on audio source separation. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.39/39