Independent Subspace Analysis Barnabás Póczos Supervisor: Dr. András Lőrincz Eötvös Loránd University Neural Information Processing Group Budapest, Hungary MPI, Tübingen, 24 July 2007.
Independent Component Analysis Sources Mixing A s(t) Observation Estimation y(t)=wx(t) x(t) = As(t) 2
Some ICA Applications Blind source separation Image denoising Medical signal processing fmri, ECG, EEG Modeling of the hippocampus Modeling of the visual cortex Feature extraction Face recognition Time series analysis Financial applications 3
Independent Subspace Analysis Sources Observation Estimation s 1 2 Rd x1 2 Rd y1 2 Rd s 2 2 Rd x2 2 Rd y2 2 Rd sm 2 xm 2 Rd y m 2 Rd Rd A 2 Rmd md W 2 Rmd md s s T ; : : : ; sm T T 2 Rdm x x T ; : : : ; xm T T 2 Rdm x As y y T ; : : : ; ym T T 2 Rdm y Wx 4
Independent Subspace Analysis The Efforts Cardoso, 1998 Akaho et al, 1999 kernel methods for mutual information estimation Theis, 2003 Processing of EEG-fMRI data 2-dimensional Edgeworth-expansion Bach & Jordan, 2003 Conjecture, ICA preprocessing followed by permutation Ambiguity issues, uniqueness of ISA Hyvärinen & Köster, 2006 FastISA: A fast fixed-point algorithm for independent subspace analysis 5
Independent Subspace Analysis The Ambiguity Ambiguity of ICA: Sources can be recovered only up to: arbitrary permutation arbitrary scaling factors sign Ambiguity of ISA: Sources can be recovered only up to: arbitrary permutation arbitrary invertible transformation 6
Independent Subspace Analysis pairwise independence joint independence In ICA case pairwise independence of the sources = joint independence of the sources (Comon, 1994) In ISA case pairwise independence of the subspaces joint independence of the subspaces Proof: Let ; s g; fs ; s ; s g; fs ; s ; s g fs ; s 3 of 3- dimensional independent sources, where the elements of each subspace are pairwise independent. Than fs ; s ; s g; fs ; s ; s g; fs ; s ; s g is a wrong 7 ISA solution.
The ISA Cost Functions R Mutual Information: I y ; : : : ; ym R p y dy p y p y m Shannon-entropy: H y p y p y dy 8
The ISA Cost Functions 9
Multidimensional Entropy Estimation
Multi-dimensional Entropy Estimations, Method of Kozahenko and Leonenko fz ; : : : ; z n g n z 2 Rd N ;j z j Then the nearest neighbour entropy estimation: n P z H nkn ;j z j k CE n j 1 R t CE e t dt This estimation is means-square consistent, but not robust. Let us try to use more neighbours! 11
Multi-dimensional Entropy Estimations R Let us apply Rényi sh f z dz entropy for estimating H f z f z dz the Shannon-entropy:! R Let us use - K-nearest neighbors - geodesic spanning trees for estimating the multi-dimensional Rényi s entropy. 12
Beadword - Halton - Hammersley Theorem fz ; : : : ; z n g n z 2 Rd Nk;j k z j d d n P P! H z c k v z j k kn j v2nk;j n! 1 13
Multi-dimensional Entropy Estimations Using Geodesic Spanning Forests Build first an Euclidean neighbourhood graph use the edges of the k nearest nodes to each node z p Find geodesic spanning forests on this graph (minimal spanning forests of the Euclidean neighbourhood graph) 14
Euclidean Graphs fz ; : : : ; z n g n z 2 Rd Euclidean neighbourhood graph E fe e p; q z p z q 2 Rd ; z q 2 Nk;pg Weight of minimal (γ-weighted) Euclidean spanning forest: L z P T2T e2t Where T kek is the set of all γ-weighted Euclidean spanning forests d d d L z! H z c n! 1 n 15
Estimation of the Shannon-entropy 16
Mutual Information Estimation
Kernel covariance (KC) A. Gretton, R. Herbrich, A. Smola, F. Bach, M. Jordan The calculation of the supremum over function sets is extremely difficult. We can ease it using Reproducing Kernel Hilbert Spaces. 18
RKHS construction for x, y stochastic variables. 19
Kernel covariance (KC) And what is more, after some calculation we get, that 20
The ISA Separation Theorem
ISA Separation Theorem 22
The ISA Separation Theorem 23
Numerical Simulations
Numerical Simulations 2D Letters (i.i.d.) Sources Observation Estimated sources Performance matrix 25
Numerical Simulations 3D Curves (i.i.d.) Sources Observation Estimated sources Performance matrix 26
Numerical Simulations Facial images (i.i.d.) Sources Observation Estimated sources Performance matrix 27
Numerical Simulations 28
Working on Innovations These methods for entropy estimation need i.i.d. processes. What can we do with τ-order AR sources? si t F si t : : : F si t ¹ t Then the innovations are i.i.d. processes: si t si t E si t jsi t ; si t : : : ¹ t and the mixing matrix is the same for the innovations, so we can use ISA on innovations. A s t x t E x t jx t ; x t : : : x t 29
Results Using Innovations Original AR sources Mixed sources Estimated sources by plain ISA Performance of plain ISA Estimated sources using ISA on innovations Performance using30 innovations
Undercomplete Blind Subspace Deconvolution Multi-dimensional generalization of the undercomplete Blind Source Deconvolution (BSD) 31
BSSD reduction to ISA 32
BSSD reduction to ISA ISA task! 33
BSSD Results Database: Convolved mixture Performance Estimation 34
Post nonlinear ISA Has it any sense? 35
Post nonlinear ISA Separability theorem: 36
Post nonlinear ISA, results Original Observed Nonlinear functions Hinton diagram 37 Estimated functions
ISA for facial components In our database we had 800 different front view faces with the 6 basic facial expressions. We had thus 4,800 images in total. All images were sized to 40 40 pixel. A large 4800 1600 data matrix was compiled; rows of this matrix were 1600 dimensional vectors formed by the pixel values of the individual images. The columns of this matrix were considered as mixed signals. The observed 4800 dimensional signals were compressed by PCA to 60 dimensions and we searched for 4 pieces of ISA subspaces. 38
ISA for facial components eyes mouth eye brushes facial profiles Estimated subspaces 39
Ongoing Projects & Future Plans Multilinear (tenzorial) ISA BSSD in the Fourier domain EEG, fmri data processing Low-dimensional embedding with ICA / ISA constraints Low-dimensional embedding of time series Variational Bayesian Hidden Markov Factor Analyzer 40
Thanks for your attention! 41
References Independent Process Analysis without Combinatorial Efforts. Z. Szabó, B. Póczos and A. Lőrincz. (ICA2007, accepted) Post Nonlinear Independent Subspace Analysis. Z. Szabó, B. Póczos, G. Szirtes and A. Lőrincz. 2007. (ICANN 2007, accepted) Undercomplete BSSD via Linear Prediction. Z. Szabó, B. Póczos and A. Lőrincz. 2007. (ECML 2007, accepted) Undercomplete Blind Subspace Deconvolution Z. Szabó, B. Póczos and A. Lőrincz. 2007. Journal of Machine Learning Research. vol. 8, pp. 1063-1095. Independent subspace analysis using geodesic spanning trees B. Póczos and A. Lőrincz Proc. of ICML 2005, Bonn, ICML: 673-680 Cross-Entropy Optimization for Independent Process Analysis Z. Szabó, B. Póczos and A. Lőrincz Proc. of ICA 2006, Charleston, SC: LNCS 3889, 909-916, Springer Verlag 42
References Noncombinatorial estimation of independent auto-regressive sources B. Póczos and A. Lőrincz 2005. Neurocomputing vol. 69, pp. 2416-2419 Independent Subspace Analysis on Innovations B. Póczos, B. Takács and A. Lőrincz Proc. of. ECML 2005, Porto, LNAI 3720: 698-706, Springer-Verlag Independent subspace analysis using k-nearest neighborhood distances B. Póczos and A. Lőrincz Proc. of ICANN 2005, Warsaw, LNCS 3697: 163-168, Springer-Verlag Separation theorem for independent subspace analysis with sufficient conditions. Z. Szabó, B. Póczos and A. Lőrincz. Technical report, Eötvös Loránd University, Budapest. Cost component analysis András Lőrincz and Barnabás Póczos 2003. International Journal of Neural Systems, Vol. 13, pp. 183-192. 43
44