Independent Subspace Analysis

Similar documents
Cross-Entropy Optimization for Independent Process Analysis

Real and Complex Independent Subspace Analysis by Generalized Variance

Independent Subspace Analysis on Innovations

Complete Blind Subspace Deconvolution

Undercomplete Blind Subspace Deconvolution via Linear Prediction

Autoregressive Independent Process Analysis with Missing Observations

Fast Parallel Estimation of High Dimensional Information Theoretical Quantities with Low Dimensional Random Projection Ensembles

ICA and ISA Using Schweizer-Wolff Measure of Dependence

Advanced Introduction to Machine Learning CMU-10715

Complex Independent Process Analysis

LECTURE :ICA. Rita Osadchy. Based on Lecture Notes by A. Ng

NONPARAMETRIC DIVERGENCE ESTIMATORS FOR INDEPENDENT SUBSPACE ANALYSIS

Online Dictionary Learning with Group Structure Inducing Norms

Separation Theorem for Independent Subspace Analysis and its Consequences

Statistical Machine Learning

NONPARAMETRIC DIVERGENCE ESTIMATORS FOR INDEPENDENT SUBSPACE ANALYSIS

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Independent Component Analysis. Contents

Natural Gradient Learning for Over- and Under-Complete Bases in ICA

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Unsupervised Feature Extraction by Time-Contrastive Learning and Nonlinear ICA

Information Theoretical Estimators Toolbox

Distinguishing Causes from Effects using Nonlinear Acyclic Causal Models

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Robust Laplacian Eigenmaps Using Global Information

Nonlinear Dimensionality Reduction

CIFAR Lectures: Non-Gaussian statistics and natural images

ICA [6] ICA) [7, 8] ICA ICA ICA [9, 10] J-F. Cardoso. [13] Matlab ICA. Comon[3], Amari & Cardoso[4] ICA ICA

Recursive Generalized Eigendecomposition for Independent Component Analysis

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39

Blind Machine Separation Te-Won Lee

Separation Principles in Independent Process Analysis

Independent Component Analysis (ICA)

Non-Euclidean Independent Component Analysis and Oja's Learning

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

PROPERTIES OF THE EMPIRICAL CHARACTERISTIC FUNCTION AND ITS APPLICATION TO TESTING FOR INDEPENDENCE. Noboru Murata

Diffeomorphic Warping. Ben Recht August 17, 2006 Joint work with Ali Rahimi (Intel)

Non-linear Dimensionality Reduction

A two-layer ICA-like model estimated by Score Matching

FINDING CLUSTERS IN INDEPENDENT COMPONENT ANALYSIS

Lecture 10: Dimension Reduction Techniques

Convergence Rates of Kernel Quadrature Rules

MTTS1 Dimensionality Reduction and Visualization Spring 2014 Jaakko Peltonen

Group-Structured and Independent Subspace Based Dictionary Learning

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Estimation of linear non-gaussian acyclic models for latent factors

Statistical Pattern Recognition

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Principal Component Analysis

Machine Learning (BSMC-GA 4439) Wenke Liu

Biomedical signal processing application of optimization methods for machine learning problems

Collaborative Filtering via Group-Structured Dictionary Learning

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Tree-dependent Component Analysis

Tutorial on Blind Source Separation and Independent Component Analysis

Independent Component Analysis of Incomplete Data

Deriving Principal Component Analysis (PCA)

VARIABLE SELECTION AND INDEPENDENT COMPONENT

Machine learning strategies for fmri analysis

Iterative Laplacian Score for Feature Selection

Nonlinear Methods. Data often lies on or near a nonlinear low-dimensional curve aka manifold.

c Springer, Reprinted with permission.

Metric Learning. 16 th Feb 2017 Rahul Dey Anurag Chowdhury

Feature Extraction with Weighted Samples Based on Independent Component Analysis

Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego

Independent Component Analysis

Natural Image Statistics

Using Kernel PCA for Initialisation of Variational Bayesian Nonlinear Blind Source Separation Method

ICA. Independent Component Analysis. Zakariás Mátyás

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Distribution Regression: A Simple Technique with Minimax-optimal Guarantee

Dimensionality Reduction AShortTutorial

Classification of handwritten digits using supervised locally linear embedding algorithm and support vector machine

(Non-linear) dimensionality reduction. Department of Computer Science, Czech Technical University in Prague

Unsupervised dimensionality reduction

A MULTIVARIATE MODEL FOR COMPARISON OF TWO DATASETS AND ITS APPLICATION TO FMRI ANALYSIS

One-unit Learning Rules for Independent Component Analysis

Kernel Learning via Random Fourier Representations

Diffusion Geometries, Diffusion Wavelets and Harmonic Analysis of large data sets.

Unsupervised Kernel Dimension Reduction Supplemental Material

GAUSSIAN PROCESS TRANSFORMS

Graph Metrics and Dimension Reduction

Learning Eigenfunctions: Links with Spectral Clustering and Kernel PCA

Hilbert Schmidt Independence Criterion

Principal Component Analysis

Blind separation of instantaneous mixtures of dependent sources

FuncICA for time series pattern discovery

Independent Component Analysis

Statistical Convergence of Kernel CCA

Kernel Measures of Conditional Dependence

Unsupervised Learning: K- Means & PCA

Semi-Blind approaches to source separation: introduction to the special session

Randomized Algorithms

L26: Advanced dimensionality reduction

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

Research on Feature Extraction Method for Handwritten Chinese Character Recognition Based on Kernel Independent Component Analysis

Distribution Regression

Linear and Non-Linear Dimensionality Reduction

Transcription:

Independent Subspace Analysis Barnabás Póczos Supervisor: Dr. András Lőrincz Eötvös Loránd University Neural Information Processing Group Budapest, Hungary MPI, Tübingen, 24 July 2007.

Independent Component Analysis Sources Mixing A s(t) Observation Estimation y(t)=wx(t) x(t) = As(t) 2

Some ICA Applications Blind source separation Image denoising Medical signal processing fmri, ECG, EEG Modeling of the hippocampus Modeling of the visual cortex Feature extraction Face recognition Time series analysis Financial applications 3

Independent Subspace Analysis Sources Observation Estimation s 1 2 Rd x1 2 Rd y1 2 Rd s 2 2 Rd x2 2 Rd y2 2 Rd sm 2 xm 2 Rd y m 2 Rd Rd A 2 Rmd md W 2 Rmd md s s T ; : : : ; sm T T 2 Rdm x x T ; : : : ; xm T T 2 Rdm x As y y T ; : : : ; ym T T 2 Rdm y Wx 4

Independent Subspace Analysis The Efforts Cardoso, 1998 Akaho et al, 1999 kernel methods for mutual information estimation Theis, 2003 Processing of EEG-fMRI data 2-dimensional Edgeworth-expansion Bach & Jordan, 2003 Conjecture, ICA preprocessing followed by permutation Ambiguity issues, uniqueness of ISA Hyvärinen & Köster, 2006 FastISA: A fast fixed-point algorithm for independent subspace analysis 5

Independent Subspace Analysis The Ambiguity Ambiguity of ICA: Sources can be recovered only up to: arbitrary permutation arbitrary scaling factors sign Ambiguity of ISA: Sources can be recovered only up to: arbitrary permutation arbitrary invertible transformation 6

Independent Subspace Analysis pairwise independence joint independence In ICA case pairwise independence of the sources = joint independence of the sources (Comon, 1994) In ISA case pairwise independence of the subspaces joint independence of the subspaces Proof: Let ; s g; fs ; s ; s g; fs ; s ; s g fs ; s 3 of 3- dimensional independent sources, where the elements of each subspace are pairwise independent. Than fs ; s ; s g; fs ; s ; s g; fs ; s ; s g is a wrong 7 ISA solution.

The ISA Cost Functions R Mutual Information: I y ; : : : ; ym R p y dy p y p y m Shannon-entropy: H y p y p y dy 8

The ISA Cost Functions 9

Multidimensional Entropy Estimation

Multi-dimensional Entropy Estimations, Method of Kozahenko and Leonenko fz ; : : : ; z n g n z 2 Rd N ;j z j Then the nearest neighbour entropy estimation: n P z H nkn ;j z j k CE n j 1 R t CE e t dt This estimation is means-square consistent, but not robust. Let us try to use more neighbours! 11

Multi-dimensional Entropy Estimations R Let us apply Rényi sh f z dz entropy for estimating H f z f z dz the Shannon-entropy:! R Let us use - K-nearest neighbors - geodesic spanning trees for estimating the multi-dimensional Rényi s entropy. 12

Beadword - Halton - Hammersley Theorem fz ; : : : ; z n g n z 2 Rd Nk;j k z j d d n P P! H z c k v z j k kn j v2nk;j n! 1 13

Multi-dimensional Entropy Estimations Using Geodesic Spanning Forests Build first an Euclidean neighbourhood graph use the edges of the k nearest nodes to each node z p Find geodesic spanning forests on this graph (minimal spanning forests of the Euclidean neighbourhood graph) 14

Euclidean Graphs fz ; : : : ; z n g n z 2 Rd Euclidean neighbourhood graph E fe e p; q z p z q 2 Rd ; z q 2 Nk;pg Weight of minimal (γ-weighted) Euclidean spanning forest: L z P T2T e2t Where T kek is the set of all γ-weighted Euclidean spanning forests d d d L z! H z c n! 1 n 15

Estimation of the Shannon-entropy 16

Mutual Information Estimation

Kernel covariance (KC) A. Gretton, R. Herbrich, A. Smola, F. Bach, M. Jordan The calculation of the supremum over function sets is extremely difficult. We can ease it using Reproducing Kernel Hilbert Spaces. 18

RKHS construction for x, y stochastic variables. 19

Kernel covariance (KC) And what is more, after some calculation we get, that 20

The ISA Separation Theorem

ISA Separation Theorem 22

The ISA Separation Theorem 23

Numerical Simulations

Numerical Simulations 2D Letters (i.i.d.) Sources Observation Estimated sources Performance matrix 25

Numerical Simulations 3D Curves (i.i.d.) Sources Observation Estimated sources Performance matrix 26

Numerical Simulations Facial images (i.i.d.) Sources Observation Estimated sources Performance matrix 27

Numerical Simulations 28

Working on Innovations These methods for entropy estimation need i.i.d. processes. What can we do with τ-order AR sources? si t F si t : : : F si t ¹ t Then the innovations are i.i.d. processes: si t si t E si t jsi t ; si t : : : ¹ t and the mixing matrix is the same for the innovations, so we can use ISA on innovations. A s t x t E x t jx t ; x t : : : x t 29

Results Using Innovations Original AR sources Mixed sources Estimated sources by plain ISA Performance of plain ISA Estimated sources using ISA on innovations Performance using30 innovations

Undercomplete Blind Subspace Deconvolution Multi-dimensional generalization of the undercomplete Blind Source Deconvolution (BSD) 31

BSSD reduction to ISA 32

BSSD reduction to ISA ISA task! 33

BSSD Results Database: Convolved mixture Performance Estimation 34

Post nonlinear ISA Has it any sense? 35

Post nonlinear ISA Separability theorem: 36

Post nonlinear ISA, results Original Observed Nonlinear functions Hinton diagram 37 Estimated functions

ISA for facial components In our database we had 800 different front view faces with the 6 basic facial expressions. We had thus 4,800 images in total. All images were sized to 40 40 pixel. A large 4800 1600 data matrix was compiled; rows of this matrix were 1600 dimensional vectors formed by the pixel values of the individual images. The columns of this matrix were considered as mixed signals. The observed 4800 dimensional signals were compressed by PCA to 60 dimensions and we searched for 4 pieces of ISA subspaces. 38

ISA for facial components eyes mouth eye brushes facial profiles Estimated subspaces 39

Ongoing Projects & Future Plans Multilinear (tenzorial) ISA BSSD in the Fourier domain EEG, fmri data processing Low-dimensional embedding with ICA / ISA constraints Low-dimensional embedding of time series Variational Bayesian Hidden Markov Factor Analyzer 40

Thanks for your attention! 41

References Independent Process Analysis without Combinatorial Efforts. Z. Szabó, B. Póczos and A. Lőrincz. (ICA2007, accepted) Post Nonlinear Independent Subspace Analysis. Z. Szabó, B. Póczos, G. Szirtes and A. Lőrincz. 2007. (ICANN 2007, accepted) Undercomplete BSSD via Linear Prediction. Z. Szabó, B. Póczos and A. Lőrincz. 2007. (ECML 2007, accepted) Undercomplete Blind Subspace Deconvolution Z. Szabó, B. Póczos and A. Lőrincz. 2007. Journal of Machine Learning Research. vol. 8, pp. 1063-1095. Independent subspace analysis using geodesic spanning trees B. Póczos and A. Lőrincz Proc. of ICML 2005, Bonn, ICML: 673-680 Cross-Entropy Optimization for Independent Process Analysis Z. Szabó, B. Póczos and A. Lőrincz Proc. of ICA 2006, Charleston, SC: LNCS 3889, 909-916, Springer Verlag 42

References Noncombinatorial estimation of independent auto-regressive sources B. Póczos and A. Lőrincz 2005. Neurocomputing vol. 69, pp. 2416-2419 Independent Subspace Analysis on Innovations B. Póczos, B. Takács and A. Lőrincz Proc. of. ECML 2005, Porto, LNAI 3720: 698-706, Springer-Verlag Independent subspace analysis using k-nearest neighborhood distances B. Póczos and A. Lőrincz Proc. of ICANN 2005, Warsaw, LNCS 3697: 163-168, Springer-Verlag Separation theorem for independent subspace analysis with sufficient conditions. Z. Szabó, B. Póczos and A. Lőrincz. Technical report, Eötvös Loránd University, Budapest. Cost component analysis András Lőrincz and Barnabás Póczos 2003. International Journal of Neural Systems, Vol. 13, pp. 183-192. 43

44