Independent Component Analysis

Similar documents
The Singular Value Decomposition

The Principal Component Analysis

Numerical Linear Algebra

Representation of Functions as Power Series

Differentiation and Integration of Fourier Series

The Laplace Transform

Eigenvalues and Eigenvectors

First Order Differential Equations

Singular Value Decomposition

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

Introduction to Independent Component Analysis. Jingmei Lu and Xixi Lu. Abstract

Consequences of Orthogonality

Differentiation - Important Theorems

The Cross Product. Philippe B. Laval. Spring 2012 KSU. Philippe B. Laval (KSU) The Cross Product Spring /

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

Solving Linear Systems

The Laplace Transform

Testing Series with Mixed Terms

Numerical Methods I Singular Value Decomposition

Principal Components Analysis (PCA)

Eigenvalues, Eigenvectors, and an Intro to PCA

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Numerical Linear Algebra

Independent Component Analysis and Its Application on Accelerator Physics

Solving Linear Systems

Homework 1. Yuan Yao. September 18, 2011

Independent Component Analysis

Lecture: Face Recognition and Feature Reduction

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

Principal Component Analysis (PCA)

Lecture: Face Recognition and Feature Reduction

Recall : Eigenvalues and Eigenvectors

Data Preprocessing Tasks

Integration Using Tables and Summary of Techniques

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4

Maths for Signals and Systems Linear Algebra in Engineering

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Introduction to Machine Learning

Recitation 9: Probability Matrices and Real Symmetric Matrices. 3 Probability Matrices: Definitions and Examples

Singular Value Decomposition

Independent Components Analysis

CS 143 Linear Algebra Review

Covariance to PCA. CS 510 Lecture #8 February 17, 2014

Differentiation - Quick Review From Calculus

Principal Component Analysis

STATS 306B: Unsupervised Learning Spring Lecture 12 May 7

HST.582J/6.555J/16.456J

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis

ICA. Independent Component Analysis. Zakariás Mátyás

UNIT 6: The singular value decomposition.

Exercises * on Principal Component Analysis

Covariance to PCA. CS 510 Lecture #14 February 23, 2018

1 Linearity and Linear Systems

= main diagonal, in the order in which their corresponding eigenvectors appear as columns of E.

Review of Functions. Functions. Philippe B. Laval. Current Semester KSU. Philippe B. Laval (KSU) Functions Current Semester 1 / 12

Linear Algebra Review. Fei-Fei Li

Lecture 15, 16: Diagonalization

MAT 1302B Mathematical Methods II

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Introduction to Partial Differential Equations

Linear Least Squares. Using SVD Decomposition.

Independent Component Analysis

Linear Algebra Methods for Data Mining

Summary of Week 9 B = then A A =

Exercise Set 7.2. Skills

Consequences of the Completeness Property

MLCC 2015 Dimensionality Reduction and PCA

CSE 554 Lecture 7: Alignment

Eigenvalues, Eigenvectors, and an Intro to PCA

Multivariate Statistical Analysis

Principal Component Analysis vs. Independent Component Analysis for Damage Detection

Linear Algebra Review. Vectors

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

Review problems for MA 54, Fall 2004.

The Mathematics of Facial Recognition

Introduction to Vector Functions

Linear Algebra Review. Fei-Fei Li

Announcements (repeat) Principal Components Analysis

Math Spring 2011 Final Exam

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

Advanced Introduction to Machine Learning CMU-10715

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

AMS10 HW7 Solutions. All credit is given for effort. (-5 pts for any missing sections) Problem 1 (20 pts) Consider the following matrix 2 A =

Lecture 7. Econ August 18

Spectral Theorem for Self-adjoint Linear Operators

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

15 Singular Value Decomposition

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

Dimensionality Reduction

Connection of Local Linear Embedding, ISOMAP, and Kernel Principal Component Analysis

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

Chapter 7: Symmetric Matrices and Quadratic Forms

Eigenvalues, Eigenvectors, and an Intro to PCA

Math 315: Linear Algebra Solutions to Assignment 7

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I

Math 4242 Fall 2016 (Darij Grinberg): homework set 8 due: Wed, 14 Dec b a. Here is the algorithm for diagonalizing a matrix we did in class:

Transcription:

Independent Component Analysis Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) ICA Fall 2017 1 / 18

Introduction Independent Component Analysis (ICA) falls under the broader topic of Blind Source Separation (BSS). BSS is the separation of a set of source signals (the signals we are looking for) from a set of mixed signals (the signals we can measure) where very little is known about the source signals and the mixing process. With ICA, we are assuming that the mixing is linear, that is the measured signals can be expressed as a linear combination of the source signals. We are also assuming that the source signals are statistically independent. Philippe B. Laval (KSU) ICA Fall 2017 2 / 18

Introduction A classical example of ICA is the cocktail party problem. Imagine a room in which p people are speaking. We are trying to extract each conversation. For this, we use p microphones spread throughout the room to record the various people speaking. We have to extract each conversation from the mixed signals captured by the microphones. Philippe B. Laval (KSU) ICA Fall 2017 3 / 18

Introduction Example Consider two signals s 1 (t) = sin (2t) + 2 cos (3t) and s 2 (t) = sin t cos t shown in the next 2 slides. Consider the linear mixings of these signals x 1 (t) = 2s 1 (t) + 3s 2 (t) and x 2 (t) = 1.5s 1 (t) 2.37s 2 (t) shown in the following 2 slides. s 1 (t) and s 2 (t) are the source signals while x 1 (t) and x 2 (t) are the measured signals and they are a linear combination of s 1 (t) and s 2 (t). Imagine that x 1 (t) and x 2 (t) are known and we have to recover s 1 (t) and s 2 (t) without actually knowing the linear combination which was given here. This seems to be an impossible problem as there are too many unknowns. However, we will see that with some additional assumptions, we can come extremely close to recovering s 1 (t) and s 2 (t). Philippe B. Laval (KSU) ICA Fall 2017 4 / 18

Introduction y 3 2 1 0 1 1 2 3 4 5 6 7 8 9 10 x 2 3 s 1 (t) = sin (2t) + 2 cos (3t) Philippe B. Laval (KSU) ICA Fall 2017 5 / 18

Introduction y 1.0 0.5 0.0 0.5 1 2 3 4 5 6 7 8 9 10 x 1.0 s 2 (t) = sin t cos t Philippe B. Laval (KSU) ICA Fall 2017 6 / 18

Introduction y 5 0 1 2 3 4 5 6 7 8 9 10 x 5 10 x 1 (t) = 2s 1 (t) + 3s 2 (t) Philippe B. Laval (KSU) ICA Fall 2017 7 / 18

Introduction y 4 2 0 2 1 2 3 4 5 6 7 8 9 10 x 4 6 x 2 (t) = 1.5s 1 (t) 2.37s 2 (t) Philippe B. Laval (KSU) ICA Fall 2017 8 / 18

Setup of the Problem s 1 s 2 Let s =. s p represent the source signals and x = represent the signals measured by the microphones. For the ICA model, we assume that each x k and s k is a random variable instead of a variable depending on time. The observed values x k (t) are just a sample of the random variable x k, k = 1, 2,..., p. We call B x the matrix of observation for the x variables and use a similar notation for the other observation matrices. Since both x and s are p 1, if we discretize the time interval in N points (that is if each variable has N measures) then B x and B s will be p N. Without loss of generality, we may assume that both the x i s and the s i s have zero mean. If this is not the case, we first subtract the sample mean to the x i s. Philippe B. Laval (KSU) ICA Fall 2017 9 / 18 x 1 x 2. x p

Setup of the Problem Since the x i s are a linear combination of the s i s, we can write x j = a j1 s 1 + a j2 s 2 +... + a jp s p p = a ji s i i=1 If we let A = (a ij ) be the p p matrix containing the coeffi cients of the above linear combination, then we can write x = As Philippe B. Laval (KSU) ICA Fall 2017 10 / 18

Setup of the Problem The goal of ICA is to find the matrix A (or its inverse) so we can recover the signal s from x. In other words, s = A 1 x. This seems to be an impossible problems because we are trying to find the p 2 entries of A plus the p entries of s from the p entries of x. The approach is to approximate A 1 by a matrix W such that if ^s = W x then ^s s. We will outline some of the steps which make this problem possible but will not go in detail through all the steps. Philippe B. Laval (KSU) ICA Fall 2017 11 / 18

Strategy for Solving ICA We find the matrix A in several steps. Knowing that A has an SVD of the form A = UΣV T, instead of finding A, we find U, Σ, and V. W (A 1 ) will then be W = V Σ 1 U T It is also important to remember that both U and V are orthogonal matrices hence their inverse is the same as their transpose. We will proceed in two stages: 1 Use the covariance of the data x to find Σ and U. 2 Use the assumption of independence of s to find V. We will describe stage 1 in detail but not stage 2. Philippe B. Laval (KSU) ICA Fall 2017 12 / 18

Strategy for Solving ICA To recover Σ and U, we make one additional assumption. We will discuss its meaning below. We assume that the covariance of the source data satisfies B s Bs T = I where I is the identity matrix. Recall from the homework of the section on PCA that if x = As then B x = AB s. Now, we compute the covariance of the measured data: B x B T x = AB s (AB s ) T = UΣ 2 U T Note that this equation shows that under our assumptions, the covariance of the measured data only depends on Σ and U, not V and s. Philippe B. Laval (KSU) ICA Fall 2017 13 / 18

Strategy for Solving ICA Recall that B x Bx T we can write is symmetric hence diagonalizable. This means that B x B T x = PDP T where D is the matrix containing the eigenvalues of B x Bx T and P is the matrix containing the corresponding eigenvectors, written as column vectors. This tell us that U = P and Σ 2 = D will work. Therefore, we have identified U and Σ. Σ is a diagonal matrix containing the square root of the eigenvalues of B x Bx T and U is the matrix containing the corresponding eigenvectors, written as column vectors. Therefore, W = VD 1 2 P T V is the only orthogonal matrix left to find. Philippe B. Laval (KSU) ICA Fall 2017 14 / 18

Whitening of the Data Before we say a few words about V,let us discuss our assumption B s Bs T = I also known as whitening of the data. First, we note that ( P T ) ( B x P T ) T B x = D. ( ) Next, we define x w = D 1 2 P T x. This operation is called whitening of the data and note that B xw B T x w = I. Recall that our goal was to find W such that ^s = W x which amounts to solving ^s = V x w Hence it amounts to finding V from the whitened data x w. Recall V is an orthogonal (rotation) matrix. You will also not that this implies our assumption BŝB Ṱ s = I. Philippe B. Laval (KSU) ICA Fall 2017 15 / 18

Finding V Solving ICA amounts to finding the rotation matrix V so that ^s is statistically independent. As I mentioned, We will not discuss finding V here; it is a much more challenging and advanced problem. It involves using information theory and a quantity called the entropy of a distribution. Algorithms exist to perform this step; we will use them. Philippe B. Laval (KSU) ICA Fall 2017 16 / 18

PCA and MATLAB ICA is not included in the version of MATLAB one buys from MathWorks. However, some implementations can be downloaded from the internet for MATLAB and other platforms. Please, visit the fastica webpage. Philippe B. Laval (KSU) ICA Fall 2017 17 / 18

Exercises See the problems at the end of the notes on Independent Component Analysis. Philippe B. Laval (KSU) ICA Fall 2017 18 / 18