Independent Component Analysis. Slides from Paris Smaragdis UIUC

Size: px
Start display at page:

Download "Independent Component Analysis. Slides from Paris Smaragdis UIUC"

Transcription

1 Independent Component Analysis Slides from Paris Smaragdis UIUC 1

2 A simple audio problem 2

3 Formalizing the problem Each mic will receive a mi of both sounds Sound waves superimpose linearly We ll ignore propagation delays for now s 1 The simplified miing model is: s 2 (t) = A.s(t) We know (t), but nothing else How do we solve this system and find s(t)? 1 (t)=a 11 s 1 (t)+a 21 s 2 (t) 2 (t)=a 12 s 1 (t)+a 22 s 2 (t) 3

4 When can we solve this? The miing equation is: (t) = A.s(t) Our estimates of s(t) will be: s 1 A a 1 a 2 1 sˆ ( t ) A.( t) s 2 To recover s(t), A must be invertible: We need as many mics as sources The mics/sources must not coincide All sources must be audible s 1 s 2 Otherwise this is a different story A a a 1 1 a a 2 2 4

5 A simple eample (t) A s(t) A simple invertible problem s(t) contains two structured waveforms A is invertible (but we don t know it) (t) looks messy, doesn t reveal s(t) clearly How do we solve this? Any ideas? 5

6 What to look for (t) = A.s(t) We can only use (t) Is there a property we can take advantage of? Yes! We know that different sounds are statistically unrelated The plan: Find a solution that enforces this unrelatedness 6

7 A first try Find s(t) by minimizing cross-correlation Our estimate of s(t) is computed by: sˆ ( t) W. ( t) If W A -1 then we have a good solution The goal is that the output becomes uncorrelated: sˆ ( t), sˆ ( t) 0,i j We assume here that our signals are zero mean So the overall problem to solve is: i j arg min W wik k ( t), wjk k ( t), i k k j 7

8 How to solve for uncorrelatedness Let s use matrices instead of time-series: 1 (1)... 1 ( N) ( t) X, etc. 2(1)... 2( N) The uncorrelatedness translates to: ( t) As. ( t) and sˆ( t) W. ( t) X AS. and S ˆ W. X 1 N S.S ˆ ˆ T c c22 We then need to diagonalize: 1 N Sˆ.Sˆ T 1 N ( W. X)( W. X) T 8

9 How to solve for uncorrelatedness This is actually a well known problem in linear algebra One solution is Eigenanalysis: Cov( Sˆ) 1 T S. ˆ Sˆ N 1 ( W. X)( W. X) N 1 N W. X. X T W T T W. Cov( X). W T Cov( X) is a symmetric matri 9

10 How to solve for uncorrelatedness For any symmetric matri like Cov(X) we know that: Cov( X) U. D. U T u1... d u u 1 0 N d 1 0 N u N Where u i and d i are the eigenvectors and eigenvalues of Cov(X) In our case if [U,D] = eig( Cov( X)) Cov( S ˆ) W. Cov( X). W T T T W. U. D. U. W let us replace W with T T U. U. DU. U T note U U I ( U is orthonormal) D This is actually Principal Component Analysis U T 10

11 Another solution We can also solve for a matri inverse square root: Cov( ˆ) T S W. X. X. I W T 1 T T T 2 T X. X. X. X. X. X replace W with X. X 2 d 1 2 T T 1 1 X. X 2 U where [ U, D] eig X. X 0 d N 11

12 Another approach What if we want to do this in real-time? We can also estimate W in an online manner: T T W ( I - W. ( t). ( t). W ) W Every time we see a new observation (t) we update W W WW Using this adaptive approach we can see that: Cov( W. ( t)) I, W 0 We ll skip the derivation details for now 12

13 Summary so far For a miture: X = A.S We can algebraically recover an uncorrelated output using Sˆ W.X If W is the eigenvector matri of Cov( X) Or with W = Cov( X) -1/2 Or we can use an online estimator: T T W ( I - W. ( t). ( t). W ) W 13

14 Miing system Unmiing system So how well does this work? (t) A s(t) ŝ(t) W (t) Well, that was a waste of time 14

15 What went wrong? What does decorrelation mean? That the two things compared are not related on average Consider a miture of two Gaussians s( t) N(0,2) N(0,1) ( t) s( t) 15

16 Decorrelation Now let us do what we derived so far on this signal: a) Find the eigenvectors b) Rotate and scale so that covariance is I After we are done the two Gaussians are statistically independent i.e., P( s1, s2) P( s1) P( s2) We have in effect separated the original signals Save for a scaling ambiguity 16

17 Now lets try this on the original data (t) A s(t) Stating the obvious: These are not very Gaussian signals!! 17

18 Doing PCA doesn t give us the right solution The result is not what we want We are off by a rotation This idea doesn t seem to work for non-gaussian signals a) Find the eigenvectors b) Rotate and scale so that covariance is I 18

19 19 So what s wrong? For Gaussian data decorrelation means independence Gaussians have up to second order statistics (1 st is mean, 2 nd is variance) By minimizing the 2 nd -order cross-statistics we achieve independence These statistics can be epressed by the 2 nd -order cumulants: Which happen to be the diagonals of the covariance matri But real-world data are seldom Gaussian Non-Gaussian data have higher orders which are not taken care of with PCA We can measure their dependence using higher order cumulants: ), ( cum k j i k j i rd cum.. ),, ( : order 3 k j l i l j k i k j j i l k j i l k j i th cum ),,, ( : order 4

20 Cumulants for Gaussian vs non-gaussian case Gaussian case Non-Gaussian case Cross-cumulants tend to zero cum( i, j ) 1e -13 cum( i, j, k ) , cum( i, j, k, l ) , , Only 2 nd order cross-cumulants tend to zero cum( i, j ) -4e -14 cum( i, j, k ) -0.08, -0.1 cum( i, j, k, l ) 0.42, -0.3,

21 The real problem to solve For statistical independence we need to minimize all cross-cumulants In practice up to 4 th order is enough For 2 nd order we minimized the off-diagonal covariance elements cum( 1, cum( 2, 1 1 ) ) cum( cum( 1 2,, 2 2 ) ) For 4 th order we will do the same for a tensor Q cum(, The process is similar to PCA, but in more dimensions We now find eigenmatrices instead of eigenvectors Algorithms like JADE and FOBI solve this problem,, Can you see a potential problem though? : i, j, k, l i j k l ) 21

22 An alternative approach Tensorial methods can be very very computationally intensive How about an on-line method instead? Independence can also be coined as non-linear decorrelation and y are independent if and only if: f ( ) g( y) f ( ) g( y) For all continuous functions f and g This is a non-linear etension of 2 nd order independence where f() = g() = We can try solving for that then 22

23 Online ICA Conceptually this is very similar to online decorrelation For decorrelation: For non-linear decorrelation: T T W ( I - W. ( t). ( t). W ) W T W ( I - f ( W. ( t)). g( W. ( t)) ) W This adaptation method is known as the Cichocki-Unbehauen update But we can obtain it using many different ways But how do we pick the non-linearities? Depends on the prior we have on the sources f ( i tanh( ), for super Gaussian ) tanh( ), for sub Gaussian 23

24 Other popular approaches Minimum Mutual Information Minimize the mutual information of the output Creates maimally statistically independent outputs Infoma Maimize the entropy of the output or Mutual Information of input/output Non-Gaussianity Adding signals tends towards Gaussianity (Central Limit Theorem) Find the maimally non-gaussian outputs undoes the miing Maimum Likelihood Less straightforward at first, but elegant nevertheless Geometric methods Trying to eyeball the proper way to rotate 24

25 Trying this on our dataset (t) A s(t)

26 Trying this on our dataset (t) A s(t) ŝ(t) W (t) We actually separated the miture! 26

27 Trying this on our dataset ŝ(t) W (t)

28 But something is amiss.. There some things that ICA will not resolve Scale Statistical independence is invariant of scale (and sign) Order of inputs Order of inputs is irrelevant when talking about independence ICA will actually recover: sˆ ( t) D. P. s( t) Where D is diagonal and P is a permutation matri ŝ(t) s(t) 28

29 This works really well for audio mitures! Input Mi Output 29

30 Problems with instantaneous miing Sounds don t really mi instantaneously There are multiple effects Room reflections Sensor response Propagation delays Propagation and reflection filtering Most can be seen as filters We need a convolutive miing model Estimated sources using the instantaneous model on convolutive mi 30 30

31 Convolutive miing Instead of instantaneous miing: ( t) a s ( t) i We now have convolutive miing: The miing filters a ij (k) encapsulate all the miing effects in this model But how do we do ICA now? This is an ugly equation! j ( t) a ( k) s ( t k) i j k ij ij j j 31 31

32 FIR matri algebra Matrices with FIR filters as elements a ij a A a a a [ a (0)... a ( k 1)] ij ij FIR matri multiplication performs convolution and accumulation A. b a a a a b b 1 2 a a * b * b 1 1 a a * b * b

33 Back to convolutive miing Now we can rewrite convolutive miing as: ( t) ( t) a ( k) s ( t k) i a A. s( t) a j k ij * s * s 1 1 j ( t) ( t) a a * s * s 2 2 ( t) ( t) Tidier formulation! We can use the FIR matri abstraction to solve this problem now 33 33

34 An easy way to solve convolutive miing Straightforward translation of instantaneous learning rules using FIR matrices: T W ( I - f ( W. ). g( W. ) ) W Not so easy with algebraic approaches! Multiple other (and more rigorous/better behaved) approaches have been developed 34 34

35 Complications with this approach Required convolutions are epensive Real-room filters are long Their FIR inverses are very long FIR products can become very time consuming Convergence is hard to achieve Huge parameter space Tightly interwoven parameter relationships A slow optimization nightmare! 35 35

36 FIR matri algebra, part II FIR matrices have frequency domain counterparts: And their products are simpler: 36 36

37 Yet another convolutive miing formulation We can now model the process in the frequency domain: Xˆ Aˆ. sˆ For every frequency we have: X f ( t) A f. S f ( t), f, t Hey, that s instantaneous miing! We can solve that! 37 37

38 Overall flowgraph Convolved Mitures Frequency Transform Mied Frequency Bins Instantaneous ICA unmiers Unmied Frequency Bins Time Transform Recovered Sources 1 W 1 S 1 2 S 2... M Point STFT W 2... M point ISTFT... N S N W M 38 38

39 Some complications Permutation issues We don t know which source will end up in each narrowband output Resulting output can have separated narrowband elements from both sounds! Etracted source with permutation Scaling issues Narrowband outputs can be scaled arbitrarily This results in spectrally colored outputs Original source Colored source 39 39

40 Scaling issue One simple fi is to normalize the separating matrices W W norm f orig f W orig f 1 N Results into more reasonable scaling More sophisticated approaches eist but this is not a major problem Some spectral coloration is however unavoidable Original source Colored source Corrected source 40 40

41 Some solutions for permutation problems Continuity of unmiing matrices Adjacent unmiing matrices tend to be a little similar, we can permute/bias them accordingly Doesn t work that great Smoothness of spectral output Narrowband components from each source tend to modulate the same way Permute unmiing matrices to ensure adjacent narrowband output are similarly modulated Works fine The above can fail miserably for more than two sources! Combinatorial eplosion! Interspeec Microphone Array Processing and 41 41

42 Beamforming and ICA If we know the placement of the sensors we can obtain the spatial response of the ICA solution ICA places nulls to cancel out interfering sources Just as in the instantaneous case we cancel out sources We can visualize the permutation problem now Out of place bands Bands with permutation problems Interspeec Microphone Array Processing and 42 42

43 Using beamforming to resolve permutations Spatial information can be used to resolve permutations Find permutations that preserve zeros or smooth out the responses Works fine, although it can be flaky if the array response is not that clean Interspeec Microphone Array Processing and 43

44 The N-input N-output problem ICA, in either formulation inverts a square matri (whether scalar, or FIR) This implies that we have the same number of input as outputs E.g. in a street with 30 noise sources we need at least 30 mics! Solutions eist for M ins - N outs where M > N If N > M we can only beamform In some cases etra sources can be treated as noise This can be restrictive in some situations Interspeec Microphone Array Processing and 44 44

45 Separation recap Orthogonality is not independence!! Not all signals are Gaussian which is a usual assumption We can model instantaneous mitures with ICA and get good results ICA algorithms can optimize a variety of objectives, but ultimately result in statistical independence between the outputs Same model is useful for all sorts of miing situations Convolutive mitures are more challenging but solvable There s more ambiguity, and a closer link to signal processing approaches 45

46 ICA for data eploration ICA is also great for data eploration If PCA is, then ICA should be, right? With data of large dimensionalities we want to find structure PCA can reduce the dimensionality And clean up the data structure a bit But ICA can find much more intuitive projections 46

47 Eample cases of PCA vs ICA Motivation for using ICA vs PCA PCA will indicate orthogonal directions of maimal variance PCA ICA Non-Gaussian data This is great for Gaussian data Also great if we are into LS models Real-world is not Gaussian though ICA finds directions that are more revealing 47

48 Finding useful transforms with ICA Audio preprocessing eample Take a lot of audio snippets and concatenate them in a big matri, do component analysis PCA results in the DCT bases Do you see why? ICA returns time/freq localized sinusoids which is a better way to analyze sounds Ditto for images ICA returns localizes edge filters 48

49 Enhancing PCA with ICA ICA cannot perform dimensionality reduction The goal is to find independent components, hence there is no sense of order PCA does a great job at reducing dimensionality Keeps the elements that carry most of the input s energy It turns out that PCA is a great preprocessor for ICA There is no guarantee that the PCA subspace will be appropriate for the independent components but for most practical purposes this doesn t make a big difference M N input M K PCA K K ICA K N ICs 49

50 Eample case: ICA-faces vs. Eigenfaces ICA-faces Eigenfaces 50

51 A Video Eample The movie is a series of frames Each frame is a data point 126, piel frames Data X will be Using PCA/ICA X = W H W will contain visual components H will contain their time weights 51

52 PCA Results Nothing special about the visual components They are orthogonal pictures Does this mean anything? (not really ) Some segmentation of constant vs. moving parts Some highlighting of the action in the weights 52

53 ICA Results Much more interesting visual components They are independent Unrelated elements (left/right hands, background) are now highlighted We have some decomposition by parts Components weights are now describing the scene 53

54 A Video Eample The movie is a series of frames Each frame is a data point Input movie 315, piel frames Data X will be Using PCA/ICA X = W H W will contain visual components H will contain their time weights Independent neighborhoods 54

55 What about the soundtrack? We can also analyze audio in a similar way We do a frequency transform and get an audio spectrogram X Xis frequencies time Distinct audio elements can be seen inx Unlike before we have only one input this time 55

56 PCA on Audio Umm it sucks! Orthogonality doesn t mean much for audio components Results are mathematically optimal, perceptually useless 56

57 ICA on Audio A definite improvement Independence helps pick up somewhat more meaningful sound objects Not too clean results, but the intentions are clear Misses some details 57

58 Audio Visual Components? We can can even take in both audio and video data and try to find structure Sometimes there is a very strong correlation between auditory and visual elements We should be able to discover that automatically Input video Input audio 58

59 Audio/Visual Components 59

60 Which allows us to play with output And of course once we have such a nice description we can resynthesize at will Resynthesis 60

61 Recap A motivating eample The theory Decorrelation Independence vsdecorrelation Independent component analysis Separating sounds Solving instantaneous mitures Solving convolutive mitures Data eploration and independence Etracting audio features Etracting multimodal features 61

Principal Component Analysis CS498

Principal Component Analysis CS498 Principal Component Analysis CS498 Today s lecture Adaptive Feature Extraction Principal Component Analysis How, why, when, which A dual goal Find a good representation The features part Reduce redundancy

More information

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA)

CS168: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) CS68: The Modern Algorithmic Toolbox Lecture #7: Understanding Principal Component Analysis (PCA) Tim Roughgarden & Gregory Valiant April 0, 05 Introduction. Lecture Goal Principal components analysis

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Machine Learning for Signal Processing. Analysis. Class Nov Instructor: Bhiksha Raj. 8 Nov /18797

Machine Learning for Signal Processing. Analysis. Class Nov Instructor: Bhiksha Raj. 8 Nov /18797 11-755 Machine Learning for Signal Processing Independent Component Analysis Class 20. 8 Nov 2012 Instructor: Bhiksha Raj 8 Nov 2012 11755/18797 1 A brief review of basic probability Uncorrelated: Two

More information

LECTURE :ICA. Rita Osadchy. Based on Lecture Notes by A. Ng

LECTURE :ICA. Rita Osadchy. Based on Lecture Notes by A. Ng LECURE :ICA Rita Osadchy Based on Lecture Notes by A. Ng Cocktail Party Person 1 2 s 1 Mike 2 s 3 Person 3 1 Mike 1 s 2 Person 2 3 Mike 3 microphone signals are mied speech signals 1 2 3 ( t) ( t) ( t)

More information

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Recap: Linear Algebra Today... Topics: Mathematical Background Linear algebra Analysis & differential geometry Numerical techniques Geometric

More information

Independent Component Analysis. Contents

Independent Component Analysis. Contents Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

Tutorial on Blind Source Separation and Independent Component Analysis

Tutorial on Blind Source Separation and Independent Component Analysis Tutorial on Blind Source Separation and Independent Component Analysis Lucas Parra Adaptive Image & Signal Processing Group Sarnoff Corporation February 09, 2002 Linear Mixtures... problem statement...

More information

Acoustic Source Separation with Microphone Arrays CCNY

Acoustic Source Separation with Microphone Arrays CCNY Acoustic Source Separation with Microphone Arrays Lucas C. Parra Biomedical Engineering Department City College of New York CCNY Craig Fancourt Clay Spence Chris Alvino Montreal Workshop, Nov 6, 2004 Blind

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Statistical Geometry Processing Winter Semester 2011/2012

Statistical Geometry Processing Winter Semester 2011/2012 Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian

More information

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.] Math 43 Review Notes [Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty Dot Product If v (v, v, v 3 and w (w, w, w 3, then the

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

HST.582J/6.555J/16.456J

HST.582J/6.555J/16.456J Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear

More information

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Introduction Consider a zero mean random vector R n with autocorrelation matri R = E( T ). R has eigenvectors q(1),,q(n) and associated eigenvalues λ(1) λ(n). Let Q = [ q(1)

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Dimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Dimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro Dimensionality Reduction CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Visualize high dimensional data (and understand its Geometry) } Project the data into lower dimensional spaces }

More information

Lagrange Multipliers

Lagrange Multipliers Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each

More information

Independent Components Analysis

Independent Components Analysis CS229 Lecture notes Andrew Ng Part XII Independent Components Analysis Our next topic is Independent Components Analysis (ICA). Similar to PCA, this will find a new basis in which to represent our data.

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Math 4242 Fall 2016 (Darij Grinberg): homework set 8 due: Wed, 14 Dec b a. Here is the algorithm for diagonalizing a matrix we did in class:

Math 4242 Fall 2016 (Darij Grinberg): homework set 8 due: Wed, 14 Dec b a. Here is the algorithm for diagonalizing a matrix we did in class: Math 4242 Fall 206 homework page Math 4242 Fall 206 Darij Grinberg: homework set 8 due: Wed, 4 Dec 206 Exercise Recall that we defined the multiplication of complex numbers by the rule a, b a 2, b 2 =

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

20 Unsupervised Learning and Principal Components Analysis (PCA)

20 Unsupervised Learning and Principal Components Analysis (PCA) 116 Jonathan Richard Shewchuk 20 Unsupervised Learning and Principal Components Analysis (PCA) UNSUPERVISED LEARNING We have sample points, but no labels! No classes, no y-values, nothing to predict. Goal:

More information

CSE 559A: Computer Vision

CSE 559A: Computer Vision CSE 559A: Computer Vision Fall 208: T-R: :30-pm @ Lopata 0 Instructor: Ayan Chakrabarti (ayan@wustl.edu). Course Staff: Zhihao ia, Charlie Wu, Han Liu http://www.cse.wustl.edu/~ayan/courses/cse559a/ Sep

More information

Linear Algebra, Summer 2011, pt. 2

Linear Algebra, Summer 2011, pt. 2 Linear Algebra, Summer 2, pt. 2 June 8, 2 Contents Inverses. 2 Vector Spaces. 3 2. Examples of vector spaces..................... 3 2.2 The column space......................... 6 2.3 The null space...........................

More information

Linear Least-Squares Data Fitting

Linear Least-Squares Data Fitting CHAPTER 6 Linear Least-Squares Data Fitting 61 Introduction Recall that in chapter 3 we were discussing linear systems of equations, written in shorthand in the form Ax = b In chapter 3, we just considered

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Dot Products, Transposes, and Orthogonal Projections

Dot Products, Transposes, and Orthogonal Projections Dot Products, Transposes, and Orthogonal Projections David Jekel November 13, 2015 Properties of Dot Products Recall that the dot product or standard inner product on R n is given by x y = x 1 y 1 + +

More information

CSC 411 Lecture 12: Principal Component Analysis

CSC 411 Lecture 12: Principal Component Analysis CSC 411 Lecture 12: Principal Component Analysis Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 12-PCA 1 / 23 Overview Today we ll cover the first unsupervised

More information

= W z1 + W z2 and W z1 z 2

= W z1 + W z2 and W z1 z 2 Math 44 Fall 06 homework page Math 44 Fall 06 Darij Grinberg: homework set 8 due: Wed, 4 Dec 06 [Thanks to Hannah Brand for parts of the solutions] Exercise Recall that we defined the multiplication of

More information

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi

Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Communication Engineering Prof. Surendra Prasad Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 41 Pulse Code Modulation (PCM) So, if you remember we have been talking

More information

Lecture 4: Training a Classifier

Lecture 4: Training a Classifier Lecture 4: Training a Classifier Roger Grosse 1 Introduction Now that we ve defined what binary classification is, let s actually train a classifier. We ll approach this problem in much the same way as

More information

Getting Started with Communications Engineering

Getting Started with Communications Engineering 1 Linear algebra is the algebra of linear equations: the term linear being used in the same sense as in linear functions, such as: which is the equation of a straight line. y ax c (0.1) Of course, if we

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

MACHINE LEARNING ADVANCED MACHINE LEARNING

MACHINE LEARNING ADVANCED MACHINE LEARNING MACHINE LEARNING ADVANCED MACHINE LEARNING Recap of Important Notions on Estimation of Probability Density Functions 22 MACHINE LEARNING Discrete Probabilities Consider two variables and y taking discrete

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Frequency, Vibration, and Fourier

Frequency, Vibration, and Fourier Lecture 22: Frequency, Vibration, and Fourier Computer Graphics CMU 15-462/15-662, Fall 2015 Last time: Numerical Linear Algebra Graphics via linear systems of equations Why linear? Have to solve BIG problems

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013 PAUL SCHRIMPF OCTOBER 24, 213 UNIVERSITY OF BRITISH COLUMBIA ECONOMICS 26 Today s lecture is about unconstrained optimization. If you re following along in the syllabus, you ll notice that we ve skipped

More information

A METHOD OF ICA IN TIME-FREQUENCY DOMAIN

A METHOD OF ICA IN TIME-FREQUENCY DOMAIN A METHOD OF ICA IN TIME-FREQUENCY DOMAIN Shiro Ikeda PRESTO, JST Hirosawa 2-, Wako, 35-98, Japan Shiro.Ikeda@brain.riken.go.jp Noboru Murata RIKEN BSI Hirosawa 2-, Wako, 35-98, Japan Noboru.Murata@brain.riken.go.jp

More information

Independent Component Analysis

Independent Component Analysis Independent Component Analysis James V. Stone November 4, 24 Sheffield University, Sheffield, UK Keywords: independent component analysis, independence, blind source separation, projection pursuit, complexity

More information

Covariance and Principal Components

Covariance and Principal Components COMP3204/COMP6223: Computer Vision Covariance and Principal Components Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that

More information

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS

CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS CS 179: LECTURE 16 MODEL COMPLEXITY, REGULARIZATION, AND CONVOLUTIONAL NETS LAST TIME Intro to cudnn Deep neural nets using cublas and cudnn TODAY Building a better model for image classification Overfitting

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Separation of Different Voices in Speech using Fast Ica Algorithm

Separation of Different Voices in Speech using Fast Ica Algorithm Volume-6, Issue-6, November-December 2016 International Journal of Engineering and Management Research Page Number: 364-368 Separation of Different Voices in Speech using Fast Ica Algorithm Dr. T.V.P Sundararajan

More information

Blind Machine Separation Te-Won Lee

Blind Machine Separation Te-Won Lee Blind Machine Separation Te-Won Lee University of California, San Diego Institute for Neural Computation Blind Machine Separation Problem we want to solve: Single microphone blind source separation & deconvolution

More information

Lecture Notes 5: Multiresolution Analysis

Lecture Notes 5: Multiresolution Analysis Optimization-based data analysis Fall 2017 Lecture Notes 5: Multiresolution Analysis 1 Frames A frame is a generalization of an orthonormal basis. The inner products between the vectors in a frame and

More information

Advanced Methods for Fault Detection

Advanced Methods for Fault Detection Advanced Methods for Fault Detection Piero Baraldi Agip KCO Introduction Piping and long to eploration distance pipelines activities Piero Baraldi Maintenance Intervention Approaches & PHM Maintenance

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #21: Dimensionality Reduction Seoul National University 1 In This Lecture Understand the motivation and applications of dimensionality reduction Learn the definition

More information

CIFAR Lectures: Non-Gaussian statistics and natural images

CIFAR Lectures: Non-Gaussian statistics and natural images CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analysis

Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analysis Semi-Supervised Laplacian Regularization of Kernel Canonical Correlation Analsis Matthew B. Blaschko, Christoph H. Lampert, & Arthur Gretton Ma Planck Institute for Biological Cbernetics Tübingen, German

More information

A GUIDE TO INDEPENDENT COMPONENT ANALYSIS Theory and Practice

A GUIDE TO INDEPENDENT COMPONENT ANALYSIS Theory and Practice CONTROL ENGINEERING LABORATORY A GUIDE TO INDEPENDENT COMPONENT ANALYSIS Theory and Practice Jelmer van Ast and Mika Ruusunen Report A No 3, March 004 University of Oulu Control Engineering Laboratory

More information

Issues and Techniques in Pattern Classification

Issues and Techniques in Pattern Classification Issues and Techniques in Pattern Classification Carlotta Domeniconi www.ise.gmu.edu/~carlotta Machine Learning Given a collection of data, a machine learner eplains the underlying process that generated

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Acoustic MIMO Signal Processing

Acoustic MIMO Signal Processing Yiteng Huang Jacob Benesty Jingdong Chen Acoustic MIMO Signal Processing With 71 Figures Ö Springer Contents 1 Introduction 1 1.1 Acoustic MIMO Signal Processing 1 1.2 Organization of the Book 4 Part I

More information

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi

Signal Modeling Techniques in Speech Recognition. Hassan A. Kingravi Signal Modeling Techniques in Speech Recognition Hassan A. Kingravi Outline Introduction Spectral Shaping Spectral Analysis Parameter Transforms Statistical Modeling Discussion Conclusions 1: Introduction

More information

Unsupervised learning: beyond simple clustering and PCA

Unsupervised learning: beyond simple clustering and PCA Unsupervised learning: beyond simple clustering and PCA Liza Rebrova Self organizing maps (SOM) Goal: approximate data points in R p by a low-dimensional manifold Unlike PCA, the manifold does not have

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

IV. Matrix Approximation using Least-Squares

IV. Matrix Approximation using Least-Squares IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

1 What does the random effect η mean?

1 What does the random effect η mean? Some thoughts on Hanks et al, Environmetrics, 2015, pp. 243-254. Jim Hodges Division of Biostatistics, University of Minnesota, Minneapolis, Minnesota USA 55414 email: hodge003@umn.edu October 13, 2015

More information

Independent Component Analysis and Its Applications. By Qing Xue, 10/15/2004

Independent Component Analysis and Its Applications. By Qing Xue, 10/15/2004 Independent Component Analysis and Its Applications By Qing Xue, 10/15/2004 Outline Motivation of ICA Applications of ICA Principles of ICA estimation Algorithms for ICA Extensions of basic ICA framework

More information

Semi-Blind approaches to source separation: introduction to the special session

Semi-Blind approaches to source separation: introduction to the special session Semi-Blind approaches to source separation: introduction to the special session Massoud BABAIE-ZADEH 1 Christian JUTTEN 2 1- Sharif University of Technology, Tehran, IRAN 2- Laboratory of Images and Signals

More information

Noise & Data Reduction

Noise & Data Reduction Noise & Data Reduction Paired Sample t Test Data Transformation - Overview From Covariance Matrix to PCA and Dimension Reduction Fourier Analysis - Spectrum Dimension Reduction 1 Remember: Central Limit

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Math (P)Review Part I:

Math (P)Review Part I: Lecture 1: Math (P)Review Part I: Linear Algebra Computer Graphics CMU 15-462/15-662, Fall 2017 Homework 0.0 (Due Monday!) Exercises will be a bit harder / more rigorous than what you will do for the rest

More information

x 1 (t) Spectrogram t s

x 1 (t) Spectrogram t s A METHOD OF ICA IN TIME-FREQUENCY DOMAIN Shiro Ikeda PRESTO, JST Hirosawa 2-, Wako, 35-98, Japan Shiro.Ikeda@brain.riken.go.jp Noboru Murata RIKEN BSI Hirosawa 2-, Wako, 35-98, Japan Noboru.Murata@brain.riken.go.jp

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

CSE 559A: Computer Vision Tomorrow Zhihao's Office Hours back in Jolley 309: 10:30am-Noon

CSE 559A: Computer Vision Tomorrow Zhihao's Office Hours back in Jolley 309: 10:30am-Noon CSE 559A: Computer Vision ADMINISTRIVIA Tomorrow Zhihao's Office Hours back in Jolley 309: 0:30am-Noon Fall 08: T-R: :30-pm @ Lopata 0 This Friday: Regular Office Hours Net Friday: Recitation for PSET

More information

This appendix provides a very basic introduction to linear algebra concepts.

This appendix provides a very basic introduction to linear algebra concepts. APPENDIX Basic Linear Algebra Concepts This appendix provides a very basic introduction to linear algebra concepts. Some of these concepts are intentionally presented here in a somewhat simplified (not

More information

Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego

Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego Email: brao@ucsdedu References 1 Hyvarinen, A, Karhunen, J, & Oja, E (2004) Independent component analysis (Vol 46)

More information

A non-gaussian decomposition of Total Water Storage (TWS), using Independent Component Analysis (ICA)

A non-gaussian decomposition of Total Water Storage (TWS), using Independent Component Analysis (ICA) Titelmaster A non-gaussian decomposition of Total Water Storage (TWS, using Independent Component Analysis (ICA Ehsan Forootan and Jürgen Kusche Astronomical Physical & Mathematical Geodesy, Bonn University

More information

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering

Karhunen-Loève Transform KLT. JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform KLT JanKees van der Poel D.Sc. Student, Mechanical Engineering Karhunen-Loève Transform Has many names cited in literature: Karhunen-Loève Transform (KLT); Karhunen-Loève Decomposition

More information

Week 3: Linear Regression

Week 3: Linear Regression Week 3: Linear Regression Instructor: Sergey Levine Recap In the previous lecture we saw how linear regression can solve the following problem: given a dataset D = {(x, y ),..., (x N, y N )}, learn to

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

Linear Algebra, Summer 2011, pt. 3

Linear Algebra, Summer 2011, pt. 3 Linear Algebra, Summer 011, pt. 3 September 0, 011 Contents 1 Orthogonality. 1 1.1 The length of a vector....................... 1. Orthogonal vectors......................... 3 1.3 Orthogonal Subspaces.......................

More information

STATS 306B: Unsupervised Learning Spring Lecture 12 May 7

STATS 306B: Unsupervised Learning Spring Lecture 12 May 7 STATS 306B: Unsupervised Learning Spring 2014 Lecture 12 May 7 Lecturer: Lester Mackey Scribe: Lan Huong, Snigdha Panigrahi 12.1 Beyond Linear State Space Modeling Last lecture we completed our discussion

More information

Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise.

Name Solutions Linear Algebra; Test 3. Throughout the test simplify all answers except where stated otherwise. Name Solutions Linear Algebra; Test 3 Throughout the test simplify all answers except where stated otherwise. 1) Find the following: (10 points) ( ) Or note that so the rows are linearly independent, so

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Dot Products K. Behrend April 3, 008 Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem. Contents The dot product 3. Length of a vector........................

More information

Missing Data and Dynamical Systems

Missing Data and Dynamical Systems U NIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN CS598PS Machine Learning for Signal Processing Missing Data and Dynamical Systems 12 October 2017 Today s lecture Dealing with missing data Tracking and linear

More information

ECE 501b Homework #6 Due: 11/26

ECE 501b Homework #6 Due: 11/26 ECE 51b Homework #6 Due: 11/26 1. Principal Component Analysis: In this assignment, you will explore PCA as a technique for discerning whether low-dimensional structure exists in a set of data and for

More information