Machine Learning for Signal Processing. Analysis. Class Nov Instructor: Bhiksha Raj. 8 Nov /18797

Size: px
Start display at page:

Download "Machine Learning for Signal Processing. Analysis. Class Nov Instructor: Bhiksha Raj. 8 Nov /18797"

Transcription

1 Machine Learning for Signal Processing Independent Component Analysis Class Nov 2012 Instructor: Bhiksha Raj 8 Nov /

2 A brief review of basic probability Uncorrelated: Two random variables X and Y are uncorrelated iff: The average value of the product of the variables equals the product of their individual averages Setup: Each draw produces one instance of X and one instance of Y I.e one instance of (X,Y) E[XY] = E[X]E[Y] The average value of X is the same regardless of the value of Y 8 Nov /

3 Uncorrelatedness Which of the above represent uncorrelated RVs? 8 Nov /

4 A brief review of basic probability Independence: Two random variables X and Y are independent iff: Their joint probability equals the product of their individual probabilities P(X,Y) = P(X)P(Y) The average value of X is the same regardless of the value of Y E[X Y] = E[X] 8 Nov /

5 A brief review of basic probability Independence: Two random variables X and Y are independent iff: The average value of any function X is the same regardless of the value of Y E[f(X)g(Y)] (Y)] = E[f(X)] E[g(Y)] for all f(), g() 8 Nov /

6 Independence Which of the above represent independent RVs? Which h represent uncorrelated RVs? 8 Nov /

7 A brief review of basic probability f(x) p(x) The expected value of an odd function of an RV is 0 if The RV is 0 mean The PDF is of the RV is symmetric around 0 E[f(X)] = 0 if f(x) is odd symmetric 8 Nov /

8 A brief review of basic info. theory T(all), M(ed), S(hort) ( X ) P( X )[ log P( X )] Entropy: Theminimum average number of bits to transmit to convey a symbol X X T, M, S M F F M.. Y ( X, Y ) P( X, Y )[ log P( X, Y )] Joint entropy: The minimum average number of bits to convey sets (pairs here) of symbols 8 Nov / X, Y

9 A brief review of basic info. theory X T, M, S ( X Y ) Y M F F M.. P( Y ) P( X Y )[ log P( X Y )] P( X, Y )[ Y X X, Y log P( X Y )] Conditional Entropy: The minimum average number of bits to transmit to convey a symbol X, after symbol Y has already been conveyed Averaged over all values of X and Y 8 Nov /

10 A brief review of basic info. theory ) ( )] ( log )[ ( ) ( )] ( log )[ ( ) ( ) ( X X P X P Y P Y X P Y X P Y P Y X Conditional entropy of X = (X) if X is Y X Y X Conditional entropy of X = (X) if X is independent of Y Y P X P Y X P Y X P Y X P Y X )] ( ) ( l )[ ( )] ( l )[ ( ) ( Y X Y X Y P X P Y X P Y X P Y X P Y X,, )] ( ) ( log )[, ( )], ( log )[, ( ), ( ) ( ) ( ) ( )log, ( ) ( )log, (,, Y X Y P Y X P X P Y X P Y X Y X Joint entropy of X and Y is the sum of the entropies of X and Y if they are independent p y p 8 Nov /

11 Onward.. 8 Nov /

12 Projection: multiple notes M = W = P = W (W T W) 1 W T Projected Spectrogram = P * M 8 Nov /

13 We re actually computing a score M = =? W = M ~ W = pinv(w)m 8 Nov /

14 ow about the other way? M = =?? W= U= M ~ W W = Mpinv(V) U = W 8 Nov /

15 So what are we doing here? =? W =? M ~ W is an approximation Given W, estimate to minimize error arg min M W 2 F arg min Must ideally find transcription of given notes i j M ij ( W) ij 2 8 Nov /

16 Going the other way.. W =?? W M ~ W is an approximation Given, estimate W to minimize error arg min W M W 2 F arg min ( W) Must ideally find the notes corresponding to the transcription 8 Nov /18797 i j M ij ij 2

17 When both parameters are unknown =? W =? approx(m) =? Must estimate both and W to best approximate M Ideally, must learn both the notes and their transcription! 8 Nov /

18 A least squares solution W Unconstrained 2, arg min M W W F, For any W, that minimizes the error, W =WA, =A -1 also minimizes the error for any invertible A Forour our problem, lets consider the truth.. When one note occurs, the other does not h it h j = 0 for all i!= j The rows of are uncorrelated 8 Nov /

19 A least squares solution Assume: T = I Normalizing all rows of to length 1 pinv() = T Projecting M onto W = M pinv() = M T W = M T W 2, arg min M W W, F 2 arg min M M T F Constraint: Rank() = 4 8 Nov /

20 Finding the notes arg min M M T 2 F Note T!= I Only T = I Could also be rewritten as T arg min trace M( I ) M T arg min trace M T M( I T ) arg min trace Correlation( M T )( I T ) arg max T trace Correlation( M ) T 8 Nov /

21 Finding the notes Constraint: every row of has length 1 T T T Correlation( M trace arg max trace ) Differentiating and equating to 0 T Correlation(M ) Simply pyrequiring the rows of to be orthonormal gives us that is the set of Eigenvectors of the data in M T 8 Nov /

22 Equivalences T T T Correlation( M trace arg max trace ) is identical to W, arg min 2 2 M W F i h W, i i i j h ij T i h j Minimize least squares error with the constraint that the rows of are length 1 and orthogonal to one another 8 Nov /

23 So how does that work? There are 12 notes in the segment, hence we try to estimate 12 notes.. 8 Nov /

24 So how does that work? The first three notes and their contributions The spectrograms of the notes are statistically uncorrelated 8 Nov /

25 Finding the notes Can find W instead of W arg min M W W T WM 2 F Solving the above, with the constraints that the columns of W are orthonormal gives you the eigen vectors of the data in M W arg max W trace T T W WCorrelation( M) trace W W Correlation( M) W W 8 Nov /

26 So how does that work? There are 12 notes in the segment, hence we try to estimate 12 notes.. 8 Nov /

27 Our notes are not orthogonal Overlapping frequencies Note occur concurrently armonica continues to resonate to previous note More generally, simple orthogonality will not give us the desired solution 8 Nov /

28 What else can we look for? Assume: The transcription of one note does not depend on what else is playing Or, in a multi instrument piece, instruments are playing independently of one another Not strictly true, but still.. 8 Nov /

29 Formulating it with Independence W, arg min W 2 M W ( rows. of.. are. independent), F Impose statistical independence constraints on decomposition 8 Nov /

30 Changing problems for a bit h 1 ( t) m1 ( t) w11h1 ( t) w12h2 ( t) m ( t ) w h ( t ) w h ( t ) 2( t h 2 ( t) Two people speak simultaneously Recorded by two microphones Each recorded ddsignal lis a mixture it of both signals 8 Nov /

31 Imposing Statistical Constraints M W = w21 w 11 w 12 w 22 M = W M = mixed signal W = notes = transcription Signal at mic 1 Signal at mic 2 Signal from speaker 1 Signal from speaker 2 Given only M estimate Ensure that the components of the vectors in the estimated are statistically independent Multiple approaches.. 8 Nov /

32 Imposing Statistical Constraints M W = w21 w 11 w 12 w 22 M = W Given only M estimate = W -1 M = AM Estimate A such that the components of AM are statistically independent A is the unmixing matrix Multiple approaches.. 8 Nov /

33 Statistical Independence M = W = AM Emulating independence Compute W (or A) and such that t has statistical ttiti characteristics that are observed in statistically independent variables Enforcing independence Compute W and such that the components of M are independent 8 Nov /

34 Emulating Independence Therows of areuncorrelated E[h i h j ] = E[h i ]E[h j ] h i and h j are the i th and j th components of any vector in The fourth order moments are independent E[h i h j h k h l ] = E[h i ]E[h j ]E[h k ]E[h l ] E[h i2 h j h k ] = E[h i2 ]E[h j ]E[h k ] E[h i2 h j2 ] = E[h i2 ]E[h j2 ] Etc. 8 Nov /

35 Zero Mean Usual to assume zero mean processes Otherwise, some of the math doesn t work well M = W = AM If mean(m) = 0 => mean() = 0 E[] [ ] = AE[M] = A0 = 0 First step of ICA: Set the mean of M to 0 1 m m i cols ( M ) i m i m m i are the columns of M i m i 8 Nov /

36 Emulating Independence.. = Diagonal + rank1 matrix Independence Uncorrelatedness Estimate a C such that CM is uncorrelated X = CM =AM A=BC =BCM E[x i x j ] = E[x i ]E[x j ] = ij [since M is now centered ] XX T = I In reality, we only want this to be a diagonal matrix, but we ll make it identity 8 Nov /

37 Decorrelating = Diagonal + rank1 matrix =AM A=BC =BCM X = CM XX T = I Eigen decomposition MM T = USU T Let C = S -1/2 U T WMM T W T = S -1/2 U T USU T US -1/2 = I 8 Nov /

38 Decorrelating = Diagonal + rank1 matrix =AM A=BC =BCM X = CM XX T = I Eigen decomposition MM T = ESE T Let C = S -1/2 E T WMM T W T = S -1/2 E T ESE T ES -1/2 = I X is called the whitened version of M The process of decorrelating M is called whitening C is the whitening i matrix ti 8 Nov /

39 Uncorrelated!= Independent Whitening merely ensures that the resulting shat signals are uncorrelated, i.e. E[x i x j ] = 0 if i!= j j This does not ensure higher order moments are also decoupled, e.g. it does not ensurethat E[x i2 x j2 ] = E[x i2 ]E [x j2 ] This is one of the signatures of independent RVs Lets explicitly itl decouple the fourth order moments 8 Nov /

40 Decorrelating = Diagonal + rank1 matrix =AM A=BC =BX X = CM XX T = I Will multiplying li li X by B re correlate the components? Not if B is unitary BB T = B T B = I So we want to find a unitary matrix Since the rows of are uncorrelated Because they are independent d 8 Nov /

41 ICA: Freeing Fourth Moments We have E[x i x j ] = 0 if i!= j Already been decorrelated A=BC, = BCM, X = CM, = BX The fourth moments of have the form: E[h i h j h k h l ] If the rows of were independent E[h i h j h k h l ] = E[h i ] E[h j ] E[h k ] E[h l ] Solution: Compute B such that the fourth moments of = BX are decoupled While ensuring that B is Unitary 8 Nov /

42 ICA: Freeing Fourth Moments Create a matrix of fourth moment terms that would be diagonal were the rows of independent and diagonalize it A good candidate Good because it incorporates the energy in all rows of Where d ij = E[ k h k2 h i h j j] i.e. D = E[h T hhh T ] h are the columns of d11 d12 d13.. D d.. 21 d22 d Assuming h is real, else replace transposition with ermition 8 Nov /

43 ICA: The D matrix D d11 d12 d13.. d.. 21 d22 d d ij = E[ k h k2 h i h j ] = 1 h 2 cols( ) m k mk h mi h mj Sum of squares of all components j th component h k 2 j th component h i h j h k2 h i h j Average above term across all columns of 8 Nov /

44 ICA: The D matrix D d11 d12 d13.. d.. 21 d22 d d ij = E[ k h k2 h i h j ] = 1 h 2 cols( ) m k mk h mi h mj If the h i terms were independent For i!= j E k h 2 k h h i j E h E h Eh Eh Eh Eh Eh i j j i k i, k j Centered: E[h j ] = 0 E[ k h k2 h i h j ]=0 for i!= j For i= j k i j E k 2 k hih j E h Eh Eh 0 Thus, if the h i terms were independent, d ij = 0 if i!= j i.e., if h i were independent, D would be a diagonal matrix Let us diagonalize D 8 Nov /18797 h i i k i k 45

45 Diagonalizing D Compose a fourth order matrix from X Recall: X = CM, = BX = BCM B is what we re trying to learn to make independent Compose D = E[x T xxx T ] Diagonalize D via Eigen decmpositin D = UU T B = U T That s it!!!! 8 Nov /

46 B frees the fourth moment D = UU T ; B = U T U is a unitary matrix, i.e. U T U = UU T = I (identity) = BX = U T X h = U T x The fourth moment matrix of is E[h T h h h T ] = E[x T UU T xu T x x T U] = E[x T xu T x x T U] = U T E[x T x xx T ]U = U T D U = U T U U T U = The fourth moment matrix of = U T X is Diagonal!! 8 Nov /

47 Overall Solution = AM = BCM = BX A = BC = U T C 8 Nov /

48 Independent Component Analysis Goal: to derive a matrix A such that the rows of AM are independent Procedure: re 1. Center M 2. Compute the autocorrelation matrix R MM of M 3. Compute whitening matrix C via Eigen decomposition R XX = ESE T, C = S -1/2 E T 4. Compute X= CM 5. Compute the fourth moment matrix D = E[x T xxx T ] 6. Diagonalize D via Eigen decomposition 7. D = UU T 8. Compute A = U T C The fourth moment matrix of =AM is diagonal Note that the autocorrelation matrix of will also be diagonal 8 Nov /

49 ICA by diagonalizing moment matrices The procedure just outlined, while fully functional, has shortcomings Only a subset of fourth order moments are considered There eeae are many yother ways of constructing fourth order ode moment matrices that would ideally be diagonal Diagonalizing the particular fourth order moment matrix we have chosen is not guaranteed to diagonalize every other fourth order moment matrix JADE: (Joint Approximate Diagonalization of Eigenmatrices), J.F. Cardoso Jointly diagonalizes several fourth order moment matrices More effective than the procedure shown, but more computationally expensive 8 Nov /

50 Enforcing Independence Specifically ensure that the components of are independent = AM Contrast function: A non linear function that has a minimum value when the output components are independent Dfi Define and minimize i i a contrast function F(AM) Contrast tfunctions are often only approximations too.. 8 Nov /

51 A note on pre-whitening The mixed signal is usually prewhitened Normalize variance along all directions Eliminate second order dependence X = CM E[x i x j ] = E[x i ]E[x j ] = ij for centered signals Eigen decomposition MM T = ESE T C = S -1/2 E T Can use first K columns of E only if only K independent sources are expected In microphone array setup only K < M sources 8 Nov /

52 The contrast function Contrast function: A non linear function that has a minimum i value when the output tcomponents are independent An explicit contrast function I ( ) ( h ) ( ) With constraint : = BX X is whitened M i i 8 Nov /

53 Linear Functions h = Bx Individual columns of the and X matrices x is mixed signal, B is the unmixing matrix P h ( h) P x ( B 1 h) B 1 ( x) P( x)log P( x) dx ( h) ( x) log B 8 Nov /

54 The contrast function i I( ) ( h ) ( ) i I( ) ( hi ) ( x) log B i Ignoring g (x) ( ) (Const) J ( ) ( h ) log W Minimize the above to obtain B i i 8 Nov /

55 An alternate approach Recall PCA M = W, the columns of W must be statistically independent Leads to: min W M W T WM 2 Error minimization framework to estimate W Can we arrive at an error minimization framework for ICA Define an Error objective that represents independence 8 Nov /

56 An alternate approach Definition of Independence if x and y are independent: d E[f(x)g(y)] = E[f(x)]E[g(y)] Must thold ldfor every f() and g()!! 8 Nov /

57 An alternate approach Define g() = g(bx) (component wise function) g(h 11 ) g(h 12 )... g(h 21 ) g(h 22 ) Define f() ( ) = f(bx) f(h 11 ) f(h 12 )... f(h 21 ) f(h 22 ) Nov /

58 An alternate approach P = g() f() T = g(bx) f(bx) T P = P 11 P P 21 P Must ideally be Error = P-Q 2 F.. Q P ij g( hik ) f ( h jk ) k This is a square matrix g( h ) Q 11 Q Q 12 Q ij ik Q = 22.. k l.... f ( h Q ii g( hik ) f ( hil ) k jl ) i j 8 Nov /

59 An alternate approach Ideal value for Q Q = Q 11 Q Q 21 Q Q ij g( h ik ) k l f ( h Q ii g( hik ) f ( hil ) k jl ) i j If g() and h() are odd symmetric functions j g(h ij ) = 0 for all i Since = j h ij = 0 ( is centered) Q is a Diagonal Matrix!!! 8 Nov /

60 An alternate approach Minimize Error P g(bx)f(bx) Q Diagonal error T 2 P Q F Leads to trivial Widrow opf type iterative rule: T E Diag g(bx)f(bx) B B EB T 8 Nov /

61 Update Rules Multiple solutions under different assumptions for g() and f() = BX B = B + B Jutten erraut : Online update B ij = f(h i )g(h j ); actually assumed a recursive neural network Bell Sejnowski B = ([B T ] -1 g()x T ) 8 Nov /

62 Update Rules Multiple solutions under different assumptions for g() and f() = BX B = B + B Natural gradient f() = identity function B = (I g() T )W Cichoki Unbehaeven B = (I g()f() T )W 8 Nov /

63 What are G() and () Must be odd symmetric functions Multiple functions proposed g( x) x x tanh( x ) tanh( x) x is super Gaussian x is sub Gaussian Audio signals in general B = (I T -Ktanh() T )W Or simply B=(I Ktanh() = T )W 8 Nov /

64 So how does it work? Example with instantaneous mixture of two speakers Natural gradient update Works very well! 8 Nov /

65 Another example! Input Mix Output 8 Nov /

66 Another Example Three instruments.. t 8 Nov /

67 The Notes Three instruments.. 8 Nov /

68 ICA for data exploration The bases in PCA represent the building blocks Ideally notes Very successfully used So can ICA be used to do the same? 8 Nov /

69 ICA vs PCA bases Motivation for using ICA vs PCA PCA will indicate orthogonal directions of maximal variance May not align with the data! Non-Gaussian data PCA ICA ICA finds directions that are independent More likely to align with the data 8 Nov /

70 Finding useful transforms with ICA Audio preprocessing example Take a lot of audio snippets and concatenate t them in a big matrix, do component analysis PCA results in the DCT bases ICA returns time/freq localized sinusoids which is a better way to analyze sounds Ditto for images ICA returns localizes edge filters 8 Nov /

71 Example case: ICA-faces vs. Eigenfaces ICA-faces Eigenfaces 8 Nov /

72 ICA for Signal Enhncement Very commonly used to enhance EEG signals EEG signals are frequently corrupted by heartbeats and biorhythm signals ICA can be used to separate them out 8 Nov /

73 So how does that work? There are 12 notes in the segment, hence we try to estimate 12 notes.. 8 Nov /

74 PCA solution There are 12 notes in the segment, hence we try to estimate 12 notes.. 8 Nov /

75 So how does this work: ICA solution Better.. But not much But the issues here? 8 Nov /

76 ICA Issues No sense of order Unlike PCA Get K independent directions, but does not have a notion of the best direction So the sources can come in any order Permutation invariance Does not have sense of scaling Scaling the signal does not affect independence Outputsare scaled versions of desired signals in permuted order In the best case In worse case, output are not desired signals at all.. 8 Nov /

77 What else went wrong? Assume distribution of signals is symmetric around mean Note energy here Not symmetric negative values never happen Still this didn t affect the three instruments case.. Notes are not independent Only one note plays at a time If one note plays, other notes are not playing 8 Nov /

78 Continue in next class.. NMF Factor analysis.. 8 Nov /

Independent Component Analysis. Slides from Paris Smaragdis UIUC

Independent Component Analysis. Slides from Paris Smaragdis UIUC Independent Component Analysis Slides from Paris Smaragdis UIUC 1 A simple audio problem 2 Formalizing the problem Each mic will receive a mi of both sounds Sound waves superimpose linearly We ll ignore

More information

Independent Component Analysis

Independent Component Analysis 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 1 Introduction Indepent

More information

Lecture 10: Dimension Reduction Techniques

Lecture 10: Dimension Reduction Techniques Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set

More information

Dimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

Dimensionality Reduction. CS57300 Data Mining Fall Instructor: Bruno Ribeiro Dimensionality Reduction CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Visualize high dimensional data (and understand its Geometry) } Project the data into lower dimensional spaces }

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Unsupervised learning: beyond simple clustering and PCA

Unsupervised learning: beyond simple clustering and PCA Unsupervised learning: beyond simple clustering and PCA Liza Rebrova Self organizing maps (SOM) Goal: approximate data points in R p by a low-dimensional manifold Unlike PCA, the manifold does not have

More information

16.584: Random Vectors

16.584: Random Vectors 1 16.584: Random Vectors Define X : (X 1, X 2,..X n ) T : n-dimensional Random Vector X 1 : X(t 1 ): May correspond to samples/measurements Generalize definition of PDF: F X (x) = P[X 1 x 1, X 2 x 2,...X

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Matrix Factorization and Analysis

Matrix Factorization and Analysis Chapter 7 Matrix Factorization and Analysis Matrix factorizations are an important part of the practice and analysis of signal processing. They are at the heart of many signal-processing algorithms. Their

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39

Massoud BABAIE-ZADEH. Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39 Blind Source Separation (BSS) and Independent Componen Analysis (ICA) Massoud BABAIE-ZADEH Blind Source Separation (BSS) and Independent Componen Analysis (ICA) p.1/39 Outline Part I Part II Introduction

More information

Statistics for scientists and engineers

Statistics for scientists and engineers Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3

More information

HST.582J/6.555J/16.456J

HST.582J/6.555J/16.456J Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

5.6. PSEUDOINVERSES 101. A H w.

5.6. PSEUDOINVERSES 101. A H w. 5.6. PSEUDOINVERSES 0 Corollary 5.6.4. If A is a matrix such that A H A is invertible, then the least-squares solution to Av = w is v = A H A ) A H w. The matrix A H A ) A H is the left inverse of A and

More information

Fundamentals of Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Independent Vector Analysis (IVA)

Fundamentals of Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Independent Vector Analysis (IVA) Fundamentals of Principal Component Analysis (PCA),, and Independent Vector Analysis (IVA) Dr Mohsen Naqvi Lecturer in Signal and Information Processing, School of Electrical and Electronic Engineering,

More information

Tensor Decompositions for Machine Learning. G. Roeder 1. UBC Machine Learning Reading Group, June University of British Columbia

Tensor Decompositions for Machine Learning. G. Roeder 1. UBC Machine Learning Reading Group, June University of British Columbia Network Feature s Decompositions for Machine Learning 1 1 Department of Computer Science University of British Columbia UBC Machine Learning Group, June 15 2016 1/30 Contact information Network Feature

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

11-755/ Machine Learning for Signal Processing. Algebra. Class Sep Instructor: Bhiksha Raj

11-755/ Machine Learning for Signal Processing. Algebra. Class Sep Instructor: Bhiksha Raj 11-755/18-797 Machine Learning for Signal Processing Fundamentals of Linear Algebra Class 2-3. 6 Sep 2011 Instructor: Bhiksha Raj 57 Administrivia TA Times: Anoop Ramakrishna: Thursday 12.30-1.30pm Manuel

More information

Natural Gradient Learning for Over- and Under-Complete Bases in ICA

Natural Gradient Learning for Over- and Under-Complete Bases in ICA NOTE Communicated by Jean-François Cardoso Natural Gradient Learning for Over- and Under-Complete Bases in ICA Shun-ichi Amari RIKEN Brain Science Institute, Wako-shi, Hirosawa, Saitama 351-01, Japan Independent

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Independent Components Analysis

Independent Components Analysis CS229 Lecture notes Andrew Ng Part XII Independent Components Analysis Our next topic is Independent Components Analysis (ICA). Similar to PCA, this will find a new basis in which to represent our data.

More information

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello Artificial Intelligence Module 2 Feature Selection Andrea Torsello We have seen that high dimensional data is hard to classify (curse of dimensionality) Often however, the data does not fill all the space

More information

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology

Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna

More information

Independent Component Analysis

Independent Component Analysis Chapter 5 Independent Component Analysis Part I: Introduction and applications Motivation Skillikorn chapter 7 2 Cocktail party problem Did you see that Have you heard So, yesterday this guy I said, darling

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

An Introduction to Independent Components Analysis (ICA)

An Introduction to Independent Components Analysis (ICA) An Introduction to Independent Components Analysis (ICA) Anish R. Shah, CFA Northfield Information Services Anish@northinfo.com Newport Jun 6, 2008 1 Overview of Talk Review principal components Introduce

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

CIFAR Lectures: Non-Gaussian statistics and natural images

CIFAR Lectures: Non-Gaussian statistics and natural images CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity

More information

Independent Component Analysis

Independent Component Analysis 1 Independent Component Analysis Background paper: http://www-stat.stanford.edu/ hastie/papers/ica.pdf 2 ICA Problem X = AS where X is a random p-vector representing multivariate input measurements. S

More information

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?) Math 35 Exam Review SOLUTIONS Overview In this third of the course we focused on linear learning algorithms to model data. summarize: To. Background: The SVD and the best basis (questions selected from

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

Independent Component Analysis

Independent Component Analysis Independent Component Analysis James V. Stone November 4, 24 Sheffield University, Sheffield, UK Keywords: independent component analysis, independence, blind source separation, projection pursuit, complexity

More information

Dot Products, Transposes, and Orthogonal Projections

Dot Products, Transposes, and Orthogonal Projections Dot Products, Transposes, and Orthogonal Projections David Jekel November 13, 2015 Properties of Dot Products Recall that the dot product or standard inner product on R n is given by x y = x 1 y 1 + +

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Independent Component Analysis. Contents

Independent Component Analysis. Contents Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle

More information

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence

More information

Introduction to Independent Component Analysis. Jingmei Lu and Xixi Lu. Abstract

Introduction to Independent Component Analysis. Jingmei Lu and Xixi Lu. Abstract Final Project 2//25 Introduction to Independent Component Analysis Abstract Independent Component Analysis (ICA) can be used to solve blind signal separation problem. In this article, we introduce definition

More information

Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego

Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego Email: brao@ucsdedu References 1 Hyvarinen, A, Karhunen, J, & Oja, E (2004) Independent component analysis (Vol 46)

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Linear Algebra, Summer 2011, pt. 3

Linear Algebra, Summer 2011, pt. 3 Linear Algebra, Summer 011, pt. 3 September 0, 011 Contents 1 Orthogonality. 1 1.1 The length of a vector....................... 1. Orthogonal vectors......................... 3 1.3 Orthogonal Subspaces.......................

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

Vector, Matrix, and Tensor Derivatives

Vector, Matrix, and Tensor Derivatives Vector, Matrix, and Tensor Derivatives Erik Learned-Miller The purpose of this document is to help you learn to take derivatives of vectors, matrices, and higher order tensors (arrays with three dimensions

More information

Lecture'12:' SSMs;'Independent'Component'Analysis;' Canonical'Correla;on'Analysis'

Lecture'12:' SSMs;'Independent'Component'Analysis;' Canonical'Correla;on'Analysis' Lecture'12:' SSMs;'Independent'Component'Analysis;' Canonical'Correla;on'Analysis' Lester'Mackey' May'7,'2014' ' Stats'306B:'Unsupervised'Learning' Beyond'linearity'in'state'space'modeling' Credit:'Alex'Simma'

More information

Matrices and systems of linear equations

Matrices and systems of linear equations Matrices and systems of linear equations Samy Tindel Purdue University Differential equations and linear algebra - MA 262 Taken from Differential equations and linear algebra by Goode and Annin Samy T.

More information

Different Estimation Methods for the Basic Independent Component Analysis Model

Different Estimation Methods for the Basic Independent Component Analysis Model Washington University in St. Louis Washington University Open Scholarship Arts & Sciences Electronic Theses and Dissertations Arts & Sciences Winter 12-2018 Different Estimation Methods for the Basic Independent

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future

More information

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors

5. Random Vectors. probabilities. characteristic function. cross correlation, cross covariance. Gaussian random vectors. functions of random vectors EE401 (Semester 1) 5. Random Vectors Jitkomut Songsiri probabilities characteristic function cross correlation, cross covariance Gaussian random vectors functions of random vectors 5-1 Random vectors we

More information

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo

1 Introduction Independent component analysis (ICA) [10] is a statistical technique whose main applications are blind source separation, blind deconvo The Fixed-Point Algorithm and Maximum Likelihood Estimation for Independent Component Analysis Aapo Hyvarinen Helsinki University of Technology Laboratory of Computer and Information Science P.O.Box 5400,

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is

More information

Basic Principles of Video Coding

Basic Principles of Video Coding Basic Principles of Video Coding Introduction Categories of Video Coding Schemes Information Theory Overview of Video Coding Techniques Predictive coding Transform coding Quantization Entropy coding Motion

More information

4 Derivations of the Discrete-Time Kalman Filter

4 Derivations of the Discrete-Time Kalman Filter Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof N Shimkin 4 Derivations of the Discrete-Time

More information

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4 Linear Algebra Section. : LU Decomposition Section. : Permutations and transposes Wednesday, February 1th Math 01 Week # 1 The LU Decomposition We learned last time that we can factor a invertible matrix

More information

Independent Component Analysis of Incomplete Data

Independent Component Analysis of Incomplete Data Independent Component Analysis of Incomplete Data Max Welling Markus Weber California Institute of Technology 136-93 Pasadena, CA 91125 fwelling,rmwg@vision.caltech.edu Keywords: EM, Missing Data, ICA

More information

On Information Maximization and Blind Signal Deconvolution

On Information Maximization and Blind Signal Deconvolution On Information Maximization and Blind Signal Deconvolution A Röbel Technical University of Berlin, Institute of Communication Sciences email: roebel@kgwtu-berlinde Abstract: In the following paper we investigate

More information

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.]

[Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty.] Math 43 Review Notes [Disclaimer: This is not a complete list of everything you need to know, just some of the topics that gave people difficulty Dot Product If v (v, v, v 3 and w (w, w, w 3, then the

More information

Getting Started with Communications Engineering. Rows first, columns second. Remember that. R then C. 1

Getting Started with Communications Engineering. Rows first, columns second. Remember that. R then C. 1 1 Rows first, columns second. Remember that. R then C. 1 A matrix is a set of real or complex numbers arranged in a rectangular array. They can be any size and shape (provided they are rectangular). A

More information

Principal Component Analysis CS498

Principal Component Analysis CS498 Principal Component Analysis CS498 Today s lecture Adaptive Feature Extraction Principal Component Analysis How, why, when, which A dual goal Find a good representation The features part Reduce redundancy

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Least square joint diagonalization of matrices under an intrinsic scale constraint

Least square joint diagonalization of matrices under an intrinsic scale constraint Least square joint diagonalization of matrices under an intrinsic scale constraint Dinh-Tuan Pham 1 and Marco Congedo 2 1 Laboratory Jean Kuntzmann, cnrs - Grenoble INP - UJF (Grenoble, France) Dinh-Tuan.Pham@imag.fr

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

Semi-Blind approaches to source separation: introduction to the special session

Semi-Blind approaches to source separation: introduction to the special session Semi-Blind approaches to source separation: introduction to the special session Massoud BABAIE-ZADEH 1 Christian JUTTEN 2 1- Sharif University of Technology, Tehran, IRAN 2- Laboratory of Images and Signals

More information

Single Channel Signal Separation Using MAP-based Subspace Decomposition

Single Channel Signal Separation Using MAP-based Subspace Decomposition Single Channel Signal Separation Using MAP-based Subspace Decomposition Gil-Jin Jang, Te-Won Lee, and Yung-Hwan Oh 1 Spoken Language Laboratory, Department of Computer Science, KAIST 373-1 Gusong-dong,

More information

LINEAR SYSTEMS (11) Intensive Computation

LINEAR SYSTEMS (11) Intensive Computation LINEAR SYSTEMS () Intensive Computation 27-8 prof. Annalisa Massini Viviana Arrigoni EXACT METHODS:. GAUSSIAN ELIMINATION. 2. CHOLESKY DECOMPOSITION. ITERATIVE METHODS:. JACOBI. 2. GAUSS-SEIDEL 2 CHOLESKY

More information

Blind Signal Separation: Statistical Principles

Blind Signal Separation: Statistical Principles Blind Signal Separation: Statistical Principles JEAN-FRANÇOIS CARDOSO, MEMBER, IEEE Invited Paper Blind signal separation (BSS) and independent component analysis (ICA) are emerging techniques of array

More information

File: ica tutorial2.tex. James V Stone and John Porrill, Psychology Department, Sheeld University, Tel: Fax:

File: ica tutorial2.tex. James V Stone and John Porrill, Psychology Department, Sheeld University, Tel: Fax: File: ica tutorial2.tex Independent Component Analysis and Projection Pursuit: A Tutorial Introduction James V Stone and John Porrill, Psychology Department, Sheeld University, Sheeld, S 2UR, England.

More information

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2 Final Review Sheet The final will cover Sections Chapters 1,2,3 and 4, as well as sections 5.1-5.4, 6.1-6.2 and 7.1-7.3 from chapters 5,6 and 7. This is essentially all material covered this term. Watch

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method

CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method CS168: The Modern Algorithmic Toolbox Lecture #8: PCA and the Power Iteration Method Tim Roughgarden & Gregory Valiant April 15, 015 This lecture began with an extended recap of Lecture 7. Recall that

More information

Fundamentals of Linear Algebra

Fundamentals of Linear Algebra 11-755/18-797 achine Learning for Signal Processing Fundamentals of Linear Algebra Class 2-3. 6 Sep 211 Administrivia A imes: Anoop Ramakrishna: hursday 12.3-1.3pm anuel ragut: Friday 11am 12pm. HW1: On

More information

Foundations of Matrix Analysis

Foundations of Matrix Analysis 1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the

More information

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION) HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION) PROFESSOR STEVEN MILLER: BROWN UNIVERSITY: SPRING 2007 1. CHAPTER 1: MATRICES AND GAUSSIAN ELIMINATION Page 9, # 3: Describe

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Review of Probability Theory

Review of Probability Theory Review of Probability Theory Arian Maleki and Tom Do Stanford University Probability theory is the study of uncertainty Through this class, we will be relying on concepts from probability theory for deriving

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Fantope Regularization in Metric Learning

Fantope Regularization in Metric Learning Fantope Regularization in Metric Learning CVPR 2014 Marc T. Law (LIP6, UPMC), Nicolas Thome (LIP6 - UPMC Sorbonne Universités), Matthieu Cord (LIP6 - UPMC Sorbonne Universités), Paris, France Introduction

More information

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 12/02/18 Monday 5th February 2018 Peter J.C. Dickinson

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition A Brief Mathematical Review Hamid R. Rabiee Jafar Muhammadi, Ali Jalali, Alireza Ghasemi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Probability theory

More information

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data

More information

B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

More information

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane.

Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane. Queens College, CUNY, Department of Computer Science Numerical Methods CSCI 361 / 761 Spring 2018 Instructor: Dr. Sateesh Mane c Sateesh R. Mane 2018 8 Lecture 8 8.1 Matrices July 22, 2018 We shall study

More information

OR MSc Maths Revision Course

OR MSc Maths Revision Course OR MSc Maths Revision Course Tom Byrne School of Mathematics University of Edinburgh t.m.byrne@sms.ed.ac.uk 15 September 2017 General Information Today JCMB Lecture Theatre A, 09:30-12:30 Mathematics revision

More information

A matrix over a field F is a rectangular array of elements from F. The symbol

A matrix over a field F is a rectangular array of elements from F. The symbol Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Gaussian and Linear Discriminant Analysis; Multiclass Classification

Gaussian and Linear Discriminant Analysis; Multiclass Classification Gaussian and Linear Discriminant Analysis; Multiclass Classification Professor Ameet Talwalkar Slide Credit: Professor Fei Sha Professor Ameet Talwalkar CS260 Machine Learning Algorithms October 13, 2015

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

a b = a T b = a i b i (1) i=1 (Geometric definition) The dot product of two Euclidean vectors a and b is defined by a b = a b cos(θ a,b ) (2)

a b = a T b = a i b i (1) i=1 (Geometric definition) The dot product of two Euclidean vectors a and b is defined by a b = a b cos(θ a,b ) (2) This is my preperation notes for teaching in sections during the winter 2018 quarter for course CSE 446. Useful for myself to review the concepts as well. More Linear Algebra Definition 1.1 (Dot Product).

More information

Lecture: Adaptive Filtering

Lecture: Adaptive Filtering ECE 830 Spring 2013 Statistical Signal Processing instructors: K. Jamieson and R. Nowak Lecture: Adaptive Filtering Adaptive filters are commonly used for online filtering of signals. The goal is to estimate

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition (Com S 477/577 Notes Yan-Bin Jia Sep, 7 Introduction Now comes a highlight of linear algebra. Any real m n matrix can be factored as A = UΣV T where U is an m m orthogonal

More information

component risk analysis

component risk analysis 273: Urban Systems Modeling Lec. 3 component risk analysis instructor: Matteo Pozzi 273: Urban Systems Modeling Lec. 3 component reliability outline risk analysis for components uncertain demand and uncertain

More information