# COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare

2 Variance and Covariance

3 Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of random variables and expected values A random variable is a variable that takes on different values due to chance. The set of sample values from a single dimension of a featurespace can be considered to be a random variable The expected value (denoted E[X]) is the most likely value a random variable will take. If we assume that the values an element of a feature can take are all equally likely then expected value is thus just the mean value

4 f'such'a'variable'is'just'its'mean'valu!!data'points,'!! =! [!,!,,! ],'is't!!! Variance :'! 1!!!! = (!!!) '!!!! 2) is the mean squared difference from Variance (σ ow' spreadsout 'the'data'is'from'the' the mean ("). wo'variables'(x'and'y)'change'togethe It s a measure of how spread-out the data is.! 1 2] technically it s E[(X E[X])!!,! = (!!! )(!!! )'!

5 !!! ow' spreadsout 'the'data'is'from'the' Covariance wo'variables'(x'and'y)'change'togethe! 1!!,! = (!!! )(!!! )'!!!! ance'when'the'two'variables'are'the' Covariance (σ(x,y)) measures how two variables change together ans'that'the'variables'are'uncorrelat technically it s E[(x - E[x])(y-E[y])] in'fact'related'to'correlation:'

6 !!! ow' spreadsout 'the'data'is'from'the' Covariance wo'variables'(x'and'y)'change'togethe! 1!!,! = (!!! )(!!! )'!!!! ance'when'the'two'variables'are'the' The variance is the covariance when the two variables are the same (σ(x,x)=σ2(x)) ans'that'the'variables'are'uncorrelat in'fact'related'to'correlation:'

7 !!! ow' spreadsout 'the'data'is'from'the' Covariance wo'variables'(x'and'y)'change'togethe! 1!!,! = (!!! )(!!! )'!!!! ance'when'the'two'variables'are'the' A covariance of 0 means the variables are uncorrelated ans'that'the'variables'are'uncorrelat (Covariance is related to Correlation see notes) in'fact'related'to'correlation:'

8 !,! =!!,! '!!!!!! e'that'!!,! =!Matrix!,! ' Covariance matrix,!!,'encodes'how'all'possible'pairs'of'dimensions'in'a'nsd ts'in'a'feature'space),'x,'vary'together:'!!!,!!!!!,!!!!!,!!!!!,!!!!!,!!!!!,!!!= '!!!,!!!!!,!!!!!,!! o'the'isth'element'of'all'the'vectors'in'the'feature'space.'' riance'matrix'is'a'symmetric'matrix' A covariance matrix encodes how all possible pairs a'set'of'vectors'is'the'process'of'subtracting'the'mean'(comput of dimensions in an n-dimensional dataset vary le]'of'the'vectors)'from'each'vector.' of'n'meanscentred'vectors,'you'can'form'them'into'a'matrix,'z together s'to'one'of'your'vectors.'the'covariance'matrix'then'proportio ultiplied'by'z:'!

9 !,! =!!,! '!!!!!! e'that'!!,! =!Matrix!,! ' Covariance matrix,!!,'encodes'how'all'possible'pairs'of'dimensions'in'a'nsd ts'in'a'feature'space),'x,'vary'together:'!!!,!!!!!,!!!!!,!!!!!,!!!!!,!!!!!,!!!= '!!!,!!!!!,!!!!!,!! o'the'isth'element'of'all'the'vectors'in'the'feature'space.'' riance'matrix'is'a'symmetric'matrix' a'set'of'vectors'is'the'process'of'subtracting'the'mean'(comput The covariance matrix is a square symmetric le]'of'the'vectors)'from'each'vector.' matrix of'n'meanscentred'vectors,'you'can'form'them'into'a'matrix,'z s'to'one'of'your'vectors.'the'covariance'matrix'then'proportio ultiplied'by'z:'!

10 Demo: 2D covariance

11 Mean Centring Mean Centring is the process of computing the mean (across each dimension independently) of a set of vectors, and then subtracting the mean vector from every vector in the set. All the vectors will be translated so their average position is the origin

12 Covariance matrix again Z = v 11 v 12 v 12 v 21 v 22 v 22 v 31 v 32 v 32 v 41 v 42 v 42 Each row is a mean centred featurevector Then Σ Z T Z

13 Principal axes of variation

14 Basis A basis is a set of n linearly independent vectors in an n dimensional space The vectors are orthogonal They form a coordinate system There are an infinite number of possible basis

15 The first principal axis For a given set of n dimensional data, the first principle axis (or just principal axis) is the vector that describes the direction of greatest variance.

16 The second principal axis The second principal axis is a vector in the direction of greatest variance orthogonal (perpendicular) to the first major axis.

17 The third principal axis In a space with 3 or more dimensions, the third principal axis is the direction of greatest variance orthogonal to both the first and second principal axes. The forth and so on The set of n principal axes of an n dimensional space are a basis

18 Eigenvectors, Eigenvalues and Eigendecomposition

19 Eigenvectors and Eigenvalues Very important equation! Av=λv

20 Eigenvectors and Eigenvalues a n n square matrix a scalar value, known as an eigenvalue Av=λv an n dimensional vector, known as an eigenvector

21 Eigenvectors and Eigenvalues Av=λv There are at most n eigenvector-eigenvalue pairs If A is symmetric, then the set of eigenvectors is orthogonal Can you see where this is going?

22 Eigenvectors and Eigenvalues Av=λv If A is a covariance matrix, then the eigenvectors are the principal axes. The eigenvalues are proportional to the variance of the data along each eigenvector. The eigenvector corresponding to the largest eigenvalue is the first P.C.

23 Finding the EVecs and EVals For small matrices (n 4) there are algebraic solutions to finding all the eigenvector-eigenvalue pairs For larger matrices, numerical solutions to the Eigendecomposition must be sought.

24 Eigendecomposition (aka EVD) columns of Q are the eigenvectors A=QΛQ -1 diagonal eigenvalue matrix (Λii = λi)

25 Eigendecomposition A=QΛQ -1 If A is real symmetric (i.e. a covariance matrix), then Q -1 = Q T (i.e. eigenvectors are orthogonal), so: A=QΛQ T

26 Eigendecomposition In summary, the Eigendecomposition of a covariance matrix A: A=QΛQ T Gives you the principal axes and their relative magnitudes

27 Ordering Standard Eigendecomposition implementations will order the eigenvectors (columns of Q) such that the eigenvalues (in the diagonal of Λ) are sorted in order of decreasing value. Some solvers are optimised to only find the top k eigenvalues and corresponding eigenvectors, rather than all of them.

28 Demo: Covariance, Eigendecomposition and principal axes

29 Principal Component Analysis

30 Linear Transform A linear transform W projects data from one space into another: T = ZW Original data stored in the rows of Z T can have fewer dimensions than Z.

31 Linear Transforms The effects of a linear transform can be reversed if W is invertible: Z = TW -1 A lossy process if the dimensionality of the spaces is different

32 PCA PCA is an Orthogonal Linear Transform that maps data from its original space to a space defined by the principal axes of the data. The transform matrix W is just the eigenvector matrix Q from the Eigendecomposition of the covariance matrix of the data. Dimensionality reduction can be achieved by removing the eigenvectors with low eigenvalues from Q (i.e. keeping the first L columns of Q assuming the eigenvectors are sorted by decreasing eigenvalue).

33 PCA Algorithm 1. Mean- centre the data vectors 2. Form the vectors into a matrix Z, such that each row corresponds to a vector 3. Perform the Eigendecomposition of the matrix Z T Z, to recover the eigenvector matrix Q and diagonal eigenvalue matrix Λ: Z T Z = QΛQ T 4. Sort the columns of Q and corresponding diagonal values of Λ so that the eigenvalues are decreasing. 5. Select the L largest eigenvectors of Q (the first L columns) to create the transform matrix Q L. 6. Project the original vectors into a lower dimensional space, T L : T L = ZQ L

34 Demo: PCA

35 Singular Value Decomposition

36 Singular Value Decomposition Conjugate Transpose (Complex Valued M) M=UΣV * Note that this is a different Σ to the one used earlier (Real Valued M) M=UΣV T

37 (Real) Singular Value Decomposition Orthogonal matrix: U T U=UU T =I Orthogonal matrix: V T V=VV T =I M=UΣV T Real Diagonal Matrix

38 Singular Value Decomposition M U Σ V = T ρ ρ ρ n m n m ρ where ρ is the rank of M (i.e. the number of linearly independent rows or columns)

39 Singular Value Decomposition M U V = T ρ n m n m ρ Singular Values (Usually organised/computed so they monotonically decrease)

40 Singular Value Decomposition M = V T ρ n m n Columns of U are called the left singular vectors The left singular vectors are the eigenvectors of MM T

41 Singular Value Decomposition M = m n Rows of V T (columns of V) are called the right singular vectors The right singular vectors are the eigenvectors of M T M

42 Singular Value Decomposition M = m n The singular values are the square roots of the eigenvalues of the corresponding eigenvectors of M T M and MM T

43 Singular Value Decomposition Z = m n If Z is the matrix of mean centred featurevectors that we introduced earlier, then the columns of V are the eigenvectors of Z T Z - that is they are the principle components of Z!

44 SVD-based PCA Algorithm 1. Mean- centre the data vectors 2. Form the vectors into a matrix Z, such that each row corresponds to a vector 3. Perform the SVD of the matrix Z, to recover the right singular vector matrix V and diagonal singular value matrix Σ: Z = UΣV T 4. Sort the columns of V and corresponding diagonal values of Σ so that the singular values are decreasing. 5. Select the first L right singular vectors from V (the first L columns) to create the transform matrix Q L. 6. Project the original vectors into a lower dimensional space, T L : T L = ZQ L

45 Why SVD-based PCA Numerical stability Computing the covariance matrix might be disastrous - e.g. consider the Läuchli Matrix: 0 1> B A Performance 0 0 SVD can be faster than computing covariance followed by EVD

46 Truncated Singular Value Decomposition M Ur Σr Vr T r n r r m n m r Truncated SVD considers only the largest r singular values (and corresponding left & right vectors)

47 Low Rank Approximation M Ur Σr Vr T r n r r m n m r It can be shown that Ḿ=UrΣrVr T is the best possible rank-r approximation to M in the sense that it minimises the Frobenius Norm Ḿ-M F (Frobenius Norm of A is just the square root of the sum of all the elements of A squared)

48 What else is SVD good for? Pseudoinverse A + =VΣ -1 U T use to solve Ax=b for x where Ax-b 2 is minimised: x=a + b (i.e. solve least squares regression!) Solving homogeneous linear equations Ax=0 Data mining e.g. model based CF recommender systems, latent factors, We ll see lots more in coming lectures

49 Computation of EVD and SVD

50 Classical EVD/SVD algorithms I All general E.V. algorithms are iterative Power Iteration Iterative process to find the biggest E.V. b k+1 = Ab k kab k k Repeated for each E.V. by rotating and truncating the input to remove the effect of the previous E.V. Arnoldi Iteration and the Lanczos Algorithm Move efficient variants of Power Iteration use the Gram-Schmidt process to find the orthonormal basis of the top-r E.Vs

51 Classical EVD/SVD algorithms II Those algorithms are good for finding the largest r E.V.s or S.V.s Lots of standard efficient implementations in LAPACK/ ARPACK including for relatively large, sparse matrices but not really massive matrices Variations exist for finding the smallest non-zero E.V.s

52 Summary Covariance measures the shape of data by measuring how different dimensions change together.! The principle axes are a basis, aligned such they describe most directions of greatest variance.! The Eigendecomposition of the covariance matrix produces pairs of eigenvectors (corresponding to the principal axes) and eigenvalues (proportional to the variance along the respective axis).! PCA aligns data with its principal axes, and allows dimensionally reduction by discounting axes with low variance. SVD is a general matrix factorisation tool with lots of uses

### Covariance and Principal Components

COMP3204/COMP6223: Computer Vision Covariance and Principal Components Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and

### Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

### Singular Value Decomposition

Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

### Lecture 3: Review of Linear Algebra

ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

### 1 Linearity and Linear Systems

Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)

### Lecture 3: Review of Linear Algebra

ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

### Maths for Signals and Systems Linear Algebra in Engineering

Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 15, Tuesday 8 th and Friday 11 th November 016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE

### 2. Review of Linear Algebra

2. Review of Linear Algebra ECE 83, Spring 217 In this course we will represent signals as vectors and operators (e.g., filters, transforms, etc) as matrices. This lecture reviews basic concepts from linear

### B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

### PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

### Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

### MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T KEMP) Definition 01 If T (x) = Ax is a linear transformation from R n to R m then Nul (T ) = {x R n : T (x) = 0} = Nul (A) Ran (T ) = {Ax R m : x R n } = {b R m

### DS-GA 1002 Lecture notes 10 November 23, Linear models

DS-GA 2 Lecture notes November 23, 2 Linear functions Linear models A linear model encodes the assumption that two quantities are linearly related. Mathematically, this is characterized using linear functions.

### Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

### Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

### Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

### Computational Methods. Eigenvalues and Singular Values

Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations

### Singular Value Decomposition

Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

### Principal Component Analysis

Principal Component Analysis Anders Øland David Christiansen 1 Introduction Principal Component Analysis, or PCA, is a commonly used multi-purpose technique in data analysis. It can be used for feature

### Face Recognition and Biometric Systems

The Eigenfaces method Plan of the lecture Principal Components Analysis main idea Feature extraction by PCA face recognition Eigenfaces training feature extraction Literature M.A.Turk, A.P.Pentland Face

### 1 Vectors. Notes for Bindel, Spring 2017 Numerical Analysis (CS 4220)

Notes for 2017-01-30 Most of mathematics is best learned by doing. Linear algebra is no exception. You have had a previous class in which you learned the basics of linear algebra, and you will have plenty

### Stat 159/259: Linear Algebra Notes

Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the

### Singular Value Decomposition (SVD)

School of Computing National University of Singapore CS CS524 Theoretical Foundations of Multimedia More Linear Algebra Singular Value Decomposition (SVD) The highpoint of linear algebra Gilbert Strang

### Singular value decomposition

Singular value decomposition The eigenvalue decomposition (EVD) for a square matrix A gives AU = UD. Let A be rectangular (m n, m > n). A singular value σ and corresponding pair of singular vectors u (m

### Linear Algebra in Actuarial Science: Slides to the lecture

Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations

### Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

Computational Methods CMSC/AMSC/MAPL 460 EigenValue decomposition Singular Value Decomposition Ramani Duraiswami, Dept. of Computer Science Hermitian Matrices A square matrix for which A = A H is said

### Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

### EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition

### (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

. (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

### Linear Algebra Primer

Linear Algebra Primer David Doria daviddoria@gmail.com Wednesday 3 rd December, 2008 Contents Why is it called Linear Algebra? 4 2 What is a Matrix? 4 2. Input and Output.....................................

### Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Salvador Dalí, Galatea of the Spheres CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy and Lisa Zhang Some slides from Derek Hoiem and Alysha

### CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

### Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

### Signal Analysis. Principal Component Analysis

Multi dimensional Signal Analysis Lecture 2E Principal Component Analysis Subspace representation Note! Given avector space V of dimension N a scalar product defined by G 0 a subspace U of dimension M

### IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET This is a (not quite comprehensive) list of definitions and theorems given in Math 1553. Pay particular attention to the ones in red. Study Tip For each

### Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

### What is Principal Component Analysis?

What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

### Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

### Linear Algebra Massoud Malek

CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

### Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Prof. Tesler Math 283 Fall 2015 Prof. Tesler Principal Components Analysis Math 283 / Fall 2015

### Example Linear Algebra Competency Test

Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

### Information Retrieval

Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices

### Linear Subspace Models

Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

### I. Multiple Choice Questions (Answer any eight)

Name of the student : Roll No : CS65: Linear Algebra and Random Processes Exam - Course Instructor : Prashanth L.A. Date : Sep-24, 27 Duration : 5 minutes INSTRUCTIONS: The test will be evaluated ONLY

### 5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

Linear Algebra - Test File - Spring Test # For problems - consider the following system of equations. x + y - z = x + y + 4z = x + y + 6z =.) Solve the system without using your calculator..) Find the

### Image Registration Lecture 2: Vectors and Matrices

Image Registration Lecture 2: Vectors and Matrices Prof. Charlene Tsai Lecture Overview Vectors Matrices Basics Orthogonal matrices Singular Value Decomposition (SVD) 2 1 Preliminary Comments Some of this

### Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Latent Semantic Models Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Vector Space Model: Pros Automatic selection of index terms Partial matching of queries

### Linear Algebra- Final Exam Review

Linear Algebra- Final Exam Review. Let A be invertible. Show that, if v, v, v 3 are linearly independent vectors, so are Av, Av, Av 3. NOTE: It should be clear from your answer that you know the definition.

### Lecture 7 Spectral methods

CSE 291: Unsupervised learning Spring 2008 Lecture 7 Spectral methods 7.1 Linear algebra review 7.1.1 Eigenvalues and eigenvectors Definition 1. A d d matrix M has eigenvalue λ if there is a d-dimensional

### Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization

### Pseudoinverse & Moore-Penrose Conditions

ECE 275AB Lecture 7 Fall 2008 V1.0 c K. Kreutz-Delgado, UC San Diego p. 1/1 Lecture 7 ECE 275A Pseudoinverse & Moore-Penrose Conditions ECE 275AB Lecture 7 Fall 2008 V1.0 c K. Kreutz-Delgado, UC San Diego

### Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians

Exercise Sheet 1 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Show that an infinite mixture of Gaussian distributions, with Gamma distributions as mixing weights in the following

### IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET This is a (not quite comprehensive) list of definitions and theorems given in Math 1553. Pay particular attention to the ones in red. Study Tip For each

### Eigenvectors and SVD 1

Eigenvectors and SVD 1 Definition Eigenvectors of a square matrix Ax=λx, x=0. Intuition: x is unchanged by A (except for scaling) Examples: axis of rotation, stationary distribution of a Markov chain 2

### Machine Learning (Spring 2012) Principal Component Analysis

1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

### 22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

m:33 Notes: 7. Diagonalization of Symmetric Matrices Dennis Roseman University of Iowa Iowa City, IA http://www.math.uiowa.edu/ roseman May 3, Symmetric matrices Definition. A symmetric matrix is a matrix

### PDE Model Reduction Using SVD

http://people.sc.fsu.edu/ jburkardt/presentations/fsu 2006.pdf... John 1 Max Gunzburger 2 Hyung-Chun Lee 3 1 School of Computational Science Florida State University 2 Department of Mathematics and School

### VAR Model. (k-variate) VAR(p) model (in the Reduced Form): Y t-2. Y t-1 = A + B 1. Y t + B 2. Y t-p. + ε t. + + B p. where:

VAR Model (k-variate VAR(p model (in the Reduced Form: where: Y t = A + B 1 Y t-1 + B 2 Y t-2 + + B p Y t-p + ε t Y t = (y 1t, y 2t,, y kt : a (k x 1 vector of time series variables A: a (k x 1 vector

### Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

### The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

### AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized

### Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition

Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2016 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing

### 1 Principal component analysis and dimensional reduction

Linear Algebra Working Group :: Day 3 Note: All vector spaces will be finite-dimensional vector spaces over the field R. 1 Principal component analysis and dimensional reduction Definition 1.1. Given an

### L3: Review of linear algebra and MATLAB

L3: Review of linear algebra and MATLAB Vector and matrix notation Vectors Matrices Vector spaces Linear transformations Eigenvalues and eigenvectors MATLAB primer CSCE 666 Pattern Analysis Ricardo Gutierrez-Osuna

### 4. Matrix inverses. left and right inverse. linear independence. nonsingular matrices. matrices with linearly independent columns

L. Vandenberghe ECE133A (Winter 2018) 4. Matrix inverses left and right inverse linear independence nonsingular matrices matrices with linearly independent columns matrices with linearly independent rows

### Review of some mathematical tools

MATHEMATICAL FOUNDATIONS OF SIGNAL PROCESSING Fall 2016 Benjamín Béjar Haro, Mihailo Kolundžija, Reza Parhizkar, Adam Scholefield Teaching assistants: Golnoosh Elhami, Hanjie Pan Review of some mathematical

### Principal Component Analysis (PCA) Theory, Practice, and Examples

Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A

### 10-725/36-725: Convex Optimization Prerequisite Topics

10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the

### Computational math: Assignment 1

Computational math: Assignment 1 Thanks Ting Gao for her Latex file 11 Let B be a 4 4 matrix to which we apply the following operations: 1double column 1, halve row 3, 3add row 3 to row 1, 4interchange

### Learning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse.

Application of the SVD: Compression and Pseudoinverse Learning goals: students learn to use the SVD to find good approximations to matrices and to compute the pseudoinverse. Low rank approximation One

### Singular Value Decomposition

Singular Value Decomposition (Com S 477/577 Notes Yan-Bin Jia Sep, 7 Introduction Now comes a highlight of linear algebra. Any real m n matrix can be factored as A = UΣV T where U is an m m orthogonal

### Learning with Singular Vectors

Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

### Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

### Study Notes on Matrices & Determinants for GATE 2017

Study Notes on Matrices & Determinants for GATE 2017 Matrices and Determinates are undoubtedly one of the most scoring and high yielding topics in GATE. At least 3-4 questions are always anticipated from

### ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization Xiaohui Xie University of California, Irvine xhx@uci.edu Xiaohui Xie (UCI) ICS 6N 1 / 21 Symmetric matrices An n n

### Review of Linear Algebra

Review of Linear Algebra Dr Gerhard Roth COMP 40A Winter 05 Version Linear algebra Is an important area of mathematics It is the basis of computer vision Is very widely taught, and there are many resources

### [POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

### THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},

### Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

### Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Math 8, Linear Algebra, Lecture C, Spring 7 Review and Practice Problems for Final Exam. The augmentedmatrix of a linear system has been transformed by row operations into 5 4 8. Determine if the system

### The Singular Value Decomposition

The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition

### PRINCIPAL COMPONENTS ANALYSIS

121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

### Index. Copyright (c)2007 The Society for Industrial and Applied Mathematics From: Matrix Methods in Data Mining and Pattern Recgonition By: Lars Elden

Index 1-norm, 15 matrix, 17 vector, 15 2-norm, 15, 59 matrix, 17 vector, 15 3-mode array, 91 absolute error, 15 adjacency matrix, 158 Aitken extrapolation, 157 algebra, multi-linear, 91 all-orthogonality,

### Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Section 3.2 Theorem 3.6. Let A be an m n matrix of rank r. Then r m, r n, and, by means of a finite number of elementary row and column operations, A can be transformed into the matrix ( ) Ir O D = 1 O

### 18.06SC Final Exam Solutions

18.06SC Final Exam Solutions 1 (4+7=11 pts.) Suppose A is 3 by 4, and Ax = 0 has exactly 2 special solutions: 1 2 x 1 = 1 and x 2 = 1 1 0 0 1 (a) Remembering that A is 3 by 4, find its row reduced echelon

### Maths for Signals and Systems Linear Algebra for Engineering Applications

Maths for Signals and Systems Linear Algebra for Engineering Applications Lectures 1-2, Tuesday 11 th October 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE LONDON

### The QR Decomposition

The QR Decomposition We have seen one major decomposition of a matrix which is A = LU (and its variants) or more generally PA = LU for a permutation matrix P. This was valid for a square matrix and aided

### MATH 304 Linear Algebra Lecture 34: Review for Test 2.

MATH 304 Linear Algebra Lecture 34: Review for Test 2. Topics for Test 2 Linear transformations (Leon 4.1 4.3) Matrix transformations Matrix of a linear mapping Similar matrices Orthogonality (Leon 5.1

### PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

### BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

BASIC ALGORITHMS IN LINEAR ALGEBRA STEVEN DALE CUTKOSKY Matrices and Applications of Gaussian Elimination Systems of Equations Suppose that A is an n n matrix with coefficents in a field F, and x = (x,,

### MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

### Linear algebra II Homework #1 solutions A = This means that every eigenvector with eigenvalue λ = 1 must have the form

Linear algebra II Homework # solutions. Find the eigenvalues and the eigenvectors of the matrix 4 6 A =. 5 Since tra = 9 and deta = = 8, the characteristic polynomial is f(λ) = λ (tra)λ+deta = λ 9λ+8 =

### Math Linear Algebra Final Exam Review Sheet

Math 15-1 Linear Algebra Final Exam Review Sheet Vector Operations Vector addition is a component-wise operation. Two vectors v and w may be added together as long as they contain the same number n of

### Solutions for Econometrics I Homework No.1

Solutions for Econometrics I Homework No.1 due 2006-02-20 Feldkircher, Forstner, Ghoddusi, Grafenhofer, Pichler, Reiss, Yan, Zeugner Exercise 1.1 Structural form of the problem: 1. q d t = α 0 + α 1 p

### Linear Algebra Final Exam Study Guide Solutions Fall 2012

. Let A = Given that v = 7 7 67 5 75 78 Linear Algebra Final Exam Study Guide Solutions Fall 5 explain why it is not possible to diagonalize A. is an eigenvector for A and λ = is an eigenvalue for A diagonalize

### Orthogonality. Orthonormal Bases, Orthogonal Matrices. Orthogonality

Orthonormal Bases, Orthogonal Matrices The Major Ideas from Last Lecture Vector Span Subspace Basis Vectors Coordinates in different bases Matrix Factorization (Basics) The Major Ideas from Last Lecture

### Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 18th,

### Elementary Linear Algebra Review for Exam 2 Exam is Monday, November 16th.

Elementary Linear Algebra Review for Exam Exam is Monday, November 6th. The exam will cover sections:.4,..4, 5. 5., 7., the class notes on Markov Models. You must be able to do each of the following. Section.4