Principal Component Analysis

Size: px

Start display at page:

Download "Principal Component Analysis"

Shon Miles
5 years ago
Views:

1 Principal Component Analysis Yuanzhen Shao MA Yuanzhen Shao PCA 1 / 13

2 Data as points in R n Assume that we have a collection of data in R n. x 11 x 21 x 12 S = {X 1 =., X x 22 2 =.,, X x m2 m =. } x 1n x 2n x mn x m1 Figure: A collection data in R n Yuanzhen Shao PCA 2 / 13

3 Data as points in R n Without loss of generality, we may assume that the mean of this collection of data 0 X = 1 m 0 X i = m i=1.. 0 Otherwise, we just translate these data to have 0 mean by looking at S = {X 1 X, X 2 X,, X m X }. Figure: A collection data in R n Yuanzhen Shao PCA 3 / 13

4 Variance of Data Question 1: How can we determine a subspace that S is close to? Some examples: Yuanzhen Shao PCA 4 / 13

5 Variance of Data Question 1: How can we determine a subspace that S is close to? Some examples: Question 2: How to determine the direction representing the largest variance of S? Yuanzhen Shao PCA 4 / 13

6 Variance of Data Question 1: How can we determine a subspace that S is close to? Some examples: Question 2: How to determine the direction representing the largest variance of S? Recall if v R n is a unit vector, then the orthogonal projection of X i on the direction given by v, i.e. span{v}, is Proj V X i = (X i, v)v = c i v, that is, c i is the coordinate of X i in the direction of v. Figure: Orthogonal projection Yuanzhen Shao PCA 4 / 13

7 Variance of Data Answer: the direction representing the largest variance of S is given by the unit vector v that can maximize m ci 2 = m (X i, v) 2 among all unit vector v, i=1 i=1 or equivalently to maximize m m ci 2 = (X i, v) 2 among all unit vector v. i=1 i=1 Yuanzhen Shao PCA 5 / 13

8 Optimization Problem Question 3: How to find such v to this optimization problem? Let X T 1 X T x 11 x 12 x 1n A m n = 2. =.... x m1 x m2 x mn X T m Yuanzhen Shao PCA 6 / 13

9 Optimization Problem Question 3: How to find such v to this optimization problem? Let X T 1 X T x 11 x 12 x 1n A m n = 2. =.... x m1 x m2 x mn X T m Recall (X i, v) = Xi T v. So X1 T v (X 1, v) X2 T Av = v. = (X 2, v). = Xm T v (X m, v) c 1 c 2. c m. Yuanzhen Shao PCA 6 / 13

10 Optimization Problem Question 3: How to find such v to this optimization problem? Let X T 1 X T x 11 x 12 x 1n A m n = 2. =.... x m1 x m2 x mn X T m Recall (X i, v) = Xi T v. So X1 T v (X 1, v) X2 T Av = v. = (X 2, v). = Xm T v (X m, v) c 1 c 2. c m. Therefore, m i=1 ci 2 = (Av, Av) = v T A T Av = v T }{{} C v =A T A Yuanzhen Shao PCA 6 / 13

11 Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Yuanzhen Shao PCA 7 / 13

12 Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Assume that C has eigenvalues λ 1 λ 2 λ n. Moreover, C has an orthonormal basis of eigenvectors v 1, v 2, v n such that Cv i = λ i v i. Yuanzhen Shao PCA 7 / 13

13 Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Assume that C has eigenvalues λ 1 λ 2 λ n. Moreover, C has an orthonormal basis of eigenvectors v 1, v 2, v n such that Cv i = λ i v i. In particular, vi T Cv i = λ i vi T v i = λ i v i 2 = λ i. Yuanzhen Shao PCA 7 / 13

14 Optimization Problem C is an n n symmetric matrix (consider why), and thus is diagonalizable. Assume that C has eigenvalues λ 1 λ 2 λ n. Moreover, C has an orthonormal basis of eigenvectors In particular, v 1, v 2, v n such that Cv i = λ i v i. vi T Cv i = λ i vi T v i = λ i v i 2 = λ i. Claim: v 1 represents the direction of the largest variance of S. Yuanzhen Shao PCA 7 / 13

15 Optimization Problem Proof. If u = a 1 v 1 + a 2 v 2 + a n v n is a unit vector in R n, i.e. then 1 = u 2 = a a a 2 n, (consider why?) u T Cu = (a 1 v 1 + a 2 v 2 + a n v n ) T (λ 1 a 1 v 1 + λ 2 a 2 v 2 + λ 2 a n v n ) }{{}}{{} u T Cu = λ 1 a1 2 + λ 2 a2 2 + λ n an 2 λ 1 a λ 1 a λ 1 a 2 n = λ 1 (a a a 2 n) }{{} =1 = λ 1. Yuanzhen Shao PCA 8 / 13

16 Optimization Problem Thus, v 1 = the direction of the largest variance of S. Yuanzhen Shao PCA 9 / 13

17 Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } Yuanzhen Shao PCA 9 / 13

18 Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } v 3 = the direction of the largest variance of S in span{v 1, v 2 } Yuanzhen Shao PCA 9 / 13

19 Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } v 3 = the direction of the largest variance of S in span{v 1, v 2 } etc. Yuanzhen Shao PCA 9 / 13

20 Optimization Problem Thus, v 1 = the direction of the largest variance of S. Similarly, we can show that v 2 = the direction of the largest variance of S in span{v 1 } v 3 = the direction of the largest variance of S in span{v 1, v 2 } etc. In the end, we can just drop the directions corresponding to very small eigenvalues of C. Yuanzhen Shao PCA 9 / 13

PCA in digital images: an example by Václav Hlaváč Let us consider a 321 261 image.

21 PCA in digital images: an example by Václav Hlaváč Let us consider a image. Such an image can be considered as a vector in R n with n = = Yuanzhen Shao PCA 10 / 13

22 What if we have 32 instances of images? Yuanzhen Shao PCA 11 / 13

23 PCA in digital images: an example by Václav Hlaváč Using PCA method, we can determine a four-dimensional subspace W in R n such that all 32 images are close to W. Yuanzhen Shao PCA 12 / 13

24 PCA in digital images: an example by Václav Hlaváč Using PCA method, we can determine a four-dimensional subspace W in R n such that all 32 images are close to W. We can find four basis vectors for W, which can be displayed as images: Yuanzhen Shao PCA 12 / 13

We can find four basis vectors for W, which can be displayed as images: We can reconstruct all 32

25 PCA in digital images: an example by Václav Hlaváč Using PCA method, we can determine a four-dimensional subspace W in R n such that all 32 images are close to W. We can find four basis vectors for W, which can be displayed as images: We can reconstruct all 32 images by using linear combinations of these four basis images, e.g. where q 1 = 0.078, q 2 = 0.062, q 3 = 0.182, q 4 = Yuanzhen Shao PCA 12 / 13

26 Reconstruction fidelity, 4 components Yuanzhen Shao PCA 13 / 13

27 References Václav Hlaváč, Principal Component Analysis Application to images geometric-interpretation-covariance-matrix/ Yuanzhen Shao PCA 14 / 13

1 Principal Components Analysis

Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for