Principal Components Analysis

Size: px
Start display at page:

Download "Principal Components Analysis"

Transcription

1 Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 1

2 Copyright Copyright c 2017 by Nathaniel E. Helwig Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 2

3 Outline of Notes 1) Background Overview Data, Cov, Cor Orthogonal Rotation 3) Sample PCs Definition Calculation Properties 2) Population PCs Definition Calculation Properties 4) PCA in Practice Covariance or Correlation? Decathlon Example Number of Components Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 3

4 Background Background Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 4

5 Background Overview of Principal Components Analysis Definition and Purposes of PCA Principal Components Analysis (PCA) finds linear combinations of variables that best explain the covariation structure of the variables. There are two typical purposes of PCA: 1 Data reduction: explain covariation between p variables using r < p linear combinations 2 Data interpretation: find features (i.e., components) that are important for explaining covariation Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 5

6 Data Matrix Background Data, Covariance, and Correlation Matrix The data matrix refers to the array of numbers x 11 x 12 x 1p x 21 x 22 x 2p X = x 31 x 32 x 3p x n1 x n2 x np where x ij is the j-th variable collected from the i-th item (e.g., subject). items/subjects are rows variables are columns X is a data matrix of order n p (# items by # variables). Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 6

7 Background Population Covariance Matrix Data, Covariance, and Correlation Matrix The population covariance matrix refers to the symmetric array σ 11 σ 12 σ 13 σ 1p σ 21 σ 22 σ 23 σ 2p Σ = σ 31 σ 32 σ 33 σ 3p σ p1 σ p2 σ p3 σ pp where σ jj = E([X j µ j ] 2 ) is the population variance of the j-th variable σ jk = E([X j µ j ][X k µ k ]) is the population covariance between the j-th and k-th variables µ j = E(X j ) is the population mean of the j-th variable Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 7

8 Background Sample Covariance Matrix Data, Covariance, and Correlation Matrix The sample covariance matrix refers to the symmetric array where s 2 1 s 12 s 13 s 1p s 21 s2 2 s 23 s 2p S = s 31 s 32 s3 2 s 3p s p1 s p2 s p3 sp 2 s 2 j = 1 n 1 n i=1 (x ij x j ) 2 is the sample variance of the j-th variable s jk = 1 n 1 n i=1 (x ij x j )(x ik x k ) is the sample covariance between the j-th and k-th variables x j = (1/n) n i=1 x ij is the sample mean of the j-th variable Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 8

9 Background Covariance Matrix from Data Matrix Data, Covariance, and Correlation Matrix We can calculate the (sample) covariance matrix such as S = 1 n 1 X cx c where X c = X 1 n x = CX with x = ( x 1,..., x p ) denoting the vector of variable means C = I n n 1 1 n 1 n denoting a centering matrix Note that the centered matrix X c has the form x 11 x 1 x 12 x 2 x 1p x p x 21 x 1 x 22 x 2 x 2p x p X c = x 31 x 1 x 32 x 2 x 3p x p x n1 x 1 x n2 x 2 x np x p Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 9

10 Background Population Correlation Matrix Data, Covariance, and Correlation Matrix The population correlation matrix refers to the symmetric array 1 ρ 12 ρ 13 ρ 1p ρ 21 1 ρ 23 ρ 2p P = ρ 31 ρ 32 1 ρ 3p ρ p1 ρ p2 ρ p3 1 where ρ jk = σ jk σjj σ kk is the Pearson correlation coefficient between variables X j and X k. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 10

11 Background Sample Correlation Matrix Data, Covariance, and Correlation Matrix The sample correlation matrix refers to the symmetric array 1 r 12 r 13 r 1p r 21 1 r 23 r 2p R = r 31 r 32 1 r 3p r p1 r p2 r p3 1 where r jk = s jk s j s k = n i=1 (x ij x j )(x ik x k ) n i=1 (x ij x j ) 2 n i=1 (x ik x k ) 2 is the Pearson correlation coefficient between variables x j and x k. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 11

12 Background Correlation Matrix from Data Matrix Data, Covariance, and Correlation Matrix We can calculate the (sample) correlation matrix such as R = 1 n 1 X sx s where X s = CXD 1 with C = I n n 1 1 n 1 n denoting a centering matrix D = diag(s 1,..., s p ) denoting a diagonal scaling matrix Note that the standardized matrix X s has the form (x 11 x 1 )/s 1 (x 12 x 2 )/s 2 (x 1p x p )/s p (x 21 x 1 )/s 1 (x 22 x 2 )/s 2 (x 2p x p )/s p X s = (x 31 x 1 )/s 1 (x 32 x 2 )/s 2 (x 3p x p )/s p (x n1 x 1 )/s 1 (x n2 x 2 )/s 2 (x np x p )/s p Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 12

13 Background Orthogonal Rotation Rotating Points in Two Dimensions Suppose we have z = (x, y) R 2, i.e., points in 2D Euclidean space. A 2 2 orthogonal rotation of (x, y) of the form ( ) ( ) ( ) x cos(θ) sin(θ) x y = sin(θ) cos(θ) y rotates (x, y) counter-clockwise around the origin by an angle of θ and ( ) ( ) ( ) x cos(θ) sin(θ) x = sin(θ) cos(θ) y y rotates (x, y) clockwise around the origin by an angle of θ. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 13

14 y Background Orthogonal Rotation Visualization of 2D Clockwise Rotation No Rotation 30 degrees 45 degrees a b c d f e g h i j k xyrot[,2] a b c d e f g h i j k xyrot[,2] a c b d i j k e f g h x xyrot[,1] xyrot[,1] 60 degrees 90 degrees 180 degrees xyrot[,2] a b c d j k i h e f g xyrot[,2] a b c j k i d h e f g xyrot[,2] k j i h g f e d c a b xyrot[,1] Nathaniel E. Helwig (U of Minnesota) xyrot[,1] Principal Components Analysis xyrot[,1] Updated 16-Mar-2017 : Slide 14

15 Background Orthogonal Rotation Visualization of 2D Clockwise Rotation (R Code) rotmat2d <- function(theta){ matrix(c(cos(theta),sin(theta),-sin(theta),cos(theta)),2,2) } x <- seq(-2,2,length=11) y <- 4*exp(-x^2) - 2 xy <- cbind(x,y) rang <- c(30,45,60,90,180) dev.new(width=12,height=8,norstudiogd=true) par(mfrow=c(2,3)) plot(x,y,xlim=c(-3,3),ylim=c(-3,3),main="no Rotation") text(x,y,labels=letters[1:11],cex=1.5) for(j in 1:5){ rmat <- rotmat2d(rang[j]*2*pi/360) xyrot <- xy%*%rmat plot(xyrot,xlim=c(-3,3),ylim=c(-3,3)) text(xyrot,labels=letters[1:11],cex=1.5) title(paste(rang[j]," degrees")) } Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 15

16 Background Orthogonal Rotation Orthogonal Rotation in Two Dimensions Note that the 2 2 rotation matrix ( ) cos(θ) sin(θ) R = sin(θ) cos(θ) is an orthogonal matrix for all θ: ( ) ( ) R cos(θ) sin(θ) cos(θ) sin(θ) R = sin(θ) cos(θ) sin(θ) cos(θ) ( cos = 2 (θ) + sin 2 ) (θ) cos(θ) sin(θ) cos(θ) sin(θ) cos(θ) sin(θ) cos(θ) sin(θ) cos 2 (θ) + sin 2 (θ) ( ) 1 0 = 0 1 Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 16

17 Background Orthogonal Rotation Orthogonal Rotation in Higher Dimensions Suppose we have a data matrix X with p columns. Rows of X are coordinates of points in p-dimensional space Note: when p = 2 we have situation on previous slides A p p orthogonal rotation is an orthogonal linear transformation. R R = RR = I p where I p is p p identity matrix If X = XR is rotated data matrix, then X X = XX Orthogonal rotation preserves relationships between points Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 17

18 Population Principal Components Population Principal Components Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 18

19 Population Principal Components Definition Linear Combinations of Random Variables X = (X 1,..., X p ) is a random vector with covariance matrix Σ, where λ 1 λ p 0 are the eigenvalues of Σ. Consider forming new variables Y 1,..., Y p by taking p different linear combinations of the X j variables: Y 1 = b 1 X = b 11X 1 + b 21 X b p1 X p Y 2 = b 2 X = b 12X 1 + b 22 X b p2 X p. Y p = b px = b 1p X 1 + b 2p X b pp X p where b k = (b 1k,..., b pk ) is the k-th linear combination vector. b k are called the loadings for the k-th principal component Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 19

20 Population Principal Components Definition Defining Principal Components in the Population Note that the random variable Y k = b kx has the properties: Var(Y k ) = b k Σb k Cov(Y k, Y l ) = b k Σb l The principal components are the uncorrelated linear combinations Y 1,..., Y p whose variances are as large as possible. b 1 = argmax{var(b 1 X)} b 1 =1 b 2 = argmax{var(b 2 X)} subject to Cov(b 1 X, b 2 X) = 0 b 2 =1 b l = argmax b l =1 {Var(b l X)} subject to Cov(b k X, b l X) = 0 k < l Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 20

21 Population Principal Components Definition Visualizing Principal Components for Bivariate Normal Figure: Figure 8.1 from Applied Multivariate Statistical Analysis, 6th Ed (Johnson & Wichern). Note that e 1 and e 2 denote the eigenvectors of Σ. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 21

22 Population Principal Components Calculation PCA Solution via the Eigenvalue Decomposition We can write the population covariance matrix Σ such as Σ = VΛV = where p λ k v k v k k=1 V = [v 1,..., v p ] contains the eigenvectors (V V = VV = I p ) Λ = diag(λ 1,..., λ p ) contains the (non-negative) eigenvalues The PCA solution is obtained by setting b k = v k for k = 1,..., p: Var(Y k ) = Var(v k X) = v k Σv k = v k VΛV v k = λ k Cov(Y k, Y l ) = Cov(v k X, v l X) = v k Σv l = v k VΛV v l = 0 if k l Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 22

23 Population Principal Components Properties Variance Explained by Principal Components Y 1,..., Y p has the same total variance as X 1,..., X p : p Var(X j ) = tr(σ) = tr(vλv ) = tr(λ) = j=1 p Var(Y j ) j=1 The proportion of the total variance accounted for by the k-th PC is R 2 k = λ k p l=1 λ l If r k=1 R2 k 1 for some r < p, we do not lose much transforming the original variables into fewer new (principal component) variables. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 23

24 Population Principal Components Properties Covariance of Variables and Principal Components The covariance between X j and Y k has the form Cov(X j, Y k ) = Cov(e j X, v k X) = e j Σv k = e j (VΛV )v k = e j v kλ k = v jk λ k where e j is a vector of zeros with a one in the j-th position v k = (v 1k,..., v pk ) is the k-th eigenvector of Σ This implies that the correlation between X j and Y k has the form Cor(X j, Y k ) = Cov(X j, Y k ) Var(Xj ) Var(Y k ) = v jkλ k σjj λk = v jk λk σjj Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 24

25 Sample Principal Components Sample Principal Components Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 25

26 Sample Principal Components Definition Linear Combinations of Observed Random Variables x i = (x i1,..., x ip ) is an observed random vector and x i iid (µ, Σ). Consider forming new variables y i1,..., y ip by taking p different linear combinations of the x ij variables: y i1 = b 1 x i = b 11 x i1 + b 21 x i2 + + b p1 x ip y i2 = b 2 x i = b 12 x i1 + b 22 x i2 + + b p2 x ip. y ip = b px i = b 1p x i1 + b 2p x i2 + + b pp x ip where b k = (b 1k,..., b pk ) is the k-th linear combination vector. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 26

27 Sample Principal Components Definition Sample Properties of Linear Combinations Note that the sample mean and variance of the y ik variables are: ȳ k = 1 n n y ik = b k x i=1 s 2 y k = 1 n 1 = 1 n 1 n i=1 (y ik ȳ k ) 2 = 1 n 1 n (b k x i b k x)(b k x i b k x) i=1 n b k (x i x)(x i x) b k = b k Sb k i=1 and the sample covariance between y ik and y il is given by s yk y l = 1 n 1 n (y ik ȳ k )(y il ȳ l ) = b k Sb l i=1 Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 27

28 Sample Principal Components Definition Defining Principal Components in the Sample The principal components are the uncorrelated linear combinations y i1,..., y ip whose sample variances are as large as possible. b 1 = argmax{b 1 Sb 1} b 1 =1 b 2 = argmax{b 2 Sb 2} subject to b 1 Sb 2 = 0 b 2 =1. b l = argmax b l =1 {b l Sb l} subject to b k Sb l = 0 k < l Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 28

29 Sample Principal Components Calculation PCA Solution via the Eigenvalue Decomposition We can write the sample covariance matrix S such as S = ˆVˆΛˆV = p ˆλ k ˆv k ˆv k k=1 where ˆV = [ˆv 1,..., ˆv p ] contains the eigenvectors (ˆV ˆV = ˆVˆV = I p ) ˆΛ = diag(ˆλ 1,..., ˆλ p ) contains the (non-negative) eigenvalues The PCA solution is obtained by setting b k = ˆv k for k = 1,..., p: s 2 y k = ˆv k Sˆv k = ˆv kˆvˆλˆv ˆv k = ˆλ k s yk y l = ˆv k Sˆv l = ˆv kˆvˆλˆv ˆv l = 0 if k l Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 29

30 Sample Principal Components Properties Variance Explained by Principal Components {y i1,..., y ip } n i=1 has the same total variance as {x i1,..., x ip } n i=1 : p sk 2 = tr(s) = tr(ˆvˆλˆv ) = tr(ˆλ) = j=1 p j=1 s 2 y k The proportion of the total variance accounted for by the k-th PC is ˆR 2 k = ˆλ k p l=1 ˆλ l If r ˆR k=1 k 2 1 for some r < p, we do not lose much transforming the original variables into fewer new (principal component) variables. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 30

31 Sample Principal Components Properties Covariance of Variables and Principal Components The sample covariance between x ij and y ik has the form Cov(x ij, y ik ) = 1 n (x ij x j )(y ik ȳ k ) n 1 = 1 n 1 i=1 n e j (x i x)(x i x) ˆv k i=1 = e j Sˆv k = e j ˆv k ˆλk = ˆv jk ˆλk where e j is a vector of zeros with a one in the j-th position ˆv k = (ˆv 1k,..., ˆv pk ) is the k-th eigenvector of S This implies that the (sample) correlation between x ij and y ik is Cor(x ij, y ik ) = Cov(x ij, y ik ) Var(xij ) Var(y ik ) = ˆv jk ˆλ k s j ˆλ 1/2 = ˆv 1/2 jk ˆλ k s j k Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 31

32 Sample Principal Components Large Sample Properties Properties Assume that x i iid N(µ, Σ) and that the eigenvalues of Σ are strictly positive and unique: λ 1 > > λ p > 0. As n, we have that n(ˆλ λ) N(0, 2Λ 2 ) n(ˆv k v k ) N(0, V k ) where V k = λ k l k λ l (λ l λ k ) 2 v l v l Furthermore, as n, we have that ˆλ k and ˆv k are independent. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 32

33 Principal Components Analysis in Practice Principal Components Analysis in Practice Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 33

34 Principal Components Analysis in Practice Covariance or Correlation Matrix? PCA Solution is Related to SVD (Covariance Matrix) Let ÛˆDˆV denote the SVD of X c = CX. The PCA (covariance) solution is directly related to the SVD of X c : Y = ÛˆD and B = ˆV Note that columns of ˆV are the... Right singular vectors of the mean-centered data matrix X c Eigenvectors of the covariance matrix S = 1 n 1 X cx c = ˆVˆΛˆV where ˆΛ = diag(ˆλ 1,..., ˆλ p ) with ˆλ k = ˆd 2 kk n 1 and ˆD = diag(ˆd 11,..., ˆd pp ). Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 34

35 Principal Components Analysis in Practice Covariance or Correlation Matrix? PCA Solution is Related to SVD (Correlation Matrix) Let Ũ DṼ denote the SVD of X s = CXD 1 with D = diag(s 1,..., s p ). The PCA (correlation) solution is directly related to the SVD of X cs : Y = Ũ D and B = Ṽ Note that columns of Ṽ are the... Right singular vectors of the centered and scaled data matrix X s Eigenvectors of the correlation matrix R = 1 n 1 X sx s = Ṽ ΛṼ where Λ = diag( λ 1,..., λ p ) with λ k = d 2 kk n 1 and D = diag( d 11,..., d pp ). Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 35

36 Principal Components Analysis in Practice Covariance or Correlation Matrix? Covariance versus Correlation Considerations Problem: there is no simple relationship between SVDs of X c and X s. No simple relationship between PCs obtained from S and R Rescaling variables can fundamentally change our results Note that PCA is trying to explain the variation in S or R If units of p variables are comparable, covariance PCA may be more informative (because units of measurement are retained) If units of p variables are incomparable, correlation PCA may be more appropriate (because units of measurement are removed) Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 36

37 Principal Components Analysis in Practice Decathlon Example Men s Olympic Decathlon Data from 1988 Data from men s 1988 Olympic decathlon Total of n = 34 athletes Have p = 10 variables giving score for each decathlon event Have overall decathlon score also (score) > decathlon[1:9,] run100 long.jump shot high.jump run400 hurdle discus pole.vault javelin run1500 score Schenk Voss Steen Thompson Blondel Plaziat Bright De.Wit Johnson Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 37

38 Principal Components Analysis in Practice Resigning Running Events Decathlon Example For the running events (run100, run400, run1500, and hurdle), lower scores correspond to better performance, whereas higher scores represent better performance for other events. To make interpretation simpler, we will resign the running events: > decathlon[,c(1,5,6,10)] <- (-1)*decathlon[,c(1,5,6,10)] > decathlon[1:9,] run100 long.jump shot high.jump run400 hurdle discus pole.vault javelin run1500 score Schenk Voss Steen Thompson Blondel Plaziat Bright De.Wit Johnson Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 38

39 Principal Components Analysis in Practice PCA on Covariance Matrix Decathlon Example # PCA on covariance matrix (default) > pcacov <- princomp(x=decathlon[,1:10]) > names(pcacov) [1] "sdev" "loadings" "center" "scale" "n.obs" [6] "scores" "call" # resign PCA solution > pcsign <- sign(colsums(pcacov$loadings^3)) > pcacov$loadings <- pcacov$loadings %*% diag(pcsign) > pcacov$scores <- pcacov$scores %*% diag(pcsign) # Note: R uses MLE of covariance matrix > n <- nrow(decathlon) > sum((pcacov$sdev - sqrt(eigen((n-1)/n*cov(decathlon[,1:10]))$values))^2) [1] e-28 Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 39

40 Principal Components Analysis in Practice PCA on Correlation Matrix Decathlon Example # PCA on correlation matrix > pcacor <- princomp(x=decathlon[,1:10], cor=true) > names(pcacor) [1] "sdev" "loadings" "center" "scale" "n.obs" [6] "scores" "call" # resign PCA solution > pcsign <- sign(colsums(pcacor$loadings^3)) > pcacor$loadings <- pcacor$loadings %*% diag(pcsign) > pcacor$scores <- pcacor$scores %*% diag(pcsign) # Note: PC standard deviations are sqrts correlation matrix eigenvalues > sum((pcacor$sdev - sqrt(eigen(cor(decathlon[,1:10]))$values))^2) [1] e-30 Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 40

41 Principal Components Analysis in Practice Decathlon Example Plot Covariance and Correlation PCA Results PCA of Covariance Matrix PCA of Correlation Matrix PC2 Loadings javelin discus shot pole.vault high.jump long.jump run100 hurdle score run400 run1500 PC2 Loadings run1500 run400 long.jump run100 score hurdle high.jump pole.vault javelin discus shot PC1 Loaings PC1 Loaings > dev.new(width=10, height=5, norstudiogd=true) > par(mfrow=c(1,2)) > plot(pcacov$loadings[,1:2], xlab="pc1 Loaings", ylab="pc2 Loadings", + type="n", main="pca of Covariance Matrix", xlim=c(-0.15, 1.15), ylim=c(-0.15, 1.15)) > text(pcacov$loadings[,1:2], labels=colnames(decathlon)) > plot(pcacor$loadings[,1:2], xlab="pc1 Loaings", ylab="pc2 Loadings", + type="n", main="pca of Correlation Matrix", xlim=c(0.05, 0.45), ylim=c(-0.5,0.6)) > text(pcacor$loadings[,1:2], labels=colnames(decathlon)) Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 41

42 Principal Components Analysis in Practice Decathlon Example Correlation of PC Scores and Overall Decathlon Score PCA of Covariance Matrix PCA of Correlation Matrix Decathlon Score Decathlon Score PC2 Score PC1 Score > round(cor(decathlon$score, pcacov$scores),3) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] > round(cor(decathlon$score, pcacor$scores),3) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,] Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 42

43 Principal Components Analysis in Practice Dimensionality Problem Choosing the Number of Components In practice, the optimal number of components is often unknown. In some cases, possible/feasible values may be known a priori from theory and/or past research. In other cases, we need to use some data-driven approach to select a reasonable number of components. Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 43

44 Principal Components Analysis in Practice Choosing the Number of Components Scree Plots A scree plot displays the variance explained by each component. Example Scree Plot vaf We look for the elbow of the plot, i.e., point where line bends. Could do formal test on derivative of scree line, but common sense approach often works fine # Components (q) Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 44

45 Principal Components Analysis in Practice Scree Plots For Decathlon PCA Choosing the Number of Components PCA of Covariance Matrix PCA of Correlation Matrix Variance of PC Variance of PC # PCs # PCs > dev.new(width=10, height=5, norstudiogd=true) > par(mfrow=c(1,2)) > plot(1:10, pcacov$sdev^2, type="b", xlab="# PCs", ylab="variance of PC", + main="pca of Covariance Matrix") > plot(1:10, pcacor$sdev^2, type="b", xlab="# PCs", ylab="variance of PC", + main="pca of Correlation Matrix") Nathaniel E. Helwig (U of Minnesota) Principal Components Analysis Updated 16-Mar-2017 : Slide 45

Introduction to Normal Distribution

Introduction to Normal Distribution Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Principal Analysis Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board

More information

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis

TAMS39 Lecture 10 Principal Component Analysis Factor Analysis TAMS39 Lecture 10 Principal Component Analysis Factor Analysis Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content - Lecture Principal component analysis

More information

18 Bivariate normal distribution I

18 Bivariate normal distribution I 8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer

More information

3.1. The probabilistic view of the principal component analysis.

3.1. The probabilistic view of the principal component analysis. 301 Chapter 3 Principal Components and Statistical Factor Models This chapter of introduces the principal component analysis (PCA), briefly reviews statistical factor models PCA is among the most popular

More information

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Prof. Tesler Math 283 Fall 2015 Prof. Tesler Principal Components Analysis Math 283 / Fall 2015

More information

Sample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson

Sample Geometry. Edps/Soc 584, Psych 594. Carolyn J. Anderson Sample Geometry Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois Spring

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

9.1 Orthogonal factor model.

9.1 Orthogonal factor model. 36 Chapter 9 Factor Analysis Factor analysis may be viewed as a refinement of the principal component analysis The objective is, like the PC analysis, to describe the relevant variables in study in terms

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1 Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of

More information

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with

More information

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16

Statistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16 Statistics for Applications Chapter 9: Principal Component Analysis (PCA) 1/16 Multivariate statistics and review of linear algebra (1) Let X be a d-dimensional random vector and X 1,..., X n be n independent

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

Announcements (repeat) Principal Components Analysis

Announcements (repeat) Principal Components Analysis 4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables. Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

Stat 206: Sampling theory, sample moments, mahalanobis

Stat 206: Sampling theory, sample moments, mahalanobis Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because

More information

TMA4267 Linear Statistical Models V2017 [L3]

TMA4267 Linear Statistical Models V2017 [L3] TMA4267 Linear Statistical Models V2017 [L3] Part 1: Multivariate RVs and normal distribution (L3) Covariance and positive definiteness [H:2.2,2.3,3.3], Principal components [H11.1-11.3] Quiz with Kahoot!

More information

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham

Intelligent Data Analysis. Principal Component Analysis. School of Computer Science University of Birmingham Intelligent Data Analysis Principal Component Analysis Peter Tiňo School of Computer Science University of Birmingham Discovering low-dimensional spatial layout in higher dimensional spaces - 1-D/3-D example

More information

Introduction to Nonparametric Regression

Introduction to Nonparametric Regression Introduction to Nonparametric Regression Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as MAHALANOBIS DISTANCE Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as d E (x, y) = (x 1 y 1 ) 2 + +(x p y p ) 2

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Principal Component Analysis (PCA) Theory, Practice, and Examples

Principal Component Analysis (PCA) Theory, Practice, and Examples Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A

More information

1. Introduction to Multivariate Analysis

1. Introduction to Multivariate Analysis 1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Nonparametric Independence Tests

Nonparametric Independence Tests Nonparametric Independence Tests Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Nonparametric

More information

Exercises * on Principal Component Analysis

Exercises * on Principal Component Analysis Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................

More information

Last time: PCA. Statistical Data Mining and Machine Learning Hilary Term Singular Value Decomposition (SVD) Eigendecomposition and PCA

Last time: PCA. Statistical Data Mining and Machine Learning Hilary Term Singular Value Decomposition (SVD) Eigendecomposition and PCA Last time: PCA Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v } Statistics 35 Probability I Fall 6 (63 Final Exam Solutions Instructor: Michael Kozdron (a Solving for X and Y gives X UV and Y V UV, so that the Jacobian of this transformation is x x u v J y y v u v

More information

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30 Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Outline Week 1 PCA Challenge. Introduction. Multivariate Statistical Analysis. Hung Chen

Outline Week 1 PCA Challenge. Introduction. Multivariate Statistical Analysis. Hung Chen Introduction Multivariate Statistical Analysis Hung Chen Department of Mathematics https://ceiba.ntu.edu.tw/972multistat hchen@math.ntu.edu.tw, Old Math 106 2009.02.16 1 Outline 2 Week 1 3 PCA multivariate

More information

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Edps/Soc 584 and Psych 594 Applied Multivariate Statistics Carolyn J. Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN Canonical Slide

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Second-Order Inference for Gaussian Random Curves

Second-Order Inference for Gaussian Random Curves Second-Order Inference for Gaussian Random Curves With Application to DNA Minicircles Victor Panaretos David Kraus John Maddocks Ecole Polytechnique Fédérale de Lausanne Panaretos, Kraus, Maddocks (EPFL)

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

Nonparametric Location Tests: k-sample

Nonparametric Location Tests: k-sample Nonparametric Location Tests: k-sample Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

An Introduction to Multivariate Methods

An Introduction to Multivariate Methods Chapter 12 An Introduction to Multivariate Methods Multivariate statistical methods are used to display, analyze, and describe data on two or more features or variables simultaneously. I will discuss multivariate

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Recall: Eigenvectors of the Covariance Matrix Covariance matrices are symmetric. Eigenvectors are orthogonal Eigenvectors are ordered by the magnitude of eigenvalues: λ 1 λ 2 λ p {v 1, v 2,..., v n } Recall:

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 The required files for all problems can be found in: http://www.stat.uchicago.edu/~lekheng/courses/331/hw3/ The file name indicates which problem

More information

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Distances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Distances and similarities Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Similarities Start with X which we assume is centered and standardized. The PCA loadings were

More information

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson More Linear Algebra Edps/Soc 584, Psych 594 Carolyn J. Anderson Department of Educational Psychology I L L I N O I S university of illinois at urbana-champaign c Board of Trustees, University of Illinois

More information

STAT 501 Assignment 1 Name Spring 2005

STAT 501 Assignment 1 Name Spring 2005 STAT 50 Assignment Name Spring 005 Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Written Assignment: Due

More information

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] With 2 sets of variables {x i } and {y j }, canonical correlation analysis (CCA), first introduced by Hotelling (1936), finds the linear modes

More information

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms Christopher Engström November 14, 2014 Hermitian LU QR echelon Contents of todays lecture Some interesting / useful / important of matrices Hermitian LU QR echelon Rewriting a as a product of several matrices.

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Unsupervised dimensionality reduction

Unsupervised dimensionality reduction Unsupervised dimensionality reduction Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 Guillaume Obozinski Unsupervised dimensionality reduction 1/30 Outline 1 PCA 2 Kernel PCA 3 Multidimensional

More information

Unconstrained Ordination

Unconstrained Ordination Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)

More information

LECTURE 16: PCA AND SVD

LECTURE 16: PCA AND SVD Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Laurenz Wiskott Institute for Theoretical Biology Humboldt-University Berlin Invalidenstraße 43 D-10115 Berlin, Germany 11 March 2004 1 Intuition Problem Statement Experimental

More information

Functional SVD for Big Data

Functional SVD for Big Data Functional SVD for Big Data Pan Chao April 23, 2014 Pan Chao Functional SVD for Big Data April 23, 2014 1 / 24 Outline 1 One-Way Functional SVD a) Interpretation b) Robustness c) CV/GCV 2 Two-Way Problem

More information

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 2017-2018 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 3: Principal Component Analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical

More information

EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS

EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS Mario Romanazzi October 29, 2017 1 Introduction An important task in multidimensional data analysis is reduction in complexity. Recalling that

More information

16.584: Random Vectors

16.584: Random Vectors 1 16.584: Random Vectors Define X : (X 1, X 2,..X n ) T : n-dimensional Random Vector X 1 : X(t 1 ): May correspond to samples/measurements Generalize definition of PDF: F X (x) = P[X 1 x 1, X 2 x 2,...X

More information

Independent component analysis for functional data

Independent component analysis for functional data Independent component analysis for functional data Hannu Oja Department of Mathematics and Statistics University of Turku Version 12.8.216 August 216 Oja (UTU) FICA Date bottom 1 / 38 Outline 1 Probability

More information

Notes on Implementation of Component Analysis Techniques

Notes on Implementation of Component Analysis Techniques Notes on Implementation of Component Analysis Techniques Dr. Stefanos Zafeiriou January 205 Computing Principal Component Analysis Assume that we have a matrix of centered data observations X = [x µ,...,

More information

Standardization and Singular Value Decomposition in Canonical Correlation Analysis

Standardization and Singular Value Decomposition in Canonical Correlation Analysis Standardization and Singular Value Decomposition in Canonical Correlation Analysis Melinda Borello Johanna Hardin, Advisor David Bachman, Reader Submitted to Pitzer College in Partial Fulfillment of the

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables 1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables

More information

Tutorial on Principal Component Analysis

Tutorial on Principal Component Analysis Tutorial on Principal Component Analysis Copyright c 1997, 2003 Javier R. Movellan. This is an open source document. Permission is granted to copy, distribute and/or modify this document under the terms

More information

Linear Methods in Data Mining

Linear Methods in Data Mining Why Methods? linear methods are well understood, simple and elegant; algorithms based on linear methods are widespread: data mining, computer vision, graphics, pattern recognition; excellent general software

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

For more information about how to cite these materials visit

For more information about how to cite these materials visit Author(s): Kerby Shedden, Ph.D., 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution Share Alike 3.0 License: http://creativecommons.org/licenses/by-sa/3.0/

More information

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review You can t see this text! Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review Eric Zivot Spring 2015 Eric Zivot (Copyright 2015) Matrix Algebra Review 1 / 54 Outline 1

More information

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Random Vectors and Random Sampling. 1+ x2 +y 2 ) (n+2)/2

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Random Vectors and Random Sampling. 1+ x2 +y 2 ) (n+2)/2 MATH 3806/MATH4806/MATH6806: MULTIVARIATE STATISTICS Solutions to Problems on Rom Vectors Rom Sampling Let X Y have the joint pdf: fx,y) + x +y ) n+)/ π n for < x < < y < this is particular case of the

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices.

3d scatterplots. You can also make 3d scatterplots, although these are less common than scatterplot matrices. 3d scatterplots You can also make 3d scatterplots, although these are less common than scatterplot matrices. > library(scatterplot3d) > y par(mfrow=c(2,2)) > scatterplot3d(y,highlight.3d=t,angle=20)

More information

Clusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved

Clusters. Unsupervised Learning. Luc Anselin.   Copyright 2017 by Luc Anselin, All Rights Reserved Clusters Unsupervised Learning Luc Anselin http://spatial.uchicago.edu 1 curse of dimensionality principal components multidimensional scaling classical clustering methods 2 Curse of Dimensionality 3 Curse

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis November 24, 2015 From data to operators Given is data set X consisting of N vectors x n R D. Without loss of generality, assume x n = 0 (subtract mean). Let P be D N matrix

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra (Review) Volker Tresp 2018 Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c

More information

Factorial and Unbalanced Analysis of Variance

Factorial and Unbalanced Analysis of Variance Factorial and Unbalanced Analysis of Variance Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

Factor Analysis. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA Factor Analysis Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA 1 Factor Models The multivariate regression model Y = XB +U expresses each row Y i R p as a linear combination

More information

MLCC 2015 Dimensionality Reduction and PCA

MLCC 2015 Dimensionality Reduction and PCA MLCC 2015 Dimensionality Reduction and PCA Lorenzo Rosasco UNIGE-MIT-IIT June 25, 2015 Outline PCA & Reconstruction PCA and Maximum Variance PCA and Associated Eigenproblem Beyond the First Principal Component

More information

High Dimensional Covariance and Precision Matrix Estimation

High Dimensional Covariance and Precision Matrix Estimation High Dimensional Covariance and Precision Matrix Estimation Wei Wang Washington University in St. Louis Thursday 23 rd February, 2017 Wei Wang (Washington University in St. Louis) High Dimensional Covariance

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information