Standardization and Singular Value Decomposition in Canonical Correlation Analysis

Size: px
Start display at page:

Download "Standardization and Singular Value Decomposition in Canonical Correlation Analysis"

Transcription

1 Standardization and Singular Value Decomposition in Canonical Correlation Analysis Melinda Borello Johanna Hardin, Advisor David Bachman, Reader Submitted to Pitzer College in Partial Fulfillment of the Degree of Bachelor of Arts April 24, 2013 Department of Mathematics

2

3 Abstract Canonical correlation analysis (CCA) is a type of multivariate analysis based on the correlation between linear combinations of variables in two data sets. Biological applications of CCA include large scale genomic studies that can have multiple phenotypic or genotypic data. In these cases, CCA can lead to results that lack interpretability since CCA considers all variables. These types of analyses usually have an enormous number of variables, where the number of genes exceeds tens of thousands. Sparse canonical correlation analysis (SCCA) aims to solve the problem of interpretability by providing sparse solutions. In this paper, I examine the relationship between running CCA and SCCA with raw unstandardized data and running both methods using data that have been standardized to have a mean of zero and a standard deviation of one. I also show how both CCA and SCCA relate to singular value decomposition (SVD) by looking at an algorithm for SVD in the context of CCA and SCCA.

4

5 Contents Abstract Acknowledgments iii vii 1 Canonical Correlation Analysis 1 2 Sparse Canonical Correlation Analysis 9 3 Singular Value Decomposition Algorithm 15 Bibliography 21

6

7 Acknowledgments I d like to thank Johanna Hardin for offering to be my thesis adviser at the last minute when I thought it was too late for me to write a thesis. Her guidance throughout the research and writing process has been invaluable. I would also like to thank Associate Professor of Mathematics, Stephen Garcia, for his help with SVD and other linear algebra questions that we came upon throughout the semester. Without some of his explanations, Chapter 3 wouldn t exist. Of course, I would like to thank my mother, for all of my accomplishments are really hers. Thanks Mom, for everything.

8

9 Chapter 1 Canonical Correlation Analysis Canonical correlation analysis (CCA) measures the relationship between two sets of variables. CCA accomplishes this by focusing on the correlation between a linear combination of the variables in one set and a linear combination of the variables in another set. Canonical correlation can also be thought as an extension of bivariate correlation allowing more than two continuous variables in each set. CCA seeks to answer how the best linear combination of one set relates to the best linear combination of the other set of variables. The following presentation of canonical correlation analysis follows that which is offered by Johnson and Wichern (1992). Consider a group of p variables represented by the (p 1) random vector X, and a second group of q variables represented by the (q 1) random vector Y. Assume that p q. The random vectors have Cov(X) = Σ 11, Cov(Y) = Σ 22, and Cov(X, Y) = Σ 12, with E(X) = µ x and E(Y) = µ y. For coefficient vectors

10 2 Canonical Correlation Analysis a and b we can form the linear combinations, G = a X and H = b Y. Then max Corr(G, H) a,b is attained by the linear combinations (first canonical variate pair) G 1 = a 1X and H 1 = b 1Y. The kth pair of canonical variates, k = 2, 3,... p, G k = a k X and H k = b k Y where a k = Σ 1/2 11 u k and b k = Σ 1/2 22 v k for k = 1,..., p, maximizes Corr(G, H) among those linear combinations uncorrelated with the preceding 1, 2,..., k 1 canonical variables. That is, the second canonical variate pair are the linear combinations G 2 and H 2 which maximize all linear combinations which are uncorrelated with the first canonical variate pair. Then we have max a,b Corr(a X, b Y) = Corr(a k X, b k Y) = Corr(G k, H k ) = ρ k. Here the (p 1) vectors, u 1, u 2,... u p, and the (q 1) vectors, v 1, v 2,..., v p, are the left and right singular vectors, respectively, of matrix K = Σ 1/2 11 Σ 12 Σ 1/2 22.

11 3 Th singular values ρ 1 ρ 2 ρ p, of matrix K are the canonical correlations. In some instances, one may want to work with standardized variables, allowing ease of comparison of the variables to each other. There may be other motivations for standardization, such as standardization can simplify computations. Consider if we standardized the original variables as follows: Z X = V 1/2 11 (X µ x ) and Z Y = V 1/2 22 (Y µ y ) where V 1/2 11 is the (p p) diagonal matrix with one over the standard deviation on its diagonal, i.e., V 1/2 11 = 1/ σ x / σ x / σ xpp The matrix V 1/2 22 is similarly defined as a (q q) matrix with one over the standard deviation of Y, 1 σyii, on its diagonal. Here, ρ 11 = Cov(Z X ), ρ 22 = Cov(Z Y ) and ρ 12 = Cov(Z X, Z Y ), with E(Z X ) = E(Z Y ) = 0. Let the (p 1) vectors, e 1, e 2,... e p and the (q 1) vectors, f 1, f 2,..., f p be the left and right singular vectors of L, respectively, where L = ρ 1/2 11 ρ 12 ρ 1/2 22. Coefficient vectors, α k and β k, form the kth pair of canonical variates M k = α k Z X, N k = β k Z Y.

12 4 Canonical Correlation Analysis Then we have max α,β Corr(α Z X, β Z Y ) = Corr(α k Z X, β k Z Y ) = Corr(M k, N k ) = ρ k. Here ρ 1 ρ 2 ρ p are the singular values of matrix L and α k = e k ρ 1/2 11 and β k = f k ρ 1/2 22. Theorem 1.1 Given coefficient vectors α = ρ 1/2 11 e and β = ρ 1/2 22 f from scaled data, Z X and Z Y, and coefficient vectors a = Σ 1/2 11 u and b = Σ 1/2 22 v from raw data, X and Y, if V 1/2 11 and V 1/2 22 are diagonal matrices with i th diagonal element σ xii and σ yii, respectively, then α k = V 1/2 11 a k and β k = V 1/2 22 b k, i.e., coefficients from scaled data are equal to scaled coefficients from raw data in classical canonical correlation analysis. Proof: We will show that e k ρ 1/2 11 = u k Σ 1/2 11 V 1/2 11 or equivalently, that Σ 1/2 11 V 1/2 11 = ρ 1/2 11 and that e k = u k. By showing these two equalities hold, we will prove the theorem since without loss of generality, α k = e k ρ 1/2 11 = u k Σ 1/2 11 V 1/2 11 α k = a k V 1/2 11 = (V 1/2 11 a k) α k = V 1/2 11 a k.

13 5 Consider ρ 11 : ρ 11 = Cov(Z X ) = Cov(V 1/2 11 (X µ x )) = V 1/2 11 Cov(X µ x )V 1/2 11 (1.1) = V 1/2 11 Cov(X)V 1/2 11 (1.2) = V 1/2 11 Σ 11 V 1/2 11 = V 1/2 11 Σ 1/2 11 Σ1/2 11 V 1/2 11 Since ρ 11 is positive semi-definite, taking the square root of ρ 11 results in ρ 1/2 11 = V 1/2 11 Σ 1/2 11 ρ 1/2 11 = Σ 1/2 11 V 1/2 11 Line (1.2) follows from (1.1) because V 1/2 11 is diagonal and thus symmetric, so it is equal to its transpose. A similar argument for ρ 22 also shows that ρ 1/2 22 = Σ 1/2 22 V 1/2 22. Next, does u k = e k? In other words, does the left singular vector of K equal the left singular vector of L? We just proved that ρ 1/2 11 = Σ 1/2 11 V 1/2 11 and ρ 1/2 22 = Σ 1/2 22 V 1/2 22. Recall that ρ 12 = Cov(Z X, Z Y ). Thus we can

14 6 Canonical Correlation Analysis write L as: L = ρ 1/2 11 ρ 12 ρ 1/2 22 = Σ 1/2 11 V 1/2 11 Cov(Z X, Z Y )Σ 1/2 22 V 1/2 22 = Σ 1/2 11 V 1/2 11 1/2 Cov(V11 (X µ x ), V 1/2 22 (Y µ y ))Σ 1/2 22 V 1/2 22 = Σ 1/2 11 V 1/2 11 V 1/2 11 Cov(X µ x, Y µ y )V 1/2 22 Σ 1/2 22 V 1/2 22 (1.3) = Σ 1/2 11 Cov(X, Y)V 1/2 22 (Σ 1/2 22 V 1/2 22 ) (1.4) = Σ 1/2 11 Cov(X, Y)V 1/2 22 V 1/2( ) 22 Σ 1/2( ) 22 (1.5) = Σ 1/2 11 Cov(X, Y)V 1/2 22 V 1/2 22 Σ 1/2 22 = Σ 1/2 11 Cov(X, Y)Σ 1/2 22 = Σ 1/2 11 Σ 12 Σ 1/2 22 = K Moving from line (1.3) to (1.4), note that as proven above Σ 1/2 22 V 1/2 22 = ρ 1/2 22 which is positive definite and symmetric. Line (1.5) follows from (1.4) since Σ 1/2 22 and V 1/2 22 are both symmetric. Since L = K it is clear that L and K have the same singular vectors since they have the same singular value decomposition. Therefore, V 1/2 11 a k in fact is the coefficient vector α k for the k th canonical variate M k constructed from the standardized variable Z X, and V 1/2 22 b k is the coefficient vector β k for the k th canonical variate N k constructed from the standardized variable Z Y. Claim 1.2 If (i) a k and b k maximize Corr(a X, b Y) over all a and b subject to a and b being uncorrelated to a i and b i where i = 1, 2,... k 1 (ii) Corr(G k, H k ) =

15 7 ρ k, then, (i) max α,βcorr(α Z X, β Z Y ) =Corr(α k Z X, β k Z Y) subject to α and β being uncorrelated to α i and β i where i = 1, 2,..., k 1 (ii) Corr(M k, N k ) = ρ k as well, i.e., the canonical correlations are unchanged by the standardization. We will now show that raw and standardized canonical variates produce the same maximum correlation. From the raw data, G k and H k are the k th pair of canonical variates that maximize Corr(G, H). If this correlation is equal to ρ k then, ρ k = Corr(G k, H k ) = Corr(a k X, b k Y) = Corr(a k (X µ x), b k (Y µ y)) = Corr(a k V 1/2 11 V 1/2 11 (X µ x ), b k V 1/2 22 V 1/2 22 (Y µ y )) = Corr(a k V 1/2 11 Z X, b k V 1/2 22 Z Y ) = Corr(α k Z X, β k Z Y ) = Corr(M k, N k ) Thus canonical correlations are unchanged by the standardization. We know that G k and H k maximize the correlation for the raw data, but do M k and N k maximize the correlation for the standardized data? Suppose not. Suppose that there was a larger correlation, ρ k, between the two standardized canonical variates, M k and N k. Then ρ k must also be the maximum correlation for the raw canonical variates, G k and H k, since as shown above, the correlation for M k and N k is equal to the correlation for G k and H k.

16 8 Canonical Correlation Analysis Consider matrices K and L. As shown above, K = L, and thus they have the same singular vectors. It follows that they would also have the same singular values, ρ k. Furthermore, these singular values are the correlation for the kth canonical variate pair (detailed in Chapter 3). Hence, the pairs G k and H k, and M k and N k have the same (maximum) correlation value. Ultimately, CCA provides the same results whether using scaled data or using raw data and scaling coefficients by V 1/2 11 and V 1/2 22 (respectively) at the end. Both raw and standardized variables maximize the correlation of linear combinations of the variables from both data sets.

17 Chapter 2 Sparse Canonical Correlation Analysis As presented above, canonical correlation analysis uses all variables from both sets, X and Y, to create canonical vectors. Yet when trying to apply CCA to real data, CCA often fails to produce interpretable results. Data used in microarray data analysis and genome-wide linkage analysis usually have an enormous number of variables, where the number of genes exceeds tens of thousands. Thus results from CCA lack biological interpretability. One way to combat this issue is by using sparse canonical correlation analysis (SCCA). SCCA helps solve problems of interpretability by providing sparse sets of the associated variables, i.e., canonical vectors contain sparse loadings. Following a method to select appropriate sparseness parameters, the sparse solution contains variables that are deemed more im-

18 10 Sparse Canonical Correlation Analysis portant than others. Thus the solution reduces the dimensionality which improves interpretability. The iterative algorithm below is that presented by Parkhomenko et al. (2009). This algorithm uses soft-thresholding as its penalty function. Note that there are a number of different penalty functions that one can use for SCCA which have been outlined and compared by Chalise and Fridley (2012). Following classical CCA, consider two sets of variables X and Y, with p variables in X and q variables in Y. As before, let K = Σ 1/2 11 Σ 12 Σ 1/2 22, where Cov(X) = Σ 11, Cov(Y) = Σ 22, and Cov(X, Y) = Σ 12. The first sparse canonical vectors are identified using the following algorithm: 1. Select sparseness parameters, λ u and λ v 2. Select initial values u 0 and v 0 and set i = 0 3. Update u: (a) u i+1 Kv i (b) Normalize: u i+1 ui+1 u i+1 (c) Apply soft-thresholding to obtain sparse solution: u i+1 j ( u i+1 j 1 2 λ u) + Sign(u i+1 ) for j = 1,..., p (d) Normalize: u i+1 4. Update v: ui+1 u i+1 j (a) v i+1 K u i (b) Normalize: v i+1 vi+1 v i+1

19 11 (c) Apply soft-thresholding to obtain sparse solution: vj i+1 ( vj i λ v) + Sign(vj i+1 ) for j = 1,..., q (d) Normalize: v i+1 5. i i + 1 vi+1 v i+1 6. Repeat steps 3-5 until convergence where (x) + is equal to x if x 0 and 0 if x < 0, and 1 if x < 0 sign(x) = 1 if x > 1 0 if x = 0. Parkhomenko et al. (2009) replace Σ 11 and Σ 22 by diag(σ 11 ) and diag(σ 22 ) thus K becomes an approximation of the sample correlation matrix of X and Y without any of the information of how the variables in X are correlated or how the variables in Y are correlated. Using diag(σ 11 ) avoids computational problems. The computation of K requires (X X) 1 and (Y Y) 1 which may not exist in cases where the number of variables is greater than the number of observations (when p or q are larger than n). This is common in the biological applications discussed above, where one can have tens of thousands of genes, but only a few hundred observations. The authors also select initial values so that u 0 is the row means of K and v 0 is the column means of K, both standardized to have unit length. Although the authors assume that X and Y have been standardized to Z X and Z Y, SCCA applied to raw data produces the same results as ap-

20 12 Sparse Canonical Correlation Analysis plied to standardized data. As in classical CCA, using scaled and centered data or using raw data does not affect K nor does it affect the left and right singular vectors of K. Since the initial values, u 0 and v 0, come from K and recall that K = L, there is no difference in the algorithm for raw variables or standardized variables. In this context, we are interested in the effect of scaling and centering the variables on the soft-thresholding step. The sparseness parameters λ u and λ v are chosen using k-fold cross-validation as outlined below: 1. Choose λ u and λ v 2. Remove 1 k of the data (testing sample) 3. Find canonical coefficients from k 1 k of the data (training sample) 4. Find canonical correlation using testing sample and coefficients from training sample 5. Repeat steps 2-4 k times. Average across the k correlations, keeping this value 6. Repeat steps 1-5 for new λ u and λ v From this process, we find the optimal combination of λ u and λ v out of all specific pairs of sparseness parameters that correspond to the highest average test sample correlation. This process is not affected by scaling variables as it looks at the correlation between canonical vectors in testing and training samples. We have already proven that the correlation will be the same between raw canonical vectors and standardized canonical vectors.

21 13 Thus, the choices for sparseness parameters are not affected by the scaling of variables. (For more on the selection of sparseness parameters consult Parkhomenko et al. (2009)). The relationship between the coefficients from raw data and coefficients from standardized data is the same in the sparse setting as it was in CCA. The difference is that u and v will have sparse loadings, so our coefficient vectors will also be sparse. As before, if α k and β k are coefficients for standardized data Z X and Z Y, and a k and b k are coefficients for raw data, X and Y, then α k = V 1/2 11 a k and β k = V 1/2 22 b k, where Corr(α k Z X, β k Z Y) = Corr(a k X, b ky). That is, Theorem 1.1 holds for SCCA.

22

23 Chapter 3 Singular Value Decomposition Algorithm One may notice that the sparse algorithm is nothing more than the standard algorithm for singular value decomposition (SVD) with an added soft-thresholding step to obtain a sparse vector. If we consider the standard algorithm, we have: 1. Select initial values u 0 and v 0 and set i = 0 2. Update u: (a) u i+1 Kv i (b) Normalize: u i+1 3. Update v: ui+1 u i+1 (a) v i+1 K u i

24 16 Singular Value Decomposition Algorithm (b) Normalize: v i+1 4. i i + 1 vi+1 v i+1 5. Repeat steps 2-4 until convergence. Why does this provide the singular vectors of K? Recall that the SVD of K is K = UDV where we choose U and V to be orthogonal matrices, i.e., U = U 1 and V = V 1. Thus, the column vectors of U and V, u i and v i, are unit vectors and are the left and right singular vectors of K. Matrix D is a diagonal matrix with the singular values of K on its diagonal, i.e., the square roots of the eigenvalues of K K or KK. Alternating multiplying v by K and u by K produces the singular values of K if the following if noted: Kv i = UDV v i = d i u i. Observe that V v i is a column vector of zeros except for a 1 in the i th position, since this is equivalent to taking the inner product of v i with the columns of V. This then results in DV v i which is a diagonal matrix with all zeros expect for d i on the diagonal in the i th column. Thus UDV v i is the i th column of U multiplied by the singular value, d i. Next we normalize to get a unit vector, giving us, d i u i d i u i = d iu i d i u i = u i u i = u i

25 17 Similarly, K u i = V DU u i = d i v i which we normalize to get v i. The following process will produce the largest eigenvalue of D which is the largest singular value of K. In the context of CCA and SCCA, this is the largest correlation, ρ 1, between the canonical variates. By switching off multiplying u and v by K and K respectively, each starting vector u 0 and v 0 will be scaled towards the direction of the largest singular value of K until u and v become the singular vectors of K. Notice that if we construct a symmetric matrix A, such that A = 0 K K 0 the problem is then reduced to an eigenvalue problem. We wish to solve Ax = dx where d is the dominant singular value of K (and the square root of the dominant eigenvalue of A) and x is the vector ( v u ). Note that applying

26 18 Singular Value Decomposition Algorithm A to x leads to A v = K u u Kv = dv du if u and v are the singular vectors of K. Thus putting K and K into a block matrix results in a symmetric matrix A with dominant eigenvalue d and eigenvector ( v u ). Now instead of finding singular vectors and singular values, we just need to find the eigenvectors and eigenvalues of A. The above algorithm for SVD can be used for CCA and SCCA. It produces the components needed for the canonical variates, (singular vectors) and it produces the canonical correlations, (singular values). How does SVD produce canonical correlations? Recall that in CCA we seek coefficient vectors, a and b, such that Corr(G, H) = a Σ 12 b a Σ 11 a b Σ 22 b

27 19 is maximized. Notice that we can reduce this expression as follows: max a,b Corr(G, H) = max a,b = max u,v = max u,v = max u,v a Σ 12 b a Σ 11 a b Σ 22 b u Σ 1/2 11 Σ 12 Σ 1/2 u Σ 1/2 11 Σ 11 Σ 1/2 11 u u Kv u I p p u v I q q v 22 v v Σ 1/2 22 Σ 22 Σ 1/2 22 v (3.1) (3.2) u Kv u v. (3.3) Line 3.2 follows from 3.1 by a change of variables. Line 3.3 is equivalent to finding u and v that are unit vectors and maximizing over those vectors. Under this condition, 3.3 boils down to max u,v u Kv. Substituting K by its singular value decomposition gives max u,v u UDV v. (3.4)

28 20 Singular Value Decomposition Algorithm When u and v are the i th singular vectors of K, from above calculations, 3.4 becomes, max u,v u UDV v = u 1 UDV v 1 = u 1 d 1 u 1 = d 1 u 1 = d 1 The first singular vector of K will produce the largest singular value, which we saw in the above algorithm. The iterative process to find u 1 and v 1 will always be pulled towards the direction of the largest singular value of K. Thus, SVD produces the canonical correlations. To summarize, SVD is a helpful tool that can be used to accomplish CCA and SCCA. SVD provides an algorithm that helps finds the linear combinations of variables in each data set that produces the maximum correlation. In both SCCA and CCA, standardizing the variables so as to have mean zero and standard deviation of one does not change the canonical correlation. The coefficients from the standardized data are equal to the coefficients from the raw data scaled by their standard deviations. Therefore, it is at the discretion of the researcher whether to use raw data or to use standardized data in CCA or SCCA.

29 Bibliography Chalise, P. and Fridley, B. L. (2012). Comparison of penalty functions for sparse canonical correlation analysis. Comput. Statist. Data Anal., 56(2): Johnson, R. A. and Wichern, D. W. (1992). Applied multivariate statistical analysis. Prentice Hall Inc., Englewood Cliffs, NJ, third edition. Parkhomenko, E., Tritchler, D., and Beyene, J. (2009). Sparse canonical correlation analysis with application to genomic data integration. Stat. Appl. Genet. Mol. Biol., 8:Art. 1, 36.

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra (Review) Volker Tresp 2017 Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.

More information

Learning with Singular Vectors

Learning with Singular Vectors Learning with Singular Vectors CIS 520 Lecture 30 October 2015 Barry Slaff Based on: CIS 520 Wiki Materials Slides by Jia Li (PSU) Works cited throughout Overview Linear regression: Given X, Y find w:

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

A Least Squares Formulation for Canonical Correlation Analysis

A Least Squares Formulation for Canonical Correlation Analysis A Least Squares Formulation for Canonical Correlation Analysis Liang Sun, Shuiwang Ji, and Jieping Ye Department of Computer Science and Engineering Arizona State University Motivation Canonical Correlation

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

Linear Methods in Data Mining

Linear Methods in Data Mining Why Methods? linear methods are well understood, simple and elegant; algorithms based on linear methods are widespread: data mining, computer vision, graphics, pattern recognition; excellent general software

More information

COMP 558 lecture 18 Nov. 15, 2010

COMP 558 lecture 18 Nov. 15, 2010 Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

STAT 501 Assignment 1 Name Spring 2005

STAT 501 Assignment 1 Name Spring 2005 STAT 50 Assignment Name Spring 005 Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Written Assignment: Due

More information

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2 Norwegian University of Science and Technology Department of Mathematical Sciences TMA445 Linear Methods Fall 07 Exercise set Please justify your answers! The most important part is how you arrive at an

More information

VALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION. Carol Lindee

VALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION. Carol Lindee VALUES FOR THE CUMULATIVE DISTRIBUTION FUNCTION OF THE STANDARD MULTIVARIATE NORMAL DISTRIBUTION Carol Lindee LindeeEmail@netscape.net (708) 479-3764 Nick Thomopoulos Illinois Institute of Technology Stuart

More information

Gershgorin s Circle Theorem for Estimating the Eigenvalues of a Matrix with Known Error Bounds

Gershgorin s Circle Theorem for Estimating the Eigenvalues of a Matrix with Known Error Bounds Gershgorin s Circle Theorem for Estimating the Eigenvalues of a Matrix with Known Error Bounds Author: David Marquis Advisors: Professor Hans De Moor Dr. Kathryn Porter Reader: Dr. Michael Nathanson May

More information

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v }

Statistics 351 Probability I Fall 2006 (200630) Final Exam Solutions. θ α β Γ(α)Γ(β) (uv)α 1 (v uv) β 1 exp v } Statistics 35 Probability I Fall 6 (63 Final Exam Solutions Instructor: Michael Kozdron (a Solving for X and Y gives X UV and Y V UV, so that the Jacobian of this transformation is x x u v J y y v u v

More information

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra (Review) Volker Tresp 2018 Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors

More information

Block Bidiagonal Decomposition and Least Squares Problems

Block Bidiagonal Decomposition and Least Squares Problems Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition

More information

Numerical Linear Algebra

Numerical Linear Algebra Chapter 3 Numerical Linear Algebra We review some techniques used to solve Ax = b where A is an n n matrix, and x and b are n 1 vectors (column vectors). We then review eigenvalues and eigenvectors and

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

6-1. Canonical Correlation Analysis

6-1. Canonical Correlation Analysis 6-1. Canonical Correlation Analysis Canonical Correlatin analysis focuses on the correlation between a linear combination of the variable in one set and a linear combination of the variables in another

More information

Iterative Methods for Eigenvalues of Symmetric Matrices as Fixed Point Theorems

Iterative Methods for Eigenvalues of Symmetric Matrices as Fixed Point Theorems Iterative Methods for Eigenvalues of Symmetric Matrices as Fixed Point Theorems Student: Amanda Schaeffer Sponsor: Wilfred M. Greenlee December 6, 007. The Power Method and the Contraction Mapping Theorem

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

More information

632 CHAP. 11 EIGENVALUES AND EIGENVECTORS. QR Method

632 CHAP. 11 EIGENVALUES AND EIGENVECTORS. QR Method 632 CHAP 11 EIGENVALUES AND EIGENVECTORS QR Method Suppose that A is a real symmetric matrix In the preceding section we saw how Householder s method is used to construct a similar tridiagonal matrix The

More information

Canonical Correlation Analysis with Kernels

Canonical Correlation Analysis with Kernels Canonical Correlation Analysis with Kernels Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Computational Diagnostics Group Seminar 2003 Mar 10 1 Overview

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part X Factor analysis When we have data x (i) R n that comes from a mixture of several Gaussians, the EM algorithm can be applied to fit a mixture model. In this setting,

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information

Designing Information Devices and Systems II

Designing Information Devices and Systems II EECS 16B Fall 2016 Designing Information Devices and Systems II Linear Algebra Notes Introduction In this set of notes, we will derive the linear least squares equation, study the properties symmetric

More information

The Exponential of a Matrix

The Exponential of a Matrix The Exponential of a Matrix 5-8- The solution to the exponential growth equation dx dt kx is given by x c e kt It is natural to ask whether you can solve a constant coefficient linear system x A x in a

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in

. =. a i1 x 1 + a i2 x 2 + a in x n = b i. a 11 a 12 a 1n a 21 a 22 a 1n. i1 a i2 a in Vectors and Matrices Continued Remember that our goal is to write a system of algebraic equations as a matrix equation. Suppose we have the n linear algebraic equations a x + a 2 x 2 + a n x n = b a 2

More information

Lecture 6. Numerical methods. Approximation of functions

Lecture 6. Numerical methods. Approximation of functions Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation

More information

Stat 159/259: Linear Algebra Notes

Stat 159/259: Linear Algebra Notes Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the

More information

1 Linearity and Linear Systems

1 Linearity and Linear Systems Mathematical Tools for Neuroscience (NEU 34) Princeton University, Spring 26 Jonathan Pillow Lecture 7-8 notes: Linear systems & SVD Linearity and Linear Systems Linear system is a kind of mapping f( x)

More information

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4]

Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] Ch.3 Canonical correlation analysis (CCA) [Book, Sect. 2.4] With 2 sets of variables {x i } and {y j }, canonical correlation analysis (CCA), first introduced by Hotelling (1936), finds the linear modes

More information

Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

More information

Vector Space Models. wine_spectral.r

Vector Space Models. wine_spectral.r Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components

More information

Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition. Jiaxing Tan Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

More information

A Short Note on Resolving Singularity Problems in Covariance Matrices

A Short Note on Resolving Singularity Problems in Covariance Matrices International Journal of Statistics and Probability; Vol. 1, No. 2; 2012 ISSN 1927-7032 E-ISSN 1927-7040 Published by Canadian Center of Science and Education A Short Note on Resolving Singularity Problems

More information

Characterization of half-radial matrices

Characterization of half-radial matrices Characterization of half-radial matrices Iveta Hnětynková, Petr Tichý Faculty of Mathematics and Physics, Charles University, Sokolovská 83, Prague 8, Czech Republic Abstract Numerical radius r(a) is the

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Review of similarity transformation and Singular Value Decomposition

Review of similarity transformation and Singular Value Decomposition Review of similarity transformation and Singular Value Decomposition Nasser M Abbasi Applied Mathematics Department, California State University, Fullerton July 8 7 page compiled on June 9, 5 at 9:5pm

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Robust Sparse Canonical Correlation Analysis and PITCHf/x. Jacob Coleman, Johanna Hardin

Robust Sparse Canonical Correlation Analysis and PITCHf/x. Jacob Coleman, Johanna Hardin Robust Sparse Canonical Correlation Analysis and PITCHf/x Jacob Coleman, Johanna Hardin April 5, 2013 Chapter 1 Introduction Since the early 2000 s, there has been a wave of thinking in baseball that departs

More information

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as

Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as MAHALANOBIS DISTANCE Def. The euclidian distance between two points x = (x 1,...,x p ) t and y = (y 1,...,y p ) t in the p-dimensional space R p is defined as d E (x, y) = (x 1 y 1 ) 2 + +(x p y p ) 2

More information

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays Prof. Tesler Math 283 Fall 2015 Prof. Tesler Principal Components Analysis Math 283 / Fall 2015

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 A cautionary tale Notes for 2016-10-05 You have been dropped on a desert island with a laptop with a magic battery of infinite life, a MATLAB license, and a complete lack of knowledge of basic geometry.

More information

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018

MATH 315 Linear Algebra Homework #1 Assigned: August 20, 2018 Homework #1 Assigned: August 20, 2018 Review the following subjects involving systems of equations and matrices from Calculus II. Linear systems of equations Converting systems to matrix form Pivot entry

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

From Lay, 5.4. If we always treat a matrix as defining a linear transformation, what role does diagonalisation play?

From Lay, 5.4. If we always treat a matrix as defining a linear transformation, what role does diagonalisation play? Overview Last week introduced the important Diagonalisation Theorem: An n n matrix A is diagonalisable if and only if there is a basis for R n consisting of eigenvectors of A. This week we ll continue

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

This appendix provides a very basic introduction to linear algebra concepts.

This appendix provides a very basic introduction to linear algebra concepts. APPENDIX Basic Linear Algebra Concepts This appendix provides a very basic introduction to linear algebra concepts. Some of these concepts are intentionally presented here in a somewhat simplified (not

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1 Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Fitting functions to data

Fitting functions to data 1 Fitting functions to data 1.1 Exact fitting 1.1.1 Introduction Suppose we have a set of real-number data pairs x i, y i, i = 1, 2,, N. These can be considered to be a set of points in the xy-plane. They

More information

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson Name: TA Name and section: NO CALCULATORS, SHOW ALL WORK, NO OTHER PAPERS ON DESK. There is very little actual work to be done on this exam if

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course

More information

Interlacing Inequalities for Totally Nonnegative Matrices

Interlacing Inequalities for Totally Nonnegative Matrices Interlacing Inequalities for Totally Nonnegative Matrices Chi-Kwong Li and Roy Mathias October 26, 2004 Dedicated to Professor T. Ando on the occasion of his 70th birthday. Abstract Suppose λ 1 λ n 0 are

More information

Dot Products, Transposes, and Orthogonal Projections

Dot Products, Transposes, and Orthogonal Projections Dot Products, Transposes, and Orthogonal Projections David Jekel November 13, 2015 Properties of Dot Products Recall that the dot product or standard inner product on R n is given by x y = x 1 y 1 + +

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6 CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6 GENE H GOLUB Issues with Floating-point Arithmetic We conclude our discussion of floating-point arithmetic by highlighting two issues that frequently

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM

CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM CANONICAL LOSSLESS STATE-SPACE SYSTEMS: STAIRCASE FORMS AND THE SCHUR ALGORITHM Ralf L.M. Peeters Bernard Hanzon Martine Olivi Dept. Mathematics, Universiteit Maastricht, P.O. Box 616, 6200 MD Maastricht,

More information

CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery

CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery Tim Roughgarden & Gregory Valiant May 3, 2017 Last lecture discussed singular value decomposition (SVD), and we

More information

NUMERICAL METHODS WITH TENSOR REPRESENTATIONS OF DATA

NUMERICAL METHODS WITH TENSOR REPRESENTATIONS OF DATA NUMERICAL METHODS WITH TENSOR REPRESENTATIONS OF DATA Institute of Numerical Mathematics of Russian Academy of Sciences eugene.tyrtyshnikov@gmail.com 2 June 2012 COLLABORATION MOSCOW: I.Oseledets, D.Savostyanov

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence

EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence 1 EE226a - Summary of Lecture 13 and 14 Kalman Filter: Convergence Jean Walrand I. SUMMARY Here are the key ideas and results of this important topic. Section II reviews Kalman Filter. A system is observable

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

AN ITERATION. In part as motivation, we consider an iteration method for solving a system of linear equations which has the form x Ax = b

AN ITERATION. In part as motivation, we consider an iteration method for solving a system of linear equations which has the form x Ax = b AN ITERATION In part as motivation, we consider an iteration method for solving a system of linear equations which has the form x Ax = b In this, A is an n n matrix and b R n.systemsof this form arise

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Direct Methods Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) Linear Systems: Direct Solution Methods Fall 2017 1 / 14 Introduction The solution of linear systems is one

More information

Lecture 02 Linear Algebra Basics

Lecture 02 Linear Algebra Basics Introduction to Computational Data Analysis CX4240, 2019 Spring Lecture 02 Linear Algebra Basics Chao Zhang College of Computing Georgia Tech These slides are based on slides from Le Song and Andres Mendez-Vazquez.

More information

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits

More information

Linear Algebra, part 3 QR and SVD

Linear Algebra, part 3 QR and SVD Linear Algebra, part 3 QR and SVD Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2012 Going back to least squares (Section 1.4 from Strang, now also see section 5.2). We

More information

ILLUSTRATIVE EXAMPLES OF PRINCIPAL COMPONENTS ANALYSIS

ILLUSTRATIVE EXAMPLES OF PRINCIPAL COMPONENTS ANALYSIS ILLUSTRATIVE EXAMPLES OF PRINCIPAL COMPONENTS ANALYSIS W. T. Federer, C. E. McCulloch and N. J. Miles-McDermott Biometrics Unit, Cornell University, Ithaca, New York 14853-7801 BU-901-MA December 1986

More information

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes Repeated Eigenvalues and Symmetric Matrices. Introduction In this Section we further develop the theory of eigenvalues and eigenvectors in two distinct directions. Firstly we look at matrices where one

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

On Expected Gaussian Random Determinants

On Expected Gaussian Random Determinants On Expected Gaussian Random Determinants Moo K. Chung 1 Department of Statistics University of Wisconsin-Madison 1210 West Dayton St. Madison, WI 53706 Abstract The expectation of random determinants whose

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Extreme Values and Positive/ Negative Definite Matrix Conditions

Extreme Values and Positive/ Negative Definite Matrix Conditions Extreme Values and Positive/ Negative Definite Matrix Conditions James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 8, 016 Outline 1

More information