EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS

Size: px
Start display at page:

Download "EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS"

Transcription

1 EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS Mario Romanazzi October 29, Introduction An important task in multidimensional data analysis is reduction in complexity. Recalling that data are usually characterized as a set of n objects with relevant features described by a set of p variables, complexity reduction can be achieved either by reduction of variables or by reduction of objects. Here we consider the first problem assuming variables to be numerical in nature. A first, and obvious, method for simplification of features is to select a subset of q p features able to retain the desired information. A second, more general, method is to look for q p transformations of the observed features able to retain the desired information. Principal component analysis (PCA) belongs to this second family of methods and is a typical step when trying to understand the structure of multidimensional numerical data. 2 Preliminaries: redundancy, rank and linear transformations If we want to preserve the main information given by the observed features, it is clear that complexity reduction is only possible when there is some redundancy in the data. Mathematics offers a useful first notion of redundancy, which is rank. Since p is usually much lower than n, the rank of a data frame can be taken to be the number of linearly independent variables (columns) meaning that, if some variables are linear combinations of the others, then they do not offer new information and can be dropped without loosing anything. The rank can be computed as the number of strictly positive singular values of the data frame. Below we examine the rank of Swiss notes data set. > bn <- read.table(file= + header=true) > str(bn) data.frame : 200 obs. of 7 variables: $ Id : Factor w/ 200 levels "BN1","BN10","BN",..: $ Length : num $ Left : num $ Right : num $ Bottom : num $ Top : num $ Diagonal: num > SVD <- svd(bn[, -1]) > str(svd) 1

2 2 PRELIMINARIES: REDUNDANCY, RANK AND LINEAR TRANSFORMATIONS 2 List of 3 $ d: num [1:6] $ u: num [1:200, 1:6] $ v: num [1:6, 1:6] > # Singular values > SVD$d [1] Here the rank is 6 and equals the number of variables. Let us consider two more variables, that is, the perimeter and the area of each note. > peri <- 2*bn$Length + bn$left + bn$right > area <- bn$length * (bn$left + bn$right)/2 > bn1 <- data.frame(bn[, -1], peri, area) > names(bn1) <- c(names(bn[, -1]), Perimeter, Area ) > SVD1 <- svd(bn1) > # Singular values > SVD1$d [1] e e e e e+00 [6] e e e-13 Now the last singular value, up to precision tolerance, is zero reflecting the fact that perimeter is a linear combination of side lengths of the notes. Hence the rank is 7 = p Note that addition of area, being a non linear transformation of side lengths, does not contribute to redundancy. Another useful notion of redundancy is offered by the correlation matrix R R(X)of a data frame X. An observed p p correlation matrix R(X) varies between two extreme correlation matrices Identity matrix I p = diag(1,..., 1), All-ones matrix J p = 1 p 1 T p. The identity matrix corresponds to the situation where variables are linearly independent, hence there is no redundancy and it is not possible to reduce complexity. The J-matrix corresponds to the seemingly opposite situation where just one variable carries real information all the others being perfect linear transformations of it. The ranks of I p and J p are p and 1, respectively. Underlying the previous notions of redundancy and rank there is a particular class of data transformations, called linear combinations. We introduce a useful notation to represent a general linear combination and recall some properties. Let X be the n p numerical matrix of the data and let a = (a 1,..., a p ) T be a general p-vector. The linear combination z = (z 1, z 2,..., z n ) T associated to vector a is the transformation z z(a) = Xa. (1) As z i = a T x i = p j=1 a jx ij, the values of the linear combination can be interpred as generalized means, the generalization being that the a j are general real numbers, whereas for means they must be non negative numbers summing to one. As an example, the perimeter of Swiss notes is the linear combination associated to vector a = (2, 1, 1, 0, 0, 0) T and the perimeter of the general note is z i = 2x i1 + x i2 + x i3, i = 1,..., n. The properties of a linear combination depend on the transformation vector a and the reference data frame. In particular, the mean and the variance are

3 3 PRINCIPAL COMPONENTS 3 z = a T x = s 2 Z = a T Sa = p a j x j, (2) j=1 p s ij a i a j. (3) The expression of the variance is particularly important. We regard variance as the information content (proportional to L2 norm of errors about the mean) of the corresponding variable. From (??), the variance of a linear combination is a quadratic form depending on the underlying data through variances and pairwise covariances of the observed variables. Hence, the variance of a linear combination incorporates the information about spread as well as linear interdependence, filtered by the coefficients a j. To avoid variance explosion, often normalized linear combinations are considered, where the coefficients a j satisfy the constraint p j=1 a2 j = 1. In this case, the range of variation of a is the boundary of the unit radius hypersphere centered at the origin, instead of the entire euclidean space. Another property of linear combinations is that they preserve normality, when data are normally distributed. 3 Principal components In the previous examples, the vector a of a linear combination was given. But in data analysis the vector a is often determined so as to achieve specific goals. In these situations it is typically a function of the observed data. This is exactly the case of principal components. The principal components of a numerical data frame X are p uncorrelated normalized linear combinations Z 1, Z 2,..., Z p ordered according to an information criterion. The first principal component is the normalized linear combination with maximum variance, the second principal component is the normalized linear combination with maximum variance subject to the constraint of being uncorrelated with the first one, and so on. The last principal component can also be characterized as the normalized linear combination with minimum variance. The computation of principal components is simple because a) the vectors a 1, a 2,..., a p of the optimal linear combinations are known to be the orthonormal eigenvectors of the covariance matrix of X and b) their variances are the corresponding eigenvalues l 1 l 2... l p 0. A geometrical interpretation is also available. The principal component transformation is a rotation of p-dimensional space to new axes, called principal axes, corresponding to directions of maximum variation of the data. For normal data, the principal axes are the axes of the ellipsoids of concentration, that is, the contours of normal density function. In typical applications, PCA includes three steps: 1. preliminary data transformations, always including column centering of the data frame and sometimes column standardization to unit variance, i,j=1 2. rotation to principal axes and computation of principal component scores, 3. selection of an optimal subset of q p principal components. The optimality criterion used in the last step is an information criterion. Recalling that principal components are ordered according to decreasing variance, we have to retain enough pc s so as their cumulated variance approximates the total variance of the original variables. In practice, we consider the information ratio R 2 (q) = q j=1 l j p j=1 l j, (4) and choose q so as R 2 (q) is sufficiently high. Values in the range 70-80% are considered satisfactory.

4 3 PRINCIPAL COMPONENTS 4 pc Variances Figure 1: Swiss banknotes. Ordered variances of principal components. 3.1 Worked example: PCA of centered Swiss notes data Below PCA of Swiss notes is performed. We consider first the analysis of column centered data. > class <- rep(c(0,1), c(, )) > col <- rep(c( black, red ), c(, )) > pc <- princomp(bn[, -1], cor=false) > str(pc) List of 7 $ sdev : Named num [1:6] attr(*, "names")= chr [1:6] "Comp.1" "Comp.2" "Comp.3" "Comp.4"... $ loadings: loadings [1:6, 1:6] attr(*, "dimnames")=list of 2....$ : chr [1:6] "Length" "Left" "Right" "Bottom" $ : chr [1:6] "Comp.1" "Comp.2" "Comp.3" "Comp.4"... $ center : Named num [1:6] attr(*, "names")= chr [1:6] "Length" "Left" "Right" "Bottom"... $ scale : Named num [1:6] attr(*, "names")= chr [1:6] "Length" "Left" "Right" "Bottom"... $ n.obs : int 200 $ scores : num [1:200, 1:6] attr(*, "dimnames")=list of 2....$ : NULL....$ : chr [1:6] "Comp.1" "Comp.2" "Comp.3" "Comp.4"... $ call : language princomp(x = bn[, -1], cor = FALSE) - attr(*, "class")= chr "princomp" > # pc$loadings pxp orthogonal matrix of ordered eigenvectors > # whose columns are the optimal normalized linear combinations > # pc$sdev standard deviations of pc scores (coincident with squared roots

5 3 PRINCIPAL COMPONENTS 5 > # of eigenvalues of covariance matrix) > # pc$scores n x p matrix of pc scores or coordinates (linear combination values) > summary(pc) Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Standard deviation Proportion of Variance Cumulative Proportion Comp.6 Standard deviation Proportion of Variance Cumulative Proportion > plot(pc) > # Statistical summaries of pc scores > round(colmeans(pc$scores), 2) > round(cov(pc$scores), 2) Comp Comp Comp Comp Comp Comp > round(cor(pc$scores), 2) Comp Comp Comp Comp Comp Comp > round(cor(bn[,-1], pc$scores[, 1:6]), 2) Length Left Right Bottom Top Diagonal > plot(pc$scores[,1:2], pch=20, col=col, + xlab= PC1 (66.8%), ylab= PC2 (20.8%), main= PCA of Swiss Banknotes Data ) > abline(h=0, v=0, lty= dotted, col= grey ) > text(pc$scores[,1:2], labels=c(1:, 1:), cex=0.6, pos=3) > pairs(pc$scores, pch=20, col=col)

6 3 PRINCIPAL COMPONENTS 6 PCA of Swiss Banknotes Data PC2 (20.8%) PC1 (66.8%) Figure 2: Swiss banknotes. Scatter plot of the first two principal components of centered data (black/red: genuine/forged bills). A discussion of results is given below. 1. The first two pc s provide a very good approximation of the 6-dimensional data: 66.8% of total variance is absorbed by the first pc and an additional 20.8% by the second one, corresponding to a cumulative value equal to 87.6%. Therefore the visualization of the sample on the cartesian plane of the first two pc s is a reliable picture of the original 6-dimensional configuration. 2. The scatter plot of the first two pc s shows some remarkable features. The two classes appear as separate swarms of points which is important because the information about class composition was NOT explicitly included in principal component transformation. Moreover, the elongated shape of both clusters (more accentuated fot forged bills) suggests within-class negative correlation of pc s scores (recall that, by definition, pc s scores are uncorrelated). Finally, outliers are clearly displayed (e. g., observations no. 5 and 70 from the class of genuine bills). 3. The correlation matrix of principal components with observed variables is the main tool to obtain an interpretation of the transformation. In the present case the first pc is correlated mainly with Bottom and Diagonal whereas the second pc is mainly correlated with Top. While this gives a clue about interpretation of principal axes, it also suggests that estimation of principal axes may be distorted by unbalanced variances of original variables. This can be the case here because these three variables have the highest variances. This is why it is generally recommended to apply PCA after data standardization. 3.2 Worked example: PCA of standardized Swiss notes data For the sake of completeness, we also study the pc transformation of standardized Swiss notes data. Note that in this case we look for the stationary points of the function b T Rb = p i,j=1 r ijb i b j, where R = (r ij ) is the correlation matrix of the data, subject to the constraint b T b = 1. The solution is given by the orthonormal eigenvectors of R and the corresponding eigenvalues.

7 3 PRINCIPAL COMPONENTS 7 PCA of Swiss Banknotes Data PC2 (21.3%) PC1 (49.1%) Standardized Data Figure 3: Swiss banknotes. Scatter plot of the first two principal components of standardized data (black/red: genuine/forged bills). > pc1 <- princomp(bn[, -1], cor=true) > summary(pc1) Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Standard deviation Proportion of Variance Cumulative Proportion Comp.6 Standard deviation Proportion of Variance Cumulative Proportion > plot(pc1) > # Statistical summaries of pc scores > round(colmeans(pc1$scores), 2) > round(cov(pc1$scores), 2) Comp Comp Comp Comp Comp Comp

8 3 PRINCIPAL COMPONENTS 8 > round(cor(pc1$scores), 2) Comp Comp Comp Comp Comp Comp > round(cor(bn[,-1], pc1$scores[, 1:6]), 2) Length Left Right Bottom Top Diagonal > plot(pc1$scores[,1:2], pch=20, col=col, + xlab= PC1 (49.1%), ylab= PC2 (21.3%), main= PCA of Swiss Banknotes Data, + sub= Standardized Data ) > abline(h=0, v=0, lty= dotted, col= grey ) > text(pc1$scores[,1:2], labels=c(1:, 1:), cex=0.6, pos=3) > pairs(pc1$scores, pch=20, col=col) For standardized data, we observe a drop in the value of variance explained by just the first two pc s: 70.4% against 87.6% obtained on centered data. Also, interpretation is different. The first pc heavily depends on all observed variables except Length. Correlations are positive except for Diagonal. The second pc depends almost only on Length. Again, class discrimination is good and there is evidence of positive within-class correlation of pc s scores. 3.3 Worked example: PCA of Swiss notes data augmented with perimeter and area of bills As a final application, we study the effect on pc transformation of inclusion of linear and non linear transformations of variables. Here we consider addition of perimeter and area of notes. > apply(bn1, 2, sd) Length Left Right Bottom Top Diagonal Perimeter Area > pc2 <- princomp(bn1, cor=true) > summary(pc2) Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Standard deviation Proportion of Variance Cumulative Proportion

9 4 BEYOND PRINCIPAL COMPONENTS 9 Comp.6 Comp.7 Comp.8 Standard deviation e-03 0 Proportion of Variance e-07 0 Cumulative Proportion e+00 1 > round(cor(bn1, pc2$scores[, 1:8]), 2) Comp.7 Comp.8 Length Left Right Bottom Top Diagonal Perimeter Area > plot(pc2$scores[,1:2], pch=20, col=col, + xlab= PC1 (52.8%), ylab= PC2 (24.9%), main= PCA of Swiss Banknotes Data, + sub= Data set augmented with Perimeter and Area ) > abline(h=0, v=0, lty= dotted, col= grey ) > text(pc2$scores[,1:2], labels=c(1:, 1:), cex=0.6, pos=3) > pairs(pc2$scores, pch=20, col=col) The results show remarkable differences with respect previous versions. 1. Here it is necessary to apply pc transformation to standardized data because the variance of Area variable is clearly dominant. 2. The last eigenvalue is zero because the rank of the augmented data matrix is 7, not 8, as Perimeter is a linear transformation of a subset of the observed variables. 3. The cumulated variance explained by the first two pc is 77.7%, an intermediate value between the previous results and the cumulated variance explained by the first three pc is 88.6%, a very good value. 4. Let us try to interpret the first three pc, using the coorelations with observed variables. The first, and most important pc, mainly depends on Left, Right, Perimeter and Area (absolute correlations all higher than 0.8, highest value corresponding to Area), the correlations with the remaining variables being non negligible but clearly of minor importance. The second pc mainly depends on Length (correlation equal to 0.81), Diagonal (correlation equal to 0.68), Bottom, Top and Perimeter. The third pc can be interpreted as a contrast between Bottom (correlation equal to 0.55) and Top (correlation equal to 0.73). 5. Class separation remains good. Observe that genuine bills are generally above the line P C2 = P C1, that is, the bisector of third anf fourth quadrants. 4 Beyond principal components Taking linear combinations of observed variables is a powerful method to explore the multidimensional space. Principal components are characterized by the maximum variance property but different solutions can be obtained by changing the function to be optimized. Another two classical examples arise in multiple linear regression and discriminant analysis.

10 4 BEYOND PRINCIPAL COMPONENTS 10 PCA of Swiss Banknotes Data PC2 (24.9%) PC1 (52.8%) Data set augmented with Perimeter and Area Figure 4: Swiss banknotes. Scatter plot of the first two principal components from data augmented with area and perimeter of bills (black/red: genuine/forged bills). In multiple linear regression we are given p explanatory variables X 1,..., X p and a dependent variable Y and we look for the optimal linear predictor of Y based on X 1,..., X p. It turns out that the well-known least square solution is the linear combination of (centered) X 1,..., X p with maximum squared correlation with (centered) Y. In discriminant analysis we are given a partition of the n objects in G classes and we look for the linear combination of the observed features X 1,..., X p producing the best separation of classes. A criterion suggested in 19 by R. A. Fisher is to maximize the ratio of between-group variance to within group variance. Recall that in the scalar case the within group variance s 2 W is the weighted mean of the class variances and the between group variance s 2 B is the variance of the weighted class means about the overall mean. An important result is that the overall variance is identically equal to the sum of the betweengroup and the within-goup components. The resulting optimally separating linear combinations, called canonical variates, are related to linear discriminant analysis. We illustrate the canonical variate method using the Swiss banknotes data. > library(mass) > ld <- lda(scale(bn[, -1]), grouping=class) > str(ld) List of 8 $ prior : Named num [1:2] attr(*, "names")= chr [1:2] "0" "1" $ counts : Named int [1:2]..- attr(*, "names")= chr [1:2] "0" "1" $ means : num [1:2, 1:6] attr(*, "dimnames")=list of 2....$ : chr [1:2] "0" "1"....$ : chr [1:6] "Length" "Left" "Right" "Bottom"... $ scaling: num [1:6, 1] attr(*, "dimnames")=list of 2

11 4 BEYOND PRINCIPAL COMPONENTS 11 PCA and CVA of Swiss Banknotes Data CV PC1 Standardized Data Figure 5: Swiss banknotes. Scatter plot of first principal component (horizontal axis) and first canonical variate (vertical axis) obtained from standardized data (black/red: genuine/forged bills).....$ : chr [1:6] "Length" "Left" "Right" "Bottom" $ : chr "LD1" $ lev : chr [1:2] "0" "1" $ svd : num 49.1 $ N : int 200 $ call : language lda(x = scale(bn[, -1]), grouping = class) - attr(*, "class")= chr "lda" > # $scaling optimal linear combination(s) > cv <- scale(bn[, -1]) %*% ld$scaling > plot(pc1$scores[,1], cv, pch=20, col=col, + xlab= PC1, ylab= CV1, main= PCA and CVA of Swiss Banknotes Data, + sub= Standardized Data ) > abline(h=0, v=0, lty= dotted, col= grey ) > text(pc1$scores[,1], cv, labels=c(1:, 1:), cex=0.6, pos=3) > round(cor(bn[, -1], cv), 2) LD1 Length Left 0.52 Right 0.61 Bottom 0.80 Top 0.63 Diagonal > cor(pc1$scores[,1], cv) LD1 [1,]

12 4 BEYOND PRINCIPAL COMPONENTS 12 Some remarks are given below 1. It is clear that the canonical variate achieves optimal separation, with genuine bills assuming negative scores and forged bills assuming positive scores. 2. Interpretation is again obtainable from correlations with observed variables. Maximum absolute correlations of the canonical variate are with Diagonal ( 0.94) and Bottom (0.80). 3. In this case PCA and CVA produce similar results, as shown by the strong linear relation between the first principal component and the canonical variate. But in general it needs not be so.

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

DIMENSION REDUCTION AND CLUSTER ANALYSIS

DIMENSION REDUCTION AND CLUSTER ANALYSIS DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS 121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS PRINCIPAL COMPONENTS ANALYSIS Iris Data Let s find Principal Components using the iris dataset. This is a well known dataset, often used to demonstrate the effect of clustering algorithms. It contains

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

12.2 Dimensionality Reduction

12.2 Dimensionality Reduction 510 Chapter 12 of this dimensionality problem, regularization techniques such as SVD are almost always needed to perform the covariance matrix inversion. Because it appears to be a fundamental property

More information

Regularized Discriminant Analysis and Reduced-Rank LDA

Regularized Discriminant Analysis and Reduced-Rank LDA Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables 1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables

More information

Discriminant Analysis with High Dimensional. von Mises-Fisher distribution and

Discriminant Analysis with High Dimensional. von Mises-Fisher distribution and Athens Journal of Sciences December 2014 Discriminant Analysis with High Dimensional von Mises - Fisher Distributions By Mario Romanazzi This paper extends previous work in discriminant analysis with von

More information

Lecture 4: Principal Component Analysis and Linear Dimension Reduction

Lecture 4: Principal Component Analysis and Linear Dimension Reduction Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti Mobile Robotics 1 A Compact Course on Linear Algebra Giorgio Grisetti SA-1 Vectors Arrays of numbers They represent a point in a n dimensional space 2 Vectors: Scalar Product Scalar-Vector Product Changes

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references

More information

Principal component analysis

Principal component analysis Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Principal Component Analysis (PCA) Theory, Practice, and Examples

Principal Component Analysis (PCA) Theory, Practice, and Examples Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1 Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of

More information

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with

More information

CS 143 Linear Algebra Review

CS 143 Linear Algebra Review CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Multivariate Statistics (I) 2. Principal Component Analysis (PCA)

Multivariate Statistics (I) 2. Principal Component Analysis (PCA) Multivariate Statistics (I) 2. Principal Component Analysis (PCA) 2.1 Comprehension of PCA 2.2 Concepts of PCs 2.3 Algebraic derivation of PCs 2.4 Selection and goodness-of-fit of PCs 2.5 Algebraic derivation

More information

Lecture 3: Review of Linear Algebra

Lecture 3: Review of Linear Algebra ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4 Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4 Data reduction, similarity & distance, data augmentation

More information

Unconstrained Ordination

Unconstrained Ordination Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)

More information

Data Preprocessing Tasks

Data Preprocessing Tasks Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag

A Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization

ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization ICS 6N Computational Linear Algebra Symmetric Matrices and Orthogonal Diagonalization Xiaohui Xie University of California, Irvine xhx@uci.edu Xiaohui Xie (UCI) ICS 6N 1 / 21 Symmetric matrices An n n

More information

Robustness of Principal Components

Robustness of Principal Components PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.

More information

Linear Dimensionality Reduction

Linear Dimensionality Reduction Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis

More information

5. Discriminant analysis

5. Discriminant analysis 5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH M Test #1. July 11, 2013 Solutions YORK UNIVERSITY Faculty of Science Department of Mathematics and Statistics MATH 222 3. M Test # July, 23 Solutions. For each statement indicate whether it is always TRUE or sometimes FALSE. Note: For

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

MATH 1553 PRACTICE FINAL EXAMINATION

MATH 1553 PRACTICE FINAL EXAMINATION MATH 553 PRACTICE FINAL EXAMINATION Name Section 2 3 4 5 6 7 8 9 0 Total Please read all instructions carefully before beginning. The final exam is cumulative, covering all sections and topics on the master

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Table of Contents. Multivariate methods. Introduction II. Introduction I

Table of Contents. Multivariate methods. Introduction II. Introduction I Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation

More information

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Recall: Eigenvectors of the Covariance Matrix Covariance matrices are symmetric. Eigenvectors are orthogonal Eigenvectors are ordered by the magnitude of eigenvalues: λ 1 λ 2 λ p {v 1, v 2,..., v n } Recall:

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Collinearity: Impact and Possible Remedies

Collinearity: Impact and Possible Remedies Collinearity: Impact and Possible Remedies Deepayan Sarkar What is collinearity? Exact dependence between columns of X make coefficients non-estimable Collinearity refers to the situation where some columns

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 Sathaye Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26 Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1

More information

Linear Algebra in Actuarial Science: Slides to the lecture

Linear Algebra in Actuarial Science: Slides to the lecture Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations

More information

Eigenvalues, Eigenvectors, and an Intro to PCA

Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Eigenvalues, Eigenvectors, and an Intro to PCA Changing Basis We ve talked so far about re-writing our data using a new set of variables, or a new basis.

More information

STAT 730 Chapter 1 Background

STAT 730 Chapter 1 Background STAT 730 Chapter 1 Background Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 27 Logistics Course notes hopefully posted evening before lecture,

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 6. Principal component analysis (PCA) 6.1 Overview 6.2 Essentials of PCA 6.3 Numerical calculation of PCs 6.4 Effects of data preprocessing

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 Linear Discriminant Analysis Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Principal

More information

Preprocessing & dimensionality reduction

Preprocessing & dimensionality reduction Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016

More information

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses. ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Neuroscience Introduction

Neuroscience Introduction Neuroscience Introduction The brain As humans, we can identify galaxies light years away, we can study particles smaller than an atom. But we still haven t unlocked the mystery of the three pounds of matter

More information

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture

More information

Machine Learning - MT & 14. PCA and MDS

Machine Learning - MT & 14. PCA and MDS Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)

More information

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6

R in Linguistic Analysis. Wassink 2012 University of Washington Week 6 R in Linguistic Analysis Wassink 2012 University of Washington Week 6 Overview R for phoneticians and lab phonologists Johnson 3 Reading Qs Equivalence of means (t-tests) Multiple Regression Principal

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

Lecture Notes 2: Matrices

Lecture Notes 2: Matrices Optimization-based data analysis Fall 2017 Lecture Notes 2: Matrices Matrices are rectangular arrays of numbers, which are extremely useful for data analysis. They can be interpreted as vectors in a vector

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of

More information

COMP 558 lecture 18 Nov. 15, 2010

COMP 558 lecture 18 Nov. 15, 2010 Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Review problems for MA 54, Fall 2004.

Review problems for MA 54, Fall 2004. Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on

More information

Exercises * on Principal Component Analysis

Exercises * on Principal Component Analysis Exercises * on Principal Component Analysis Laurenz Wiskott Institut für Neuroinformatik Ruhr-Universität Bochum, Germany, EU 4 February 207 Contents Intuition 3. Problem statement..........................................

More information

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent. Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u

More information

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016 Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Vector Space Models. wine_spectral.r

Vector Space Models. wine_spectral.r Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components

More information

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 2017-2018 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI

More information

Linear Algebra. Session 12

Linear Algebra. Session 12 Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Principal Component Analysis

Principal Component Analysis I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview

More information

1 9/5 Matrices, vectors, and their applications

1 9/5 Matrices, vectors, and their applications 1 9/5 Matrices, vectors, and their applications Algebra: study of objects and operations on them. Linear algebra: object: matrices and vectors. operations: addition, multiplication etc. Algorithms/Geometric

More information