MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

Size: px
Start display at page:

Download "MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A"

Transcription

1 MultiDimensional Signal Processing Master Degree in Ingegneria delle elecomunicazioni A.A Pietro Guccione, PhD DEI - DIPARIMENO DI INGEGNERIA ELERICA E DELL INFORMAZIONE POLIECNICO DI BARI Pietro Guccione Assistant Professor in Signal Processing (pietro.guccione@poliba.it, )

2 Lecture 10 - Summary Further focus on Dimensionality Reduction Non Negative Matrix Factorization Optimum Constrained Components Rotation Applications Summary

3 From PCA to NNMF Non Negative Matrix Factorization is a way to decompose a matrix A in a product of two non negative matrices, W and H. NNMF overcomes some limitations of PCA: Input/output data are all positives; It does not need of normalization Main drawback: Not-unique solution Inner dimension: can be chosen/selected, basing on problem considerations A = W W H H he problem is solved by using a least square solution with applied the positivity constraint. wo methods: Alternative Least Square; Multiplicative Update Algorithm. 3

4 NNMF Algorithms Multiplicative Update Algorithm: W rand( m, k) H rand( k, n) for i=1:maxiter H H W A W WH W W AH WHH endfor 1 1 4

5 NNMF Algorithms Alternate Least Square Algorithm: W rand( m, k) for i=1:maxiter 1 H W W W A 1 W HH HA WW ( 0) 0 endfor H H 0 0 5

6 NNMF Initialization! Critical problems: he choice of the initial matrices (final solution is sensible); he choice of the number of iterates (which the precision of the solutions depends on). he problem can be circumvented by approaching a first (possibly negative) solutions. Some examples are: - Fill H and/or W with random values with fixed generator seed (to get stable solution). - Use some decomposition method (PCA or ICA) to get initial solution (it is not constrained to be positive): A SL W S(:,1: k) H L(:,1: k) Use the PCA and retain the first k components 6

7 NNMF Initialization! Critical problems: he choice of the initial matrices (final solution is sensible); he choice of the number of iterates (which the precision of the solutions depends on). Alternative solutions can be found, according to detailed problem constraints A SL W S(:,1: k) 1 H W W W A 1 W HH HA W f ( W ) H H 0 0 Firstly use the PCA Solve for H as in ALS Solve the LS problem for W, then apply any other constraint, if needed 7

8 Example: again XPD /1 he set of X-ray Powder Diffraction patterns (an example already presented in previous lectures), decomposed by using NNMF. NNMF has been modified according to some constraints: the spectra are all positive (but not the time profiles), one of the components is the square of another. When applied to the set of XPD patterns, PCA can be interpreted as follows: A powder diffraction profile (sample) can be seen as a data point of an N- dimensional space, where N is the number of the θ (diffraction angles) values (the variables) while the coordinates of the point in a reference system of this space are the values of intensity associated to each θ value. 8

9 Example: again XPD / he set of X-ray Powder Diffraction patterns (an example already presented in previous lectures), decomposed by using NNMF. NNMF has been modified according to some constraints: the spectra are all positive (but not the time profiles), one of the components is the square of another Component #1 x Component ~ tests Component # x 108 [deg] Component time profile SIM CASE Component tests Component # x [deg] tests [deg] 9

10 Example: again XPD /3 Comparison of the first component vs. the second component, to verify the square relation. nd component st vs. nd components 4.5 x 109 C4_ Intensity st component theta [deg] Comparison of the second component with a reference spectrum (pure Cu): ρ=

11 Example: again XPD /4 he set of X-ray Powder Diffraction patterns (an example already presented in previous lectures), decomposed by using NNMF. NNMF has been modified accordingly to some constraints: the spectra are all positive (but not the time profiles), one of the components is the square of another Component #1 3.5 x Component time profile REAL CASE ~ Component tests Component Component # tests Component # x [deg] x [deg] tests

12 Example: again XPD /5 he square relation between the 1 st and nd components still mantains, even for the real data st vs. nd components 4.5 x 109 C_ nd component Intensity st component theta [deg] Comparison of the second component with a reference spectrum (pure Cu): ρ= (real data has some negative component, in this case) 1

13 Example: again XPD /6 Different kind of normalization on data has been applied using PCA Stimulus shape Method applied C4 (simulated data) C (real data) ρ resid ρ resid Sinusoidal PCA no score PCA Z-score spectra PCA Z-score stimuli Mod-NNMF Modified NNMF 1. Perform Z-score on data along stimuli. Make PCA and save the first components (usually 3 comp. correspond to 99% of saved variance) 3. Make NNMF. Initialize W with the components of a previous PCA step. 4. Compute the polynomial fit of such components [coeff = polyfit(w(:,1),w(:,),)] 5. Perform the LS for H [H = pinv(w'*w,tol)*w'*a] 6. Put H(H<0) = 0 7. Perform the LS for W [W = (pinv(h*h',tol)*h*a ) ] 8. Impose the quadratic condition W(:,) = coeff(1)*w(:,1).^ 9. Compute the cost function [0.5*norm(A-W*H, 'fro') / norm(a, 'fro')] 10.Repeat the loop 5-9 conditioned to the reaching of the minimum of the cost function or a maximum number of iterates 13

14 NNMF/constrained PCA or MCR? Multivariate Curve Resolution (MCR) is a multivariate data-driven analysis algorithm firstly proposed in the chemometrics research field. MCR decomposes the dataset to recover the pure response profiles (spectra, compounds ph profiles, time profiles) of the chemical constituents or species of an unresolved mixture obtained in chemical processes (Lawton & Sylvestre, 1971; Sylvestre et al., 1974), starting from PCA. MCR tries to refine the solution by determining a decomposition in two matrices (the concentration profiles C and the spectra profiles S of individual components, corresponding to the scores and loadings in PCA, respectively), that are both nonnegative. he two matrices are found by solving a constrained minimum mean square problem and starting from the reduced data matrix achieved after the application of an initial PCA to original data: X CS E min Cˆ C, S s. t.: opt min Sˆ X X PCA PCA CS ˆˆ CS ˆˆ 14

15 From PCA to a constrained PCA In PCA the data matrix is decomposed into a number of principal components (PCs) that maximize the explained variance in the data on each successive component, under the constraint of being orthogonal to the previous PCs: X UW' N X ( n, m) = U( n, l) W( m, l) U( n, l) W( m, l) l1 l1 where the transformation is defined by a set of N-dimensional vectors of N loadings W(:,n) (this notation addresses the n-th column vector of W) that map each row vector of X to a new vector of principal component (or scores) U(:,n) (U has size MxN, as the matrix X). he loadings are calculated as the eigenvectors of the covariance matrix of the data, X X; the magnitude of the corresponding eigenvalues represents the variance of the data along the eigenvector directions k 15

16 Optimum Constrained PCA In some problems, there may occur the need to impose some external constraints to the components, since it is supposed that the «source» from which data derive, may be related each other following a set of equations (the constraints): X UW' f1 ( UW, ) 0 fk ( UW, ) 0 Here f are a set of equations that impose the constraints to the loadings, to the score, or to both of them at the same time. Such constraints transform a problem of component extraction (which is a linear problem) in a constrained problem, which can have a computationally intensive difficulty, since nonlinear, that may be solved, when possible, only using optimization methods. 16

17 Optimum Constrained PCA he problem of principal component decomposition, which is a linear problem of computational complexity (NM +N 3 ) (the Singular Value Decomposition is needed to decompose the sample covariance matrix of X), becomes a nonlinear problem, according to the general formulation: UW, U, W = arg min X UW', f ( U, W) 0 that is a nonlinear constrained optimization problem ( According to the difficulty of expanding and accounting of the function(s) f, the problem can be solved using optimization methods as trust-region-reflective, activeset or interior point. 17

18 Optimum Constrained Component Rotation Let us apply the previous formulation to the specific problem of X-ray Powder Diffraction spectra. he XPD spectrum can be properly modelled as follows: A(, t) b(, t) R( ) g( t) S( ) g( t) ( ) where A(ϑ,t) are the data, b(ϑ,t) a possible bias, R(ϑ) represents the diffraction profile as determined by the averaged crystallographic parameters of the active atoms; S(ϑ) the diffraction profile as determined by the interaction between the active and spectator (or silent) sub-lattices; the third term has contribution from the part of the structure factors which does not vary with time. he quantity A(θ, t b(θ, t can be arranged as a matrix X(m,n) of size MxN, where the columns are the variables (the diffraction angles θ), the rows over the diffraction profiles taken at different times 18

19 Optimum Constrained Component Rotation he matrix can be seen as a data point set of an N-dimensional space, where N is the number of θ values (variables), while the coordinates of the point in a reference system of this space are the values of intensity associated with each θ value. PCA can reduce the dimensionality of this representation, by using a reference system with only k orthogonal axes that represent the directions of maximum variability of the data. he coordinates of the data point in this new reference system are the scores, while the loadings are the coefficients which define the N directions with respect to the original reference system. 19

20 Optimum Constrained Component Rotation Decomposition of is: With constraints: A(, t) b(, t) R( ) g( t) S( ) g( t) ( ) R( ) (, ) (, ) ( ) ( ) 1 A t b t g t g t S( ) ( ) R( ),S( ), ( ) 0 [spectra (components) all positive] and g( t) g( t) [second score (dependence with time) is expected to be the square of the first one] 0

21 Optimum Constrained Component Rotation Main rationale: the scores are no longer constrained to be orthogonal each other (they may be partially correlated), so to allow the constraints to be applied. Since this constraint is not required by the previous problem, we allow the score axes to change their direction, by exploring the k-dimensional space (already reduced to the principal components) driven by a properly defined cost function. he idea is that we are able to detect the optimal rotated axes of a lowdimensional space (where data still have a meaningful representation) by minimizing an objective function, provided that the conditions (after X UW' ): U(:,) U(:,1) W(:,) 0 are satisfied. he axes are no longer orthogonal. 1

22 Optimum Constrained Component Rotation A hypothetical powder diffraction profile (P) constituted by N = 3 intensity values (I 1, I, I 3 ) for respective θ values (θ 1, θ, θ 3 ). When projected in the space of the principal component directions PC1 and PC, it can be described by only two values: Score 1 and Score.

23 Optimum Constrained Component Rotation Problem formalization for the case k= X UW' X U W ' ( k) ( k) 1 ( k) ( k) X U W ' New scores X M N cos sin sin cos where φ and ψ are two independent parameters defining the change in direction of the axes U ( k) uˆ( m,1) u( m,1)cos u( m,)sin uˆ( m,) u( m,1)sin u( m,1)cos m 1,..., M 3

24 Optimum Constrained Component Rotation New components: w n w n w n cos 1 W k ( ) 1 ˆ (,1) (,1)cos (,)sin 1 wˆ ( n,) w( n,1)sin w( n,)cos cos n 1,..., N Matrix the energies associated to the first and second scores (i.e. the variance of data explained by them) do not change in such transformation (the columns of have norm 1); the change in the direction of the two scores are independent (φ ψ ). 4

25 Optimum Constrained Component Rotation he figures of merit are the transformation of the constraints into equations. he objective is to identify the scores that give the maximum values of the FOM. 1. Pearson correlation coefficient between the second (rotated) score and the square of the first (rotated) score FOM scores M M m1 uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) M uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) m1 m1 his figure of merit requires that the mean square of the residual ε is minimum in, U(:,) U(:,1) regardless of the proportional term γ. he absolute value at numerator accounts for the sign ambiguity of PCA scores 5

26 Optimum Constrained Component Rotation he figures of merit are the transformation of the constraints into equations. he objective is to identify the loadings that give the maximum values of the FOM.. he normalized difference between the positive and the negative part of the area underlying the second loading FOM loadings N N wˆ( n,) wˆ( n,) w w n1 n1 N N wˆ( n,) wˆ( n,) w w n1 n1 where w^(n,) is the intensity of the rotated second loading at the angle ϑ n, and σ w is the standard deviation of w^(n,). his cost function measures the positive-negative asymmetry of the second (rotated) loading; its definition is dictated by the fact that the overall sign of the PCA loadings is arbitrary 6

27 Optimum Constrained Component Rotation Both the figures of merit have 1 as the highest and best value. he idea is to find the optimal combination of (φ,ψ) verifying FOM scores M M m1 uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) M uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) m1 m1 subjected to FOM loadings N N wˆ( n,) wˆ( n,) w w n1 n1 N N wˆ( n,) wˆ( n,) w w n1 n1 Possible optimization research path of the algorithm 7

28 Optimum Constrained Component Rotation Problem formalization for the generic case k> X UW' X U W ' ( k) ( k) k ' ' 1 X U ( ) W ( k) ' X M N Moore-Penrose generalized inverse matrix k k 1 1 has size [kx]. he degree of freedom in is now (k-1), since, to preserve the energy of the scores we have 8

29 Optimum Constrained Component Rotation, arg max i i FOM U, W, i, i New scores opt k uˆ( m,1) u( m, i) i ˆ i 1 ( k) U= U m1,..., M k uˆ( m,) u( m, i) i New components computed according to the relation: i1 ˆ 1 ( ) ' ' W k W ( k) ' s.t. k i1 k i1 i i

30 Summary NNMF is a matrix data-driven decomposition that applies the non-negativity as constraint. Useful in problem where negative solutions are meaningless Multivariate Curve Resolution is a decomposition method similar to NNMF, but that starts from a PCA as initial solution OCCR can be seen as a possible generalization of PCA to problem where sources (loadings) or components (scores) are subjected to some conditions. he problem may not always be resolvable OCCR differs from MCR since: In OCCR the solution is given without solving a least squares problem (a general optimization method is used instead); OCCR does not impose the constraint that both the matrices are positive, as MCR does. NNMF (and in a limited extent, also MCR, which basically is a modified version of NNMF) has some limitations: A NNMF solution is very sensitive to initial conditions; In some case, we do not need both the matrices of the decomposition to be positive. 30

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A

MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 2017-2018 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI

More information

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17 Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Principal Components Analysis (PCA)

Principal Components Analysis (PCA) Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering

More information

Independent Component Analysis and Its Application on Accelerator Physics

Independent Component Analysis and Its Application on Accelerator Physics Independent Component Analysis and Its Application on Accelerator Physics Xiaoying Pang LA-UR-12-20069 ICA and PCA Similarities: Blind source separation method (BSS) no model Observed signals are linear

More information

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = = 18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013 1. Consider a bivariate random variable: [ ] X X = 1 X 2 with mean and co [ ] variance: [ ] [ α1 Σ 1,1 Σ 1,2 σ 2 ρσ 1 σ E[X] =, and Cov[X]

More information

Computational functional genomics

Computational functional genomics Computational functional genomics (Spring 2005: Lecture 8) David K. Gifford (Adapted from a lecture by Tommi S. Jaakkola) MIT CSAIL Basic clustering methods hierarchical k means mixture models Multi variate

More information

Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Multivariate Statistical Analysis

Multivariate Statistical Analysis Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions

More information

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

Linear Algebra: Matrix Eigenvalue Problems

Linear Algebra: Matrix Eigenvalue Problems CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

«Random Vectors» Lecture #2: Introduction Andreas Polydoros

«Random Vectors» Lecture #2: Introduction Andreas Polydoros «Random Vectors» Lecture #2: Introduction Andreas Polydoros Introduction Contents: Definitions: Correlation and Covariance matrix Linear transformations: Spectral shaping and factorization he whitening

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

DIMENSION REDUCTION AND CLUSTER ANALYSIS

DIMENSION REDUCTION AND CLUSTER ANALYSIS DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Seminar on Linear Algebra

Seminar on Linear Algebra Supplement Seminar on Linear Algebra Projection, Singular Value Decomposition, Pseudoinverse Kenichi Kanatani Kyoritsu Shuppan Co., Ltd. Contents 1 Linear Space and Projection 1 1.1 Expression of Linear

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

Principal component analysis

Principal component analysis Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and

More information

Robustness of Principal Components

Robustness of Principal Components PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.

More information

Principal Component Analysis (PCA) Theory, Practice, and Examples

Principal Component Analysis (PCA) Theory, Practice, and Examples Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A

More information

Sparse orthogonal factor analysis

Sparse orthogonal factor analysis Sparse orthogonal factor analysis Kohei Adachi and Nickolay T. Trendafilov Abstract A sparse orthogonal factor analysis procedure is proposed for estimating the optimal solution with sparse loadings. In

More information

Vector Space Models. wine_spectral.r

Vector Space Models. wine_spectral.r Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components

More information

Factor Analysis Continued. Psy 524 Ainsworth

Factor Analysis Continued. Psy 524 Ainsworth Factor Analysis Continued Psy 524 Ainsworth Equations Extraction Principal Axis Factoring Variables Skiers Cost Lift Depth Powder S1 32 64 65 67 S2 61 37 62 65 S3 59 40 45 43 S4 36 62 34 35 S5 62 46 43

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

PRINCIPAL COMPONENT ANALYSIS

PRINCIPAL COMPONENT ANALYSIS PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem

More information

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis .. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction

More information

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations. Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,

More information

MULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA. Haisheng Lin, Ognjen Marjanovic, Barry Lennox

MULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA. Haisheng Lin, Ognjen Marjanovic, Barry Lennox MULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA Haisheng Lin, Ognjen Marjanovic, Barry Lennox Control Systems Centre, School of Electrical and Electronic Engineering, University of Manchester Abstract:

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

STATISTICAL LEARNING SYSTEMS

STATISTICAL LEARNING SYSTEMS STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis

More information

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the

-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the 1 2 3 -Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the 1950's. -PCA is based on covariance or correlation

More information

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent. Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u

More information

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works

CS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The

More information

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti

More information

ECE 275A Homework 6 Solutions

ECE 275A Homework 6 Solutions ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =

More information

1 Principal Components Analysis

1 Principal Components Analysis Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

2.5 Multivariate Curve Resolution (MCR)

2.5 Multivariate Curve Resolution (MCR) 2.5 Multivariate Curve Resolution (MCR) Lecturer: Dr. Lionel Blanchet The Multivariate Curve Resolution (MCR) methods are widely used in the analysis of mixtures in chemistry and biology. The main interest

More information

16.1 L.P. Duality Applied to the Minimax Theorem

16.1 L.P. Duality Applied to the Minimax Theorem CS787: Advanced Algorithms Scribe: David Malec and Xiaoyong Chai Lecturer: Shuchi Chawla Topic: Minimax Theorem and Semi-Definite Programming Date: October 22 2007 In this lecture, we first conclude our

More information

On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View Point

On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View Point Applied Mathematics E-Notes, 7(007), 65-70 c ISSN 1607-510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View

More information

CSE 554 Lecture 7: Alignment

CSE 554 Lecture 7: Alignment CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex

More information

Linear Algebra for Machine Learning. Sargur N. Srihari

Linear Algebra for Machine Learning. Sargur N. Srihari Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it

More information

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

PCA and admixture models

PCA and admixture models PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1

More information

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA

MACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR

More information

Forecasting 1 to h steps ahead using partial least squares

Forecasting 1 to h steps ahead using partial least squares Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

2.3. Clustering or vector quantization 57

2.3. Clustering or vector quantization 57 Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

CS281 Section 4: Factor Analysis and PCA

CS281 Section 4: Factor Analysis and PCA CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Dimensionality Reduction

Dimensionality Reduction Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball

More information

Maximum variance formulation

Maximum variance formulation 12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

LECTURE 16: PCA AND SVD

LECTURE 16: PCA AND SVD Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal

More information

Principal component analysis

Principal component analysis Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance

More information

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R, Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations

More information

Normal Mode Decomposition of 2N 2N symplectic matrices

Normal Mode Decomposition of 2N 2N symplectic matrices David Rubin February 8, 008 Normal Mode Decomposition of N N symplectic matrices Normal mode decomposition of a 4X4 symplectic matrix is a standard technique for analyzing transverse coupling in a storage

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

COMP 558 lecture 18 Nov. 15, 2010

COMP 558 lecture 18 Nov. 15, 2010 Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to

More information

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 The required files for all problems can be found in: http://www.stat.uchicago.edu/~lekheng/courses/331/hw3/ The file name indicates which problem

More information

Introduction to Principal Component Analysis (PCA)

Introduction to Principal Component Analysis (PCA) Introduction to Principal Component Analysis (PCA) NESAC/BIO NESAC/BIO Daniel J. Graham PhD University of Washington NESAC/BIO MVSA Website 2010 Multivariate Analysis Multivariate analysis (MVA) methods

More information

Matrix Vector Products

Matrix Vector Products We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Quantum Computing Lecture 2. Review of Linear Algebra

Quantum Computing Lecture 2. Review of Linear Algebra Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces

More information

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 147 le-tex

Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 147 le-tex Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c08 2013/9/9 page 147 le-tex 8.3 Principal Component Analysis (PCA) 147 Figure 8.1 Principal and independent components

More information

Unsupervised Learning: Dimensionality Reduction

Unsupervised Learning: Dimensionality Reduction Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,

More information

Ph 219/CS 219. Exercises Due: Friday 20 October 2006

Ph 219/CS 219. Exercises Due: Friday 20 October 2006 1 Ph 219/CS 219 Exercises Due: Friday 20 October 2006 1.1 How far apart are two quantum states? Consider two quantum states described by density operators ρ and ρ in an N-dimensional Hilbert space, and

More information

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017 The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the

More information

HST.582J/6.555J/16.456J

HST.582J/6.555J/16.456J Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear

More information

Reduced-dimension Models in Nonlinear Finite Element Dynamics of Continuous Media

Reduced-dimension Models in Nonlinear Finite Element Dynamics of Continuous Media Reduced-dimension Models in Nonlinear Finite Element Dynamics of Continuous Media Petr Krysl, Sanjay Lall, and Jerrold E. Marsden, California Institute of Technology, Pasadena, CA 91125. pkrysl@cs.caltech.edu,

More information

LECTURE 18: NONLINEAR MODELS

LECTURE 18: NONLINEAR MODELS LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:

More information

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN

PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the

More information

Principal Components Theory Notes

Principal Components Theory Notes Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal

More information

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University

Chemometrics. Matti Hotokka Physical chemistry Åbo Akademi University Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

Preprocessing & dimensionality reduction

Preprocessing & dimensionality reduction Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016

More information

ANALYSIS OF NONLINEAR PARTIAL LEAST SQUARES ALGORITHMS

ANALYSIS OF NONLINEAR PARTIAL LEAST SQUARES ALGORITHMS ANALYSIS OF NONLINEAR PARIAL LEAS SQUARES ALGORIHMS S. Kumar U. Kruger,1 E. B. Martin, and A. J. Morris Centre of Process Analytics and Process echnology, University of Newcastle, NE1 7RU, U.K. Intelligent

More information

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:

More information

Invertibility of random matrices

Invertibility of random matrices University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]

More information

FERMENTATION BATCH PROCESS MONITORING BY STEP-BY-STEP ADAPTIVE MPCA. Ning He, Lei Xie, Shu-qing Wang, Jian-ming Zhang

FERMENTATION BATCH PROCESS MONITORING BY STEP-BY-STEP ADAPTIVE MPCA. Ning He, Lei Xie, Shu-qing Wang, Jian-ming Zhang FERMENTATION BATCH PROCESS MONITORING BY STEP-BY-STEP ADAPTIVE MPCA Ning He Lei Xie Shu-qing Wang ian-ming Zhang National ey Laboratory of Industrial Control Technology Zhejiang University Hangzhou 3007

More information

Factor Analysis. Qian-Li Xue

Factor Analysis. Qian-Li Xue Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale

More information

Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

More information

Principal Components Analysis

Principal Components Analysis Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal

More information

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of

More information

ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation

ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation Prof. om Overbye Dept. of Electrical and Computer Engineering exas A&M University overbye@tamu.edu Announcements

More information