MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
|
|
- Loraine Fitzgerald
- 5 years ago
- Views:
Transcription
1 MultiDimensional Signal Processing Master Degree in Ingegneria delle elecomunicazioni A.A Pietro Guccione, PhD DEI - DIPARIMENO DI INGEGNERIA ELERICA E DELL INFORMAZIONE POLIECNICO DI BARI Pietro Guccione Assistant Professor in Signal Processing (pietro.guccione@poliba.it, )
2 Lecture 10 - Summary Further focus on Dimensionality Reduction Non Negative Matrix Factorization Optimum Constrained Components Rotation Applications Summary
3 From PCA to NNMF Non Negative Matrix Factorization is a way to decompose a matrix A in a product of two non negative matrices, W and H. NNMF overcomes some limitations of PCA: Input/output data are all positives; It does not need of normalization Main drawback: Not-unique solution Inner dimension: can be chosen/selected, basing on problem considerations A = W W H H he problem is solved by using a least square solution with applied the positivity constraint. wo methods: Alternative Least Square; Multiplicative Update Algorithm. 3
4 NNMF Algorithms Multiplicative Update Algorithm: W rand( m, k) H rand( k, n) for i=1:maxiter H H W A W WH W W AH WHH endfor 1 1 4
5 NNMF Algorithms Alternate Least Square Algorithm: W rand( m, k) for i=1:maxiter 1 H W W W A 1 W HH HA WW ( 0) 0 endfor H H 0 0 5
6 NNMF Initialization! Critical problems: he choice of the initial matrices (final solution is sensible); he choice of the number of iterates (which the precision of the solutions depends on). he problem can be circumvented by approaching a first (possibly negative) solutions. Some examples are: - Fill H and/or W with random values with fixed generator seed (to get stable solution). - Use some decomposition method (PCA or ICA) to get initial solution (it is not constrained to be positive): A SL W S(:,1: k) H L(:,1: k) Use the PCA and retain the first k components 6
7 NNMF Initialization! Critical problems: he choice of the initial matrices (final solution is sensible); he choice of the number of iterates (which the precision of the solutions depends on). Alternative solutions can be found, according to detailed problem constraints A SL W S(:,1: k) 1 H W W W A 1 W HH HA W f ( W ) H H 0 0 Firstly use the PCA Solve for H as in ALS Solve the LS problem for W, then apply any other constraint, if needed 7
8 Example: again XPD /1 he set of X-ray Powder Diffraction patterns (an example already presented in previous lectures), decomposed by using NNMF. NNMF has been modified according to some constraints: the spectra are all positive (but not the time profiles), one of the components is the square of another. When applied to the set of XPD patterns, PCA can be interpreted as follows: A powder diffraction profile (sample) can be seen as a data point of an N- dimensional space, where N is the number of the θ (diffraction angles) values (the variables) while the coordinates of the point in a reference system of this space are the values of intensity associated to each θ value. 8
9 Example: again XPD / he set of X-ray Powder Diffraction patterns (an example already presented in previous lectures), decomposed by using NNMF. NNMF has been modified according to some constraints: the spectra are all positive (but not the time profiles), one of the components is the square of another Component #1 x Component ~ tests Component # x 108 [deg] Component time profile SIM CASE Component tests Component # x [deg] tests [deg] 9
10 Example: again XPD /3 Comparison of the first component vs. the second component, to verify the square relation. nd component st vs. nd components 4.5 x 109 C4_ Intensity st component theta [deg] Comparison of the second component with a reference spectrum (pure Cu): ρ=
11 Example: again XPD /4 he set of X-ray Powder Diffraction patterns (an example already presented in previous lectures), decomposed by using NNMF. NNMF has been modified accordingly to some constraints: the spectra are all positive (but not the time profiles), one of the components is the square of another Component #1 3.5 x Component time profile REAL CASE ~ Component tests Component Component # tests Component # x [deg] x [deg] tests
12 Example: again XPD /5 he square relation between the 1 st and nd components still mantains, even for the real data st vs. nd components 4.5 x 109 C_ nd component Intensity st component theta [deg] Comparison of the second component with a reference spectrum (pure Cu): ρ= (real data has some negative component, in this case) 1
13 Example: again XPD /6 Different kind of normalization on data has been applied using PCA Stimulus shape Method applied C4 (simulated data) C (real data) ρ resid ρ resid Sinusoidal PCA no score PCA Z-score spectra PCA Z-score stimuli Mod-NNMF Modified NNMF 1. Perform Z-score on data along stimuli. Make PCA and save the first components (usually 3 comp. correspond to 99% of saved variance) 3. Make NNMF. Initialize W with the components of a previous PCA step. 4. Compute the polynomial fit of such components [coeff = polyfit(w(:,1),w(:,),)] 5. Perform the LS for H [H = pinv(w'*w,tol)*w'*a] 6. Put H(H<0) = 0 7. Perform the LS for W [W = (pinv(h*h',tol)*h*a ) ] 8. Impose the quadratic condition W(:,) = coeff(1)*w(:,1).^ 9. Compute the cost function [0.5*norm(A-W*H, 'fro') / norm(a, 'fro')] 10.Repeat the loop 5-9 conditioned to the reaching of the minimum of the cost function or a maximum number of iterates 13
14 NNMF/constrained PCA or MCR? Multivariate Curve Resolution (MCR) is a multivariate data-driven analysis algorithm firstly proposed in the chemometrics research field. MCR decomposes the dataset to recover the pure response profiles (spectra, compounds ph profiles, time profiles) of the chemical constituents or species of an unresolved mixture obtained in chemical processes (Lawton & Sylvestre, 1971; Sylvestre et al., 1974), starting from PCA. MCR tries to refine the solution by determining a decomposition in two matrices (the concentration profiles C and the spectra profiles S of individual components, corresponding to the scores and loadings in PCA, respectively), that are both nonnegative. he two matrices are found by solving a constrained minimum mean square problem and starting from the reduced data matrix achieved after the application of an initial PCA to original data: X CS E min Cˆ C, S s. t.: opt min Sˆ X X PCA PCA CS ˆˆ CS ˆˆ 14
15 From PCA to a constrained PCA In PCA the data matrix is decomposed into a number of principal components (PCs) that maximize the explained variance in the data on each successive component, under the constraint of being orthogonal to the previous PCs: X UW' N X ( n, m) = U( n, l) W( m, l) U( n, l) W( m, l) l1 l1 where the transformation is defined by a set of N-dimensional vectors of N loadings W(:,n) (this notation addresses the n-th column vector of W) that map each row vector of X to a new vector of principal component (or scores) U(:,n) (U has size MxN, as the matrix X). he loadings are calculated as the eigenvectors of the covariance matrix of the data, X X; the magnitude of the corresponding eigenvalues represents the variance of the data along the eigenvector directions k 15
16 Optimum Constrained PCA In some problems, there may occur the need to impose some external constraints to the components, since it is supposed that the «source» from which data derive, may be related each other following a set of equations (the constraints): X UW' f1 ( UW, ) 0 fk ( UW, ) 0 Here f are a set of equations that impose the constraints to the loadings, to the score, or to both of them at the same time. Such constraints transform a problem of component extraction (which is a linear problem) in a constrained problem, which can have a computationally intensive difficulty, since nonlinear, that may be solved, when possible, only using optimization methods. 16
17 Optimum Constrained PCA he problem of principal component decomposition, which is a linear problem of computational complexity (NM +N 3 ) (the Singular Value Decomposition is needed to decompose the sample covariance matrix of X), becomes a nonlinear problem, according to the general formulation: UW, U, W = arg min X UW', f ( U, W) 0 that is a nonlinear constrained optimization problem ( According to the difficulty of expanding and accounting of the function(s) f, the problem can be solved using optimization methods as trust-region-reflective, activeset or interior point. 17
18 Optimum Constrained Component Rotation Let us apply the previous formulation to the specific problem of X-ray Powder Diffraction spectra. he XPD spectrum can be properly modelled as follows: A(, t) b(, t) R( ) g( t) S( ) g( t) ( ) where A(ϑ,t) are the data, b(ϑ,t) a possible bias, R(ϑ) represents the diffraction profile as determined by the averaged crystallographic parameters of the active atoms; S(ϑ) the diffraction profile as determined by the interaction between the active and spectator (or silent) sub-lattices; the third term has contribution from the part of the structure factors which does not vary with time. he quantity A(θ, t b(θ, t can be arranged as a matrix X(m,n) of size MxN, where the columns are the variables (the diffraction angles θ), the rows over the diffraction profiles taken at different times 18
19 Optimum Constrained Component Rotation he matrix can be seen as a data point set of an N-dimensional space, where N is the number of θ values (variables), while the coordinates of the point in a reference system of this space are the values of intensity associated with each θ value. PCA can reduce the dimensionality of this representation, by using a reference system with only k orthogonal axes that represent the directions of maximum variability of the data. he coordinates of the data point in this new reference system are the scores, while the loadings are the coefficients which define the N directions with respect to the original reference system. 19
20 Optimum Constrained Component Rotation Decomposition of is: With constraints: A(, t) b(, t) R( ) g( t) S( ) g( t) ( ) R( ) (, ) (, ) ( ) ( ) 1 A t b t g t g t S( ) ( ) R( ),S( ), ( ) 0 [spectra (components) all positive] and g( t) g( t) [second score (dependence with time) is expected to be the square of the first one] 0
21 Optimum Constrained Component Rotation Main rationale: the scores are no longer constrained to be orthogonal each other (they may be partially correlated), so to allow the constraints to be applied. Since this constraint is not required by the previous problem, we allow the score axes to change their direction, by exploring the k-dimensional space (already reduced to the principal components) driven by a properly defined cost function. he idea is that we are able to detect the optimal rotated axes of a lowdimensional space (where data still have a meaningful representation) by minimizing an objective function, provided that the conditions (after X UW' ): U(:,) U(:,1) W(:,) 0 are satisfied. he axes are no longer orthogonal. 1
22 Optimum Constrained Component Rotation A hypothetical powder diffraction profile (P) constituted by N = 3 intensity values (I 1, I, I 3 ) for respective θ values (θ 1, θ, θ 3 ). When projected in the space of the principal component directions PC1 and PC, it can be described by only two values: Score 1 and Score.
23 Optimum Constrained Component Rotation Problem formalization for the case k= X UW' X U W ' ( k) ( k) 1 ( k) ( k) X U W ' New scores X M N cos sin sin cos where φ and ψ are two independent parameters defining the change in direction of the axes U ( k) uˆ( m,1) u( m,1)cos u( m,)sin uˆ( m,) u( m,1)sin u( m,1)cos m 1,..., M 3
24 Optimum Constrained Component Rotation New components: w n w n w n cos 1 W k ( ) 1 ˆ (,1) (,1)cos (,)sin 1 wˆ ( n,) w( n,1)sin w( n,)cos cos n 1,..., N Matrix the energies associated to the first and second scores (i.e. the variance of data explained by them) do not change in such transformation (the columns of have norm 1); the change in the direction of the two scores are independent (φ ψ ). 4
25 Optimum Constrained Component Rotation he figures of merit are the transformation of the constraints into equations. he objective is to identify the scores that give the maximum values of the FOM. 1. Pearson correlation coefficient between the second (rotated) score and the square of the first (rotated) score FOM scores M M m1 uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) M uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) m1 m1 his figure of merit requires that the mean square of the residual ε is minimum in, U(:,) U(:,1) regardless of the proportional term γ. he absolute value at numerator accounts for the sign ambiguity of PCA scores 5
26 Optimum Constrained Component Rotation he figures of merit are the transformation of the constraints into equations. he objective is to identify the loadings that give the maximum values of the FOM.. he normalized difference between the positive and the negative part of the area underlying the second loading FOM loadings N N wˆ( n,) wˆ( n,) w w n1 n1 N N wˆ( n,) wˆ( n,) w w n1 n1 where w^(n,) is the intensity of the rotated second loading at the angle ϑ n, and σ w is the standard deviation of w^(n,). his cost function measures the positive-negative asymmetry of the second (rotated) loading; its definition is dictated by the fact that the overall sign of the PCA loadings is arbitrary 6
27 Optimum Constrained Component Rotation Both the figures of merit have 1 as the highest and best value. he idea is to find the optimal combination of (φ,ψ) verifying FOM scores M M m1 uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) M uˆ ( m,1) uˆ ( m,1) uˆ ( m,) uˆ ( m,) m1 m1 subjected to FOM loadings N N wˆ( n,) wˆ( n,) w w n1 n1 N N wˆ( n,) wˆ( n,) w w n1 n1 Possible optimization research path of the algorithm 7
28 Optimum Constrained Component Rotation Problem formalization for the generic case k> X UW' X U W ' ( k) ( k) k ' ' 1 X U ( ) W ( k) ' X M N Moore-Penrose generalized inverse matrix k k 1 1 has size [kx]. he degree of freedom in is now (k-1), since, to preserve the energy of the scores we have 8
29 Optimum Constrained Component Rotation, arg max i i FOM U, W, i, i New scores opt k uˆ( m,1) u( m, i) i ˆ i 1 ( k) U= U m1,..., M k uˆ( m,) u( m, i) i New components computed according to the relation: i1 ˆ 1 ( ) ' ' W k W ( k) ' s.t. k i1 k i1 i i
30 Summary NNMF is a matrix data-driven decomposition that applies the non-negativity as constraint. Useful in problem where negative solutions are meaningless Multivariate Curve Resolution is a decomposition method similar to NNMF, but that starts from a PCA as initial solution OCCR can be seen as a possible generalization of PCA to problem where sources (loadings) or components (scores) are subjected to some conditions. he problem may not always be resolvable OCCR differs from MCR since: In OCCR the solution is given without solving a least squares problem (a general optimization method is used instead); OCCR does not impose the constraint that both the matrices are positive, as MCR does. NNMF (and in a limited extent, also MCR, which basically is a modified version of NNMF) has some limitations: A NNMF solution is very sensitive to initial conditions; In some case, we do not need both the matrices of the decomposition to be positive. 30
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A
MultiDimensional Signal Processing Master Degree in Ingegneria delle Telecomunicazioni A.A. 2017-2018 Pietro Guccione, PhD DEI - DIPARTIMENTO DI INGEGNERIA ELETTRICA E DELL INFORMAZIONE POLITECNICO DI
More informationPrincipal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17
Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis Chris Funk Lecture 17 Outline Filters and Rotations Generating co-varying random fields Translating co-varying fields into
More informationDimension Reduction Techniques. Presented by Jie (Jerry) Yu
Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering
More informationIndependent Component Analysis and Its Application on Accelerator Physics
Independent Component Analysis and Its Application on Accelerator Physics Xiaoying Pang LA-UR-12-20069 ICA and PCA Similarities: Blind source separation method (BSS) no model Observed signals are linear
More information18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =
18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013 1. Consider a bivariate random variable: [ ] X X = 1 X 2 with mean and co [ ] variance: [ ] [ α1 Σ 1,1 Σ 1,2 σ 2 ρσ 1 σ E[X] =, and Cov[X]
More informationComputational functional genomics
Computational functional genomics (Spring 2005: Lecture 8) David K. Gifford (Adapted from a lecture by Tommi S. Jaakkola) MIT CSAIL Basic clustering methods hierarchical k means mixture models Multi variate
More informationLecture 5 Singular value decomposition
Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationPrincipal Component Analysis (PCA)
Principal Component Analysis (PCA) Additional reading can be found from non-assessed exercises (week 8) in this course unit teaching page. Textbooks: Sect. 6.3 in [1] and Ch. 12 in [2] Outline Introduction
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationLinear Algebra: Matrix Eigenvalue Problems
CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More information«Random Vectors» Lecture #2: Introduction Andreas Polydoros
«Random Vectors» Lecture #2: Introduction Andreas Polydoros Introduction Contents: Definitions: Correlation and Covariance matrix Linear transformations: Spectral shaping and factorization he whitening
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationDIMENSION REDUCTION AND CLUSTER ANALYSIS
DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833
More informationMachine Learning (BSMC-GA 4439) Wenke Liu
Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems
More informationSeminar on Linear Algebra
Supplement Seminar on Linear Algebra Projection, Singular Value Decomposition, Pseudoinverse Kenichi Kanatani Kyoritsu Shuppan Co., Ltd. Contents 1 Linear Space and Projection 1 1.1 Expression of Linear
More informationChapter 4: Factor Analysis
Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.
More informationPrincipal component analysis
Principal component analysis Angela Montanari 1 Introduction Principal component analysis (PCA) is one of the most popular multivariate statistical methods. It was first introduced by Pearson (1901) and
More informationRobustness of Principal Components
PCA for Clustering An objective of principal components analysis is to identify linear combinations of the original variables that are useful in accounting for the variation in those original variables.
More informationPrincipal Component Analysis (PCA) Theory, Practice, and Examples
Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A
More informationSparse orthogonal factor analysis
Sparse orthogonal factor analysis Kohei Adachi and Nickolay T. Trendafilov Abstract A sparse orthogonal factor analysis procedure is proposed for estimating the optimal solution with sparse loadings. In
More informationVector Space Models. wine_spectral.r
Vector Space Models 137 wine_spectral.r Latent Semantic Analysis Problem with words Even a small vocabulary as in wine example is challenging LSA Reduce number of columns of DTM by principal components
More informationFactor Analysis Continued. Psy 524 Ainsworth
Factor Analysis Continued Psy 524 Ainsworth Equations Extraction Principal Axis Factoring Variables Skiers Cost Lift Depth Powder S1 32 64 65 67 S2 61 37 62 65 S3 59 40 45 43 S4 36 62 34 35 S5 62 46 43
More information1 Singular Value Decomposition and Principal Component
Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)
More informationPRINCIPAL COMPONENT ANALYSIS
PRINCIPAL COMPONENT ANALYSIS 1 INTRODUCTION One of the main problems inherent in statistics with more than two variables is the issue of visualising or interpreting data. Fortunately, quite often the problem
More informationDecember 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis
.. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make
More informationEE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015
EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationFocus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.
Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,
More informationMULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA. Haisheng Lin, Ognjen Marjanovic, Barry Lennox
MULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA Haisheng Lin, Ognjen Marjanovic, Barry Lennox Control Systems Centre, School of Electrical and Electronic Engineering, University of Manchester Abstract:
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More information-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the
1 2 3 -Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the 1950's. -PCA is based on covariance or correlation
More informationj=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.
Lecture Notes: Orthogonal and Symmetric Matrices Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong taoyf@cse.cuhk.edu.hk Orthogonal Matrix Definition. Let u = [u
More informationCS168: The Modern Algorithmic Toolbox Lecture #8: How PCA Works
CS68: The Modern Algorithmic Toolbox Lecture #8: How PCA Works Tim Roughgarden & Gregory Valiant April 20, 206 Introduction Last lecture introduced the idea of principal components analysis (PCA). The
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationECE 275A Homework 6 Solutions
ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =
More information1 Principal Components Analysis
Lecture 3 and 4 Sept. 18 and Sept.20-2006 Data Visualization STAT 442 / 890, CM 462 Lecture: Ali Ghodsi 1 Principal Components Analysis Principal components analysis (PCA) is a very popular technique for
More informationComputation. For QDA we need to calculate: Lets first consider the case that
Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More information2.5 Multivariate Curve Resolution (MCR)
2.5 Multivariate Curve Resolution (MCR) Lecturer: Dr. Lionel Blanchet The Multivariate Curve Resolution (MCR) methods are widely used in the analysis of mixtures in chemistry and biology. The main interest
More information16.1 L.P. Duality Applied to the Minimax Theorem
CS787: Advanced Algorithms Scribe: David Malec and Xiaoyong Chai Lecturer: Shuchi Chawla Topic: Minimax Theorem and Semi-Definite Programming Date: October 22 2007 In this lecture, we first conclude our
More informationOn The Belonging Of A Perturbed Vector To A Subspace From A Numerical View Point
Applied Mathematics E-Notes, 7(007), 65-70 c ISSN 1607-510 Available free at mirror sites of http://www.math.nthu.edu.tw/ amen/ On The Belonging Of A Perturbed Vector To A Subspace From A Numerical View
More informationCSE 554 Lecture 7: Alignment
CSE 554 Lecture 7: Alignment Fall 2012 CSE554 Alignment Slide 1 Review Fairing (smoothing) Relocating vertices to achieve a smoother appearance Method: centroid averaging Simplification Reducing vertex
More informationLinear Algebra for Machine Learning. Sargur N. Srihari
Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it
More informationData Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis
Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline of Lecture
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationThe Singular Value Decomposition
The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will
More informationGatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II
Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationMACHINE LEARNING. Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA
1 MACHINE LEARNING Methods for feature extraction and reduction of dimensionality: Probabilistic PCA and kernel PCA 2 Practicals Next Week Next Week, Practical Session on Computer Takes Place in Room GR
More informationForecasting 1 to h steps ahead using partial least squares
Forecasting 1 to h steps ahead using partial least squares Philip Hans Franses Econometric Institute, Erasmus University Rotterdam November 10, 2006 Econometric Institute Report 2006-47 I thank Dick van
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More information2.3. Clustering or vector quantization 57
Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :
More informationBasic Calculus Review
Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V
More informationCS281 Section 4: Factor Analysis and PCA
CS81 Section 4: Factor Analysis and PCA Scott Linderman At this point we have seen a variety of machine learning models, with a particular emphasis on models for supervised learning. In particular, we
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationDimensionality Reduction
Lecture 5 1 Outline 1. Overview a) What is? b) Why? 2. Principal Component Analysis (PCA) a) Objectives b) Explaining variability c) SVD 3. Related approaches a) ICA b) Autoencoders 2 Example 1: Sportsball
More informationMaximum variance formulation
12.1. Principal Component Analysis 561 Figure 12.2 Principal component analysis seeks a space of lower dimensionality, known as the principal subspace and denoted by the magenta line, such that the orthogonal
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationLECTURE 16: PCA AND SVD
Instructor: Sael Lee CS549 Computational Biology LECTURE 16: PCA AND SVD Resource: PCA Slide by Iyad Batal Chapter 12 of PRML Shlens, J. (2003). A tutorial on principal component analysis. CONTENT Principal
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationPrincipal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,
Principal Component Analysis (PCA) PCA is a widely used statistical tool for dimension reduction. The objective of PCA is to find common factors, the so called principal components, in form of linear combinations
More informationNormal Mode Decomposition of 2N 2N symplectic matrices
David Rubin February 8, 008 Normal Mode Decomposition of N N symplectic matrices Normal mode decomposition of a 4X4 symplectic matrix is a standard technique for analyzing transverse coupling in a storage
More informationPrincipal Component Analysis
Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More informationCOMP 558 lecture 18 Nov. 15, 2010
Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to
More informationFINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3
FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3 The required files for all problems can be found in: http://www.stat.uchicago.edu/~lekheng/courses/331/hw3/ The file name indicates which problem
More informationIntroduction to Principal Component Analysis (PCA)
Introduction to Principal Component Analysis (PCA) NESAC/BIO NESAC/BIO Daniel J. Graham PhD University of Washington NESAC/BIO MVSA Website 2010 Multivariate Analysis Multivariate analysis (MVA) methods
More informationMatrix Vector Products
We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationQuantum Computing Lecture 2. Review of Linear Algebra
Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces
More informationFrank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c /9/9 page 147 le-tex
Frank C Porter and Ilya Narsky: Statistical Analysis Techniques in Particle Physics Chap. c08 2013/9/9 page 147 le-tex 8.3 Principal Component Analysis (PCA) 147 Figure 8.1 Principal and independent components
More informationUnsupervised Learning: Dimensionality Reduction
Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,
More informationPh 219/CS 219. Exercises Due: Friday 20 October 2006
1 Ph 219/CS 219 Exercises Due: Friday 20 October 2006 1.1 How far apart are two quantum states? Consider two quantum states described by density operators ρ and ρ in an N-dimensional Hilbert space, and
More informationThe Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017
The Kernel Trick, Gram Matrices, and Feature Extraction CS6787 Lecture 4 Fall 2017 Momentum for Principle Component Analysis CS6787 Lecture 3.1 Fall 2017 Principle Component Analysis Setting: find the
More informationHST.582J/6.555J/16.456J
Blind Source Separation: PCA & ICA HST.582J/6.555J/16.456J Gari D. Clifford gari [at] mit. edu http://www.mit.edu/~gari G. D. Clifford 2005-2009 What is BSS? Assume an observation (signal) is a linear
More informationReduced-dimension Models in Nonlinear Finite Element Dynamics of Continuous Media
Reduced-dimension Models in Nonlinear Finite Element Dynamics of Continuous Media Petr Krysl, Sanjay Lall, and Jerrold E. Marsden, California Institute of Technology, Pasadena, CA 91125. pkrysl@cs.caltech.edu,
More informationLECTURE 18: NONLINEAR MODELS
LECTURE 18: NONLINEAR MODELS The basic point is that smooth nonlinear models look like linear models locally. Models linear in parameters are no problem even if they are nonlinear in variables. For example:
More informationPHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION. A Thesis MELTEM APAYDIN
PHASE RETRIEVAL OF SPARSE SIGNALS FROM MAGNITUDE INFORMATION A Thesis by MELTEM APAYDIN Submitted to the Office of Graduate and Professional Studies of Texas A&M University in partial fulfillment of the
More informationPrincipal Components Theory Notes
Principal Components Theory Notes Charles J. Geyer August 29, 2007 1 Introduction These are class notes for Stat 5601 (nonparametrics) taught at the University of Minnesota, Spring 2006. This not a theory
More informationPrincipal Component Analysis
Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal
More informationChemometrics. Matti Hotokka Physical chemistry Åbo Akademi University
Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More informationWhat is Principal Component Analysis?
What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most
More informationPreprocessing & dimensionality reduction
Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016
More informationANALYSIS OF NONLINEAR PARTIAL LEAST SQUARES ALGORITHMS
ANALYSIS OF NONLINEAR PARIAL LEAS SQUARES ALGORIHMS S. Kumar U. Kruger,1 E. B. Martin, and A. J. Morris Centre of Process Analytics and Process echnology, University of Newcastle, NE1 7RU, U.K. Intelligent
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationInvertibility of random matrices
University of Michigan February 2011, Princeton University Origins of Random Matrix Theory Statistics (Wishart matrices) PCA of a multivariate Gaussian distribution. [Gaël Varoquaux s blog gael-varoquaux.info]
More informationFERMENTATION BATCH PROCESS MONITORING BY STEP-BY-STEP ADAPTIVE MPCA. Ning He, Lei Xie, Shu-qing Wang, Jian-ming Zhang
FERMENTATION BATCH PROCESS MONITORING BY STEP-BY-STEP ADAPTIVE MPCA Ning He Lei Xie Shu-qing Wang ian-ming Zhang National ey Laboratory of Industrial Control Technology Zhejiang University Hangzhou 3007
More informationFactor Analysis. Qian-Li Xue
Factor Analysis Qian-Li Xue Biostatistics Program Harvard Catalyst The Harvard Clinical & Translational Science Center Short course, October 7, 06 Well-used latent variable models Latent variable scale
More informationNumerical Methods I Singular Value Decomposition
Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)
More informationPrincipal Components Analysis
Principal Components Analysis Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 16-Mar-2017 Nathaniel E. Helwig (U of Minnesota) Principal
More informationCOMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare
COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of
More informationECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation
ECEN 615 Methods of Electric Power Systems Analysis Lecture 18: Least Squares, State Estimation Prof. om Overbye Dept. of Electrical and Computer Engineering exas A&M University overbye@tamu.edu Announcements
More information