矩阵论 马锦华 数据科学与计算机学院 中山大学

Size: px
Start display at page:

Download "矩阵论 马锦华 数据科学与计算机学院 中山大学"

Transcription

1 矩阵论 马锦华 数据科学与计算机学院 中山大学

2 About The Instructor Education - Bachelor and master in mathematics, SYSU - PhD in computer science, HKBU Research Experience - Postdoc at Rutgers - Postdoc at Johns Hopkins - Postdoc at HKBU Research interests - Machine learning: feature fusion, transfer learning, etc. - Computer vision: intelligent video surveillance, etc. - Medical data analysis: diagnosis and prediction models, etc. 2

3 About This Course Instructor s contact - Office: 超算中心 5 楼 529C - majh8@mail.sysu.edu.cn Teaching assistant - 孙淑婷, sunsht3@mail2.sysu.edu.cn Lecture hours and venue - Monday 3-4 节, C104 - Thursday 5-6 节, C304 Prerequisite - Linear Algebra, Advanced Mathematics 3

4 About This Course Text Books - 矩阵论简明教程, 徐仲 张凯院等编著, 第三版, 科学出版社出版 Recommended readings - Roger A. Horn and Charles R. Johnson, Matrix Analysis (Second Edition), Cambridge University Press, Gene H. Golub and Charles F. van Loan, Matrix Computations (Fourth Edition), John Hopkins University Press, Assessment - 6 Assignments (60%), final exam (40%), attendance 4

5 About This Course Course Contents - 矩阵的相似变换 - 范数理论 - 矩阵分析 - 矩阵分解 - 特征值的估计与表示 - 广义逆矩阵 - 线性空间与线性变换 Applications - Machine learning, computer vision, systems and control, Slides Website: 5

6 A Glimpse of Topics W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 7

7 Eigenvalue Problem Problem: given A R n n, find a v R n such that Av = λv, for some λ. Eigendecomposition: let A R n n be symmetric; i.e., a ij = a ji for all i, j. It also admits a decomposition A = QΛQ T, where Q R n n is orthogonal, i.e., Q T Q = I; Λ = Diag(λ 1,..., λ n ) also widely used, either as an analysis tool or as a computational tool no closed form in general; can be numerically computed W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 12

8 Application Example: PageRank PageRank is an algorithm used by Google to rank the pages of a search result. the idea is to use counts of links of various pages to determine pages importance. Source: Wiki. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 13

9 One-Page Explanation of How PageRank Works Model: v j = v i, i = 1,..., n, c j j L i where c j is the number of outgoing links from page j; L i is the set of pages with a link to page i; v i is the importance score of page i. as an example, A {}} { {}} { v v 1 v 2 v 3 v 4 = v { }} { v 1 v 2 v 3 v 4. finding v is an eigenvalue problem with n being of order of millions! further reading: [Bryan-Tanya2006] W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 14

10 Least Squares (LS) Problem: given A R m n, y R n, solve min y x R n Ax 2 2, where 2 is the Euclidean norm; i.e., x 2 = n i=1 x i 2. widely used in science, engineering, and mathematics assuming a tall and full-rank A, the LS solution is uniquely given by x LS = (A T A) 1 A T y. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 8

11 Application Example: Linear Prediction (LP) let y[0], y[1], be a time series. Model (autoregressive (AR) model): y[n] = a 1 y[n 1] + a 2 y[n 2] + + a L y[n L] + w[n], n = 0, 1, 2,... for some coefficients {a i } L i=1, where w[n] is noise or modeling error. Problem: estimate {a i } L i=1 from {y[n]}; can be formulated as LS Applications: time-series prediction, speech analysis and coding, spectral estimation... W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 9

12 A Toy Demo: Predicting Hang Seng Index 2.3 x Hang Seng Index Linear Prediction 2.2 Hang Seng Index day blue Hang Seng Index during a certain time period. red training phase; the line is L l=1 a ly[n l]; a is obtained by LS; L = 10. green prediction phase; the line is ŷ[n] = L l=1 a lŷ[n l]; the same a in the training phase. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 10

13 A Real Example: Real-Time Prediction of Flu Activity Tracking influenza outbreaks by ARGO a model combining the AR model and Google search data. Source: [Yang-Santillana-Kou2015]. (PNAS) W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 11

14 Low-Rank Matrix Approximation Problem: given Y R m n and an integer r < min{m, n}, find an (A, B) R m r R r n such that either Y = AB or Y AB. Formulation: min Y AB 2 A R m r,b R r n F, where F is the Frobenius, or matrix Euclidean, norm. Applications: dimensionality reduction, extracting meaningful features from data, low-rank modeling,... W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 15

15 Application Example: Image Compression let Y R m n be an image. (a) original image, size= store the low-rank factor pair (A, B), instead of Y. (b) truncated SVD, k= 5 (c) truncated SVD, k= 10 (d) truncated SVD, k= 20 W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 16

16 Application Example: Principal Component Analysis (PCA) Aim: given a set of data points {y 1, y 2,..., y n } R n and an integer k < min{m, n}, perform a low-dimensional representation y i = Qc i + µ + e i, i = 1,..., n, where Q R m k is a basis; c i s are coefficients; µ is a base; e i s are errors W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 17

17 Toy Demo: Dimensionality Reduction of a Face Image Dataset A face image dataset. Image size = , number of face images = 400. Each x i is the vectorization of one face image, leading to m = = 10304, n = 400. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 18

18 Toy Demo: Dimensionality Reduction of a Face Image Dataset Mean face 1st principal left singular vector 2nd principal left singular vector 3rd principal left singular vector 400th left singular vector Energy Rank Energy Concentration W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 19

19 Singular Value Decomposition (SVD) SVD: Any Y R m n can be decomposed into Y = UΣV T, where U R m m, V R n n are orthogonal; Σ R m n takes a diagonal form. also a widely used analysis and computational tool; can be numerically computed SVD can be used to solve the low-rank matrix approximation problem min Y AB 2 A R m r,b R r n F. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 20

20 Why Matrix Analysis and Computations is Important? as said, areas such as signal processing, image processing, machine learning, optimization, computer vision, control, communications,..., use matrix operations extensively it helps you build the foundations for understanding hot topics such as sparse recovery; structured low-rank matrix approximation; matrix completion. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 22

21 The Sparse Recovery Problem Problem: given y R m, A R m n, m < n, find a sparsest x R n such that y = Ax. measurements sparse vector with few nonzero entries by sparsest, we mean that x should have as many zero elements as possible. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 23

22 Application: Magnetic resonance imaging (MRI) Problem: MRI image reconstruction. (a) (b) Fig. a shows the original test image. Fig. b shows the sampling region in the frequency domain. Fourier coefficients are sampled along 22 approximately radial lines. Source: [Candès-Romberg-Tao2006] W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 24

23 Application: Magnetic resonance imaging (MRI) Problem: MRI image reconstruction. (c) (d) Fig. c is the recovery by filling the unobserved Fourier coefficients to zero. Fig. d is the recovery by a sparse recovery solution. Source: [Candès-Romberg-Tao2006] W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 25

24 Application: Low-Rank Matrix Completion recommendation systems in 2009, Netflix awarded $1 million to a team that performed best in recommending new movies to users based on their previous preference 1. let Z be a preference matrix, where z ij records how user i likes movie j. Z = movies 2 3 1?? 5 5 1? 4 2???? 3 1? users??? 3? 1 5 some entries z ij are missing, since no one watches all movies. Z is assumed to be of low rank; research shows that only a few factors affect users preferences. Aim: guess the unkown z ij s from the known ones. 1 W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 26

25 Low-Rank Matrix Completion The 2009 Netflix Grand Prize winners used low-rank matrix approximations [Koren-Bell-Volinsky2009]. Formulation (oversimplified): min A R m r,b R r n (i,j) Ω z ij [AB] i,j 2 where Ω is an index set that indicates the known entries of Z. cannot be solved by SVD in the recommendation system application, it s a large-scale problem alternating LS may be used W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 27

26 Toy Demonstration of Low-Rank Matrix Completion Left: An incomplete image with 40% missing pixels. Right: the low-rank matrix completion result. r = 120. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 28

27 Nonnegative Matrix Factorization (NMF) Aim: we want the factors to be non-negative Formulation: min Y AB 2 A R m r,b R r n F s.t. A 0, B 0, where X 0 means that x ij 0 for all i, j. arguably a topic in optimization, but worth noticing found to be able to extract meaningful features (by empirical studies) numerous applications, e.g., in machine learning, signal processing, remote sensing W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 29

28 they are non-global and contain several versions of mouths, noses and other facial parts, where the various versions are in different locations or forms. The variability of a whole face is generated by NMF Examples combining these different parts. Although all parts are used by at through tri the matrix Fig. 2 may bases. Image Processing: Original NMF = Figure 1 Nonfaces, wherea holistic repres m ¼ 2;429 fa n 3 m matrix three different and methods. r ¼ 49 basis i with red pixels represented b superposition superpositions learns to repre The basis elements extract facial features such as eyes, nose and lips. Source: VQ (Nature) [Lee-Seung1999]. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 30

29 Text Mining: basis elements allow us to recover different topics; weights allow us to assign each text to its corresponding topics. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 31

30 Toy Demonstration of NMF A face image dataset. Image size = , number of face images = Each x i is the vectorization of one face image, leading to m = = 10201, n = W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 32

31 Toy Demonstration of NMF: NMF-Extracted Features NMF settings: r = 49, Lee-Seung multiplicative update with 5000 iterations. W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 33

32 Toy Demonstration of NMF: Comparison with PCA Mean face 1st principal left singular vector 2nd principal left singular vector 3th principal left singular vector last principal left singular vector Energy Rank Energy Concentration W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 34

33 References [Yang-Santillana-Kou2015] S. Yang, M. Santillana, and S. C. Kou, Accurate estimation of influenza epidemics using Google search data via ARGO, Proceedings of the National Academy of Sciences, vol. 112, no. 47, pp , [Bryan-Tanya2006] K. Bryan and L. Tanya, The 25, 000, 000, 000 eigenvector: The linear algebra behind Google, SIAM Review, vol. 48, no. 3, pp , [Candès-Romberg-Tao2006] E. J. Candès, J. Romberg, and T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Information Theory, vol. 52, no. 2, pp , [Koren-Bell-Volinsky2009] B. Koren, R. Bell, and C. Volinsky, Matrix factorization techniques for recommender systems, IEEE Computer, vol. 42 no. 8, pp , [Lee-Seung1999] D. D. Lee and H. S. Seung, Learning the parts of objects by non-negative matrix factorization, Nature, vol. 401, no. 6755, pp , W.-K. Ma, ENGG5781 Matrix Analysis and Computations, CUHK, Term 2. 36

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview

ENGG5781 Matrix Analysis and Computations Lecture 0: Overview ENGG5781 Matrix Analysis and Computations Lecture 0: Overview Wing-Kin (Ken) Ma 2018 2019 Term 1 Department of Electronic Engineering The Chinese University of Hong Kong Course Information W.-K. Ma, ENGG5781

More information

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

Linear Algebra 20 Symmetric Matrices and Quadratic Forms

Linear Algebra 20 Symmetric Matrices and Quadratic Forms Linear Algebra 20 Symmetric Matrices and Quadratic Forms Wei-Shi Zheng, wszheng@ieee.org, 2011 1 What Do You Learn from This Note Do you still remember the following equation or something like that I have

More information

Homework 1. Yuan Yao. September 18, 2011

Homework 1. Yuan Yao. September 18, 2011 Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to

More information

Lecture Note on Linear Algebra 16. Eigenvalues and Eigenvectors

Lecture Note on Linear Algebra 16. Eigenvalues and Eigenvectors Lecture Note on Linear Algebra 16. Eigenvalues and Eigenvectors Wei-Shi Zheng, wszheng@ieee.org, 2011 November 18, 2011 1 What Do You Learn from This Note In this lecture note, we are considering a very

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

CSC 576: Variants of Sparse Learning

CSC 576: Variants of Sparse Learning CSC 576: Variants of Sparse Learning Ji Liu Department of Computer Science, University of Rochester October 27, 205 Introduction Our previous note basically suggests using l norm to enforce sparsity in

More information

Lecture 8: Linear Algebra Background

Lecture 8: Linear Algebra Background CSE 521: Design and Analysis of Algorithms I Winter 2017 Lecture 8: Linear Algebra Background Lecturer: Shayan Oveis Gharan 2/1/2017 Scribe: Swati Padmanabhan Disclaimer: These notes have not been subjected

More information

Linear dimensionality reduction for data analysis

Linear dimensionality reduction for data analysis Linear dimensionality reduction for data analysis Nicolas Gillis Joint work with Robert Luce, François Glineur, Stephen Vavasis, Robert Plemmons, Gabriella Casalino The setup Dimensionality reduction for

More information

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Basic Calculus Review

Basic Calculus Review Basic Calculus Review Lorenzo Rosasco ISML Mod. 2 - Machine Learning Vector Spaces Functionals and Operators (Matrices) Vector Space A vector space is a set V with binary operations +: V V V and : R V

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Exercise Sheet 1 1 Probability revision 1: Student-t as an infinite mixture of Gaussians Show that an infinite mixture of Gaussian distributions, with Gamma distributions as mixing weights in the following

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed

More information

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space

More information

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare

COMP6237 Data Mining Covariance, EVD, PCA & SVD. Jonathon Hare COMP6237 Data Mining Covariance, EVD, PCA & SVD Jonathon Hare jsh2@ecs.soton.ac.uk Variance and Covariance Random Variables and Expected Values Mathematicians talk variance (and covariance) in terms of

More information

2.3. Clustering or vector quantization 57

2.3. Clustering or vector quantization 57 Multivariate Statistics non-negative matrix factorisation and sparse dictionary learning The PCA decomposition is by construction optimal solution to argmin A R n q,h R q p X AH 2 2 under constraint :

More information

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014 Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms François Caron Department of Statistics, Oxford STATLEARN 2014, Paris April 7, 2014 Joint work with Adrien Todeschini,

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Linear Algebra - Part II

Linear Algebra - Part II Linear Algebra - Part II Projection, Eigendecomposition, SVD (Adapted from Sargur Srihari s slides) Brief Review from Part 1 Symmetric Matrix: A = A T Orthogonal Matrix: A T A = AA T = I and A 1 = A T

More information

Linear Algebra Background

Linear Algebra Background CS76A Text Retrieval and Mining Lecture 5 Recap: Clustering Hierarchical clustering Agglomerative clustering techniques Evaluation Term vs. document space clustering Multi-lingual docs Feature selection

More information

Lecture: Face Recognition and Feature Reduction

Lecture: Face Recognition and Feature Reduction Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the

More information

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY OUTLINE Why We Need Matrix Decomposition SVD (Singular Value Decomposition) NMF (Nonnegative

More information

EE 381V: Large Scale Learning Spring Lecture 16 March 7

EE 381V: Large Scale Learning Spring Lecture 16 March 7 EE 381V: Large Scale Learning Spring 2013 Lecture 16 March 7 Lecturer: Caramanis & Sanghavi Scribe: Tianyang Bai 16.1 Topics Covered In this lecture, we introduced one method of matrix completion via SVD-based

More information

Information Retrieval

Information Retrieval Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices

More information

Main matrix factorizations

Main matrix factorizations Main matrix factorizations A P L U P permutation matrix, L lower triangular, U upper triangular Key use: Solve square linear system Ax b. A Q R Q unitary, R upper triangular Key use: Solve square or overdetrmined

More information

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles

Compressed sensing. Or: the equation Ax = b, revisited. Terence Tao. Mahler Lecture Series. University of California, Los Angeles Or: the equation Ax = b, revisited University of California, Los Angeles Mahler Lecture Series Acquiring signals Many types of real-world signals (e.g. sound, images, video) can be viewed as an n-dimensional

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization

Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Exact Low-rank Matrix Recovery via Nonconvex M p -Minimization Lingchen Kong and Naihua Xiu Department of Applied Mathematics, Beijing Jiaotong University, Beijing, 100044, People s Republic of China E-mail:

More information

Matrix Completion for Structured Observations

Matrix Completion for Structured Observations Matrix Completion for Structured Observations Denali Molitor Department of Mathematics University of California, Los ngeles Los ngeles, C 90095, US Email: dmolitor@math.ucla.edu Deanna Needell Department

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Singular Value Decompsition

Singular Value Decompsition Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:

More information

Problems. Looks for literal term matches. Problems:

Problems. Looks for literal term matches. Problems: Problems Looks for literal term matches erms in queries (esp short ones) don t always capture user s information need well Problems: Synonymy: other words with the same meaning Car and automobile 电脑 vs.

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Lecture 7 Mathematics behind Internet Search

Lecture 7 Mathematics behind Internet Search CCST907 Hidden Order in Daily Life: A Mathematical Perspective Lecture 7 Mathematics behind Internet Search Dr. S. P. Yung (907A) Dr. Z. Hua (907B) Department of Mathematics, HKU Outline Google is the

More information

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas

More information

Linear Algebra 18 Orthogonality

Linear Algebra 18 Orthogonality Linear Algebra 18 Orthogonality Wei-Shi Zheng, wszheng@ieeeorg, 2011 1 What Do You Learn from This Note We still observe the unit vectors we have introduced in Chapter 1: 1 0 0 e 1 = 0, e 2 = 1, e 3 =

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011

Introduction How it works Theory behind Compressed Sensing. Compressed Sensing. Huichao Xue. CS3750 Fall 2011 Compressed Sensing Huichao Xue CS3750 Fall 2011 Table of Contents Introduction From News Reports Abstract Definition How it works A review of L 1 norm The Algorithm Backgrounds for underdetermined linear

More information

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

More information

PROJECTIVE NON-NEGATIVE MATRIX FACTORIZATION WITH APPLICATIONS TO FACIAL IMAGE PROCESSING

PROJECTIVE NON-NEGATIVE MATRIX FACTORIZATION WITH APPLICATIONS TO FACIAL IMAGE PROCESSING st Reading October 0, 200 8:4 WSPC/-IJPRAI SPI-J068 008 International Journal of Pattern Recognition and Artificial Intelligence Vol. 2, No. 8 (200) 0 c World Scientific Publishing Company PROJECTIVE NON-NEGATIVE

More information

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms Adrien Todeschini Inria Bordeaux JdS 2014, Rennes Aug. 2014 Joint work with François Caron (Univ. Oxford), Marie

More information

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task Binary Principal Component Analysis in the Netflix Collaborative Filtering Task László Kozma, Alexander Ilin, Tapani Raiko first.last@tkk.fi Helsinki University of Technology Adaptive Informatics Research

More information

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Chapter 14 SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Today we continue the topic of low-dimensional approximation to datasets and matrices. Last time we saw the singular

More information

Recommender systems, matrix factorization, variable selection and social graph data

Recommender systems, matrix factorization, variable selection and social graph data Recommender systems, matrix factorization, variable selection and social graph data Julien Delporte & Stéphane Canu stephane.canu@litislab.eu StatLearn, april 205, Grenoble Road map Model selection for

More information

Robust Principal Component Analysis

Robust Principal Component Analysis ELE 538B: Mathematics of High-Dimensional Data Robust Principal Component Analysis Yuxin Chen Princeton University, Fall 2018 Disentangling sparse and low-rank matrices Suppose we are given a matrix M

More information

Automatic Rank Determination in Projective Nonnegative Matrix Factorization

Automatic Rank Determination in Projective Nonnegative Matrix Factorization Automatic Rank Determination in Projective Nonnegative Matrix Factorization Zhirong Yang, Zhanxing Zhu, and Erkki Oja Department of Information and Computer Science Aalto University School of Science and

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 6: Numerical Linear Algebra: Applications in Machine Learning Cho-Jui Hsieh UC Davis April 27, 2017 Principal Component Analysis Principal

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1

Machine Learning for Large-Scale Data Analysis and Decision Making A. Week #1 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Week #1 Today Introduction to machine learning The course (syllabus) Math review (probability + linear algebra) The future

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 2. Basic Linear Algebra continued Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Machine Learning for Signal Processing Sparse and Overcomplete Representations

Machine Learning for Signal Processing Sparse and Overcomplete Representations Machine Learning for Signal Processing Sparse and Overcomplete Representations Abelino Jimenez (slides from Bhiksha Raj and Sourish Chaudhuri) Oct 1, 217 1 So far Weights Data Basis Data Independent ICA

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Collaborative Filtering. Radek Pelánek

Collaborative Filtering. Radek Pelánek Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine

More information

Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples

Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples Radu Horaud INRIA Grenoble Rhone-Alpes, France Radu.Horaud@inrialpes.fr http://perception.inrialpes.fr/ Outline

More information

CS 572: Information Retrieval

CS 572: Information Retrieval CS 572: Information Retrieval Lecture 11: Topic Models Acknowledgments: Some slides were adapted from Chris Manning, and from Thomas Hoffman 1 Plan for next few weeks Project 1: done (submit by Friday).

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

c Springer, Reprinted with permission.

c Springer, Reprinted with permission. Zhijian Yuan and Erkki Oja. A FastICA Algorithm for Non-negative Independent Component Analysis. In Puntonet, Carlos G.; Prieto, Alberto (Eds.), Proceedings of the Fifth International Symposium on Independent

More information

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University

Matrix completion: Fundamental limits and efficient algorithms. Sewoong Oh Stanford University Matrix completion: Fundamental limits and efficient algorithms Sewoong Oh Stanford University 1 / 35 Low-rank matrix completion Low-rank Data Matrix Sparse Sampled Matrix Complete the matrix from small

More information

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora Scribe: Today we continue the

More information

Eigenvalues and diagonalization

Eigenvalues and diagonalization Eigenvalues and diagonalization Patrick Breheny November 15 Patrick Breheny BST 764: Applied Statistical Modeling 1/20 Introduction The next topic in our course, principal components analysis, revolves

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the

More information

Collaborative Filtering Matrix Completion Alternating Least Squares

Collaborative Filtering Matrix Completion Alternating Least Squares Case Study 4: Collaborative Filtering Collaborative Filtering Matrix Completion Alternating Least Squares Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade May 19, 2016

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

SVD, PCA & Preprocessing

SVD, PCA & Preprocessing Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees

More information

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang

Deep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i

More information

Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology

Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology Erkki Oja Department of Computer Science Aalto University, Finland

More information

CS 231A Section 1: Linear Algebra & Probability Review

CS 231A Section 1: Linear Algebra & Probability Review CS 231A Section 1: Linear Algebra & Probability Review 1 Topics Support Vector Machines Boosting Viola-Jones face detector Linear Algebra Review Notation Operations & Properties Matrix Calculus Probability

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang

CS 231A Section 1: Linear Algebra & Probability Review. Kevin Tang CS 231A Section 1: Linear Algebra & Probability Review Kevin Tang Kevin Tang Section 1-1 9/30/2011 Topics Support Vector Machines Boosting Viola Jones face detector Linear Algebra Review Notation Operations

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

EUSIPCO

EUSIPCO EUSIPCO 2013 1569741067 CLUSERING BY NON-NEGAIVE MARIX FACORIZAION WIH INDEPENDEN PRINCIPAL COMPONEN INIIALIZAION Liyun Gong 1, Asoke K. Nandi 2,3 1 Department of Electrical Engineering and Electronics,

More information

Deriving Principal Component Analysis (PCA)

Deriving Principal Component Analysis (PCA) -0 Mathematical Foundations for Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Deriving Principal Component Analysis (PCA) Matt Gormley Lecture 11 Oct.

More information

Adaptive one-bit matrix completion

Adaptive one-bit matrix completion Adaptive one-bit matrix completion Joseph Salmon Télécom Paristech, Institut Mines-Télécom Joint work with Jean Lafond (Télécom Paristech) Olga Klopp (Crest / MODAL X, Université Paris Ouest) Éric Moulines

More information

Collaborative Filtering: A Machine Learning Perspective

Collaborative Filtering: A Machine Learning Perspective Collaborative Filtering: A Machine Learning Perspective Chapter 6: Dimensionality Reduction Benjamin Marlin Presenter: Chaitanya Desai Collaborative Filtering: A Machine Learning Perspective p.1/18 Topics

More information

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection Instructor: Herke van Hoof (herke.vanhoof@cs.mcgill.ca) Based on slides by:, Jackie Chi Kit Cheung Class web page:

More information

Functional Analysis Review

Functional Analysis Review Outline 9.520: Statistical Learning Theory and Applications February 8, 2010 Outline 1 2 3 4 Vector Space Outline A vector space is a set V with binary operations +: V V V and : R V V such that for all

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Principal Component Analysis (PCA) for Sparse High-Dimensional Data

Principal Component Analysis (PCA) for Sparse High-Dimensional Data AB Principal Component Analysis (PCA) for Sparse High-Dimensional Data Tapani Raiko, Alexander Ilin, and Juha Karhunen Helsinki University of Technology, Finland Adaptive Informatics Research Center Principal

More information

Lecture 1: Review of linear algebra

Lecture 1: Review of linear algebra Lecture 1: Review of linear algebra Linear functions and linearization Inverse matrix, least-squares and least-norm solutions Subspaces, basis, and dimension Change of basis and similarity transformations

More information

Linear Methods in Data Mining

Linear Methods in Data Mining Why Methods? linear methods are well understood, simple and elegant; algorithms based on linear methods are widespread: data mining, computer vision, graphics, pattern recognition; excellent general software

More information