Linear Methods in Data Mining

Size: px
Start display at page:

Download "Linear Methods in Data Mining"

Transcription

1

2

3 Why Methods? linear methods are well understood, simple and elegant; algorithms based on linear methods are widespread: data mining, computer vision, graphics, pattern recognition; excellent general software available for experimental work.

4 Software available the JAMA java package available free on Internet; MATLAB, excellent, but expensive; SCILAB (free) OCTAVE (free)

5 Least Squares for Main Problem: Given observations of p variables in n experiments a 1,..., a p in R n and the vector of the outcomes of the experiments b R n determine b 0, b 1,..., b p to express the outcome of the experiment b as b = α 0 + p a i α i. i=1

6 Data Sets as Tables Series of n experiments measuring p variables and an outcome: a 1 a p b a 11 a 1p b 1... a n1 a np b n

7 Least Squares for In Matrix Form: Determine α 0, α 1,..., α p such that α 0 α 1 (1 a 1 a p ). = b. α p This system consists of n equations and p + 1 unknowns α 0,..., α p and is overdetermined.

8 Example Data on relationship between weight (in lbs) and blood triglycerides: weight tgl

9 Example cont d Triglycerides vs. Weight TGL weight

10 Quest Find α 0 and α 1 such that tgl = α 0 + α 1 weight. 2 unknowns and 7 equations! α α 1 = 120 α α 1 = 144 α α 1 = 142 α α 1 = 167 α α 1 = 180 α α 1 = 190 α α 1 = 197

11 In matrix form: ( α0 α ) 142 = or Aα = b. Finding parameters allows building a model and making predictions for values of tgl based on weight.

12 The Least Square Method (LSM) LSM is used in data mining as a method of estimating the parameters of a model. Estimation process is known as regression. Several types of regression exist depending on the nature of the assummed model of dependency.

13 Making the Best of Overdetermined Systems If Ax = b has no solution, the next best thing is finding c R n such that for every x R n. Ac b 2 Ax b 2 Ax range(a) for any x R n. Find a u range(a) such that Au is as close to b as possible.

14 A R m n be a full-rank matrix (rank(a) = n) such that m > n. The symmetric square matrix A A R n n has the same rank n as the matrix A (A A)x = A b has a unique solution. A A is positive definite because x A Ax = (Ax) Ax = Ax 2 2 > 0 if x 0.

15 Theorem Let A R m n be a full-rank matrix such that m > n and let b R m. The unique solution of the system (A A)x = A b equals the projection of the vector b on the subspace range(a). (A A)x = A b is known as the system of normal equations of A and b.

16 Example in MATLAB C = A A x = C \ (A b) C = x = ( ) ( ) line: tgl = weight

17 Danger: Choosing the Simplest Way! Condition number of A A is the square of condition number of A. Forming A A leads to loss of information.

18 The Thin QR Decompositions Let A R m n be a matrix such that m n and rank(a) = n (full-rank matrix). A can be factored as ( ) R A = Q, O m n,n where Q R m m and R R n n such that i ii Q is an orthonormal matrix, and R = (r ij ) is an upper triangular matrix.

19 LSQ Approximation and QR Decomposition ( Au b = Q R ( = Q R O m n,n O m n,n ) u b ) u QQ b (because Q is orthonormal and therefore QQ = I m ) (( ) ) R = Q u Q b. O m n,n

20 LSQ Approximation and QR Decomposition (cont d) Au b 2 ( R 2= O m n,n ) u Q b If we write Q = (L 1 L 2 ), where L 1 R m n and L 2 R m (m n), then ( ) ( ) Au b 2 R L 2 = u 1 b 2 O m n,n L 2 b 2 ( ) Ru L = 1 b 2 L 2 b = Ru L 1b L 2b 2 2.

21 LSQ Approximation and QR Decomposition (cont d) The system Ru = L 1b can be solved and its solution minimizes Au b 2.

22 Example [Q, R] = qr(a, 0) A = and b =

23 Q = and R = x = R \ (Q b) cond(a) = x = ( ) ( )

24 Sample Data Sets V 1 V j V p E 1 x 1 x 11 x 1j x 1p E i x i x i1 x ij x ip E n x n x n1 x nj x np

25 Sample Matrix x 1 X E =. R n p. x n Sample Mean: x = 1 n (x x n ) = 1 n 1 X.

26 Centered Sample Matrix x 1 x ˆX =. x n x Covariance Matrix: cov(x ) = 1 n 1 ˆX ˆX R p p. cov(x ) ii = 1 n 1 n ((v i ) k ) 2 = 1 n 1 k=1 n (x ik x i ) 2, k=1 c ii is the i th variance and c ij is the (i, j)-covariance.

27 Centered Sample Matrix x 1 x ˆX =. x n x Covariance Matrix: cov(x ) = 1 n 1 ˆX ˆX R p p. cov(x ) ii = 1 n 1 For i j, (cov(x )) ij = n ((v i ) k ) 2 = 1 n 1 k=1 ( n 1 n 1 n n (x ik x i ) 2, k=1 ) n x ik x jk x i x j k=1 c ii is the i th variance and c ij is the (i, j)-covariance.

28 Total Variance tvar(x ) of X is trace(c). Let x 1 x n X =. R n p be a centered sample matrix and let R R p p be an orthonormal matrix. If Z R n p is a matrix such that Z = XR, then Z is centered, cov(z) = Rcov(X )R and tvar(z) = tvar(x ).

29 Properties of the Covariance matrix cov(x ) = 1 n 1 X X is symmetric, so is orthonormally diagonalizable: There exists an orthonormal matrix R such that R cov(x )R = D, which corresponds to a sample matrix Z = XR. Since z 1 z n x 1 Z =. =. R, x n New data form: z i = x i R

30 Properties of Diagonalized Covariance Matrix cov(z) = D = diag(d 1,..., d p ), then d i is the variance of the i th component; the covariances of the form cov(i, j) are 0. From a statistical point of view, this means that the components i and j are uncorrelated. The principal components of the sample matrix X are the eigenvectors of the matrix cov(x ): the column of R that diagonalizes cov(x ).

31 Variance Explanation The sum of the elements of D s main diagonal is equal to the total variance tvar(x ). The principal components explain the sources of the total variance: sample vectors grouped around p 1 explain the largest portion of the variance; sample vectors grouped around p 2 explain the second largest portion of the variance, etc.

32 Example minutes cost E 1 x E 2 x E 3 x E 4 x E 5 x E 6 x E 7 x E 8 x E 9 x E 10 x

33 Price vs. repair time cost Cost vs.time minutes

34 The principal components of X : r 1 = ( ) corresponding to the eigenvalues and r 2 = ( ) 0.93, 0.37 λ 1 = and λ 2 =

35 Centered Data and s 50 Cost vs. time (centered data) cost centered minutes centered

36 The New Variables z 1 = 0.37 (minutes 37.30) 0.93 (price 28.50) z 2 = 0.93 (minutes 37.30) 0.37 (price 28.50),

37 The New Variables z z2

38 The Optimality Theorem of PCA Let X R n p be a centered sample matrix and let R R p p be the orthonormal matrix such that (p 1)R cov(x )R is a diagonal matrix D R p p, where d 11 d pp. Let Q R p l be a matrix whose set of columns is orthonormal and 1 l p, and let W = XQ R n l. Then, trace(cov(w )) is maximized when Q consists of the first l columns of R and is minimized when Q consists of the last l columns of R.

39 Singular values and vectors Let A C m n be a matrix. A number σ R >0 is a singular value of A if there exists a pair of vectors (u, v) C n C m such that Av = σu and A H u = σv. The vector u is the left singular vector and v is the right singular vector associated to the singular value σ.

40 SVD Facts The matrices A H A and AA H have the same non-zero eigenvalues. If σ is a singular value of A, then σ 2 is an eigenvalue of both A H A and AA H. Any right singular vector v is an eigenvector of A H A and any left singular vector u is an eigenvector of AA H.

41 SVD Facts (cont d) If λ be an eigenvalue of A H A and AA H, λ is a real non-negative number. There is v C n such that A H Av = λv. Let u C m be the vector defined by Av = λu. We have A H u = λv, so λ is a singular value of A and (u, v) is a pair of singular vectors associate with the singular value λ. If A C n n is invertible and σ is a singular value of A, then 1 σ is a singular value of the matrix A 1.

42 Example: SVD of a Vector a 1 a n a =. be a non-zero vector is C n, which can also be regarded as a matrix in C n 1. The square of a singular value of A is an eigenvalue of the matrix ā 1 a 1 ā n a 1 A H ā 1 a 2 ā n a 2 A =.. ā 1 a n ā n a n The unique non-zero eigenvalue of this matrix is a 2 2, so the unique singular value of a is a 2.

43 A Decomposition of Square Matrices Let A C n n be a unitarily diagonalizable matrix. There exists a diagonal matrix D C n n and a unitary matrix U C n n such that A = U H DU; equivalently, we have A = VDV H, where V = U H. If V = (v 1 v n ), then Av i = d i v i, where D = diag(d 1,..., d n ), so v i is a unit eigenvector of A that corresponds to the eigenvalue d i. implies d 1 v H 1 A = (v 1 v n ). d n v H n A = d 1 v 1 v H d n v n v H n,

44 Decomposition of Rectangular Matrices Theorem (SVD Decomposition Theorem) Let A C m n be a matrix with singular values σ 1, σ 2,..., σ p, where σ 1 σ 2 σ p > 0 and p min{m, n}. There exist two unitary matrices U C m m and V C n n such that A = Udiag(σ 1,..., σ p )V H, (1) where diag(σ 1,..., σ p ) R m n.

45 SVD Properties Let A = Udiag(σ 1,..., σ p )V H be the SVD decomposition of the matrix A, where σ 1,..., σ p are the singular values of A. The rank of A equals p, the number of the non-zero elements located on the diagonal of D.

46 The Thin SVD Decomposition Let A C m n be a matrix with singular values σ 1, σ 2,..., σ p, where σ 1 σ 2 σ p > 0 and p min{m, n}. Then, A can be factored as A = UDV H, where U C m p and V C n p are matrices having orthonormal columns and D is the diagonal matrix σ σ 2 0 D = σ p

47 SVD Facts The rank-1 matrices of the form u i v H i, where 1 i p are pairwise orthogonal. u i v H i F = 1 for 1 i p.

48 Eckhart-Young Theorem Let A C m n be a matrix whose sequence of non-zero singular values is σ 1 σ p > 0. A can be written as Let B(k) C m n be A = σ 1 u 1 v H σ p u p v H p. B(k) = k σ i u i v H i. i=1 If r k = inf{ A X 2 X C m n and rank(x ) k}, then for 1 k p, where σ p+1 = 0. A B(k) 2 = r k = σ k+1,

49 Central Issue in IR The computation of sets of documents that contain terms specified by queries submitted by users. Challenges: a concept can be expressed by many equivalent words (synonimy); and the same word may mean different things in various contexts (polysemy); this can lead the retrieval technique to return documents that are irrelevant to the query (false positive) or to omit documents that may be relevant (false negatives).

50 Documents, Corpora, Terms A corpus is a pair K = (T, D), where T = {t 1,..., t m } is a finite set whose elements are referred to as terms, and D = (D 1,..., D n ) is a set of documents. Each document D i is a finite sequence of terms, D j = (t j1,..., t jk,..., t jlj ).

51 Documents, Corpora, Terms (cont d) Each term t i generates a row vector (a i1, a i2,..., a in ) referred to as a term vector and each document d j generates a column vector a 1j d j =. a mj. A query is a sequence of terms q Seq(T ) and it is also represented as a vector q 1 q =. q m, where q i = 1 if the term t i occurs in q, and 0 otherwise.

52 Retrieval When a query q is applied to a corpus K, an IR system that is based on the vector model computes the similarity between the query and the documents of the corpus by evaluating the cosine of the angle between the query vector q and the vectors of the documents of the corpus. For the angle α j between q and d j we have (q, d j ) cos α j =. q 2 d j 2 The IR system returns those documents D j for which this angle is small, that is cos α j t, where t is a parameter provided by the user.

53 LSI The LSI methods aims to capture relationships between documents motivated by the underlying structure of the documents. This structure is obscured by synonimy, polysemy, the use of insignificant syntactic-sugar words, and plain noise, which is caused my misspelled words or counting errors.

54 Assumptions: i A R m n is the matrix of a corpus K; ii A = UDV H is an SVD of A; iii rank(a) = p The first p columns of U form an orthonormal basis for range(a), the subspace generated by the vector documents of K; The last n p columns of V constitute an orthonormal basis for nullsp(a). The first p transposed columns of V form an orthonormal basis for the subspace of R n generated by the term vectors of K.

55 Example K = (T, D), where T = {t 1,..., t 5 } and D = {D 1, D 2, D 3 } A = D 1 and D 2 are fairly similar (they contain two common terms, t 3 and t 4 ).

56 Example (cont d) Suppose that t 1 and t 2 are synonyms. 1 0 q = 0 0 returns D 1. 0 The matrix representation can not directly account for the equivalence of the terms t 1 and t 2. However, this is an acceptable assumption because these terms appear in the common context {t 3, t 4 } in D 1 and D 2.

57 An SVD of A A =

58 Successive Approximations of A: B(1) = σ 1 u 1 v H = , B(2) = σ 1 u 1 v H 1 + σ 2 u 2 v H =

59 A and B(1) A = B(1) =

60 A and B(2) A = B(2) =

61 Basic Properties of SVD Approximation: i the rank-1 matrices u i v H i are pairwise orthogonal; ii their Frobenius norms are all equal to 1; iii noise as distributed with relative uniformity with respect to the p orthonormal components of the SVD. By omitting several such components that correspond to relatively small singular values we eliminate a substantial part of the noise and we obtain a matrix that better reflects the underlying hidden structure of the corpus.

62 Example Let q be a query whose vector is 1 0 q = The similarity between q and the document vectors d i, 1 i 3, that constitute the columns of A is cos(q, d 1 ) = , cos(q, d 2 ) = 0.482, and cos(q, d 3 ) = 0, suggesting that d 1 is by far the most relevant document for q.

63 Example (cont d) If we compute the same value of cosine for q and the columns b 1, b 2 and b 3 of the matrix B(2) we have cos(q, b 1 ) = cos(q, b 2 ) = , and cos(q, b 3 ) = 0. This approximation of A uncovers the hidden similarity of d 1 and d 2, a fact that is quite apparent from the structure of the matrix B(2).

64 Recognizing Chinese Characters - N. Wang

65 Characteristics of Chinese Characters square in shape; more than 5000 characters; set of characters indexed by the number of strokes which varies from 1 to 32.

66 Difficulties with the Stroke Counts length of stroke is variable; number of strokes does not capture topology of characters; strokes are not line segments; rather, they are calygraphic units.

67 Two Characters with 9 Strokes Each

68 Characters are Digitized (black and white) Matrix of Chinese characters

69 Successive Approximation of Images SVD for Chinese characters The best rank k approximation to A

70 Scree Diagram: Variation of Singular Values Singular Values

71 Several Important Sources for Methods G. H. Golub and C. F. Van Loan: Matrix Computations, The Johns Hopkins University Press, Baltimore, 1989 I. T.Jolliffe:, Springer, New York, 2002 R. A. Horn and C. R. Johnson, Matrix, Cambridge University Press, Cambridge, UK, 1996 L. Elden: Matrix and Pattern Recognition, SIAM, Philadelpha, 2007

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1 Eigenvalue Problems. Introduction Let A an n n real nonsymmetric matrix. The eigenvalue problem: EIGENVALE PROBLEMS AND THE SVD. [5.1 TO 5.3 & 7.4] Au = λu Example: ( ) 2 0 A = 2 1 λ 1 = 1 with eigenvector

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 1. Basic Linear Algebra Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Example

More information

EIGENVALE PROBLEMS AND THE SVD. [5.1 TO 5.3 & 7.4]

EIGENVALE PROBLEMS AND THE SVD. [5.1 TO 5.3 & 7.4] EIGENVALE PROBLEMS AND THE SVD. [5.1 TO 5.3 & 7.4] Eigenvalue Problems. Introduction Let A an n n real nonsymmetric matrix. The eigenvalue problem: Au = λu λ C : eigenvalue u C n : eigenvector Example:

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University

More information

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary

More information

Manning & Schuetze, FSNLP (c) 1999,2000

Manning & Schuetze, FSNLP (c) 1999,2000 558 15 Topics in Information Retrieval (15.10) y 4 3 2 1 0 0 1 2 3 4 5 6 7 8 Figure 15.7 An example of linear regression. The line y = 0.25x + 1 is the best least-squares fit for the four points (1,1),

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u MATH 434/534 Theoretical Assignment 7 Solution Chapter 7 (71) Let H = I 2uuT Hu = u (ii) Hv = v if = 0 be a Householder matrix Then prove the followings H = I 2 uut Hu = (I 2 uu )u = u 2 uut u = u 2u =

More information

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)

The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Numerical Methods I Singular Value Decomposition

Numerical Methods I Singular Value Decomposition Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)

More information

Linear Algebra Review. Vectors

Linear Algebra Review. Vectors Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors

More information

Manning & Schuetze, FSNLP, (c)

Manning & Schuetze, FSNLP, (c) page 554 554 15 Topics in Information Retrieval co-occurrence Latent Semantic Indexing Term 1 Term 2 Term 3 Term 4 Query user interface Document 1 user interface HCI interaction Document 2 HCI interaction

More information

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

1 Singular Value Decomposition and Principal Component

1 Singular Value Decomposition and Principal Component Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)

More information

Algebra C Numerical Linear Algebra Sample Exam Problems

Algebra C Numerical Linear Algebra Sample Exam Problems Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric

More information

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx

More information

Linear Algebra, part 3 QR and SVD

Linear Algebra, part 3 QR and SVD Linear Algebra, part 3 QR and SVD Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2012 Going back to least squares (Section 1.4 from Strang, now also see section 5.2). We

More information

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra (Review) Volker Tresp 2018 Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c

More information

Linear Algebra in Actuarial Science: Slides to the lecture

Linear Algebra in Actuarial Science: Slides to the lecture Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition

More information

Main matrix factorizations

Main matrix factorizations Main matrix factorizations A P L U P permutation matrix, L lower triangular, U upper triangular Key use: Solve square linear system Ax b. A Q R Q unitary, R upper triangular Key use: Solve square or overdetrmined

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

IV. Matrix Approximation using Least-Squares

IV. Matrix Approximation using Least-Squares IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that

More information

I. Multiple Choice Questions (Answer any eight)

I. Multiple Choice Questions (Answer any eight) Name of the student : Roll No : CS65: Linear Algebra and Random Processes Exam - Course Instructor : Prashanth L.A. Date : Sep-24, 27 Duration : 5 minutes INSTRUCTIONS: The test will be evaluated ONLY

More information

Singular Value Decomposition

Singular Value Decomposition Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course

More information

Problems. Looks for literal term matches. Problems:

Problems. Looks for literal term matches. Problems: Problems Looks for literal term matches erms in queries (esp short ones) don t always capture user s information need well Problems: Synonymy: other words with the same meaning Car and automobile 电脑 vs.

More information

Singular Value Decompsition

Singular Value Decompsition Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

A few applications of the SVD

A few applications of the SVD A few applications of the SVD Many methods require to approximate the original data (matrix) by a low rank matrix before attempting to solve the original problem Regularization methods require the solution

More information

CS 143 Linear Algebra Review

CS 143 Linear Algebra Review CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see

More information

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining

Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can

More information

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach

LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach LEC 2: Principal Component Analysis (PCA) A First Dimensionality Reduction Approach Dr. Guangliang Chen February 9, 2016 Outline Introduction Review of linear algebra Matrix SVD PCA Motivation The digits

More information

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized

More information

UNIT 6: The singular value decomposition.

UNIT 6: The singular value decomposition. UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

Principal components analysis COMS 4771

Principal components analysis COMS 4771 Principal components analysis COMS 4771 1. Representation learning Useful representations of data Representation learning: Given: raw feature vectors x 1, x 2,..., x n R d. Goal: learn a useful feature

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:

More information

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013. The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment Two Caramanis/Sanghavi Due: Tuesday, Feb. 19, 2013. Computational

More information

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016

Lecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016 Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen

More information

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg Linear Algebra, part 3 Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2010 Going back to least squares (Sections 1.7 and 2.3 from Strang). We know from before: The vector

More information

Matrices, Vector Spaces, and Information Retrieval

Matrices, Vector Spaces, and Information Retrieval Matrices, Vector Spaces, and Information Authors: M. W. Berry and Z. Drmac and E. R. Jessup SIAM 1999: Society for Industrial and Applied Mathematics Speaker: Mattia Parigiani 1 Introduction Large volumes

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture notes: Applied linear algebra Part 1. Version 2 Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and

More information

Review problems for MA 54, Fall 2004.

Review problems for MA 54, Fall 2004. Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on

More information

MATH 350: Introduction to Computational Mathematics

MATH 350: Introduction to Computational Mathematics MATH 350: Introduction to Computational Mathematics Chapter V: Least Squares Problems Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Spring 2011 fasshauer@iit.edu MATH

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 6 1 / 22 Overview

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Lecture 5: Numerical Linear Algebra Cho-Jui Hsieh UC Davis April 20, 2017 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Homework 1. Yuan Yao. September 18, 2011

Homework 1. Yuan Yao. September 18, 2011 Homework 1 Yuan Yao September 18, 2011 1. Singular Value Decomposition: The goal of this exercise is to refresh your memory about the singular value decomposition and matrix norms. A good reference to

More information

Notes on Eigenvalues, Singular Values and QR

Notes on Eigenvalues, Singular Values and QR Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University

More information

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in 806 Problem Set 8 - Solutions Due Wednesday, 4 November 2007 at 4 pm in 2-06 08 03 Problem : 205+5+5+5 Consider the matrix A 02 07 a Check that A is a positive Markov matrix, and find its steady state

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information

Applied Linear Algebra in Geoscience Using MATLAB

Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in

More information

Tutorial on Principal Component Analysis

Tutorial on Principal Component Analysis Tutorial on Principal Component Analysis Copyright c 1997, 2003 Javier R. Movellan. This is an open source document. Permission is granted to copy, distribute and/or modify this document under the terms

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our

More information

Problem set 5: SVD, Orthogonal projections, etc.

Problem set 5: SVD, Orthogonal projections, etc. Problem set 5: SVD, Orthogonal projections, etc. February 21, 2017 1 SVD 1. Work out again the SVD theorem done in the class: If A is a real m n matrix then here exist orthogonal matrices such that where

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra (Review) Volker Tresp 2017 Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.

More information

Parallel Singular Value Decomposition. Jiaxing Tan

Parallel Singular Value Decomposition. Jiaxing Tan Parallel Singular Value Decomposition Jiaxing Tan Outline What is SVD? How to calculate SVD? How to parallelize SVD? Future Work What is SVD? Matrix Decomposition Eigen Decomposition A (non-zero) vector

More information

Problem # Max points possible Actual score Total 120

Problem # Max points possible Actual score Total 120 FINAL EXAMINATION - MATH 2121, FALL 2017. Name: ID#: Email: Lecture & Tutorial: Problem # Max points possible Actual score 1 15 2 15 3 10 4 15 5 15 6 15 7 10 8 10 9 15 Total 120 You have 180 minutes to

More information

STA141C: Big Data & High Performance Statistical Computing

STA141C: Big Data & High Performance Statistical Computing STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude

More information

Introduction to Numerical Linear Algebra II

Introduction to Numerical Linear Algebra II Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in

More information

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014

Principal Component Analysis and Singular Value Decomposition. Volker Tresp, Clemens Otte Summer 2014 Principal Component Analysis and Singular Value Decomposition Volker Tresp, Clemens Otte Summer 2014 1 Motivation So far we always argued for a high-dimensional feature space Still, in some cases it makes

More information

The SVD-Fundamental Theorem of Linear Algebra

The SVD-Fundamental Theorem of Linear Algebra Nonlinear Analysis: Modelling and Control, 2006, Vol. 11, No. 2, 123 136 The SVD-Fundamental Theorem of Linear Algebra A. G. Akritas 1, G. I. Malaschonok 2, P. S. Vigklas 1 1 Department of Computer and

More information

Singular value decomposition

Singular value decomposition Singular value decomposition The eigenvalue decomposition (EVD) for a square matrix A gives AU = UD. Let A be rectangular (m n, m > n). A singular value σ and corresponding pair of singular vectors u (m

More information

Linear Least Squares. Using SVD Decomposition.

Linear Least Squares. Using SVD Decomposition. Linear Least Squares. Using SVD Decomposition. Dmitriy Leykekhman Spring 2011 Goals SVD-decomposition. Solving LLS with SVD-decomposition. D. Leykekhman Linear Least Squares 1 SVD Decomposition. For any

More information

Cheat Sheet for MATH461

Cheat Sheet for MATH461 Cheat Sheet for MATH46 Here is the stuff you really need to remember for the exams Linear systems Ax = b Problem: We consider a linear system of m equations for n unknowns x,,x n : For a given matrix A

More information

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax = . (5 points) (a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? dim N(A), since rank(a) 3. (b) If we also know that Ax = has no solution, what do we know about the rank of A? C(A)

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Data Mining Lecture 4: Covariance, EVD, PCA & SVD Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The

More information

Math 407: Linear Optimization

Math 407: Linear Optimization Math 407: Linear Optimization Lecture 16: The Linear Least Squares Problem II Math Dept, University of Washington February 28, 2018 Lecture 16: The Linear Least Squares Problem II (Math Dept, University

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the

More information

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice 3. Eigenvalues and Eigenvectors, Spectral Representation 3.. Eigenvalues and Eigenvectors A vector ' is eigenvector of a matrix K, if K' is parallel to ' and ' 6, i.e., K' k' k is the eigenvalue. If is

More information

1 9/5 Matrices, vectors, and their applications

1 9/5 Matrices, vectors, and their applications 1 9/5 Matrices, vectors, and their applications Algebra: study of objects and operations on them. Linear algebra: object: matrices and vectors. operations: addition, multiplication etc. Algorithms/Geometric

More information

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms Christopher Engström November 14, 2014 Hermitian LU QR echelon Contents of todays lecture Some interesting / useful / important of matrices Hermitian LU QR echelon Rewriting a as a product of several matrices.

More information

Lecture 5 Singular value decomposition

Lecture 5 Singular value decomposition Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn

More information

Lecture 1: Review of linear algebra

Lecture 1: Review of linear algebra Lecture 1: Review of linear algebra Linear functions and linearization Inverse matrix, least-squares and least-norm solutions Subspaces, basis, and dimension Change of basis and similarity transformations

More information

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I Eigenproblems I CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Eigenproblems I 1 / 33 Announcements Homework 1: Due tonight

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 2. Basic Linear Algebra continued Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

MATH36001 Generalized Inverses and the SVD 2015

MATH36001 Generalized Inverses and the SVD 2015 MATH36001 Generalized Inverses and the SVD 201 1 Generalized Inverses of Matrices A matrix has an inverse only if it is square and nonsingular. However there are theoretical and practical applications

More information

Numerical Methods for Solving Large Scale Eigenvalue Problems

Numerical Methods for Solving Large Scale Eigenvalue Problems Peter Arbenz Computer Science Department, ETH Zürich E-mail: arbenz@inf.ethz.ch arge scale eigenvalue problems, Lecture 2, February 28, 2018 1/46 Numerical Methods for Solving Large Scale Eigenvalue Problems

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T. Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

The Singular Value Decomposition and Least Squares Problems

The Singular Value Decomposition and Least Squares Problems The Singular Value Decomposition and Least Squares Problems Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 27, 2009 Applications of SVD solving

More information

1. The Polar Decomposition

1. The Polar Decomposition A PERSONAL INTERVIEW WITH THE SINGULAR VALUE DECOMPOSITION MATAN GAVISH Part. Theory. The Polar Decomposition In what follows, F denotes either R or C. The vector space F n is an inner product space with

More information