FSAN/ELEG815: Statistical Learning
|
|
- Milton Little
- 5 years ago
- Views:
Transcription
1 : Statistical Learning Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware 3. Eigen Analysis, SVD and PCA
2 Outline of the Course 1. Review of Probability 2. Stationary processes 3. Eigen Analysis, Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 4. The Learning Problem 5. Training vs Testing 6. Estimation theory: Maximum likelihood and Bayes estimation 7. The Wiener Filter 8. Adaptive Optimization: Steepest descent and the LMS algorithm 9. Least Squares (LS) and Recursive Least Squares (RLS) algorithm 10. Overfitting and Regularization 11. Logistic, Ridge and Lasso regression. 12. Neural Networks 13. Matrix Completion 1/79
3 Eigen Analysis Eigen Analysis Objective: Utilize tools from linear algebra to characterize and analyze matrices, especially the correlation matrix The correlation matrix plays a large role in statistical characterization and processing. Previously result: R is Hermitian. Further insight into the correlation matrix is achieved through eigen analysis Eigenvalues and vectors Matrix diagonalization Application: Optimum filtering problems 2/79
4 Eigen Analysis Objective: For a Hermitian matrix R, find a vector q satisfying Rq = λq Interpretation: Linear transformation by R changes the scale, but not the direction of q Fact: A M M matrix R has M eigenvectors and eigenvalues To see this, note Rq i = λ i q i i = 1,2,3,,M (R λi)q = 0 For this to be true, the row/columns of (R λi) must be linearly dependent, det(r λi) = 0 3/79
5 Eigen Analysis Note: det(r λi) is a Mth order polynomial in λ The roots of the polynomial are the eigenvalues λ 1,λ 2,,λ M Rq i = λ i q i Each eigenvector q i is associated with one eigenvalue λ i The eigenvectors are not unique Rq i = λ i q i R(aq i ) = λ i (aq i ) Consequence: eigenvectors are generally normalized, e.g., q i = 1 for i = 1,2,...,M 4/79
6 Eigen Analysis Example (General two dimensional case) Let M = 2 and R = [ R1,1 R 1,2 R 2,1 R 2,2 Determine the eigenvalues and eigenvectors. Thus ] det(r λi) = 0 R 1,1 λ R 1,2 R 2,1 R 2,2 λ = 0 λ 2 λ(r 1,1 + R 2,2 ) + (R 1,1 R 2,2 R 1,2 R 2,1 ) = 0 λ 1,2 = 1 2 [ ] (R 1,1 + R 2,2 ) ± 4R 1,2 R 2,1 + (R 1,1 R 2,2 ) 5/79
7 Eigen Analysis Back substitution yields the eigenvectors: [ R1,1 λ R 1,2 R 2,2 λ R 2,1 ] [ q1 q 2 ] = [ 0 0 ] In general, this yields a set of linear equations. In the M = 2 case: (R 1,1 λ)q 1 + R 1,2 q 2 =0 R 2,1 q 1 + (R 2,2 λ)q 2 =0 Solving the set of linear equations for a specific eigenvalue λ i yields the corresponding eigenvector, q i 6/79
8 Eigen Analysis Example (Two dimensional white noise) Let R be the correlation matrix of a two sample vector of zero mean white noise R = [ σ σ 2 Determine the eigenvalues and eigenvectors. Carrying out the analysis yields eigenvalues [ λ 1,2 = 1 (R 1,1 + R 2,2 ) ± 4R 1,2 R 2,1 + (R 1,1 R 2,2 ) 2 = 1 [ (σ 2 + σ 2 ] ) ± 0 + (σ 2 2 σ 2 ) = σ 2 ] ] and eigenvectors q 1 = [ 1 0 ] and q 2 = [ 0 1 Note: The eigenvectors are unit length (and orthogonal) ] 7/79
9 Eigen Properties Eigen Properties Property (eigenvalues of R k ) If λ 1,λ 2,,λ M are the eigenvalues of R, then λ k 1,λ k 2,,λ k M are the eigenvalues of R k. Proof: Note Rq i = λ i q i.multiplying both sides by R k 1 times, R k q i = λ i R k 1 q i = λ k i q i Property (linear independence of eigenvectors) The eigenvectors q 1,q 2,,q M, of R are linearly independent, i.e., M i=1 for all nonzero scalars a 1,a 2,,a M. a i q i 0 8/79
10 Eigen Properties Property (Correlation matrix eigenvalues are real & nonnegative) The eigenvalues of R are real and nonnegative. Proof: Rq i = λ i q i qi H Rq i = λ i qi H q i [pre multiply by qi H ] λ i = qh i Rq i q H i q i 0 Follows from the facts: R is positive semi-definite and q H i q i = q i 2 > 0 Note: In most cases, R is positive definite and λ i > 0, i = 1,2,,M 9/79
11 Eigen Properties Property (Unique eigenvalues orthogonal eigenvectors) If λ 1,λ 2,,λ M are unique eigenvalues of R, then the corresponding eigenvectors, q 1,q 2,,q M, are orthogonal. Proof: Rq i = λ i q i qj H Rq i = λ i qj H q i ( ) Also, since λ j is real and R is Hermitian Rq j = λ j q j qj H R = λ j qj H qj H Rq i = λ j qj H q i Substituting the LHS from ( ) λ i q H j q i = λ j q H j q i 10/79
12 Eigen Properties Thus λ i q H j q i = λ j q H j q i (λ i λ j )q H j q i = 0 Since λ 1,λ 2,,λ M are unique q H j q i = 0 i j q 1,q 2,,q M are orthogonal. QED 11/79
13 Eigen Properties Diagonalization of R Objective: Find a transformation that transforms the correlation matrix into a diagonal matrix. Let λ 1,λ 2,,λ M be unique eigenvectors of R and take q 1,q 2,,q M to be the M orthonormal eigenvectors { qi H 1 i = j q j = 0 i j Define Q = [q 1,q 2,,q M ] and Ω = diag(λ 1,λ 2,,λ M ). Then consider Q H RQ = q H 1 q H 2. qm H R[q 1,q 2,,q M ] 12/79
14 Eigen Properties Q H RQ = = = q H 1 q H 2. qm H q H 1 q H 2. qm H R[q 1,q 2,,q M ] [λ 1 q 1,λ 2 q 2,,λ N q M ] λ λ λ M Q H RQ = Ω (eigenvector diagonalization of R) 13/79
15 Eigen Properties Property (Q is unitary) Q is unitary, i.e., Q 1 = Q H Proof: Since the q i eigenvectors are orthonormal Q H Q = Q 1 = Q H q H 1 q H 2. qm H [q 1,q 2,,q M ] = I Property (Eigen decomposition of R) The correlation matrix can be expressed as R = M i=1 λ i q i q H i 14/79
16 Eigen Properties Proof: The correlation diagonalization result states Isolating R and expanding, Note: This also gives Q H RQ = Ω R = QΩQ H = [q 1,q 2,,q M ]Ω = [q 1,q 2,,q M ] λ 1 q H 1 λ 2 q H 2. λ M q H M q H 1 q H 2. qm H = M i=1 R 1 = (Q H ) 1 Ω 1 Q 1 = QΩ 1 Q H where Ω 1 = diag(1/λ 1,1/λ 2,,1/λ M ) λ i q i q H i 15/79
17 Eigen Properties Aside (trace & determinant for matrix products) Note trace(a) = i A i,i. Also, trace(ab) = trace(ba) similarly det(ab) = det(a)det(b) Property (Determinant Eigenvalue Relation) The determinant of the correlation matrix is related to the eigenvalues as follows: det(r) = M λ i i=1 Proof: Using R = QΩQ H and the above, det(r) = det(qωq H ) = det(q)det(q H )det(ω) = det(ω) = M λ i i=1 16/79
18 Eigen Properties Property (Trace Eigenvalue Relation) The trace of the correlation matrix is related to the eigenvalues as follows: trace(r) = M λ i i=1 Proof: Note QED trace(r) = trace(qωq H ) = trace(q H QΩ) = trace(ω) = M λ i i=1 17/79
19 Eigen Properties Definition (Normal Matrix) A complex square matrix A is a normal matrix if A H A = AA H That is, a matrix is normal if it commutes with its conjugate transpose. Note All Hermitian symmetric matrices are normal Every matrix that can be diagonalized by the unitary transform is normal Definition (Condition Number) The condition number reflects how numerically well conditioned a problem is, i.e, a low condition number well conditioned; a high condition number ill conditioned. 18/79
20 Eigen Properties Definition (Condition Number for Linear Systems) For a linear system Ax = b defined by a normal matrix A, the condition number is χ(a) = λ max λ min where λ max and λ min are the maximum/minimum eigenvalues of A Observations: Large eigenvalue spread ill conditioned Small eigenvalue spread well conditioned 19/79
21 SVD Matrix-Vector Multiplication Example in 2D: and, A = y = Ax = [ 2 ] x = [ ] [ ] [ 1 3 = ] [ 5 2 What is the geometrical meaning of the matrix-vector multiplication? ] 20/79
22 SVD Matrix-Vector Multiplication [ y = Ax = 1 1][ 3 ] [ 5 = 2 ] Rotates the vector θ Stretches the vector 21/79
23 SVD Matrix-Vector Multiplication To rotate x by an angle θ, we pre-multiply by [ ] cosθ sinθ A = sinθ cosθ Stretch x by factor α, pre-multiply by [ ] α 0 A = 0 α 22/79
24 SVD Matrix-Vector Multiplication Consider the vectors v 1 and v 2 depicting a circle. What happens to the circle under matrix multiplication? 23/79
25 SVD Matrix-Vector Multiplication What happens to the 2D circle under matrix multiplication? Note: Ortogonality holds since they are all rotated by the same angle. 24/79
26 SVD Matrix-Vector Multiplication What happens to the n-d hyper-sphere under matrix multiplication? 25/79
27 SVD n-dim Hyper-Sphere Mapping to n-dim Hyper-Ellipsoid The mapping can be written as Expressed in matrix form as A }{{} A C m n [v 1 v 2...v n ] }{{} Av 1 = σ 1 û 1. Av n = σ j û n = [û 1 û 2...û n ] }{{} V C n n AV = Û ˆΣ. Û C m n σ σ n } {{ } ˆΣ C n n 26/79
28 SVD n-dim Hyper-Sphere Mapping to n-dim Hyper-Ellipsoid Let v 1,...,v n be unitary orthonormal vectors, then V = [v 1 v 2... v n ] is a unitary transformation matrix, that is V 1 = V H. Let û 1,...,û n be unitary orthonormal vectors, then Û = [û 1 û 2... û n ] is a unitary transformation matrix, that is U 1 = ÛH. 27/79
29 SVD Reduced Singular Value Decomposition The mapping is thus given by, Multiply both sides by V 1 we obtain: AV = Û ˆΣ AVV 1 = 1 Û ˆΣV AVV H = Û ˆΣV H AI = Û ˆΣV H A = Û ˆΣV H where Σ = diag([σ 1,σ 2,...,σ n ]), such that σ 1 σ 2...σ p 0. 28/79
30 SVD Singular Value Decomposition Reduced SVD SVD 29/79
31 SVD Theorem 1 Every matrix A C m n has a singular value decomposition (SVD). Singular values σ j are uniquely determined. If A is square σ j are distinct. u j and v j are also unique up to a complex sign. (unique if the complex sign is ignored) 30/79
32 SVD SVD calculation Start with A T A: A H A = ( UΣV H) H ( UΣV H) = VΣU H UΣV H A H AV = VΣ 2 V H V A H AV = VΣ 2 Reduces to an eigenvalue decomposition problem of the form: A T AV = V Σ 2, }{{} B where Λ is a diagonal matrix with the eigenvalues of B and V corresponds to the eigenvectors of B. }{{} Λ 31/79
33 SVD SVD calculation How do we calculate U: AA H = ( UΣV H) ( UΣV H) H = UΣV H VΣU H AA H U = UΣ 2 U H U AA }{{ H } U = U Σ 2 }{{} B Λ Eigenvalue problem where Λ is a diagonal matrix with the eigenvalues of B and U corresponds to the eigenvectors of B. 32/79
34 SVD Movie Rating - A Solution Describe a movie as an array of factors, e.g. comedy, action... Describe each viewer using same factors, e.g. likes comedy, likes action, etc Rating based on match/mismatch More factors better prediction 33/79
35 SVD Singular Value Decomposition Solution Viewers rated movies on a scale from 1 to 5, 0 for movies that were not rated by the user. Each column j is a different movie Each row i is a different viewer Each element a i,j represents the rating of movie j by viewer i A = a 1,1 a 1,n a m, a m,n Goal: Use SVD to predict unobserved data or the rating of a movie that hasn t come out yet. 34/79
36 SVD Singular Value Decomposition Solution We want to classify Movies and Viewers: Category 1 Category 2 Movies = Category 3 Intuitively, if Movie 1 Movie 2, these movies are similar (same category).. Categories are determined by matrix A and SVD algorithm. 35/79
37 SVD Singular Value Decomposition Solution Now, consider that each movie belongs to more than one category e.g. half comedy and half action. This can be written as: Movie j = v 1 Cat1 + v 2 Cat2 + + v n Catn s.t. v 2 = 1 where the set of categories {Cat i} forms an orthonormal basis. 36/79
38 SVD Singular Value Decomposition Solution In the case of V iewers, we use the same Movies categories: Movies = Category 1 Category 2 Category 3 E.g. a viewer that loves comedy is represented with the same unit vector of the comedy category movies. Each V iewer is represented as:. = V iewers. V iewer i = u 1 Cat1 + u 2 Cat2 + + u n Catn s.t. u 2 = 1 37/79
39 SVD Singular Value Decomposition Solution From Theorem 1: There exist a unique decomposition into categories. Every matrix A C m n can be factorized as A = ÛΣVH where: 38/79
40 SVD Singular Value Decomposition Solution Each row vector (u i ) in Û represents the taste of a V iewer i on the corresponding categories. Û = u 1,1 u 1,n u m, u m,n = u 1 u 2. u m 39/79
41 SVD Singular Value Decomposition Solution Each column (v j ) in V H represents the content of a Movie j on the corresponding categories. V H = v 1,1 v 1,n v n, v n,n = [ ] v 1 v 2... v n 40/79
42 SVD Singular Value Decomposition Solution Each singular value σ ii in Σ computes how a viewer of category i rates a movie of the same category i. Σ = σ 1, σ 2, σ n,n 41/79
43 SVD Singular Value Decomposition Solution We have more viewers than movies: New categories are created. The new vectors are still unit vectors orthonormal to all the basis vectors but the ratings of these useless categories are zero. 42/79
44 SVD Singular Value Decomposition Solution The representation of each movie and each viewer can be obtained by Movie j = v 1,j Cat1 + v 2,j Cat2 + + v n,j Catn s.t. v j 2 = 1 = v 1,j σ1,1 + v 2,j σ2,2 + + v n,j σn,n = Σv j V iewer i = u i,1 Cat1 + u i,2 Cat2 + + u i,n Catn + + u i,m Catm s.t. u i 2 = 1 = u i,1 σ1,1 + u i,2 σ2,2 + + u i,n σn,n = u i Σ 43/79
45 SVD Singular Value Decomposition Solution Given the decomposition of a movie and a viewer, the rating is estimated by: V iewer i Movie j = u i,1 v 1,j σ 1,1 + u i,2 v 2,j σ 2,2 + + u i,n v n,j σ n,n = (u i Σ)( Σvj ) = u i Σv j 44/79
46 SVD Singular Value Decomposition - Example Considering the rating from 60 viewers to 16 movies of 4 different genres(action, romance, sci-fi, comedy), we generate A R Viewers rated movies on a scale from 1 to 5, 0 for movies that were not rated by the user. Observe the same 4 categories of viewers. 45/79
47 SVD Singular Value Decomposition - Example 46/79
48 SVD Singular Value Decomposition - Example 47/79
49 SVD Singular Value Decomposition - Example 48/79
50 SVD Singular Value Decomposition - Example To estimate not rated movies (zero entries in A), we use additional information: A is known to be low-rank or approximately low-rank. Thus, we are going to use the k-rank approximation of the matrix A that is: Â = ÛΣ kv H where Σ k has all but the first k singular values σ ii set to zero. The ratings different from zero in A are set to its original value. Note: The ratings matrix A is expected to be low-rank since user preferences can be described by a few categories (k), such as the movie genres. 49/79
51 SVD Singular Value Decomposition - Example 50/79
52 SVD Singular Value Decomposition - Example 51/79
53 PCA Principal Component Analysis (PCA) Simple, method for extracting relevant information from confusing data sets. How to reduce a complex data set to a lower dimension? Consider a mass attached to a spring which oscillates as shown below. What if we did not know that F = ma? 52/79
54 PCA PCA - Motivation: Toy example Since we live in a 3D world use three cameras to capture data from the system. No information about the real x,y, and z axes camera positions are chosen arbitrarily. How do we get from this data set to a simple equation of z? 53/79
55 PCA PCA - Motivation: Toy example Three cameras give redundant information. Only one camera at a specific angle necessary to describe the system behavior. PCA is used to avoid redundancy. 54/79
56 PCA Framework: Change of Basis The goal of PCA is to identify the most meaningful basis to re-express a dataset. Let the basis of representation for our samples be the naive choice, that is the identity matrix. B = b 1 b 2. = b m = I (1) Note all the recorded data can be trivially expressed as a linear combination of b i. 55/79
57 PCA Change of Basis PCA: Is there another basis, which is a linear combination of the original basis, that best respresents the data set? Let X be the original data set, where each column is a single measurements set. Let Y be a linear transformation by P, i.e. Y = PX. Implications: Geometrically P is a rotation and a stretch which transforms X into Y. The rows of P, {p 1,...,p m } are a set of new basis vectors for expressing the columns of X. What is the best way to re-express X?, what is a good choice for P? 56/79
58 PCA Noise Signal and noise variances are depicted as σ 2 signal and σ 2 noise. The largest direction of variance is not along the natural basis but along the best-fit line. The directions with largest variances contain the dynamics of interest. Intuition: Find the direction indicated by σ signal. 57/79
59 PCA Redundancy Figures depict possible plots between two arbitrary measurement types r 1 and r 2. Low redundancy uncorrelated recordings High redundancy correlated recordings, e.g. the sensors are too close or the measured variables are equivalent. If recordings are highly correlated it is not necessary to measure both of them. 58/79
60 PCA PCA - Basic concepts Let a = [a 1,a 2,...,a n ] and b = [b 1,b 2,...,b n ] be two sets of measurements. Are they related? If the mean of a and b is zero, then: Variance: How large the change is in each vector. σ 2 a = 1 n aat = 1 n σ 2 b = 1 n bbt = 1 n a 2 i i b 2 i i Covariance: Statistical relationship between data in a and b. σ 2 ab = 1 n abt = 1 n i a i b i 59/79
61 PCA Variance and Covariance Let X be defined as X = [x T 1... x T m], where x i corresponds to all measurements of a particular type. Then the covariance matrix is defined as: C X = 1 n XXH The covariance values reflect the noise and redundacy in the measurements. 60/79
62 PCA Variance and Covariance Recall C X is the covariance matrix of X defined as C X = 1 n XXH. Covariance matrix in the spring example is C X R 6 6 : C X = σy 2 1 y 1 σy 2 1 z 1 σy 2 1 y 2 σy 2 1 z 2 σy 2 1 y 3 σy 2 1 z 3 σz 2 1 y 1 σz 2 1 z 1 σz 2 1 y 2 σz 2 1 z 2 σz 2 1 y 3 σz 2 1 z 3 σy 2 2 y 1 σy 2 2 z 1 σy 2 2 y 2 σy 2 2 z 2 σy 2 2 y 3 σy 2 2 z 3 σz 2 2 y 1 σz 2 2 z 1 σz 2 2 y 2 σz 2 2 z 2 σz 2 2 y 3 σz 2 2 z 3 σy 2 3 y 1 σy 2 3 z 1 σy 2 3 y 2 σy 2 3 z 2 σy 2 3 y 3 σy 2 3 z 3 σz 2 3 y 1 σz 2 3 z 1 σz 2 3 y 2 σz 2 3 z 2 σz 2 3 y 3 σz 2 3 z 3 Diagonal: Variance measures; Off-diagonal: covariance between all pairs. C X is hermitian and symmetric, i.e. C X = C T X = C T X. 61/79
63 PCA Covariance Matrix Interpretation C X = σy 2 1 y 1 σy 2 1 z 1 σy 2 1 y 2 σy 2 1 z 2 σy 2 1 y 3 σy 2 1 z 3 σz 2 1 y 1 σz 2 1 z 1 σz 2 1 y 2 σz 2 1 z 2 σz 2 1 y 3 σz 2 1 z 3 σy 2 2 y 1 σy 2 2 z 1 σy 2 2 y 2 σy 2 2 z 2 σy 2 2 y 3 σy 2 2 z 3 σz 2 2 y 1 σz 2 2 z 1 σz 2 2 y 2 σz 2 2 z 2 σz 2 2 y 3 σz 2 2 z 3 σy 2 3 y 1 σy 2 3 z 1 σy 2 3 y 2 σy 2 3 z 2 σy 2 3 y 3 σy 2 3 z 3 σz 2 3 y 1 σz 2 3 z 1 σz 2 3 y 2 σz 2 3 z 2 σz 2 3 y 3 σz 2 3 z 3 Off-diagonal terms If covariance is large then components are statistically dependent. If covariance is small then components are statistically independent. Diagonal terms: If variance is large it contains a lot of information about the system. If variance is small it does not provide significant information about the system. 62/79
64 PCA PCA Goal: Change basis such that the covariance matrix of the data is diagonal. If off-diagonal terms 0, the redundancies are eliminated. Diagonal terms represent the variance of each component. Components with large variance are the most representative. 63/79
65 PCA PCA and Eigenvalue Decomposition How to solve the problem? Data set: X R m n, where m is the number of measurement types and n is the number of samples. PCA : Find an orthonormal matrix P in Y = PX such that C Y = 1 n YYT is a diagonal matrix. The rows of P are the principal components of X 64/79
66 PCA PCA and Eigenvalue Decomposition We begin rewriting C Y in terms of the unknown variable. C Y = 1 n YYT = 1 n (PX)(PX)T = 1 n PXXT P T = P = PC X P T ( 1 n XXT ) P T 65/79
67 PCA PCA and Eigenvalue Decomposition C X can be diagonalized by an orthogonal matrix of its eigenvectors since it is a symmetric matrix. Let P = Q T, where Q is a matrix with the eigenvectors of 1 n XXT, then: C Y = PC X P T = P ( Q T ΩQ ) P T = P ( P T ΩP ) P T = ( PP 1) Ω ( PP 1) = Ω The transformation Y = PX diagonalizes the system. Covariance of Y is a diagonal matrix with the eigenvalues of 1 n XXT. 66/79
68 PCA PCA and SVD The SVD of X is given by X = UΣV H. Let P = U H, then: Y = U H X, The covariance matrix of Y is given by: C Y = 1 n YYH = 1 n UH XX H U = 1 n UH UΣV H VΣU H U = 1 n Σ2 67/79
69 PCA PCA The transformation Y = U H X diagonalized the system. Covariance of Y is a diagonal matrix with the squared singular values of X multiplied by a factor of 1 n. It can be concluded that Σ 2 = Ω, and σ 2 i = λ i. The principal components of the data matrix are given by U H. 68/79
70 PCA Application: Face Recognition PCA in face recognition Eigenfaces Intuition: Figure out the correlation between the rows/ colums of A from the SVD. A = UΣV H (2) How important each direction is: Σ Principal Directions: U How each individual component (row/column) projects onto the principal components: V. 69/79
71 PCA Data in Face Recognition The data matrix for the face recognition problem is constructed by vectorizing the face images as shown below, i.e. A = [A T 1 A T 2... A T N] T. The matrix will be N M, where N is the number of images in the data base and M is the number of pixels of each image. 70/79
72 PCA Example - Celebrity Images Example, take 5 images of each celebrity: George Clooney, Bruce Willis, Margaret Thatcher and Matt Damon. In the example, M = and N = /79
73 PCA Average Faces How do the average of the faces of these celebrities look like? ā i = j=1 A j (3) 72/79
74 PCA Average Faces What defines George Clooney s face? Data matrix A R N M with the images of the example. Compute the correlation matrix of the features of the dataset, i.e. the pixels. The correlation matrix is C = A T A R M M, here M = High correlation values everybody has eyes, a nose and a mouth. Correlations between images of the same person will be higher. Average Face 73/79
75 PCA Eigendecomposition Obtain the eigenvalue decomposition of C = A T A. That is C = QΩQ 1. First eigenvectors q i R M 1 are called the principal components (eigenfaces). One can reconstruct each face as a weighted sum of the eigenvectors. 74/79
76 PCA Representing Faces onto Basis Each face A i R 1 M in the data set A = [A T 1 A T 2... A T N] T, can be represented as a linear combination of the best K eigenvectors: A T i = K j=1 w j q j, where w j = q T j A T i (4) 75/79
77 PCA Projection of the Average faces into the K=20 largest Eigenvectors Q is M M, here let Q be the matrix formed by the first 20 eigenvectors, i.e. Q R M N. Project the average faces ā i onto the reduced eigenvector space, i.e. projection = ā i Q. Projections for each face are characteristic of each average face and could be used for classification purposes. 76/79
78 PCA Projection of new images Test set: New image of Margaret Thatcher, Maryl Streep as Margaret Thatcher in The Iron Lady, Betty White. Project test images onto eigenvector space, p = xq, where x R 1 M is the new vectorized image and Q is the matrix with the first 20 eigenvectors of the database. Reconstruct images as ˆx = Qp T. Error defined as the difference between the projection of the new image and the projection of the original Margaret Thatcher images A j Q where j = 1,...,5, that is E j = A jq xq, A j Q where A j are the original images of the database, in this case the 5 images of Margareth Thatcher. 77/79
79 PCA Projection of new images Image depicts, from left to right Test images. Projection of the test images onto the eigenvector space p = xq. Reconstructed images using the first 20 eigenvectors of the database ˆx = Qp T. Error of the projection with respect to each original Margareth Thatcher Image. 78/79
80 PCA Projection of new images 79/79
Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed in the
More informationNotes on Eigenvalues, Singular Values and QR
Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square
More informationMathematical foundations - linear algebra
Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar
More informationIntroduction to Matrix Algebra
Introduction to Matrix Algebra August 18, 2010 1 Vectors 1.1 Notations A p-dimensional vector is p numbers put together. Written as x 1 x =. x p. When p = 1, this represents a point in the line. When p
More informationThe Singular Value Decomposition
The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationMultivariate Statistical Analysis
Multivariate Statistical Analysis Fall 2011 C. L. Williams, Ph.D. Lecture 4 for Applied Multivariate Analysis Outline 1 Eigen values and eigen vectors Characteristic equation Some properties of eigendecompositions
More informationNumerical Methods I Singular Value Decomposition
Numerical Methods I Singular Value Decomposition Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 9th, 2014 A. Donev (Courant Institute)
More informationMaths for Signals and Systems Linear Algebra in Engineering
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 15, Tuesday 8 th and Friday 11 th November 016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR) IN SIGNAL PROCESSING IMPERIAL COLLEGE
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationLecture 8. Principal Component Analysis. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. December 13, 2016
Lecture 8 Principal Component Analysis Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza December 13, 2016 Luigi Freda ( La Sapienza University) Lecture 8 December 13, 2016 1 / 31 Outline 1 Eigen
More informationMatrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =
30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can
More informationLecture: Face Recognition and Feature Reduction
Lecture: Face Recognition and Feature Reduction Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap - Curse of dimensionality Assume 5000 points uniformly distributed
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationThe Singular Value Decomposition and Least Squares Problems
The Singular Value Decomposition and Least Squares Problems Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 27, 2009 Applications of SVD solving
More informationA Tutorial on Data Reduction. Principal Component Analysis Theoretical Discussion. By Shireen Elhabian and Aly Farag
A Tutorial on Data Reduction Principal Component Analysis Theoretical Discussion By Shireen Elhabian and Aly Farag University of Louisville, CVIP Lab November 2008 PCA PCA is A backbone of modern data
More informationTHE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR
THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR 1. Definition Existence Theorem 1. Assume that A R m n. Then there exist orthogonal matrices U R m m V R n n, values σ 1 σ 2... σ p 0 with p = min{m, n},
More information7 Principal Component Analysis
7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is
More informationThe Singular Value Decomposition (SVD) and Principal Component Analysis (PCA)
Chapter 5 The Singular Value Decomposition (SVD) and Principal Component Analysis (PCA) 5.1 Basics of SVD 5.1.1 Review of Key Concepts We review some key definitions and results about matrices that will
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationSingular Value Decomposition
Chapter 6 Singular Value Decomposition In Chapter 5, we derived a number of algorithms for computing the eigenvalues and eigenvectors of matrices A R n n. Having developed this machinery, we complete our
More informationLinear Algebra - Part II
Linear Algebra - Part II Projection, Eigendecomposition, SVD (Adapted from Sargur Srihari s slides) Brief Review from Part 1 Symmetric Matrix: A = A T Orthogonal Matrix: A T A = AA T = I and A 1 = A T
More informationIV. Matrix Approximation using Least-Squares
IV. Matrix Approximation using Least-Squares The SVD and Matrix Approximation We begin with the following fundamental question. Let A be an M N matrix with rank R. What is the closest matrix to A that
More informationPrincipal Component Analysis
Principal Component Analysis CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Principal
More information2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.
2. LINEAR ALGEBRA Outline 1. Definitions 2. Linear least squares problem 3. QR factorization 4. Singular value decomposition (SVD) 5. Pseudo-inverse 6. Eigenvalue decomposition (EVD) 1 Definitions Vector
More informationCS 143 Linear Algebra Review
CS 143 Linear Algebra Review Stefan Roth September 29, 2003 Introductory Remarks This review does not aim at mathematical rigor very much, but instead at ease of understanding and conciseness. Please see
More informationMachine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012
Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component
More informationSingular Value Decomposition
Singular Value Decomposition Motivatation The diagonalization theorem play a part in many interesting applications. Unfortunately not all matrices can be factored as A = PDP However a factorization A =
More informationLinear Algebra: Matrix Eigenvalue Problems
CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given
More informationUNIT 6: The singular value decomposition.
UNIT 6: The singular value decomposition. María Barbero Liñán Universidad Carlos III de Madrid Bachelor in Statistics and Business Mathematical methods II 2011-2012 A square matrix is symmetric if A T
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationThe Singular Value Decomposition
The Singular Value Decomposition An Important topic in NLA Radu Tiberiu Trîmbiţaş Babeş-Bolyai University February 23, 2009 Radu Tiberiu Trîmbiţaş ( Babeş-Bolyai University)The Singular Value Decomposition
More informationLecture notes: Applied linear algebra Part 1. Version 2
Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and
More informationPrincipal Component Analysis
Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationMATH36001 Generalized Inverses and the SVD 2015
MATH36001 Generalized Inverses and the SVD 201 1 Generalized Inverses of Matrices A matrix has an inverse only if it is square and nonsingular. However there are theoretical and practical applications
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationLinear Algebra Review. Fei-Fei Li
Linear Algebra Review Fei-Fei Li 1 / 51 Vectors Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightnesses, etc. A vector
More informationLinear Algebra Review. Fei-Fei Li
Linear Algebra Review Fei-Fei Li 1 / 37 Vectors Vectors and matrices are just collections of ordered numbers that represent something: movements in space, scaling factors, pixel brightnesses, etc. A vector
More informationMath Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p
Math Bootcamp 2012 1 Review of matrix algebra 1.1 Vectors and rules of operations An p-dimensional vector is p numbers put together. Written as x 1 x =. x p. When p = 1, this represents a point in the
More informationLinear Algebra Review. Vectors
Linear Algebra Review 9/4/7 Linear Algebra Review By Tim K. Marks UCSD Borrows heavily from: Jana Kosecka http://cs.gmu.edu/~kosecka/cs682.html Virginia de Sa (UCSD) Cogsci 8F Linear Algebra review Vectors
More informationMIT Final Exam Solutions, Spring 2017
MIT 8.6 Final Exam Solutions, Spring 7 Problem : For some real matrix A, the following vectors form a basis for its column space and null space: C(A) = span,, N(A) = span,,. (a) What is the size m n of
More informationFoundations of Matrix Analysis
1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the
More informationLecture 5 Singular value decomposition
Lecture 5 Singular value decomposition Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationLinear Algebra Fundamentals
Linear Algebra Fundamentals It can be argued that all of linear algebra can be understood using the four fundamental subspaces associated with a matrix. Because they form the foundation on which we later
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition
More informationStat 159/259: Linear Algebra Notes
Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the
More informationSingular Value Decomposition
Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the
More informationLinear Algebra Primer
Linear Algebra Primer David Doria daviddoria@gmail.com Wednesday 3 rd December, 2008 Contents Why is it called Linear Algebra? 4 2 What is a Matrix? 4 2. Input and Output.....................................
More informationBackground Mathematics (2/2) 1. David Barber
Background Mathematics (2/2) 1 David Barber University College London Modified by Samson Cheung (sccheung@ieee.org) 1 These slides accompany the book Bayesian Reasoning and Machine Learning. The book and
More informationhttps://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:
More informationNotes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.
Notes on singular value decomposition for Math 54 Recall that if A is a symmetric n n matrix, then A has real eigenvalues λ 1,, λ n (possibly repeated), and R n has an orthonormal basis v 1,, v n, where
More informationSTA141C: Big Data & High Performance Statistical Computing
STA141C: Big Data & High Performance Statistical Computing Numerical Linear Algebra Background Cho-Jui Hsieh UC Davis May 15, 2018 Linear Algebra Background Vectors A vector has a direction and a magnitude
More informationCOMP 558 lecture 18 Nov. 15, 2010
Least squares We have seen several least squares problems thus far, and we will see more in the upcoming lectures. For this reason it is good to have a more general picture of these problems and how to
More informationEigenvalues and diagonalization
Eigenvalues and diagonalization Patrick Breheny November 15 Patrick Breheny BST 764: Applied Statistical Modeling 1/20 Introduction The next topic in our course, principal components analysis, revolves
More informationMATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.
MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. Adjoint operator and adjoint matrix Given a linear operator L on an inner product space V, the adjoint of L is a transformation
More informationSingular Value Decomposition and Principal Component Analysis (PCA) I
Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression
More informationSymmetric and anti symmetric matrices
Symmetric and anti symmetric matrices In linear algebra, a symmetric matrix is a square matrix that is equal to its transpose. Formally, matrix A is symmetric if. A = A Because equal matrices have equal
More informationLinear Algebra in Actuarial Science: Slides to the lecture
Linear Algebra in Actuarial Science: Slides to the lecture Fall Semester 2010/2011 Linear Algebra is a Tool-Box Linear Equation Systems Discretization of differential equations: solving linear equations
More informationa 11 a 12 a 11 a 12 a 13 a 21 a 22 a 23 . a 31 a 32 a 33 a 12 a 21 a 23 a 31 a = = = = 12
24 8 Matrices Determinant of 2 2 matrix Given a 2 2 matrix [ ] a a A = 2 a 2 a 22 the real number a a 22 a 2 a 2 is determinant and denoted by det(a) = a a 2 a 2 a 22 Example 8 Find determinant of 2 2
More informationMachine Learning - MT & 14. PCA and MDS
Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)
More informationON THE SINGULAR DECOMPOSITION OF MATRICES
An. Şt. Univ. Ovidius Constanţa Vol. 8, 00, 55 6 ON THE SINGULAR DECOMPOSITION OF MATRICES Alina PETRESCU-NIŢǍ Abstract This paper is an original presentation of the algorithm of the singular decomposition
More informationSingular Value Decomposition
Singular Value Decomposition CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Singular Value Decomposition 1 / 35 Understanding
More informationQuantum Computing Lecture 2. Review of Linear Algebra
Quantum Computing Lecture 2 Review of Linear Algebra Maris Ozols Linear algebra States of a quantum system form a vector space and their transformations are described by linear operators Vector spaces
More informationTBP MATH33A Review Sheet. November 24, 2018
TBP MATH33A Review Sheet November 24, 2018 General Transformation Matrices: Function Scaling by k Orthogonal projection onto line L Implementation If we want to scale I 2 by k, we use the following: [
More informationPrincipal Component Analysis -- PCA (also called Karhunen-Loeve transformation)
Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations
More informationLinear Subspace Models
Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,
More informationReview of some mathematical tools
MATHEMATICAL FOUNDATIONS OF SIGNAL PROCESSING Fall 2016 Benjamín Béjar Haro, Mihailo Kolundžija, Reza Parhizkar, Adam Scholefield Teaching assistants: Golnoosh Elhami, Hanjie Pan Review of some mathematical
More informationPrincipal Component Analysis
Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used
More information. = V c = V [x]v (5.1) c 1. c k
Chapter 5 Linear Algebra It can be argued that all of linear algebra can be understood using the four fundamental subspaces associated with a matrix Because they form the foundation on which we later work,
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationWhat is Principal Component Analysis?
What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most
More informationLecture Notes 2: Matrices
Optimization-based data analysis Fall 2017 Lecture Notes 2: Matrices Matrices are rectangular arrays of numbers, which are extremely useful for data analysis. They can be interpreted as vectors in a vector
More informationDeep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i
More informationSingular Value Decomposition and its. SVD and its Applications in Computer Vision
Singular Value Decomposition and its Applications in Computer Vision Subhashis Banerjee Department of Computer Science and Engineering IIT Delhi October 24, 2013 Overview Linear algebra basics Singular
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationAnnouncements (repeat) Principal Components Analysis
4/7/7 Announcements repeat Principal Components Analysis CS 5 Lecture #9 April 4 th, 7 PA4 is due Monday, April 7 th Test # will be Wednesday, April 9 th Test #3 is Monday, May 8 th at 8AM Just hour long
More informationMatrices A brief introduction
Matrices A brief introduction Basilio Bona DAUIN Politecnico di Torino Semester 1, 2014-15 B. Bona (DAUIN) Matrices Semester 1, 2014-15 1 / 41 Definitions Definition A matrix is a set of N real or complex
More informationReview problems for MA 54, Fall 2004.
Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on
More informationLinear Algebra for Machine Learning. Sargur N. Srihari
Linear Algebra for Machine Learning Sargur N. srihari@cedar.buffalo.edu 1 Overview Linear Algebra is based on continuous math rather than discrete math Computer scientists have little experience with it
More information1 Singular Value Decomposition and Principal Component
Singular Value Decomposition and Principal Component Analysis In these lectures we discuss the SVD and the PCA, two of the most widely used tools in machine learning. Principal Component Analysis (PCA)
More informationLinear Algebra. Session 12
Linear Algebra. Session 12 Dr. Marco A Roque Sol 08/01/2017 Example 12.1 Find the constant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution c = 1 c = 0 f (x)
More informationforms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms
Christopher Engström November 14, 2014 Hermitian LU QR echelon Contents of todays lecture Some interesting / useful / important of matrices Hermitian LU QR echelon Rewriting a as a product of several matrices.
More informationMatrix Vector Products
We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and
More informationComputational Methods. Eigenvalues and Singular Values
Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations
More informationCheng Soon Ong & Christian Walder. Canberra February June 2017
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2017 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 141 Part III
More informationBasic Concepts in Matrix Algebra
Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1
More informationLecture 2: Linear Algebra Review
EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1
More informationAPPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.
APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product
More informationData Mining Lecture 4: Covariance, EVD, PCA & SVD
Data Mining Lecture 4: Covariance, EVD, PCA & SVD Jo Houghton ECS Southampton February 25, 2019 1 / 28 Variance and Covariance - Expectation A random variable takes on different values due to chance The
More informationPrincipal Component Analysis and Linear Discriminant Analysis
Principal Component Analysis and Linear Discriminant Analysis Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1/29
More informationThroughout these notes we assume V, W are finite dimensional inner product spaces over C.
Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal
More informationMath 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination
Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column
More informationIntroduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin
1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)
More information