Linear Algebra Methods for Data Mining

Similar documents
Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining

Linear Algebra - Part II

2. Review of Linear Algebra

EECS 275 Matrix Computation

CS 143 Linear Algebra Review

Numerical Methods for Solving Large Scale Eigenvalue Problems

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Notes on Eigenvalues, Singular Values and QR

Linear Algebra Methods for Data Mining

The Singular Value Decomposition

Chapter 3 Transformations

Introduction to Matrix Algebra

Nonnegative and spectral matrix theory Lecture notes

Solving large scale eigenvalue problems

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Jordan Normal Form and Singular Decomposition

Fundamentals of Matrices

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Linear Methods in Data Mining

Applied Linear Algebra in Geoscience Using MATLAB

Basic Calculus Review

Computational Methods. Eigenvalues and Singular Values

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

We will discuss matrix diagonalization algorithms in Numerical Recipes in the context of the eigenvalue problem in quantum mechanics, m A n = λ m

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

. = V c = V [x]v (5.1) c 1. c k

Social Influence in Online Social Networks. Epidemiological Models. Epidemic Process

1 Last time: least-squares problems

Linear Algebra, part 3 QR and SVD

Linear Algebra Review. Vectors

Econ Slides from Lecture 7

Knowledge Discovery and Data Mining 1 (VO) ( )

Eigenvalue Problems and Singular Value Decomposition

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

Some notes on Linear Algebra. Mark Schmidt September 10, 2009

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1

Linear Algebra and Matrices

SVD, PCA & Preprocessing

EE731 Lecture Notes: Matrix Computations for Signal Processing

Linear Algebra Massoud Malek

Problems of Eigenvalues/Eigenvectors

Notes on Solving Linear Least-Squares Problems

Review of Linear Algebra

1 Vectors. Notes for Bindel, Spring 2017 Numerical Analysis (CS 4220)

UNIT 6: The singular value decomposition.

MATH 221, Spring Homework 10 Solutions

Review problems for MA 54, Fall 2004.

Summary of Week 9 B = then A A =

Lecture 15, 16: Diagonalization

TBP MATH33A Review Sheet. November 24, 2018

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra- Final Exam Review

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

The SVD-Fundamental Theorem of Linear Algebra

CS 246 Review of Linear Algebra 01/17/19

Functional Analysis Review

Lecture 2: Linear Algebra Review

1 Singular Value Decomposition and Principal Component

Mathematical foundations - linear algebra

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Markov Chains and Spectral Clustering

15 Singular Value Decomposition

Eigenvalue and Eigenvector Problems

B553 Lecture 5: Matrix Algebra Review

Quantum Computing Lecture 2. Review of Linear Algebra

Lecture notes: Applied linear algebra Part 1. Version 2

MATH 581D FINAL EXAM Autumn December 12, 2016

Problems of Eigenvalues/Eigenvectors

Lecture 10 - Eigenvalues problem

Mathematical foundations - linear algebra

Basic Elements of Linear Algebra

NOTES ON LINEAR ALGEBRA CLASS HANDOUT

14 Singular Value Decomposition

Deep Learning Book Notes Chapter 2: Linear Algebra

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

Linear Algebra for Machine Learning. Sargur N. Srihari

STA141C: Big Data & High Performance Statistical Computing

The Cartan Dieudonné Theorem

Spectral radius, symmetric and positive matrices

Numerical Linear Algebra

LINEAR ALGEBRA KNOWLEDGE SURVEY

Linear Algebra. Session 12

Chapter 5. Eigenvalues and Eigenvectors

33AH, WINTER 2018: STUDY GUIDE FOR FINAL EXAM

From Matrix to Tensor. Charles F. Van Loan

Chapter 7: Symmetric Matrices and Quadratic Forms

CS224W: Analysis of Networks Jure Leskovec, Stanford University

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

Applied Numerical Linear Algebra. Lecture 8

7. Symmetric Matrices and Quadratic Forms

ACM106a - Homework 4 Solutions

This appendix provides a very basic introduction to linear algebra concepts.

Eigenvalues and diagonalization

Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

Basic Concepts in Matrix Algebra

Linear Algebra & Geometry why is linear algebra useful in computer vision?

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Transcription:

Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 2. Basic Linear Algebra continued Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki

Happened so far: matrix-vector multiplication, matrix-matrix multiplication vector norms, matrix norms distances between vectors eigenvalues, eigenvectors linear independence basis orthogonality Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 1

Eigenvalues and eigenvectors Let A be a n n matrix. The vector v 0 that satisfies Av = λv for some scalar λ is called the eigenvector of A and λ is the eigenvalue corresponding to the eigenvector v. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 2

Example det(a λi) = Av = ( 2 1 1 3 2 λ 1 1 3 λ ) v = λv. = (2 λ)(3 λ) 1 = 0 λ 1 = 3.62 λ 2 = 1.38 ( ) ( 0.52 0.85 v 1 = v 2 = 0.85 0.52 ) Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 3

3 2 1 0 1 2 3 4 3 2 1 0 1 2 3 4 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 4

Orthogonality Two vectors x and y are orthogonal, if x T y = 0. Let q j, j = 1,..., n be orthogonal, i.e. q T i q j = 0, i j. Then they are linearly independent. Let the set of orthogonal vectors q j, j = 1,..., m in R m be normalized, q = 1. Then they are orthonormal, and constitute an orthonormal basis in R m. A matrix R m m Q = [q 1 q 2... q m ] with orthonormal columns is called an orthogonal matrix. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 5

Example In the previous example we determined the eigenvectors of the matrix ( 2 1 1 3 ) to be v 1 = ( 0.52 0.85 ), v 2 = The vectors v 1 and v 2 are orthogonal: v T 1 v 2 = 0.52 0.85 + 0.85 ( 0.52) = 0. This is no coincidence! ( 0.85 0.52 ). Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 6

Eigenvalues and eigenvectors of a symmetric matrix The eigenvectors of a symmetric matrix are mutually orthogonal and its eigenvalues are real. (A R n n is a symmetric matrix, if A = A T.) Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 7

Eigendecomposition A symmetric matrix A R n n can be written in the form A = UΛU T, where the columns of U are the eigenvectors of A and Λ is a diagonal matrix, the diagonal elements of which are the corresponding eigenvalues of A. Note, that U is orthogonal. This is called the eigendecomposition or symmetric Schur decomposition of A. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 8

U=[.52 0.85; 0.85-0.52] Check U = 0.5200 0.8500 0.8500-0.5200 Lambda=[3.62 0;0 1.38] Lambda = 3.6200 0 0 1.3800 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 9

U*Lambda*U ans = 1.9759 0.9901 0.9901 2.9886 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 10

Example of symmetric matrices: graphs The adjacency matrix of an undirected graph is a symmetric matrix: 1 2 A = 0 1 1 1 1 0 0 1 1 0 0 1 1 1 1 0 3 4 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 11

Example: virus propagation Question 1: How does a virus spread across an arbitrary network? Question 2: Will it create an epidemic? Question 3: What can we do about it? Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 12

Model: the Susceptible-Infected-Susceptible (SIS) model cured nodes immediately become susceptible virus birth rate β: probability that an infected neighbor attacks virus death rate δ: probability that an infected node heals virus strength s = β/δ. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 13

Healthy Infected? Infected Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 14

Epidemic threshold τ The epidemic threshold of a graph is the largest value of τ for which it holds that if the virus strength an epidemic can not happen. s = β/δ < τ, Problem: given a graph G, compute the epidemic threshold τ. The epidemic threshold depends only on the graph! But which properties of the graph? Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 15

Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 16

Answer (to question 2) Theorem. (Wang et al, 2003) Let us have a graph with the adjacency matrix A. We have no epidemic, if where β/δ < τ = 1/λ A, λ A the largest eigenvalue of A β is the prob. that an infected neighbor attacks, δ is the prob. that an infected node heals. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 17

What can we do with this information? Q: Who is the best person/computer to immunize against the virus? A: The one the removal of which will make the largest difference in λ A. Note: Eigenvalues are strongly related to graph topology! Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 18

Other questions that can be answered in a similar way: who is the best customer to advertise to? who originated a raging rumor? in general: how important is a node in a network? Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 19

What if the matrix is not symmetric? If A is symmetric, we can write A = UΛU T, where U and Λ contain the eigenvectors and corresponding eigenvalues of A. What if A R m n? Definitely not symmetric! Then we can use the singular value decomposition (SVD): A = UΣV T, where U and V are orthogonal, and Σ is diagonal. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 20

Example: customer \ day Wed Thu Fri Sat Sun ABC Inc. 1 1 1 0 0 CDE Co. 2 2 2 0 0 FGH Ltd. 1 1 1 0 0 NOP Inc. 5 5 5 0 0 Smith 0 0 0 2 2 Brown 0 0 0 3 3 Johnson 0 0 0 1 1 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 21

1 1 1 0 0 2 2 2 0 0 1 1 1 0 0 5 5 5 0 0 0 0 0 2 2 0 0 0 3 3 0 0 0 1 1 = 0.18 0 0.36 0 0.18 0 0.90 0 0 0.53 0 0.80 0 0.27 ( ) 9.64 0 0 5.29 ( 0.58 0.58 0.58 0 0 ) 0 0 0 0.71 0.71 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 22

How to compute matrix decompositions? We have had a brief glimpse at eigenvalue decomposition and singular value decomposition. Before taking a closer look: how can we compute these? they are imple- Answer 1: use LAPACK, Matlab, Mathematica,... mented everywhere! Answer 2: we need more linear algebra tools! Back to basics... Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 23

Givens (plane) rotations G = ( c s s c ), c 2 + s 2 = 1. theta=pi/8;c=cos(theta);s=sin(theta); G=[c s;-s c] G *G 0.9239 0.3827-0.3827 0.9239 1.0000 0.0000 0.0000 1.0000 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 24

Givens (plane) rotations G = ( c s s c ), c 2 + s 2 = 1. We can choose c and s so that c = cos(θ), s = sin(θ) for some θ. Then multiplication of a vector x by G means that we rotate the vector in th R 2 by an angle θ: Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 25

x θ Gx Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 26

Embed a 2-D rotation in a larger unit matrix: 1 0 0 0 0 c 0 s 0 0 1 0 0 s 0 c Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 27

G = G *G ans = 1.0000 0 0 0 0 0.9239 0 0.3827 0 0 1.0000 0 0-0.3827 0 0.9239 1.0000 0 0 0 0 1.0000 0 0.0000 0 0 1.0000 0 0 0.0000 0 1.0000 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 28

Givens rotations can be used to zero elements in vectors and matrices. ( ) c s Rotation matrix G =, c 2 + s 2 = 1. s c Choose c (and s) so that Gv = ( α 0 ) : Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 29

Choose c (and s) so that Gv = ( α 0 ) : c = v 1 v 2 1 + v 2 2, s = v 2 v 2 1 + v 2 2 (Check that this does what it is supposed to do!) Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 30

We can choose c and s in 1 0 0 0 0 c 0 s 0 0 1 0 0 s 0 c so that we can zero element 4 in a vector by a rotation in plane (2,4). Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 31

x=[1;2;3;4]; sq=sqrt(x(2)^2+x(4)^2); c=x(2)/sq;s=x(4)/sq; G=[1 0 0 0;0 c 0 s;0 0 1 0;0 -s 0 c]; y=g*x y = 1.0000 4.4721 3.0000 0 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 32

Reminder The inverse of an orthogonal matrix Q is Q 1 = Q T. The Euclidean length of a vector is invariant under an orthogonal transformation Q: Qx 2 = (Qx) T Qx = x T x = x 2. The product of two orthogonal matrices Q and P is orthogonal: X T X = (PQ) T PQ = Q T P T PQ = Q T Q = I. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 33

Transforming a vector v to αe 1 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 34

Summarize αe 1 = G 3 (G 2 (G 1 v)) = (G 3 G 2 G 1 )v. Denote P = G 3 G 2 G 1. P is orthogonal, and Pv = αe 1. Since P is orthogonal, euclidean length is preserved, and α = v = n vi 2. i=1 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 35

Number of floating point operations What is the number of flops needed to transform the first column of a m n matrix to αe 1, a multiple of the first unit vector? the computation of ( c s s c ) ( x y ) = ( cx + sy sx + cy ) requires 4 multiplications and 2 additions, i.e. 6 flops. Applying such a transformation to a m n matrix requires 6n flops. In order to zero all elements but one in the first column of the matrix we must apply (m 1) rotations. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 36

overall flop count: 6(m 1)n 6mn. the so called Householder transformation, with which one can do the same thing, is slightly cheaper: 4mn. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 37

What was that all about? We can use very simple, orthogonal transformations (e.g. rotations) to zero elements in a matrix. Givens In principle, this is what is done when matrix decompositions are calculated. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 38

Matrix decompositions We wish to decompose the matrix A by writing it as a product of two or more matrices: A m n = B m k C k n, A m n = B m k C k r D r n This is done in such a way that the right side of the equation yields some useful information or insight to the nature of the data matrix A. Or is in other ways useful for solving the problem at hand. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 39

Matrix decompositions There are numerous examples of useful matrix decompositions: Eigendecomposition: A = UΛU T, A symmetric, U and Λ eigenvectors and eigenvalues. Singular value decomposition: A = UΣV T, U, V orthogonal, Σ diagonal. Matrix factorization is the same thing as matrix decomposition (e.g. NMF = nonnegative matix factorization, V = WH, all elements nonnegative.) Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 40

How to compute these? Roughly (attn: in reality there is more to it than this!), (1) Manipulate A by multiplying it by intelligently chosen, fairly simple (orthogonal) matrices from both sides: V k... V 2 V 1 AW 1... W s = B, until B is nice. (2) Denote V = V k... V 2 V 1, W = W 1... W s. Now A = VBW T. But how to choose V 1,..., V k, W 1,..., W s? Needed: linear algebra tools for transforming matrices in an orderly fashion: Givens! Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 41

By applying several Givens rotations in succession, we can transform α 0 x. 0. 0 Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 42

QR transformation We transform A Q T A = R, where Q is orthogonal and R is upper triangular. Example: Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 43

Note! Any matrix A R m n, m n, can be transformed to upper triangular form by an orthogonal matrix. If the columns of A are linearly independent, then R is non-singular. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 44

QR decomposition A = Q ( R 0 ) Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 45

References [1] Lars Eldén: Matrix Methods in Data Mining and Pattern Recognition, SIAM 2007. [2] G. H. Golub and C. F. Van Loan. Matrix Computations. 3rd ed. Johns Hopkins Press, Baltimore, MD., 1996. [3] Y. Wang, D. Chakrabarti, C. Wang and C. Faloutsos, Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint, SRDS 2003, Florence, Italy. Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki 46