Dimensionality Reduction

Similar documents
CS47300: Web Information Search and Management

Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig

Text Analytics (Text Mining)

Latent Semantic Analysis (Tutorial)

Lecture: Face Recognition and Feature Reduction

Assignment 3. Latent Semantic Indexing

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

CS60021: Scalable Data Mining. Dimensionality Reduction

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Lecture: Face Recognition and Feature Reduction

Parallel Singular Value Decomposition. Jiaxing Tan

Problems. Looks for literal term matches. Problems:

Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

CS 143 Linear Algebra Review

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

1 Singular Value Decomposition and Principal Component

Positive Definite Matrix

Introduction to Data Mining

Lecture II: Linear Algebra Revisited

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya

Singular Value Decompsition

Manning & Schuetze, FSNLP (c) 1999,2000

Information Retrieval

Knowledge Discovery and Data Mining 1 (VO) ( )

Introduction to Matrix Algebra

Image Registration Lecture 2: Vectors and Matrices

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

Lecture 2: Linear Algebra Review

A VERY BRIEF LINEAR ALGEBRA REVIEW for MAP 5485 Introduction to Mathematical Biophysics Fall 2010

Main matrix factorizations

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Signal Analysis. Principal Component Analysis

The Singular Value Decomposition

PCA, Kernel PCA, ICA

Latent Semantic Analysis. Hongning Wang

CS 246 Review of Linear Algebra 01/17/19

Large Scale Data Analysis Using Deep Learning

2. Matrix Algebra and Random Vectors

Linear Algebra Methods for Data Mining

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Applied Linear Algebra in Geoscience Using MATLAB

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides

Linear Algebra Review. Vectors

Faloutsos, Tong ICDE, 2009

Latent Semantic Analysis. Hongning Wang

Text Analytics (Text Mining)

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

linearly indepedent eigenvectors as the multiplicity of the root, but in general there may be no more than one. For further discussion, assume matrice

Lecture 3: Review of Linear Algebra

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Machine learning for pervasive systems Classification in high-dimensional spaces

5 Linear Algebra and Inverse Problem

A = 3 B = A 1 1 matrix is the same as a number or scalar, 3 = [3].

Maths for Signals and Systems Linear Algebra in Engineering

CSE 494/598 Lecture-6: Latent Semantic Indexing. **Content adapted from last year s slides

Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Background Mathematics (2/2) 1. David Barber

1 Matrices and vector spaces

Singular Value Decomposition

More Linear Algebra. Edps/Soc 584, Psych 594. Carolyn J. Anderson

CSE 554 Lecture 7: Alignment

14 Singular Value Decomposition

Properties of Matrices and Operations on Matrices

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1

Mathematical Fundamentals

A few applications of the SVD

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Math Linear Algebra II. 1. Inner Products and Norms

18.06SC Final Exam Solutions

EIGENVALE PROBLEMS AND THE SVD. [5.1 TO 5.3 & 7.4]

2. Review of Linear Algebra

Math Linear Algebra Final Exam Review Sheet

Latent semantic indexing

.. CSC 566 Advanced Data Mining Alexander Dekhtyar..

Jun Zhang Department of Computer Science University of Kentucky

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Linear Algebra Background

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY

MATH 829: Introduction to Data Mining and Analysis Principal component analysis

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Foundations of Computer Vision

Dimension reduction methods: Algorithms and Applications Yousef Saad Department of Computer Science and Engineering University of Minnesota

Unsupervised Learning: Dimensionality Reduction

Principal Component Analysis

Symmetric and anti symmetric matrices

Conceptual Questions for Review

Assignment 2 (Sol.) Introduction to Machine Learning Prof. B. Ravindran

Computational Methods. Eigenvalues and Singular Values

Introduction to Linear Algebra, Second Edition, Serge Lange

Review of some mathematical tools

Matrix Decomposition and Latent Semantic Indexing (LSI) Introduction to Information Retrieval INF 141/ CS 121 Donald J. Patterson

Linear Methods in Data Mining

Transcription:

Dimensionality Reduction Given N vectors in n dims, find the k most important axes to project them k is user defined (k < n) Applications: information retrieval & indexing identify the k most important features or reduce indexing dimensions for faster retrieval (low dim indices are faster) E.G.M. Petrakis Dimensionality Reduction 1

Techniques Eigenvalue analysis techniques [NR 9] Karhunen-Loeve (K-L) transform Singular Value Decomposition (SVD) both need O(N ) time FastMap [Faloutsos & Lin 95] dimensionality reduction and mapping of objects to vectors O(N) time E.G.M. Petrakis Dimensionality Reduction

Mathematical Preliminaries For an nxn square matrix S, for unit vector x and scalar value λ: Sx λx x: eigenvector of S λ: eigenvalue of S The eigenvectors of a symmetric matrix (SS T ) are mutually orthogonal and its eigenvalues are real r rank of a matrix: maximum number or independent columns or rows E.G.M. Petrakis Dimensionality Reduction 3

Example 1 Intuition: S defines an affine transform y Sx that involves scaling, rotation eigenvectors: unit vectors along the new directions eigenvalues denote scaling S 1 3.38,.85.5 E.G.M. Petrakis Dimensionality Reduction 4 1 3 λ 3.6, λ 1 r u 1 r u eigenvector of major axis.5.85

Example If S is real and symmetric (SS T ) then it can be written as S UΛU T the columns of U are eigenvectors of S U: column orthogonal (UU T I) Λ: diagonal with the eigenvalues of S S 1 1 3.5.85.85 3.6.5.5 1.38.85.85.5 E.G.M. Petrakis Dimensionality Reduction 5

Karhunen-Loeve (K-L) Project in a k-dimensional space (k<n) minimizing the error of the projections (sum. of sq. diffs) K-L gives a linear combination of axes sorted by importance keep the first k dims -dim points and the K-L directions for k1 keep x E.G.M. Petrakis Dimensionality Reduction 6

Computation of K-L Put N vectors in rows in A[a ij ] Compute B[a ij -a p ], where a 1 p N Covariance matrix: CB T B Compute the eigenvectors of C Sort in decreasing eigenvalue order N i 1 Approximate each object by its projections on the directions of the first k eigenvectors E.G.M. Petrakis Dimensionality Reduction 7 a ip

Intuition B shifts the origin of the center of gravity of the vectors by a p and has column mean C represents attribute to attribute similarity C square, real, symmetric Eigenvector and eigenvalues are computed on C not on A C denotes the affine transform that minimizes the error Approximate each vector with its projections along the first k eigenvectors E.G.M. Petrakis Dimensionality Reduction 8

E.G.M. Petrakis Dimensionality Reduction 9 Example Input vectors [1 ], [1 1], [ ] Then col.avgs are /3 and 1 1 1 1 A.47 -.88 u.13.88.47 u.53 1 1 3 / and 1 3 / 3 1/ 1 3 1/ 1 1 r r λ λ C B

SVD For general rectangular matrixes Nxn matrix (N vectors, n dimensions) groups similar entities (documents) together Groups similar terms together and each group of terms corresponds to a concept Given an Nxn matrix A, write it as A UΛV T U: Nxr column orthogonal (r: rank of A) Λ: rxr diagonal matrix (non-negative, desc. order) V: rxn column orthogonal matrix E.G.M. Petrakis Dimensionality Reduction 1

SVD (cont,d) A λ 1 u 1 v 1T + λ u v T + + λ r u r v T r u, v are column vectors of U, V SVD identifies rect. blobs of related values in A The rank r of A: number of blobs E.G.M. Petrakis Dimensionality Reduction 11

Example Term/ data information retrieval brain lung Document CS-TR1 1 1 1 CS-TR CS-TR3 1 1 1 CS-TR4 5 5 5 MED-TR1 MED-TR 3 3 MED-TR3 1 1 Two types of documents: CS and Medical Two concepts (groups of terms) CS: data, information, retrieval Medical: brain, lung E.G.M. Petrakis Dimensionality Reduction 1

Example (cont,d) U.18.36.18 9.64 A.9.53.8.7 r Λ.58 5.9.58.58 V t.71.71 U: document-to-document similarity matrix V: term-to-document similarity matrix v 1 : data has similarity with the nd concept E.G.M. Petrakis Dimensionality Reduction 13

SVD and LSI SVD leads to Latent Semantic Indexing (http://lsi.research.telcordia.com/lsi/lsipapers.html) Terms that occur together are grouped into concepts When a user searches for a term, the system determines the relevant concepts to search LSI maps concepts to vectors in the concept space instead of the n-dim. document space Concept space: is a lower dimensionality space E.G.M. Petrakis Dimensionality Reduction 14

Examples of Queries Find documents with the term data Translate query vector q to concept space The query is related to the CS concept and unrelated to the medical concept LSI returns docs that also contain the terms retrieval and information which are not specified by the query r q.58.58 1 q r r q 1.71 E.G.M. Petrakis Dimensionality Reduction 15 c V T.58.58.71

FastMap Works with distances, has two roles: 1. Maps objects to vectors so that their distances are preserved (then apply SAMs for indexing). Dim. Reduction: N vectors with n attributes each, find N vectors with k attributes such that distances are preserved as much as possible E.G.M. Petrakis Dimensionality Reduction 16

Main idea Pretend that objects are points in some unknown n-dimensional space project these points on k mutually orthogonal axes compute projections using distance only The heart of FastMap is the method that projects two objects on a line take objects which are far apart (pivots) project on the line that connects the pivots E.G.M. Petrakis Dimensionality Reduction 17

Project Objects on a Line Apply cosine low: d x bi i d d ai ai + d + d d ab ab ab x d d bi i ab O a, O b : pivots, O i : any object d ij : shorthand for D(O i,o j ) x i : first coordinate on a k dimensional space If O i is close to O a, x i is small E.G.M. Petrakis Dimensionality Reduction 18

Choose Pivots Complexity: O(N) The optimal algorithm would require O(N ) time steps,3 can be repeated 4-5 times to improve the accuracy of selection E.G.M. Petrakis Dimensionality Reduction 19

Extension for Many Dimensions Consider the (n-1)-dimensional hyperplane H that is perpendicular to line O ab Project objects on H and apply previous step choose two new pivots the new x i is the next object coordinate repeat this step until k dim. vectors are obtained The distance on H is not D D : distance between projected objects E.G.M. Petrakis Dimensionality Reduction

Distance on the Hyper-Plane H Pythagorean theorem: D' ( O O i j ) D( O O i j ) ( x i x j ) D on H can be computed from the Pythagorean theorem The ability to compute D allows for computing a second line on H etc. E.G.M. Petrakis Dimensionality Reduction 1

Algorithm E.G.M. Petrakis Dimensionality Reduction

Observations Complexity: O(kN) distance calculations k: desired dimensionality k recursive calls, each takes O(N) The algorithm records pivots in each call (dimension) to facilitate queries the query is mapped to a k-dimensional vector by projecting it on the pivot lines for each dimension O(1) computation/step: no need to compute pivots E.G.M. Petrakis Dimensionality Reduction 3

Observations (cont,d) The projected vectors can be indexed mapping on -3 dimensions allows for visualization of the data space Assumes Euclidean space (triangle rules) not always true (at least after second step) Approximation of pivots some distances are negative turn negative distances to E.G.M. Petrakis Dimensionality Reduction 4

Application: Document Vectors distance( d, d ) (1 cos( θ )) 1 sin( θ / ) (1 similarity( d 1, d )) E.G.M. Petrakis Dimensionality Reduction 5

FastMap on 1 documents for & 3 dims (a) k and (b) k 3 E.G.M. Petrakis Dimensionality Reduction 6

References Searching Multimedia Databases by Content, C. Faloutsos, Kluwer, 1996 W. Press et.al. Numerical Recipes in C, Cambridge Univ. Press, 1988 LSI website: http://lsi.research.telcordia.com/lsi/lsipapers.html C. Faloutsos, K.-Ip.Lin, FastMap: A Fast Algorithm for Indexing, Data Mining and Visualization of Traditional and Multimedia Datasets, Proc. of Sigmod, 1995 E.G.M. Petrakis Dimensionality Reduction 7