Latent Semantic Analysis. Hongning Wang

Similar documents
Latent Semantic Analysis. Hongning Wang

Information Retrieval

Problems. Looks for literal term matches. Problems:

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Linear Algebra Background

Probabilistic Latent Semantic Analysis

Latent semantic indexing

Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Manning & Schuetze, FSNLP (c) 1999,2000

INF 141 IR METRICS LATENT SEMANTIC ANALYSIS AND INDEXING. Crista Lopes

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Manning & Schuetze, FSNLP, (c)

EIGENVALE PROBLEMS AND THE SVD. [5.1 TO 5.3 & 7.4]

Introduction to Information Retrieval

Notes on Latent Semantic Analysis

Singular Value Decomposition

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Let A an n n real nonsymmetric matrix. The eigenvalue problem: λ 1 = 1 with eigenvector u 1 = ( ) λ 2 = 2 with eigenvector u 2 = ( 1

Matrix decompositions and latent semantic indexing

Assignment 3. Latent Semantic Indexing

PV211: Introduction to Information Retrieval

Bruce Hendrickson Discrete Algorithms & Math Dept. Sandia National Labs Albuquerque, New Mexico Also, CS Department, UNM

CS 572: Information Retrieval

Variable Latent Semantic Indexing

Exercise Sheet 1. 1 Probability revision 1: Student-t as an infinite mixture of Gaussians

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing

Singular Value Decompsition

Embeddings Learned By Matrix Factorization

.. CSC 566 Advanced Data Mining Alexander Dekhtyar..

PROBABILISTIC LATENT SEMANTIC ANALYSIS

Numerical Methods I Singular Value Decomposition

CS47300: Web Information Search and Management

INFO 4300 / CS4300 Information Retrieval. IR 9: Linear Algebra Review

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition

STA141C: Big Data & High Performance Statistical Computing

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides

The Singular Value Decomposition

Using SVD to Recommend Movies

Maths for Signals and Systems Linear Algebra in Engineering

Review of Some Concepts from Linear Algebra: Part 2

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

A few applications of the SVD

The Singular Value Decomposition

STA141C: Big Data & High Performance Statistical Computing

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

9 Searching the Internet with the SVD

Matrices, Vector Spaces, and Information Retrieval

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Singular Value Decomposition

Cross-Lingual Language Modeling for Automatic Speech Recogntion

CSE 494/598 Lecture-6: Latent Semantic Indexing. **Content adapted from last year s slides

Lecture 10 - Eigenvalues problem

Lecture 02 Linear Algebra Basics

Language Information Processing, Advanced. Topic Models

Linear Algebra - Part II

Lecture 5: Web Searching using the SVD

Background Mathematics (2/2) 1. David Barber

13 Searching the Web with the SVD

Fast LSI-based techniques for query expansion in text retrieval systems

CS168: The Modern Algorithmic Toolbox Lecture #10: Tensors, and Low-Rank Tensor Recovery

Data Mining Lecture 4: Covariance, EVD, PCA & SVD

Latent Semantic Indexing

COMP 558 lecture 18 Nov. 15, 2010

Homework 1. Yuan Yao. September 18, 2011

Chapter 7: Symmetric Matrices and Quadratic Forms

A Note on the Effect of Term Weighting on Selecting Intrinsic Dimensionality of Data

Linear Methods in Data Mining

Text Mining for Economics and Finance Unsupervised Learning

2. LINEAR ALGEBRA. 1. Definitions. 2. Linear least squares problem. 3. QR factorization. 4. Singular value decomposition (SVD) 5.

USING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING

THE SINGULAR VALUE DECOMPOSITION MARKUS GRASMAIR

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

ANLP Lecture 22 Lexical Semantics with Dense Vectors


Data Mining Techniques

Large Scale Data Analysis Using Deep Learning

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

Linear Least Squares. Using SVD Decomposition.

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

Retrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1

Faloutsos, Tong ICDE, 2009

15 Singular Value Decomposition

Singular Value Decomposition

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Data Mining and Matrices

EIGENVALUES AND SINGULAR VALUE DECOMPOSITION

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

1. Ignoring case, extract all unique words from the entire set of documents.

Review problems for MA 54, Fall 2004.

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

Singular Value Decomposition

Linear Algebra. Session 12

INFO 4300 / CS4300 Information Retrieval. slides adapted from Hinrich Schütze s, linked from

Transcription:

Latent Semantic Analysis Hongning Wang CS@UVa

VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy: fly (action v.s. insect) CS@UVa CS6501: Information Retrieval 2

Choosing basis for VS model A concept space is preferred Semantic gap will be bridged Finance D 2 D 4 D 3 Query Sports Education D D 1 5 CS@UVa CS6501: Information Retrieval 3

How to build such a space Automatic term expansion Construction of thesaurus WordNet Clustering of words Word sense disambiguation Dictionary-based Relation between a pair of words should be similar as in text and dictionary s descrption Explore word usage context CS@UVa CS6501: Information Retrieval 4

How to build such a space Latent Semantic Analysis Assumption: there is some underlying latent semantic structure in the data that is partially obscured by the randomness of word choice with respect to retrieval It means: the observed term-document association data is contaminated by random noise CS@UVa CS6501: Information Retrieval 5

How to build such a space Solution Low rank matrix approximation Imagine this is *true* concept-document matrix Imagine this is our observed term-document matrix Random noise over the word selection in each document CS@UVa CS6501: Information Retrieval 6

Latent Semantic Analysis (LSA) Low rank approximation of term-document matrix C M N Goal: remove noise in the observed termdocument association data Solution: find a matrix with rank k which is closest to the original matrix in terms of Frobenius norm Z = argmin Z rank Z =k C Z F = argmin Z rank Z =k M N 2 i=1 j=1 C ij Z ij CS@UVa CS6501: Information Retrieval 7

Basic concepts in linear algebra Symmetric matrix C = C T Rank of a matrix Number of linearly independent rows (columns) in a matrix C M N rank C M N min(m, N) CS@UVa CS6501: Information Retrieval 8

Basic concepts in linear algebra Eigen system For a square matrix C M M If Cx = λx, x is called the right eigenvector of C and λ is the corresponding eigenvalue For a symmetric full-rank matrix C M M We have its eigen-decomposition as C = QΛQ T where the columns of Q are the orthogonal and normalized eigenvectors of C and Λ is a diagonal matrix whose entries are the eigenvalues of C CS@UVa CS6501: Information Retrieval 9

Basic concepts in linear algebra Singular value decomposition (SVD) For matrix C M N with rank r, we have C = UΣV T where U M r and V N r are orthogonal matrices, and Σ is a r r diagonal matrix, with Σ ii = λ i and λ 1 λ r are the eigenvalues of CC T k We define C M N T = U M k Σ k k V N k where we place Σ ii in a descending order and set Σ ii = λ i for i k, and Σ ii = 0 for i > k CS@UVa CS6501: Information Retrieval 10

Latent Semantic Analysis (LSA) Solve LSA by SVD Z = argmin Z rank Z =k = argmin Z rank Z =k k = C M N Procedure of LSA Map to a lower dimensional space C Z F M N 2 i=1 j=1 C ij Z ij 1. Perform SVD on document-term adjacency matrix k 2. Construct C M N by only keeping the largest k singular values in Σ non-zero CS@UVa CS6501: Information Retrieval 11

Latent Semantic Analysis (LSA) Another interpretation C M N is the term-document adjacency matrix T D M M = C M N C M N D ij : document-document similarity by counting how many terms co-occur in d i and d j D = UΣV T UΣV T T = UΣ 2 U T Eigen-decomposition of document-document similarity matrix d i s new representation is then UΣ 1 2 i in this system(space) In the lower dimensional space, we will only use the first k elements in UΣ 1 2 i to represent d i The same analysis applies to T N N = C M N T C M N CS@UVa CS6501: Information Retrieval 12

Geometric interpretation of LSA k C M N (i, j) measures the relatedness between d i and w j in the k-dimensional space Therefore k As C M N T = U M k Σ k k V N k 2 d i is represented as U M k Σ k k 2 w j is represented as V N k Σ k k 1 1 i j CS@UVa CS6501: Information Retrieval 13

Latent Semantic Analysis (LSA) Visualization HCI Graph theory CS@UVa CS6501: Information Retrieval 14

What are those dimensions in LSA Principle component analysis CS@UVa CS6501: Information Retrieval 15

Latent Semantic Analysis (LSA) What we have achieved via LSA Terms/documents that are closely associated are placed near one another in this new space Terms that do not occur in a document may still close to it, if that is consistent with the major patterns of association in the data A good choice of concept space for VS model! CS@UVa CS6501: Information Retrieval 16

LSA for retrieval Project queries into the new document space 1 q = qv N k Σ k k Treat query as a pseudo document of term vector Cosine similarity between query and documents in this lower-dimensional space CS@UVa CS6501: Information Retrieval 17

LSA for retrieval HCI Graph theory q: human computer interaction CS@UVa CS6501: Information Retrieval 18

Discussions Computationally expensive Time complexity O(MN 2 ) Empirically helpful for recall but not for precision Recall increases as k decreases Optimal choice of k Difficult to handle dynamic corpus Difficult to interpret the decomposition results CS@UVa We will come back to CS6501: this Information later! Retrieval 19

LSA beyond text Collaborative filtering CS@UVa CS6501: Information Retrieval 20

LSA beyond text Eigen face CS@UVa CS6501: Information Retrieval 21

LSA beyond text Cat from deep neuron network One of the neurons in the artificial neural network, trained from still frames from unlabeled YouTube videos, learned to detect cats. CS@UVa CS6501: Information Retrieval 22

What you should know Assumption in LSA Interpretation of LSA Low rank matrix approximation Eigen-decomposition of co-occurrence matrix for documents and terms LSA for IR CS@UVa CS6501: Information Retrieval 23