Data Mining and Matrices

Size: px
Start display at page:

Download "Data Mining and Matrices"

Transcription

1 Data Mining and Matrices 6 Non-Negative Matrix Factorization Rainer Gemulla, Pauli Miettinen May 23, 23

2 Non-Negative Datasets Some datasets are intrinsically non-negative: Counters (e.g., no. occurrences of each word in a text document) Quantities (e.g., amount of each ingredient in a chemical experiment) Intensities (e.g., intensity of each color in an image) The corresponding data matrix D has only non-negative values. Decompositions such as SVD and SDD may involve negative values in factors and components Negative values describe the absence of something Often no natural interpretation Can we find a decomposition that is more natural to non-negative data? 2 / 39

3 Example (SVD) Consider the following bridge matrix and its truncated SVD: D = U Σ V T Here are the corresponding components: D = U D V T + U 2 D 22 V T 2 Negative values make interpretation unnatural or difficult. 3 / 39

4 Outline Non-Negative Matrix Factorization 2 Algorithms 3 Probabilistic Latent Semantic Analysis 4 Summary 4 / 39

5 Non-Negative Matrix Factorization (NMF) Definition (Non-negative matrix factorization, basic form) Given a non-negative matrix D R m n +, a non-negative matrix factorization of rank k is D LR, where L R m r + and R R r n + are both non-negative. Additive decomposition: factors and components non-negative No cancellation effects Rows of R can be thought as parts Row of D obtained by mixing (or assembling ) parts in L Smallest r such that D = LR exists is called non-negative rank of D rank(d) rank + (D) min { m, n } 5 / 39

6 Example (NMF) Consider the following bridge matrix and its rank-2 NMF: D = L R Here are the corresponding components: D = L R + L 2 R 2 Non-negative matrix decomposition encourage a more natural, part-based representation and (sometimes) sparsity. 6 / 39

7 Decomposing faces (PCA) D i (original) [LR] i = L i R [UΣV T ] i = U i Σ V T Lee and Seung, 999. PCA factors are hard to interpret. 7 / 39

8 Decomposing faces (NMF) D i (original) [LR] i = L i R NMF factors correspond to parts of faces. Lee and Seung, / 39

9 Decomposing digits (NMF) D R NMF factors correspond to parts of digits and background. Cichocki et al., / 39

10 Some applications Cichocki et al., 29. Text mining (more later) Bioinformatics Microarray analysis Mineral exploration Neuroscience Image understanding Air pollution research Chemometrics Spectral data analysis Linear sparse coding Image classification Clustering Neural learning process Sound recognition Remote sensing Object characterization... / 39

11 Gaussian NMF Gaussian NMF is the most basic form of non-negative factorizations: minimize D LR 2 F s. t. L R m r + R R r n + Truncated SVD minimizes the same objective (but without non-negativity constraints) Many other variants exist Different objective functions (e.g., KL-divergence) Additional regularizations (e.g., L -regularization) Different constraints (e.g., orthogonality of R) Different compositions (e.g., 3 matrices) multi-layer NMF, semi-nmf, sparse NMF, tri-nmf, symmetric NMF, orthogonal NMF, non-smooth NMF (nsnmf), overlapping NMF, convolutive NMF (CNMF), k-means,... / 39

12 k-means can be seen as a variant of NMF Additional constraint: L contains exactly one in each row, rest D i (original) [LR] i = L i R k-means factors correspond to prototypical faces. Lee and Seung, / 39

13 NMF is not unique Factors are not ordered = = One way of ordering: decreasing Frobenius norm of components (i.e., order by L k R k F ) Factors/components are not unique = + = = + Additional constraints or regularization can encourage uniqueness. 3 / 39

14 NMF is not hierarchical Rank- NMF = Rank-2 NMF = + = Best rank-k approximation may differ significantly from best rank-(k ) approximation Rank influences sparsity, interpretability, and statistical fidelity Optimum choice of rank is not well-studied (often requires experimentation) 4 / 39

15 Outline Non-Negative Matrix Factorization 2 Algorithms 3 Probabilistic Latent Semantic Analysis 4 Summary 5 / 39

16 NMF is difficult We focus on minimizing L(L, R) = D LR 2 F. For varying m, n, and r, problem is NP-hard When rank(d) = (or r = ), can be solved in polynomial time Take first non-zero column of D as L m 2 Determine R n entry by entry (using the fact that D j = LR j ) Problem is not convex Local optimum may not correspond to global optimum Generally little hope to find global optimum But: Problem is biconvex For fixed R, f (L) = D LR 2 F is convex f (L) = i D i L i R 2 F Lik f (L) = 2(D i L i R)R T k (chain rule) (product rule) 2 L ik f (L) = 2R k R T k (does not depend on L) For fixed L, f (R) = D LR 2 F Allows for efficient algorithms is convex 6 / 39

17 General framework Gradient descent generally slow Stochastic gradient descent inappropriate Key approach: alternating minimization : Pick starting point L and R 2: while not converged do 3: Keep R fixed, optimize L 4: Keep L fixed, optimize R 5: end while Update steps 3 and 4 easier than full problem Also called alternating projections or (block) coordinate descent Starting point Random Multi-start initialization: try multiple random starting points, run a few epochs, continue with best Based on SVD... 7 / 39

18 Example Ignore non-negativity for now. Consider the regularized least-square error: L(L, R) = D LR 2 F + λ( L 2 F + R 2 F ) By setting m = n = r =, D = () and λ =.5, we obtain L(l, r) = ( lr) 2 +.5(l 2 + r 2 ) L(l,r) 5 l f (l) = 2r( lr) +.l 4 r f (r) = 2l( lr) +.r Local optima: 3 ( ) ( , 9 9 2, 2, 2 r Stationary point: (,) 9 2 ) 2 2 l 8 / 39

19 Example (ALS) f (l, r) = ( lr) 2 +.5(l 2 + r 2 ) l min l f (l) = r min r f (r) = Step l r Converges to local minimum 2r 2r l 2l 2 +. r / 39

20 Example (ALS) f (l, r) = ( lr) 2 +.5(l 2 + r 2 ) l min l f (l) = r min r f (r) = Step l r 2.. Converges to stationary point. 2r 2r l 2l 2 +. r / 39

21 Alternating non-negative least squares (ANLS) Uses non-negative least squares approximation of L and R: argmin D LR 2 F and argmin D LR 2 F L R m r + R R r n + Equivalently: find non-negative least squares solution to LR = D Common approach: Solve unconstrained least squares problems and remove negative values. E.g., when columns (rows) of L (R) are linearly independent, set L = [DR ] ɛ and R = [L D] ɛ where R = R T (RR T ) is the right pseudo-inverse of R L = (L T L) L T is the left pseudo-inverse of L [a]ɛ = max { ɛ, a } for ɛ = or some small constant (e.g., ɛ = 9 ) Difficult to analyze due to non-linear update steps Often slow convergence to a bad local minimum (better when regularized) 2 / 39

22 Example (ANLS) f (l, r) = ( lr) 2 +.5(l 2 + r 2 ) and set ɛ = 9 [ ] 2r l 2r 2 +. ɛ [ ] r 2l 2l 2 +. Step l r Converges to local minimum ɛ r / 39

23 Hierarchical alternating least squares (HALS) Work locally on a single factor, then proceed to next factor, and so on Let D (k) be the residual matrix (error) when k-th factor is removed: D (k) = D LR + L k R k = D k k L k R k HALS minimizes D (k) L k R k 2 F for k =, 2,..., r,,... (equivalently: finds best solution for k-th factor, fixing the rest) In each iteration, set (once or multiple times): [ ] L k = R k 2 D (k) R T k and R T k = ] [(D (k) F ɛ L k 2 ) T L k F D (k) can be incrementally maintained fast implementation D (k+) = D (k) + L k R k L (k+) R (k+) ɛ Often better performance in practice than ANLS Converges to stationary point when initialized with positive matrix and sufficiently small ɛ 23 / 39

24 Multiplicative updates Gradient descent step with step size η kj R kj R kj + η kj ([L T D] kj [L T LR] kj ) Setting η kj = R kj (L T LR) kj, we obtain the multiplicative update rules L L DRT LRR T and R R LT D L T LR, where multiplication ( ) and division are element-wise Does not necessarily find optimum L (or R), but can be shown to never increase loss Faster than ANLS (no computation of pseudo-inverse), easy to implement and parallelize Zeros in factors are problematic (divisions become undefined) L L [DRT ] ɛ LRR T + ɛ and R R [LT D] ɛ L T LR + ɛ Lee and Seung, 2. Cichocki et al., / 39

25 Example (multiplicative updates) f (l, r) = ( lr) 2 +.5(l 2 + r 2 ) l l r.5l lr 2 r r l.5r l 2 r Step l r Converges to local minimum r / 39

26 Outline Non-Negative Matrix Factorization 2 Algorithms 3 Probabilistic Latent Semantic Analysis 4 Summary 26 / 39

27 Topic modeling Consider a document-word matrix constructed from some corpus D = air water pollution democrat republican doc doc doc 3 doc doc 5 Documents seem to talk about two topics Environment (with words air, water, and pollution) 2 Congress (with words democrat and republican) Can we automatically detect topics in documents? 27 / 39

28 A probabilistic viewpoint Let s normalize such that the entries sum to unity D = air water pollution democrat republican doc doc doc doc doc Put all words in an urn and draw. The probability to draw word w from document d is given by P(d, w) = D dw Matrix D can represent any probability distribution plsa tries to find a distribution that is close to D but exposes information about topics 28 / 39

29 Probabilistic latent semantic analysis (plsa) Definition (plsa, NMF formulation) Given a rank r, find matrices L, Σ, and R such that where D LΣR L m r is a non-negative, column-stochastic matrix (columns sum to unity), Σ r r is a non-negative, diagonal matrix that sums to unity, and R r n is a non-negative, row-stochastic matrix (rows sum to unity). is usually taken to be the (generalized) KL divergence Additional regularization or tempering necessary to avoid overfitting Hofmann, / 39

30 Example plsa factorization of example matrix air wat pol dem rep air wat pol dem rep D L Σ R Rank r corresponds to number of topics Σ kk corresponds to overall frequency of topic k L dk corresponds to contribution of document d to topic k R kw corresponds to frequency of word w in topic k.9.6 plsa constraints allow for probabilistic interpretation P(d, w) [LΣR] dw = k Σ kkl dk R kw = k P(k)P(d k)p(w k) plsa model imposes conditional independence constraints restricted space of distributions 3 / 39

31 Another example Concepts ( of 28) extracted from Science Magazine articles (2K) Hofmann, / 39

32 plsa geometry Rewrite probabilistic formulation P(d, w) = k P(w d) = z P(k)P(d k)p(w k) P(w z)p(z d) Generative process of creating a word Pick a document according to P(d) 2 Select a topic acc. to P(z d) 3 Select a word acc. to P(w z) Hofmann, / 39

33 Kullback-Leibler divergence () Let D be the unnormalized word-count data and denote by N total number of words Likelihood of seeing D when drawing N words with replacement is proportional to m n P(d, w) D dw d= w= plsa maximizes the log-likelihood of seeing the data given the model m n log P( D L, Σ, R) D dw log P(d, w L, Σ, R) Gaussier and Goutte, 25. d= w= = m n D log [LΣR] d= w= dw m n D dw D dw log +c D [LΣR] d= w= dw }{{} Kullback-Leibler divergence 33 / 39

34 Kullback-Leibler divergence (2) KL divergence D KL (P Q) = m n d= w= P dw log P dw Q dw Interpretation: expected number of extra bits for encoding a value drawn from P using an optimum code for distribution Q D KL (P Q) D KL (P P) = D KL (P Q) D KL (Q P) NMF-based plsa algorithms minimize the generalized KL divergence D GKL ( P Q) = m d= w= where P = D and Q = L ΣR n ( P dw log P dw Q P dw + Q dw ), dw 34 / 39

35 Multiplicative updates for GKL (w/o tempering) We first find a decomposition D L R, where L and R are non-negative matrices Update rules L L D R T diag(/ rowsums( R)) L R R R diag(/ colsums( L)) L T D L R GKL is non-increasing under these update rules Normalize by rescaling columns of L and rows of R to obtain L = L diag(/ colsums( L)) R = diag(/ rowsums( R)) R Σ = diag(colsums( L) rowsums( R)) Σ = Σ/ k Σ kk Lee and Seung, / 39

36 Applications of plsa Topic modeling Clustering documents Clustering terms Information retrieval Treat query q as a new document (new row in D and L) Determine P(k q) by keeping Σ and R fixed ( fold in the query) Retrieve documents with similar topic mixture as query Can deal with synonymy and polysemy Better generalization performance than LSA (=SVD), esp. with tempering In practice, outperformed by Latent Dirichlet Allocation (LDA) Hofmann, / 39

37 Outline Non-Negative Matrix Factorization 2 Algorithms 3 Probabilistic Latent Semantic Analysis 4 Summary 37 / 39

38 Lessons learned Non-negative matrix factorization (NMF) appears natural for non-negative data NMF encourages parts-based decomposition, interpretability, and (sometimes) sparseness Many variants, many applications Usually solved via alternating minimization algorithms Alternating non-negative least squares (ANLS) Projected gradient local hierarchical ALS (HALS) Multiplicative updates plsa is an approach to topic modeling that can be seen as an NMF 38 / 39

39 Literature David Skillicorn Understanding Complex Datasets: Data Mining with Matrix Decompositions (Chapter 8) Chapman and Hall, 27 Andrzej Cichocki, Rafal Zdunek, Anh Huy Phan, and Shun-ichi Amari Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis and Blind Source Separation Wiley, 29 Yifeng Li and Alioune Ngom The NMF MATLAB Toolbox Renaud Gaujoux and Cathal Seoighe NMF R package References given at bottom of slides 39 / 39

Non-Negative Matrix Factorization

Non-Negative Matrix Factorization Chapter 3 Non-Negative Matrix Factorization Part 2: Variations & applications Geometry of NMF 2 Geometry of NMF 1 NMF factors.9.8 Data points.7.6 Convex cone.5.4 Projections.3.2.1 1.5 1.5 1.5.5 1 3 Sparsity

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May 16, 2013 Outline 1 Hunting the Bump 2 Semi-Discrete Decomposition 3 The Algorithm 4 Applications SDD alone SVD

More information

Non-Negative Matrix Factorization

Non-Negative Matrix Factorization Chapter 3 Non-Negative Matrix Factorization Part : Introduction & computation Motivating NMF Skillicorn chapter 8; Berry et al. (27) DMM, summer 27 2 Reminder A T U Σ V T T Σ, V U 2 Σ 2,2 V 2.8.6.6.3.6.5.3.6.3.6.4.3.6.4.3.3.4.5.3.5.8.3.8.3.3.5

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

Non-negative Matrix Factorization: Algorithms, Extensions and Applications

Non-negative Matrix Factorization: Algorithms, Extensions and Applications Non-negative Matrix Factorization: Algorithms, Extensions and Applications Emmanouil Benetos www.soi.city.ac.uk/ sbbj660/ March 2013 Emmanouil Benetos Non-negative Matrix Factorization March 2013 1 / 25

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Yuriy Sverchkov Intelligent Systems Program University of Pittsburgh October 6, 2011 Outline Latent Semantic Analysis (LSA) A quick review Probabilistic LSA (plsa)

More information

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space

More information

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA

More information

Non-negative matrix factorization with fixed row and column sums

Non-negative matrix factorization with fixed row and column sums Available online at www.sciencedirect.com Linear Algebra and its Applications 9 (8) 5 www.elsevier.com/locate/laa Non-negative matrix factorization with fixed row and column sums Ngoc-Diep Ho, Paul Van

More information

Nonnegative Matrix Factorization

Nonnegative Matrix Factorization Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Document and Topic Models: plsa and LDA

Document and Topic Models: plsa and LDA Document and Topic Models: plsa and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced Topics in Machine Learning 2 October 2018 Outline Topic Models plsa LSA Model Fitting via EM phits: link analysis

More information

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,

More information

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Duy-Khuong Nguyen 13, Quoc Tran-Dinh 2, and Tu-Bao Ho 14 1 Japan Advanced Institute of Science and Technology, Japan

More information

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:

More information

CS 572: Information Retrieval

CS 572: Information Retrieval CS 572: Information Retrieval Lecture 11: Topic Models Acknowledgments: Some slides were adapted from Chris Manning, and from Thomas Hoffman 1 Plan for next few weeks Project 1: done (submit by Friday).

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

1 Non-negative Matrix Factorization (NMF)

1 Non-negative Matrix Factorization (NMF) 2018-06-21 1 Non-negative Matrix Factorization NMF) In the last lecture, we considered low rank approximations to data matrices. We started with the optimal rank k approximation to A R m n via the SVD,

More information

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION Ehsan Hosseini Asl 1, Jacek M. Zurada 1,2 1 Department of Electrical and Computer Engineering University of Louisville, Louisville,

More information

c Springer, Reprinted with permission.

c Springer, Reprinted with permission. Zhijian Yuan and Erkki Oja. A FastICA Algorithm for Non-negative Independent Component Analysis. In Puntonet, Carlos G.; Prieto, Alberto (Eds.), Proceedings of the Fifth International Symposium on Independent

More information

Probabilistic Latent Variable Models as Non-Negative Factorizations

Probabilistic Latent Variable Models as Non-Negative Factorizations 1 Probabilistic Latent Variable Models as Non-Negative Factorizations Madhusudana Shashanka, Bhiksha Raj, Paris Smaragdis Abstract In this paper, we present a family of probabilistic latent variable models

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa Recap: vector space model Represent both doc and query by concept vectors Each concept defines one dimension K concepts define a high-dimensional space Element

More information

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Jiho Yoo and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong,

More information

Information Retrieval

Information Retrieval Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Probabilistic Latent Semantic Analysis Denis Helic KTI, TU Graz Jan 16, 2014 Denis Helic (KTI, TU Graz) KDDM1 Jan 16, 2014 1 / 47 Big picture: KDDM

More information

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation.

Sparse vectors recap. ANLP Lecture 22 Lexical Semantics with Dense Vectors. Before density, another approach to normalisation. ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Previous lectures: Sparse vectors recap How to represent

More information

ANLP Lecture 22 Lexical Semantics with Dense Vectors

ANLP Lecture 22 Lexical Semantics with Dense Vectors ANLP Lecture 22 Lexical Semantics with Dense Vectors Henry S. Thompson Based on slides by Jurafsky & Martin, some via Dorota Glowacka 5 November 2018 Henry S. Thompson ANLP Lecture 22 5 November 2018 Previous

More information

Information Retrieval and Topic Models. Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin)

Information Retrieval and Topic Models. Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin) Information Retrieval and Topic Models Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin) Sec. 1.1 Unstructured data in 1620 Which plays of Shakespeare

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

Information retrieval LSI, plsi and LDA. Jian-Yun Nie Information retrieval LSI, plsi and LDA Jian-Yun Nie Basics: Eigenvector, Eigenvalue Ref: http://en.wikipedia.org/wiki/eigenvector For a square matrix A: Ax = λx where x is a vector (eigenvector), and

More information

Non-Negative Matrix Factorization with Quasi-Newton Optimization

Non-Negative Matrix Factorization with Quasi-Newton Optimization Non-Negative Matrix Factorization with Quasi-Newton Optimization Rafal ZDUNEK, Andrzej CICHOCKI Laboratory for Advanced Brain Signal Processing BSI, RIKEN, Wako-shi, JAPAN Abstract. Non-negative matrix

More information

cross-language retrieval (by concatenate features of different language in X and find co-shared U). TOEFL/GRE synonym in the same way.

cross-language retrieval (by concatenate features of different language in X and find co-shared U). TOEFL/GRE synonym in the same way. 10-708: Probabilistic Graphical Models, Spring 2015 22 : Optimization and GMs (aka. LDA, Sparse Coding, Matrix Factorization, and All That ) Lecturer: Yaoliang Yu Scribes: Yu-Xiang Wang, Su Zhou 1 Introduction

More information

Single-channel source separation using non-negative matrix factorization

Single-channel source separation using non-negative matrix factorization Single-channel source separation using non-negative matrix factorization Mikkel N. Schmidt Technical University of Denmark mns@imm.dtu.dk www.mikkelschmidt.dk DTU Informatics Department of Informatics

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 08 Boolean Matrix Factorization Rainer Gemulla, Pauli Miettinen June 13, 2013 Outline 1 Warm-Up 2 What is BMF 3 BMF vs. other three-letter abbreviations 4 Binary matrices, tiles,

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,

More information

https://goo.gl/kfxweg KYOTO UNIVERSITY Statistical Machine Learning Theory Sparsity Hisashi Kashima kashima@i.kyoto-u.ac.jp DEPARTMENT OF INTELLIGENCE SCIENCE AND TECHNOLOGY 1 KYOTO UNIVERSITY Topics:

More information

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,

More information

Regularized Alternating Least Squares Algorithms for Non-negative Matrix/Tensor Factorization

Regularized Alternating Least Squares Algorithms for Non-negative Matrix/Tensor Factorization Regularized Alternating Least Squares Algorithms for Non-negative Matrix/Tensor Factorization Andrzej CICHOCKI and Rafal ZDUNEK Laboratory for Advanced Brain Signal Processing, RIKEN BSI, Wako-shi, Saitama

More information

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization Tim Roughgarden February 28, 2017 1 Preamble This lecture fulfills a promise made back in Lecture #1,

More information

Dimensionality Reduction and Principle Components Analysis

Dimensionality Reduction and Principle Components Analysis Dimensionality Reduction and Principle Components Analysis 1 Outline What is dimensionality reduction? Principle Components Analysis (PCA) Example (Bishop, ch 12) PCA vs linear regression PCA as a mixture

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Automatic Rank Determination in Projective Nonnegative Matrix Factorization

Automatic Rank Determination in Projective Nonnegative Matrix Factorization Automatic Rank Determination in Projective Nonnegative Matrix Factorization Zhirong Yang, Zhanxing Zhu, and Erkki Oja Department of Information and Computer Science Aalto University School of Science and

More information

Linear Algebra Background

Linear Algebra Background CS76A Text Retrieval and Mining Lecture 5 Recap: Clustering Hierarchical clustering Agglomerative clustering techniques Evaluation Term vs. document space clustering Multi-lingual docs Feature selection

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT CONOLUTIE NON-NEGATIE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT Paul D. O Grady Barak A. Pearlmutter Hamilton Institute National University of Ireland, Maynooth Co. Kildare, Ireland. ABSTRACT Discovering

More information

he Applications of Tensor Factorization in Inference, Clustering, Graph Theory, Coding and Visual Representation

he Applications of Tensor Factorization in Inference, Clustering, Graph Theory, Coding and Visual Representation he Applications of Tensor Factorization in Inference, Clustering, Graph Theory, Coding and Visual Representation Amnon Shashua School of Computer Science & Eng. The Hebrew University Matrix Factorization

More information

Nonnegative matrix factorization (NMF) for determined and underdetermined BSS problems

Nonnegative matrix factorization (NMF) for determined and underdetermined BSS problems Nonnegative matrix factorization (NMF) for determined and underdetermined BSS problems Ivica Kopriva Ruđer Bošković Institute e-mail: ikopriva@irb.hr ikopriva@gmail.com Web: http://www.lair.irb.hr/ikopriva/

More information

Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Latent Semantic Models. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze Latent Semantic Models Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 Vector Space Model: Pros Automatic selection of index terms Partial matching of queries

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same

More information

Problems. Looks for literal term matches. Problems:

Problems. Looks for literal term matches. Problems: Problems Looks for literal term matches erms in queries (esp short ones) don t always capture user s information need well Problems: Synonymy: other words with the same meaning Car and automobile 电脑 vs.

More information

Intelligent Data Analysis. Mining Textual Data. School of Computer Science University of Birmingham

Intelligent Data Analysis. Mining Textual Data. School of Computer Science University of Birmingham Intelligent Data Analysis Mining Textual Data Peter Tiňo School of Computer Science University of Birmingham Representing documents as numerical vectors Use a special set of terms T = {t 1, t 2,..., t

More information

Singular Value Decompsition

Singular Value Decompsition Singular Value Decompsition Massoud Malek One of the most useful results from linear algebra, is a matrix decomposition known as the singular value decomposition It has many useful applications in almost

More information

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition

ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition ENGG5781 Matrix Analysis and Computations Lecture 10: Non-Negative Matrix Factorization and Tensor Decomposition Wing-Kin (Ken) Ma 2017 2018 Term 2 Department of Electronic Engineering The Chinese University

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION Julian M. ecker, Christian Sohn Christian Rohlfing Institut für Nachrichtentechnik RWTH Aachen University D-52056

More information

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

Novel Alternating Least Squares Algorithm for Nonnegative Matrix and Tensor Factorizations

Novel Alternating Least Squares Algorithm for Nonnegative Matrix and Tensor Factorizations Novel Alternating Least Squares Algorithm for Nonnegative Matrix and Tensor Factorizations Anh Huy Phan 1, Andrzej Cichocki 1,, Rafal Zdunek 1,2,andThanhVuDinh 3 1 Lab for Advanced Brain Signal Processing,

More information

On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing

On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing Computational Statistics and Data Analysis 52 (2008) 3913 3927 www.elsevier.com/locate/csda On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing Chris

More information

Nonnegative Matrix Factorization. Data Mining

Nonnegative Matrix Factorization. Data Mining The Nonnegative Matrix Factorization in Data Mining Amy Langville langvillea@cofc.edu Mathematics Department College of Charleston Charleston, SC Yahoo! Research 10/18/2005 Outline Part 1: Historical Developments

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

CSE 494/598 Lecture-6: Latent Semantic Indexing. **Content adapted from last year s slides

CSE 494/598 Lecture-6: Latent Semantic Indexing. **Content adapted from last year s slides CSE 494/598 Lecture-6: Latent Semantic Indexing LYDIA MANIKONDA HT TP://WWW.PUBLIC.ASU.EDU/~LMANIKON / **Content adapted from last year s slides Announcements Homework-1 and Quiz-1 Project part-2 released

More information

Embeddings Learned By Matrix Factorization

Embeddings Learned By Matrix Factorization Embeddings Learned By Matrix Factorization Benjamin Roth; Folien von Hinrich Schütze Center for Information and Language Processing, LMU Munich Overview WordSpace limitations LinAlgebra review Input matrix

More information

Collaborative Filtering: A Machine Learning Perspective

Collaborative Filtering: A Machine Learning Perspective Collaborative Filtering: A Machine Learning Perspective Chapter 6: Dimensionality Reduction Benjamin Marlin Presenter: Chaitanya Desai Collaborative Filtering: A Machine Learning Perspective p.1/18 Topics

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language

More information

Using Matrix Decompositions in Formal Concept Analysis

Using Matrix Decompositions in Formal Concept Analysis Using Matrix Decompositions in Formal Concept Analysis Vaclav Snasel 1, Petr Gajdos 1, Hussam M. Dahwa Abdulla 1, Martin Polovincak 1 1 Dept. of Computer Science, Faculty of Electrical Engineering and

More information

Modeling Environment

Modeling Environment Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA

More information

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing

Semantics with Dense Vectors. Reference: D. Jurafsky and J. Martin, Speech and Language Processing Semantics with Dense Vectors Reference: D. Jurafsky and J. Martin, Speech and Language Processing 1 Semantics with Dense Vectors We saw how to represent a word as a sparse vector with dimensions corresponding

More information

Language Information Processing, Advanced. Topic Models

Language Information Processing, Advanced. Topic Models Language Information Processing, Advanced Topic Models mcuturi@i.kyoto-u.ac.jp Kyoto University - LIP, Adv. - 2011 1 Today s talk Continue exploring the representation of text as histogram of words. Objective:

More information

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition Yu-Seop Kim 1, Jeong-Ho Chang 2, and Byoung-Tak Zhang 2 1 Division of Information and Telecommunication

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

Fast Coordinate Descent methods for Non-Negative Matrix Factorization

Fast Coordinate Descent methods for Non-Negative Matrix Factorization Fast Coordinate Descent methods for Non-Negative Matrix Factorization Inderjit S. Dhillon University of Texas at Austin SIAM Conference on Applied Linear Algebra Valencia, Spain June 19, 2012 Joint work

More information

Stochastic Variational Inference

Stochastic Variational Inference Stochastic Variational Inference David M. Blei Princeton University (DRAFT: DO NOT CITE) December 8, 2011 We derive a stochastic optimization algorithm for mean field variational inference, which we call

More information

Abstract. LANDI, AMANDA KIM. The Nonnegative Matrix Factorization: Methods and Applications. (Under the direction of Kazufumi Ito.

Abstract. LANDI, AMANDA KIM. The Nonnegative Matrix Factorization: Methods and Applications. (Under the direction of Kazufumi Ito. Abstract LANDI, AMANDA KIM. The Nonnegative Matrix Factorization: Methods and Applications. (Under the direction of Kazufumi Ito.) In today s world, data continues to grow in size and complexity. Thus,

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2017 Admin Assignment 4: Due Friday. Assignment 5: Posted, due Monday of last week of classes Last Time: PCA with Orthogonal/Sequential

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Manning & Schuetze, FSNLP, (c)

Manning & Schuetze, FSNLP, (c) page 554 554 15 Topics in Information Retrieval co-occurrence Latent Semantic Indexing Term 1 Term 2 Term 3 Term 4 Query user interface Document 1 user interface HCI interaction Document 2 HCI interaction

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Dan Oneaţă 1 Introduction Probabilistic Latent Semantic Analysis (plsa) is a technique from the category of topic models. Its main goal is to model cooccurrence information

More information

Latent Variable Models and EM algorithm

Latent Variable Models and EM algorithm Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic

More information

Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification

Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification 250 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'15 Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification Keigo Kimura, Mineichi Kudo and Lu Sun Graduate

More information

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology

Non-Negative Matrix Factorization And Its Application to Audio. Tuomas Virtanen Tampere University of Technology Non-Negative Matrix Factorization And Its Application to Audio Tuomas Virtanen Tampere University of Technology tuomas.virtanen@tut.fi 2 Contents Introduction to audio signals Spectrogram representation

More information

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation

Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Nonnegative Matrix Factor 2-D Deconvolution for Blind Single Channel Source Separation Mikkel N. Schmidt and Morten Mørup Technical University of Denmark Informatics and Mathematical Modelling Richard

More information

Probabilistic Time Series Classification

Probabilistic Time Series Classification Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Nonnegative Matrix Factorization with Orthogonality Constraints

Nonnegative Matrix Factorization with Orthogonality Constraints Nonnegative Matrix Factorization with Orthogonality Constraints Jiho Yoo and Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology (POSTECH) Pohang, Republic

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Topic Models and Applications to Short Documents

Topic Models and Applications to Short Documents Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text

More information

Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square Statistic, and a Hybrid Method

Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square Statistic, and a Hybrid Method Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, hi-square Statistic, and a Hybrid Method hris Ding a, ao Li b and Wei Peng b a Lawrence Berkeley National Laboratory,

More information

Dictionary Learning Using Tensor Methods

Dictionary Learning Using Tensor Methods Dictionary Learning Using Tensor Methods Anima Anandkumar U.C. Irvine Joint work with Rong Ge, Majid Janzamin and Furong Huang. Feature learning as cornerstone of ML ML Practice Feature learning as cornerstone

More information

Fast Nonnegative Matrix Factorization with Rank-one ADMM

Fast Nonnegative Matrix Factorization with Rank-one ADMM Fast Nonnegative Matrix Factorization with Rank-one Dongjin Song, David A. Meyer, Martin Renqiang Min, Department of ECE, UCSD, La Jolla, CA, 9093-0409 dosong@ucsd.edu Department of Mathematics, UCSD,

More information

DATA MINING AND MACHINE LEARNING. Lecture 4: Linear models for regression and classification Lecturer: Simone Scardapane

DATA MINING AND MACHINE LEARNING. Lecture 4: Linear models for regression and classification Lecturer: Simone Scardapane DATA MINING AND MACHINE LEARNING Lecture 4: Linear models for regression and classification Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Linear models for regression Regularized

More information

Matrix decompositions and latent semantic indexing

Matrix decompositions and latent semantic indexing 18 Matrix decompositions and latent semantic indexing On page 113, we introduced the notion of a term-document matrix: an M N matrix C, each of whose rows represents a term and each of whose columns represents

More information

Note for plsa and LDA-Version 1.1

Note for plsa and LDA-Version 1.1 Note for plsa and LDA-Version 1.1 Wayne Xin Zhao March 2, 2011 1 Disclaimer In this part of PLSA, I refer to [4, 5, 1]. In LDA part, I refer to [3, 2]. Due to the limit of my English ability, in some place,

More information