Non-Negative Matrix Factorization

Size: px
Start display at page:

Download "Non-Negative Matrix Factorization"

Transcription

1 Chapter 3 Non-Negative Matrix Factorization Part 2: Variations & applications

2 Geometry of NMF 2

3 Geometry of NMF 1 NMF factors.9.8 Data points.7.6 Convex cone.5.4 Projections

4 Sparsity in NMF Skillicorn chapter 8; Berry et al. (27) 4

5 Sparsity: desiderata Sparse factor matrices are often preferred Simpler to interpret (zeroes can be ignored) Agrees with our intuition on parts of whole Faster computations, less space NMF is sometimes claimed to automatically yield sparse factors In practice, this is often not the case 5

6 Enforcing sparsity A common solution: change the target function to minimize Parameters Regularizers 1 2 ka WHk2 F + density(w)+ density(h) How to define sparsity? Naïve: density(w) = 1 nnz(w)/size(w) Non-convex, non-nice to optimize directly 6

7 Frobenius regularizer A.k.a. Tikhonov regularizer Minimize 1 2 Ä ka WHk 2 F + kwk2 F + khk2 F Doesn t help much with sparsity Used to impose smoothness 7

8 ALS-NMF with Frobenius regularizers The update rules for ALS with Frobenius regularizer are W H (AH T )(HH T + ) 1 + (W T W + ) 1 (W T A) + Uses the fact that X + = (X T X) 1 X T if X has full column rank (homework) 8

9 L 1 (Lasso) regularizer Using L 1 based instead of L 2 based regularizer helps obtaining sparse solutions 1 2 ka WHk2 F + P,j W j + P,j H j Larger values of α and β yield sparse solutions (e.g. α, β [.1,.5]) Still no guarantees on sparsity 9

10 ALS-NMF with Lasso The update rules are W (AH T 1 n k )(HH T ) 1 + H (W T W) 1 (W T A 1 k m ) + 1 n k is n-by-k matrix of all 1s Requires columns of W to be normalized to unit L 1 after each update W j W j / P W j 1

11 Hoyer s sparse NMF Hoyer (24) considers the following sparsity function for n-dimensional vector x sp rsity( )= p n k k1 / k k 2 p n 1 sparsity(x) = 1 iff nnz(x) = 1 sparsity(x) = iff x i = x j for all i, j Hoyer 24 11

12 Hoyer s sparse NMF Hoyer s algorithm obtains an NMF using factor matrices with user-defined level of sparsity After every update, the columns of W and rows of H are updated s.t. their L 2 is constant and L 1 is set to desired level of sparsity 12

13 Getting the desired level of sparsity Set s = x + (L 1 i x i )/dim(x) and Z = {} Fix L 1 repeat Set m i = L 1 /(dim(x) size(z)) for i Z and m i = o/w Set s = m + α(s m) where α is s.t. s 2 = L 2 Fix L 2 If s, return s Are we done? Set Z = Z {i : s i < } and s i = for all i Z Set c = ( i s i L 1 )/(dim(x) size(z)) Set s i = s i c for all i Z Truncate negative values and fix L 1 again 13

14 Other forms of NMF 14

15 Normalized NMF Columns of W (and/or rows of H) should be normalized to sum to unity Stability of the solution and interpretability If only W (or H) is normalized, the weights can be pushed to the other matrix To normalize both, use k-by-k diagonal Σ s.t. σ ii = W i 1 (H T ) i 1 Normalized NMF: WΣH 15

16 Semi-orthogonal NMF In semi-orthogonal NMF we restrict H to roworthogonal: minimize A WH F s.t. HH T = I and W and H are nonnegative Solutions are unique (up to permutations) The problem is equivalent to k-means In the sense that the optimal solutions have the same value Ding et al

17 Geometry of semiorthogonal NMF The orthogonal factors span a cone 17

18 NMF and clustering In k-means, we minimize P k j=1 P 2C j j 2 2 = P k j=1 P n =1 G j j 2 2 μ j is the centroid of the jth cluster C j G is n-by-k cluster assignment matrix G ij = 1 if i C j and otherwise Equivalently: ka GMk 2 F Type of NMF if A is nonnegative! M is k-by-m containing the centroids as its rows 18

19 Orthogonal tri-factor NMF We can find NMF where both W and H are (column/row) orthogonal Often too restrictive; cannot handle different scales In orthogonal nonnegative tri-factorization we add third non-negative matrix S: minimize A WSH F s.t. W T W = I, HH T = I, and all matrices are non-negative 19

20 More on tri-factorization S does not have to be square W is n-by-k, S is k-by-l, H is l-by-m Different number of row and column factors If orthogonal NMF clusters columns of A, this bi-clusters rows and columns simultaneously 2

21 Computing semiorthogonal NMF If H has to be orthogonal, either update as usual and set after every iteration H [HH T ] 1/2 H ; or update H j H j v t (W T A) j (W T AH T H) j W is updated as usual (w/o constraints) If W needs to be orthogonal, the update rules are changed accordingly 21

22 Computing the orthogonal tri-factorization The update rules for orthogonal trifactorization are H j H j v ut (WS) T A j (WS) T AH T H j W j W j v t (A(SH) T ) j (WW T A(SH) T ) j S j S j v t (W T AH T ) j (W T WSHH T ) j 22

23 Other optimization functions 23

24 Kullback Leibler divergence The Kullback Leibler divergence of Q from P, D KL (P Q), measures the expected number of extra bits required to code samples from P when using a code optimized for Q D KL (PkQ) = P P( ) ln P( ) Q( ) P and Q are probability distributions Non-negative and non-symmetric 24

25 Generalized KL-divergence and matrix factorizations The standard KL-divergence requires P and Q be probability distributions (e.g. i P(i) = 1) The generalized KL-divergence (or I-divergence) removes this requirement: P Ä D GKL (PkQ) = P( ) ln P( ) ä P( )+Q( ) Q( ) In NMF, P = A and P Q = WH : D GKL (AkWH)=,j A j ln A j (WH) j A j +(WH) j 25

26 KL v.s. GKL in NMF KL requires A to be considered as a probability distribution i,j A i,j = 1 (or row/column normalization) WH should be normalized the same way GKL only requires non-negativity But inherently assumes integer data Looses a bit of the probability interpretation 26

27 NMF for GKL The update rules for multiplicative GKL NMF algorithm are H kj H kj P n =1 W k(a j /(WH) j ) P n =1 W k W k W k P m j=1 (A j/(wh) j )H kj P m j=1 H kj The columns of W are normalized to sum to unity after every iteration 27

28 Applications of NMF 28

29 Component interpretation NMF s main sales argument is the component interpretation A w 1 h 1 + w 2 h w k h k Each rank-1 component has parts-ofwhole interpretation Nothing is ever removed 29

30 Hand-written digits Cichocki et al. 29 3

31 Factor interpretation NMF can be seen as a nonnegative mixture of nonnegative factors The factors can capture underlying nonnegative phenomena The nonnegative coefficients potentially help with the interpretation 31

32 Geometric interpretation NMF factors are not (generally) orthogonal They do not create a coordinate system Span a convex cone Projection to the space spanned by the factors can yield odd results Points that are far away in the original space get close in the cone and vice versa 32

33 Separation of various spectra In spectroscopy imaging we often have multiple observations of signals over some spectrum Observations-by-spectrum non-negative matrix The signals constitute an additive mixture of pure signals 33

34 Raman spectroscopy Ground truth Observation Estimated Estimated Cichocki et al

35 Text mining and plsa Consider a document term matrix A a ij is the number of times term j appears in document i Can we find these topics automatically? A = Environmet air water pollution democrat republican doc doc doc 3 B 1 11 doc 4@ 8 5 doc Politics 1 C A 35

36 The idea Normalized A that sums to 1 can be considered as a probability distribution P(d, w) = A d,w P(d, )= P k P(k)P(d k)p( k) Model with topics: P( d) = P z P( z)p(z d) 36

37 Generative process Pick a document according to P(d) P [ w d ] P [ z d ] P [ w z ] z economic Select a topic according to P(z d) imports Select a word according to TRADE embargo P(w z) documents d latent concepts z (aspects) terms w (words) 37

38 plsa as NMF In the NMF version of the probabilistic latent semantic analysis (plsa) we are given documents-by-terms matrix A and rank r We have to find n-by-r non-negative W (columns sum to unity) r-by-r diagonal non-negative Σ r-by-m non-negative H (rows sum to unity) Minimizing D GKL (A WΣH) 38

39 Geometry of plsa T. Hofmann Unsupervised learning by probabilistic latent semantic analysis

40 plsa example air wat pol dem rep air wat pol dem rep D L R A W Σ H Here, A is normalized How strong the topic is Overall frequency How strong the word is in the in the document? topic? 4

41 NMF algorithm for plsa Compute W and H using GKL NMF algorithm Normalize columns (rows) of W (H) and put the multipliers to Σ Normalize Σ to sum to unity Real implementations would require tempering to avoid over-fitting 41

42 plsa example Source: Thomas Hofmann, Tutorial at ADFOCS 24 42

43 plsa applications Topic modeling Clustering documents and terms Information retrieval Similar to LSA/LSI Generalizes better than LSA But outperformed by Latent Dirichlet Allocation (LDA) 43

44 NMF summary Parts-of-whole interpretation Often easier/more appropriate than SVD Hard to compute and non-unique Local updates (multiplicative, gradient, ALS) Many applications and specific variations Still under very active research 44

45 Literature Berry, Browne, Langville, Pauca & Plemmons (27): Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52, pp Hoyer (24): Non-negative matrix factorizations with sparseness constraints. J. Mach. Learn. Res. 5, pp Ding, Li, Peng & Park (26): Orthogonal nonnegative matrix tri-factorizations for clustering. In KDD 6, pp Cichocki, Zdunek, Phan & Amari (29): Nonnegative matrix and tensor factorizations. John Wiley & Sons Chichester, UK 45

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 6 Non-Negative Matrix Factorization Rainer Gemulla, Pauli Miettinen May 23, 23 Non-Negative Datasets Some datasets are intrinsically non-negative: Counters (e.g., no. occurrences

More information

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space

More information

Non-negative Matrix Factorization: Algorithms, Extensions and Applications

Non-negative Matrix Factorization: Algorithms, Extensions and Applications Non-negative Matrix Factorization: Algorithms, Extensions and Applications Emmanouil Benetos www.soi.city.ac.uk/ sbbj660/ March 2013 Emmanouil Benetos Non-negative Matrix Factorization March 2013 1 / 25

More information

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs

Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Rank Selection in Low-rank Matrix Approximations: A Study of Cross-Validation for NMFs Bhargav Kanagal Department of Computer Science University of Maryland College Park, MD 277 bhargav@cs.umd.edu Vikas

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

Nonnegative Matrix Factorization. Data Mining

Nonnegative Matrix Factorization. Data Mining The Nonnegative Matrix Factorization in Data Mining Amy Langville langvillea@cofc.edu Mathematics Department College of Charleston Charleston, SC Yahoo! Research 10/18/2005 Outline Part 1: Historical Developments

More information

Non-Negative Matrix Factorization

Non-Negative Matrix Factorization Chapter 3 Non-Negative Matrix Factorization Part : Introduction & computation Motivating NMF Skillicorn chapter 8; Berry et al. (27) DMM, summer 27 2 Reminder A T U Σ V T T Σ, V U 2 Σ 2,2 V 2.8.6.6.3.6.5.3.6.3.6.4.3.6.4.3.3.4.5.3.5.8.3.8.3.3.5

More information

Document and Topic Models: plsa and LDA

Document and Topic Models: plsa and LDA Document and Topic Models: plsa and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced Topics in Machine Learning 2 October 2018 Outline Topic Models plsa LSA Model Fitting via EM phits: link analysis

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 05 Semi-Discrete Decomposition Rainer Gemulla, Pauli Miettinen May 16, 2013 Outline 1 Hunting the Bump 2 Semi-Discrete Decomposition 3 The Algorithm 4 Applications SDD alone SVD

More information

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,

More information

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds

Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Orthogonal Nonnegative Matrix Factorization: Multiplicative Updates on Stiefel Manifolds Jiho Yoo and Seungjin Choi Department of Computer Science Pohang University of Science and Technology San 31 Hyoja-dong,

More information

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION

MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION MULTIPLICATIVE ALGORITHM FOR CORRENTROPY-BASED NONNEGATIVE MATRIX FACTORIZATION Ehsan Hosseini Asl 1, Jacek M. Zurada 1,2 1 Department of Electrical and Computer Engineering University of Louisville, Louisville,

More information

Non-negative matrix factorization with fixed row and column sums

Non-negative matrix factorization with fixed row and column sums Available online at www.sciencedirect.com Linear Algebra and its Applications 9 (8) 5 www.elsevier.com/locate/laa Non-negative matrix factorization with fixed row and column sums Ngoc-Diep Ho, Paul Van

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

Probabilistic Latent Variable Models as Non-Negative Factorizations

Probabilistic Latent Variable Models as Non-Negative Factorizations 1 Probabilistic Latent Variable Models as Non-Negative Factorizations Madhusudana Shashanka, Bhiksha Raj, Paris Smaragdis Abstract In this paper, we present a family of probabilistic latent variable models

More information

Clustering. SVD and NMF

Clustering. SVD and NMF Clustering with the SVD and NMF Amy Langville Mathematics Department College of Charleston Dagstuhl 2/14/2007 Outline Fielder Method Extended Fielder Method and SVD Clustering with SVD vs. NMF Demos with

More information

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization Tim Roughgarden February 28, 2017 1 Preamble This lecture fulfills a promise made back in Lecture #1,

More information

Non-negative matrix factorization and term structure of interest rates

Non-negative matrix factorization and term structure of interest rates Non-negative matrix factorization and term structure of interest rates Hellinton H. Takada and Julio M. Stern Citation: AIP Conference Proceedings 1641, 369 (2015); doi: 10.1063/1.4906000 View online:

More information

Notes on Latent Semantic Analysis

Notes on Latent Semantic Analysis Notes on Latent Semantic Analysis Costas Boulis 1 Introduction One of the most fundamental problems of information retrieval (IR) is to find all documents (and nothing but those) that are semantically

More information

On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing

On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing Computational Statistics and Data Analysis 52 (2008) 3913 3927 www.elsevier.com/locate/csda On the equivalence between Non-negative Matrix Factorization and Probabilistic Latent Semantic Indexing Chris

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

SVD, PCA & Preprocessing

SVD, PCA & Preprocessing Chapter 1 SVD, PCA & Preprocessing Part 2: Pre-processing and selecting the rank Pre-processing Skillicorn chapter 3.1 2 Why pre-process? Consider matrix of weather data Monthly temperatures in degrees

More information

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu

CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu CS598 Machine Learning in Computational Biology (Lecture 5: Matrix - part 2) Professor Jian Peng Teaching Assistant: Rongda Zhu Feature engineering is hard 1. Extract informative features from domain knowledge

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2017 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2017 Admin Assignment 4: Due Friday. Assignment 5: Posted, due Monday of last week of classes Last Time: PCA with Orthogonal/Sequential

More information

Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering

Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering Regularized NNLS Algorithms for Nonnegative Matrix Factorization with Application to Text Document Clustering Rafal Zdunek Abstract. Nonnegative Matrix Factorization (NMF) has recently received much attention

More information

Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square Statistic, and a Hybrid Method

Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square Statistic, and a Hybrid Method Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, hi-square Statistic, and a Hybrid Method hris Ding a, ao Li b and Wei Peng b a Lawrence Berkeley National Laboratory,

More information

Automatic Rank Determination in Projective Nonnegative Matrix Factorization

Automatic Rank Determination in Projective Nonnegative Matrix Factorization Automatic Rank Determination in Projective Nonnegative Matrix Factorization Zhirong Yang, Zhanxing Zhu, and Erkki Oja Department of Information and Computer Science Aalto University School of Science and

More information

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

Information retrieval LSI, plsi and LDA. Jian-Yun Nie Information retrieval LSI, plsi and LDA Jian-Yun Nie Basics: Eigenvector, Eigenvalue Ref: http://en.wikipedia.org/wiki/eigenvector For a square matrix A: Ax = λx where x is a vector (eigenvector), and

More information

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm

Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Simplicial Nonnegative Matrix Tri-Factorization: Fast Guaranteed Parallel Algorithm Duy-Khuong Nguyen 13, Quoc Tran-Dinh 2, and Tu-Bao Ho 14 1 Japan Advanced Institute of Science and Technology, Japan

More information

Single-channel source separation using non-negative matrix factorization

Single-channel source separation using non-negative matrix factorization Single-channel source separation using non-negative matrix factorization Mikkel N. Schmidt Technical University of Denmark mns@imm.dtu.dk www.mikkelschmidt.dk DTU Informatics Department of Informatics

More information

Information Retrieval

Information Retrieval Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 13: Latent Semantic Indexing Ch. 18 Today s topic Latent Semantic Indexing Term-document matrices

More information

c Springer, Reprinted with permission.

c Springer, Reprinted with permission. Zhijian Yuan and Erkki Oja. A FastICA Algorithm for Non-negative Independent Component Analysis. In Puntonet, Carlos G.; Prieto, Alberto (Eds.), Proceedings of the Fifth International Symposium on Independent

More information

Non-Negative Matrix Factorization with Quasi-Newton Optimization

Non-Negative Matrix Factorization with Quasi-Newton Optimization Non-Negative Matrix Factorization with Quasi-Newton Optimization Rafal ZDUNEK, Andrzej CICHOCKI Laboratory for Advanced Brain Signal Processing BSI, RIKEN, Wako-shi, JAPAN Abstract. Non-negative matrix

More information

Modeling Environment

Modeling Environment Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA

More information

Nonnegative Matrix Factorization

Nonnegative Matrix Factorization Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Yuriy Sverchkov Intelligent Systems Program University of Pittsburgh October 6, 2011 Outline Latent Semantic Analysis (LSA) A quick review Probabilistic LSA (plsa)

More information

Matrix factorization models for patterns beyond blocks. Pauli Miettinen 18 February 2016

Matrix factorization models for patterns beyond blocks. Pauli Miettinen 18 February 2016 Matrix factorization models for patterns beyond blocks 18 February 2016 What does a matrix factorization do?? A = U V T 2 For SVD that s easy! 3 Inner-product interpretation Element (AB) ij is the inner

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

Information Retrieval and Topic Models. Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin)

Information Retrieval and Topic Models. Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin) Information Retrieval and Topic Models Mausam (Based on slides of W. Arms, Dan Jurafsky, Thomas Hofmann, Ata Kaban, Chris Manning, Melanie Martin) Sec. 1.1 Unstructured data in 1620 Which plays of Shakespeare

More information

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012

Machine Learning. Principal Components Analysis. Le Song. CSE6740/CS7641/ISYE6740, Fall 2012 Machine Learning CSE6740/CS7641/ISYE6740, Fall 2012 Principal Components Analysis Le Song Lecture 22, Nov 13, 2012 Based on slides from Eric Xing, CMU Reading: Chap 12.1, CB book 1 2 Factor or Component

More information

Gaussian Graphical Models and Graphical Lasso

Gaussian Graphical Models and Graphical Lasso ELE 538B: Sparsity, Structure and Inference Gaussian Graphical Models and Graphical Lasso Yuxin Chen Princeton University, Spring 2017 Multivariate Gaussians Consider a random vector x N (0, Σ) with pdf

More information

CS 572: Information Retrieval

CS 572: Information Retrieval CS 572: Information Retrieval Lecture 11: Topic Models Acknowledgments: Some slides were adapted from Chris Manning, and from Thomas Hoffman 1 Plan for next few weeks Project 1: done (submit by Friday).

More information

Matrix decompositions and latent semantic indexing

Matrix decompositions and latent semantic indexing 18 Matrix decompositions and latent semantic indexing On page 113, we introduced the notion of a term-document matrix: an M N matrix C, each of whose rows represents a term and each of whose columns represents

More information

Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification

Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification 250 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'15 Dimension Reduction Using Nonnegative Matrix Tri-Factorization in Multi-label Classification Keigo Kimura, Mineichi Kudo and Lu Sun Graduate

More information

Preserving Privacy in Data Mining using Data Distortion Approach

Preserving Privacy in Data Mining using Data Distortion Approach Preserving Privacy in Data Mining using Data Distortion Approach Mrs. Prachi Karandikar #, Prof. Sachin Deshpande * # M.E. Comp,VIT, Wadala, University of Mumbai * VIT Wadala,University of Mumbai 1. prachiv21@yahoo.co.in

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Nonnegative Matrix Factorization with Orthogonality Constraints

Nonnegative Matrix Factorization with Orthogonality Constraints Nonnegative Matrix Factorization with Orthogonality Constraints Jiho Yoo and Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology (POSTECH) Pohang, Republic

More information

Research Article Comparative Features Extraction Techniques for Electrocardiogram Images Regression

Research Article Comparative Features Extraction Techniques for Electrocardiogram Images Regression Research Journal of Applied Sciences, Engineering and Technology (): 6, 7 DOI:.96/rjaset..6 ISSN: -79; e-issn: -767 7 Maxwell Scientific Organization Corp. Submitted: September 8, 6 Accepted: November,

More information

cross-language retrieval (by concatenate features of different language in X and find co-shared U). TOEFL/GRE synonym in the same way.

cross-language retrieval (by concatenate features of different language in X and find co-shared U). TOEFL/GRE synonym in the same way. 10-708: Probabilistic Graphical Models, Spring 2015 22 : Optimization and GMs (aka. LDA, Sparse Coding, Matrix Factorization, and All That ) Lecturer: Yaoliang Yu Scribes: Yu-Xiang Wang, Su Zhou 1 Introduction

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing

NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION. Julian M. Becker, Christian Sohn and Christian Rohlfing NMF WITH SPECTRAL AND TEMPORAL CONTINUITY CRITERIA FOR MONAURAL SOUND SOURCE SEPARATION Julian M. ecker, Christian Sohn Christian Rohlfing Institut für Nachrichtentechnik RWTH Aachen University D-52056

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Probabilistic Latent Semantic Analysis Denis Helic KTI, TU Graz Jan 16, 2014 Denis Helic (KTI, TU Graz) KDDM1 Jan 16, 2014 1 / 47 Big picture: KDDM

More information

Dimension Reduction and Iterative Consensus Clustering

Dimension Reduction and Iterative Consensus Clustering Dimension Reduction and Iterative Consensus Clustering Southeastern Clustering and Ranking Workshop August 24, 2009 Dimension Reduction and Iterative 1 Document Clustering Geometry of the SVD Centered

More information

Nonnegative matrix factorization for interactive topic modeling and document clustering

Nonnegative matrix factorization for interactive topic modeling and document clustering Nonnegative matrix factorization for interactive topic modeling and document clustering Da Kuang and Jaegul Choo and Haesun Park Abstract Nonnegative matrix factorization (NMF) approximates a nonnegative

More information

Interaction on spectral data with: Kira Abercromby (NASA-Houston) Related Papers at:

Interaction on spectral data with: Kira Abercromby (NASA-Houston) Related Papers at: Low-Rank Nonnegative Factorizations for Spectral Imaging Applications Bob Plemmons Wake Forest University Collaborators: Christos Boutsidis (U. Patras), Misha Kilmer (Tufts), Peter Zhang, Paul Pauca, (WFU)

More information

Unsupervised Learning: Linear Dimension Reduction

Unsupervised Learning: Linear Dimension Reduction Unsupervised Learning: Linear Dimension Reduction Unsupervised Learning Clustering & Dimension Reduction ( 化繁為簡 ) Generation ( 無中生有 ) only having function input function only having function output function

More information

Detect and Track Latent Factors with Online Nonnegative Matrix Factorization

Detect and Track Latent Factors with Online Nonnegative Matrix Factorization Detect and Track Latent Factors with Online Nonnegative Matrix Factorization Bin Cao 1, Dou Shen 2, Jian-Tao Sun 3, Xuanhui Wang 4, Qiang Yang 2 and Zheng Chen 3 1 Peking university, Beijing, China 2 Hong

More information

Chapter 7 Network Flow Problems, I

Chapter 7 Network Flow Problems, I Chapter 7 Network Flow Problems, I Network flow problems are the most frequently solved linear programming problems. They include as special cases, the assignment, transportation, maximum flow, and shortest

More information

Lecture 5: Web Searching using the SVD

Lecture 5: Web Searching using the SVD Lecture 5: Web Searching using the SVD Information Retrieval Over the last 2 years the number of internet users has grown exponentially with time; see Figure. Trying to extract information from this exponentially

More information

arxiv: v2 [math.oc] 6 Oct 2011

arxiv: v2 [math.oc] 6 Oct 2011 Accelerated Multiplicative Updates and Hierarchical ALS Algorithms for Nonnegative Matrix Factorization Nicolas Gillis 1 and François Glineur 2 arxiv:1107.5194v2 [math.oc] 6 Oct 2011 Abstract Nonnegative

More information

Note for plsa and LDA-Version 1.1

Note for plsa and LDA-Version 1.1 Note for plsa and LDA-Version 1.1 Wayne Xin Zhao March 2, 2011 1 Disclaimer In this part of PLSA, I refer to [4, 5, 1]. In LDA part, I refer to [3, 2]. Due to the limit of my English ability, in some place,

More information

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic

More information

Introduction to Information Retrieval

Introduction to Information Retrieval Introduction to Information Retrieval http://informationretrieval.org IIR 18: Latent Semantic Indexing Hinrich Schütze Center for Information and Language Processing, University of Munich 2013-07-10 1/43

More information

Regularized Alternating Least Squares Algorithms for Non-negative Matrix/Tensor Factorization

Regularized Alternating Least Squares Algorithms for Non-negative Matrix/Tensor Factorization Regularized Alternating Least Squares Algorithms for Non-negative Matrix/Tensor Factorization Andrzej CICHOCKI and Rafal ZDUNEK Laboratory for Advanced Brain Signal Processing, RIKEN BSI, Wako-shi, Saitama

More information

13 Searching the Web with the SVD

13 Searching the Web with the SVD 13 Searching the Web with the SVD 13.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this

More information

9 Searching the Internet with the SVD

9 Searching the Internet with the SVD 9 Searching the Internet with the SVD 9.1 Information retrieval Over the last 20 years the number of internet users has grown exponentially with time; see Figure 1. Trying to extract information from this

More information

Linear Algebra Background

Linear Algebra Background CS76A Text Retrieval and Mining Lecture 5 Recap: Clustering Hierarchical clustering Agglomerative clustering techniques Evaluation Term vs. document space clustering Multi-lingual docs Feature selection

More information

Nonnegative Matrix Factorization with Toeplitz Penalty

Nonnegative Matrix Factorization with Toeplitz Penalty Journal of Informatics and Mathematical Sciences Vol. 10, Nos. 1 & 2, pp. 201 215, 2018 ISSN 0975-5748 (online); 0974-875X (print) Published by RGN Publications http://dx.doi.org/10.26713/jims.v10i1-2.851

More information

Language Information Processing, Advanced. Topic Models

Language Information Processing, Advanced. Topic Models Language Information Processing, Advanced Topic Models mcuturi@i.kyoto-u.ac.jp Kyoto University - LIP, Adv. - 2011 1 Today s talk Continue exploring the representation of text as histogram of words. Objective:

More information

PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211

PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211 PV211: Introduction to Information Retrieval https://www.fi.muni.cz/~sojka/pv211 IIR 18: Latent Semantic Indexing Handout version Petr Sojka, Hinrich Schütze et al. Faculty of Informatics, Masaryk University,

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Using Matrix Decompositions in Formal Concept Analysis

Using Matrix Decompositions in Formal Concept Analysis Using Matrix Decompositions in Formal Concept Analysis Vaclav Snasel 1, Petr Gajdos 1, Hussam M. Dahwa Abdulla 1, Martin Polovincak 1 1 Dept. of Computer Science, Faculty of Electrical Engineering and

More information

Fast Bregman Divergence NMF using Taylor Expansion and Coordinate Descent

Fast Bregman Divergence NMF using Taylor Expansion and Coordinate Descent Fast Bregman Divergence NMF using Taylor Expansion and Coordinate Descent Liangda Li, Guy Lebanon, Haesun Park School of Computational Science and Engineering Georgia Institute of Technology Atlanta, GA

More information

Linear Algebra Methods for Data Mining

Linear Algebra Methods for Data Mining Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 PCA, NMF Linear Algebra Methods for Data Mining, Spring 2007, University of Helsinki Summary: PCA PCA is SVD

More information

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT

CONVOLUTIVE NON-NEGATIVE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT CONOLUTIE NON-NEGATIE MATRIX FACTORISATION WITH SPARSENESS CONSTRAINT Paul D. O Grady Barak A. Pearlmutter Hamilton Institute National University of Ireland, Maynooth Co. Kildare, Ireland. ABSTRACT Discovering

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2016 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 08 Boolean Matrix Factorization Rainer Gemulla, Pauli Miettinen June 13, 2013 Outline 1 Warm-Up 2 What is BMF 3 BMF vs. other three-letter abbreviations 4 Binary matrices, tiles,

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Dan Oneaţă 1 Introduction Probabilistic Latent Semantic Analysis (plsa) is a technique from the category of topic models. Its main goal is to model cooccurrence information

More information

Lecture 8: Clustering & Mixture Models

Lecture 8: Clustering & Mixture Models Lecture 8: Clustering & Mixture Models C4B Machine Learning Hilary 2011 A. Zisserman K-means algorithm GMM and the EM algorithm plsa clustering K-means algorithm K-means algorithm Partition data into K

More information

Problems. Looks for literal term matches. Problems:

Problems. Looks for literal term matches. Problems: Problems Looks for literal term matches erms in queries (esp short ones) don t always capture user s information need well Problems: Synonymy: other words with the same meaning Car and automobile 电脑 vs.

More information

Fast Coordinate Descent methods for Non-Negative Matrix Factorization

Fast Coordinate Descent methods for Non-Negative Matrix Factorization Fast Coordinate Descent methods for Non-Negative Matrix Factorization Inderjit S. Dhillon University of Texas at Austin SIAM Conference on Applied Linear Algebra Valencia, Spain June 19, 2012 Joint work

More information

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY

Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY Matrix Decomposition in Privacy-Preserving Data Mining JUN ZHANG DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF KENTUCKY OUTLINE Why We Need Matrix Decomposition SVD (Singular Value Decomposition) NMF (Nonnegative

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 22 1 / 21 Overview

More information

Chapter 6 Randomization Algorithm Theory WS 2012/13 Fabian Kuhn

Chapter 6 Randomization Algorithm Theory WS 2012/13 Fabian Kuhn Chapter 6 Randomization Algorithm Theory WS 2012/13 Fabian Kuhn Randomization Randomized Algorithm: An algorithm that uses (or can use) random coin flips in order to make decisions We will see: randomization

More information

A TWO-LAYER NON-NEGATIVE MATRIX FACTORIZATION MODEL FOR VOCABULARY DISCOVERY. MengSun,HugoVanhamme

A TWO-LAYER NON-NEGATIVE MATRIX FACTORIZATION MODEL FOR VOCABULARY DISCOVERY. MengSun,HugoVanhamme A TWO-LAYER NON-NEGATIVE MATRIX FACTORIZATION MODEL FOR VOCABULARY DISCOVERY MengSun,HugoVanhamme Department of Electrical Engineering-ESAT, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, Bus

More information

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2014 Most slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276,

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2016

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2016 CPSC 340: Machine Learning and Data Mining More PCA Fall 2016 A2/Midterm: Admin Grades/solutions posted. Midterms can be viewed during office hours. Assignment 4: Due Monday. Extra office hours: Thursdays

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

Latent Semantic Analysis. Hongning Wang

Latent Semantic Analysis. Hongning Wang Latent Semantic Analysis Hongning Wang CS@UVa VS model in practice Document and query are represented by term vectors Terms are not necessarily orthogonal to each other Synonymy: car v.s. automobile Polysemy:

More information

Multi-Task Co-clustering via Nonnegative Matrix Factorization

Multi-Task Co-clustering via Nonnegative Matrix Factorization Multi-Task Co-clustering via Nonnegative Matrix Factorization Saining Xie, Hongtao Lu and Yangcheng He Shanghai Jiao Tong University xiesaining@gmail.com, lu-ht@cs.sjtu.edu.cn, yche.sjtu@gmail.com Abstract

More information

A Purely Geometric Approach to Non-Negative Matrix Factorization

A Purely Geometric Approach to Non-Negative Matrix Factorization A Purely Geometric Approach to Non-Negative Matrix Factorization Christian Bauckhage B-IT, University of Bonn, Bonn, Germany Fraunhofer IAIS, Sankt Augustin, Germany http://mmprec.iais.fraunhofer.de/bauckhage.html

More information

arxiv: v1 [stat.ml] 23 Dec 2015

arxiv: v1 [stat.ml] 23 Dec 2015 k-means Clustering Is Matrix Factorization Christian Bauckhage arxiv:151.07548v1 [stat.ml] 3 Dec 015 B-IT, University of Bonn, Bonn, Germany Fraunhofer IAIS, Sankt Augustin, Germany http://mmprec.iais.fraunhofer.de/bauckhage.html

More information

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition Yu-Seop Kim 1, Jeong-Ho Chang 2, and Byoung-Tak Zhang 2 1 Division of Information and Telecommunication

More information

Lecture Notes 10: Matrix Factorization

Lecture Notes 10: Matrix Factorization Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To

More information

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Understanding Comments Submitted to FCC on Net Neutrality Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Abstract We aim to understand and summarize themes in the 1.65 million

More information

Insights from Viewing Ranked Retrieval as Rank Aggregation

Insights from Viewing Ranked Retrieval as Rank Aggregation Insights from Viewing Ranked Retrieval as Rank Aggregation Holger Bast Ingmar Weber Max-Planck-Institut für Informatik Stuhlsatzenhausweg 85 66123 Saarbrücken, Germany {bast, iweber}@mpi-sb.mpg.de Abstract

More information