* Matrix Factorization and Recommendation Systems

Similar documents
Andriy Mnih and Ruslan Salakhutdinov

Collaborative Filtering. Radek Pelánek

Matrix Factorization Techniques for Recommender Systems

Restricted Boltzmann Machines for Collaborative Filtering

Matrix Factorization Techniques For Recommender Systems. Collaborative Filtering

Scaling Neighbourhood Methods

Collaborative Filtering on Ordinal User Feedback

Review: Probabilistic Matrix Factorization. Probabilistic Matrix Factorization (PMF)

Large-scale Ordinal Collaborative Filtering

A Modified PMF Model Incorporating Implicit Item Associations

Matrix Factorization In Recommender Systems. Yong Zheng, PhDc Center for Web Intelligence, DePaul University, USA March 4, 2015

Collaborative Filtering Applied to Educational Data Mining

arxiv: v2 [cs.ir] 14 May 2018

SCMF: Sparse Covariance Matrix Factorization for Collaborative Filtering

Content-based Recommendation

Matrix Factorization with Content Relationships for Media Personalization

Recommendation Systems

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Scalable Bayesian Matrix Factorization

Recommender System for Yelp Dataset CS6220 Data Mining Northeastern University

Generative Models for Discrete Data

Matrix Factorization Techniques for Recommender Systems

Mixture-Rank Matrix Approximation for Collaborative Filtering

Recommender Systems. Dipanjan Das Language Technologies Institute Carnegie Mellon University. 20 November, 2007

Preliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use!

Decoupled Collaborative Ranking

CS 175: Project in Artificial Intelligence. Slides 4: Collaborative Filtering

Mixed Membership Matrix Factorization

Recommender Systems EE448, Big Data Mining, Lecture 10. Weinan Zhang Shanghai Jiao Tong University

TopicMF: Simultaneously Exploiting Ratings and Reviews for Recommendation

Recommendation Systems

Matrix Factorization and Collaborative Filtering

Binary Principal Component Analysis in the Netflix Collaborative Filtering Task

Matrix and Tensor Factorization from a Machine Learning Perspective

Algorithms for Collaborative Filtering

Mixed Membership Matrix Factorization

Using SVD to Recommend Movies

Collaborative Filtering Matrix Completion Alternating Least Squares

Introduction to Logistic Regression

A Latent-Feature Plackett-Luce Model for Dyad Ranking Completion

STA 4273H: Statistical Machine Learning

CS249: ADVANCED DATA MINING

Collaborative Filtering

A Bayesian Treatment of Social Links in Recommender Systems ; CU-CS

Missing Data Prediction in Multi-source Time Series with Sensor Network Regularization

Dynamic Poisson Factorization

LLORMA: Local Low-Rank Matrix Approximation

SQL-Rank: A Listwise Approach to Collaborative Ranking

Collaborative Filtering

Factorization models for context-aware recommendations

Graphical Models for Collaborative Filtering

Collaborative Filtering: A Machine Learning Perspective

a Short Introduction

Learning in Probabilistic Graphs exploiting Language-Constrained Patterns

Recurrent Latent Variable Networks for Session-Based Recommendation

Recommendation Systems

Recommender systems, matrix factorization, variable selection and social graph data

Context-aware Ensemble of Multifaceted Factorization Models for Recommendation Prediction in Social Networks

Data Mining Techniques

Data Science Mastery Program

Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Joint user knowledge and matrix factorization for recommender systems

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Recommendation. Tobias Scheffer

arxiv: v1 [cs.si] 18 Oct 2015

Transfer Learning for Collective Link Prediction in Multiple Heterogenous Domains

Lecture Notes 10: Matrix Factorization

Rating Prediction with Topic Gradient Descent Method for Matrix Factorization in Recommendation

Mixed Membership Matrix Factorization

Matrix Factorization and Factorization Machines for Recommender Systems

Department of Computer Science, Guiyang University, Guiyang , GuiZhou, China

Bounded Matrix Low Rank Approximation

STA 4273H: Statistical Machine Learning

STA 4273H: Sta-s-cal Machine Learning

Impact of Data Characteristics on Recommender Systems Performance

arxiv: v2 [cs.ir] 4 Jun 2018

The Pragmatic Theory solution to the Netflix Grand Prize

A graph contains a set of nodes (vertices) connected by links (edges or arcs)

TFMAP: Optimizing MAP for Top-N Context-aware Recommendation

Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.

CS425: Algorithms for Web Scale Data

Divide and Transfer: Understanding Latent Factors for Recommendation Tasks

Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization

A Graphical Model Formulation of Collaborative Filtering Neighbourhood Methods with Fast Maximum Entropy Training

STA 4273H: Statistical Machine Learning

Relational Stacked Denoising Autoencoder for Tag Recommendation. Hao Wang

A Comparative Study of Matrix Factorization and Random Walk with Restart in Recommender Systems

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

A Learning-rate Schedule for Stochastic Gradient Methods to Matrix Factorization

Probabilistic Graphical Models

Downloaded 09/30/17 to Redistribution subject to SIAM license or copyright; see

STA414/2104 Statistical Methods for Machine Learning II

Service Recommendation for Mashup Composition with Implicit Correlation Regularization

Learning to Recommend Point-of-Interest with the Weighted Bayesian Personalized Ranking Method in LBSNs

Generalized Linear Models in Collaborative Filtering

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Factorization Machines with libfm

Probabilistic Low-Rank Matrix Completion with Adaptive Spectral Regularization Algorithms

Collaborative Filtering via Ensembles of Matrix Factorizations

Learning From Data Lecture 15 Reflecting on Our Path - Epilogue to Part I

Transcription:

Matrix Factorization and Recommendation Systems Originally presented at HLF Workshop on Matrix Factorization with Loren Anderson (University of Minnesota Twin Cities) on 25 th September, 2017 15 th March, 2018 Datalys

Recommendation is must have in e-commerce 2 Movies, products, etc. Preventing information overload Explicit (rating) vs. implicit (click) feedback

Recommendation = Prediction of rating 3 Item 1 Item 2 Item 3 Item 4 Item 5 Bob 5 3 4 5? User 1 3 1 2 3 3 User 2 4 3 4 3 5 User 3 3 3 1 5 4 User 4 1 5 5 2 1

Recommendation = Prediction of rating 4 Item 1 Item 2 Item 3 Item 4 Item 5 Bob 3 4? User 1 3 3 User 2 3 3 User 3 3 4 4 User 4 1 2 In real systems, matrix is really big and sparse Dataset from Netflix Price: 500,000 users 17,000 movies 100 million (1%) ratings are known

Recommender systems - Classification 5 Recommender Systems Content-based Collaborative filtering Hybrid Neighborhoodbased Model-based State of the art Item-based User-based Rule-based Latent factors Matrix Factorization Neural networks

Matrix factorization Formalization 6 Goal is to map users and items into latent factor f-dimensional space m n ratings matrix R is approximately factorized into an m f matrix P and an n f matrix Q R PQ T Users Items r 11 r 1n r m1 r mn Users Factors p 11 p 1f p m1 p mf Factors Items q 11 q 1n q f1 q fn

Matrix factorization Formalization 7 Goal is to map users and items into latent factor f-dimensional space m n ratings matrix R is approximately factorized into an m f matrix P and an n f matrix Q R PQ T Users Items r 11 r 1n r m1 r mn Users Factors p 11 p 1f p m1 p mf Factors Items q 11 q 1n q f1 q fn Each item i is associated with a vector q i (item factor) q i measures affinity of the item i towards those f factors Each user u is associated with a vector p u (user factor) p u measures interest of the user in these f factors

Matrix factorization Formalization 8 Goal is to map users and items into latent factor f-dimensional space m n ratings matrix R is approximately factorized into an m f matrix P and an n f matrix Q R PQ T Users Items r 11 r 1n r m1 r mn Users Factors p 11 p 1f p m1 p mf Factors Items q 11 q 1n q f1 q fn Each item i is associated with a vector q i (item factor) q i measures affinity of the item i towards those f factors Each user u is associated with a vector p u (user factor) p u measures interest of the user in these f factors Rating r ui of user u on item i can be approximated as a dot product r ui = q i T p u

Matrix factorization Basic model 9 The main challenge is computing of user and item factors Incomplete and sparse rating matrix R (conventional SVD is not applicable) Overfitting when only few ratings are available

Matrix factorization Basic model 10 The main challenge is computing of user and item factors Incomplete and sparse rating matrix R (conventional SVD is not applicable) Overfitting when only few ratings are available Solution 1: amputation of missing values Memory inefficient Introduce bias

Matrix factorization Basic model 11 The main challenge is computing of user and item factors Incomplete and sparse rating matrix R (conventional SVD is not applicable) Overfitting when only few ratings are available Solution 2: modeling only known ratings + minimize error e ui = (r ui r ui ) min (r ui q T i p u ) 2 +λ( q 2 i + p 2 u ) q,p (u,i) R Parameter λ controls the extent of regularization Optimization problem -> learning algorithms Stochastic gradient descent Alternating least squares

Matrix factorization Differences in models 12 Constraints imposed on P and Q Unconstrained MF Orthogonality of the latent vectors (SVD) Non-negativity of the latent vectors (NMF)

Matrix factorization Differences in models 13 Constraints imposed on P and Q Unconstrained MF Orthogonality of the latent vectors (SVD) Non-negativity of the latent vectors (NMF) Objective function Minimize squared (Frobenius) norm Maximizing the likelihood estimation

Matrix factorization Differences in models 14 Constraints imposed on P and Q Unconstrained MF Orthogonality of the latent vectors (SVD) Non-negativity of the latent vectors (NMF) Objective function Minimize squared (Frobenius) norm Maximizing the likelihood estimation Additional information used in modeling of ratings User and item bias (SVD) Implicit feedback (SVD++) Temporal dynamics (TimeSVD++)

Improvement of basic model for real systems

Incorporating user and item bias 16 Real systems usually exhibit some systematical tendencies Some users tend to give higher ratings Some items are widely rated better r ui = μ + b i + b u + q i T p u μ denotes overall average rating b i represents item bias b u represents user bias

Incorporating implicit feedback 17 Real systems must tackle with cold-start problem Users usually provide very few explicit ratings Implicit feedback can provide some insights into user preferences r ui = μ + b i + b u + q i T (p u + R(u) 1/2 σ jεr(u) y j ) R(u) 1/2 stabilizes variance across the range of observed R(u) y j additional item factor reflecting implicit feedback

Incorporating temporal dynamics 18 Real systems must tackle with temporal shifts Item popularity changes over time User baseline rating changes over time User preferences change over time r ui = μ + b i (t ui ) + b u (t ui ) + q T i (p u (t ui ) + R(u) 1/2 jεr(u) y j t ui denotes the time of rating r ui

Performance comparison on Netflix Price dataset 19 Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30 37.

* Advanced models

Advanced models 21 Probabilistic matrix factorization (PMF) Constrained PMF Mnih, Andriy, and Ruslan R. Salakhutdinov. "Probabilistic matrix factorization." Advances in neural information processing systems. 2008. Bayesian PMF using Markov Chain Monte Carlo (MCMC) Salakhutdinov, Ruslan, and Andriy Mnih. "Bayesian probabilistic matrix factorization using Markov chain Monte Carlo." Proceedings of the 25th international conference on Machine learning. ACM, 2008. Local Low-Rank Matrix Approximation (LLORMA) Lee, J., Kim, S., Lebanon, G., & Singer, Y. (2013). Local Low-Rank Matrix Approximation, 28.

Conclusion 22 Matrix factorization = state of the art within collaborative filtering recommenders High scalability Can be applied also at large datasets Possibility to easily integrate various additional aspects of data User and item biases Implicit feedback Temporal dynamics

Literature and sources 23 Aggarwal, Charu C. 2016. Recommender Systems - The Textbook. Communications of the ACM. Vol. 40. Cham: Springer International Publishing. Ricci, Francesco, Lior Rokach, Bracha Shapira, and Paul B. Kantor, eds. 2015. Recommender Systems Handbook. 2nd Edition. Boston, MA: Springer US. Koren, Y., Bell, R., & Volinsky, C. (2009). Matrix factorization techniques for recommender systems. Computer, 42(8), 30 37. Lee, J., Kim, S., Lebanon, G., & Singer, Y. (2013). Local Low-Rank Matrix Approximation, 28. Retrieved from http://theanalysisofdata.com/gl/lee_icml_2013.pdf Lei Guo. 2012. Matrix Factorization Techniques For Recommender Systems, Retrieved from https://www.slideshare.net/studentalei/matrix-factorization-techniques-for-recommender-systems Mnih, Andriy, and Ruslan R. Salakhutdinov. "Probabilistic matrix factorization." Advances in neural information processing systems. 2008. Salakhutdinov, Ruslan, and Andriy Mnih. "Bayesian probabilistic matrix factorization using Markov chain Monte Carlo." Proceedings of the 25th international conference on Machine learning. ACM, 2008.