Scaling Neighbourhood Methods
|
|
- Pamela Horn
- 6 years ago
- Views:
Transcription
1 Quick Recap
2 Scaling Neighbourhood Methods
3 Collaborative Filtering m = #items n = #users Complexity : m * m * n
4 Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion) Purchase ~ O(100M) (100 per billion) Browse ~ O(10B) (10000 per billion)
5 Implicit Signals Used Bought History Browse History Compare History
6 Category-partitioned v/s Category independent
7
8
9 Similarity Metric for boolean matrix Cosine Similarity A. Pair Count (p) - P1 and P2 B. Individual Count (n_i) - P1, P2 individually
10 Employing Map Reduce Calculate 'p' by forming pairs and counting Calculate 'n1' by making P1 as the key Calculate 'n2' by making P2 as the key Took 2 hours on 5 years of data Scales Horizontally
11 Map Reduce 1
12 Map Reduce 2
13 Map Reduce 3
14 Latent Variable Models and Factorization Models Akash Khandelwal, Avijit Saha, Mohit Kumar, Vivek Mehta 1 / 34
15 Outline Introduction Recommender Systems Recap Latent Variable Models Factorization Models Matrix Factorization Singular Value Decomposition BPMF Factorization Machine Conclusion 2 / 34
16 Recommender Systems (RSs) Collaborative Filtering (CF) Neighborhood Based - KNN Model Based - Cluster-based CF and Bayesian classifiers. - Latent variable models such as, LDA, plsa, and matrix factorization (MF). Content Based Knowledge Based Hybrid 3 / 34
17 Latent Variable Models Supplementing a set of observed variables with additional latent, or hidden, variables. Latent variable models are widely used in several domains such as machine learning, statistics, data mining. Reveals hidden structure which explains the data. Latent variable models consider a joint distribution over the hidden and observed variables. Hidden structure is found by calculating the posterior. LDA is an example of latent variable models. 4 / 34
18 Factorization Models One of the widely used latent variable models in the RSs community. preferences of a user are determined by a small number of unobserved latent factors. Matrix Factorization : Each user and item are mapped to a latent factor vector: u i R K v j R K Tensor Factorization : Mapping of each variable of each category type to a K dimensional latent factor vector. Many problem specific factorization models. 5 / 34
19 Focus Factorization Models Matrix Factorization (MF) Factorization Machine (FM) 6 / 34
20 Singular Value Decomposition Singular value decomposition (SVD) is a factorization of a matrix. Formally, the SVD of an R R I J is: R = UΣV, (1) where, U = I I unitary matrix Σ =I J rectangular diagonal matrix V = J J unitary matrix σ i,i of Σ are known as the singular values of R I columns of U and the J columns of V are called the left-singular vectors and right-singular vectors of R, respectively. 7 / 34
21 Dimensionality Reduction Using SVD Let, R R I J Estimate : Apply SVD: R = UΣV, (2) σ 1,1 σ 2,2... ˆR = U V. σk,k 0... (3) 8 / 34
22 Example of SVD (a) (b) (c) (d) 9 / 34
23 MF for RSs Figure 1: Matrix Factorization 10 / 34
24 MF for RSs Consider a user-movie matrix R R I J where the r ij cell represents the rating provided to the j th movie by the i th user. MF decomposes the matrix R into two low-rank matrices U = [u 1, u 2,..., u I ] T R I K and V = [v 1, v 2,..., v J ] T R J K : R UV T. (4) ( ) 2 r ij u T i v j (5) (i,j) Ω 11 / 34
25 MF for RSs SVD with K singular value would been the solution if R is fully observed. However, R is partially observed. Solution: Stochastic gradient descent to rank-1 update: Then iterate this for each rank K. e ij = r ij u T i v j u ik = u ik + ν(e ij v jk λu ik ) (6) v jk = v jk + ν(e ij u ik λv jk ) (7) 12 / 34
26 Problem with SVD approach Learning rate and regularization parameters needs to be tuned manually. Overfitting. Solution: Bayesian Probabilistic Matrix Factorization (BPMF) [1]. 13 / 34
27 BPMF Model The likelihood term of BPMF is as follows: p(r Θ) = N (r ij u T i v j, τ 1 ), (8) (i,j) Ω where u i is the latent factor vector for the i th user, v j is the latent factor vector for the j th item, τ is the model precision, Ω is the set of all observations, and Θ is the set of all the model parameters. 14 / 34
28 Priors Independent priors are placed on all the model parameters in Θ as: I p(u) = N (u i µ u, Λ 1 u ), (9) p(v ) = i=1 J N (v j µ v, Λ 1 v ). (10) j=1 15 / 34
29 Hyperpriors Further place Normal-Wishart priors are placed on all the hyperparameters Θ H = {{µ u, Λ u }, {µ v, Λ v }} as: where p(µ u, Λ u ) = N W (µ u, Λ u µ 0, β 0, W 0, ν 0 ), (11) = N (µ u µ 0, (β 0 Λ u ) 1 )W(Λ u W 0, ν 0 ) p(µ v, Λ v ) = N W (µ v, Λ v µ 0, β 0, W 0, ν 0 ). (12) W(Λ W 0, ν 0 ) = 1 C Λ ν 0 D 1 exp( 1 1 Tr(W 0 2 Λ)) 16 / 34
30 Joint Distributions The joint distribution of the observations and the hidden variables can be written as: p(r, Θ, Θ H Θ 0 ) = p(r Θ)p(U)p(V )p(µ u, Λ u Θ 0 )p(µ v, Λ v Θ 0 ), (13) where Θ 0 = {µ 0, β 0, W 0, ν 0 } 17 / 34
31 Inference Evaluation of the joint distribution in Eq. (13) is intractable. However, all the model parameters are conditionally conjugate. So we Gibbs sampler has closed form updates. Replacing Eq. (8)-(12) in Eq. (13), the sampling distribution of u i can be written as follows: p(u i ) N ( u i µ, (Λ ) 1), (14) Λ = Λ u + τ v j v T j (15) j Ωi µ = (Λ ) 1 Λ u µ u + τ v j r ij (16) j Ωi 18 / 34
32 Results Figure 2: RMSE vs Iterations 19 / 34
33 Factorization Models Matrix factorization [2]. Tensor factorization [3]. Specific models like SVD++ [4], TimeSVD++ [5], FPMC [6], and BPTF [3], etc. have been developed. Several Learning technique like SGD, ALS, variational Bayes, MCMC Gibbs sampling have been developed. 20 / 34
34 Factorization Models Vs Feature Based Techniques Advantages of Factorization Model: Problem: Scalability and Performance. deriving inference techniques for each individual model is a time consuming task and requires considerable expertise. Advantages of Feature Based Techniques: Problem: Generic approach. Can be solved using standard tools like LIBSVM or SVMLight. Can not handle very sparse data. 21 / 34
35 Factorization Machine A Generic framework [7] proposed by Stephen Rendle. Combines advantages of both factorization models and feature based model. Can subsume many state of the art factorization model like SVD++, TimeSVD++, FPMC, PITF, etc. Performs well for sparse data where SVMs fails. 22 / 34
36 Feature Representation of Factorization Machine Example: (U1,M1,G1,2),(U1,M3,G2,5),(U2,M2,G3,4) and (U3,M1,G1,5) Feature x Target y User Movie Genre x y1 x y2 x y3 x y4 U1 U2 U3. M1 M2 M3. G1 G2 G3. 23 / 34
37 Factorization Machine Following is the equation for FM. D D y n = w 0 + w i x ni + D x ni x nj K i=1 i=1 j=i+1 k=1 Assumptions of FM are following: y n x n, θ N (ŷ(x n, θ), α 1 ) y n x n, θ Bernoulli(b(ŷ(x n, θ)) And L2 regularization on θ v ik v jk (17) 24 / 34
38 Mimic MF ŷ = w 0 + w u + w i + v T u v i (18) Feature representation of FM: D = U I x j = δ(j = u j = i) 25 / 34
39 Mimic SVD++ ŷ = w 0 + w u + w i + v T u v i + 1 N Feature representation of FM: D = U I L x j = 1 if j = u j = i = 1 N if j N u = 0 else l N u v T i v l (19) 26 / 34
40 Algorithms of Factorization Machine Stochastic gradient descent is the simplest algorithm to solve FM (SGD-FM). MCMC based Bayesian Factorization Machine gives state-of-the-art performance (MCMC-FM). Alternating least square and adaptive stochastic gradient descent. 27 / 34
41 Comparison of each Leaning Techniques SGD-FM Pros: SGD online algorithm and more scalable. Cons: Costly cross validation of parameters. MCMC-FM Pros: Performance. No cross validation required. 28 / 34
42 SGD Learning Following is the equation for FM. ŷ n = w 0 + Cost function is: D D D w i x ni + x ni x nj i=1 K i=1 j=i+1 k=1 v ik v jk (20) N (y n ŷ n ) 2 + L2 regularization (21) n=1 SGD update equations: v ik = v ik + ν 2 (y n ŷ n )x ni N l=1&l i x nl v nl 2λv ik (22) 29 / 34
43 MCMC-FM Likelihood: Prior: Hyperprior: y n N (y n ŷ n, τ) p(w 0 ) N (w 0 µ 0, σ 0 ) p(w i ) N (w i µ w, σ w ) p(v ik ) N (v ik µ k, σ k ) p(µ w ) N (µ w µ, σ w ν 0 ) p(σ w ) G(σ w α 0, β 0 ) p(µ k ) N (µ k µ, σ k ν 0 ) p(σ k ) G(σ k α 0, β 0 ) 30 / 34
44 Results 31 / 34
45 Datasets Yelp challenge Yelp datasets Users information Social information Location Time Ratings Reviews 32 / 34
46 References R. Salakhutdinov and A. Mnih, Bayesian probabilistic matrix factorization using markov chain monte carlo, in Proc. of ICML, pp , R. Salakhutdinov and A. Mnih, Probabilistic matrix factorization, in Proc. of NIPS, L. Xiong, X. Chen, T. K. Huang, J. Schneider, and J. G. Carbonell, Temporal collaborative filtering with bayesian probabilistic tensor factorization, in Proc. of SDM, Y. Koren, Factorization meets the neighborhood: A multifaceted collaborative filtering model, in Proc. of KDD, pp , Y. Koren, Collaborative filtering with temporal dynamics, in Proc. of KDD, pp , S. Rendle, C. Freudenthaler, and L. Schmidt-Thieme, Factorizing personalized markov chains for next-basket recommendation, in Proc. of WWW, pp , S. Rendle, Factorization machines, in Proc. of ICDM, / 34
47 THANK YOU 34 / 34
Scalable Bayesian Matrix Factorization
Scalable Bayesian Matrix Factorization Avijit Saha,1, Rishabh Misra,2, and Balaraman Ravindran 1 1 Department of CSE, Indian Institute of Technology Madras, India {avijit, ravi}@cse.iitm.ac.in 2 Department
More informationLarge-scale Ordinal Collaborative Filtering
Large-scale Ordinal Collaborative Filtering Ulrich Paquet, Blaise Thomson, and Ole Winther Microsoft Research Cambridge, University of Cambridge, Technical University of Denmark ulripa@microsoft.com,brmt2@cam.ac.uk,owi@imm.dtu.dk
More informationMatrix and Tensor Factorization from a Machine Learning Perspective
Matrix and Tensor Factorization from a Machine Learning Perspective Christoph Freudenthaler Information Systems and Machine Learning Lab, University of Hildesheim Research Seminar, Vienna University of
More informationReview: Probabilistic Matrix Factorization. Probabilistic Matrix Factorization (PMF)
Case Study 4: Collaborative Filtering Review: Probabilistic Matrix Factorization Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox February 2 th, 214 Emily Fox 214 1 Probabilistic
More informationAndriy Mnih and Ruslan Salakhutdinov
MATRIX FACTORIZATION METHODS FOR COLLABORATIVE FILTERING Andriy Mnih and Ruslan Salakhutdinov University of Toronto, Machine Learning Group 1 What is collaborative filtering? The goal of collaborative
More information* Matrix Factorization and Recommendation Systems
Matrix Factorization and Recommendation Systems Originally presented at HLF Workshop on Matrix Factorization with Loren Anderson (University of Minnesota Twin Cities) on 25 th September, 2017 15 th March,
More informationDownloaded 09/30/17 to Redistribution subject to SIAM license or copyright; see
Latent Factor Transition for Dynamic Collaborative Filtering Downloaded 9/3/17 to 4.94.6.67. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php Chenyi Zhang
More informationFactorization Machines with libfm
Factorization Machines with libfm STEFFEN RENDLE, University of Konstanz Factorization approaches provide high accuracy in several important prediction problems, for example, recommender systems. However,
More informationTemporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization
Temporal Collaborative Filtering with Bayesian Probabilistic Tensor Factorization Liang Xiong Xi Chen Tzu-Kuo Huang Jeff Schneider Jaime G. Carbonell Abstract Real-world relational data are seldom stationary,
More informationSCMF: Sparse Covariance Matrix Factorization for Collaborative Filtering
SCMF: Sparse Covariance Matrix Factorization for Collaborative Filtering Jianping Shi Naiyan Wang Yang Xia Dit-Yan Yeung Irwin King Jiaya Jia Department of Computer Science and Engineering, The Chinese
More informationMixed Membership Matrix Factorization
Mixed Membership Matrix Factorization Lester Mackey 1 David Weiss 2 Michael I. Jordan 1 1 University of California, Berkeley 2 University of Pennsylvania International Conference on Machine Learning, 2010
More informationMixed Membership Matrix Factorization
Mixed Membership Matrix Factorization Lester Mackey University of California, Berkeley Collaborators: David Weiss, University of Pennsylvania Michael I. Jordan, University of California, Berkeley 2011
More informationGraphical Models for Collaborative Filtering
Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,
More informationRestricted Boltzmann Machines for Collaborative Filtering
Restricted Boltzmann Machines for Collaborative Filtering Authors: Ruslan Salakhutdinov Andriy Mnih Geoffrey Hinton Benjamin Schwehn Presentation by: Ioan Stanculescu 1 Overview The Netflix prize problem
More informationGenerative Models for Discrete Data
Generative Models for Discrete Data ddebarr@uw.edu 2016-04-21 Agenda Bayesian Concept Learning Beta-Binomial Model Dirichlet-Multinomial Model Naïve Bayes Classifiers Bayesian Concept Learning Numbers
More informationCollaborative Filtering. Radek Pelánek
Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine
More informationBayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures Ian Porteous and Arthur Asuncion
More informationRecommendation Systems
Recommendation Systems Popularity Recommendation Systems Predicting user responses to options Offering news articles based on users interests Offering suggestions on what the user might like to buy/consume
More informationA Modified PMF Model Incorporating Implicit Item Associations
A Modified PMF Model Incorporating Implicit Item Associations Qiang Liu Institute of Artificial Intelligence College of Computer Science Zhejiang University Hangzhou 31007, China Email: 01dtd@gmail.com
More informationAlgorithms for Collaborative Filtering
Algorithms for Collaborative Filtering or How to Get Half Way to Winning $1million from Netflix Todd Lipcon Advisor: Prof. Philip Klein The Real-World Problem E-commerce sites would like to make personalized
More informationCOT: Contextual Operating Tensor for Context-Aware Recommender Systems
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence COT: Contextual Operating Tensor for Context-Aware Recommender Systems Qiang Liu, Shu Wu, Liang Wang Center for Research on Intelligent
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate
More informationLarge-Scale Matrix Factorization with Distributed Stochastic Gradient Descent
Large-Scale Matrix Factorization with Distributed Stochastic Gradient Descent KDD 2011 Rainer Gemulla, Peter J. Haas, Erik Nijkamp and Yannis Sismanis Presenter: Jiawen Yao Dept. CSE, UT Arlington 1 1
More informationDecoupled Collaborative Ranking
Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides
More information26 : Spectral GMs. Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G.
10-708: Probabilistic Graphical Models, Spring 2015 26 : Spectral GMs Lecturer: Eric P. Xing Scribes: Guillermo A Cidre, Abelino Jimenez G. 1 Introduction A common task in machine learning is to work with
More informationNetBox: A Probabilistic Method for Analyzing Market Basket Data
NetBox: A Probabilistic Method for Analyzing Market Basket Data José Miguel Hernández-Lobato joint work with Zoubin Gharhamani Department of Engineering, Cambridge University October 22, 2012 J. M. Hernández-Lobato
More informationRelational Stacked Denoising Autoencoder for Tag Recommendation. Hao Wang
Relational Stacked Denoising Autoencoder for Tag Recommendation Hao Wang Dept. of Computer Science and Engineering Hong Kong University of Science and Technology Joint work with Xingjian Shi and Dit-Yan
More informationRating Prediction with Topic Gradient Descent Method for Matrix Factorization in Recommendation
Rating Prediction with Topic Gradient Descent Method for Matrix Factorization in Recommendation Guan-Shen Fang, Sayaka Kamei, Satoshi Fujita Department of Information Engineering Hiroshima University Hiroshima,
More informationAfternoon Meeting on Bayesian Computation 2018 University of Reading
Gabriele Abbati 1, Alessra Tosi 2, Seth Flaxman 3, Michael A Osborne 1 1 University of Oxford, 2 Mind Foundry Ltd, 3 Imperial College London Afternoon Meeting on Bayesian Computation 2018 University of
More informationCollaborative Recommendation with Multiclass Preference Context
Collaborative Recommendation with Multiclass Preference Context Weike Pan and Zhong Ming {panweike,mingz}@szu.edu.cn College of Computer Science and Software Engineering Shenzhen University Pan and Ming
More informationLarge-scale Collaborative Ranking in Near-Linear Time
Large-scale Collaborative Ranking in Near-Linear Time Liwei Wu Depts of Statistics and Computer Science UC Davis KDD 17, Halifax, Canada August 13-17, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 11, 2016 Paper presentations and final project proposal Send me the names of your group member (2 or 3 students) before October 15 (this Friday)
More informationLarge-scale Collaborative Prediction Using a Nonparametric Random Effects Model
Large-scale Collaborative Prediction Using a Nonparametric Random Effects Model Kai Yu Joint work with John Lafferty and Shenghuo Zhu NEC Laboratories America, Carnegie Mellon University First Prev Page
More informationApproximate Inference using MCMC
Approximate Inference using MCMC 9.520 Class 22 Ruslan Salakhutdinov BCS and CSAIL, MIT 1 Plan 1. Introduction/Notation. 2. Examples of successful Bayesian models. 3. Basic Sampling Algorithms. 4. Markov
More informationSQL-Rank: A Listwise Approach to Collaborative Ranking
SQL-Rank: A Listwise Approach to Collaborative Ranking Liwei Wu Depts of Statistics and Computer Science UC Davis ICML 18, Stockholm, Sweden July 10-15, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack
More informationTopic Modelling and Latent Dirichlet Allocation
Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationarxiv: v2 [cs.ir] 14 May 2018
A Probabilistic Model for the Cold-Start Problem in Rating Prediction using Click Data ThaiBinh Nguyen 1 and Atsuhiro Takasu 1, 1 Department of Informatics, SOKENDAI (The Graduate University for Advanced
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationClick-Through Rate prediction: TOP-5 solution for the Avazu contest
Click-Through Rate prediction: TOP-5 solution for the Avazu contest Dmitry Efimov Petrovac, Montenegro June 04, 2015 Outline Provided data Likelihood features FTRL-Proximal Batch algorithm Factorization
More informationRecommender Systems EE448, Big Data Mining, Lecture 10. Weinan Zhang Shanghai Jiao Tong University
2018 EE448, Big Data Mining, Lecture 10 Recommender Systems Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html Content of This Course Overview of
More informationUsing SVD to Recommend Movies
Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last
More informationReplicated Softmax: an Undirected Topic Model. Stephen Turner
Replicated Softmax: an Undirected Topic Model Stephen Turner 1. Introduction 2. Replicated Softmax: A Generative Model of Word Counts 3. Evaluating Replicated Softmax as a Generative Model 4. Experimental
More informationMatrix Factorization Techniques for Recommender Systems
Matrix Factorization Techniques for Recommender Systems By Yehuda Koren Robert Bell Chris Volinsky Presented by Peng Xu Supervised by Prof. Michel Desmarais 1 Contents 1. Introduction 4. A Basic Matrix
More informationThe Origin of Deep Learning. Lili Mou Jan, 2015
The Origin of Deep Learning Lili Mou Jan, 2015 Acknowledgment Most of the materials come from G. E. Hinton s online course. Outline Introduction Preliminary Boltzmann Machines and RBMs Deep Belief Nets
More informationBayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models
Bayesian Image Segmentation Using MRF s Combined with Hierarchical Prior Models Kohta Aoki 1 and Hiroshi Nagahashi 2 1 Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
More informationContent-based Recommendation
Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3
More informationRecurrent Latent Variable Networks for Session-Based Recommendation
Recurrent Latent Variable Networks for Session-Based Recommendation Panayiotis Christodoulou Cyprus University of Technology paa.christodoulou@edu.cut.ac.cy 27/8/2017 Panayiotis Christodoulou (C.U.T.)
More informationMatrix Factorization Techniques For Recommender Systems. Collaborative Filtering
Matrix Factorization Techniques For Recommender Systems Collaborative Filtering Markus Freitag, Jan-Felix Schwarz 28 April 2011 Agenda 2 1. Paper Backgrounds 2. Latent Factor Models 3. Overfitting & Regularization
More informationMatrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang
Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space
More informationVCMC: Variational Consensus Monte Carlo
VCMC: Variational Consensus Monte Carlo Maxim Rabinovich, Elaine Angelino, Michael I. Jordan Berkeley Vision and Learning Center September 22, 2015 probabilistic models! sky fog bridge water grass object
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationPoint-of-Interest Recommendations: Learning Potential Check-ins from Friends
Point-of-Interest Recommendations: Learning Potential Check-ins from Friends Huayu Li, Yong Ge +, Richang Hong, Hengshu Zhu University of North Carolina at Charlotte + University of Arizona Hefei University
More informationLow Rank Matrix Completion Formulation and Algorithm
1 2 Low Rank Matrix Completion and Algorithm Jian Zhang Department of Computer Science, ETH Zurich zhangjianthu@gmail.com March 25, 2014 Movie Rating 1 2 Critic A 5 5 Critic B 6 5 Jian 9 8 Kind Guy B 9
More informationLatent Dirichlet Allocation
Outlines Advanced Artificial Intelligence October 1, 2009 Outlines Part I: Theoretical Background Part II: Application and Results 1 Motive Previous Research Exchangeability 2 Notation and Terminology
More informationCollaborative topic models: motivations cont
Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationSum-Product Networks. STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017
Sum-Product Networks STAT946 Deep Learning Guest Lecture by Pascal Poupart University of Waterloo October 17, 2017 Introduction Outline What is a Sum-Product Network? Inference Applications In more depth
More informationarxiv: v2 [cs.lg] 5 May 2015
fastfm: A Library for Factorization Machines Immanuel Bayer University of Konstanz 78457 Konstanz, Germany immanuel.bayer@uni-konstanz.de arxiv:505.0064v [cs.lg] 5 May 05 Editor: Abstract Factorization
More informationProbabilistic Matrix Factorization
Probabilistic Matrix Factorization David M. Blei Columbia University November 25, 2015 1 Dyadic data One important type of modern data is dyadic data. Dyadic data are measurements on pairs. The idea is
More informationLECTURE 15 Markov chain Monte Carlo
LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte
More informationMatrix Factorization In Recommender Systems. Yong Zheng, PhDc Center for Web Intelligence, DePaul University, USA March 4, 2015
Matrix Factorization In Recommender Systems Yong Zheng, PhDc Center for Web Intelligence, DePaul University, USA March 4, 2015 Table of Contents Background: Recommender Systems (RS) Evolution of Matrix
More informationLanguage Information Processing, Advanced. Topic Models
Language Information Processing, Advanced Topic Models mcuturi@i.kyoto-u.ac.jp Kyoto University - LIP, Adv. - 2011 1 Today s talk Continue exploring the representation of text as histogram of words. Objective:
More informationRETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic
More informationIntroduction to Machine Learning
Introduction to Machine Learning Logistic Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574
More informationData Science Mastery Program
Data Science Mastery Program Copyright Policy All content included on the Site or third-party platforms as part of the class, such as text, graphics, logos, button icons, images, audio clips, video clips,
More informationSocial Relations Model for Collaborative Filtering
Social Relations Model for Collaborative Filtering Wu-Jun Li and Dit-Yan Yeung Shanghai Key Laboratory of Scalable Computing and Systems Department of Computer Science and Engineering, Shanghai Jiao Tong
More informationLearning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling
Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence
More informationAppendix: Modeling Approach
AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,
More informationBayesian Learning in Undirected Graphical Models
Bayesian Learning in Undirected Graphical Models Zoubin Ghahramani Gatsby Computational Neuroscience Unit University College London, UK http://www.gatsby.ucl.ac.uk/ Work with: Iain Murray and Hyun-Chul
More informationRecommendation Systems
Recommendation Systems Pawan Goyal CSE, IITKGP October 21, 2014 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 21, 2014 1 / 52 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation
More informationRecommendation Systems
Recommendation Systems Pawan Goyal CSE, IITKGP October 29-30, 2015 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 1 / 61 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationLarge-scale Collaborative Prediction Using a Nonparametric Random Effects Model
Large-scale Collaborative Prediction Using a Nonparametric Random Effects Model Kai Yu John Lafferty Shenghuo Zhu Yihong Gong NEC Laboratories America, Cupertino, CA 95014 USA Carnegie Mellon University,
More informationNCDREC: A Decomposability Inspired Framework for Top-N Recommendation
NCDREC: A Decomposability Inspired Framework for Top-N Recommendation Athanasios N. Nikolakopoulos,2 John D. Garofalakis,2 Computer Engineering and Informatics Department, University of Patras, Greece
More informationIntroduction to Restricted Boltzmann Machines
Introduction to Restricted Boltzmann Machines Ilija Bogunovic and Edo Collins EPFL {ilija.bogunovic,edo.collins}@epfl.ch October 13, 2014 Introduction Ingredients: 1. Probabilistic graphical models (undirected,
More informationDeep Learning Basics Lecture 7: Factor Analysis. Princeton University COS 495 Instructor: Yingyu Liang
Deep Learning Basics Lecture 7: Factor Analysis Princeton University COS 495 Instructor: Yingyu Liang Supervised v.s. Unsupervised Math formulation for supervised learning Given training data x i, y i
More informationMatrix Factorization and Factorization Machines for Recommender Systems
Talk at SDM workshop on Machine Learning Methods on Recommender Systems, May 2, 215 Chih-Jen Lin (National Taiwan Univ.) 1 / 54 Matrix Factorization and Factorization Machines for Recommender Systems Chih-Jen
More informationLearning to Disentangle Factors of Variation with Manifold Learning
Learning to Disentangle Factors of Variation with Manifold Learning Scott Reed Kihyuk Sohn Yuting Zhang Honglak Lee University of Michigan, Department of Electrical Engineering and Computer Science 08
More informationLecture 16 Deep Neural Generative Models
Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed
More informationNonnegative Matrix Factorization
Nonnegative Matrix Factorization Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationCollaborative Filtering Applied to Educational Data Mining
Journal of Machine Learning Research (200) Submitted ; Published Collaborative Filtering Applied to Educational Data Mining Andreas Töscher commendo research 8580 Köflach, Austria andreas.toescher@commendo.at
More informationMachine Learning. Probabilistic KNN.
Machine Learning. Mark Girolami girolami@dcs.gla.ac.uk Department of Computing Science University of Glasgow June 21, 2007 p. 1/3 KNN is a remarkably simple algorithm with proven error-rates June 21, 2007
More informationCross-domain recommendation without shared users or items by sharing latent vector distributions
Cross-domain recommendation without shared users or items by sharing latent vector distributions Tomoharu Iwata Koh Takeuchi NTT Communication Science Laboratories Abstract We propose a cross-domain recommendation
More informationA Bayesian Treatment of Social Links in Recommender Systems ; CU-CS
University of Colorado, Boulder CU Scholar Computer Science Technical Reports Computer Science Spring 5--0 A Bayesian Treatment of Social Links in Recommender Systems ; CU-CS-09- Mike Gartrell University
More informationStochastic Proximal Gradient Algorithm
Stochastic Institut Mines-Télécom / Telecom ParisTech / Laboratoire Traitement et Communication de l Information Joint work with: Y. Atchade, Ann Arbor, USA, G. Fort LTCI/Télécom Paristech and the kind
More informationPredicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence AAAI-16 Predicting the Next Location: A Recurrent Model with Spatial and Temporal Contexts Qiang Liu, Shu Wu, Liang Wang, Tieniu
More informationDeep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationINFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES
INFINITE MIXTURES OF MULTIVARIATE GAUSSIAN PROCESSES SHILIANG SUN Department of Computer Science and Technology, East China Normal University 500 Dongchuan Road, Shanghai 20024, China E-MAIL: slsun@cs.ecnu.edu.cn,
More informationVariational inference
Simon Leglaive Télécom ParisTech, CNRS LTCI, Université Paris Saclay November 18, 2016, Télécom ParisTech, Paris, France. Outline Introduction Probabilistic model Problem Log-likelihood decomposition EM
More informationA Matrix Factorization Technique with Trust Propagation for Recommendation in Social Networks
A Matrix Factorization Technique with Trust Propagation for Recommendation in Social Networks ABSTRACT Mohsen Jamali School of Computing Science Simon Fraser University Burnaby, BC, Canada mohsen_jamali@cs.sfu.ca
More informationClick Prediction and Preference Ranking of RSS Feeds
Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS
More informationMixed Membership Matrix Factorization
7 Mixed Membership Matrix Factorization Lester Mackey Computer Science Division, University of California, Berkeley, CA 94720, USA David Weiss Computer and Information Science, University of Pennsylvania,
More informationLatent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language
More informationCollaborative Filtering on Ordinal User Feedback
Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence Collaborative Filtering on Ordinal User Feedback Yehuda Koren Google yehudako@gmail.com Joseph Sill Analytics Consultant
More informationFactorization Models for Context-/Time-Aware Movie Recommendations
Factorization Models for Context-/Time-Aware Movie Recommendations Zeno Gantner Machine Learning Group University of Hildesheim Hildesheim, Germany gantner@ismll.de Steffen Rendle Machine Learning Group
More informationCircle-based Recommendation in Online Social Networks
Circle-based Recommendation in Online Social Networks Xiwang Yang, Harald Steck*, and Yong Liu Polytechnic Institute of NYU * Bell Labs/Netflix 1 Outline q Background & Motivation q Circle-based RS Trust
More informationTechniques for Dimensionality Reduction. PCA and Other Matrix Factorization Methods
Techniques for Dimensionality Reduction PCA and Other Matrix Factorization Methods Outline Principle Compoments Analysis (PCA) Example (Bishop, ch 12) PCA as a mixture model variant With a continuous latent
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More information