Point-of-Interest Recommendations: Learning Potential Check-ins from Friends Huayu Li, Yong Ge +, Richang Hong, Hengshu Zhu University of North Carolina at Charlotte + University of Arizona Hefei University of Technology Baidu Research-Big Data Lab 1
Outline Introduction Research Problem Research Challenges Related Work Methodologies Experiments 2
Introduction Users Mobile Devices Location-based Social Network (LBSN) Services 3
Introduction 4
Introduction 5
Introduction 6
Introduction Information Overload Foursquare: 65 million venues Facebook: 16 million local business Yelp: 2.1 million claimed business New Region Which One? 7
Introduction Information Overload Foursquare: 65 million venues Facebook: 16 million local business Yelp: 2.1 million claimed business Which One? A location recommender system is very important! New Region 8
Research Problem Given a set of users and a set of locations they have visited before, the objective is to recommend the locations to an individual who might have interest to visit. visited recommended 9
Research Challenges Complex Decision Making Process Social Network Influence Geographical Influence 10
Research Challenges Complex Decision Making Process Social Network Influence Geographical Influence Data Sparsity Issue Each user only visits a limited number of locations. For new user/location, we do not have their check-in information. 11
Research Challenges Complex Decision Making Process Social Network Influence Geographical Influence Data Sparsity Issue Each user only visits a limited number of locations. For new user/location, we do not have their check-in information. Implicit Feedback Issue Only check-in frequency without explicit rating. We do not know user s explicit preference for locations. 12
Related Work Modeling Social Network Influence Social regularization constraint (WSDM 11) Social correlations (CIKM 12, IJCAI 13, ICDM 15) User-based collaborative filtering (SIGIR 11) Modeling geographical influence Incorporating geographical distance (KDD 11, SIGIR 11, AAAI 12, SIGSPATIAL 13, KDD 14, ICDM 15) Incorporating activity area (KDD 14) Incorporation nearest neighbors (CIKM 14) 13
Methods: Framework Learn potential locations from friends Learn user s preference for locations 14
Methods: Framework Learn potential locations from friends Learn user s preference for locations 15
Definition of Friends Social Friends F i s The users who socially connect with the target user i in LBSNs. Location Friends F i l The users who check-in the same locations as the target user i. Neighboring Friends F i n f 5 f 6 u i f 3 f 4 f 1 f 2 The users who live physically closest to the target user i. l 1 l 2 l 3 l 4 l 5 16
Definition of Friends Social Friends F i s The users who socially connect with the target user i in LBSNs. Location Friends F i l The users who check-in the same locations as the target user i. Neighboring Friends F i n f 5 f 6 u i f 3 f 4 f 1 f 2 The users who live physically F closest i = F s to i S(F l the target i ) S(F n user i ) i. l 1 l 2 l 3 l 4 l 5 17
Methods: Learning Potential Locations PROBLEM DEFINITION: For the target user i, given a set of locations that her friends have checked-in before but she never visits, the problem is to find top most potential locations that she might be interested in. u i 18
Methods: Learning Potential Locations u i Linear Aggregation Random Walk P ij pot? l j Location Candidate 19
Methods: Linear Aggregation Probability P ij pot that user i visits a location j: u i P pot ij max{sim(i, f; j)} j f F i ζsim u i, f + (1 ζ)p ij G Similarity of User Interest Similarity of Geo-location l j 20
Methods: Random Walk Nodes: users and locations Links: user-user, userlocation, location-location u i y = 1 β Ay + β M i o M i f + Fi +1 x Transition Matrix Restart Nodes P ij pot is the steady probability corresponding to location j 21
Methods: Learning Potential Locations Observed Locations Potential Locations Other Unobserved Locations 22
Methods: Framework Learn potential locations from friends Learn user s preference for locations 23
d Recommendation Models The preference pƹ ij of user i for location j: Users preference for locations P Category Feature Matrix Q = Q + ε U V Location Latent Matrix User Latent Matrix pƹ ij = (q icj + ε) u T i v j User s Preference for Category Tuning Parameter User s Typical Preference for Location 24
Recommendation Models Loss function of general form argmin U,V,Q E i p ij, p ik, p ih, p ij, p ik, p ih i Estimated Value + Θ(U, V, Q) j M i o, k M i p, h M i u Observed Locations Potential Locations Other Unobserved Locations 25
Recommendation Models Loss function of general form argmin U,V,Q E i p ij, p ik, p ih, p ij, p ik, p ih i Estimated Value + Θ(U, V, Q) j M i o, Observed Locations k M i p, h M i u Potential Locations Other Unobserved Locations λ u 2 U 2 2 + λ v 2 V 2 2 + λ q 2 Q 2 2 Regularization Term 26
Recommendation Models Loss function of general form Square Error based Model Ranking Error based Model argmin U,V,Q E i p ij, p ik, p ih, p ij, p ik, p ih i + Θ(U, V, Q) j M i o, Observed Locations k M i p, h M i u Potential Locations Other Unobserved Locations λ u 2 U 2 2 + λ v 2 V 2 2 + λ q 2 Q 2 2 Regularization Term 27
Square Error based Model The user s preference for a location is defined as: o 1 if j M i p ij = p α if j M i 0 otherwise Observed Locations Potential Locations Other unobserved Locations 28
Square Error based Model The user s preference for a location is defined as: o 1 if j M i p ij = p α if j M i 0 otherwise Squared error loss function M E i = w ij (p ij pƹ ij ) 2 j=1 Weight Matrix w ij = ቊ 1 + γ r o ij, if j M i 1, otherwise 29
Square Error based Model Squared error based objective function L N M = min U,V,Q i=1 j=1 + Θ(U, V, Q) w ij (p ij pƹ ij ) 2 Initialization Alternating Update Alternating Least Square 30
Ƹ Ƹ Ƹ Ranking Error based Model Model the ranking order among user s preference for three types of locations Observed Location ቊ pƹ ij > p ik, p ik > p ih j M i o,k M i p, h M i u Potential Location Potential Location Other Unobserved Location 31
Ƹ Ƹ Ƹ Ranking Error based Model Model the ranking order among user s preference for three types of locations ቊ pƹ ij > p ik, p ik > p ih Ranking error loss function E i = o p j M i k M i j M i o,k M i p, h M i u ln σ( pƹ ij pƹ ik ) p u k M h M i i Using Logistic Function to Model Ranking Order ln σ( pƹ ik pƹ ih ) 32
Ranking Error based Model Ranking error based objective function Sampling Initialization Update Stochastic Gradient Descent with Boostrap Sampling 33
Incorporating Geographical Influence Check-in probability is refined by a power-law function associated with the distance between user home position and a location. pƹ ij p G ij σ( pƹ ij ) powerlaw(d(i, j)) 34
Recommendation Strategies Standard Recommendation New User Recommendation Ƹ p ij = (q icj + ε) u i T v j Target User i New Location Recommendation pƹ ij p G ij σ σ l Sim ψ j G (j, l) pƹ il σ l ψ j Sim G (j, l) New Location 35
Experiments Datasets: Gowalla Test Methodology Selecting 80% as training and using the rest 20% as testing according to timestamp Evaluation Metrics: Top-K Recommendation Accuracy (Precision@K and Recall@K) Statistics of Data Set New Location Rec New User Rec #User #Location #Check-in Sparsity #New Location #Test #New User #Test 52,216 98,351 2,577,336 0.0399% 78,881 568,937 9,326 79,153 36
Exp. : Standard Recommendation Precision@K Recall@K Modeling unobserved check-ins can improve recommendation accuracy! 37
Exp. : Standard Recommendation Precision@K Recall@K Modeling potential check-ins can benefit recommendation! 38
Exp. : New User Recommendation Precision@K Recall@K Modeling potential check-ins can solve user cold-start issue! 39
Exp. : New Location Recommendation Performance comparison for new location recommendation in terms of Precision@K and Recall@K. Modeling potential check-ins can solve location cold-start issue! 40
Conclusion Empirically analyze the correlations between users and their three type of friends using real-world data Learn a set of locations for each user that her friends have checked-in before and she is most interested in Develop matrix factorization based models via different error loss functions with the learned potential check-ins, and propose two scalable optimization methods Design three different recommendation strategies 41
Thank You 42