Random Surfing on Multipartite Graphs

Random Surfing on Multipartite Graphs Athanasios N. Nikolakopoulos, Antonia Korba and John D. Garofalakis Department of Computer Engineering and Informatics, University of Patras December 07, 2016 IEEE International Conference on Big Data, IEEE BigData 2016

Outline 1. Introduction & Motivation 2. Block Teleportation Model: Definition, Algorithm and Theoretical Analysis. Experimental Evaluation 4. Conclusions & Future Work 1

Introduction & Motivation

Revisiting the Random Surfer Model I PageRank Model G = αh + (1 α)e Primitivity Adjustment of the Row-Normalized Adjacency Matrix H: Damping Factor α Has received much attention (Generalized Damping Functions (Functional Rankings) [1], Multidamping [5],...) Teleportation matrix E Little have been done towards its generalization [8]. 2

Revisiting the Random Surfer Model II Problems With Traditional Teleportation Treats nodes in a leveling way Restrictive or even counter-intuitive (eg. structured graphs) Blind to the spectral characteristics of the underlying graph Overview of Our Approach We focus on Multipartite Graphs We modify the traditional teleportation model Different Teleportation behavior per partite set. u 1 u 2 u v 1 v 2 v w 1 w 2

Block Teleportation Model: Definition, Algorithm and Theoretical Analysis

Model Definition S = ηh + M H diag(a G 1) 1 A G, Each partite set is a Teleportation Block! [M] ij M = { 1 M i, if v j M i, 0, otherwise, where M i the origin partite set of v i. R R }{{} Sparse and Low-Rank Factors Random Surfing Interpretation The Random Surfer: With probability η follows the edges of the graph With probability 1 η he jumps to a node belonging to the same partite set he is currently in. 4

Decomposability and Time-Scale Dissociation Theorem (Decomposability) u1 u1 When the value of the teleportation parameter is small enough, the Markov chain corresponding to matrix S is NCD with respect to the partition of the nodes of the initial graph, into different connected components. u2 u u4 v1 v2 v u2 u u4 v1 v2 v S = S + εc, S diag{s(g 1 ),..., S(G L )} u5 u5 n(i) L L S t = Z 11 + λ t 1 Z 1I + λ t }{{} I m Z mi, I I=2 I=1 m=2 Term A }{{}}{{} Term B Term C S t = n(i) L L Z 1I + ( λ) t m Z mi. I I=1 I=1 m=2 }{{}}{{} Term Ã Term C Short-term Dynamics. Short-term Equilibrium. Long-term Dynamics. Long-term Equilibrium. Computational Implications... in brush strokes! Study each Connected Component in Isolation, and then combine the results. 5

BT-Rank Algorithm and Computational Analysis Block-Teleportation Rank Input: H, M R n n, ϵ, scalars η, > 0 such that η + = 1. Output: π 1: Let the initial approximation be π. Set k = 0. (0) 2: Compute π (k+1) π (k) H ϕ π (k) M π (k+1) ηπ (k+1) + ϕ : Normalize π and compute (k+1) r = π (k+1) π (k) 1. 4: If r < ϵ, quit with π (k+1), otherwise k k + 1 and go to step 2. General Cost: log ϵ Θ(nnz(H)) }{{} log λ 2(S) per iteration If χ(g) = 2 we can dig a little deeper! Theorem (Eigenvalue Property ) Assuming G is a connected graph for which χ(g) = 2 holds, the spectrum of the stochastic matrix S is such that η + λ(s). Theorem (Lumpability ) The BT-Rank Markov chain that corresponds to a 2-chromatic graph, is lumpable wrt A = {A 1, A 2}. Theorem (Perron Vector ) When the BT-Rank Markov chain is lumpable wrt to partition A, for the left Perron eigenvector of matrix S it holds π 1 1 A 1 = π 2 1 A 2 6

Experimental Evaluation

Computational Experiments # of iterations # of iterations # of iterations 200 100 200 100 200 100 Jester Digg votes Yahoo!Music 200 100 200 150 100 50 200 100 MovieLens20M TREC Wikipedia(en) Time(sec) Time(sec) Time(sec) 2 1 0 2 1.5 1 0.5 0 80 60 40 20 0 Jester Digg votes Yahoo!Music MovieLens20M 10 5 0 TREC 60 40 20 0 Wikipedia(en) 2 1 0 BT-Rank BT-Rank(NoLump) PageRank 7

Qualitative Experiments: Ranking-Based Recommendation I Our Approach We model the recommender as a tripartite graph Users Movies Genres ( ) Personalization through matrix M diag 1e i, 1ω i, 1ϖ i ω i : the normalized vector of the users ratings over the set of movies. ϖ i : the normalized vector of his mean ratings per genre. 8

Qualitative Experiments: Ranking-Based Recommendation II Methodology Randomly sample 1.4% of the ratings probe set P Use each item v j, rated with 5 stars by user u i in P test set T Randomly select another 1000 unrated items of the same user for each item in T Form ranked lists by ordering all the 1001 items MovieLens1M Metrics Recall@N Normalized Discounted Cumulative Gain (NDCG@N) Mean Reciprocal Rank Recall@N NDCG@N 0.4 0.2 0 0. 0.2 0.1 0 5 10 15 20 5 10 15 20 BT-Rank Katz FP MFA L CT MRR 0.2 0.15 0.1 BT-Rank Katz FP MFA L CT 9

Conclusions & Future Work

Conclusion & Future Work Synopsis We proposed a simple alternative teleportation component for Random Surfing on Multipartite Graphs. Can be handled efficiently Entails nice theoretical properties Allows for richer modeling Future Directions Propose a Systematic Framework for the definition of teleportation models that match better the underlying graphs For the Web-Graph: NCDawareRank (WSDM 1) 10

References R. Baeza-Yates, P. Boldi, and C. Castillo. Generic damping functions for propagating importance in link-based ranking. Internet Math., (4):445 478, 2006. P.-J. Courtois. Decomposability: Queueing and Computer System Applications. ACM monograph series. Academic Press, 1977. P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems, RecSys 10, pages 9 46. ACM, 2010. F. Fouss, K. Francoisse, L. Yen, A. Pirotte, and M. Saerens. An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification. Neural Netw., 1:5 72, July 2012. G. Kollias, E. Gallopoulos, and A. Grama. Surfing the network for ranking by multidamping. IEEE Trans. Knowl. Data Eng., 26(9):22 26, 2014. B. Long, X. Wu, Z. M. Zhang, and P. S. Yu. Unsupervised learning on k-partite graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 06, pages 17 26, New York, NY, USA, 2006. ACM. A. Nikolakopoulos and J. Garofalakis. NCDREC: A decomposability inspired framework for top-n recommendation. In Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on, pages 18 190, Aug 2014. A. N. Nikolakopoulos and J. D. Garofalakis. NCDawareRank: a novel ranking method that exploits the decomposable structure of the web. In Proceedings of the sixth ACM international conference on Web search and data mining, WSDM 1, pages 14 152. ACM, 201. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab, November 1999. H. A. Simon and A. Ando. Aggregation of variables in dynamic systems. Econometrica, 29(2):pp. 111 18, 1961.

Questions?

Appendix: Eigenvalue Theorem Back Theorem (Eigenvalue Property) Assuming G is a connected graph for which χ(g) = 2 holds, the spectrum of the stochastic matrix S is such that η + λ(s). Proof Sketch. #nodes of 1st partite set { }} { We define a vector v [ 1 1 1 1 #nodes of 2nd partite set {}}{ 1 1 1 ] We show that ( 1, v) is an eigenpair of matrix H, and (1, v) is an eigenpair of matrix M. Then, using any nonsingular matrix, Q ( 1 v X ), allows us to perform a similarity transformation Q 1 SQ = Q 1 (ηh + M) Q = = 1 0 ηy 1 HX + y 1 MX = 0 η + ηy 2 HX + y 2 MX (1) 0 0 ηy HX + Y MX that reveals the desired eigenvalue.

Appendix: Lumpability Theorem Back Theorem (Lumpability of the BT-Rank Chain) The BT-Rank Markov chain that corresponds to a 2-chromatic graph, is lumpable. Proof Sketch. u1 v1 u2 v2 u v w1 w2 u1 u2 u w1 w2 v1 v2 v S = η 2 η 2 η η η 2 2 η η η 2 2 2 2 2 2 η η 2 η 2 η 2 η 2 We have Pr{i A 2} = j A 2 S ij = η for all i A 1. Pr{i A 1} = j A 1 S ij = η for all i A 2.

Appendix: Perron Vector Theorem Back Theorem (Eigenvector Property) When the BT-Rank Markov chain is lumpable with respect to partition A = {A 1, A 2}, for the left Perron-Frobenius eigenvector of matrix S it holds π 1 1 A 1 = π 2 1 A 2 Proof Sketch. ( ) π 1A1 0 S 0 1 A2 ( ) π M111 A1 ηh 121 A2 ηh 211 A1 M 221 A2 ( π 1 π ) ( ) 1 A1 η1 A1 2 η1 A2 1 A2 ( ) = π 1A1 0 0 1 A2 = ( π 1 1 A 1 π 2 1 A 2 ) = ( π 1 1 A 1 π 2 1 A 2 ) and the result follows from the solution of the system.

Appendix: Near Complete Decomposability Back Herbert A. Simon Sparsity Hierarchy Decomposability [10]. Nearly Completely Decomposable Systems Interactions: Strong Within Blocks - Weak Between Blocks Successful Applications in Diverse Disciplines Economics, Cognitive Theory, A Management, Biology, Ecology, etc. Computer Science and Engineering: Performance Analysis (Courtois [2]) Web Search (NG1 [8]) C Recommendation and Data Mining (NG14 [7]) B ε ε ε ε ε ε ε ε