Random Surfing on Multipartite Graphs

Size: px
Start display at page:

Download "Random Surfing on Multipartite Graphs"

Transcription

1 Random Surfing on Multipartite Graphs Athanasios N. Nikolakopoulos, Antonia Korba and John D. Garofalakis Department of Computer Engineering and Informatics, University of Patras December 07, 2016 IEEE International Conference on Big Data, IEEE BigData 2016

2 Outline 1. Introduction & Motivation 2. Block Teleportation Model: Definition, Algorithm and Theoretical Analysis. Experimental Evaluation 4. Conclusions & Future Work 1

3 Introduction & Motivation

4 Revisiting the Random Surfer Model I PageRank Model G = αh + (1 α)e Primitivity Adjustment of the Row-Normalized Adjacency Matrix H: Damping Factor α Has received much attention (Generalized Damping Functions (Functional Rankings) [1], Multidamping [5],...) Teleportation matrix E Little have been done towards its generalization [8]. 2

5 Revisiting the Random Surfer Model II Problems With Traditional Teleportation Treats nodes in a leveling way Restrictive or even counter-intuitive (eg. structured graphs) Blind to the spectral characteristics of the underlying graph Overview of Our Approach We focus on Multipartite Graphs We modify the traditional teleportation model Different Teleportation behavior per partite set. u 1 u 2 u v 1 v 2 v w 1 w 2

6 Block Teleportation Model: Definition, Algorithm and Theoretical Analysis

7 Model Definition S = ηh + M H diag(a G 1) 1 A G, Each partite set is a Teleportation Block! [M] ij M = { 1 M i, if v j M i, 0, otherwise, where M i the origin partite set of v i. R R }{{} Sparse and Low-Rank Factors Random Surfing Interpretation The Random Surfer: With probability η follows the edges of the graph With probability 1 η he jumps to a node belonging to the same partite set he is currently in. 4

8 Decomposability and Time-Scale Dissociation Theorem (Decomposability) u1 u1 When the value of the teleportation parameter is small enough, the Markov chain corresponding to matrix S is NCD with respect to the partition of the nodes of the initial graph, into different connected components. u2 u u4 v1 v2 v u2 u u4 v1 v2 v S = S + εc, S diag{s(g 1 ),..., S(G L )} u5 u5 n(i) L L S t = Z 11 + λ t 1 Z 1I + λ t }{{} I m Z mi, I I=2 I=1 m=2 Term A }{{}}{{} Term B Term C S t = n(i) L L Z 1I + ( λ) t m Z mi. I I=1 I=1 m=2 }{{}}{{} Term à Term C Short-term Dynamics. Short-term Equilibrium. Long-term Dynamics. Long-term Equilibrium. Computational Implications... in brush strokes! Study each Connected Component in Isolation, and then combine the results. 5

9 BT-Rank Algorithm and Computational Analysis Block-Teleportation Rank Input: H, M R n n, ϵ, scalars η, > 0 such that η + = 1. Output: π 1: Let the initial approximation be π. Set k = 0. (0) 2: Compute π (k+1) π (k) H ϕ π (k) M π (k+1) ηπ (k+1) + ϕ : Normalize π and compute (k+1) r = π (k+1) π (k) 1. 4: If r < ϵ, quit with π (k+1), otherwise k k + 1 and go to step 2. General Cost: log ϵ Θ(nnz(H)) }{{} log λ 2(S) per iteration If χ(g) = 2 we can dig a little deeper! Theorem (Eigenvalue Property ) Assuming G is a connected graph for which χ(g) = 2 holds, the spectrum of the stochastic matrix S is such that η + λ(s). Theorem (Lumpability ) The BT-Rank Markov chain that corresponds to a 2-chromatic graph, is lumpable wrt A = {A 1, A 2}. Theorem (Perron Vector ) When the BT-Rank Markov chain is lumpable wrt to partition A, for the left Perron eigenvector of matrix S it holds π 1 1 A 1 = π 2 1 A 2 6

10 Experimental Evaluation

11 Computational Experiments # of iterations # of iterations # of iterations Jester Digg votes Yahoo!Music MovieLens20M TREC Wikipedia(en) Time(sec) Time(sec) Time(sec) Jester Digg votes Yahoo!Music MovieLens20M TREC Wikipedia(en) BT-Rank BT-Rank(NoLump) PageRank 7

12 Qualitative Experiments: Ranking-Based Recommendation I Our Approach We model the recommender as a tripartite graph Users Movies Genres ( ) Personalization through matrix M diag 1e i, 1ω i, 1ϖ i ω i : the normalized vector of the users ratings over the set of movies. ϖ i : the normalized vector of his mean ratings per genre. 8

13 Qualitative Experiments: Ranking-Based Recommendation II Methodology Randomly sample 1.4% of the ratings probe set P Use each item v j, rated with 5 stars by user u i in P test set T Randomly select another 1000 unrated items of the same user for each item in T Form ranked lists by ordering all the 1001 items MovieLens1M Metrics Recall@N Normalized Discounted Cumulative Gain (NDCG@N) Mean Reciprocal Rank Recall@N NDCG@N BT-Rank Katz FP MFA L CT MRR BT-Rank Katz FP MFA L CT 9

14 Conclusions & Future Work

15 Conclusion & Future Work Synopsis We proposed a simple alternative teleportation component for Random Surfing on Multipartite Graphs. Can be handled efficiently Entails nice theoretical properties Allows for richer modeling Future Directions Propose a Systematic Framework for the definition of teleportation models that match better the underlying graphs For the Web-Graph: NCDawareRank (WSDM 1) 10

16 References R. Baeza-Yates, P. Boldi, and C. Castillo. Generic damping functions for propagating importance in link-based ranking. Internet Math., (4): , P.-J. Courtois. Decomposability: Queueing and Computer System Applications. ACM monograph series. Academic Press, P. Cremonesi, Y. Koren, and R. Turrin. Performance of recommender algorithms on top-n recommendation tasks. In Proceedings of the fourth ACM conference on Recommender systems, RecSys 10, pages ACM, F. Fouss, K. Francoisse, L. Yen, A. Pirotte, and M. Saerens. An experimental investigation of kernels on graphs for collaborative recommendation and semisupervised classification. Neural Netw., 1:5 72, July G. Kollias, E. Gallopoulos, and A. Grama. Surfing the network for ranking by multidamping. IEEE Trans. Knowl. Data Eng., 26(9):22 26, B. Long, X. Wu, Z. M. Zhang, and P. S. Yu. Unsupervised learning on k-partite graphs. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 06, pages 17 26, New York, NY, USA, ACM. A. Nikolakopoulos and J. Garofalakis. NCDREC: A decomposability inspired framework for top-n recommendation. In Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on, pages , Aug A. N. Nikolakopoulos and J. D. Garofalakis. NCDawareRank: a novel ranking method that exploits the decomposable structure of the web. In Proceedings of the sixth ACM international conference on Web search and data mining, WSDM 1, pages ACM, 201. L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. Technical Report , Stanford InfoLab, November H. A. Simon and A. Ando. Aggregation of variables in dynamic systems. Econometrica, 29(2):pp , 1961.

17 Questions?

18 Appendix: Eigenvalue Theorem Back Theorem (Eigenvalue Property) Assuming G is a connected graph for which χ(g) = 2 holds, the spectrum of the stochastic matrix S is such that η + λ(s). Proof Sketch. #nodes of 1st partite set { }} { We define a vector v [ #nodes of 2nd partite set {}}{ ] We show that ( 1, v) is an eigenpair of matrix H, and (1, v) is an eigenpair of matrix M. Then, using any nonsingular matrix, Q ( 1 v X ), allows us to perform a similarity transformation Q 1 SQ = Q 1 (ηh + M) Q = = 1 0 ηy 1 HX + y 1 MX = 0 η + ηy 2 HX + y 2 MX (1) 0 0 ηy HX + Y MX that reveals the desired eigenvalue.

19 Appendix: Lumpability Theorem Back Theorem (Lumpability of the BT-Rank Chain) The BT-Rank Markov chain that corresponds to a 2-chromatic graph, is lumpable. Proof Sketch. u1 v1 u2 v2 u v w1 w2 u1 u2 u w1 w2 v1 v2 v S = η 2 η 2 η η η 2 2 η η η η η 2 η 2 η 2 η 2 We have Pr{i A 2} = j A 2 S ij = η for all i A 1. Pr{i A 1} = j A 1 S ij = η for all i A 2.

20 Appendix: Perron Vector Theorem Back Theorem (Eigenvector Property) When the BT-Rank Markov chain is lumpable with respect to partition A = {A 1, A 2}, for the left Perron-Frobenius eigenvector of matrix S it holds π 1 1 A 1 = π 2 1 A 2 Proof Sketch. ( ) π 1A1 0 S 0 1 A2 ( ) π M111 A1 ηh 121 A2 ηh 211 A1 M 221 A2 ( π 1 π ) ( ) 1 A1 η1 A1 2 η1 A2 1 A2 ( ) = π 1A A2 = ( π 1 1 A 1 π 2 1 A 2 ) = ( π 1 1 A 1 π 2 1 A 2 ) and the result follows from the solution of the system.

21 Appendix: Near Complete Decomposability Back Herbert A. Simon Sparsity Hierarchy Decomposability [10]. Nearly Completely Decomposable Systems Interactions: Strong Within Blocks - Weak Between Blocks Successful Applications in Diverse Disciplines Economics, Cognitive Theory, A Management, Biology, Ecology, etc. Computer Science and Engineering: Performance Analysis (Courtois [2]) Web Search (NG1 [8]) C Recommendation and Data Mining (NG14 [7]) B ε ε ε ε ε ε ε ε

Random Surfing on Multipartite Graphs

Random Surfing on Multipartite Graphs Random Surfing on Multipartite Graphs Athanasios N. Nikolakopoulos, Antonia Korba and John D. Garofalakis Department of Computer Engineering and Informatics, University of Patras {nikolako,korba,garofala}@ceid.upatras.gr

More information

NCDREC: A Decomposability Inspired Framework for Top-N Recommendation

NCDREC: A Decomposability Inspired Framework for Top-N Recommendation NCDREC: A Decomposability Inspired Framework for Top-N Recommendation Athanasios N. Nikolakopoulos,2 John D. Garofalakis,2 Computer Engineering and Informatics Department, University of Patras, Greece

More information

1998: enter Link Analysis

1998: enter Link Analysis 1998: enter Link Analysis uses hyperlink structure to focus the relevant set combine traditional IR score with popularity score Page and Brin 1998 Kleinberg Web Information Retrieval IR before the Web

More information

Link Analysis. Leonid E. Zhukov

Link Analysis. Leonid E. Zhukov Link Analysis Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural Analysis and Visualization

More information

Factored Proximity Models for Top-N Recommendations

Factored Proximity Models for Top-N Recommendations Factored Proximity Models for Top- Recommendations Athanasios. ikolakopoulos 1,3, Vassilis Kalantzis 2, Efstratios Gallopoulos 1 and John D. Garofalakis 1 Department of Computer Engineering and Informatics

More information

Uncertainty and Randomization

Uncertainty and Randomization Uncertainty and Randomization The PageRank Computation in Google Roberto Tempo IEIIT-CNR Politecnico di Torino tempo@polito.it 1993: Robustness of Linear Systems 1993: Robustness of Linear Systems 16 Years

More information

Calculating Web Page Authority Using the PageRank Algorithm

Calculating Web Page Authority Using the PageRank Algorithm Jacob Miles Prystowsky and Levi Gill Math 45, Fall 2005 1 Introduction 1.1 Abstract In this document, we examine how the Google Internet search engine uses the PageRank algorithm to assign quantitatively

More information

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa Introduction to Search Engine Technology Introduction to Link Structure Analysis Ronny Lempel Yahoo Labs, Haifa Outline Anchor-text indexing Mathematical Background Motivation for link structure analysis

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018 Lab 8: Measuring Graph Centrality - PageRank Monday, November 5 CompSci 531, Fall 2018 Outline Measuring Graph Centrality: Motivation Random Walks, Markov Chains, and Stationarity Distributions Google

More information

Node Centrality and Ranking on Networks

Node Centrality and Ranking on Networks Node Centrality and Ranking on Networks Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Social

More information

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine CS 277: Data Mining Mining Web Link Structure Class Presentations In-class, Tuesday and Thursday next week 2-person teams: 6 minutes, up to 6 slides, 3 minutes/slides each person 1-person teams 4 minutes,

More information

Analysis of Google s PageRank

Analysis of Google s PageRank Analysis of Google s PageRank Ilse Ipsen North Carolina State University Joint work with Rebecca M. Wills AN05 p.1 PageRank An objective measure of the citation importance of a web page [Brin & Page 1998]

More information

Updating PageRank. Amy Langville Carl Meyer

Updating PageRank. Amy Langville Carl Meyer Updating PageRank Amy Langville Carl Meyer Department of Mathematics North Carolina State University Raleigh, NC SCCM 11/17/2003 Indexing Google Must index key terms on each page Robots crawl the web software

More information

Krylov Subspace Methods to Calculate PageRank

Krylov Subspace Methods to Calculate PageRank Krylov Subspace Methods to Calculate PageRank B. Vadala-Roth REU Final Presentation August 1st, 2013 How does Google Rank Web Pages? The Web The Graph (A) Ranks of Web pages v = v 1... Dominant Eigenvector

More information

Node and Link Analysis

Node and Link Analysis Node and Link Analysis Leonid E. Zhukov School of Applied Mathematics and Information Science National Research University Higher School of Economics 10.02.2014 Leonid E. Zhukov (HSE) Lecture 5 10.02.2014

More information

Data Mining Recitation Notes Week 3

Data Mining Recitation Notes Week 3 Data Mining Recitation Notes Week 3 Jack Rae January 28, 2013 1 Information Retrieval Given a set of documents, pull the (k) most similar document(s) to a given query. 1.1 Setup Say we have D documents

More information

Applications of The Perron-Frobenius Theorem

Applications of The Perron-Frobenius Theorem Applications of The Perron-Frobenius Theorem Nate Iverson The University of Toledo Toledo, Ohio Motivation In a finite discrete linear dynamical system x n+1 = Ax n What are sufficient conditions for x

More information

Mathematical Properties & Analysis of Google s PageRank

Mathematical Properties & Analysis of Google s PageRank Mathematical Properties & Analysis of Google s PageRank Ilse Ipsen North Carolina State University, USA Joint work with Rebecca M. Wills Cedya p.1 PageRank An objective measure of the citation importance

More information

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics Spectral Graph Theory and You: and Centrality Metrics Jonathan Gootenberg March 11, 2013 1 / 19 Outline of Topics 1 Motivation Basics of Spectral Graph Theory Understanding the characteristic polynomial

More information

ECEN 689 Special Topics in Data Science for Communications Networks

ECEN 689 Special Topics in Data Science for Communications Networks ECEN 689 Special Topics in Data Science for Communications Networks Nick Duffield Department of Electrical & Computer Engineering Texas A&M University Lecture 8 Random Walks, Matrices and PageRank Graphs

More information

PageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10

PageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10 PageRank Ryan Tibshirani 36-462/36-662: Data Mining January 24 2012 Optional reading: ESL 14.10 1 Information retrieval with the web Last time we learned about information retrieval. We learned how to

More information

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors Michael K. Ng Centre for Mathematical Imaging and Vision and Department of Mathematics

More information

Link Analysis Ranking

Link Analysis Ranking Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision

More information

Link Analysis. Stony Brook University CSE545, Fall 2016

Link Analysis. Stony Brook University CSE545, Fall 2016 Link Analysis Stony Brook University CSE545, Fall 2016 The Web, circa 1998 The Web, circa 1998 The Web, circa 1998 Match keywords, language (information retrieval) Explore directory The Web, circa 1998

More information

Markov Chains and Spectral Clustering

Markov Chains and Spectral Clustering Markov Chains and Spectral Clustering Ning Liu 1,2 and William J. Stewart 1,3 1 Department of Computer Science North Carolina State University, Raleigh, NC 27695-8206, USA. 2 nliu@ncsu.edu, 3 billy@ncsu.edu

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com

More information

Top-N recommendations in the presence of sparsity: An NCD-based approach

Top-N recommendations in the presence of sparsity: An NCD-based approach Top-N recommendations in the presence of sparsity: An NCD-based approach arxiv:7.43v2 [cs.ir] 7 Jul 2 Athanasios N. Nikolakopoulos a,b, and John D. Garofalakis a,b, a Department of Computer Engineering

More information

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci Link Analysis Information Retrieval and Data Mining Prof. Matteo Matteucci Hyperlinks for Indexing and Ranking 2 Page A Hyperlink Page B Intuitions The anchor text might describe the target page B Anchor

More information

A Note on Google s PageRank

A Note on Google s PageRank A Note on Google s PageRank According to Google, google-search on a given topic results in a listing of most relevant web pages related to the topic. Google ranks the importance of webpages according to

More information

UpdatingtheStationary VectorofaMarkovChain. Amy Langville Carl Meyer

UpdatingtheStationary VectorofaMarkovChain. Amy Langville Carl Meyer UpdatingtheStationary VectorofaMarkovChain Amy Langville Carl Meyer Department of Mathematics North Carolina State University Raleigh, NC NSMC 9/4/2003 Outline Updating and Pagerank Aggregation Partitioning

More information

Computing PageRank using Power Extrapolation

Computing PageRank using Power Extrapolation Computing PageRank using Power Extrapolation Taher Haveliwala, Sepandar Kamvar, Dan Klein, Chris Manning, and Gene Golub Stanford University Abstract. We present a novel technique for speeding up the computation

More information

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu How to organize/navigate it? First try: Human curated Web directories Yahoo, DMOZ, LookSmart

More information

Data and Algorithms of the Web

Data and Algorithms of the Web Data and Algorithms of the Web Link Analysis Algorithms Page Rank some slides from: Anand Rajaraman, Jeffrey D. Ullman InfoLab (Stanford University) Link Analysis Algorithms Page Rank Hubs and Authorities

More information

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte Reputation Systems I HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte Yury Lifshits Wiki Definition Reputation is the opinion (more technically, a social evaluation) of the public toward a person, a

More information

Data Mining Techniques

Data Mining Techniques Data Mining Techniques CS 622 - Section 2 - Spring 27 Pre-final Review Jan-Willem van de Meent Feedback Feedback https://goo.gl/er7eo8 (also posted on Piazza) Also, please fill out your TRACE evaluations!

More information

EigenRec: Generalizing PureSVD for Effective and Efficient Top-N Recommendations

EigenRec: Generalizing PureSVD for Effective and Efficient Top-N Recommendations KAIS manuscript No. (will be inserted by the editor) EigenRec: Generalizing PureSVD for Effective and Efficient Top-N Recommendations Athanasios N. Nikolakopoulos Vassilis Kalantzis Efstratios Gallopoulos

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Graph and Network Instructor: Yizhou Sun yzsun@cs.ucla.edu May 31, 2017 Methods Learnt Classification Clustering Vector Data Text Data Recommender System Decision Tree; Naïve

More information

Graph Models The PageRank Algorithm

Graph Models The PageRank Algorithm Graph Models The PageRank Algorithm Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2013 The PageRank Algorithm I Invented by Larry Page and Sergey Brin around 1998 and

More information

Personalized PageRank with node-dependent restart

Personalized PageRank with node-dependent restart Personalized PageRank with node-dependent restart Konstantin Avrachenkov Remco van der Hofstad Marina Sokol May 204 Abstract Personalized PageRank is an algorithm to classify the improtance of web pages

More information

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES MICHELE BENZI AND CHRISTINE KLYMKO Abstract This document contains details of numerical

More information

eigenvalues, markov matrices, and the power method

eigenvalues, markov matrices, and the power method eigenvalues, markov matrices, and the power method Slides by Olson. Some taken loosely from Jeff Jauregui, Some from Semeraro L. Olson Department of Computer Science University of Illinois at Urbana-Champaign

More information

Data Mining and Matrices

Data Mining and Matrices Data Mining and Matrices 10 Graphs II Rainer Gemulla, Pauli Miettinen Jul 4, 2013 Link analysis The web as a directed graph Set of web pages with associated textual content Hyperlinks between webpages

More information

Lecture: Local Spectral Methods (1 of 4)

Lecture: Local Spectral Methods (1 of 4) Stat260/CS294: Spectral Graph Methods Lecture 18-03/31/2015 Lecture: Local Spectral Methods (1 of 4) Lecturer: Michael Mahoney Scribe: Michael Mahoney Warning: these notes are still very rough. They provide

More information

Applications. Nonnegative Matrices: Ranking

Applications. Nonnegative Matrices: Ranking Applications of Nonnegative Matrices: Ranking and Clustering Amy Langville Mathematics Department College of Charleston Hamilton Institute 8/7/2008 Collaborators Carl Meyer, N. C. State University David

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #9: Link Analysis Seoul National University 1 In This Lecture Motivation for link analysis Pagerank: an important graph ranking algorithm Flow and random walk formulation

More information

A hybrid reordered Arnoldi method to accelerate PageRank computations

A hybrid reordered Arnoldi method to accelerate PageRank computations A hybrid reordered Arnoldi method to accelerate PageRank computations Danielle Parker Final Presentation Background Modeling the Web The Web The Graph (A) Ranks of Web pages v = v 1... Dominant Eigenvector

More information

Math 304 Handout: Linear algebra, graphs, and networks.

Math 304 Handout: Linear algebra, graphs, and networks. Math 30 Handout: Linear algebra, graphs, and networks. December, 006. GRAPHS AND ADJACENCY MATRICES. Definition. A graph is a collection of vertices connected by edges. A directed graph is a graph all

More information

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR

THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR THE PERTURBATION BOUND FOR THE SPECTRAL RADIUS OF A NON-NEGATIVE TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper, we study the perturbation bound for the spectral radius of an m th - order n-dimensional

More information

Link Mining PageRank. From Stanford C246

Link Mining PageRank. From Stanford C246 Link Mining PageRank From Stanford C246 Broad Question: How to organize the Web? First try: Human curated Web dictionaries Yahoo, DMOZ LookSmart Second try: Web Search Information Retrieval investigates

More information

1 Searching the World Wide Web

1 Searching the World Wide Web Hubs and Authorities in a Hyperlinked Environment 1 Searching the World Wide Web Because diverse users each modify the link structure of the WWW within a relatively small scope by creating web-pages on

More information

Eigenvalue Problems Computation and Applications

Eigenvalue Problems Computation and Applications Eigenvalue ProblemsComputation and Applications p. 1/36 Eigenvalue Problems Computation and Applications Che-Rung Lee cherung@gmail.com National Tsing Hua University Eigenvalue ProblemsComputation and

More information

The Second Eigenvalue of the Google Matrix

The Second Eigenvalue of the Google Matrix The Second Eigenvalue of the Google Matrix Taher H. Haveliwala and Sepandar D. Kamvar Stanford University {taherh,sdkamvar}@cs.stanford.edu Abstract. We determine analytically the modulus of the second

More information

ON THE LIMITING PROBABILITY DISTRIBUTION OF A TRANSITION PROBABILITY TENSOR

ON THE LIMITING PROBABILITY DISTRIBUTION OF A TRANSITION PROBABILITY TENSOR ON THE LIMITING PROBABILITY DISTRIBUTION OF A TRANSITION PROBABILITY TENSOR WEN LI AND MICHAEL K. NG Abstract. In this paper we propose and develop an iterative method to calculate a limiting probability

More information

Francois Fouss, Alain Pirotte, Jean-Michel Renders & Marco Saerens. January 31, 2006

Francois Fouss, Alain Pirotte, Jean-Michel Renders & Marco Saerens. January 31, 2006 A novel way of computing dissimilarities between nodes of a graph, with application to collaborative filtering and subspace projection of the graph nodes Francois Fouss, Alain Pirotte, Jean-Michel Renders

More information

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information

More information

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds MAE 298, Lecture 8 Feb 4, 2008 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in a file-sharing

More information

Complex Social System, Elections. Introduction to Network Analysis 1

Complex Social System, Elections. Introduction to Network Analysis 1 Complex Social System, Elections Introduction to Network Analysis 1 Complex Social System, Network I person A voted for B A is more central than B if more people voted for A In-degree centrality index

More information

Algebraic Representation of Networks

Algebraic Representation of Networks Algebraic Representation of Networks 0 1 2 1 1 0 0 1 2 0 0 1 1 1 1 1 Hiroki Sayama sayama@binghamton.edu Describing networks with matrices (1) Adjacency matrix A matrix with rows and columns labeled by

More information

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68 CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68 References 1 L. Freeman, Centrality in Social Networks: Conceptual Clarification, Social Networks, Vol. 1, 1978/1979, pp. 215 239. 2 S. Wasserman

More information

Pseudocode for calculating Eigenfactor TM Score and Article Influence TM Score using data from Thomson-Reuters Journal Citations Reports

Pseudocode for calculating Eigenfactor TM Score and Article Influence TM Score using data from Thomson-Reuters Journal Citations Reports Pseudocode for calculating Eigenfactor TM Score and Article Influence TM Score using data from Thomson-Reuters Journal Citations Reports Jevin West and Carl T. Bergstrom November 25, 2008 1 Overview There

More information

Computational Economics and Finance

Computational Economics and Finance Computational Economics and Finance Part II: Linear Equations Spring 2016 Outline Back Substitution, LU and other decomposi- Direct methods: tions Error analysis and condition numbers Iterative methods:

More information

Page rank computation HPC course project a.y

Page rank computation HPC course project a.y Page rank computation HPC course project a.y. 2015-16 Compute efficient and scalable Pagerank MPI, Multithreading, SSE 1 PageRank PageRank is a link analysis algorithm, named after Brin & Page [1], and

More information

Decoupled Collaborative Ranking

Decoupled Collaborative Ranking Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides

More information

The Push Algorithm for Spectral Ranking

The Push Algorithm for Spectral Ranking The Push Algorithm for Spectral Ranking Paolo Boldi Sebastiano Vigna March 8, 204 Abstract The push algorithm was proposed first by Jeh and Widom [6] in the context of personalized PageRank computations

More information

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University. Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University http://www.mmds.org #1: C4.5 Decision Tree - Classification (61 votes) #2: K-Means - Clustering

More information

MATH36001 Perron Frobenius Theory 2015

MATH36001 Perron Frobenius Theory 2015 MATH361 Perron Frobenius Theory 215 In addition to saying something useful, the Perron Frobenius theory is elegant. It is a testament to the fact that beautiful mathematics eventually tends to be useful,

More information

Locally-biased analytics

Locally-biased analytics Locally-biased analytics You have BIG data and want to analyze a small part of it: Solution 1: Cut out small part and use traditional methods Challenge: cutting out may be difficult a priori Solution 2:

More information

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE

Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 55, NO. 9, SEPTEMBER 2010 1987 Distributed Randomized Algorithms for the PageRank Computation Hideaki Ishii, Member, IEEE, and Roberto Tempo, Fellow, IEEE Abstract

More information

Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications

Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications Michael K. Ng Centre for Mathematical Imaging and Vision and Department of Mathematics Hong Kong Baptist University Email: mng@math.hkbu.edu.hk

More information

Personalized PageRank with Node-dependent Restart

Personalized PageRank with Node-dependent Restart Reprint from Moscow Journal of Combinatorics and Number Theory 204, vol. 4, iss. 4, pp. 5 20, [pp. 387 402] Scale =.7 PS:./fig-eps/Logo-MJCNT.eps Personalized PageRank with Node-dependent Restart Konstantin

More information

Finding central nodes in large networks

Finding central nodes in large networks Finding central nodes in large networks Nelly Litvak University of Twente Eindhoven University of Technology, The Netherlands Woudschoten Conference 2017 Complex networks Networks: Internet, WWW, social

More information

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank

More information

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 11, 2016 Paper presentations and final project proposal Send me the names of your group member (2 or 3 students) before October 15 (this Friday)

More information

Application. Stochastic Matrices and PageRank

Application. Stochastic Matrices and PageRank Application Stochastic Matrices and PageRank Stochastic Matrices Definition A square matrix A is stochastic if all of its entries are nonnegative, and the sum of the entries of each column is. We say A

More information

Statistical Problem. . We may have an underlying evolving system. (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t

Statistical Problem. . We may have an underlying evolving system. (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t Markov Chains. Statistical Problem. We may have an underlying evolving system (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t Consecutive speech feature vectors are related

More information

IR: Information Retrieval

IR: Information Retrieval / 44 IR: Information Retrieval FIB, Master in Innovation and Research in Informatics Slides by Marta Arias, José Luis Balcázar, Ramon Ferrer-i-Cancho, Ricard Gavaldá Department of Computer Science, UPC

More information

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search Xutao Li 1 Michael Ng 2 Yunming Ye 1 1 Department of Computer Science, Shenzhen Graduate School, Harbin Institute of Technology,

More information

Applications to network analysis: Eigenvector centrality indices Lecture notes

Applications to network analysis: Eigenvector centrality indices Lecture notes Applications to network analysis: Eigenvector centrality indices Lecture notes Dario Fasino, University of Udine (Italy) Lecture notes for the second part of the course Nonnegative and spectral matrix

More information

Slides based on those in:

Slides based on those in: Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering

More information

Some relationships between Kleinberg s hubs and authorities, correspondence analysis, and the Salsa algorithm

Some relationships between Kleinberg s hubs and authorities, correspondence analysis, and the Salsa algorithm Some relationships between Kleinberg s hubs and authorities, correspondence analysis, and the Salsa algorithm François Fouss 1, Jean-Michel Renders 2 & Marco Saerens 1 {saerens,fouss}@isys.ucl.ac.be, jean-michel.renders@xrce.xerox.com

More information

Using SVD to Recommend Movies

Using SVD to Recommend Movies Michael Percy University of California, Santa Cruz Last update: December 12, 2009 Last update: December 12, 2009 1 / Outline 1 Introduction 2 Singular Value Decomposition 3 Experiments 4 Conclusion Last

More information

RaRE: Social Rank Regulated Large-scale Network Embedding

RaRE: Social Rank Regulated Large-scale Network Embedding RaRE: Social Rank Regulated Large-scale Network Embedding Authors: Yupeng Gu 1, Yizhou Sun 1, Yanen Li 2, Yang Yang 3 04/26/2018 The Web Conference, 2018 1 University of California, Los Angeles 2 Snapchat

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

Some relationships between Kleinberg s hubs and authorities, correspondence analysis, and the Salsa algorithm

Some relationships between Kleinberg s hubs and authorities, correspondence analysis, and the Salsa algorithm Some relationships between Kleinberg s hubs and authorities, correspondence analysis, and the Salsa algorithm François Fouss 1, Jean-Michel Renders 2, Marco Saerens 1 1 ISYS Unit, IAG Université catholique

More information

Links between Kleinberg s hubs and authorities, correspondence analysis, and Markov chains

Links between Kleinberg s hubs and authorities, correspondence analysis, and Markov chains Links between Kleinberg s hubs and authorities, correspondence analysis, and Markov chains Francois Fouss, Jean-Michel Renders & Marco Saerens Université Catholique de Louvain and Xerox Research Center

More information

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University Jeffrey D. Ullman Stanford University 2 Often, our data can be represented by an m-by-n matrix. And this matrix can be closely approximated by the product of two matrices that share a small common dimension

More information

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze Link Analysis Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze 1 The Web as a Directed Graph Page A Anchor hyperlink Page B Assumption 1: A hyperlink between pages

More information

Ten good reasons to use the Eigenfactor TM metrics

Ten good reasons to use the Eigenfactor TM metrics Ten good reasons to use the Eigenfactor TM metrics Massimo Franceschet Department of Mathematics and Computer Science, University of Udine Via delle Scienze 206 33100 Udine, Italy massimo.franceschet@dimi.uniud.it

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University.

CS246: Mining Massive Datasets Jure Leskovec, Stanford University. CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu What is the structure of the Web? How is it organized? 2/7/2011 Jure Leskovec, Stanford C246: Mining Massive

More information

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS LINK ANALYSIS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Retrieval evaluation Link analysis Models

More information

The PageRank Computation in Google: Randomization and Ergodicity

The PageRank Computation in Google: Randomization and Ergodicity The PageRank Computation in Google: Randomization and Ergodicity Roberto Tempo CNR-IEIIT Consiglio Nazionale delle Ricerche Politecnico di Torino roberto.tempo@polito.it Randomization over Networks - 1

More information

Online Social Networks and Media. Link Analysis and Web Search

Online Social Networks and Media. Link Analysis and Web Search Online Social Networks and Media Link Analysis and Web Search How to Organize the Web First try: Human curated Web directories Yahoo, DMOZ, LookSmart How to organize the web Second try: Web Search Information

More information

0.1 Naive formulation of PageRank

0.1 Naive formulation of PageRank PageRank is a ranking system designed to find the best pages on the web. A webpage is considered good if it is endorsed (i.e. linked to) by other good webpages. The more webpages link to it, and the more

More information

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018 ELE 538B: Mathematics of High-Dimensional Data Spectral methods Yuxin Chen Princeton University, Fall 2018 Outline A motivating application: graph clustering Distance and angles between two subspaces Eigen-space

More information

Inf 2B: Ranking Queries on the WWW

Inf 2B: Ranking Queries on the WWW Inf B: Ranking Queries on the WWW Kyriakos Kalorkoti School of Informatics University of Edinburgh Queries Suppose we have an Inverted Index for a set of webpages. Disclaimer Not really the scenario of

More information

Analysis and Computation of Google s PageRank

Analysis and Computation of Google s PageRank Analysis and Computation of Google s PageRank Ilse Ipsen North Carolina State University, USA Joint work with Rebecca S. Wills ANAW p.1 PageRank An objective measure of the citation importance of a web

More information

A Fast Two-Stage Algorithm for Computing PageRank

A Fast Two-Stage Algorithm for Computing PageRank A Fast Two-Stage Algorithm for Computing PageRank Chris Pan-Chi Lee Stanford University cpclee@stanford.edu Gene H. Golub Stanford University golub@stanford.edu Stefanos A. Zenios Stanford University stefzen@stanford.edu

More information