Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Similar documents
Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.

Link Mining PageRank. From Stanford C246

Slides based on those in:

CS246: Mining Massive Datasets Jure Leskovec, Stanford University.

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Introduction to Data Mining

Data and Algorithms of the Web

Online Social Networks and Media. Link Analysis and Web Search

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Online Social Networks and Media. Link Analysis and Web Search

CS6220: DATA MINING TECHNIQUES

CS345a: Data Mining Jure Leskovec and Anand Rajaraman Stanford University

CS6220: DATA MINING TECHNIQUES

Link Analysis. Stony Brook University CSE545, Fall 2016

Link Analysis & Ranking CS 224W

0.1 Naive formulation of PageRank

CS249: ADVANCED DATA MINING

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano

Jeffrey D. Ullman Stanford University

Jeffrey D. Ullman Stanford University

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

1998: enter Link Analysis

1 Searching the World Wide Web

Link Analysis Ranking

Web Ranking. Classification (manual, automatic) Link Analysis (today s lesson)

PageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

A Note on Google s PageRank

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Data Mining Recitation Notes Week 3

Complex Social System, Elections. Introduction to Network Analysis 1

Faloutsos, Tong ICDE, 2009

Lecture 12: Link Analysis for Web Retrieval

Statistical Problem. . We may have an underlying evolving system. (new state) = f(old state, noise) Input data: series of observations X 1, X 2 X t

PageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10

IS4200/CS6200 Informa0on Retrieval. PageRank Con+nued. with slides from Hinrich Schütze and Chris6na Lioma

IR: Information Retrieval

Information Retrieval and Search. Web Linkage Mining. Miłosz Kadziński

Web Ranking. Classification (manual, automatic) Link Analysis (today s lesson)

Data Mining and Matrices

Google Page Rank Project Linear Algebra Summer 2012

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte

Link Analysis. Leonid E. Zhukov

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

STA141C: Big Data & High Performance Statistical Computing

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

ECEN 689 Special Topics in Data Science for Communications Networks

Large Graph Mining: Power Tools and a Practitioner s guide

Talk 2: Graph Mining Tools - SVD, ranking, proximity. Christos Faloutsos CMU

Hyperlinked-Induced Topic Search (HITS) identifies. authorities as good content sources (~high indegree) HITS [Kleinberg 99] considers a web page

As it is not necessarily possible to satisfy this equation, we just ask for a solution to the more general equation

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Graph Models The PageRank Algorithm

Node Centrality and Ranking on Networks

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

Degree Distribution: The case of Citation Networks

How does Google rank webpages?

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci

CS54701 Information Retrieval. Link Analysis. Luo Si. Department of Computer Science Purdue University. Borrowed Slides from Prof.

eigenvalues, markov matrices, and the power method

Data Mining Techniques

Node and Link Analysis

Lesson Plan. AM 121: Introduction to Optimization Models and Methods. Lecture 17: Markov Chains. Yiling Chen SEAS. Stochastic process Markov Chains

Machine Learning CPSC 340. Tutorial 12

CS47300: Web Information Search and Management

Inf 2B: Ranking Queries on the WWW

How works. or How linear algebra powers the search engine. M. Ram Murty, FRSC Queen s Research Chair Queen s University

Uncertainty and Randomization

Web Structure Mining Nodes, Links and Influence

Lecture 7 Mathematics behind Internet Search

Computing PageRank using Power Extrapolation

googling it: how google ranks search results Courtney R. Gibbons October 17, 2017

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya

Data science with multilayer networks: Mathematical foundations and applications

Applications. Nonnegative Matrices: Ranking

Lecture: Local Spectral Methods (1 of 4)

Analysis and Computation of Google s PageRank

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

Page rank computation HPC course project a.y

Math 304 Handout: Linear algebra, graphs, and networks.

Updating PageRank. Amy Langville Carl Meyer

Models and Algorithms for Complex Networks. Link Analysis Ranking

INTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505

Finding central nodes in large networks

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CMU SCS. Large Graph Mining. Christos Faloutsos CMU

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

Combating Web Spam with TrustRank

arxiv:cond-mat/ v1 3 Sep 2004

Node similarity and classification

Outline for today. Information Retrieval. Cosine similarity between query and document. tf-idf weighting

PageRank: The Math-y Version (Or, What To Do When You Can t Tear Up Little Pieces of Paper)

INTRODUCTION TO MCMC AND PAGERANK. Eric Vigoda Georgia Tech. Lecture for CS 6505

Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications

Ranking on Large-Scale Graphs with Rich Metadata

Transcription:

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank Personalized PageRank

How to organize the Web? First try: Human curated Web directories Yahoo, DMOZ, LookSmart Second try: Web Search Information Retrieval attempts to find relevant docs in a small and trusted set Newspaper articles, Patents, etc. But: Web is huge, full of untrusted documents, random things, web spam, etc. So we need a good way to rank webpages! 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 3

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 4 2 challenges of web search: (1) Web contains many sources of information Who to trust? Insight: Trustworthy pages may point to each other! (2) What is the best answer to query newspaper? No single right answer Insight: Pages that actually know about newspapers might all be pointing to many newspapers

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 5 All web pages are not equally important www.joe-schmoe.com vs. www.stanford.edu We already know: There is large diversity in the web-graph node connectivity. So, let s rank the pages using the web graph link structure! vs.

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 6 We will cover the following Link Analysis approaches to computing importance of nodes in a graph: Hubs and Authorities (HITS) Page Rank Topic-Specific (Personalized) Page Rank Sidenote: Various notions of node centrality: Node u Degree centrality = degree of u Betweenness centrality = #shortest paths passing through u Closeness centrality = avg. length of shortest paths from u to all other nodes of the network Eigenvector centrality = like PageRank

Goal (back to the newspaper example): Don t just find newspapers. Find experts pages that link in a coordinated way to good newspapers Idea: Links as votes Page is more important if it has more links In-coming links? Out-going links? Hubs and Authorities Each page has 2 scores: Quality as an expert (hub): Total sum of votes of pages pointed to Quality as a content (authority): Total sum of votes of experts Principle of repeated improvement NYT: 10 Ebay: 3 Yahoo: 3 CNN: 8 WSJ: 9 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 8

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 9 Interesting pages fall into two classes: 1. Authorities are pages containing useful information Newspaper home pages Course home pages Home pages of auto manufacturers 2. Hubs are pages that link to authorities List of newspapers Course bulletin List of U.S. auto manufacturers NYT: 10 Ebay: 3 Yahoo: 3 CNN: 8 WSJ: 9

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 10 Each page starts with hub score 1 Authorities collect their votes (Note this is idealized example. In reality graph is not bipartite and each page has both the hub and authority score)

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 11 Hubs collect authority scores (Note this is idealized example. In reality graph is not bipartite and each page has both the hub and authority score)

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 12 Authorities collect hub scores (Note this is idealized example. In reality graph is not bipartite and each page has both the hub and authority score)

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 13 A good hub links to many good authorities A good authority is linked from many good hubs Note a self-reinforcing recursive definition Model using two scores for each node: Hub score and Authority score Represented as vectors h and a, where the i-th element is the hub/authority score of the i-th node

[Kleinberg 98] Each page i has 2 scores: i Authority score: a i Hub score: h i i HITS algorithm: (0) (0) Initialize: a j = 1/ n, hj = 1/ n Then keep iterating until convergence: i: Authority: a i (t+1) = j i h j (t) i: Hub: h i (t+1) = i j a j (t) i: Normalize: i a i t+1 2 = 1, j h j t+1 2 = 1 Convergence criteria: h i t h i t+1 2 < ε a i t a i t+1 2 < ε 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 14

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 17 Definition: Eigenvectors & Eigenvalues Let R x = λ x for some scalar λ, vector x, matrix R Then x is an eigenvector, and λ is its eigenvalue The steady state (HITS has converged) is: A T A a = c a A A T h = c h So, authority a is eigenvector of A T A (associated with the largest eigenvalue) Similarly: hub h is eigenvector of AA T Note constants c,c don t matter as we normalize them out every step of HITS

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 19 Still the same idea: Links as votes Page is more important if it has more links In-coming links? Out-going links? Think of in-links as votes: www.stanford.edu has 23,400 in-links www.joe-schmoe.com has 1 in-link Are all in-links equal? Links from important pages count more Recursive question!

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 20 A vote from an important page is worth more: Each link s vote is proportional to the importance of its source page If page i with importance r i has d i out-links, each link gets r i / d i votes Page j s own importance r j is the sum of the votes on its inlinks i j k r i /3 rk /4 r j /3 r j /3 r j /3 r j = r i /3 + r k /4

A page is important if it is pointed to by other important pages Define a rank r j for node j r j i j r i d i d i out-degree of node i a/2 a The web in 1839 a/2 y/2 y y/2 m Flow equations: r y = r y /2 + r a /2 m You might wonder: Let s just use Gaussian elimination to solve this system of linear equations. Bad idea! r a = r y /2 + r m r m = r a /2 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 21

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 22 Stochastic adjacency matrix M Let page j have d j out-links i j If j i, then M ij = 1 d j M is a column stochastic matrix Columns sum to 1 1/3 Rank vector r: An entry per page r i is the importance score of page i i r i = 1 The flow equations can be written r = M r r j M i j r i d i

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 23 Imagine a random web surfer: At any time t, surfer is on some page i At time t + 1, the surfer follows an out-link from i uniformly at random Ends up on some page j linked from i Process repeats indefinitely Let: p(t) vector whose i th coordinate is the prob. that the surfer is at page i at time t So, p(t) is a probability distribution over pages r j i 1 i 2 i 3 j i j d out ri (i)

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 24 Where is the surfer at time t+1? Follows a link uniformly at random p t + 1 = M p(t) Suppose the random walk reaches a state p t + 1 = M p(t) = p(t) then p(t) is stationary distribution of a random walk Our original rank vector r satisfies r = M r So, r is a stationary distribution for the random walk i 1 i 2 i 3 j p( t 1) M p( t)

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 26 Given a web graph with n nodes, where the nodes are pages and edges are hyperlinks Assign each node an initial page rank Repeat until convergence ( i r i (t+1) r i (t) < ) Calculate the page rank of each node r ( t 1) j i j r ( t) i d i d i. out-degree of node i

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 27 Power Iteration: Set r j 1/N 1: r j i j r i d i 2: r r If r r > ε: goto 1 Example: r y 1/3 1/3 5/12 9/24 6/15 r a = 1/3 3/6 1/3 11/24 6/15 r m 1/3 1/6 3/12 1/6 3/15 Iteration 0, 1, 2, a y m y a m y ½ ½ 0 a ½ 0 1 m 0 ½ 0 r y = r y /2 + r a /2 r a = r y /2 + r m r m = r a /2

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 28 Power Iteration: Set r j 1/N 1: r j i j r i d i 2: r r If r r > ε: goto 1 Example: r y 1/3 1/3 5/12 9/24 6/15 r a = 1/3 3/6 1/3 11/24 6/15 r m 1/3 1/6 3/12 1/6 3/15 Iteration 0, 1, 2, a y m y a m y ½ ½ 0 a ½ 0 1 m 0 ½ 0 r y = r y /2 + r a /2 r a = r y /2 + r m r m = r a /2

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 29 r ( t 1) j i j r ( t) i d i or equivalently r Mr Does this converge? Does it converge to what we want? Are results reasonable?

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 30 The Spider trap problem: Example: a b Iteration: 0, 1, 2, 3 r ( t 1) j i j r ( t) i d i r a 1 0 1 0 = r b 0 1 0 1

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 31 The Dead end problem: Example: a b Iteration: 0, 1, 2, 3 r ( t 1) j i j r ( t) i d i r a 1 0 0 0 r = b 0 1 0 0

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 32 2 problems: (1) Some pages are dead ends (have no out-links) Such pages cause importance to leak out (2) Spider traps (all out-links are within the group) Eventually spider traps absorb all importance

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 33 Power Iteration: y y a m y ½ ½ 0 Set r j = 1 N a m a ½ 0 0 m 0 ½ 1 r j = i j r i d i And iterate r y = r y /2 + r a /2 r a = r y /2 Example: r m = r a /2 + r m r y 1/3 2/6 3/12 5/24 0 r a = 1/3 1/6 2/12 3/24 0 r m 1/3 3/6 7/12 16/24 1 Iteration 0, 1, 2,

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 34 The Google solution for spider traps: At each time step, the random surfer has two options With prob., follow a link at random With prob. 1-, jump to a random page Common values for are in the range 0.8 to 0.9 Surfer will teleport out of spider trap within a few time steps y y a m a m

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 35 Power Iteration: y y a m y ½ ½ 0 Set r j = 1 N a m a ½ 0 0 m 0 ½ 0 r j = i j r i d i And iterate r y = r y /2 + r a /2 r a = r y /2 Example: r m = r a /2 r y 1/3 2/6 3/12 5/24 0 r a = 1/3 1/6 2/12 3/24 0 r m 1/3 1/6 1/12 2/24 0 Iteration 0, 1, 2,

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 36 Teleports: Follow random teleport links with probability 1.0 from dead-ends Adjust matrix accordingly y y a m a m y a m y ½ ½ 0 a ½ 0 0 m 0 ½ 0 y a m y ½ ½ ⅓ a ½ 0 ⅓ m 0 ½ ⅓

Google s solution: At each step, random surfer has two options: With probability, follow a link at random With probability 1-, jump to some random page PageRank equation [Brin-Page, 98] r j = i j β r i d i + (1 β) 1 n di out-degree of node i The above formulation assumes that M has no dead ends. We can either preprocess matrix M (bad!) or explicitly follow random teleport links with probability 1.0 from dead-ends. See P. Berkhin, A Survey on PageRank Computing, Internet Mathematics, 2005. 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 37

Details! 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 38 PageRank as a principal eigenvector r r = M r or equivalently r j = i i d i But we really want (**): r r j = β i i + 1 β 1 d i n Let s define: M ij = β M ij + (1 β) 1 n Now we get what we want: r = M r What is 1? In practice 0.15 (Jump approx. every 5-6 links) d i out-degree of node i Note: M is a sparse matrix but M is dense (all entries 0). In practice we never materialize M but rather we use the sum formulation (**)

Input: Graph G and parameter β Directed graph G with spider traps and dead ends Parameter β Output: PageRank vector r Set: r 0 j = 1, t = 1 N do: (t) j: r j = i j β r (t 1) i d i (t) r j = 0 if in-deg. of j is 0 Now re-insert the leaked PageRank: j: r j t t = t + 1 = r j t + 1 S N while j r j (t) rj (t 1) where: S = j r j (t) > ε 2/10/2017 39

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 40

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 41 PageRank and HITS are two solutions to the same problem: What is the value of an in-link from u to v? In the PageRank model, the value of the link depends on the links into u In the HITS model, it depends on the value of the other links out of u The destinies of PageRank and HITS post-1998 were very different

[Tong-Faloutsos, 06] 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 43 I 1 J 1 1 A 1 H 1 B 1 1 D E 1 1 1 F G a.k.a.: Relevance, Closeness, Similarity

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 44 Given: Conferences-to-authors graph Goal: Proximity on graphs Q: What is most related conference to ICDM? IJCAI KDD ICDM SDM AAAI Philip S. Yu Ning Zhong R. Ramakrishnan M. Jordan NIPS Conference Author

[Tong et al. 08] 45 { Sea Sun Sky Wave} { Cat Forest Grass Tiger}? {?,?,?,}

[Tong et al. 08] 46 Region Image Test Image Sea Sun Sky Wave Cat Forest Tiger Grass Keyword

[Tong et al. 08] 47 Region Image Test Image {Grass, Forest, Cat, Tiger} Sea Sun Sky Wave Cat Forest Tiger Grass Keyword

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 48 Shortest path is not good: No influence for degree-1 nodes (E, F, G)! Multi-faceted relationships

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 49 Network Flow is not good: Does not punish long paths

[Tong-Faloutsos, 06] I 1 J 1 1 A 1 H 1 B 1 1 D E 1 1 1 F G Multiple Connections Quality of connection Direct & In-direct connections Length, Degree, Weight 2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 50

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 51 9 10 2 12 1 3 8 11 4 5 6 7

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 52 Goal: Evaluate pages not just by popularity but by how close they are to the topic Teleporting can go to: Any page with equal probability (we used this so far) A topic-specific set of relevant pages Topic-specific (personalized) PageRank M ij = β M ij + (1 β)/ S if i S = β M ij otherwise (S...teleport set) Random Walk with Restart: S is a single element

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 53 Graphs and web search: Ranks nodes by importance Personalized PageRank: Ranks proximity of nodes to the teleport nodes S Proximity on graphs: Q: What is most related conference to ICDM? Random Walks with Restarts Teleport back to the starting node: S = { single node } IJCAI KDD ICDM SDM AAAI NIPS Conference Philip S. Yu Ning Zhong R. Ramakrishnan M. Jordan Author

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 54 Node 4 0.13 1 0.10 2 3 4 0.13 0.13 5 7 0.04 9 0.08 8 6 0.05 0.05 0.03 10 11 0.04 12 0.02 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 Node 9 Node 10 Node 11 Node 12 0.13 0.10 0.13 0.22 0.13 0.05 0.05 0.08 0.04 0.03 0.04 0.02 Nearby nodes, higher scores More red, more relevant Ranking vector r 4

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 55 KDD CIKM PKDD SDM PAKDD 0.009 0.011 0.004 0.004 0.008 ICDM 0.007 0.005 0.004 0.005 0.005 ICML ICDE ECML SIGMOD DMKD

2/10/2017 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 56 I K Graph of CS conferences Q: Which conferences are closest to KDD & ICDM? A: Personalized PageRank with teleport set S={KDD, ICDM}