Node Centrality and Ranking on Networks

Similar documents
Node and Link Analysis

Link Analysis. Leonid E. Zhukov

Centrality Measures. Leonid E. Zhukov

Centrality Measures. Leonid E. Zhukov

Web Structure Mining Nodes, Links and Influence

Applications to network analysis: Eigenvector centrality indices Lecture notes

Data Mining and Matrices

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Link Analysis Ranking

Inf 2B: Ranking Queries on the WWW

Online Social Networks and Media. Link Analysis and Web Search

Graph Models The PageRank Algorithm

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

Complex Social System, Elections. Introduction to Network Analysis 1

1 Searching the World Wide Web

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

Web Ranking. Classification (manual, automatic) Link Analysis (today s lesson)

Lecture 7 Mathematics behind Internet Search

Online Social Networks and Media. Link Analysis and Web Search

eigenvalues, markov matrices, and the power method

Degree Distribution: The case of Citation Networks

Uncertainty and Randomization

1998: enter Link Analysis

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci

ECEN 689 Special Topics in Data Science for Communications Networks

Data Mining Recitation Notes Week 3

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

How works. or How linear algebra powers the search engine. M. Ram Murty, FRSC Queen s Research Chair Queen s University

CS54701 Information Retrieval. Link Analysis. Luo Si. Department of Computer Science Purdue University. Borrowed Slides from Prof.

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

Google PageRank. Francesco Ricci Faculty of Computer Science Free University of Bozen-Bolzano

PageRank. Ryan Tibshirani /36-662: Data Mining. January Optional reading: ESL 14.10

Link Mining PageRank. From Stanford C246

CS47300: Web Information Search and Management

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Calculating Web Page Authority Using the PageRank Algorithm

Finding central nodes in large networks

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

Web Ranking. Classification (manual, automatic) Link Analysis (today s lesson)

arxiv:cond-mat/ v1 3 Sep 2004

Information Retrieval and Search. Web Linkage Mining. Miłosz Kadziński

Ten good reasons to use the Eigenfactor TM metrics

Link Analysis. Stony Brook University CSE545, Fall 2016

A Note on Google s PageRank

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

0.1 Naive formulation of PageRank

Applications. Nonnegative Matrices: Ranking

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Slides based on those in:

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

Diffusion and random walks on graphs

The Second Eigenvalue of the Google Matrix

Bonacich Measures as Equilibria in Network Models

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES

Diffusion of Innovation

Models and Algorithms for Complex Networks. Link Analysis Ranking

Hyperlinked-Induced Topic Search (HITS) identifies. authorities as good content sources (~high indegree) HITS [Kleinberg 99] considers a web page

AXIOMS FOR CENTRALITY

How does Google rank webpages?

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Unit 5: Centrality. ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie

Intelligent Data Analysis. PageRank. School of Computer Science University of Birmingham

Epidemics and information spreading

Algebraic Representation of Networks

Lecture 12: Link Analysis for Web Retrieval

Axiomatization of the PageRank Centrality

Randomization and Gossiping in Techno-Social Networks

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Google Page Rank Project Linear Algebra Summer 2012

Random Surfing on Multipartite Graphs

Multiple Relational Ranking in Tensor: Theory, Algorithms and Applications

Markov Chains and Spectral Clustering

Spectral Graph Theory Tools. Analysis of Complex Networks

Computing PageRank using Power Extrapolation

STA141C: Big Data & High Performance Statistical Computing

Centrality Measures and Link Analysis

IR: Information Retrieval

Introduction to Data Mining

Applications of The Perron-Frobenius Theorem

arxiv: v1 [physics.soc-ph] 26 Apr 2017

How Does Google?! A journey into the wondrous mathematics behind your favorite websites. David F. Gleich! Computer Science! Purdue University!

Applications of Matrix Functions to Network Analysis and Quantum Chemistry Part I: Complex Networks

As it is not necessarily possible to satisfy this equation, we just ask for a solution to the more general equation

Some relationships between Kleinberg s hubs and authorities, correspondence analysis, and the Salsa algorithm

Faloutsos, Tong ICDE, 2009

Math 304 Handout: Linear algebra, graphs, and networks.

Kristina Lerman USC Information Sciences Institute

Jeffrey D. Ullman Stanford University

Analysis and Computation of Google s PageRank

Application. Stochastic Matrices and PageRank

DS504/CS586: Big Data Analytics Graph Mining II

PageRank algorithm Hubs and Authorities. Data mining. Web Data Mining PageRank, Hubs and Authorities. University of Szeged.

Analysis of Google s PageRank

Transcription:

Node Centrality and Ranking on Networks Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Social Network Analysis MAGoLEGO course Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 1 / 27

Lecture outline 1 Centrality and prestige 2 Centrality measures Degree centrality Closeness centrality Betweenness centrality Eigenvector centrality 3 Ranking on directed graphs Pagerank HITS Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 2 / 27

Centrality Measures Determine the most important or prominent actors in the network based on actor location. Marriage alliances among leading Florentine families 15th century. Padgett, 1993 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 3 / 27

Centrality measures Undirected graphs: - Actor centrality - involvement (connections) with other actors Directed graphs: - Actor centrality - source of the ties (outgoing edges) - Actor prestige - recipient of many ties (incoming edges) Linton Freeman, 1979 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 4 / 27

Three graphs Star graph Circle graph Line Graph Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 5 / 27

Degree centrality Degree centrality: number of nearest neighbors C D (i) = k(i) = j A ij = j A ji Normalized degree centrality C D (i) = 1 n 1 C D(i) = k(i) n 1 High centrality degree -direct contact with many other actors Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 6 / 27

Closeness centrality Closeness centrality: how close an actor to all the other actors in network Normalized closeness centrality C C (i) = j 1 d(i, j) CC (i) = (n 1)C C (i) = n 1 d(i, j) High closeness centrality - short communication path to others, minimal number of steps to reach others j Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 7 / 27

Betweenness centrality Betweenness centrality: number of shortest paths going through the actor σ st (i) C B (i) = σ st (i) σ st Normalized betweenness centrality s t i C B (i) = 2 (n 1)(n 2) C B(i) = 2 (n 1)(n 2) s t i Hight betweenness centrality - vertex lies on many shortest paths Probability that a communication from s to t will go through i σ st (i) σ st Linton Freeman, 1977 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 8 / 27

Eigenvector centrality Importance of a node depends on the importance of its neighbors (recursive definition) v i j A ij v j v i = 1 A ij v j λ j Av = λv Select an eigenvector associated with largest eigenvalue λ = λ 1, v = v 1 Phillip Bonacich, 1972. Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 9 / 27

Centrality examples Closeness centrality igraph:closeness() from www.activenetworks.net Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 10 / 27

Centrality examples Betweenness centrality igraph:betweenness() from www.activenetworks.net Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 11 / 27

Centrality examples Eigenvector centrality igraph:evcent() from www.activenetworks.net Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 12 / 27

Centrality examples A) Degree centrality B) Closeness centrality C) Betweenness centrality D) Eigenvector centrality from Claudio Rocchini Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 13 / 27

Centralization Centralization (network measure) - how central the most central node in the network in relation to all other nodes. C x = N i [C x (p ) C x (p i )] max N i [C x (p ) C x (p i )] C x - one of the centrality measures p - node with the largest centrality value max - is taken over all graphs with the same number of nodes (for degree, closeness and betweenness the most centralized structure is the star graph) igraph: centralization.degree(), centralization.closeness(), centralization.betweenness(), centralization.evcent() Linton Freeman, 1979 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 14 / 27

Directional relations Directed graph: distinguish between choices made (outgoing edges) and choices received (incoming edges) sending - receiving export - import cite papers - being cited Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 15 / 27

Centrality measures All based on outgoing edges Degree centrality (normalized): Closeness centrality (normalized): C D (i) = kout (i) n 1 CC (i) = n 1 d(i, j) **Betweenness centrality (normalized): C B (i) = 1 (n 1)(n 2) j s t i σ st (i) σ st Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 16 / 27

Status or Rank Prestige Leo Katz, 1953. Node prestige depends on prestige of directly connected actors iterate p i j N(i) p j = j A ji p j p t+1 = A T p t, p t=0 = p 0 Difficulties: - Absorbing nodes - Source nodes - Cycles ** Solution to p = A T p might not exist. Nontrivial solution only if det(i A T ) = 0. Need to constraint matrix Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 17 / 27

Web as a graph Hyperlinks - implicit endorsements Web graph - graph of endorsements (sometimes reciprocal) Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 18 / 27

PageRank PageRank can be thought of as a model of user behavior. We assume there is a random surfer who is given a web page at random and keeps clicking on links, never hitting back but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank. Sergey Brin and Larry Page, 1998 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 19 / 27

Random walk Random walk on graph p t+1 i = j N(i) p t j d out j = j A ji dj out p j P = D 1 A, D ii = diag{d out i } p t+1 = P T p t with teleportation p t+1 = αp T p t + (1 α) e n Perron-Frobenius Theorem guarantees existence and uniqueness of the solution to p = αp T p + (1 α) e n Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 20 / 27

PageRank igraph: page.rank() Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 21 / 27

PageRank beyond the Web Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 22 / 27

Hubs and Authorities (HITS) Citation networks. Reviews vs original research (authoritative) papers authorities, contain useful information, a i hubs, contains links to authorities, h i Mutual recursion Good authorities referred by good hubs a i j A ji h j Good hubs point to good authorities h i j A ij a j Jon Kleinberg, 1999 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 23 / 27

HITS System of linear equations a = αa T h h = βaa Symmetric eigenvalue problem (A T A)a = λa (AA T )h = λh where eigenvalue λ = (αβ) 1 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 24 / 27

Hubs and Authorities Hubs Authorities igraph: hub.score(), authority.score() Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 25 / 27

Florentines families Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 26 / 27

References Centrality in Social Networks. Conceptual Clarification, Linton C. Freeman, Social Networks, 1, 215-239, 1979 Power and Centrality: A Family of Measures, Phillip Bonacich, The American Journal of Sociology, Vol. 92, No. 5, 1170-1182, 1987 A new status index derived from sociometric analysis, L. Katz, Psychometrika, 19, 39-43, 1953. Eigenvector-like measures of centrality for asymmetric relations, Phillip Bonacich, Paulette Lloyd, Social Networks 23, 191?201, 2001 The PageRank Citation Ranknig: Bringing Order to the Web. S. Brin, L. Page, R. Motwany, T. Winograd, Stanford Digital Library Technologies Project, 1998 Authoritative Sources in a Hyperlinked Environment. Jon M. Kleinberg, Proc. 9th ACM-SIAM Symposium on Discrete Algorithms A Survey of Eigenvector Methods of Web Information Retrieval. Amy N. Langville and Carl D. Meyer, 2004 PageRank beyond the Web. David F. Gleich, arxiv:1407.5107, 2014 Leonid E. Zhukov (HSE) Lecture 4 24.04.2015 27 / 27