AXIOMS FOR CENTRALITY

Similar documents
THE FALL AND RISE OF GEOMETRIC CENTRALITIES

In-core computation of distance distributions and geometric centralities with HyperBall: A hundred billion nodes and beyond

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci

Node Centrality and Ranking on Networks

Node and Link Analysis

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Finding central nodes in large networks

Degree Distribution: The case of Citation Networks

Complex Social System, Elections. Introduction to Network Analysis 1

Applications to network analysis: Eigenvector centrality indices Lecture notes

Calculating Web Page Authority Using the PageRank Algorithm

Hyperlinked-Induced Topic Search (HITS) identifies. authorities as good content sources (~high indegree) HITS [Kleinberg 99] considers a web page

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES

The Push Algorithm for Spectral Ranking

Link Analysis. Leonid E. Zhukov

Online Sampling of High Centrality Individuals in Social Networks

Centrality Measures. Leonid E. Zhukov

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

As it is not necessarily possible to satisfy this equation, we just ask for a solution to the more general equation

Centrality Measures. Leonid E. Zhukov

Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

Algebraic Representation of Networks

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

Math 304 Handout: Linear algebra, graphs, and networks.

Kristina Lerman USC Information Sciences Institute

Bonacich Measures as Equilibria in Network Models

Web Structure Mining Nodes, Links and Influence

A Note on Google s PageRank

Justification and Application of Eigenvector Centrality

Online Social Networks and Media. Link Analysis and Web Search

This section is an introduction to the basic themes of the course.

Lecture: Local Spectral Methods (1 of 4)

Wiki Definition. Reputation Systems I. Outline. Introduction to Reputations. Yury Lifshits. HITS, PageRank, SALSA, ebay, EigenTrust, VKontakte

Inf 2B: Ranking Queries on the WWW

Random Surfing on Multipartite Graphs

1 Searching the World Wide Web

Google Page Rank Project Linear Algebra Summer 2012

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

Link Analysis Ranking

ECEN 689 Special Topics in Data Science for Communications Networks

Introduction Centrality Measures Implementation Applications Limitations Homework. Centrality Metrics. Ron Hagan, Yunhe Feng, and Jordan Bush

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

6x 2 8x + 5 ) = 12x 8

Data Mining Recitation Notes Week 3

Spectral Graph Theory Tools. Analysis of Complex Networks

Eigenvalues of the Adjacency Tensor on Products of Hypergraphs

Data Mining and Matrices

IR: Information Retrieval

Markov Chains and Spectral Clustering

Affine iterations on nonnegative vectors

MultiRank and HAR for Ranking Multi-relational Data, Transition Probability Tensors, and Multi-Stochastic Tensors

Centrality Measures in Networks

How works. or How linear algebra powers the search engine. M. Ram Murty, FRSC Queen s Research Chair Queen s University

PAGERANK PARAMETERS. Amy N. Langville. American Institute of Mathematics Workshop on Ranking Palo Alto, CA August 17th, 2010

Local properties of PageRank and graph limits. Nelly Litvak University of Twente Eindhoven University of Technology, The Netherlands MIPT 2018

0.1 Naive formulation of PageRank

Data science with multilayer networks: Mathematical foundations and applications

1 The Basic Counting Principles

Structural properties of urban street patterns and the Multiple Centrality Assessment

Incentive Compatible Ranking Systems

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

Unit 5: Centrality. ICPSR University of Michigan, Ann Arbor Summer 2015 Instructor: Ann McCranie

Online Social Networks and Media. Link Analysis and Web Search

Graph Models The PageRank Algorithm

Centrality. Structural Importance of Nodes

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Axiomatization of the PageRank Centrality

The Choice of a Damping Function for Propagating Importance in Link-Based Ranking

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Móstoles, Spain. Keywords: complex networks, dual graph, line graph, line digraph.

1998: enter Link Analysis

Math Review for Exam Answer each of the following questions as either True or False. Circle the correct answer.

Introduction to Link Prediction

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Lecture 7 Mathematics behind Internet Search

Introduction to Algebra: The First Week

Lecture 10. Lecturer: Aleksander Mądry Scribes: Mani Bastani Parizi and Christos Kalaitzis

Fiedler s Theorems on Nodal Domains

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian

NewPR-Combining TFIDF with Pagerank

CS47300: Web Information Search and Management

No class on Thursday, October 1. No office hours on Tuesday, September 29 and Thursday, October 1.

Lobby index in networks

Incentive Compatible Ranking Systems

Centrality in Social Networks

Why matrices matter. Paul Van Dooren, UCL, CESAME

Ma/CS 6b Class 23: Eigenvalues in Regular Graphs

Scalable Betweenness Centrality Maximization via Sampling

arxiv: v14 [cs.ir] 29 Sep 2015

arxiv:cond-mat/ v1 3 Sep 2004

1 Primals and Duals: Zero Sum Games

Economics 205, Fall 2002: Final Examination, Possible Answers

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

CS54701 Information Retrieval. Link Analysis. Luo Si. Department of Computer Science Purdue University. Borrowed Slides from Prof.

Eigenvalues in Applications

CMSC 451: Max-Flow Extensions

Applications of The Perron-Frobenius Theorem

The Second Eigenvalue of the Google Matrix

Transcription:

AXIOMS FOR CENTRALITY From a paper by Paolo Boldi, Sebastiano Vigna Università degli Studi di Milano! Internet Math., 10(3-4):222 262, 2014. Proc. 13th ICDMW. IEEE, 2013. Proc. 4th WebScience. ACM, 2012.! Proc. 20th WWW. ACM, 2011.! More to come

1.(Collecting) Consider the most important centrality indices proposed in the scientific literature(s)! 2.(Assessing) Test them against axioms that all centrality indices should satisfy

ARS TechnoMedia / PRIN What do these people have in common? Ron Jeremy Adolf Hitler Lloyd Kaufman George W. Bush Ronald Reagan Bill Clinton Martin Sheen Debbie Rochon

PageRank (believe it or not) These are the top-8 actors of the Hollywood graph according to PageRank! Who is going to tell that is better?

ARS TechnoMedia / PRIN And these? William Shatner Bess Flowers Martin Sheen Ronald Reagan George Clooney Samuel Jackson Robin Williams Tom Hanks

These were computed by using Harmonic Centrality instead.

Degree Definitely better, but who s Bess Flowers?

Centrality in social sciences First works by Bavelas at MIT (1946)! This sparked countless works (Bavelas 1951; Katz 1953; Shaw 1954; Beauchamp 1965; Mackenzie 1966; Burgess 1969; Anthonisse 1971; Czapiel 1974...) that Freeman (1979) tried to summarize concluding that: several measures are often only vaguely related to the intuitive ideas they purport to index, and many are so complex that it is difficult or impossible to discover what, if anything, they are measuring

Only few surveys! Noteworthy (in the IR context): Craswell, Upstill, Hawking (ADCS 2003); Najork, Zaragoza, Taylor (SIGIR 2007); Najork, Gollapudi, Panigrahy (WSDM 2009)

Collecting A brief survey of centrality measures

A tale of three tribes Spectral indices, based on some linear-algebra construction# Path-based indices, based on the number of paths or shortest paths (geodesics) passing through a vertex! Geometric indices, based on distances from a vertex to other vertices

Degree (In-)Degree centrality: the number of incoming links c deg (x) =d (x) Or number of nodes at distance one! Careful: when dealing with directed networks, some indices present two variants (e.g., in-degree vs. outdegree), the ones based on incoming paths being usually more interesting

The path tribe Betweenness centrality (Anthonisse 1971): X yz(x) c betw (x) = Katz centrality (Katz 1953): c Katz (x) = y,z6=x 1X t=0 yz t x (t) =1 Fraction of shortest paths from y to z passing through x # of paths of length t ending in x 1X t G t t=0

The distance tribe Closeness centrality (Bavelas 1946): c clos (x) = P The summation is over all y such that d(y,x)< # y 1 d(y, x) Distance from y to x Harmonic centrality: c harm (x) = X y6=x 1 d(y, x) The denormalized reciprocal of the harmonic mean of all distances (even ), inspired by (Marchiori, Latora 2000)

The spectral tribe All based on the eigenstructure of some graph-related matrix! Most obvious: left or right dominant eigenvector of some matrix derived from the graph adjacency matrix G. Most common: G r, in which rows are normalized to row sum one! All share the same issues of unicity and computability, mainly solved using Perron-Frobenius theory and the power method or more sophisticated approaches

Seeley index (Seeley 1949) Basic idea: in a group of children, a child is as popular as the sum of the popularities of the children who like him, but popularities are divided evenly among friends: c Seeley (y) X c Seeley (x) = d + (y) y!x In general it is a left dominant eigenvector of G r

PageRank (Brin, Page, Motwani, Winograd 1999) The idea is to start from Seeley s equation and add an adjustment to make it have a unique solution (and more)! c pr (x) = X y!x c pr (y) d + (y) + 1 n It is the dominant eigenvector of! G r +(1 )1 T 1/n Recall: Katz (1953) is the dom. eigenv. of G! +(1 )e T 1/n Recall: Seeley (1949) is the dom. eigenv. of G r

HITS (Kleinberg 1997) The idea is to start from the system: c Hauth (x) = X y!x c Hhub (y) c Hhub (x) = X x!y c Hauth (y) HITS centrality is defined to be the authoritativeness score! It is a dominant eigenvector of, so it coincides with the dominant eigenvector on symmetric G T G graphs! Actually defined by Bonacich in 1991

Assessing Axioms for centrality

Axioms for Centrality Let s try to isolate interesting properties! Various precedents: Sabidussi [1966], Nieminen [1973], Chien et al. for PageRank [2004], Brandes et al. [2012]! One of the axioms is indeed common (score monotonicity)

Axioms Sensitivity to density The blue and the red node have the same importance (the two rings have the same size!)

Axioms Sensitivity to density Densifying the left-hand side, we expect the red node to become more important than the blue node

Axioms Sensitivity to size Two disjoint (i.e., very far ) components of a single network k p When k or p goes to, the nodes of the corresponding subnetwork must become more important

Axioms Score monotonicity Score monotonicity: if I add an arc towards y, the score of y must (strictly) increase (same as Sabidussi)

An axiomatic slaughter Density Size Monotone Degree yes only k yes Betweenness no (!) only p no Dominant yes only k no Seeley yes no no Katz yes! only k yes PageRank yes no yes HITS yes only k no Closeness no (!) no no Harmonic yes yes yes

Conclusions Next time you need a centrality index... try harmonic!! (You can compute it quickly using HyperBall)!

Thanks! ARS TechnoMedia / PRIN