Facebook Friends! and Matrix Functions

Similar documents
Heat Kernel Based Community Detection

A Nearly Sublinear Approximation to exp{p}e i for Large Sparse Matrices from Social Networks

Strong localization in seeded PageRank vectors

Strong Localization in Personalized PageRank Vectors

Lecture: Local Spectral Methods (1 of 4)

0.1 Naive formulation of PageRank

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Locally-biased analytics

A Note on Google s PageRank

Four graph partitioning algorithms. Fan Chung University of California, San Diego

Data and Algorithms of the Web

Understanding graphs through spectral densities

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

A Tale of Two Eigenvalue Problems

Using Local Spectral Methods in Theory and in Practice

Link Analysis Ranking

Diffusion and random walks on graphs

Lecture: Local Spectral Methods (2 of 4) 19 Computing spectral ranking with the push procedure

DS504/CS586: Big Data Analytics Graph Mining II

Graph Partitioning Using Random Walks

CS6220: DATA MINING TECHNIQUES

Link Mining PageRank. From Stanford C246

Lecture 13: Spectral Graph Theory

Conditioning of the Entries in the Stationary Vector of a Google-Type Matrix. Steve Kirkland University of Regina

Google Page Rank Project Linear Algebra Summer 2012

Seeded PageRank solution paths

Spectral Graph Theory and You: Matrix Tree Theorem and Centrality Metrics

Algorithmic Primitives for Network Analysis: Through the Lens of the Laplacian Paradigm

Lab 8: Measuring Graph Centrality - PageRank. Monday, November 5 CompSci 531, Fall 2018

Graph Models The PageRank Algorithm

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

A Nearly-Sublinear Method for Approximating a Column of the Matrix Exponential for Matrices from Large, Sparse Networks

Math 443/543 Graph Theory Notes 5: Graphs as matrices, spectral graph theory, and PageRank

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES

CS6220: DATA MINING TECHNIQUES

Online Social Networks and Media. Link Analysis and Web Search

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Lecture 12 : Graph Laplacians and Cheeger s Inequality

EIGENVECTOR NORMS MATTER IN SPECTRAL GRAPH THEORY

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

Estimating the Largest Elements of a Matrix

Algebraic Multigrid Preconditioners for Computing Stationary Distributions of Markov Processes

Introduction to Data Mining

Algebraic Representation of Networks

An Algorithmist s Toolkit September 10, Lecture 1

Why matrices matter. Paul Van Dooren, UCL, CESAME

Eigenvalues and Eigenvectors

Lecture 10. Lecturer: Aleksander Mądry Scribes: Mani Bastani Parizi and Christos Kalaitzis

Eugene Wigner [4]: in the natural sciences, Communications in Pure and Applied Mathematics, XIII, (1960), 1 14.

Parallel Local Graph Clustering

Class President: A Network Approach to Popularity. Due July 18, 2014

Link Analysis. Leonid E. Zhukov

Lecture: Local Spectral Methods (3 of 4) 20 An optimization perspective on local spectral methods

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

Online Social Networks and Media. Link Analysis and Web Search

New Directions in Computer Science

Cutting Graphs, Personal PageRank and Spilling Paint

Lecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian

STA141C: Big Data & High Performance Statistical Computing

CS249: ADVANCED DATA MINING

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory

Math 307 Learning Goals. March 23, 2010

DS504/CS586: Big Data Analytics Graph Mining II

Spectral Graph Theory

Understanding Graphs through Spectral Densities

Data science with multilayer networks: Mathematical foundations and applications

Math 304 Handout: Linear algebra, graphs, and networks.

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

Numerical Methods I: Eigenvalues and eigenvectors

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Mini-project in scientific computing

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Using R for Iterative and Incremental Processing

Overlapping Communities

Heat Kernel Based Community Detection

3.3 Eigenvalues and Eigenvectors

1.10 Matrix Representation of Graphs

Finding central nodes in large networks

Markov Chains and Web Ranking: a Multilevel Adaptive Aggregation Method

8.1 Concentration inequality for Gaussian random matrix (cont d)

The Kernel Trick, Gram Matrices, and Feature Extraction. CS6787 Lecture 4 Fall 2017

CS246: Mining Massive Datasets Jure Leskovec, Stanford University.

ECEN 689 Special Topics in Data Science for Communications Networks

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

Link Analysis. Reference: Introduction to Information Retrieval by C. Manning, P. Raghavan, H. Schutze

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Laplacian Matrices of Graphs: Spectral and Electrical Theory

sublinear time low-rank approximation of positive semidefinite matrices Cameron Musco (MIT) and David P. Woodru (CMU)

Web Structure Mining Nodes, Links and Influence

How does Google rank webpages?

1 Matrix notation and preliminaries from spectral graph theory

6.207/14.15: Networks Lectures 4, 5 & 6: Linear Dynamics, Markov Chains, Centralities

This section is an introduction to the basic themes of the course.

HYPERGRAPH BASED SEMI-SUPERVISED LEARNING ALGORITHMS APPLIED TO SPEECH RECOGNITION PROBLEM: A NOVEL APPROACH

Spectral and Electrical Graph Theory. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Eigenvalue Problems Computation and Applications

Quilting Stochastic Kronecker Graphs to Generate Multiplicative Attribute Graphs

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

Transcription:

Facebook Friends! and Matrix Functions! Graduate Research Day Joint with David F. Gleich, (Purdue), supported by" NSF CAREER 1149756-CCF Kyle Kloster! Purdue University!

Network Analysis Use linear algebra to study graphs Graph

Network Analysis Use linear algebra to study graphs Graph, G V, vertices (nodes) E, edges (links) degree of a node = # edges incident to it. nodes sharing an edge are neighbors.

Network Analysis Use linear algebra to study graphs Graph, G V, vertices (nodes) E, edges (links) Erdős Number Facebook friends Twitter followers Search engines Amazon/Netflix rec. Protein interactions Power grids Google Maps Air traffic control Sports rankings Cell tower placement Scheduling Parallel programming Everything Kevin Bacon

Network Analysis Use linear algebra to study graphs Graph Properties Diameter Is everything just a few hops away from everything else?

Network Analysis Use linear algebra to study graphs Graph Properties Diameter Clustering Are there tightly-knit groups of nodes?

Network Analysis Use linear algebra to study graphs Graph Properties Diameter Clustering Connectivity How well can each node reach every other node?

Network Analysis Use linear algebra to study graphs Graph Properties Linear Algebra Diameter Eigenvalues and matrix Clustering functions shed light Connectivity on all these questions. These tools require a matrix related to the graph

Graph Matrices Adjacency matrix, A A ij = 1, if nodes i, j share an edge (are adjacent) 0 otherwise Random-walk transition matrix, P P ij = A ij /d j where d j is the degree of node j. Stochastic! i.e. column-sums = 1

Network analysis via Heat Kernel Uses include Local clustering Link prediction Node centrality Heat kernel is 1X a graph diffusion a function of a matrix exp (G) = k=0 1 k! Gk For G, a network s random-walk,p adjacency, A Laplacian, L matrix

Heat Kernel describes node connectivity (A k ) ij =#walks of length k from node i to j exp (A) ij = 1X k=0 1 k! (Ak ) ij sum up the walks between i and j For a small set of seed nodes, s, exp (A) s describes nodes most relevant to s

Diffusion score diffusion scores of a graph = " weighted sum of probability vectors c 0 p 0 + c 1 p 1 + c 2 p 2 + c p 3 + 3 diffusion score vector = f! 1X f = c k P k s k=0 P = random-walk s = c k = transition matrix normalized seed vector weight on stage k

Heat Kernel vs. PageRank Diffusions Heat Kernel uses t k /k! " Our work is new analysis and algorithms for this diffusion. t 0 p 0 0! t + 1 + + 1! p 1 t 2 p 2 p 3 2! t 3 3! + PageRank uses α k at stage k." Standard, widely-used diffusion we use for comparison. Linchpin of Google s original success! α 0 p 0 + α 1 p 1 + α 2 p 2 + α3 p 3 +

Heat Kernel vs. PageRank Theory PR good clusters Local Cheeger Inequality:" PR finds near-optimal clusters fast algorithm existing constant-time algorithm [Andersen Chung Lang 06] HK

Heat Kernel vs. PageRank Theory PR good clusters Local Cheeger Inequality:" PR finds near-optimal clusters fast algorithm existing constant-time algorithm [Andersen Chung Lang 06] HK Local Cheeger Inequality [Chung 07]

Heat Kernel vs. PageRank Theory PR good clusters Local Cheeger Inequality:" PR finds near-optimal clusters fast algorithm existing constant-time algorithm [Andersen Chung Lang 06] HK Local Cheeger Inequality [Chung 07] Our work!

Algorithm outline ˆx exp (P) s (1) Approximate with a polynomial (2) Convert to linear system (3) Solve with sparse linear solver (Details in paper)

Algorithm outline ˆx exp (P) s (1) Approximate with a polynomial (2) Convert to linear system (3) Solve with sparse linear solver (Details in paper) Ax (k) b r (k) := b Ax (k) x (k+1) := x (k) + Ar (k) big Gauss-Southwell Sparse solver relax largest entry in r

Algorithm outline ˆx exp (P) s (1) Approximate with a polynomial (2) Convert to linear system (3) Solve with sparse linear solver (Details in paper) Key: We avoid doing these full matrix-vector products exp (P) s NX k=0 1 k! Pk s

Algorithm outline ˆx exp (P) s (1) Approximate with a polynomial (2) Convert to linear system (3) Solve with sparse linear solver Key: We avoid doing these full matrix-vector products exp (P) s NX k=0 1 k! Pk s (Details in paper) (All my work was showing this actually can be done with bounded error.)

Algorithms & Theory for Algorithm 1, Weak Convergence - constant time on any graph, ˆx exp (P) s Õ( e1 " ) - outperforms PageRank in clustering - accuracy: kd 1 x D 1 ˆxk 1 < "

Algorithms & Theory for ˆx exp (P) s kd 1 x D 1 ˆxk 1 < " Conceptually Diffusion vector quantifies node s connection to each other node. Divide each node s score by its degree, delete the nodes with score < ε. Only a constant number of nodes remain in G! Users spend reciprocated time with O(1) others.

Algorithms & Theory for ˆx exp (P) s Algorithm 2, Global Convergence (conditional)

Power-law Degrees Realworld graphs have degrees distributed as follows. This causes diffusions to be localized. 1e+07 rank 1e+06 100000 10000 1000 100 10 1 0 1 10 100 1000 10000 indegree [Boldi et al., Laboratory for Web Algorithmics 2008] Power-law degrees Degrees of nodes in Ljournal-2008 Log-log scale

magnitude Local solutions Magnitude of entries in solution vector 1.5 1 0.5 0 1X k=0 1 2 3 4 5 nnz = 4815948 x 10 6 1 k! Ak s Accuracy of approximation using only large entries 1 norm error 10 0 10 2 10 4 10 6 10 8 10 10 10 12 10 14 has ~5 million nnz! 10 0 10 1 10 2 10 3 10 4 10 5 10 6 largest non zeros retained

magnitude Local solutions Magnitude of entries in solution vector 1.5 1 0.5 0 1X k=0 1 2 3 4 5 nnz = 4815948 x 10 6 1 k! Ak s Accuracy of approximation using only large entries 1 norm error 10 0 10 2 10 4 10 6 Only ~3,000 10 8 entries 10 10 For 10-4 accuracy! 10 12 10 14 has ~5 million nnz! 10 0 10 1 10 2 10 3 10 4 10 5 10 6 largest non zeros retained

Algorithms & Theory for ˆx exp (P) s Algorithm 2, Global Convergence (conditional) - sublinear (power-law) - accuracy: kx ˆxk 1 < " Õ(d log d(1/") C )

Algorithms & Theory for ˆx exp (P) s kx ˆxk 1 < " Conceptually A node s diffusion vector can be approximated with total error < ε using only O(d log d) entries. In realworld networks (i.e. with degrees following a power-law), no node will have nontrivial connection with more than O(d log d) other nodes.

Experiments

Runtime on the web-graph A particularly sparse graph benefits us best 140 120 100 EXMPV GSQ GS V = O(10^8) E = O(10^9) Time (sec) 80 60 40 20 0 0 10 20 30 Trial GSQ, GS: our methods EXPMV: MatLab

! Thank you Local clustering via heat kernel code available at http://www.cs.purdue.edu/homes/dgleich/codes/hkgrow Global heat kernel code available at http://www.cs.purdue.edu/homes/dgleich/codes/nexpokit/ Questions or suggestions? Email Kyle Kloster at kkloste-at-purdue-dot-edu