Introduction to Link Prediction

Similar documents
ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties

Kristina Lerman USC Information Sciences Institute

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Similarity Measures for Link Prediction Using Power Law Degree Distribution

SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices)

LAPLACIAN MATRIX AND APPLICATIONS

Data Mining Techniques

Web Structure Mining Nodes, Links and Influence

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

Data Mining Recitation Notes Week 3

Spectral Analysis of k-balanced Signed Graphs

Lecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian

Collaborative Filtering

Algebraic Representation of Networks

Introduction to Data Mining

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

CSE 494/598 Lecture-4: Correlation Analysis. **Content adapted from last year s slides

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

Class President: A Network Approach to Popularity. Due July 18, 2014

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Introduction to Machine Learning

A spectral algorithm for computing social balance

Lecture: Local Spectral Methods (1 of 4)

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity

ELE 538B: Mathematics of High-Dimensional Data. Spectral methods. Yuxin Chen Princeton University, Fall 2018

1 Searching the World Wide Web

A New Space for Comparing Graphs

PROBABILISTIC LATENT SEMANTIC ANALYSIS

Online Social Networks and Media. Link Analysis and Web Search

Lecture 13: Spectral Graph Theory

An Introduction to Algebraic Graph Theory

Fiedler s Theorems on Nodal Domains

Introduction to Search Engine Technology Introduction to Link Structure Analysis. Ronny Lempel Yahoo Labs, Haifa

Using SVD to Recommend Movies

0.1 Naive formulation of PageRank

Link Analysis Ranking

SUPPLEMENTARY MATERIALS TO THE PAPER: ON THE LIMITING BEHAVIOR OF PARAMETER-DEPENDENT NETWORK CENTRALITY MEASURES

Collaborative Filtering. Radek Pelánek

Computer science research seminar: VideoLectures.Net recommender system challenge: presentation of baseline solution

ECEN 689 Special Topics in Data Science for Communications Networks

Link Analysis. Leonid E. Zhukov

Machine Learning and Modeling for Social Networks

Slides based on those in:

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

Parallel Singular Value Decomposition. Jiaxing Tan

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Link Analysis Information Retrieval and Data Mining. Prof. Matteo Matteucci

A Spectral Regularization Framework for Multi-Task Structure Learning

LINEAR ALGEBRA: M340L EE, 54300, Fall 2017

Online Social Networks and Media. Link Analysis and Web Search

DATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD

Lecture: Some Practical Considerations (3 of 4)

Faloutsos, Tong ICDE, 2009

Link prediction in drug-target interactions network using similarity indices

Blog Community Discovery and Evolution

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory

6.034 Introduction to Artificial Intelligence

Lecture 13 Spectral Graph Algorithms

RaRE: Social Rank Regulated Large-scale Network Embedding

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

Lecture 14: SVD, Power method, and Planted Graph problems (+ eigenvalues of random matrices) Lecturer: Sanjeev Arora

Recommendation Systems

Stat 315c: Introduction

Linear Classifiers (Kernels)

Node Centrality and Ranking on Networks

Non-linear Dimensionality Reduction

Chapter 4 & 5: Vector Spaces & Linear Transformations

Metric-based classifiers. Nuno Vasconcelos UCSD

Pseudocode for calculating Eigenfactor TM Score and Article Influence TM Score using data from Thomson-Reuters Journal Citations Reports

Linear Regression (9/11/13)

Centrality Measures. Leonid E. Zhukov

Lecture: Face Recognition and Feature Reduction

Quick Tour of Linear Algebra and Graph Theory

Community Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria

Chapter 6 Higher Dimensional Linear Systems

Notes for Advanced Level Further Mathematics. (iii) =1, hence are parametric

MATH 829: Introduction to Data Mining and Analysis Clustering II

Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014

Spectral Graph Theory Tools. Analysis of Complex Networks

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

IR: Information Retrieval

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Lecture 2: Linear Algebra Review

Google Page Rank Project Linear Algebra Summer 2012

CS 3750 Advanced Machine Learning. Applications of SVD and PCA (LSA and Link analysis) Cem Akkaya

Mathematical foundations - linear algebra

Quick Introduction to Nonnegative Matrix Factorization

Latent Semantic Analysis. Hongning Wang

Lecture 4: Linear Algebra 1

Recommendation Systems

Statistical Learning Notes III-Section 2.4.3

Decoupled Collaborative Ranking

Introduction to Information Retrieval

Transcription:

Introduction to Link Prediction Machine Learning and Modelling for Social Networks Lloyd Sanders, Olivia Woolley, Iza Moize, Nino Antulov-Fantulin D-GESS: Computational Social Science COSS

Overview What is link prediction? What are some examples of link prediction? A How To The setup Local Methods Global Methods Other Methods Challenges of Link Prediction References COSS L Sanders 2

Statement of the Problem Abstractly: Given a snapshot of a network, can one predict the next most likely links to form in the network? Zachary Karate Club COSS http://ifisc.uib-csic.es/~jramasco/complexnets.html L Sanders 3

Link Prediction in Industry (Recommender Systems) Proposing friendships Proposing matches Proposing items to purchase COSS L Sanders 4

Link Prediction in Science Artist s rendition of Human Metabolic network COSS Wikipedia L Sanders 5

Link Prediction in Science Investigating link connections in bio. networks is costly and time consuming. Wikipedia COSS L Sanders 6

Link Prediction RESEARCH ARTICLE Link Prediction in Criminal Networks: A Tool for Criminal Intelligence Analysis Giulia Berlusconi 1, Francesco Calderoni 1 *, Nicola Parolini 2, Marco Verani 2, Carlo Piccardi 3 * 1 Università Cattolica del Sacro Cuore and Transcrime, Milano, Italy, 2 MOX, Department of Mathematics, Politecnico di Milano, Milano, Italy, 3 Department of Electronics, Information and Bioengineering, Politecnico di Milano, Milano, Italy * francesco.calderoni@unicatt.it (FC); carlo.piccardi@polimi.it (CP) Possible missing links predicted between suspects in organized crime COSS L Sanders 7

Link Prediction Link prediction is used to predict future possible links in the network (E.g., Facebook). Or, it can be used to predict missing links due to incomplete data (E.g., Food-webs this is related to sampling that Olivia spoke of earlier). COSS L Sanders 8

Statement of the Problem Abstractly: Given a snapshot of a network, can one predict the next most likely links to form in the network? Graph G =(V,E) Edge e =(x, y) Similarity Score s xy COSS L Sanders 9

How To: The Setup For a given graph Split the data into a training set, and a test set Choose a link prediction algorithm Run the algorithm on the training set, and test it on the test set. Check the accuracy Compare other link prediction algorithms COSS L Sanders 10

How To: The Setup Split Edges into a training and test (probe) set Set of all possible edges on graph U = V ( V 1) 2 COSS L Sanders 11

How To: Output of algorithm The Link Prediction algorithm will spit out a list, ranked with edges which are most likely to appear at the top, descending. Taking the first n links from the list, and calculating the intersection with the probe set of length n, gives a simple measure of accuracy COSS L Sanders 12

Similarity measures: Local Local methods predict links based solely on your local contact structure. The main idea is that of triangle closing Next most likely link? COSS L Sanders 13

Similarity measures: Local Local methods predict links based solely on your local contact structure. The main idea is that of triangle closing COSS L Sanders 14

How to: Similarity Measures Common Neighbours s CN xy = (x) \ (y), Set of NN. Cardinality is the degree of the node Counting number of paths of a certain length COSS L Sanders 15

Xp How to: Similarity Measures Jaccard Index s Jaccard xy = (x) \ (y) (x) [ (y) Resource Allocation ) Sørensen Index s RA = X [42]. This 1 xy z2 (x)\ (y) k z Intuition: If we share neighbors who have a low degree, we are likely to be connected. Being connected through a hub is meaningless Many more similarity measures available COSS L Sanders 16

Similarity measures: Global Given the adjacency matrix, one can take the global structure of the graph into account when making predictions. These are often based on number of shortest path measures. Common neighbors is simply the Adj. matrix squared. s CN xy =(A 2 ) xy Try it for yourself to verify it! COSS X L Sanders 17

Global Methods X What about paths of greater length? How can we include those? We can weight longer paths less. exp( A) =1+ A + ( 2 A 2 ) 2! + ( 3 A 3 ) 3! + COSS s xy =exp( A) xy = 1X i=0 i i! Ai xy L Sanders 18

Link Prediction on Bipartite Graphs COSS L Sanders 19 Wikipedia

Link Prediction on Bipartite Graphs Consider an odd function for the graph kernel Or something more sophisticated A 3 X sinh( A) = A + ( A)3 3! sinh(αa) = i=0 + ( A)5 5! α 1+2i (1 + 2i)! A1+2i + COSS L Sanders 20

Key Tip Your choice of link prediction algorithm makes an implicit assumption of the graph kernel the mechanism for how the graph grows. Different graph types (social, academic, useritem) grow under different mechanisms. Therefore different link prediction algorithms will work better on other graphs. COSS L Sanders 21

Objective Function Taking the Frobenius Norm COSS L Sanders 22

Matrix Decomposition (Spectral) Matrix Decomposition/Eigendecomposition A = UΛU T Adjacency Matrices are real, square, and symmetric. We can break them down into Eigenvectors and eigenvalues. The set of eigenvalues is the spectrum of the graph, hence the namesake. F (A) =UF(Λ)U T Easy computation of functions on the Adj. matrix COSS Lloyd Sanders 23

Decomposition COSS Lloyd Sanders 24

Link Prediction to Regression Taking the F-norm is analogous to the RMSE between the two matrices. Function F acting upon Lambda only effects the diagonal elements. Therefore trying to minimize the F-norm one can only effect the diagonals of the source matrix to be equivalent (or close) to the right term. Hence the problem is reduced to a curve fitting problem, stated above. COSS L Sanders 25

Choosing your Graph Kernel Kunegis et al. 2013 COSS L Sanders 26

Predictions COSS L Sanders 27

Mean Averaged Precision For various kernels it is important to check there generalized accuracy, to see which is an appropriate descriptor of the link generation process To measure the notion accuracy, we use a measurement for ordered lists: Mean Average Precision This should be your go to when assessing the accuracy of ranked lists. COSS L Sanders 28

Average Precision Average Precision is method to evaluate the precision and ranking of a predicted list of retrieved objects COSS L Sanders 29

Average Precision Example Relevant links Ranked list AP = ½ ( 1.0 + 0.67 ) = 0.84 1.0 0.5 0.67 0.5 0.4 Precision Relevant links Ranked list AP = 1/3 ( 0.33 + 0.5 ) = 0.28 0 0 0.33 0.5 0.4 Precision MAP = (0.28 + 0.84)/2 = 0.68 COSS L Sanders 30

Comparison of Algorithms COSS L Sanders 31

Other Methods for Link Prediction Random Walks on graphs Average Commute Time Maximum Likelihood methods Constructing graphs Spectral Extrapolation methods Continuing Eigenvalue trends And more [Lu & Zhou, 2011] COSS L Sanders 32

Challenges for Link Prediction Cold Start Problem Temporal Networks Links can be created and destroyed over time New nodes come in and out of the system Link prediction with links of a different nature e.g., negative links. COSS L Sanders 33

References Link prediction in complex networks: A survey; Lü and Zhou, Physica A; 2011 The Link Prediction Problem for Social Networks; Liben-Nowell & Kleinberg Spectral evolution in dynamic networks; Kunegis et al.; Knowl. Inf. Syst.; 2013 Online Dating Recommender Systems: The split-complex number approach; Kunegis et al.; ACM 2012 Mean Average Precision https://en.wikipedia.org/wiki/precision_and_recall https://en.wikipedia.org/wiki/information_retrieval Victor Lavrenko: youtube.com/watch?v=pm6dj0zzee0 https://web-beta.archive.org/web/20110504130953/http://sas.uwaterloo.ca/stats_navigation/ techreports/04workingpapers/2004-09.pdf COSS L Sanders 34

How To: The Setup quick note Our algorithms consider only the edges that connect the same set of nodes in the training and test set usually taken as the nodes of the giant component of the graph This assumes the graph is static no new nodes enter the system By definition: The algorithm will not predict edges for nodes not within the giant component More about this later COSS Networkx.github.io L Sanders 35

MAP/AP FAQs Why do you take the average of the precisions at the relevant documents, and not at all places. My intuition is about the ranking. Relevant docs/links first. So if you take the first example we have and sum over all docs/links retrieved, you get a lower MAP. But suppose you only have 2 relevant docs in a list of five. You rank them first: therefore your AP is 1.Which is what you would want. But if you divide by 5 instead of 2, you get 0.2 accuracy. Which is not in line with what we would want in terms of an accuracy metric for a ranked list. But, giving the precision at a given rank has the effect of weighting the order of the correct guesses. Think: 1 relative doc @ pos 1: AP = 1. @ pos 3: AP = 0.33. COSS L Sanders 36

MAP/AP FAQs What if the denominator is zero (e.g., no relevant docs/ links)? One simply sets the AP to zero. One could think of this as applying a penalty to the algorithm for retrieving/predicting docs/links when none should be. COSS L Sanders 37

The Cold Start Problem new nodes? COSS L Sanders 38

Algebraic Methods How to Split data appropriately, giving another split, to allow for learning certain parameters of graph kernels Kunegis et al. 2013 COSS Lloyd Sanders 39

Algebraic Methods How to Split data appropriately, giving another split, to allow for learning certain parameters of graph kernels Kunegis et al. 2013 COSS Lloyd Sanders 40