CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Similar documents
Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Markov Chains and Spectral Clustering

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Spectral Clustering. Guokun Lai 2016/10

Lecture 13: Spectral Graph Theory

Spectral Clustering. Spectral Clustering? Two Moons Data. Spectral Clustering Algorithm: Bipartioning. Spectral methods

Spectral Clustering. Zitao Liu

Graph Partitioning Algorithms and Laplacian Eigenvalues

1 Matrix notation and preliminaries from spectral graph theory

Networks and Their Spectra

CS168: The Modern Algorithmic Toolbox Lectures #11 and #12: Spectral Graph Theory

Data Analysis and Manifold Learning Lecture 7: Spectral Clustering

Jure Leskovec Joint work with Jaewon Yang, Julian McAuley

1 Matrix notation and preliminaries from spectral graph theory

Lecture 12 : Graph Laplacians and Cheeger s Inequality

Data Mining Techniques

CS 664 Segmentation (2) Daniel Huttenlocher

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Data Mining and Matrices

Introduction to Spectral Graph Theory and Graph Clustering

Spectra of Adjacency and Laplacian Matrices

ON THE QUALITY OF SPECTRAL SEPARATORS

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u)

Graph Partitioning Using Random Walks

Machine Learning for Data Science (CS4786) Lecture 11

ECS231: Spectral Partitioning. Based on Berkeley s CS267 lecture on graph partition

Chapter 4. Signed Graphs. Intuitively, in a weighted graph, an edge with a positive weight denotes similarity or proximity of its endpoints.

Graph Clustering Algorithms

Machine Learning - MT Clustering

8.1 Concentration inequality for Gaussian random matrix (cont d)

Certifying the Global Optimality of Graph Cuts via Semidefinite Programming: A Theoretic Guarantee for Spectral Clustering

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

STA141C: Big Data & High Performance Statistical Computing

CS6220: DATA MINING TECHNIQUES

Data Analysis and Manifold Learning Lecture 3: Graphs, Graph Matrices, and Graph Embeddings

MLCC Clustering. Lorenzo Rosasco UNIGE-MIT-IIT

CS 556: Computer Vision. Lecture 13

Summary: A Random Walks View of Spectral Segmentation, by Marina Meila (University of Washington) and Jianbo Shi (Carnegie Mellon University)

Laplacian Eigenmaps for Dimensionality Reduction and Data Representation

MATH 567: Mathematical Techniques in Data Science Clustering II

P P P NP-Hard: L is NP-hard if for all L NP, L L. Thus, if we could solve L in polynomial. Cook's Theorem and Reductions

Data Mining and Analysis: Fundamental Concepts and Algorithms

Spectral Graph Theory Lecture 2. The Laplacian. Daniel A. Spielman September 4, x T M x. ψ i = arg min

ORIE 4741: Learning with Big Messy Data. Spectral Graph Theory

Statistical Machine Learning

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Laplacian Integral Graphs with Maximum Degree 3

Spectral Clustering on Handwritten Digits Database

Spectral Graph Theory and its Applications. Daniel A. Spielman Dept. of Computer Science Program in Applied Mathematics Yale Unviersity

Admin NP-COMPLETE PROBLEMS. Run-time analysis. Tractable vs. intractable problems 5/2/13. What is a tractable problem?

An Algorithmist s Toolkit September 10, Lecture 1

Spectral Analysis of k-balanced Signed Graphs

CS6220: DATA MINING TECHNIQUES

A Statistical Look at Spectral Graph Analysis. Deep Mukhopadhyay

Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.

Communities, Spectral Clustering, and Random Walks

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Network Embedding as Matrix Factorization: Unifying DeepWalk, LINE, PTE, and node2vec

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Specral Graph Theory and its Applications September 7, Lecture 2 L G1,2 = 1 1.

A Dimensionality Reduction Framework for Detection of Multiscale Structure in Heterogeneous Networks

COMPSCI 514: Algorithms for Data Science

Lecture: Local Spectral Methods (1 of 4)

Clustering using Mixture Models

CPSC 540: Machine Learning

Exercises NP-completeness

A lower bound for the Laplacian eigenvalues of a graph proof of a conjecture by Guo

Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee

Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach

Graphs, Vectors, and Matrices Daniel A. Spielman Yale University. AMS Josiah Willard Gibbs Lecture January 6, 2016

CS249: ADVANCED DATA MINING

A sequence of triangle-free pseudorandom graphs

Spectral Theory of Unsigned and Signed Graphs Applications to Graph Clustering. Some Slides

Predicting Graph Labels using Perceptron. Shuang Song

Latent Semantic Indexing (LSI) CE-324: Modern Information Retrieval Sharif University of Technology

Data Analysis and Manifold Learning Lecture 9: Diffusion on Manifolds and on Graphs

CS341 info session is on Thu 3/1 5pm in Gates415. CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Overlapping Communities

THE HIDDEN CONVEXITY OF SPECTRAL CLUSTERING

CS264: Beyond Worst-Case Analysis Lecture #15: Topic Modeling and Nonnegative Matrix Factorization

Spectral Generative Models for Graphs

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

17.1 Directed Graphs, Undirected Graphs, Incidence Matrices, Adjacency Matrices, Weighted Graphs

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

Testing Cluster Structure of Graphs. Artur Czumaj

Preliminaries and Complexity Theory

Lecture 13 Spectral Graph Algorithms

Eight theorems in extremal spectral graph theory

Locally-biased analytics

Lecture 14: Random Walks, Local Graph Clustering, Linear Programming

Lecture Introduction. 2 Brief Recap of Lecture 10. CS-621 Theory Gems October 24, 2012

High Dimensional Search Min- Hashing Locality Sensi6ve Hashing

Groups of vertices and Core-periphery structure. By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA

Reading group: proof of the PCP theorem

Lecture 1. 1 if (i, j) E a i,j = 0 otherwise, l i,j = d i if i = j, and

Transcription:

CS224W: Social and Information Network Analysis Jure Leskovec Stanford University Jure Leskovec, Stanford University http://cs224w.stanford.edu

Task: Find coalitions in signed networks Incentives: European chocolates! Fame Up to 10% extra credit Due: Friday midnight No late days! 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2

Today: 3 methods (3) Trawling: Community signatures that can be efficiently extracted (4) Spectral graph partitioning: i Laplacian matrix, ti 2nd eigenvector (5) Overlapping communities: Clique percolation method 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 3

[Kumar et al. 99] Searching for small communities in Web graph (1) What is the signature of a community/discussion in a Web graph Use this to define topics: What the same people on the left talk about on the right A dense 2 layer graph Intuition: many people all talking about the same things 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 4

[Kumar et al. 99] (2) A more well defined problem: Enumerate complete bipartite subgraphs K s,t Where K s,t = s nodes where each links to the same t other nodes 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 5

[Kumar et al. 99] Two points: (1) The signature of a community/discussion (2) Complete bipartite subgraph K s,t K s,t = graph on s nodes, each links to the same t other nodes Plan: (A) From (2) get back to (1): Via: Any dense enough graph contains a smaller K s,t as a subgraph (B) How do we solve (2) in a giant graph? Whatsimilar problems have been solvedonon big non graph data? (3) Frequent itemset enumeration [Agrawal Srikant 99] 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 6

[Agrawa Srikant 99] Marketbasket analysis: What items are bought together in a store? Setting: Universe U of n items m subsets of U: S 1, S 2,, S m U (S i is a set of items one person bought) Frequency threshold f Goal: Find all subsets T s.t. T S i of f sets S i (items in T were bought together f times) 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 7

Example: Universe of items: U={1,2,3,4,5} {,,,,} Itemsets: S 1 ={1,3,5}, S 2 ={2,3,4}, S 3 ={2,4,5}, S 4 ={3,4,5}, S 5 ={1,3,4,5}, S 6 ={2,3,4,5} Minimum support: f=3 Algorithm: Build up the lists Insight: for a frequent set of size k, all its subsets are also frequent 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 8

[Agrawa Srikant 99] U={1,2,3,4,5}, U={12345} f=3 S 1 ={1,3,5}, S 2 ={2,3,4}, S 3 ={2,4,5}, S 4 ={345} ={3,4,5}, S={1345} 5 ={1,3,4,5}, S={2345} 6 ={2,3,4,5} 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 9

[Agrawa Srikant 99] For i =1 1,,k Find all frequent sets of size i by composing sets of size i-1 that differ in 1 element Open question: Efficiently find only maximal frequent sets 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 10

[Kumar et al. 99] Claim: (3) (itemsets) solves (2) (bipartite graphs) How? View each node i as a set S i of nodes i points to K s,t = a set y of size t that occurs in s sets S i Looking for K s,t set of frequency threshold to s and look at layer t all frequent sets of size t 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 11

[Kumar et al. 99] (2) (1): Informally, every dense enough graph G contains a bipartite K s,t subgraph where s and t depend on size (# of nodes) and density (avg. degree) of G [Kovan Sos Turan 53] Theorem: Let G=(X,Y,E), X = Y = n with avg. degree: 1/ t 1 1/ t d s n then G contains K st s,t as a subgraph [Will not prove it here. See online slides] t 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 12

For the proof we will need the following fact Recall: a b a( a 1)...( a b 1) b! Let f(x) = x(x-1)(x-2) (x-k) Once x k, f(x) curves upward (convex) Suppose a setting: g(y) is convex Want to minimize g(x i ) where x i =x To minimize i i g(x i ) make each x i = x/n 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 13

Node i, degree d i and neighbor set S i Put node i in buckets for all size t subsets of its neighbors Potential right hand sides of K s,t (i.e., all size t subsets of S i ) 11/10/2009 Jure Leskovec, Stanford CS322: Network Analysis 14

Note: As soon as s nodes appear in a bucket we have a K s,t How many buckets node i contributes? d i degree of node i i What is the total size of all buckets? 11/10/2009 Jure Leskovec, Stanford CS322: Network Analysis 15

a a( a 1)...( a b 1) So, the total height of b b b!! all buckets is Plug in: d s 1/ t 1 1/ /tt n t 11/10/2009 Jure Leskovec, Stanford CS322: Network Analysis 16

We have: Total height of all buckets How many buckets are there? n t What is the average height of buckets? n t s t! t n t! s t s t! n t height s n t! So, avg. bucket So by pigeonhole principle, there must be at least one bucket with more than s nodes in it. 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 17

[Kumar et al. 99] Theoretical result: Complete bipartite subgraphs K s,t are embedded in larger dense enough graphs (i.e., the communities) i.e., biparite subgraphs as signatures of communities Algorithmic result: Frequent itemset extraction and dynamic programming g SCALABLE!!! 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 18

1 Undirected graph G(V,E): G(VE): Bi partitioning task: Divide vertices into two disjoint groups (A,B) A Questions: 1 5 2 6 3 4 How can we define a good partition of G? How can we efficiently identify such a partition? 2 3 B 5 4 6 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 19

What makes a good partition? Maximize the number of within group connections Minimize i i the number of bt between group connections 1 5 2 3 4 6 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 20

Express partitioning objectives as a function of the edge cut of the partition Cut: Set of edges with ihonly one vertex in a group: A 2 1 3 4 5 6 B cut(a,b) = 2 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 21

Criterion: Minimum cut Minimise weight of connections between groups min AB A,B cut(a,b) Degenerate case: Optimal cut Minimum cut Problem: Only considers external cluster connections Does not consider internal cluster connectivity 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 22

[Shi Malik] Criterion: Normalized cut [Shi Malik, 97] Connectivity between groups relative to the density of each group Vol(A): The total weight of the edges originating from groupa A. Why use this criterion? Produces more balanced partitions How do we efficiently find a good partition? Problem: Computing optimal cut is NP hard 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 23

A: adjacency matrix of undirected G A ij =1 if (i,j) is an edge, else 0 x is a vector in n with components (x 1,, x n ) just a label/value of each node of G What is the meaning of A x? Entry y j is a sum of labels x i of neighbors of j 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 24

j th coordinate of Ax: Sum of the x values of neighbors of j Make this a new value at node j Spectral Graph Theory: Analyze the spectrum of matrix representing G Spectrum: Eigenvectors of a graph, ordered by the magnitude (strength) of their corresponding eigenvalues: 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 25

Suppose all nodes in G have degree d and G is connected What are some eigenvalues/vectors of G? Ax = x What is? What x? 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 26

What if G is not connected? Say G has 2 components, each d regular What are some eigenvectors? x= Put all 1s on A and 0s on B or vice versa 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 27

Adjacency matrix (A): n n matrix A=[a ij ], a ij =1 if edge between node i and j 2 1 3 Important properties: 4 5 Symmetric matrix Eigenvectors are real and orthogonal 6 1 2 3 4 5 6 1 0 1 1 0 1 0 2 1 0 1 0 0 0 3 1 1 0 1 0 0 4 0 0 1 0 1 1 5 1 0 0 1 0 1 6 0 0 0 1 1 0 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 28

Degree matrix (D): n n diagonal matrix D=[d ii ], d ii = degree of node i 1 2 3 4 5 6 2 1 3 4 5 6 1 3 0 0 0 0 0 2 0 2 0 0 0 0 3 0 0 3 0 0 0 4 0 0 0 3 0 0 5 0 0 0 0 3 0 6 0 0 0 0 0 2 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 29

Laplacian matrix () (L): n n symmetric matrix 1 2 3 What is trivial eigenvector/ eigenvalue? 5 6 4 Important properties: 1 2 3 4 5 6 1 3 1 1 0 1 0 2 1 2 1 0 0 0 3 1 1 3 1 0 0 4 0 0 1 3 1 1 5 1 0 0 1 3 1 6 0 0 0 1 1 2 L = D - A Eigenvalues are non negative negative real numbers Eigenvectors are real and orthogonal 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 30

For symmetric matrix M: 2 min x T x T Mx x T Mx What is the meaning of min x T Lx on G? x 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 31

What else do we know about x? x is unit vector x is orthogonal to 1 st eigenvector (1,,1) 1) thus: Then: min ( x x i j 2 All lbli f x 2 i All labelings of nodes so that sum(x i )=0 ) 2 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 32

Express partition (A,B) as a vector We can minimize the cut of the partition by finding a non trivial vector x that minimizes: 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 33

The minimum value is given by the 2 nd smallest eigenvalue λ 2 of the Laplacian matrix L The optimal solution for x is given by the corresponding eigenvector λ 2, referred as the Fiedler vector 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 34

How to define a good partition of a graph? Minimise a given graph cut criterion How to efficiently identify such a partition? Approximate using information provided by the eigenvalues and eigenvectors of a graph Spectral Clustering 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 35

Threebasic stages: 1. Pre processing Construct a matrixrepresentationrepresentation of the graph 2. Decomposition Compute eigenvalues and eigenvectors of the matrix Map each point to a lower dimensional representation based on one or more eigenvectors 3. Grouping Assign points to two or more clusters, based on the new representation 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 36

Pre processing: Build Laplacian matrix L of the graph 1 2 3 4 5 6 1 3 1 1 0 1 0 2 1 2 1 0 0 0 3 1 1 3 1 0 0 4 0 0 1 3 1 1 5 1 0 0 1 3 1 6 0 0 0 1 1 2 1.0 0.4 0.6 0.4 0.4 0.4 0.0 Decomposition: 3.0 0.4 0.3 0.1 0.6 0.4 0.5 Find eigenvalues and eigenvectors x of the matrix L 0.0 = X = 3.0 4.0 5.0 0.4 0.4 0.4 0.4 0.3 0.3 0.3 0.6 0.5 0.1 0.5 0.4 0.2 0.6 0.2 0.4 0.4 0.4 0.4 0.4 0.5 0.5 0.5 0.0 Map vertices to corresponding components of 2 1 2 3 4 5 0.3 0.6 0.3 0.3 0.3 How do we now find the clusters? 6 0.6 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 37

Grouping: Sort components of reduced 1 dimensional vector Identify clusters by splitting the sorted vector in two How to choose a splitting point? Naïve approaches: Split at 0, mean or median value More expensive approaches: Attempt to minimise normalized cut criterion in 1 dimension 1 03 0.3 Split at 0: 2 3 4 5 6 0.6 0.3 0.33 0.3 0.6 Cluster A: Positive points Cluster B: Negative points 1 03 0.3 4 0.3 03 2 3 0.6 0.3 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 38 5 6 0.3 0.6 A B

11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 39

How do we partition a graph into k clusters? Two basic approaches: Recursive bi partitioning [Hagen et al., 92] Recursively apply bi partitioning algorithm in a hierarchical divisive manner Disadvantages: Inefficient, unstable Cluster multiple eigenvectors [Shi Malik, 00] Build a reduced space from multiple eigenvectors Commonly used in recent papers A preferable approach 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 40

k eigenvector Algorithm [Ng et al., 01] Pre processing: Construct the scaled adjacency matrixa': A 1/ 2 1/ 2 A' Decomposition: D AD Find the eigenvalues and eigenvectors of A' Buildembedded space from the eigenvectors corresponding to the k largest eigenvalues Grouping: Apply k means to reduced n k space to get k clusters 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 41

Approximates the optimal cut [Shi Malik, 00] Can be used to approximate the optimal k way normalized cut Emphasizes cohesive clusters Increases the unevenness in the distribution ib i of the data Associations between similar points are amplified, associations between dissimilar points are attenuated The data dt begins to approximate a clustering Well separated space Transforms data to a new embedded space, consisting of k orthogonal basis vectors NB: Multiple eigenvectors prevent instability due to information loss 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 42

Eigengap: The difference between two consecutive eigenvalues Most stable clustering is generally given by the value k that maximises eigengap: k k Example: k 1 Eigenvalue 50 45 40 35 30 25 20 15 10 5 0 λ 1 λ 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 k max k Choose k=2 2 1 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 43

METIS: Heuristic but works really well in practice http://glaros.dtc.umn.edu/gkhome/views/metis Graclus: Based on kernel k means http://www.cs.utexas.edu/users/dml/software/graclus.html Cluto: http://glaros.dtc.umn.edu/gkhome/views/cluto/ /g / / / Clique percorlation method: For finding overlapping clusters http://angel.elte.hu/cfinder/ 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 44

Non overlapping overlapping vs. overlapping communities 11/8/2010 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 45