Overlapping Communities

Size: px

Start display at page:

Download "Overlapping Communities"

Lily Barker
5 years ago
Views:

1 Overlapping Communities Davide Mottin HassoPlattner Institute Graph Mining course Winter Semester 2017

2 Acknowledgements Most of this lecture is taken from: GRAPH MINING WS

3 Lecture road Introduction to graph clustering Hierarchical approaches Spectral clustering GRAPH MINING WS

4 Identifying Communities Can we identify node groups? (communities, modules, clusters) Nodes: Football Teams Edges: Games played GRAPH MINING WS

5 College Football Network Atlantic Football Cups / conferences Nodes: Football Teams Edges: Games played GRAPH MINING WS

6 Protein-Protein Interactions Can we identify functional modules? Nodes: Proteins Edges: Physical interactions GRAPH MINING WS

7 Protein-Protein Interactions Functional modules Nodes: Proteins Edges: Physical interactions GRAPH MINING WS

8 Facebook Network Can we identify social communities? Nodes: Facebook Users Edges: Friendships GRAPH MINING WS

9 Facebook Network Social communities High school Summer internship Stanford (Squash) Stanford (Basketball) Nodes: Facebook Users Edges: Friendships GRAPH MINING WS

10 Overlapping Communities Non-overlapping vs. overlapping communities GRAPH MINING WS

11 Non-overlapping Communities Nodes Nodes Network Adjacency matrix GRAPH MINING WS

12 Communities as Tiles! What is the structure of community overlaps: Edge density in the overlaps is higher! Communities as tiles GRAPH MINING WS

13 Recap so far Communities in a network This is what we want! GRAPH MINING WS

14 Plan of attack 1) Given a model, we generate the network: Generative model for networks A C B D E F G 2) Given a network, find the best model H A C B D E H F G Generative model for networks GRAPH MINING WS

15 Model of networks Goal: Define a model that can generate networks The model will have a set of parameters that we will later want to estimate (and detect communities) Generative model for networks A C B D E F G Q: Given a set of nodes, how do communities generate edges of the network? H GRAPH MINING WS

16 Community-Affiliation Graph Communities, C Memberships, M p A p B Model Nodes, V Model Generative model B(V, C, M, {p c }) for graphs: Nodes V, Communities C, Memberships M Each community c has a single probability p c Network Later we fit the model to networks to detect communities GRAPH MINING WS

17 AGM: Generative Process Communities, C Memberships, M p A p B Model Nodes, V Community Affiliations AGM generates the links: For each For each pair of nodes in community A, we connect them with prob. p A The overall edge probability is: Network P ( u, v) = 1- Õ(1 - p c ) cîm u ÇM v If u, v share no communities: P u, v = ε M u set of communities node u belongs to Think of this as an OR function: If at least 1 community says YES we create an edge GRAPH MINING WS

18 Recap: AGM networks Model Network GRAPH MINING WS

19 AGM: Flexibility AGM can express a variety of community structures: Non-overlapping, Overlapping, Nested GRAPH MINING WS

20 Detecting Communities Detecting communities with AGM: generate A B F D E infer C H G Given a Graph G(V, E), find the Model 1) Affiliation graph M 2) Number of communities C 3) Parameters p c GRAPH MINING WS

21 Maximum Likelihood Estimation Maximum Likelihood Principle (MLE): Given: Data X Assumption: Data is generated by some model f(θ) f model Θ model parameters Want to estimate P f X Θ): The probability that our model f (with parameters Θ) generated the data Now let s find the most likely model that could have generated the data: arg max P f X Θ) 9 GRAPH MINING WS

22 Example: MLE Imagine we are given a set of coin flips Task: Figure out the bias of a coin! Data: Sequence of coin flips: X = [1, 0, 0, 0, 1, 0, 0, 1] Model: f Θ = return 1 with prob. Θ, else return 0 What is P f X Θ? Assuming coin flips are independent So, P f X Θ = P f 1 Θ P f 0 Θ P f 0 Θ P f 1 Θ What is P f 1 Θ? Simple, P f 1 Θ = Θ Then, P f X Θ = Θ 3 1 Θ 5 For example: P f X Θ = 0. 5 = P f X Θ = 3 8 = What did we learn? Our data was most likely generated by coin with bias Θ = 3/8 P f X Θ Θ = 3/8 GRAPH MINING WS Θ

23 MLE for Graphs How do we do MLE for graphs? Model generates a probabilistic adjacency matrix We then flip all the entries of the probabilistic matrix to obtain the binary adjacency matrix A For every pair of nodes u, v AGM gives the prob. p uv of them being linked Flip biased coins A The likelihood of AGM generating graph G: P( G Q) = P ( u, v) ÎE P( u, v) P ( u, v) ÏE (1 - P( u, v)) GRAPH MINING WS

24 Graphs: Likelihood P(G Θ) Given graph G(V,E) and Θ, we calculate likelihood that Θ generated G: P(G Θ) G A B Θ=B(V, C, M, {p c }) P(G Θ) G P( G Q) = P ( u, v) ÎE P( u, v) P ( u, v) ÏE (1 - P( u, v)) GRAPH MINING WS

25 MLE for Graphs Our goal: Find Θ = B(V, C, M, p C ) such that: arg max Q P( G ) AGM Q How do we find B(V, C, M, p C ) that maximizes the likelihood? GRAPH MINING WS

26 MLE for AGM Our goal is to find B V, C, M, p C such that: arg max L(V,C,M, p C ) M P(u, v) M (1 P u, v u,v E uv E ) Problem: Finding B means finding the bipartite affiliation network. There is no nice way to do this. Fitting B(V, C, M, p C ) is too hard, let s change the model (so it is easier to fit)! GRAPH MINING WS

27 From AGM to BigCLAM Relaxation: Memberships have strengths F ua u v F ua : The membership strength of node u to community A (F ua = 0: no membership) Each community A links nodes independently: P A u, v = 1 exp ( F ua F va ) GRAPH MINING WS

28 j Factor Matrix F Community membership strength matrix F Nodes F = Communities F va strength of u s membership to A P A u, v = 1 exp ( F ua F va ) Probability of connection is proportional to the product of strengths Notice: If one node doesn t belong to the community (F XY = 0) then P(u, v) = 0 Prob. that at least one common community C links the nodes: F u vector of community membership strengths of u P u, v = 1 1 P C u, v GRAPH MINING WS C

29 From AGM to BigCLAM Community A links nodes u, v independently: P A u, v = 1 exp ( F ua F va ) Then prob. at least one common C links them: P u, v = 1 Example F matrix: F u : F v : F w : Node community membership strengths C 1 P C u, v = 1 exp ( C F uc F vc ) = 1 exp ( F u F T v ) Then: F u F v T = And: P u, v = 1 exp = But: P u, w = P v, w = 0 GRAPH MINING WS

30 BigCLAM: How to find F Task: Given a network G(V, E), estimate F Find F that maximizes the likelihood: arg max F M P(u, v) M (1 P u, v ) (u,v) E u,v E where: P(u, v) = 1 exp ( F u F v T ) Many times we take the logarithm of the likelihood, and call it log-likelihood: l F = log P(G F) Goal: Find F that maximizes l(f): GRAPH MINING WS

31 BigCLAM: V1.0 Compute gradient of a single row F u of F: Coordinate gradient ascent: Iterate over the rows of F: Compute gradient l F u of row u (while keeping others fixed) Update the row F u : F u F u + η l(f u ) Project F u back to a non-negative vector: If F uc < 0: F uc = 0 This is slow! Computing l F u takes linear time! N(u).. Set out outgoing neighbors GRAPH MINING WS

32 BigCLAM: V2.0 However, we notice: We cache v F v So, computing v N(u) F v now takes linear time in the degree N u of u In networks degree of a node is much smaller to the total number of nodes in the network, so this is a significant speedup! GRAPH MINING WS

33 BigClam: Scalability BigCLAM takes 5 minutes for 300k node nets Other methods take 10 days Can process networks with 100M edges! GRAPH MINING WS

34 Extension: Beyond Clusters GRAPH MINING WS

35 Extension: Directed AGM Extension: Make community membership edges directed! Outgoing membership: Nodes sends edges Incoming membership: Node receives edges GRAPH MINING WS

36 Example: Model and Network GRAPH MINING WS

37 Directed AGM Everything is almost the same except now we have 2 matrices: F and H F out-going community memberships H in-coming community memberships F H Edge prob.: P u, v = 1 exp( F u H v T ) Network log-likelihood: which we optimize the same way as before GRAPH MINING WS

38 Predator-prey Communities GRAPH MINING WS

39 Questions? GRAPH MINING WS

40 References Yang, J. and Leskovec, J. Community-affiliation graph model for overlapping network community detection. ICDE, Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach by J. Yang, J. Leskovec. ACM International Conference on Web Search and Data Mining (WSDM), Detecting Cohesive and 2-mode Communities in Directed and Undirected Networks by J. Yang, J. McAuley, J. Leskovec. ACM International Conference on Web Search and Data Mining (WSDM), Community Detection in Networks with Node Attributes by J. Yang, J. McAuley, J. Leskovec. IEEE International Conference On Data Mining (ICDM), GRAPH MINING WS

Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach

Overlapping Community Detection at Scale: A Nonnegative Matrix Factorization Approach Author: Jaewon Yang, Jure Leskovec 1 1 Venue: WSDM 2013 Presenter: Yupeng Gu 1 Stanford University 1 Background Community