Groups of vertices and Core-periphery structure. By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA

Similar documents
Jure Leskovec Joint work with Jaewon Yang, Julian McAuley

Peripheries of Cohesive Subsets

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity

Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

Efficient Network Structures with Separable Heterogeneous Connection Costs

Peripheries of cohesive subsets

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

1 Mechanistic and generative models of network structure

Mini course on Complex Networks

Efficient Network Structures with Separable Heterogeneous Connection Costs

Applications of Eigenvalues in Extremal Graph Theory

The Critical Point of k-clique Percolation in the Erd ós Rényi Graph

arxiv: v1 [math.st] 1 Nov 2017

CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University

Hyperbolic metric spaces and their applications to large complex real world networks II

arxiv: v2 [cond-mat.dis-nn] 16 Feb 2017

Cohesive Subgroups. October 4, Based on slides by Steve Borgatti. Modifications (many sensible) by Rich DeJordy

Adventures in random graphs: Models, structures and algorithms

Subgraph Detection Using Eigenvector L1 Norms

Nonparametric Bayesian Matrix Factorization for Assortative Networks

On Dominator Colorings in Graphs

Betweenness centrality of fractal and nonfractal scale-free model networks and tests on real networks

CS224W: Analysis of Networks Jure Leskovec, Stanford University

Modularity and Graph Algorithms

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 4 May 2000

Network models: random graphs

Lecture 13: Spectral Graph Theory

Deterministic Decentralized Search in Random Graphs

Design and characterization of chemical space networks

Extension of Strongly Regular Graphs

Packing triangles in regular tournaments

Link Prediction. Eman Badr Mohammed Saquib Akmal Khan

Networks: Lectures 9 & 10 Random graphs

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

Modularity in several random graph models

The network growth model that considered community information

Spectral densest subgraph and independence number of a graph 1

Spectral Methods for Subgraph Detection

Project in Computational Game Theory: Communities in Social Networks

Lecture 5: January 30

Complex networks: an introduction

ECS 253 / MAE 253 April 26, Intro to Biological Networks, Motifs, and Model selection/validation

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

Dominator Colorings and Safe Clique Partitions

Data Mining and Analysis: Fundamental Concepts and Algorithms

The Trouble with Community Detection

Stability and topology of scale-free networks under attack and defense strategies

Consistency Under Sampling of Exponential Random Graph Models

Graph Detection and Estimation Theory

Applications of the Lopsided Lovász Local Lemma Regarding Hypergraphs

The weighted random graph model

arxiv:physics/ v1 9 Jun 2006

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Supporting Statistical Hypothesis Testing Over Graphs

A LINE GRAPH as a model of a social network

CS224W: Social and Information Network Analysis

Kristina Lerman USC Information Sciences Institute

Detection of core-periphery structure in networks by 3-tuple motifs

arxiv: v1 [physics.soc-ph] 6 Nov 2013

Causality and communities in neural networks

Random Lifts of Graphs

Undirected Graphical Models

Source Locating of Spreading Dynamics in Temporal Networks

Networks as a tool for Complex systems

1 Matrix notation and preliminaries from spectral graph theory

The number of edge colorings with no monochromatic cliques

Data science with multilayer networks: Mathematical foundations and applications

Machine Learning and Modeling for Social Networks

Social Networks- Stanley Milgram (1967)

Extremal Graphs Having No Stable Cutsets

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

Modeling Dynamic Evolution of Online Friendship Network

On the Complexity of the Minimum Independent Set Partition Problem

On Node-differentially Private Algorithms for Graph Statistics

Degree Distribution: The case of Citation Networks

Overlapping Communities

Chapter 9: Relations Relations

Spectral Analysis of k-balanced Signed Graphs

This section is an introduction to the basic themes of the course.

Peter J. Dukes. 22 August, 2012

Co-k-plex vertex partitions

Testing Equality in Communication Graphs

Shlomo Havlin } Anomalous Transport in Scale-free Networks, López, et al,prl (2005) Bar-Ilan University. Reuven Cohen Tomer Kalisky Shay Carmi

1 Random graph models

Multislice community detection

CS281A/Stat241A Lecture 19

Network Science (overview, part 1)

Chromatic number, clique subdivisions, and the conjectures of

Near-domination in graphs

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

THE NUMBER OF INDEPENDENT DOMINATING SETS OF LABELED TREES. Changwoo Lee. 1. Introduction

I N S T I T U T F Ü R I N F O R M A T I K. Local Density. Sven Kosub T E C H N I S C H E U N I V E R S I T Ä T M Ü N C H E N

ECS 253 / MAE 253, Lecture 15 May 17, I. Probability generating function recap

Learning latent structure in complex networks

Chaos, Complexity, and Inference (36-462)

Understanding spatial connectivity of individuals with non-uniform population density

The super line graph L 2

RaRE: Social Rank Regulated Large-scale Network Embedding

More on NP and Reductions

Móstoles, Spain. Keywords: complex networks, dual graph, line graph, line digraph.

Transcription:

Groups of vertices and Core-periphery structure By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA

Mostly observed real networks have: Why? Heavy tail (powerlaw most probably, exponential) High clustering (high number of triangles especially in social networks, lower count otherwise) Small average path (usually small diameter) Communities/periphery/hierarchy Homophily and assortative mixing (similar nodes tend to be adjacent) Where does the structure come from? How do we model it? 2

Macro and Meso Scale properties Macro Scale properties (using all the interactions): Small world (small average path, high clustering) Powerlaw degree distr. (generally pref. attachment) Meso Scale properties applying to groups (using k- clique, k-core, k-dense): Community structure Core-periphery structure Micro Scale properties applying to small units: Edge properties (such as who it connects, being a bridge) Node properties (such as degree, cut-vertex) 3

Some local and global metrics pertaining to structure of networks Structure they capture Local Statistics Global statistics Direct influence General feel for the distribution of the edges Vertex degree, in and out degree Degree distribution Closeness, distance between nodes Geodesic (path) Distance (numerical value) Diameter, radius, average path length Connectedness of the network How critical are vertices to the connectedness of the graph? How much damage can a network take before disconnecting? Existence of a bridge Existence of a cut vertex Cut sets Degree distribution Tight node/edge neighborhoods, important nodes as a group Clique, plex, core, community, k-dense (for edges) Community detection 4

Clusters and matrices 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0 In a very clustered graph, the adjacency matrix can be put in a block form (identifying communities). 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 Source: Guido Caldarelli, Communities and Clustering in Some social Networks, NetSci 2007 New York, May 20 th 2007

Definitions Newman s book uses k-component, k-cliques, k-plexes and k-cores to refer to a set of vertices with some properties. In graph theory (and research papers) we use a clique to be the set of vertices and edges, so a clique is actually a graph (or subgraph) Either way it works, the graph captures more information, and I will refer them as graphs (induced by the nodes in the sets).

Groups and subgroups identifications Some common approaches to subgroup identification and analysis: K-cliques K-cores (k-shell) K-denseness Components (and k-components) Community detection They are used to explore how large networks can be built up out of small and tight groups.

Section 7.8.2 COMPONENTS AND K-COMPONENTS 8

Components Recall that a graph is k-connected/k-component if it can be disconnected by removal of k vertices, and no k-1 vertices can disconnect it. Component is a maximal size connected subgraph A k-component (k-connected component) is a connected maximal subgraph that can be disconnected (or we re left with a ) by removal of k vertices, and no k-1 vertices can disconnect it. Alternatively: A k-component is a connected maximal subgraph such that there are k-vertexindependent paths between any two vertices 9

In class exercise The k-component tells how robust a graph or subgraph is. Identify a subgraph that is either a: 1-connected 2-connected 3-connected 4-connected 10

Section 7.8.1 K-CLIQUE AND K-CORE 11

k-cliques A clique of size : a subset of nodes, with every node adjacent to every other member of the subset (all one of them) We usually search for the maximum clique Hard to find (decision problem for the clique number is NP-Complete) Why is it hard to use this concept on real networks? Because one might not infer/know all the edges of the true network, so clique may exist but it may not be captured in the data to be analyzed A relaxed version of a clique might be just as useful in large networks. 12

In class exercise A clique of size : a subset of nodes, with every node connected to every other member of the subset. Identify a: 1-clique 2-clique 3-clique 4-clique Relaxed versions of a k-clique are k-dense and 13 k-core

-core: maximal subset of nodes,, with, where is the subgraph induced by S. Idea: enough edges are present to make a group strong. References: k-core [1] Bollobas, B. Graph Theory and Combinatorics: Proceedings of the Cambridge Combinatorial Conference in honor of P. Erdos, 35 (Academic, New York, 1984). [2] Seidman, S. B. Network structure and minimum degree. Social Networks 5, 269 287 (1983). [3] Carmi, S., Havlin, S, Kirkpatrick, S., Shavitt, Y. & Shir, E. A model of Internet topology using k-shell decomposition. Proc. Natl. Acad. Sci. USA 104, 11150-11154 (2007). [4] Angeles-Serrano, M. & Bogu n a, M. Clustering in complex networks. II. Percolation properties. Phys. Rev. E 74, 056116 (2006). 14

k-core A -core of size n: maximal subset of nodes with, where is the subgraph induced by. Finding the core: eliminate lower-order k-cores until to find The highest k before the network vanishes. http://iopscience.iop.org/article/10.1088/1367-2630/14/8/083030 15

In class exercise The -core of size n: maximal subset of nodes with, where is the subgraph induced by. Identify the: 1-core 2-core 3-core 4-core 16

k-dense A k- dense sub-graph is a group of vertices, in which each pair of vertices {i, j} has at least -2 common neighbors. Idea: pairwise friends ( dense looks at edges rather than vertices in making them part of the group) 17

k-dense A k- dense sub-graph is a group of vertices, in which each pair of vertices has at least -2 common neighbors. k - clique k - dense k core 18

In class exercise A k- dense sub-graph is a group of vertices, in which each pair of vertices {i, j} has at least k-2 common neighbors. Identify a: 1-dense 2-dense 3-dense 4-dense 19

Other extensions https://academic.oup.com/comnet/article/doi/10.1093/comnet/cnt016/2392115/structure-and-dynamics-of-core-periphery-networks 20

Using them globally

k-cliques, k-cores and k-dense clique of size : maximal subset of nodes, with every node adjacent to every other member of the subset -core: maximal subset of nodes, with every node adjacent to at least others in the subset A -dense sub-graph is a group of vertices, in which each pair of vertices {i, j} has at least common neighbors.

Communities vs. core/dense/clique K-core/dense/clique: look inside the group of nodes Communities look both at internal and external ties (high internal and low external ties) Core-periphery decomposition also looking at internal and ext. to the core (doesn t have to be a clique) 23

K-core (k-shell) decomposition The decomposition identifies the shells for different k-values. Generally (but not well defined): the core of the network (the -core for the largest ) and the outer periphery (last layer: 1-core taking away the 2-core). There are modifications where several top values of make the core. http://3.bp.blogspot.com/-tijz3nstwd0/togwugiveji/aaaaaaaasww/etkwklnpnw4/s1600/k-cores.png

K-core and degree (1) Reference: Alvarez-Hamelin, J. I., Dall asta, L., Barrat, A. & Vespignani, A. Large scale networks fingerprinting and visualization using the k-core decomposition. Advances in Neural Information Processing Systems 18, 41 51 (2006) http://papers.nips.cc/paper/2789-large-scale-networks-fingerprinting-and-visualization-using-the-k-core-decomposition.pdf 25

k-core and degree (2) The degree is highly (and nonlinearly) correlated with the position of the node in the k-shell http://iopscience.iop.org/article/10.1088/1367-2630/14/8/083030 26

Core-periphery structure 27

Core-periphery decomposition The core-periphery decomposition captures the notion that many networks decompose into: a densely connected core, and a sparsely connected periphery [6], [12]. The core-periphery structure is a pervasive and crucial characteristic of large networks [13], [14], [15]. If overlapping communities are considered:» the network core forms as a result of many overlapping communities http://ilpubs.stanford.edu:8090/1103/2/paper-ieee-full.pdf 28

Core-periphery adjacency matrix dark blue = 1 (adjacent) white = 0 (nonadjacent) 29

Deciding on core-periphery How to decide if a network has coreperiphery structure? Not well defined either, but generally the density of the -core must be high: Checked by the high correlation,, where is the (i,j) adjacency matrix entry, and http://www.sciencedirect.com/science/article/pii/s0378873399000192 30

Extensions of core-periphery?! Limitation: There are just two classes of nodes: core and periphery. Is a three-class partition consisting of core, semiperiphery, and periphery more realistic? Or even partitioning with more classes? The problem becomes more difficult as the number of classes is increased, and good justification is needed. http://www.sciencedirect.com/science/article/pii/s0378873399000192 31

Possible structures dark shade = 0 (nonadjacent) light shade = 1 (adjacent) 32 From Aaron Clauset and Mason Porter

Core and communities The network core was traditionally viewed as a single giant community (lacking internal communities references [7], [8], [9], [10]). Yang and Leskovec (2014, reference [11]) showed that dense cores form as a result of many overlapping communities. Moreover, foodweb, social, and web networks exhibit a single dominant core, while protein-protein interaction and product copurchasing networks contain many local cores formed around the central core 33 http://ilpubs.stanford.edu:8090/1103/2/paper-ieee-full.pdf

References 1. M. E. Newman, Analysis of weighted networks Physical Review E, vol. 70, no. 5, 2004. 2. Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery structures Social networks 21.4 (2000): 375-395. 3. Csermely, Peter, et al. "Structure and dynamics of core/periphery networks. Journal of Complex Networks 1.2 (2013): 93-123. 4. Kitsak, Maksim, et al. "Identification of influential spreaders in complex networks." Nature Physics 6.11 (2010): 888-893 5. S. B. Seidman, Network structure and minimum degree, Social networks, vol. 5, no. 3, pp. 269287, 1983 6. Borgatti, Stephen P., and Martin G. Everett. "Models of core/periphery structures." Social networks 21.4 (2000): 375-395. 7. J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney, Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters, Internet Mathematics, vol. 6, no. 1, pp. 29 123, 2009. 8. A. Clauset, M. Newman, and C. Moore, Finding community structure in very large networks, Physical Review E, vol. 70, p. 066111, 2004. 9. M. Coscia, G. Rossetti, F. Giannotti, and D. Pedreschi, Demon: a local-first discovery method for overlapping communities, in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2012, pp. 615 623. 10. J. Leskovec, K. Lang, and M. Mahoney, Empirical comparison of algorithms for network community detection, in Proceedings of the International Conference on World Wide Web (WWW), 2010 11. Jaewon Yang and Jure Leskovec. Overlapping Communities Explain Core-Periphery Organization of Networks Proceedings of the IEEE (2014) 12. P. Holme, Core-periphery organization of complex networks, Physical Review E, vol. 72, p. 046111, 2005. 13. F. D. Rossa, F. Dercole, and C. Piccardi, Profiling coreperiphery network structure by random walkers, Scientific Reports, vol. 3, 2013. 14. J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney, Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters, Internet Mathematics, vol. 6, no. 1, pp. 29 123, 2009. 15. M. P. Rombach, M. A. Porter, J. H. Fowler, and P. J. Mucha, Core-periphery structure in networks, SIAM Journal of Applied Mathematics, vol. 74, no. 1, pp. 167 190, 2014 34