arxiv: v1 [stat.me] 1 Aug 2016

Size: px

Start display at page:

Download "arxiv: v1 [stat.me] 1 Aug 2016"

Bryce Lawrence
5 years ago
Views:

1 Null Models and Modularity Based Counity Detection in Multi-Layer Networks Subhadeep Paul and Yuguo Chen University of Illinois at Urbana-Chapaign Abstract arxiv: v1 [stat.me] 1 Aug 16 Multi-layer networks are networks on a set of entities (nodes) with ultiple types of relations (edges) aong the where each type of relation/interaction is represented as a network layer. As with single layer networks, counity detection is an iportant task in ulti-layer networks. A large group of popular counity detection ethods in networks are based on optiizing a quality function known as the odularity score, which is a easure of presence of odules or counities in networks. Hence a first step in counity detection is defining a suitable odularity score that is appropriate for the network in question. Here we introduce several ulti-layer network odularity easures under different null odels of the network, otivated by epirical observations in networks fro a diverse field of applications. In particular we define the ulti-layer configuration odel, the ulti-layer expected degree odel and their various odifications as null odels for ulti-layer networks to derive different odularities. The proposed odularities are grouped into two categories. The first category, which is based on degree corrected ulti-layer stochastic block odel, has the ulti-layer expected degree odel as their null odel. The second category, which is based on ulti-layer extensions of Newan-Girvan odularity, has the ulti-layer configuration odel as their null odel. These easures are then optiized to detect the optial counity assignent of nodes. We copare the effectiveness of the easures in counity detection in siulated networks and then apply the to four real networks. Key words and phrases: Configuration odel, degree corrected ulti-layer stochastic block odel, expected degree odel, ulti-layer odularity, ulti-layer null odels. 1 Introduction The coplex networks encountered in biology, social sciences, econoics and achine learning are often ulti-layered in the sense that they consist of ultiple types of edges/relations aong a group of entities. Each of those different types of relation can be viewed as creating its own network, called a layer of the ulti-layer network. Multi-layer networks are 1

2 a ore accurate representation of any coplex systes since any entities in those systes are involved siultaneously in ultiple networks. This eans each of the individual networks carries only partial inforation about the interactions aong the entities, and full inforation is available only through the ulti-layer networked syste (Rocklin and Pinar, 11; Paul and Chen, 1). We will consider a nuber of such inherently ulti-layer networks as real world exaples in this paper. In the social networking website Twitter, users can engage in various types of interactions with other users, e.g., ention, follow, retweet, etc. (Greene and Cunningha, 13; Paul and Chen, 1). Although the individual network layers created by these relationships are structurally highly related, they represent different facets of user behavior and ignoring the differences ight lead to spurious conclusions. In another exaple fro biology, the neural network of a sall organis, Caenorhabditis elegans, consists of neurons which are connected to each other by two types of connections, a synaptic (cheical) link and an ionic (electrical) link. The two types of links have arkedly different dynaics and hence should be treated as two separate layers of a network instead of fusing together into a single network (Boccaletti et al., 14). See Kivelä et al. (14), Boccaletti et al. (14), and Nicosia and Latora (1) for ore exaples and discussions of ulti-layer networks. Efficient detection of counity structure is an iportant learning goal in odern networks. Counities or odules in a network are defined as subsets of nodes which are ore siilar to each other than the rest of the network. This definition is a bit abiguous in the sense that a group of node ight be siilar in any different ways. A general acceptable definition is the so called structural counities, where nodes within the counity are densely connected aong theselves as copared to the rest of the network (also known as assortative ixing or hoophily (Newan, 3)). Nevertheless, the proble of detecting odules or counities is relatively well studied for single layer networks (Fortunato, ). Counity detection by optiizing a quality function known as odularity (Newan and Girvan, 4) is popular in various application areas and its theoretical properties have been studied extensively as well (Bickel and Chen, 9; Zhao et al., 12). However the counity detection proble for ulti-layer networks has not been studied well in the literature until recently. Our ultiate objective in this paper is to develop ethods for counity detection in ulti-layer networks that can take into account the inforation present in all the network layers siultaneously. Hence developing odularity easures that cobine inforation fro different layers is the need of the hour. Recently there has been a surge in analysis of networks with ultiple layers (De Doenico et al., 13; Bazzi et al., 16; Boccaletti 2

3 et al., 14; Kivelä et al., 14; Han et al., 1; Paul and Chen, 1; Peixoto, 1; Stanley et al., 1; Taylor et al., 16). A few odularity easures have also been proposed in the literature. Mucha et al. () used a null odel forulated in ters of stability of counities under Laplacian dynaics in networks to derive a odularity easure for ulti-layer networks with inter-layer node coupling. Another extension of the Newan-Girvan odularity (Newan and Girvan, 4) to ulti-layer settings as a su of layer-wise odularities was entioned by Liu et al. (14) and Sarzynska et al. (16). Aggregation of adjacency atrix as a way of cobining inforation fro ultiple graphs has also been explored in the literature. However, recent inforation theoretical results show that counity detection on such aggregated graph does not always perfor quite well as the different types of layers ight have quite different properties in ters of density and signal quality which will get lost due to aggregation (Paul and Chen, 1). Here we take two principled null odel based approaches to derive odularity easures. The rest of paper is organized as follows. Section 2 deals with degree distributions in ultilayer networks and defines two failies of ulti-layer null odels. Section 3 extends the Newan-Girvan odularity to ulti-layer settings with various ulti-layer configuration odels as null odels. Section 4 first defines four ulti-layer block odels and then derives odularity easures based on the. Section deals with coputational issues, while Section 6 presents a study of the proposed odularities to assess their effectiveness on siulated networks. Section 7 applies the ethods for counity detection in a nuber of real world networks and Section 8 gives concluding rearks. 2 Multi-layer networks We represent an undirected ulti-layer graph as G = {V, E}, where the vertex set V consists of N vertices representing the entities and the edge set E consists of edges of M different types representing the different relations. Hence the ulti-layer network has M network layers on the set V. We define the adjacency atrix A () corresponding to the th network layer as follows: A () 1, if there is an th type of edge between i and j, ij =, otherwise. Hence a ulti-layer network can be viewed as a graph with vector valued edge inforation with the edge-vector between two nodes i and j being A ij = {A (1) ij, A(2) ij,..., A(M) ij }. In this connection we also define the ulti-degree of node i as k i = {k () i ; = 1,..., M} where k () i = j A() ij is its degree of th type. For a ulti-layer network with K counities, we further denote the counity assignent vector of the nodes as z = {z i ; i = 3

4 e () q 1,..., N} with z i taking values in the set {1,..., K}. We will use the notations e () and to denote the total nuber of edges of type between counities q and l and the total degree of type of all nodes in counity q, i.e., e () e () q = i,j A() ij I(z i = q) = i k() i = i,j A() ij I(z i = q, z j = l) and I(z i = q), where I( ) is the indicator function which is 1 if the condition inside is satisfied and otherwise. Note that e () = e () lq, e () q = l e(), is twice the nuber of edges within the counity q, for all in an undirected and e () qq ulti-layer graph. The total nuber of edges in layer and in the entire network are denoted as L () and L respectively. 2.1 Null odels for counity detection Modularity can be viewed as a score that coputes the difference between the observed structure of the network fro an expected structure under a rando null network. The null network can be generated by a rando network null odel which creates connections between nodes at rando without any special structure of interest (Sarzynska et al., 16; Bazzi et al., 16). In particular, for counity detection the odularity score coputes the difference between the observed nuber of edges and that expected in a null network within a group of nodes arked as a counity. The counity structure is strong if this score, with a proper noralization, is high. As an exaple, the celebrated Newan- Girvan odularity (Newan and Girvan, 4) for single layer networks has the following expression: Q NG = 1 2L { i,j A ij k ik j 2L } δ(z i, z j ) = 1 2L { q e qq e2 q 2L }, (2.1) where k i and e q are the degree of node i and the total degree of counity q respectively, and e qq and L represent twice the total nuber of edges within counity q and the total nuber of edges in the entire network respectively. This odularity uses the configuration odel (Bollobás, 198) as the null odel to generate the null network. In the configuration odel G(N, k) for a single layer network with nuber of nodes N and degree sequence k = {k i }, given the degree sequence of the nodes, the null network is sapled fro a population of networks having the sae degree sequence through rando atching of nodes. For soe counity assignent of the nodes of the network, the ethod then coputes the expected nuber of edges according to this null network within each counity. The odularity score is then the difference between the observed nuber of edges and the expected nuber of edges obtained in the previous step. Optiizing this odularity score across all possible counity assignents will then lead to a detection strategy for network counities. 4

5 Another related odel is the expected degree odel by Chung and Lu (2), which can be thought of as a null odel for a likelihood odularity based on the degree corrected block odel (Zhao et al., 12). In this odel each vertex v i is associated with a paraeter w i which represents its expected degree. The probability P ij of an edge between nodes i and j is proportional to the product of the expected degrees w i and w j : P ij = w iw j k w, ax wi 2 k i k w k. (2.2) The null network is then fored by adding edges A ij between nodes i and j with probability P ij, i.e., A ij Bernoulli(P ij ). For the subsequent discussions we use the reparaeterization due to Perry and Wolfe (12), which the authors call log-linear odel M log, log(p ij ) = α i + α j. (2.3) Let θ i = exp(α i ). Then we have P ij = θ i θ j. We further approxiate the Bernoulli distribution with Poisson distribution for ease of coputation. Such an approxiation is valid since we expect P ij to be sall in odern large scale networks which are quite sparse, and in such cases both distributions would lead to siilar results (Zhao et al., 12; Perry and Wolfe, 12; Yan et al., 14). An approxiate axiu likelihood estiate (MLE) solution to the Poisson likelihood of A ij leads to an estiate of ˆθ i = k i 2L, (2.4) and consequently ˆP ij = k ik j which is the sae as the one obtained fro the configuration 2L odel (Perry and Wolfe, 12). See Perry and Wolfe (12) for ore details and a proof of the validity of the approxiation of MLE. While the observed nuber of edges aong the nodes within a counity is unique, the expectation of the sae can vary depending on which network null odel is chosen. Hence there can be considerable variation in the counities detected using a odularity score. The null odel should be wisely chosen with the ai to capture all sources of systeatic variation in the network except the counity structure. So given that the observed network is a realization of the null odel, the additional edges observed within the counities beyond what is expected fro a purely rando phenoenon should be attributed to the counity structure. In the rest of this section, we extend both null odels entioned above to ulti-layer settings and define the ulti-layer configuration odel (MLCM) and the ulti-layer expected degree (MLED) odel respectively.

6 2.2 Degrees in ulti-layer network and null odels Fro the preceding discussions it is clear that the degree sequence, observed or expected, plays a ajor role in both null odels. In a ulti-layer network, every node is associated with a ulti-degree vector instead of a single degree. Hence in ulti-layer networks degree heterogeneity ight be present in two aspects: across the nodes in a layer and across layers in a node. Soe layers are inherently sparse and soe are dense. To illustrate this, consider the British Mebers of Parliaents (MPs) in twitter dataset. While there is clear degree heterogeneity across the different MPs in a network layer depending on their political influence and significance, there ight also be degree heterogeneity across layers for the sae MP depending upon personal preferences. The layer follow is soewhat denser copared to the layers ention and retweet, possibly because the forer requires one tie attention and the later, continued. Hence a nuber of null odels are possible depending upon how one odels the degree sequence. We can broadly classify our null odels in each of the two categories (MLCM and MLED) into independent degree (ID) odels and shared degree (SD) odels. The independent degree odels assue degrees in each layer are independent of the degrees in other layers, and assign a separate degree paraeter for each layer to each node. In contrast, the shared degree odels assign only one degree paraeter to each node, and layer-wise variations in degree are captured by a single layer-specific global paraeter. Hence we can write the independent degree ulti-layer expected degree odel as P () ij and the shared degree ulti-layer expected degree odel as = θ () i θ () j, (2.) P () ij = θ i θ j β. (2.6) The shared degree odel further requires an identifiability constraint β = 1. Note that the ID odel requires MN paraeters while the SD odel requires only N + M 1 paraeters. Since we have M network layers, we effectively have O(MN 2 ) data points. Hence in the context of sparse individual layers, the SD odel will lead to less variance at the expense of bias. It has been epirically observed that the layers of a ulti-layer network have any structural siilarity (Kivelä et al., 14). Aong others, it has been shown that the degrees are often highly correlated (Nicosia and Latora, 1). Since we expect the individual layers to be anifestation of an underlying coon structure, the degrees of a node in different layers are also expected to be highly inter-related. While there are instances where the degrees can be negatively correlated, a large nuber of cases have positively 6

7 correlated degrees. Figures 1(a1), (b1) and (c1) show the degree distribution of three layers (ention, follow, retweet) in Twitter network of British MPs, three layers (get on with, best friends, work with) in a friendship network of 7th grade school students, and two layers (synaptic, ionic) of the neuronal network of C-elegans respectively. Three real-world ultilayer networks fro diverse fields, ranging fro social networks, friendship networks to neurological networks, exhibit a large positive correlation in the degree distribution aong their layers. In such cases a relatively parsionious shared degree null odel described in (2.6) ight be appropriate. Figures 1(a2), (b2), and (c2) show that the shared degree null odel fits well to the degree distribution of the layers in these networks respectively. 4 follow 1 best_friends 1 2 ionic retweet ention work_with get_on_with 1 1 synap follow ention 1 retweet 1 best_friends get_on_with work_with 1 ionic synap (a1) (b1) (c1) 3 2 SDention SDfollow SDretweet ention follow retweet SDgw SDbf SDww get_on_with best_friends 1 1 work_with SDsyn SDion synap ionic (a2) (b2) (c2) Figure 1: Observed degree distributions across layers in (a1) Twitter network of British MPs, (b1) 7th grade school students and (c1) C-elegans neuronal network; (a2, b2, c2) degree distribution fitted with a shared degree odel plotted as scatter plots with the observed degrees in each layer. 7

8 2.3 Null odel selection By dissociating the degree based null odel fro the counity structure coponent of a odularity easure, we ake it easier to first choose the appropriate null odel based on observed degree pattern, and then choose an appropriate odularity easure based on the null odel. In this context a question that naturally arises is, given a ulti-layer network how would one choose between an independent degree and a shared degree null odel? A hypothesis testing based fraework for odel selection was developed in Yan et al. (14) for selecting between the ordinary stochastic blockodel (SBM) and the degree corrected block odel (DCBM) in single layer networks. In our case, however, the question is not between choosing degree correction or not, rather between what kind of degree correction is required; an independent degree odel or a shared degree odel. Here we provide a preliinary guidance through a siple approxiate odel selection procedure based on likelihood ratio calculations. The null hypothesis is that the SD odel is the true data generating odel whereas the alternative hypothesis is that it is the ID odel that generates the data. The log-likelihood fro the ID odel (2.) can be written as l(a; θ) = {A () ij (log(θ () i ) + log(θ () j )) θ () i θ () j } log(a () ij!). i<j The axiized log-likelihood with the approxiate MLE solution as in (2.4) is i<j Λ 1 = i k () k () i i log L + c 1, (2.7) 2L () where c 1 = are ˆθ i = k() i 2L i<j log(a() ij!). For the SD odel (2.6) the approxiate MLE solutions and ˆβ = L(), and hence the axiized log-likelihood is L Λ 2 = i k () i log k() i + 2L ( ) L L () () log L + c 1. (2.8) L Note that c 1 and the L ters cancel when we subtract (2.8) fro (2.7) to copute the logarith of likelihood ratio. The standard theory on likelihood ratio tests would suggest that 2(Λ 1 Λ 2 ) is distributed as a χ 2 distribution with degrees of freedo MN (N +M 1). However there is soe concern about the validity of the assuptions under which the asyptotic distribution of the test statistic is usually derived in the present case. particular the effective saple size in sparse ulti-layer graphs (average degree per layer is O(1)) is O(MN) and the ID odel contains MN paraeters, leading to the failure of standard asyptotics (Yan et al., 14). Hence we use paraetric bootstrap to copute the epirical distribution of the likelihood ratio test statistic under the null odel. In 8 In

9 particular we fit the SD odel to the data and estiate the paraeters ˆθ i, ˆβ. We then generate a large nuber of networks (we used in data analysis) fro the SD odel with the estiated paraeters and copute our test statistic on each of those networks. The values of the statistic for an epirical distribution which is subsequently used to calculate p-value for the test. Once a null odel selection is perfored, the user can choose an appropriate odularity easure aong the ones we define in upcoing sections. 3 Multi-layer configuration odels and odularities In light of the discussion of the previous section, conditioned upon the observed ultidegree sequence k = {k 1,..., k N }, we define several extensions of the configuration odel for ulti-layer networks. In the first odel we assue the degree sequence in one layer is independent of the degree sequence in other layers. Consequently the layers in the null networks, conditioned on the degree sequence, are independent of each other. The probability of a connection of type between nodes i and j is given by k() i k () () j i k () j k. 2L () 1 2L () Here is the justification. The nuber of stubs or half-edges (one end of an edge) coing out of node i is k () i. For every stub, there are 2L () 1 stubs or half-edges available to connect to. Out of these half-edges, the nuber of half-edges that will lead to an edge of type between nodes i and j are k () j. It is a general convention in configuration odel to write 2L () in the denoinator instead of 2L () 1 for siplification as L () is generally quite large. We call this odel the independent degree ulti-layer configuration odel (IDMLCM). Using this odel as a null odel, we then define our first extension of Newan-Girvan (NG) odularity, which we call the ulti-noralized average (MNavrg) since the expression is effectively an average of the layer-wise noralized NG odularities: { } Q MNavrg = 1 1 A () M 2L () ij k() i k () j δ(z 2L () i, z j ) i,j { } = 1 1 e () M 2L () qq (e() q ) 2. (3.1) 2L () q There is a siilar version of this odularity that has appeared in the literature before for counity detection in ulti-layer networks (the intra-layer part of the ulti-slice odularity in Bassett et al. (13) and Sarzynska et al. (16), coposite odularity in Liu et al. (14)), where the null odel used is identical to IDMLCM, however the layer-wise odularity scores are not noralized before adding together. To iniize the ipact of varying sparsity across layers, it is prudent to noralize the easures by the density of layers before aggregating (Paul and Chen, 1). 9

10 In the second odel, we bring a regularization effect into the null odel by sharing degree across layers for networks. This will be particularly useful in the case when all the network layers are extreely sparse. If we do not distinguish the stubs in ters of type, then according to the siple configuration odel, the probability of an edge (of any type) between nodes i and j will be given by ( k() i )( k() j ). Now, given that these two nodes i 2L 1 and j are the endpoints of a randoly chosen edge, we look into the probability that an edge of type is fored between the nodes. This probability can be odeled in three different ways, leading to three different shared degree ulti-layer configuration odels (SDMLCMs) and consequently three different odularities. To illustrate this, we abuse the notation a little bit and write the probability of an th type of connection between nodes i and j as P (i, j, ) = P (i, j) P ( i, j), where P (i, j) denotes the unconditional probability of an edge between nodes i and j, and P ( i, j) denotes the conditional probability of an th type of edge between i and j given there is an edge between the. We can use the global frequency of the occurrence of the th type of edge aong the ulti-layer network as an estiate of the probability P ( i, j). We call the resulting odularity score shared degree average (SDavrg) since we are using a global estiate for each node. The odularity can be written as { Q SDavrg = 1 1 A () M 2L () ij L() ( k() i )( } k() j ) δ(z 2L 2 i, z j ) i,j { = 1 1 e () M 2L () qq L() ( } e() q ) 2. (3.2) 2L 2 q We can also use local estiates of this probability of the th type of connection that is specific to a node or a group of nodes. One such easure would be the ratio of the th type of stubs to all stubs in the counities to which nodes i and j belong. Instead of looking into the entire network, this ratio easures the relative frequency of the occurrence of the th type of edge involving stubs eanating fro the groups of either i or j. Hence this is a ore local easure of the relative density of the th type of edges. The total nuber of edges (of any type) that have an end in the group to which i and j belong are (e() z i + e () z j ). Out of these only (e () z i + e () z j ) are of type. Hence according to this estiate, the probability of an th type of stub eanating out of i or j is e () z i +e z () j (e() z i +e () The corresponding odularity, which we call shared degree local (SDlocal) to highlight the fact that it uses a ore local estiate of the expected nuber of edges, is given by { Q SDlocal = 1 1 A () M 2L () ij (e() z i + e () z j )( k() i )( } k() j ) δ(z + e () i, z j ) z j )2L i,j (e() z i z j ).

11 = 1 M q { 1 e () 2L () qq e() q ( } e() q ). (3.3) 2L The last one is also a local easure of P ( i, j), but a ore direct easure. We take the ratio of the expected nuber of edges of type to the total nuber of expected edges between the groups to which i and j belong according to the configuration odel. Clearly, as per the single layer configuration odel, the expected nuber of edges of type between groups z i and z j is e () z i e () z j /2L. Hence in the ulti-layer context, given that there is an edge between the groups z i and z j, the probability that the edge would be of type is given by e () z i e () z j (e() z i e () z j ). We call this odularity the shared degree ratio (SDratio) to highlight the fact that it takes the ratio of the expected nuber of edges of type to the total expected nuber of edges. The expression is as follows, { Q SDratio = 1 1 A () M 2L () ij (e() z i e () z j )( k() i )( } k() j ) δ(z i,j (e() z i e () i, z j ) z j )2L { = 1 1 e () M 2L () qq (e() q ) 2 ( } e() q ) 2 2L. (3.4) (e() q ) 2 q 4 Degree corrected ulti-layer stochastic blockodel Our next set of odularities are based on a statistical odel of rando ulti-layer networks which we call the degree corrected ulti-layer stochastic blockodel (DCMLSBM). This odel can be thought of as a odel with counity structure that is built upon the ultilayer expected degree (MLED) odel. Hence the odularities outlined in this section have the MLED odel as its null odel. Both the ulti-layer stochastic blockodel (MLSBM) and the DCMLSBM have been previously used in the literature as a statistical odel for ulti-layer networks with block structures (Valles-Catala et al., 16; Han et al., 1; Peixoto, 1; Paul and Chen, 1; Taylor et al., 16; Stanley et al., 1). In what follows we first define various degree corrected extensions of the two odels analyzed in Paul and Chen (1), the MLSBM and the restricted MLSBM (RMLSBM) based on the ultilayer expected degree null odels defined earlier, and then derive likelihood odularities (Bickel and Chen, 9; Karrer and Newan, 11) fro the. Several generative odels based on ulti-layer extensions of SBM were developed in Peixoto (1) with priors on the paraeters and a Bayesian odel selection procedure was developed. In this paper we restrict ourselves to variations only in ters of degree through the MLED null odel and variation in ters of block paraeters through a restriction siilar to Paul and Chen (1). 11

12 It has been argued previously in the literature that the odularities based on the single layer SBM and DCBM are ore widely applicable than ad hoc fors of quality functions and often reedy soe of the deficiencies of the later (Ball et al., 11; Bickel and Chen, 9). Since the Newan-Girvan odularities consider only the intra-counity edges and do not take into account the inter-counity edge structure directly (although they are used indirectly to copute the expected intra-counity edges), they iss soe of the inforation taken into account by the likelihood odularities which consider both intra and inter counity edges. As an exaple, the NG odularities fail to detect dissortative ixing/heterophilic counities and perfor poorly if the counity sizes are unbalanced, while likelihood odularities are robust to such cases. Siilar to the single layer stochastic blockodel, the ulti-layer stochastic blockodel also assues stochastic equivalence of nodes for a given type of edge within each counity and hence fails to odel real life ulti-layer networks with degree heterogeneity. To reedy the situation for single layer graphs, degree corrected blockodel (DCBM) was proposed by Karrer and Newan (11). They also characterized the odularity based on this odel as a Kullback-Leibler divergence between this odel and a null odel without the counity structure. Such a null odel would be equivalent to the Chung-Lu expected degree rando graph odel. Hence both the MLSBM and RMLSBM should be corrected for degree variation using ulti-layer extensions of the expected degree null odels described earlier in (2.) and (2.6). Throughout the section we assue that the edges A () ij between two nodes i and j are fored independently following Poisson distribution, given the counity assignents of the nodes z i and z j and the degree vectors of the two nodes k i and k j : A () ij (z i = q, z j = l) P oisson(p () ij ). We odel the Poisson ean paraeter for the ulti-layer stochastic block odel in four different ways with varying nuber of paraeters. The first two of the odels have the independent degree expected degree odel as their null odel. We write the first odel as P () ij = θ () i θ () j π (), i, j {1,..., N}, {1,..., M}, q, l {1,..., K}, (4.1) with the identifiability constraints = 1, {1,..., M}, q {1,..., K}. i:z i =q θ () i We call this odel the degree corrected ulti-layer stochastic blockodel (DCMLSBM). The next odel is the degree corrected version of the RMLSBM, which we call the DCRMLSBM, P () ij = θ () i θ () j π, i, j {1,..., N}, {1,..., M}, q, l {1,..., K}, (4.2) 12

13 with identifiability constraints i:z i =q θ () i = 1, {1,..., M}, q {1,..., K}. The above two odels have M(N K) + MK(K + 1)/2 and M(N K) + K(K + 1)/2 paraeters respectively. In the next two odels the underlying null odel is the shared degree expected degree odel, and hence the node specific degree paraeter is coon across the layers. We call the odels the shared degree ulti-layer stochastic blockodel (SDMLSBM) and the shared degree restricted ulti-layer stochastic blockodel (SDRMLSBM) respectively. The odels can be written as P () ij = θ i θ j π (), i, j {1,..., N}, {1,..., M}, q, l {1,..., K}, (4.3) with identifiability constraints i:z i =q θ i = 1, q {1,..., K}, and P () ij = θ i θ j β π, i, j {1,..., N}, {1,..., M}, q, l {1,..., K}, (4.4) with identifiability constraints β = 1, i:z i =q θ i = 1, q {1,..., K}. These two odels have N K + MK(K + 1)/2 and N K + M 1 + K(K + 1)/2 paraeters respectively. Clearly the four odels are nested odels with different nuber of paraeters. We consider an asyptotic scenario here to estiate the nuber of paraeters each of the odels will have asyptotically. If we assue growing nuber of counities as K = O(N 1/2 ), then the first three odels have O(MN) paraeters whereas the last odel has O(ax(N, M)) paraeters. On the other hand, if we let K reain constant, then the first two odels have O(MN) paraeters while the last two odels have O(ax(N, M)) paraeters. For the odels that have considerably fewer nuber of paraeters, we expect the axiu likelihood estiates to have less variance at the expense of soe bias. Such gain in ters of low variance at the expense of bias would be advantageous in situations where the network layers are sparse (Paul and Chen, 1). 13

14 4.1 Modularities based on Multi-layer stochastic blockodels To derive odularities based on the odels defined earlier, we take the profile likelihood approach siilar to Bickel and Chen (9) and Karrer and Newan (11), where we axiize the conditional log-likelihood l(a z; P ) of the adjacency atrix given the group assignents. This is done by plugging in the MLE of the paraeter set P conditional on z. The conditional log-likelihood for DCMLSBM can be written as (dropping the ters that do not depend on the class assignent) l(a; z, π, θ) = M =1 i<j = i {A () ij k () i {log(π z () i z j ) + log(θ () ) + log(θ () log(θ () i ) + q l i {e () j )} θ () i θ () j π z () i z j } log(π () ) π () }. (4.) The MLE of π can be obtained by a straightforward differentiation of the log-likelihood function. However to find the MLE of θ under the identifiability constraints we need to use the Lagrange ultipliers as follows. The objective function to be optiized is Λ(θ, λ) = i k () i log(θ () i ) + λ q ( q i:z i =q Solving for θ and λ we obtain the following solutions for the MLE: ˆθ () i = ˆπ () = k () i i:z i =q k() i i,j: z i =q,z j =l = k() i A () ij e () q, = e (). θ () i 1). Plugging in these estiates into the log-likelihood function gives the axiized log-likelihood function as l(a; z) = i = q l k () i log e () ( k () i e () q ) log(e () ) + q l q l e () {e () + i log(e () ) e () } k () i log(k () i ) q e () q log(e () q ). (4.6) Now ignoring the ters that do not depend on the class assignent (the 2nd and 3rd ters), we get l(a; z) = q l e () log(e () ) q 14 e () q log(e () q ). (4.7)

15 It is easy to see that this axiized likelihood function can be written as Q = { ( ) } () e e () log, e () q e () l which we call the un-noralized odularity function. q l However, the odularity function at this for take ore contribution fro denser layers as copared to the sparser ones and are not appropriate for counity detection in ulti-layer networks. Since we are interested in inference about the underlying counity structure across the layers, we want to capture the signals available fro each layer irrespective of its density and cobine the together. Hence we need to noralize this odularity function layer-wise. The role of noralization is especially iportant in situations where the layers of a network represent quite different relationships. In those situations it ay happen that a dense network is uninforative and a sparse one is quite inforative. The deficiencies in un-noralized odularity are conceptually the sae as those for which an aggregate of adjacency atrices across layers fails to detect the counity signal. Apart fro reducing the undue influence of highly dense layers on the objective function, noralization helps to retain the Kullback-Liebler (KL) divergence based probabilistic interpretation of likelihood odularity (Karrer and Newan, 11). Since by assuption, given the label assignents of the nodes, the network layers are fored independently each with a Poisson distribution, the KL divergence of the odel with block structure fro a null odel without a block structure is an indicator of the goodness of fit for that odel in each of the network layers. Hence uch like the configuration odel case, the ulti-layer odularity in the stochastic blockodel case can also be viewed as an aggregation of divergences in the coponent networks. For this purpose we noralize A () ij by twice the total nuber of edges in the th layer, 2L (). Consequently quantities that are derived fro A also gets noralized accordingly. The odularity after proper noralization can be written as Q DCMLSBM = { ( () e log 2L () q l e () /2L () (e () q /2L () )(e () l /2L () ) ) }. (4.8) Siilarly for DCRMLSBM, the conditional likelihood along with the constraints can be siplified as (dropping the ters not dependent on the paraeters) l(a; z, π, θ) = M =1 i<j = i {A () ij k () i {log(π zi z j ) + log(θ () i log(θ () i ) + 1 q l {e () ) + log(θ () j )} θ () i θ () j π zi z j } log(π ) π }. (4.9)

16 The MLEs of θ and π under the constraints are once again obtained by the ethod of Lagrange ultipliers as explained before: ˆθ () i = ˆπ = k () i i:z i =q k() i i,j: z i =q,z j =l = k() i e () q A () ij, = e (). The profile likelihood odularity function can be obtained by plugging in the MLEs into the log-likelihood equation and then dropping the ters that do not depend on the class assignent: l(a; z) = q l e () log( and with proper noralization the odularity is Q DCRMLSBM = { () e log 2L () q l e () ) q ( (e() /2L () ) (e () q /2L () )(e () l /2L () ) k q () log(e () q ). (4.) ) }. (4.11) The odularities for the two shared degree odels can also be derived in a siilar fashion. For SDMLSBM, the conditional log-likelihood without the ters independent of the paraeters is l(a; z, π, θ) = = i i<j {A () ij {log(π z () i z j ) + log(θ i ) + log(θ j )} θ i θ j π z () i z j } (4.12) k () i log(θ i ) + q l {e () The axiu likelihood estiates of the paraeters are ˆθ i = ˆπ () = k() i i:z i =q k() i i,j: z i =q,z j =l A () ij = log(π () ) π () }. k() i e() q = e (), and hence the profile likelihood odularity function with noralization is Q SDMLSBM = { ( ) } () e (e () /2L () ) log 2L () q l (e() q /2L () ). (4.13) (e() l /2L () ) For SDRMLSBM, the conditional log-likelihood without the ters independent of the paraeters is l(a; z, π, θ) = i<j {A () ij {log(π zi z j ) + log(β ) + log(θ i ) + log(θ j )} θ i θ j β π zi z j } 16, (4.14)

17 = i k () i log(θ i ) + q l e () {log(β ) + log(π )} π. q l The axiu likelihood estiates of the paraeters are ˆθ i = k() i i:z i =q k() i = k() i e() q, ˆβ = ˆπ = q l e() q l e() i,j: z i =q,z j =l = L() L, A () ij = e (). Ignoring the ters not dependent on the label assignents and after noralization, the likelihood odularity function is Q SDRMLSBM = { ( () ) } e log (e() /2L () ) 2L () (e() q /2L () ). (4.1) (e() l /2L () ) q l Coputation We adapt the Louvain algorith (Blondel et al., 8) to ulti-layer network settings for coputing both the nuber of counities and the optial partitions using the ulti-layer odularities discussed in this paper. Siilar to the original Louvain algorith, the odified algorith is also a two-phased fast greedy optiization ethod for counity detection. However when the nuber of counities is known, we eploy a ulti-layer version of the algorith used by Karrer and Newan (11). The algorith is a Kerninghan-Lin type graph partitioning algorith and is a non-greedy approach which leads to ore accurate results for a known K than the Louvain approach. It however requires a starting partition and the final solution depends on the quality of the initial value. This algorith often gets stuck in a local iniu and hence we use ultiple starting points to iprove the quality of partitions. While we skip the details of the algoriths here, we note that for both algoriths the execution speed heavily depends on the ability to quickly copute the increase in odularity score for a one step change without having to recopute the odularity value for the entire network. For the Louvain algorith, this one step change is the increase in odularity for reoving a node i fro its own counity (i.e., the counity which only contains i) and oving it to the counity of one of its neighbors j, say counity q. For Kerninghan-Lin algorith, it is the increase in odularity for oving a node i fro its counity r to another neighboring counity s. For each one of the odularities, 17

18 we have derived this one step change. Here we only give exaple forulas for three MLCM based odularities to copute the one step change in Louvain algorith. We define an additional notation. Let e () i,q denote the nuber of type edges fro node i to a neighboring counity q. Then Q MNavrg = { } 1 e () 2L () i,q e() q k () i, L () Q SDavrg = { 1 e () 2L () i,q L() ( e() q )( } k() i ), L L Q SDlocal = { 1 e () 2L () i,q k() i ( e() q ) + ( } k() i )e () q. 2L 6 Siulation results In this section we nuerically copare the perforance of the odularity scores for counity detection through a siulation study. Since the true class labels of the nodes are known in siulated data, we copare the class assignents fro different ethods with the true labels. This coparison involves two stages. Since the Louvain algorith applied to the odularities can identify the nuber of clusters autoatically, an effective counity detection in situations where the nuber of counities is unknown ust first identify the nuber of counities correctly. Hence the first step of coparison is in ters of the nuber of counities detected. The etric used for this purpose is the ean square error of the nuber of classes recovered across the repetitions. The second step would be to copare the goodness of the class assignents. As a etric, we use the noralized utual inforation (NMI) which is an inforation theory based siilarity easure between two cluster assignents. This etric takes values between and 1, with indicating that the class assignent is rando with respect to the true class labels and 1 indicating a perfect atch with the true class labels. Since the easure is noralized it can be used to copare clustering solutions with different nuber of clusters as well. Finally assuing that the nuber of clusters is known in advance, we copare the clustering accuracy of the odularity scores in ters of NMI. All the results reported throughout the section are the average of the etric across 4 repetitions of the siulations. We then copare the relative perforance of the ulti-layer odularities along with a baseline ethod for coparison, the NG odularity on the aggregated adjacency atrix. The coparison is perfored under various settings on the nuber of nodes N, the nuber of counities K and the average degree per layer. 18

19 6.1 Data generation We generate data fro both the ulti-layer stochastic blockodel and its degree corrected version. For this purpose, we first generate N node labels independently fro a K class ultinoial distribution. The network counity sizes are varied by varying the paraeters of the ultinoial distribution with equal paraeters leading to balanced counities. We next generate the M layers using the stochastic blockodel each with a different connectivity atrix. In our stochastic blockodel, the connectivity atrices give larger probabilities for intra-block edges in coparison to the inter-block edges. The general structure of the connectivity atrix has (ρλ 1,..., ρλ k ) in the diagonal and the sae eleent ρɛ in the off diagonal. We control the signal strength in different layers by varying the ratio of λ s with ɛ fro layer to layer while we control the average degree per layer by varying the paraeter ρ. Throughout the section strong signal eans that each of λ i is roughly 3-4 ties greater than ɛ and weak signal eans each of λ i is only arginally greater than ɛ. If the degree corrected stochastic blockodel is used for data generation, then the degree paraeter is generated using a power law distribution, one paraeter for each node in the shared degree odel and one paraeter for each node in each layer in the independent degree odel. 6.2 Nuber of counities unknown In our first siulation we assue the nuber of counities is unknown and use the Louvain algorith to autoatically deterine the nuber of counities along with the partition. We consider two scenarios in ters of the coposition of the coponent layers: the first one having sparse strong signal in all layers and the second one having ixed sparsity and signal quality in its layers where strong and ixed are as described in the previous paragraph sparse strong signal With all coponent layers being sparse and strong in signal quality, we design two siulation scenarios. First we fix the nuber of counities K at 3 and the nuber of nodes N at 8 while we vary the average density of the ulti-layer graph. Figure 2(a) shows the results of this siulation. The top figure is a coparison in ters of NMI of the counity assignent and the botto figure is a box plot of the nuber of counities detected. While there is not uch difference aong the odularities copared in ters of the NMI, the box plots for the nuber of classes detected show that the shared degree ethods SDavrg and SDlocal are closer to the correct nuber of counities (which in this case 19

20 Noralized utual inforation MNavrg SDavrg SDratio SDlocal agg. graph Average degree of all layers cobined Noralized utual inforation MNavrg SDavrg SDratio SDlocal aggregate graph Nuber of nodes N 16 average degree cobined : agg MNavrg SDavrg SDratio SDlocal 8 average degree cobined : agg MNavrg SDavrg SDratio SDlocal 14 average degree cobined : agg MNavrg SDavrg SDratio SDlocal. average degree cobined : agg MNavrg SDavrg SDratio SDlocal agg MNavrg SDavrg SDratio SDlocal agg MNavrg SDavrg SDratio SDlocal agg MNavrg SDavrg SDratio SDlocal agg MNavrg SDavrg SDratio SDlocal (a) (b) Figure 2: Coparison of the perforance of various ulti-layer odularities for data generated fro the DCBM with independent degrees. The layers are sparse and the signal is strong across all layers. (a) The nuber of nodes and nuber of counities are fixed at 8 and 3 while the average degree of the nodes across all layers cobined is increased. (b) The nuber of nodes is increased fro 3 to 6 while the nuber of counities is fixed at 3. In both cases, the top figure is the coparison in ters of NMI of the counity assignent and the botto figure is the box plot of the nuber of counities detected. is 3) as copared to the MNavrg and the aggregate graph in sparser networks. SDavrg and SDlocal also converge to the correct nuber of counities faster than MNavrg and aggregate graph as the network becoes denser. In the second siulation scenario, we fix the nuber of counities at 3 and vary the nuber of nodes fro 3 to 6. Figure 2(b) shows the results of this siulation where as before the top figure is the coparison in ters of NMI of the counity assignent and the botto figure is the box plot of the nuber of counities detected. Siilar to the previous case, we observe that the nuber of counities detected by SDavrg and

21 SDlocal converges to the true nuber of counities faster than MNavrg and aggregate graph as the nuber of nodes increases ixed signal layers In this siulation, the coponent layers of the ulti-layer graph vary in ters of both sparsity and signal strength in the following way: two layers are sparse and have strong signal, two layers are dense and have weak signal, while one layer is dense and have strong signal. This scenario is extreely useful to test the ethods in a very general situation where not all layers are inforative and represent interactions that vary in scales. In real applications, we can not expect all layers of a ulti-layer network to contain high quality signals about the counity structure. Soe of the layers will have a high aount of noise interfering with the genuine signal fro other layers. Likewise the layers ight represent very different relations and hence vary widely in density. We design three siulation setups: for the first one we vary the average density per layer while fixing N and K at 8 and 3 respectively, for the second case we fix K at 3 and let N grow fro 3 to 6 in steps of, and for the third one we fix N at 8 and let the nuber of counities grow fro 3 to 9. As with the previous case we report both the coparison in ters of NMI and box plots of the nuber of clusters detected. For the first case, the results presented in Figure 3(a) show that MNavrg along with the shared degree ethods outperfor the aggregate graph consistently in ters of both NMI and the accuracy of the nuber of counities detected as the layers becoe denser. We observe a slight under-perforance of MNarvg copared to SDlocal, SDratio and SDavrg in ters of the accuracy of the nuber of counities detection when the average density of layers is lower, but eventually their perforance is coparable. Figure 3(b) shows very siilar observation for the second case where the nuber of nodes is increasing. Finally Figure 3(c) shows that with increasing nuber of counities perforance deteriorates in all the odularities, however the drop in perforance is faster for aggregate graph and SDavrg copared to the others. Fro this siulation we see that although the aggregate graph fails to provide good perforance, the shared degree ethods, in spite of cobining inforation fro all layers in their null odel, perfors at par with the MNavrg. Hence, this shows that the shared degree ethods not only perfor better in sparse networks, but are also robust against the presence of high degree of noise. 21

22 Noralized utual inforation MNavrg SDavrg SDratio SDlocal agg. graph Average degree per layer 12 average degree per layer : agg MNavrg SDavrg SDratio SDlocal 8 average degree per layer : agg MNavrg SDavrg SDratio SDlocal average degree per layer : agg MNavrg SDavrg SDratio SDlocal 9 average degree per layer : agg MNavrg SDavrg SDratio SDlocal (a) Noralized utual inforation MNavrg SDavrg SDratio SDlocal aggregate graph Nuber of nodes N agg MNavrgSDavrgSDratioSDlocal agg MNavrgSDavrgSDratioSDlocal (b) agg MNavrgSDavrgSDratioSDlocal agg MNavrgSDavrgSDratioSDlocal 1. Increasing Counities, dense ixed signal Correct Clustering Rate MNavrg SDavrg SDratio SDlocal agg. graph Nuber of Counities K (c) 4.2 True nuber of counities : agg MNavrg SDavrg SDratio SDlocal 11 True nuber of counities : agg MNavrg SDavrg SDratio SDlocal 9. True nuber of counities : agg MNavrg SDavrg SDratio SDlocal 12 True nuber of counities : agg MNavrg SDavrg SDratio SDlocal Figure 3: Coparison of the perforance of various ulti-layer odularities for data generated fro the independent degree DCBM. Both sparsity and signal quality are ixed across different layers. (a) N and K are fixed at 8 and 3 while the average degree of the nodes across all layers cobined is increased. (b) Increasing nodes with fixed K = 3. (c) Increasing nuber of counities with fixed N = 8. In all cases, the left side figure is the coparison in ters of NMI and the right side figure is the box plot of the nuber of 22 counities detected.

23 6.3 Nuber of counities known In this section we assue the nuber of counities K to be known in advance and assess the effectiveness of the MLED based ulti-layer odularities along with MNarvg. For this siulation (Figure 4(a) and (b)) we fix N, K and M at, 2 and 4 respectively while we let the average degree density of all layers together to grow. In the first case (Figure 4(a)) the counity sizes are balanced with roughly half of the nodes belonging to either cluster, and in the second case (Figure 4(b)) the counity sizes are unbalanced with 3% of the nodes belonging to one cluster and the reaining 7% belonging to the other. The layers are ixed in ters of density and signal quality. As entioned in Section, the Kerninghan-Lin type algorith used when the nuber of counities is known requires an initial labeling. For this purpose we randoly perute the labels of % of the nodes, keeping the correct labels for the rest of the nodes, siilar to Bickel and Chen (9). MNavrg MNavrg of all layers together of all layers together (a) (b) Figure 4: Coparison of perforance with known nuber of counities of various MLED odularities along with MNavrg for data generated fro the stochastic blockodel. The layers are ixed in sparsity and signal quality. The average degree on nodes across all layers are increasing while N, K and M are fixed at, 2 and 4 respectively. We consider two cases: (a) balanced counity sizes (roughly half of the nodes belonging to either cluster), and (b) unbalanced counity sizes (3% of the nodes belonging to one cluster and 7% belonging to the other). We notice that the DCRMLSBM and SDRMLSBM along with the aggregate graph ethod perfor worse than the DCMLSBM and SDMLSBM as expected by the fact that the signal content varies across layers. Moreover, as expected fro our discussion in odel selection for MLED based odularities, we observe that the shared degree ethods perfor 23

24 considerably better than the corresponding independent degree ethods in this sparse regie, especially when the average density per layer is low. As an exaple, we note that the shared degree ethod SDMLSBM outperfors the corresponding independent degree ethod DCMLSBM throughout the range of the siulation. Lastly, the ethod MNavrg based on the ulti-layer configuration odel perfors quite worse copared to the MLED odularities. This observation is consistent with the siilar observations in case of single networks by Bickel and Chen (9) and Zhao et al. (12). 7 Real data analysis In this section we analyze a variety of ulti-layer network datasets fro different fields including social networks (three Twitter networks), friendship networks (Vickers-Chan s grade 7 peer network) and biological networks (Neuronal network of C-elegans). We deonstrate the effectiveness of the ulti-layer ethods discussed in this paper in detecting eaningful clusters in the networks. 7.1 Twitter datasets We consider three real world ulti-layer network datasets fro the social network Twitter corresponding to interactions aong (a) British Mebers of Parliaents (MPs), (b) Irish politicians and (c) Football players fro the English preier league clubs. All the datasets were curated by Greene and Cunningha (13). For each of the networks we consider three network layers corresponding to the twitter relations entions, follows and retweets aong a set of nodes. We apply the ulti-layer counity detection ethods discussed in this paper to cluster the nodes. The ground truth counity labels are also provided by Greene and Cunningha (13) and correspond to different underlying aspects of the nodes. For exaple, in the political networks (UK MPs and the Irish politicians) the ground truth corresponds to the political affiliation of the individuals, whereas in the network of preier league football players the ground truth corresponds to the teas (English preier league clubs) the players belong to. UK MPs The first dataset consists of twitter interactions between 419 British MPs. We consider only those nodes which are connected by at least one connection in all the three layers. This reduces the nuber of nodes to 381. However, there are seven MPs in the tried network who do not belong to any ajor political party (naed other in the ground 24

25 truth). Hence we reove those nodes and analyze the network with the reaining 374 nodes. The ground truth counity assignent contains 12 Conservative, 178 Labour, 39 Liberal deocrats and SNP. The highly correlated layer-wise degree distribution of this network is presented in Figure 1. The nuber of counities detected and the NMI of the clustering result with ground truth for different counity detection algoriths are listed in Table 1. Clearly the ulti-layer ethods perfor better than the single layer ethods with several of the ethods (MNavrg, SDlocal, NG odularity on aggregate graph) obtaining perfect clustering solution. Table 1: The nuber of counities detected and the NMI of clustering for different counity detection ethods for Twitter UK politics data. The counity naes are identified by optial assignent. Method Conservative Labour Lib. De SNP no. co. NMI Ground truth Single (ention) Single (follow) Single (retweet) MNavrg SDarvg SDlocal SDratio Aggregate Irish politicians The dataset on Irish politicians consists of twitter interactions aong 37 Irish politicians. The ground truth consists of party affiliations of the into Republic of Ireland s six ajor political parties. However 23 of the are independents and do not belong to any parties. We analyze the network both with and without these independents. The corresponding results are reported in Table 2(b) and 2(a) respectively. As expected the NMI with the ground truth is better when the network is analyzed without independents. In both cases, the ulti-layer ethods generally outperfor the single layer ethods. The highest NMI in both cases are obtained by the ulti-layer ethod SDlocal. For the network without the independents, three ulti-layer ethods, MNavrg, SDlocal and SDratio, ake only one incorrect assignent. When the nuber of counities is assued to be known as 6, all ulti-layer odularities perfor very well and iscluster only one node (Table 2(c)). 2

26 Table 2: The NMI of clustering fro different counity detection ethods for Twitter Irish politics data. The counity naes are identified by optial assignent, no. co. stands for nuber of counities detected. Method fg Labour ff sf ula green no. co. NMI Ground truth Single (ention) Single (follow) Single (retweet) MNavrg SDarvg SDlocal SDratio Aggregate (a) Without Independents Method fg Labour ff sf ula green ind no. co. NMI Ground truth Single (ention) Single (follow) single (retweet) MNavrg SDarvg SDlocal SDratio aggregate (b) With Independents NG NG NG DCBM DCBM DCBM aggregate (ention) (follow) (retweet) (ention) (follow) (retweet) DCMLSBM DCRMLSBM SDMLSBM SDRMLSBM MNavrg SDavrg SDlocal SDratio (c) With known nuber of counities 26

27 Our results on both political networks show how ulti-layer ethods can correctly identify eaningful counity structure in networks. The near-optial clustering for soe of the ulti-layer ethods is quite surprising and quite rare in network counity detection. This is perhaps an indication of how politicians heavily counicate with people within their political ideologies and seldo counicate with people of different ideologies. Hence the social interaction patterns of politicians easily reveal their political affiliations. English preier league football players The last twitter dataset we analyze consists of interaction aong sports personalities; the football players in the English Preier League. As before, we keep only those nodes who are connected to at least one other node in each of the network layers. The ground truth for this dataset consist of footballers assigned to the football clubs that they play for. The nuber of clusters detected along with the NMI of the solution with the ground truth are given in the Table 3(a). We see that alost all ethods, single layer and ultilayer, underestiate the nuber of clusters. We copare the perforance of these ethods assuing that the nuber of clusters is known () in Table 3(b). We note that the MLCM based ulti-layer ethods clearly outperfor not only the single layer odularities and the baseline aggregate ethod, but also the ulti-layer block odel odularities. The single layer NG odularities also outperfor single layer DCBM odularities. In both groups of odularities, ulti-layer odularities perfor better than their single layer counterparts. Moreover one of the MLCM based shared degree ethod, the SDavrg, perfors the best aong all the ethods. This is expected because when the nuber of counities K is large, the nuber of paraeters to be estiated in block odel becoes large, resulting in poor estiation. Hence, the NG odularities outperfor the block odel ones, while the MLCM odularities outperfor the ulti-layer block odel ones. For the sae reason, aong the ulti-layer block odel odularities, the restricted ethods (RMLSBM and SDRMLSBM) with considerably less block odel paraeters to estiate perfor better than the unrestricted ones. 7.2 C-elegans Next we analyze a dataset fro biology: the neuronal network connectoe of a neatode Caenorhabditis elegans. It is the only organis whose wiring diagra or connectoe of the entire nervous syste is known and apped (Chen et al., 6; White et al., 1986). For this dataset and the next one (grade 7 students) we use the versions of the dataset shared by De Doenico et al. (1). The present network consists of 279 neurons connected by two types of connections, a cheical link or synapse and an ionic channel (Nicosia and 27

28 Table 3: Perforance of different counity detection ethods in ters of (a) nuber of clusters detected and NMI of clustering, and (b) NMI of clustering with known nuber of counities for Twitter English Preier League dataset Method no. co. NMI Ground truth Single (ention) Single (follow) Single (retweet) 14 7 MNavrg SDarvg 12.8 SDlocal SDratio Aggregate (a) Method NMI NG (ention).8848 NG (follow).922 NG (retweet).79 DCBM (ention).7243 DCBM (follow).72 DCBM (retweet).676 DCMLSBM.7898 DCRMLSBM.882 SDMLSBM.7476 SDRMLSBM.812 MNavrg.9176 SDarvg.9613 SDlocal.9129 SDratio.947 Aggregate.8896 (b) Latora, 1), and is a weighted network. This network was previously analyzed both as a single layer network with the two layers collapsed together (Sohn et al., 11; Varshney et al., 11; Fortunato and Barthéley, 7) and as a ulti-layer network (Nicosia and Latora, 1). We convert both layers into undirected network but keep the edge weights. Note that all our odularity easures can naturally handle positive edge weights with the weighted adjacency atrix replacing the binary adjacency atrix in all the calculations. Further, we consider only the nodes which are connected with at least one connection in both layers. The resulting network layers have 23 nodes and 169 and 17 edges in the synapse and ion layers respectively. We apply the hypothesis testing procedure developed in Section 2.3 to test between SD and ID null odels to this data. The LRT statistic value is The paraetric bootstrap distribution is shown in Figure (a). With an epirical p-value less than.1, we reject the null hypothesis of SD odel and conclude that the ID odel should be 28

29 used as null odel for counity detection. Note that using the chi-squared distribution assuption (with degrees of freedo 22) for LRT statistic would also reach the sae conclusion. The two adjacency atrices plotted with class assignents fro the ultilayer ethods SDlocal and MNavrg are presented in Figures 6 and 7 respectively. The block structure confirs well separated structural counities labda= labda= (a) (b) Figure : Paraetric bootstrap distribution of the likelihood ratio test statistic for (a) C- elegans network and (b) Grade 7 students network. The observed value of the test statistic is indicated with an arrow. 7.3 Grade 7 students network This dataset, obtained by Vickers and Chan (1981), is a ulti-layer network on 29 students of grade 7. The students were asked to noinate a peer as an answer to one of the following three question: (a) Who do you get on with in the class? (b) Who are your best friends in the class? (c) Who would you prefer to work with? The answers to these three questions create three layers of relations aong the students. Although the network edges are directed, we consider the network as a 3-layer undirected network and apply our counity detection ethods on it. The log-likelihood ratio test developed in Section 2.3 fails to reject the null hypothesis of no difference between the shared degree and independent degree null odels. The paraetric bootstrap distribution is shown in Figure (b). The value of the LRT statistic is 31.86, which corresponds to a bootstrap p-value of.993. Moreover, the p-value obtained with a chi-squared approxiation (with degrees of freedo 6) is.996, which is very close to the one obtained through bootstrap. Hence for parsiony 29

(a) (b) Figure 6: Adjacency atrices of the 2 layers in C-elegans connectoe, (a) ionic channel and (b) cheical synapse, sorted and arked according to the clustering results obtained fro SDlocal.

we will prefer the shared degree null odel. Single layer Newan-Grivan odularity gives 3, 4 and 3 clusters for get-on-with (gw), best friends (bf) and prefer to work with (ww) respectively.

30 (a) (b) Figure 6: Adjacency atrices of the 2 layers in C-elegans connectoe, (a) ionic channel and (b) cheical synapse, sorted and arked according to the clustering results obtained fro SDlocal. (a) (b) Figure 7: Adjacency atrices of the 2 layers in C-elegans connectoe, (a) ionic channel and (b) cheical synapse, sorted and arked according to the clustering results obtained fro MNavrg. we will prefer the shared degree null odel. Single layer Newan-Grivan odularity gives 3, 4 and 3 clusters for get-on-with (gw), best friends (bf) and prefer to work with (ww) respectively. However using the entire ultilayer network, three of the four MLCM based ethods along with aggregate detected 3 clusters of size 12, 1 and 2. Figure 8 depicts the three adjacency atrices sorted and arked into diagonal blocks according to this clustering solution. The density of intra- 3

counity edges are clearly higher than the inter-counity edges across all three layers. Hence, the counities appear to be well separated in all three layers.

obtained fro SDlocal, SDratio and MNavrg.

Surprisingly, we see quite high NMI for the clustering solution entioned above (Table 4(a)). In fact, the nodes in the cluster of size 12 are all boys.

31 counity edges are clearly higher than the inter-counity edges across all three layers. Hence, the counities appear to be well separated in all three layers (a) (b) (c) Figure 8: Adjacency atrices of the three layers, (a) get on with, (b) best friends and (c) work with, sorted and arked according to the (sae) clustering result obtained fro SDlocal, SDratio and MNavrg. Since the only external inforation known to us about these students is the gender inforation, we investigate how well the different clustering solutions align with counities based on genders. Surprisingly, we see quite high NMI for the clustering solution entioned above (Table 4(a)). In fact, the nodes in the cluster of size 12 are all boys. The girls, however, got divided in to two classes, one of size 1 and another tiny cluster of size 2. In contrast, the three clustering solutions fro the single layers yielded poor NMI with the gender-wise ground truth. Fro this we can conclude that fusing several layers of network inforation together, it is possible to learn eaningful inforation about the properties of the nodes, which would not have been possible with single layers. To further test our hypothesis of two gender-wise clusters in this ulti-layer network, we eployed the stochastic block odel based odularities in conjunction with MLCM based odularities with known nuber of counities 2. The results (Table 4(b)) show all of the stochastic block odel based odularities along with a nuber of MLCM based odularities perfectly agree with gender based ground truth. 31

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,