Edge Modification Techniques for Enhancing Network Communicability and Robustness, Part I

Edge Modification Techniques for Enhancing Network Communicability and Robustness, Part I Department of Mathematics and Computer Science Emory University Atlanta, GA 30322, USA International Summer School on Complex Networks Bertinoro, Italy 11-15 July, 2016

Acknowledgements Thanks to the organizers Francesca Arrigo (Strathclyde) Christine Klymko (Lawrence Livermore National Lab) National Science Foundations (grants DMS-1115692, DMS-1418889) Note: papers downloadable from my web page.

Outline Networks: introduction and motivation Mathematical background Walk-based centrality and communicability measures Diffusion on networks

Networks: motivation and background Networks provide models for a wide variety of physical, biological, engineered or social systems. For example: molecular structure, gene and protein interaction, anatomical and metabolic networks, food webs, transportation networks, power grids, financial and trade networks, social networks, the Internet, the WWW, Facebook, Twitter,... Network science is the study of networks, both as mathematical structures and as concrete, real world objects. It is a growing multidisciplinary field, with important contributions from physicists, mathematicians, computer scientists, biologists, social scientists, public health researchers and even from scholars in the humanities.

Complex networks: motivation and background (cont.) The field has its origins in the work of psychologists, sociologists, economists, anthropologists and statisticians dating back to the late 1940s and early 1950s. Since the late 1990s many physicists, computer scientists and mathematicians have entered the scene, making the field more mathematically sophisticated. Basic tools for the analysis of networks include graph theory, linear algebra, probability, numerical analysis, and of course algorithms and data structures from discrete mathematics. Ideas and techniques from statistical mechanics, complex systems and multilinear algebra also play an important role. The study of dynamical processes often leads to differential equations posed on networks.

Some classic references Some classic early references: J. R. Seeley, The net of reciprocal influence: A problem in treating sociometric data, Canadian J. Psychology, 3 (1949), pp. 234 240. L. Katz, A new status index derived from sociometric analysis, Psychometrika, 18 (1953), pp. 39 43. A. Rapoport, Mathematical models of social interaction, in Handbook of Mathematical Psychology, vol. 2, pp. 493 579. Wiley, New York, 1963. D. J. de Solla Price, Networks of scientific papers, Science, 149 (1965), pp. 510 515. S. Milgram, The small world problem, Psychology Today, 2 (1967), pp. 60 67. J. Travers and S. Milgram, An experimental study of the small world problem, Sociometry, 32 (1969), pp. 425 443.

Some classic references (cont.) The field has developed very rapidly since the late 1990s thanks to the work of many physicists, applied mathematicians and computer scientists. Landmark papers include D. J. Watts and S. H. Strogatz, Collective dynamics of small-world networks, Nature, 393 (1998), pp. 440 442. L.-A. Barabási and R. Albert, Emergence of scaling in random networks, Science, 386 (1999), pp. 509 512. M. E. J. Newman, Models of the small world, J. Stat. Phys., 101 (2000), pp. 819 841. J. Kleinberg, Navigation in a small world, Nature, 406 (2000), p. 845. R. Albert and L.-A. Barabási, Statistical mechanics of complex networks, Rev. Modern Phys., 74 (2002), pp. 47 97.

Some features of complex networks Typical attributes of many real-world networks include: Scale-free : the degree distribution follows a power law Small-world : small graph diameter, short average distance between nodes High clustering coefficient: many triangles, hubs,... Hierarchical structure Rich in motifs Self-similar Briefly stated: real world networks exhibit a non-trivial topology Caveat: there are important examples of real-world complex networks lacking one or more of these attributes.

What a small world...! Duncan Watts and Steven Strogatz

The Watts Strogatz random rewire model

The rich always get richer! Albert-László Barabási and Réka Albert

A synthetic example: The Barabási Albert model The Barabási Albert model is based on the notion of preferential attachment: starting from a given sparse graph, new nodes are added (one at a time) to the network, joining them to existing nodes with a probability proportional to the degree (the number of adjacent nodes) of such nodes. Hence, nodes that are rich of connections to other nodes have a high probability of attracting new neighbors, while poor nodes tend to remain such. One can show that the Barabási Albert model leads to a highly skewed degree distribution given by a power law of the form p(k) k γ, where p(k) denotes the fraction of nodes of degree k and γ is a constant, typically with 2 γ 3.

Power-law degree distribution Degree distribution according to a power law (Figure taken from L.-A. Barabási, Linked: The New Science of Networks, Perseus, NY, 2002.)

The Barabási Albert model (cont.)

The Barabási Albert model (cont.) Furthermore, the Barabási Albert model produces small world networks, in which the graph diameter (= maximum distance between any two nodes) grows slowly with the number of nodes; the growth is generally logarithmic or even log-logarithmic as N (in practice, the diameter remains nearly constant). Another feature of small world network is that they have a high clustering coefficient: that is, they contain a much larger number of triangles than a completely random graph of the same size. Social networks exhibit a high degree of clustering. Examples: Facebook; MathSciNet collaboration graph.

Example of real world network: Erdös collaboration graph

Example of real world network: PPI network of Saccharomyces cerevisiae (beer yeast)

Example of real world network: social network of injecting drug users in Colorado Springs, CO

Example of a real world network: the world airline routemap

Example of (directed) real world network: a food web

Network analysis Some of the most basic questions about network structure include Centrality: How to identify the most important nodes, or edges, in a network? How to rank the nodes/edges in order of importance? Communicability: How to measure how well a given pair of nodes can exchange information? How to identify bottlenecks in networks? Connectivity: How well connected are the nodes of a (sparse) network? Can we sparsify a network while keeping it as well-connected as possible? Robustness: How resilient is a network under the removal of edges (or nodes), either random or targeted? Note that all these questions are closely related.

Formal definitions A graph G = (V, E) consists of a (finite) set V = {v 1, v 2,..., v N } of nodes (or vertices) and a set E of edges (or links), which are pairs {v i, v j } with v i, v j V. The graph G is directed if the edges {v i, v j } E are ordered pairs = (v i, v j ) V V, otherwise G is undirected. A directed graph is often referred to as a digraph. A graph G is weighted if numerical values are associated with its edges. If all the edges are given the same value 1, we say that the graph is unweighted. A simple graph is an unweighted graph without multiple edges. A graph is sparse if it has M = O(N) edges.

Formal definitions (cont.) A loop in G is an edge from a node to itself. Loops are usually ignored or excluded. A walk of length k in G is a set of nodes v i1, v i2,... v ik, v ik+1 such that for all 1 j k, there is an edge between v ij and v ij+1. A closed walk is a walk where v i1 = v ik+1. A path is a walk with no repeated nodes. A cycle is a path with an edge between the first and last node. In other words, a cycle is a closed path. A triangle in G is a cycle of length 3.

Formal definitions (cont.) The geodesic distance d(v i, v j ) between two nodes is the length of the shortest path connecting v i and v j. We let d(v i, v j ) = if no such path exists. The diameter of a graph G = (V, E) is defined as diam(g) := max v i,v j V d(v i, v j ). A graph G is connected if for every pair of nodes v i and v j there is a path in G that starts at v i and ends at v j ; i.e., diam(g) <. These definitions apply to both undirected and directed graphs, though in the latter case the orientation of the edges must be taken into account.

Formal definitions (cont.) To every unweighted graph G = (V, E) we associate its adjacency matrix A = [a ij ] R N N, with { 1, if (vi, v a ij = j ) E, 0, else. Any renumbering of the graph nodes results in a symmetric permutation A PAP T of the adjacency matrix of the graph. If G is an undirected graph, A is symmetric with zeros along the main diagonal. In this case, the eigenvalues of A are all real. We label the eigenvalues of A in non-increasing order: λ 1 λ 2 λ N. If G is connected and acyclic, then λ 1 is simple and satisfies λ 1 > λ i for 2 i N. Also, the associated eigenvector can be chosen to be strictly positive (Perron Frobenius Theorem).

Formal definitions (cont.) If G is undirected, the degree d i of node v i is the number of edges incident to v i in G. That is, d i is the number of immediate neighbors of v i in G. Note that in terms of the adjacency matrix, d i = N j=1 a ij. For a directed graph, we define the in-degree of node v i as the number di in of edges ending in v i, and the out-degree of v i as as the number di out of edges originating at v i. In terms of the (nonsymmetric) adjacency matrix, d in i = N a ij, i=1 d out i = N a ij. j=1 Hence, the column sums of A give the in-degrees and the row sums give the out-degrees.

Formal definitions (cont.) An undirected graph G = (V, E) is bipartite if there are V 1, V 2 V, with V 1, V 2, V = V 1 V 2, V 1 V 2 = such that edges can exist only between nodes belonging to different subsets V 1, V 2. In other terms, a graph is bipartite if it does not contain any odd-length cycles.

Formal definitions (cont.) Let G = (V, E) be bipartite with V 1 = m, V 2 = n, m + n = N. Then there exists a numbering of the nodes of G such that the adjacency matrix of G is of the form [ ] 0 B A = B T 0 with B R m n. Note: The nonzero eigenvalues of A are of the form ±σ i, where σ i denote the singular values of B. More Bipartite graphs can also be used to give an alternative representation of directed graphs.

Formal definitions (cont.) Indeed, given a digraph G = (V, E) with N nodes we can make a copy V = {v 1,..., v N } of V = {v 1,..., v N } and define a new, undirected graph G = (V, E) wth 2N nodes, where V := V V and E = {(v i, v j ) (v i, v j ) E}. If A is the adjacency matrix of the original digraph G, the adjacency matrix of the corresponding bipartite graph G is given by [ ] 0 A A = A T R 2N 2N. 0 As before, the nonzero eigenvalues of A come in opposite pairs, ±σ i (A). We will make use of this bipartite representation of digraphs in Part II.

Graph spectra The adjacency matrix A of an undirected network is always symmetric, hence its eigenvalues are all real. The eigenvalue distribution reflects global properties of G; as we shall see, the eigenvectors also carry important information about the network structure. The spectrum of a network is the spectrum of the corresponding adjacency matrix A. (It should not be confused with the graph Laplacian spectrum, see below.) Much work has been done in characterizing the spectra (eigenvalue distributions) of random graphs and of certain classes of complex graphs (e.g., scale-free graphs). The adjacency matrix of a directed network, on the other hand, is typically nonsymmetric and will have complex (non-real) eigenvalues in general. In this case, the singular values of A are often useful.

Example: PPI network of Saccharomyces cerevisiae 0 500 1000 1500 2000 0 500 1000 1500 2000 nz = 13218 Adjacency matrix, N = 2224, m = 6609.

Example: PPI network of Saccharomyces cerevisiae 700 600 500 400 300 200 100 0 0 10 20 30 40 50 60 70 Degree distribution (d min = 1, d max = 64)

Example: PPI network of Saccharomyces cerevisiae 20 15 10 5 0 5 10 15 0 500 1000 1500 2000 2500 Eigenvalues

Example: Intravenous drug users network 0 100 200 300 400 500 600 0 100 200 300 400 500 600 nz = 4024 Adjacency matrix, N = 616, m = 2012.

Example: Intravenous drug users network 250 200 150 100 50 0 0 10 20 30 40 50 60 Degree distribution (d min = 1, d max = 58)

Example: Intravenous drug users network 20 15 10 5 0 5 10 0 100 200 300 400 500 600 700 Eigenvalues

Graph Laplacian spectra For an undirected network, the spectrum of the graph Laplacian L = D A, where D is the diagonal matrix of node degrees, also plays an important role. For instance, the connectivity properties of G can be characterized in terms of spectral properties of L; moreover, L is important in the study of diffusion processes on G. The eigenvalues of L are all real and nonnegative, and since L 1 = 0, where 1 denotes the vector of all ones, L is singular. Hence, 0 σ(l). Theorem. The multiplicity of 0 as an eigenvalue of L coincides with the number of connected components of the network. Corollary. For a connected network, the null space of the graph Laplacian is 1-dimensional and is spanned by 1. Thus, rank(l) = N 1.

Graph Laplacian spectra (cont.) Let G be a simple, connected graph. The eigenvector of L associated with the smallest nonzero eigenvalue is called the Fiedler vector of the network. Since this eigenvector must be orthogonal to the vector of all ones, it must contain both positive and negative entries. There exist elegant graph partitioning algorithms that assign nodes to different subgraphs based on the sign of the entries of the Fiedler vector.

Graph Laplacian spectra (cont.) If the graph G is regular of degree d, then L = di N A and the eigenvalues of the Laplacian are just λ i (L) = d λ i (A), 1 i N. If G is not a regular graph, there is no simple relationship between the eigenvalues of L and those of A. Also useful is the notion of normalized Laplacian: L := IN D 1/2 AD 1/2. In the case of directed graphs, there are several distinct notions of graph Laplacian in the literature.

Example: PPI network of Saccharomyces cerevisiae Laplacian eigenvalues (λ 2 (L) = 0.0600, λ N (L) = 65.6077)

Example: Intravenous drug users network Laplacian eigenvalues (λ 2 (L) = 0.0111, λ N (L) = 59.1729)

Network centrality A fundamental question is network analysis is the identification of the most important nodes (or edges). The first node centrality indices (or measures) were introduced by social scientists trying to identify the most influential individuals in social networks (think of marketing or public health interventions). In this context, a node (or edge) is important if it plays a central role in the spreading of information across the network. The number of direct connections (the degree) of a node is often less important. Centrality is crucial, e.g., for Network connectivity and robustness/vulnerability Essential proteins in PPI networks (lethality) Identification of keystone species in ecosystems Author centrality in collaboration networks Ranking of documents/web pages on a given topic

Centrality measures There are dozens of different notions of node centrality in the literature: Degree centrality Closeness centrality Betweenness centrality Eigenvector centrality Katz centrality Subgraph centrality Total communicability Newman s information centrality Kleinberg s HITS Google s PageRank (with its many variants) etc.

Centrality measures (cont.) Formally, a node centrality measure is a function C : V [0, ) that is invariant under graph automorphisms: C(i) = C(φ(i)) i = 1 : N, φ Aut(G). In other words, if c is the vector of node centralities of a graph G with adjacency matrix A and G is a graph with adjacency matrix A = PAP T with P a permutation matrix, then the vector of node centralities for G is given by c = Pc.

Centrality measures (cont.) The simplest is degree centrality, which is just the degree d i of node i. This does not take into account the importance of the nodes a given nodes is connected to only their number. A popular notion of centrality is betweenness centrality (Freeman 1979), defined for any node i V as C B (i) := j i δ jk (i), where δ jk (i) is the fraction of all shortest paths in the graph between nodes j and k which contain node i: k i δ jk (i) := # of shortest paths between j, k containing i # of shortest paths between j, k.

Centrality measures (cont.) This is a simple example where betweenness centrality is clearly more relevant than degree centrality.

Centrality measures (cont.) Another centrality measure popular in social network analysis is closeness centrality (Freeman, 1979), defined as C C (i) = j V 1. d(i, j) Betweenness and closeness centrality assume that communication in the network only takes place via shortest paths, but this is often not the case. This observation has motivated a number of alternative definitions of centrality, which aim at taking into account the global structure of the network and the fact that all walks between pairs of nodes should be considered, not just shortest paths.

Centrality measures (cont.) We mention here that for directed graphs it is often necessary to distinguish between hubs and authorities. Indeed, in a directed graph a node plays two roles: broadcaster and receiver of information. Crude broadcast and receive centrality measures are provided by the out-degree di out and by the in-degree di in of the node, respectively. Other, more refined receive and broadcast centrality measures will be introduced in Part II.

Walk-based centrality measures Subgraph centrality (Estrada and Rodríguez-Velásquez 2005) measures the centrality of a node by taking into account the number of subgraphs the node participates in. This is done by counting, for all k = 1, 2,..., the number of closed walks in G starting and ending at node i, with longer walks being penalized (given a smaller weight). It is sometimes useful to introduce a tuning parameter β > 0 ( inverse temperature ) to simulate external influences on the network, for example, social tensions. Subgraph centrality has been used successfully in various settings, including proteomics and neuroscience.

Walk-based centrality measures (cont.) Recall that (A k ) ii = # of closed walks of length k based at node i, (A k ) ij = # of walks of length k that connect nodes i and j. Adding up all closed walks through node i and using β k /k! as weights to penalize longer walks leads to the notion of subgraph centrality: SC(i) = ] [I + βa + β2 2! A2 + β3 3! A3 + ii = [e βa ] ii Note that if we truncate the expansion at 2nd order we recover degree centrality.

The Estrada index of a network The sum of all subgraph centralities: N EE(G, β) := Tr(e βa ) = [e βa ] ii = i=1 is known as the Estrada index of the network. N i=1 e βλ i This index, which is analogous to the partition function Z(β) = Tr(e βh ) (where H is the Hamiltonian, or energy operator; here H = A) plays an important role in the statistical mechanics approach to networks. The Helmholtz free energy of the network can then be expressed as F (G, β) = β 1 ln(ee(g, β)).

Katz centrality Of course different weights can be used, leading to different matrix functions, such as the resolvent (Katz 1953): (I αa) 1 = I + αa + α 2 A 2 +, 0 < α < 1/λ 1. In the case of a directed network one can use the solution vectors of the linear systems (I αa)x = 1 and (I αa T )y = 1, where 1 is the all-one vector, to rank hubs and authorities. These are the row and column sums of the matrix resolvent (I αa) 1. See also M. Benzi, E. Estrada and C. Klymko, Ranking hubs and authorities using matrix functions, Linear Algebra Appl., 438 (2013), 2447 2474.

Eigenvector centrality Eigenvector centrality (Bonacich 1987) uses the entries of the dominant eigenvector x to rank the nodes in the network in order of importance: the larger x(i) is, the more important node i is considered to be. By the Perron Frobenius Theorem, the vector x is positive and, up to a constant, unique provided the network is connected. The underlying idea is that a node is important if it is linked to many important nodes. This circular definition corresponds to the fixed-point iteration x (k+1) = Ax (k), k = 0, 1,... which converges, upon normalization, to the dominant eigenvector of A as k. The rate of convergence is governed by the spectral gap γ = λ 1 λ 2. The larger γ, the faster the convergence.

Eigenvector centrality (cont.) Eigenvector centrality has the following interpretation in terms of walks on G: The eigenvector centrality x(i) of node i is the limit as k of the percentage of walks of length k which visit node i among all walks of length k on G. See for example D. Cvetković, P. Rowlinson, and S. Simić, Eigenspaces of Graphs, Cambridge University Press, 1997. Thus, the eigenvector centrality measures the global influence of node i. Remark: Both Google s PageRank algorithm (Brin and Page 1998) and Kleinberg s HITS (Kleinberg 1998) can be seen as variants of eigenvector centrality for digraphs.

Comparing centrality measures Suppose A is the adjacency matrix of an undirected, connected network. Let λ 1 > λ 2 λ 3 λ N be the eigenvalues of A, and let x k be a normalized eigenvector corresponding to λ k. Then and therefore e βa = SC(i) = N e βλ k x k x T k k=1 N e βλ k x k (i) 2, k=1 where x k (i) denotes the ith entry of x k. Similar spectral formulas exist for other centrality measures based on different matrix functions (e.g., Katz).

Comparing centrality measures (cont.) When the spectral gap is sufficiently large (λ 1 λ 2 ), the centrality ranking is essentially determined by the dominant eigenvector x 1, since the contributions to the centrality scores from the remaining eigenvalues/vectors becomes negligible, in relative terms. Hence, in this case all these walk-based centrality measures tend to agree, especially on the top-ranked nodes, and they all reduce to eigenvector centrality. In contrast, the various measures can give significantly different results when the spectral gap is small. In this case, going beyond the dominant eigenvector can result in more refined rankings.

Comparing centrality measures (cont.) Theorem (B. and Klymko, 2015): Let A be the adjacency matrix for a simple, connected, undirected graph G. Then For α 0+, Katz centrality reduces to degree centrality; For α 1 λ 1, Katz centrality reduces to eigenvector centrality; For β 0+, subgraph centrality reduces to degree centrality; For β, subgraph centrality reduces to eigenvector centrality. Note: Analogous results hold for more general matrix functions and also for directed networks, in which case we need to distinguish between in-degree, out-degree, left/right dominant eigenvectors, and row/column sums of (I αa) 1 and e βa. M. Benzi and C. F. Klymko, On the limiting behavior of network centrality measures, SIAM J. Matrix Anal. Appl., 36 (2015), 686 706.

Communicability Network communicability (Estrada and Hatano 2008) is a measure of how well any two nodes i and j can exchange information: C(i, j) = [e βa ] ij ] = [I + βa + β2 2! A2 + β3 3! A3 + ij weighted sum of walks joining nodes i and j. Note that as the temperature increases, the communicability between nodes decreases (e βa I for β 0). Communicability has been successfully used to identify bottlenecks in networks, to locate poorly communicating regions of the brain in victims of stroke, and for community detection. E. Estrada & N. Hatano, Phys. Rev. E, 77 (2008), 036111; E. Estrada, N. Hatano, & M. Benzi, Phys. Rep., 514 (2012), 89 119.

Communicability (cont.) The total communicability of a node is defined as TC(i) := N C(i, j) = j=1 N [e βa ] ij. This is another walk-based centrality measure. Its main advantage is that it can be computed very efficiently, even for large graphs, since it does not require computing individual entries of e βa, but only e βa 1. This calculation can be performed very efficiently using Krylov subspace methods, which involve matrix-vector multiplications with A (see Part II). j=1

Communicability (cont.) The normalized total communicability of G: TC(G) := 1 N N N C(i, j) = 1 N 1T e βa 1 i=1 j=1 provides a global measure of how well-connected a network is, and can be used to compare different network designs. Thus, highly connected networks, such as small-world networks without bottlenecks, can be expected to have a high total communicability. Conversely, large-diameter networks with a high degree of locality (such as regular grids, road networks, etc.) or networks containing bottlenecks are likely to display a low value of TC(G). M. Benzi and C. F. Klymko, Total communicability as a centrality measure, J. Complex Networks, 1 (2013), 124 149.

Global communicability measures (cont.) Note that TC(G) can be easily bounded from below and from above: 1 N EE(G) TC(G) eβλ 1 where EE(G) = Tr(e βa ) is the Estrada index of the graph. These bounds are sharp, as can be seen from trivial examples. In the next Table we present the results of some calculations (for β = 1) of the normalized Estrada index, global communicability and e λ 1 for various real-world networks.

Communicability (cont.) Table: Comparison of the normalized Estrada index EE n (G) = Tr(e A )/N, the normalized total network communicability TC(G), and e λ1 for various real-world networks. Network EE n (G) TC(G) e λ1 Zachary Karate Club 30.62 608.79 833.81 Drug User 1.12e05 1.15e07 6.63e07 Yeast PPI 1.37e05 3.97e07 2.90e08 Pajek/Erdos971 3.84e04 4.20e06 1.81e07 Pajek/Erdos972 408.23 1.53e05 1.88e06 Pajek/Erdos982 538.58 2.07e05 2.73e06 Pajek/Erdos992 678.87 2.50e05 3.73e06 SNAP/ca-GrQc 1.24e16 8.80e17 6.47e19 SNAP/ca-HepTh 3.05e09 1.06e11 3.01e13 SNAP/as-735 3.00e16 3.64e19 2.32e20 Gleich/Minnesota 2.86 14.13 35.34

Communicability (cont.) Communicability and centrality measures can also be defined using the negative graph Laplacian L = D + A instead of the adjacency matrix A. When the graph is regular (the nodes have all the same degree d, hence D = di ) nothing is gained by using L instead of A, owing to the obvious identity e L = e di +A = e d e A. Most networks, however, have highly skewed degree distributions, and using e L instead of e A can lead to substantial differences. Note that the interpretation in terms of weighted walks is no longer valid in this case. Laplacian-based centrality and communicability measures are especially useful in the study of dynamic processes on graphs.

Dynamics on networks: diffusion The diffusion of contagious diseases or computer viruses, the adoption of new products, the spread of rumors, beliefs, fads, consensus phenomena etc. can be modeled in terms of diffusion processes on graphs. There exist a number of models to describe diffusion phenomena on networks, both deterministic and stochastic in nature. Moreover, it is possible to investigate a number of important structural properties of networks by studying the behavior of solutions of evolution problems, such as diffusion or wave propagation, taking place on the network. An increasingly important model in this direction is given by quantum graphs (not discussed here).

Dynamics on networks: diffusion (cont.) The simplest (semi-deterministic) model of diffusion on a graph is probably the initial value problem for the heat equation on G: dx dt = L x, t > 0, x(0) = x 0. Here L = D A denotes the graph Laplacian, x = x(t) is the temperature (or concentration ) at time t, and x 0 the initial distribution; e.g., if x 0 = e i, the substance that is diffusing on G is initially all concentrated on node v i. As a rule, x 0 0. The solution of the initial value problem is then given by x(t) = e tl x 0, t > 0. Here we assume that G is connected.

Dynamics on networks: diffusion (cont.) We note that the heat kernel {e tl } t 0 forms a family (semigroup) of doubly stochastic matrices. This means that each matrix e tl is nonnegative and has all its row and column sums equal to 1. The diffusion process on G is therefore a continuous time Markov process, which converges for t to the uniform distribution, regardless of the (nonzero) initial state: lim x(t) = c1, c > 0, t where c is the average of the initial vector x 0.

Dynamics on networks: diffusion (cont.) The rate at which the the steady-state (equilibrium) is reached is governed by the smallest non-zero eigenvalue of L, which in turn is determined by the topology of the graph. In general, the small-world property leads to extremely fast diffusion ( mixing ) in complex networks. This can help explain the very rapid global spread of certain epidemics (like the flu) but alsoof fads, rumors, financial crises, etc. Note that this rapid convergence to equilibrium is in contrast with the situation for diffusion on a regular grid obtained from the discretization of a (bounded) domain Ω R d (d = 1, 2, 3).

Dynamics on networks: diffusion (cont.) In more detail, let L = QΛQ T = N λ k q k q T k k=1 be the eigendecomposition of the Laplacian of the (connected) graph G, with L q k = λ k q k. Recalling that the eigenvalues of L satisfy 0 = λ 1 < λ 2 λ N and that the (normalized) eigenvector associated with the simple zero eigenvalue is q 1 = 1 N 1, we obtain e tl = 1 N 11T + N e tλ k q k q T k. k=2

Dynamics on networks: diffusion (cont.) Therefore, we have e tl x 0 = 1 N (1T x 0 )1 + N e tλ k (q T k x 0)q k c1 for t, k=2 where c = 1 N (1T x 0 ) is positive if the initial (nonzero) vector x 0 is nonnegative. It is clear that the rate of convergence to equilibrium ( mixing rate ) will be fast if, and only if, the smallest nonzero eigenvalue λ 2 of L is sufficiently large.

Dynamics on networks: diffusion (cont.) As an extreme example, consider the complete graph on N nodes, in which every node is connected to every other node. The Laplacian of this graph has only two distinct eigenvalues, 0 and N, and equilibrium is reached extremely fast. However, there also exist very sparse graphs that exhibit a sizable gap and therefore fast mixing rates. An important example is given by expander graphs. These graphs are highly sparse, yet they are highly connected and have numerous applications in the design of communication and infrastructure networks. Another important property of such graphs is that they are very robust, meaning that many edges have to be deleted to break the network into two (large) disconnected subnetworks.

Dynamics on networks: the Schrödinger equation Also of interest in Network Science is the Schrödinger equation on a graph: i dψ dt = Lψ, t R, ψ(0) = ψ 0, where ψ 0 C N is a prescribed initial state with ψ 0 2 = 1. The solution is given esplicitly by ψ(t) = e itl ψ 0, for all t R; note that since itl is skew-hermitian, the propagator U(t) = e itl is a unitary matrix, which guarantees that the solution has unit norm for all t: ψ(t) 2 = U(t)ψ 0 2 = ψ 0 2 = 1, t R. The amplitudes [U(t)] ij 2 express the probability to find the system in state e i at time t, given that the initial state was e j. The Schrödinger equation has been proposed as a tool to analyze properties of complex graphs in P. Suau, E. R. Hancock, & F. Escolano, Graph characteristics from the Schrödinger operator, LNCS 7877 (2013), 172 181.

Singular Value Decomposition (SVD) For any matrix A R m n there exist orthogonal matrices U R m m and V R n n and a diagonal matrix Σ R m n such that U T AV = Σ = diag (σ 1,..., σ p ) where p = min{m, n}. The σ i are the singular values of A and satisfy σ 1 σ 2 σ r > σ r+1 = = σ p = 0, where r = rank(a). The matrix Σ is uniquely determined by A, but U and V are not. The columns u i and v i of U and V are left and right singular vectors of A corresponding to the singular value σ i.

Singular Value Decomposition (cont.) Note that Av i = σ i u i and A T u i = σ i v i, 1 i p. From AA T = UΣΣ T U T and A T A = V Σ T ΣV T we deduce that the singular values of A are the (positive) square roots of the SPD matrices AA T and A T A; the left singular vectors of A are thus eigenvectors of AA T, and the right ones are eigenvectors of A T A. Moreover, we have the compact SVD: A = r i=1 σ i u i v T i = U r Σ r V T r, showing that any matrix A of rank r is the sum of exactly r rank-one matrices.

Singular Value Decomposition (cont.) It is also important to recall that if we form the symmetric matrix [ ] 0 A A = A T R (n+m) (n+m), 0 then the eigenvalues of A are of the form λ i = ±σ i, and the corresponding eigenvectors are given by [ ] ui x i =. ±v i The matrix A plays an important role in computations and in the analysis of bipartite and directed networks. Back