RTG: A Recursive Realistic Graph Generator Using Random Typing

Size: px
Start display at page:

Download "RTG: A Recursive Realistic Graph Generator Using Random Typing"

Transcription

1 RTG: A Recursive Realistic Graph Generator Using Random Typing Leman Akoglu and Christos Faloutsos Carnegie Mellon University School of Computer Science {lakoglu,christos}@cs.cmu.edu Abstract. We propose a new, recursive model to generate realistic graphs, evolving over time. Our model has the following properties: it is (a) flexible, capable of generating the cross product of weighted/ unweighted, directed/undirected, uni/bipartite graphs; (b) realistic, giving graphs that obey eleven static and dynamic laws that real graphs follow (we formally prove that for several of the (power) laws and we estimate their exponents as a function of the model parameters); (c) parsimonious, requiring only four parameters. (d) fast, being linear on the number of edges; (e) simple, intuitively leading to the generation of macroscopic patterns. We empirically show that our model mimics two real-world graphs very well: Blognet (unipartite, undirected, unweighted) with 27K nodes and 125K edges; and Committee-to-Candidate campaign donations (bipartite, directed, weighted) with 23K nodes and 880K edges. We also show how to handle time so that edge/weight additions are bursty and self-similar. 1 Introduction Study of complex graphs such as computer and biological networks, the link structure of the WWW, the topology of the Internet, and recently with the widespread use of the Internet, large social networks, has been a vital research area. Many fascinating properties have been discovered, such as small and shrinking diameter [2,20], power-laws [5,11,16,24,22,28,29,20], and community structures [12,13,27]. As a result of such interesting patterns being discovered, and for many other reasons which we will discuss next, how to find a model that would produce synthetic but realistic graphs is a natural question to ask. There are several applications and advantages of modeling real-world graphs: Simulation studies: if we want to run tests for, say a spam detection algorithm, and want to observe how the algorithm behaves on graphs with different sizes and structural properties, we can use graph generators to produce such graphs by changing the parameters. This is also true when it is difficult to collect any kind of real data. Sampling/Extrapolation: we can generate a smaller graph for example for visualization purposes or in case the original graph is too big to run tests W. Buntine et al. (Eds.): ECML PKDD 2009, Part I, LNAI 5781, pp , c Springer-Verlag Berlin Heidelberg 2009

2 14 L. Akoglu and C. Faloutsos on it; or conversely to generate a larger graph for instance to make future prediction and answer what-if questions. Summarization/Compression: model parameters can be used to summarize and compress a given graph as well as to measure similarity to other graphs. Motivation to understand pattern generating processes: graph generators give intuition and shed light upon what kind of processes can (or cannot) yield the emergence of certain patterns. Moreover, modeling addresses the question of what patterns real networks exhibit that needs to be matched and provides motivation to figure out such properties. Graph generator models are surveyed in [4]. Ideally, we would like a graph generator that is: 1. simple: it would be easy to understand and it would intuitively lead to the emergence of macroscopic patterns. 2. realistic: it would produce graphs that obey all the discovered laws of real-world graphs with appropriate values. 3. parsimonious: it would require only a few number of parameters. 4. flexible: it would be able to generate the cross product of weighted/unweighted, directed/undirected and unipartite/bipartite graphs. 5. fast: the generation process would ideally take linear time with respect to the number of edges in the output graph. In this paper we propose RTG, for Random Typing Generator. Our model uses a process of random typing, to generate source and destination node identifiers, and it meets all the above requirements. In fact, we show that it can generate graphs that obey all eleven patterns that real graphs typically exhibit. Next, we provide a survey on related work. Section 3 describes our RTG generator in detail. Section 4 provides experimental results and discussion. We conclude in Section 5. Appendix gives proofs showing some of the power-laws that the model generates. 2 Related Work Graph patterns: Many interesting patterns that real graphs obey have been found, which we give a detailed list of in the next section. Ideally, a generator should be able to produce all of such properties. Graph generators: The vast majority of earlier graph generators have focused on modeling a small number of common properties, but fail to mimic others. Such models include the Erdos & Renyi model[8], the preferential attachment model [3] and numerous more, like the small-world, winners don t take all, forest fire and butterfly models [31,26,20,22]. See [4] for a recent survey and discussion. In general, these methods are limited in trying to model some static graph property while neglecting others as well as dynamic properties or cannot be generalized to produce weighted graphs.

3 RTG: A Recursive Realistic Graph Generator Using Random Typing 15 Random dot product graphs [17,32] assign each vertex a random vector in some d-dimensional space and an edge is put between two vertexes with probability equal to the dot product of the endpoints. This model does not generate weighted graphs and by definition only produces undirected graphs. It also seems to require the computation of the dot product for each pair of nodes which takes quadratic time. A different family of models is utility-based, where agents try to optimize a predefined utility function and the network structure takes shape from their collective strategic behavior [10,9,18]. This class of models, however, is usually hard to analyze. Kronecker graph generators [19] and their tensor followups [1] are successful in the sense that they match several of the properties of real graphs and they have proved useful for generating self-similar properties of graphs. However, they have two disadvantages: The first is that they generate multinomial/lognormal distributions for their degree and eigenvalue distribution, instead of a power-law one. The second disadvantage is that it is not easy to grow the graph incrementally: They have a fixed, predetermined number of nodes (say, N k,wheren is the number of nodes of the generator graph, and k is the number of iterations); where adding more edges than expected does not create additional nodes. In contrast, in our model, nodes emerge naturally. 3 Proposed Model We first give a concise list of the static and dynamic laws that real graphs obey, which a graph generator should be able to match. L01. Power-law degree distribution: the degree distribution should follow a power-law in the form of f(d) d γ, with the exponent γ<0 [5,11,16,24] L02. Densification Power Law (DPL): the number of nodes N and the number of edges E should follow a power-law in the form of E(t) N(t) α,with α>1, over time [20]. L03. Weight Power Law (WPL): the total weight of the edges W and the number of edges E should follow a power-law in the form of W (t) E(t) β,with β>1, over time [22]. L04. Snapshot Power Law (SPL): the total weight of the edges W n attached to each node and the number of such edges, that is, the degree d n should follow a power-law in the form of W n d θ n,withθ>1[22]. L05. Triangle Power Law (TPL): the number of triangles Δ and the number of nodes that participate in Δ number of triangles should follow a power-law in the form of f(δ) Δ σ,withσ<0[29]. L06. Eigenvalue Power Law (EPL): the eigenvalues of the adjacency matrix of the graph should be power-law distributed [28]. L07. Principal Eigenvalue Power Law (λ 1 PL): the largest eigenvalue λ 1 of the adjacency matrix of the graph and the number of edges E should follow a power-law in the form of λ 1 (t) E(t) δ,withδ<0.5, over time [1]. L08. small and shrinking diameter: the (effective) diameter of the graph should be small [2] with a possible spike at the gelling point [22]. It should also shrink over time [20].

4 16 L. Akoglu and C. Faloutsos L09. constant size secondary and tertiary connected components: while the giant connected component keeps growing, the secondary and tertiary connected components tend to remain constant in size with small oscillations [22]. L10. community structure: the graph should exhibit a modular structure, with nodes forming groups, and possibly groups within groups [12,13,27]. L11. bursty/self-similar edge/weight additions: Edge (weight) additions to the graph over time should be self-similar and bursty rather than uniform with possible spikes [7,14,15,22]. Zipf introduced probably the earliest power law [33], stating that, in many natural languages, the rank r and the frequency f r of vocabulary words follow a power-law f r 1/r. Mandelbrot [21] argued that Zipf s law is the result of optimizing the average amount of information per unit transmission cost. Miller [23] showed that a random process also leads to Zipf-like power laws. He suggested the following experiment: A monkey types randomly on a keyboard with k characters and a space bar. A space is hit with probability q; all other characters are hit with equal probability, (1 q) k. A space is used to separate words. The distribution of the resulting words of this random typing process follow a power-law. Conrad and Mitzenmacher [6] showed that this relation still holds when the keys are hit with unequal probability. Our model generalizes the above model of natural human behavior, using random typing. We build our model RTG (Random Typing Generator) inthree steps, incrementally. In the next two steps, we introduce the base version of the proposed model to give an insight. However, as will become clear, it has two shortcomings. In particular, the base model does not capture (1) homophily, the tendency to associate and bond with similar others- people tend to be acquainted with others similar in age, class, geographical area, etc. and (2) community structure, the existence of groups of nodes that are more densely connected internally than with the rest of the graph. 3.1 RTG-IE: RTG with Independent Equiprobable Keys As in Miller s experimental setting, we propose each unique word typed by the monkey to represent a node in the output graph (one can think of each unique word as the label of the corresponding node). To form links between nodes, we mark the sequence of words as source and destination, alternatingly. That is, we divide the sequence of words into groups of two and link the first node to the second node in each pair. If two nodes are already linked, the weight of the edge is simply increased by 1. Therefore, if W words are typed, the total weight of the output graph is W/2. See Figure 1 for an example illustration. Intuitively, random typing introduces new nodes to the graph as more words are typed, because the possibility of generating longer words increases with increasing number of words typed. Due to its simple structure, this model is very easy to implement and is indeed mathematically tractable. If W words are typed on a keyboard with k keys and

5 RTG: A Recursive Realistic Graph Generator Using Random Typing 17 a p q p a b S b S a b S p p q p q p p p q T1 ab a 1 T2 bba ab 1 T3 b ab 1 T4 a b 1 T5 ab a 1 ab a bba ab b ab a b ab a Fig. 1. Illustration of the RTG-IE. Upper left: how words are (recursively) generated on a keyboard with two equiprobable keys, a and b, and a space bar; lower left: a keyboard is used to randomly type words, separated by the space character; upper right: how words are organized in pairs to create source and destination nodes in the graph over time; lower right: the output graph; each node label corresponds to a unique word, while labels on edges denote weights a space bar, the probability p of hitting a key being the same for all keys and the probability of hitting the space bar being denoted as q=(1 kp): Lemma 1. The expected number of nodes N in the output graph G of the RTG- IE model is N W logpk. Proof: In the Appendix. Lemma 2. The expected number of edges E in the output graph G of the RTG- IE model is E W logpk (1 + c logw), for c = q logpk logp > 0. Proof: In the Appendix. Lemma 3. The in(out)-degree d n of a node in the output graph G of the RTG- IE model is power law related to its total in(out)-weight W n, that is, W n d log kp n with expected exponent log k p>1. Proof: In the Appendix. Even though most of the properties listed at the beginning of this section are matched, there are two problems with this model: (1) the degree distribution follows a power-law only for small degrees and then shows multinomial characteristics (See Figure 2), and (2) it does not generate homophily and community structure, because it is possible for every node to get connected to every other node, rather than to similar nodes in the graph.

6 18 L. Akoglu and C. Faloutsos Fig. 2. Top row: Results of RTG-IE (k =5,p =0.16, W =1M). The problem with this model is that in(out)-degrees form multinomial clusters (left). This is because nodes with labels of the same length are expected to have the same degree. This can be observed on the rank-frequency plot (right) where we see many words with the same frequency. Notice the staircase effect. Bottom row: Results of RTG-IU (k =5, p =[0.03, 0.05, 0.1, 0.22, 0.30], W = 1M). Unequal probabilities introduce smoothing on the frequency of words that are of the same length (right). As a result, the degree distribution follows a power-law with expected heavy tails (left). 3.2 RTG-IU: RTG with Independent Un-Equiprobable Keys We can spread the degrees so that nodes with the same-length but otherwise distinct labels would have different degrees by making keys have unequal probabilities. This procedure introduces smoothing in the distribution of degrees, which remedies the first problem introduced by the RTG-IE model. In addition, thanks to [6], we are still guaranteed to obtain the desired power-law characteristics as before. See Figure RTG: Random Typing Graphs What the previous model fails to capture is the homophily and community structure. In a real network, we would expect nodes to get connected to similar nodes (homophily), and form groups and possibly groups within groups (modular structure). In our model, for example on a keyboard with two keys a and b, we would like nodes with many a s in their labels to be connected to similar nodes, as opposed to nodes labeled with many b s. However, in both RTG-IE and

7 RTG: A Recursive Realistic Graph Generator Using Random Typing 19 a b S a b S a b S a a*- a* a*- b* a*- S a aa*- ab* ab*-aa* ab*- ab* aa*-as a*- b* ab*-as a*- S a prob(a*,a*) = p a prob(a*,b*) prob(a*,s) p a p b β p a qβ p a as-aa* as-ab* as-as b b*- a* b*- b* b*- S b b*- a* b*- b* b*- S b p b p a β prob(b*,b*) = p b prob(b*,a*) prob(b*,s) p b qβ p b S S -a* S -b* S -S S S -a* S -b* S -S S qp a β qp b β prob(s,s) = q prob(s,a*) prob(s,b*) p a p b q (a) first level (b) recursion (c) communities q Fig. 3. The RTG model: random typing on a 2-d keyboard, generating edges (sourcedestination pairs). See Algorithm 1. (a) an example 2-d keyboard (nine keys), hitting a key generates the row(column) character for source(destination), shaded keys terminate source and/or destination words. (b) illustrates recursive nature. (c) the imbalance factor β favors diagonal keys and leads to homophily. RTG-IU it is possible for every node to connect to every other node. In fact, this yields a tightly connected core of nodes with rather short labels. Our proposal to fix this is to envision a two-dimensional keyboard that generates source and destination labels in one shot, as shown in Figure 3. The previous model generates a word for source, and, completely independently, another word for destination. In the example with two keys, we can envision this process as picking one of the nine keys in Figure 3(a), using the independence assumption: the probability for each key is the product of the probability of the corresponding row times the probability of the corresponding column: p l for letter l, and q for space ( S ). After a key is selected, its row character is appended to the source label, and the column character to the destination label. This process repeats recursively as in Figure 3(b), until the space character is hit on the first dimension in which case the source label is terminated and also on the second dimension in which case the destination label is terminated. In order to model homophily and communities, rather than assigning crossproduct probabilities to keys on the 2-d keyboard, we introduce an imbalance factor β, which will decrease the chance of a-to-b edges, and increase the chance for a-to-a and b-to-b edges, as shown in Figure 3(c). Thus, for the example that we have, the formulas for the probabilities of the nine keys become: prob(a, b) =prob(b, a) =p ap b β, prob(a, a) =p a (prob(a, b)+prob(a, S)), prob(s, a) =prob(a, S) =qp aβ, prob(b, b) =p b (prob(b, a)+prob(b, S)), prob(s, b) =prob(b, S) =qp b β, prob(s, S) =q (prob(s, a)+prob(s, b)). By boosting the probabilities of the diagonal keys and down-rating the probabilities of the off-diagonal keys, we are guaranteed that nodes with similar labels will have higher chance to get connected. The pseudo-code of generating edges as described above is shown in Algorithm 1. Next, before showing the experimental results of RTG, we take a detour to describe how we handle time so that edge/weight additions are bursty and

8 20 L. Akoglu and C. Faloutsos self-similar. We also discuss the generalizations of the model in order to produce all types of uni/bipartite, (un)weighted, and (un)directed graphs. Algorithm 1. RTG Input: k, q, W, β Output: edge-list L for output graph G 1: Initialize (k + 1)-by-(k + 1)matrix P with cross-product probabilities 2: // in order to ensure homophily and community structure 3: Multiply off-diagonal probabilities by β, 0 <β<1 4: Boost diagonal probabilities s.t. sum of row(column) probabilities remain the same. 5: Initialize edge list L 6: for 1toW do 7: L1, L2 SelectNodeLabels(P ) 8: AppendL1,L2toL 9: end for 10: 11: function SelectNodeLabels (P ):L1,L2 12: Initialize L1 and L2 to empty string 13: while not terminated L1 and not terminated L2 do 14: Draw i,j with probability P (i, j) 15: if i k, j k then 16: Append character i to L1 and j to L2 if not terminated 17: else if i k, j=k +1 then 18: Append character i to L1 if not terminated 19: Terminate L2 20: else if i=k +1, j k then 21: Append character j to L2 if not terminated 22: Terminate L1 23: else 24: Terminate L1 and L2 25: end if 26: end while 27: Return L1 and L2 28: end function 3.4 Burstiness and Self-Similarity Most real-world traffic as well as edge/weight additions to real-world graphs have been found to be self-similar and bursty [7,14,15,22]. Therefore, in this section we give a brief overview of how to aggregate time so that edge and weight additions, that is ΔE and ΔW, are bursty and self-similar. Notice that when we link two nodes at each step, we add 1 to the total weight W. So, if every step is represented as a single time-tick, the weight additions are uniform. However, to generate bursty traffic, we need to have a bias factor b> 0.5, such that b-fraction of the additions happen in one half and the remaining in the other half. We will use the b-model [30], which generates such self-similar and bursty traffic. Specifically, starting with a uniform interval, we will recursively

9 RTG: A Recursive Realistic Graph Generator Using Random Typing 21 subdivide weight additions to each half, quarter, and so on, according to the bias b. To create randomness, at each step we will randomly swap the order of fractions b and (1 b). Among many methods that measure self-similarity we use the entropy plot [30], which plots the entropy H(r) versus the resolution r. The resolution is the scale, that is, at resolution r, we divide our time interval into 2 r equal sub-intervals, compute ΔE in each sub-interval k(k =1...2 r ), normalize into fractions p k = ΔE E, and compute the Shannon entropy H(r) of the sequence p k. If the plot H(r) is linear,the correspondingtime sequence is said to be self-similar, and the slope of the plot is defined as the fractal dimension f d of the time sequence. Notice that a uniform Δ distribution yields f d =1; a lower value of f d corresponds to a more bursty time sequence, with a single burst having the lowest f d =0:thefractaldimension of a point. 3.5 Generalizations We can easily generalize RTG to model all type of graphs. To generate undirected graphs, we can simply assume edges from source to destination to be undirected as the formation of source and destination labels is the same and symmetric. For unweighted graphs, we can simply ignore duplicate edges, that is, edges that connect already linked nodes. Finally, for bipartite graphs, we can use two different sets of keys such that on the 2-d keyboard, source dimension contains keys from the first set, and the destination dimension from the other set. This assures source and destination labels to be completely different, as desired. 4 Experimental Results ThequestionwewishtoanswerhereishowRTGisabletomodelreal-world graphs. The datasets we used are: Blognet: a social network of blogs based on citations (undirected, unipartite and unweighted with N=27, 726; E=126, 227; over 80 time ticks). Com2Cand: the U.S. electoral campaign donations network from organizations to candidates (directed, bipartite and weighted with N=23, 191; E=877, 721; and W =4, 383, 105, 580 over 29 time ticks). Weights on edges indicate donated dollar amounts. In Figures 4 and 5, we show the related patterns for Blognet and Com2Cand as well as synthetic results, respectively. In order to model these networks, we ran experiments for different parameter values k, q, W,andβ. Here, we show the closestresults that RTG generated, though fitting the parameters is a challenging future direction. We observe that RTG is able to match the long wish-list of static and dynamic properties we presented earlier for the two real graphs. In order to evaluate community structure, we use the modularity measure in [25]. Figure 6(left) shows that modularity increases with smaller imbalance factor β. Without any imbalance, β=1,modularity is as low as 0.35, which indicates that no significant modularity exists. In Figure 6(right), we also show

10 22 L. Akoglu and C. Faloutsos (a) diameter (b) components (c) degrees (d) TPL (e)dpl (f)entropyδe (g) λ 1PL (h) EPL (a) L08 diameter (b) L09 components (c) L01 degree distr. (d) L05 TPL (e) L02 DPL (f) L11 entropy ΔE (g) L07 λ 1PL (h) L06 EPL Fig. 4. Top two rows: properties of Blognet: (a) small and shrinking diameter; (b) largest 3 connected components; (c) degree distribution; (d) triangles Δ vs number of nodes with Δ triangles; (e) densification; (f) bursty edge additions; (g) largest 3 eigenvalues wrt E; (h) rank spectrum of the adjacency matrix. Bottom two rows: results of RTG. Notice the similar qualitative behavior for all eight laws. the running time of RTG wrt the number of duplicate edges (that is, number of iterations W ). Notice the linear growth with increasing W. 5 Conclusion We have designed a generator that meets all the five desirable properties in the introduction. Particularly, our model is

11 RTG: A Recursive Realistic Graph Generator Using Random Typing 23 (a) diameter (b) components (c) degree distr. (d) SPL (e) D(W)PL (f) entropy ΔW(E) (g) λ 1PL (h) EPL (a) L08 diameter (b) L09 components (c) L01 degree distr. (d) L04 SPL (e) L02, L03 D(W)PL (f) L11 entropy ΔW(E) (g) L07 λ 1PL (h) L06 EPL Fig. 5. Top two rows: properties of Com2Cand; as opposed to Blognet, Com2Cand is weighted. So, different from above we show: (d) node weight vs in(inset: out)degree; (e) total weight vs number of edges(inset); (f) bursty weight additions(inset); Bottom two rows: results of RTG. Notice the similar qualitative behavior for all nine laws modularity time (s) β x 10 6 Fig. 6. Left: modularity score vs. imbalance factor β, modularity increases with decreasing β. Forβ=1, the score is very low indicating no significant modularity. Right: computation time vs. W, time grows linearly with increasing number of iterations W. W

12 24 L. Akoglu and C. Faloutsos 1. simple and intuitive, yet it generates the emergent, macroscopic patterns that we see in real graphs. 2. realistic, generating graphs that obey all eleven properties that real graphs obey - no other generator has been shown to achieve that. 3. parsimonious, requiring only a handful of parameters. 4. flexible, capable of generating weighted/unweighted, directed/undirected, and unipartite/bipartite graphs, and any combination of the above. 5. fast, being linear on the number of iterations (on a par with the number of duplicate edges in the output graph). Moreover, we showed how well RTG can mimic some large, real graphs. We have also proven that an early version of RTG generates several of the desired (power) laws, formulated in terms of model parameters. Acknowledgments. This material is based upon work supported by the National Science Foundation under Grants No. IIS , IIS and IIS , also by the icast project sponsored by the National Science Council, Taiwan, under Grants No. NSC P and by the Ministry of Economic Affairs, Taiwan, under Grants No. 97-EC-17-A-02-R and under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 (LLNL-CONF ), subcontracts B579447, B Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of any of the funding parties. References 1. Akoglu, L., McGlohon, M., Faloutsos, C.: Rtm: Laws and a recursive generator for weighted time-evolving graphs. In: ICDM (2008) 2. Albert, R., Jeong, H., Barabasi, A.-L.: Diameter of the World Wide Web. Nature 401, (1999) 3. Barabasi, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), (1999) 4. Chakrabarti, D., Faloutsos, C.: Graph mining: Laws, generators, and algorithms. ACM Comput. Surv. 38(1) (2006) 5. Chakrabarti, D., Zhan, Y., Faloutsos, C.: R-MAT: A recursive model for graph mining. In: SIAM Int. Conf. on Data Mining (April 2004) 6. Conrad, B., Mitzenmacher, M.: Power laws for monkeys typing randomly: the case of unequal probabilities. IEEE Transactions on Information Theory 50(7), (2004) 7. Crovella, M., Bestavros, A.: Self-similarity in world wide web traffic, evidence and possible causes. Sigmetrics, (1996) 8. Erdos, P., Renyi, A.: On the evolution of random graphs. Publ. Math. Inst. Hungary. Acad. Sci. 5, (1960) 9. Even-Bar, E., Kearns, M., Suri, S.: A network formation game for bipartite exchange economies. In: SODA (2007) 10. Fabrikant, A., Luthra, A., Maneva, E.N., Papadimitriou, C.H., Shenker, S.: On a network creation game. In: PODC (2003)

13 RTG: A Recursive Realistic Graph Generator Using Random Typing Faloutsos, M., Faloutsos, P., Faloutsos, C.: On power-law relationships of the internet topology. In: SIGCOMM, August-September 1999, pp (1999) 12. Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.M.: Self-organization and identification of web communities. IEEE Computer 35, (2002) 13. Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99, 7821 (2002) 14. Gomez, M.E., Santonja, V.: Self-similarity in i/o workload: Analysis and modeling. In: WWC (1998) 15. Gribble, S.D., Manku, G.S., Roselli, D., Brewer, E.A., Gibson, T.J., Miller, E.L.: Self-similarity in file systems. In: SIGMETRICS 1998 (1998) 16. Kleinberg, J.M., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.S.: The Web as a graph: Measurements, models and methods. In: Asano, T., Imai, H., Lee, D.T., Nakano, S.-i., Tokuyama, T. (eds.) COCOON LNCS, vol. 1627, pp Springer, Heidelberg (1999) 17. Scheinerman, E., Kraetzl, M., Nickel, C.: Random dot product graphs: a model for social networks (Preliminary Manuscript) (2005) 18. Laoutaris, N., Poplawski, L.J., Rajaraman, R., Sundaram, R., Teng, S.-H.: Bounded budget connection (bbc) games or how to make friends and influence people, on a budget. In: PODC (2008) 19. Leskovec, J., Chakrabarti, D., Kleinberg, J.M., Faloutsos, C.: Realistic, mathematically tractable graph generation and evolution, using kronecker multiplication. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD LNCS (LNAI), vol. 3721, pp Springer, Heidelberg (2005) 20. Leskovec, J., Kleinberg, J., Faloutsos, C.: Graphs over time: densification laws, shrinking diameters and possible explanations. In: ACM SIGKDD (2005) 21. Mandelbrot, B.: An informational theory of the statistical structure of language. Communication Theory (1953) 22. McGlohon, M., Akoglu, L., Faloutsos, C.: Weighted graphs and disconnected components: Patterns and a generator. In: ACM SIGKDD, Las Vegas (August 2008) 23. Miller, G.A.: Some effects of intermittent silence. American Journal of Psychology 70, (1957) 24. Newman, M.E.J.: Power laws, Pareto distributions and Zipf s law (December 2004) 25. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69, (2004) 26. Pennock, D.M., Flake, G.W., Lawrence, S., Glover, E.J., Giles, C.L.: Winners don t take all: Characterizing the competition for links on the web. Proceedings of the National Academy of Sciences, (2002) 27. Schwartz, M.F., Wood, D.C.M.: Discovering shared interests among people using graph analysis of global electronic mail traffic. Communications of the ACM 36, (1992) 28. Siganos, G., Faloutsos, M., Faloutsos, P., Faloutsos, C.: Power laws and the ASlevel internet topology (2003) 29. Tsourakakis, C.E.: Fast counting of triangles in large real networks without counting: Algorithms and laws. In: ICDM (2008) 30. Wang, M., Madhyastha, T., Chan, N.H., Papadimitriou, S., Faloutsos, C.: Data mining meets performance evaluation: Fast algorithms for modeling bursty traffic. In: ICDE, pp (2002) 31. Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Nature 393(6684), (1998)

14 26 L. Akoglu and C. Faloutsos 32. Young, S.J., Scheinerman, E.R.: Random dot product graph models for social networks. In: Bonato, A., Chung, F.R.K. (eds.) WAW LNCS, vol. 4863, pp Springer, Heidelberg (2007) 33. Zipf, G.K.: Selective Studies and the Principle of Relative Frequency in Language. Harvard University Press (1932) Appendix Consider the following setting: W words are typed on a keyboard with k keys and a space bar, the probability of hitting a key p being the same for all keys and probability of hitting the space bar being denoted as q=(1 kp), in the output graph G of the RTG-IE model: Lemma 1. The expected number of nodes N is N W logpk. Proof. Given the number of words W, we want to find the expected number of nodes N that the RTG-IE graph consists of. This question can be reformulated as follows: Given W words typed by a monkey on a keyboard with k keys and a space bar, what is the size of the vocabulary V? The number of unique words V is basically equal to the number of nodes N in the output graph. Let w denote a single word generated by the defined random process. Then, w can recursively be written as follows: w : c i w S, where c i is the character that corresponds to key i, 1 i k, ands is the space character. So, V as a function of model parameters can be formulated as: V (W )=V(c 1,Wp)+V(c 2,Wp)+...+ V (c k,wp)+v(s) { 1, 1 (1 q) W = k V (Wp)+V(S) = k V (Wp)+ 0, (1 q) W where q denotes the probability of hitting the space bar, i.e. q =1 kp. Given the fact that W is often large, and (1 q) < 1, it is almost always the case that w=s is generated; but since this adds only a constant factor, we can ignore it in the rest of the computation. That is, V (W ) k V (Wp)=k (k V (Wp 2 )) = k n V (1) where n = log p (1/W )= log p W. By definition, when W=1, that is, in case only one word is generated, the vocabulary size is 1, i.e. V(1)=1. Therefore, V (W )=N k n = k logpw = W logpk. The above proof shown using recursion is in agreement with the early result of Miller [23], who showed that in the monkey-typing experiment with k equiprobable keys (with probability p) and a space bar (with probability q), the rank-frequency distribution of words follow a power law. In particular, f(r) r 1+log k(1 q) 1 = r log kp.

15 RTG: A Recursive Realistic Graph Generator Using Random Typing x + ( ) = y 10 4 W [1:1M] E = W log p k *(1+c logw) Fig. 7. (a) Rank vs count of vocabulary words typed randomly on a keyboard with k equiprobable keys (with probability p) and a space bar (with probability q), follow a power law with exponent α = log k p. Approximately, the area under the curve gives the total number of words typed. (b) The relationship between number of edges E and total weight W behaves like a power-law (k=2, p=0.4). In this case, the number of ranks corresponds to the number of unique words, that is, the vocabulary size V. And, the sum of the counts of occurrences of all words in the vocabulary should give W, the number of words typed. The total count can be approximated by the area under the curve on the rank-count plot. See Figure 7(a). Next, we give a second proof of Lemma 1 using Miller s result. Proof. Let α = log k p and C(r) denote the number of times that the word with rank r is typed. Then, C(r) =cr α,wherec(r) min = C(V )=cv α and the constant c = C(V )V α.thenwecanwritew as ( V ) ( ) V ( r W = C(V )V α r α C(V )V α r α dr = C(V )V α α+1 ) V r=1 r=1 α +1 r=1 ( ) 1 = C(V )V α α 1 1 ( α 1)V α 1 c V α. where c = C(V ) α 1,whereα< 1 andc(v ) is very small (usually 1). Therefore, V = N W 1 α = W log pk. Lemma 2. The expected number of edges E is E W logpk (1 + c logw), for c = q logpk logp > 0. Proof. Given the number of words W, we want to find the expected number of edges E that the RTG-IE graph consists of. The number of edges E is the same as the unique number of pairs of words. We can think of a pair of words as a single word e, the generation of which is stopped after the second hit to the space

16 28 L. Akoglu and C. Faloutsos bar. So, e always contains a single space character. Recursively, e : c i e Sw, where w : c i w S. So, E canbeformulatedas: E(W )=k E(Wp)+V(Wq) (1) { 1, 1 (1 q) Wq V (Wq)= k V (Wqp)+ 0, (1 q) Wq (2) From Lemma 1, Equ.(2) can be approximately written as V (Wq)=(Wq) logpk. Then, Equ.(1) becomes E(W )=k E(Wp)+cW α,wherec = q logpk and α = log p k. Given that E(W=1)=1, we can solve the recursion as follows: E(W ) k (k E(Wp 2 )+c(wp) α )+cw α = k (k (k V (Wp 3 )+c(wp 2 ) α )+c(wp) α )+cw α = k n V (1) + k n 1 c(wp n 1 )+k n 2 c(wp n 2 ) α cw α = k n V (1) + cw α ((kp α ) n 1 +(kp α ) n ) where n = log p (1/W )= log p W.Sincekp α = kp logpk =1, E(W ) k n V (1)+n cw α =k logpw +c log 1 W logp W logpk =W logpk (1+c logw) where c = c logp = q logp k logp > 0. The above function of E in terms of W and other model parameters looks like a power-law for a wide range of W. See Figure 7(b). Lemma 3. The in/out-degree d n of a node is power law related to its total in/out-weight W n, that is, with expected exponent log k p>1. W n d log kp n Proof. We will show that W n d log kp n for out-edges, and a similar argument holds for in-edges. Given that the experiment is repeated W times, let W n denote the number of times a unique word is typed as a source. Each such unique word corresponds to a node in the final graph and W n is basically its out-weight, since the node appears as a source node. Then, the out-degree d n of a node is simply the number of unique words typed as a destination. From Lemma 1, W n d log kp n, for log k p>1.

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu Mary McGlohon Christos Faloutsos Carnegie Mellon University, School of Computer Science {lakoglu, mmcgloho, christos}@cs.cmu.edu

More information

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs

RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Submitted for Blind Review Abstract How do real, weighted graphs change over time? What patterns, if any, do they obey? Earlier studies

More information

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

6.207/14.15: Networks Lecture 12: Generalized Random Graphs 6.207/14.15: Networks Lecture 12: Generalized Random Graphs 1 Outline Small-world model Growing random networks Power-law degree distributions: Rich-Get-Richer effects Models: Uniform attachment model

More information

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity

Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity Networks as vectors of their motif frequencies and 2-norm distance as a measure of similarity CS322 Project Writeup Semih Salihoglu Stanford University 353 Serra Street Stanford, CA semih@stanford.edu

More information

Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication

Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication Realistic, Mathematically Tractable Graph Generation and Evolution, Using Kronecker Multiplication Jurij Leskovec 1, Deepayan Chakrabarti 1, Jon Kleinberg 2, and Christos Faloutsos 1 1 School of Computer

More information

Modeling of Growing Networks with Directional Attachment and Communities

Modeling of Growing Networks with Directional Attachment and Communities Modeling of Growing Networks with Directional Attachment and Communities Masahiro KIMURA, Kazumi SAITO, Naonori UEDA NTT Communication Science Laboratories 2-4 Hikaridai, Seika-cho, Kyoto 619-0237, Japan

More information

Weighted graphs and disconnected components

Weighted graphs and disconnected components Weighted graphs and disconnected components Patterns and a Generator Mary McGlohon Carnegie Mellon University School of Computer Science Forbes Ave. Pittsburgh, Penn. USA mmcgloho@cs.cmu.edu Leman Akoglu

More information

Weighted Graphs and Disconnected Components

Weighted Graphs and Disconnected Components Weighted Graphs and Disconnected Components Patterns and a Generator Mary McGlohon Carnegie Mellon University School of Computer Science Forbes Ave. Pittsburgh, Penn. USA mmcgloho@cs.cmu.edu Leman Akoglu

More information

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 4 May 2000

arxiv:cond-mat/ v1 [cond-mat.dis-nn] 4 May 2000 Topology of evolving networks: local events and universality arxiv:cond-mat/0005085v1 [cond-mat.dis-nn] 4 May 2000 Réka Albert and Albert-László Barabási Department of Physics, University of Notre-Dame,

More information

Chapter 2 STATISTICAL PROPERTIES OF SOCIAL NETWORKS. Mary McGlohon. Leman Akoglu. Christos Faloutsos

Chapter 2 STATISTICAL PROPERTIES OF SOCIAL NETWORKS. Mary McGlohon. Leman Akoglu. Christos Faloutsos Chapter 2 STATISTICAL PROPERTIES OF SOCIAL NETWORKS Mary McGlohon School of Computer Science Carnegie Mellon University mmcgloho@cs.cmu.edu Leman Akoglu School of Computer Science Carnegie Mellon University

More information

Analysis & Generative Model for Trust Networks

Analysis & Generative Model for Trust Networks Analysis & Generative Model for Trust Networks Pranav Dandekar Management Science & Engineering Stanford University Stanford, CA 94305 ppd@stanford.edu ABSTRACT Trust Networks are a specific kind of social

More information

Analysis of Large Multi-modal Social Networks: Patterns and a Generator

Analysis of Large Multi-modal Social Networks: Patterns and a Generator Analysis of Large Multi-modal Social Networks: Patterns and a Generator Nan Du 1,HaoWang 1, and Christos Faloutsos 2 1 Nokia Research Center, Beijing {daniel.du,hao.ui.wang}@nokia.com 2 Carnegie Mellon

More information

CS224W: Social and Information Network Analysis

CS224W: Social and Information Network Analysis CS224W: Social and Information Network Analysis Reaction Paper Adithya Rao, Gautam Kumar Parai, Sandeep Sripada Keywords: Self-similar networks, fractality, scale invariance, modularity, Kronecker graphs.

More information

Data mining in large graphs

Data mining in large graphs Data mining in large graphs Christos Faloutsos University www.cs.cmu.edu/~christos ALLADIN 2003 C. Faloutsos 1 Outline Introduction - motivation Patterns & Power laws Scalability & Fast algorithms Fractals,

More information

Data Mining and Analysis: Fundamental Concepts and Algorithms

Data Mining and Analysis: Fundamental Concepts and Algorithms Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA

More information

CS224W: Analysis of Networks Jure Leskovec, Stanford University

CS224W: Analysis of Networks Jure Leskovec, Stanford University CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/30/17 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2

More information

Deterministic scale-free networks

Deterministic scale-free networks Physica A 299 (2001) 559 564 www.elsevier.com/locate/physa Deterministic scale-free networks Albert-Laszlo Barabasi a;, Erzsebet Ravasz a, Tamas Vicsek b a Department of Physics, College of Science, University

More information

T , Lecture 6 Properties and stochastic models of real-world networks

T , Lecture 6 Properties and stochastic models of real-world networks T-79.7003, Lecture 6 Properties and stochastic models of real-world networks Charalampos E. Tsourakakis 1 1 Aalto University November 1st, 2013 Properties of real-world networks Properties of real-world

More information

arxiv: v2 [stat.ml] 21 Aug 2009

arxiv: v2 [stat.ml] 21 Aug 2009 KRONECKER GRAPHS: AN APPROACH TO MODELING NETWORKS Kronecker graphs: An Approach to Modeling Networks arxiv:0812.4905v2 [stat.ml] 21 Aug 2009 Jure Leskovec Computer Science Department, Stanford University

More information

arxiv: v1 [physics.soc-ph] 15 Dec 2009

arxiv: v1 [physics.soc-ph] 15 Dec 2009 Power laws of the in-degree and out-degree distributions of complex networks arxiv:0912.2793v1 [physics.soc-ph] 15 Dec 2009 Shinji Tanimoto Department of Mathematics, Kochi Joshi University, Kochi 780-8515,

More information

CMU SCS : Multimedia Databases and Data Mining CMU SCS Must-read Material CMU SCS Must-read Material (cont d)

CMU SCS : Multimedia Databases and Data Mining CMU SCS Must-read Material CMU SCS Must-read Material (cont d) 15-826: Multimedia Databases and Data Mining Lecture #26: Graph mining - patterns Christos Faloutsos Must-read Material Michalis Faloutsos, Petros Faloutsos and Christos Faloutsos, On Power-Law Relationships

More information

Must-read material (1 of 2) : Multimedia Databases and Data Mining. Must-read material (2 of 2) Main outline

Must-read material (1 of 2) : Multimedia Databases and Data Mining. Must-read material (2 of 2) Main outline Must-read material (1 of 2) 15-826: Multimedia Databases and Data Mining Lecture #29: Graph mining - Generators & tools Fully Automatic Cross-Associations, by D. Chakrabarti, S. Papadimitriou, D. Modha

More information

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities

A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities Journal of Advanced Statistics, Vol. 3, No. 2, June 2018 https://dx.doi.org/10.22606/jas.2018.32001 15 A Modified Method Using the Bethe Hessian Matrix to Estimate the Number of Communities Laala Zeyneb

More information

A Modified Earthquake Model Based on Generalized Barabási Albert Scale-Free

A Modified Earthquake Model Based on Generalized Barabási Albert Scale-Free Commun. Theor. Phys. (Beijing, China) 46 (2006) pp. 1011 1016 c International Academic Publishers Vol. 46, No. 6, December 15, 2006 A Modified Earthquake Model Based on Generalized Barabási Albert Scale-Free

More information

1 Complex Networks - A Brief Overview

1 Complex Networks - A Brief Overview Power-law Degree Distributions 1 Complex Networks - A Brief Overview Complex networks occur in many social, technological and scientific settings. Examples of complex networks include World Wide Web, Internet,

More information

A stochastic model for the link analysis of the Web

A stochastic model for the link analysis of the Web A stochastic model for the link analysis of the Web Paola Favati, IIT-CNR, Pisa. Grazia Lotti, University of Parma. Ornella Menchi and Francesco Romani, University of Pisa Three examples of link distribution

More information

Models of Communication Dynamics for Simulation of Information Diffusion

Models of Communication Dynamics for Simulation of Information Diffusion Models of Communication Dynamics for Simulation of Information Diffusion Konstantin Mertsalov, Malik Magdon-Ismail, Mark Goldberg Rensselaer Polytechnic Institute Department of Computer Science 11 8th

More information

Link Analysis Ranking

Link Analysis Ranking Link Analysis Ranking How do search engines decide how to rank your query results? Guess why Google ranks the query results the way it does How would you do it? Naïve ranking of query results Given query

More information

Evolving network with different edges

Evolving network with different edges Evolving network with different edges Jie Sun, 1,2 Yizhi Ge, 1,3 and Sheng Li 1, * 1 Department of Physics, Shanghai Jiao Tong University, Shanghai, China 2 Department of Mathematics and Computer Science,

More information

Spectral Methods for Subgraph Detection

Spectral Methods for Subgraph Detection Spectral Methods for Subgraph Detection Nadya T. Bliss & Benjamin A. Miller Embedded and High Performance Computing Patrick J. Wolfe Statistics and Information Laboratory Harvard University 12 July 2010

More information

Mini course on Complex Networks

Mini course on Complex Networks Mini course on Complex Networks Massimo Ostilli 1 1 UFSC, Florianopolis, Brazil September 2017 Dep. de Fisica Organization of The Mini Course Day 1: Basic Topology of Equilibrium Networks Day 2: Percolation

More information

The Beginning of Graph Theory. Theory and Applications of Complex Networks. Eulerian paths. Graph Theory. Class Three. College of the Atlantic

The Beginning of Graph Theory. Theory and Applications of Complex Networks. Eulerian paths. Graph Theory. Class Three. College of the Atlantic Theory and Applications of Complex Networs 1 Theory and Applications of Complex Networs 2 Theory and Applications of Complex Networs Class Three The Beginning of Graph Theory Leonhard Euler wonders, can

More information

Deterministic Decentralized Search in Random Graphs

Deterministic Decentralized Search in Random Graphs Deterministic Decentralized Search in Random Graphs Esteban Arcaute 1,, Ning Chen 2,, Ravi Kumar 3, David Liben-Nowell 4,, Mohammad Mahdian 3, Hamid Nazerzadeh 1,, and Ying Xu 1, 1 Stanford University.

More information

Long Term Evolution of Networks CS224W Final Report

Long Term Evolution of Networks CS224W Final Report Long Term Evolution of Networks CS224W Final Report Jave Kane (SUNET ID javekane), Conrad Roche (SUNET ID conradr) Nov. 13, 2014 Introduction and Motivation Many studies in social networking and computer

More information

Graph Detection and Estimation Theory

Graph Detection and Estimation Theory Introduction Detection Estimation Graph Detection and Estimation Theory (and algorithms, and applications) Patrick J. Wolfe Statistics and Information Sciences Laboratory (SISL) School of Engineering and

More information

Dynamics of Real-world Networks

Dynamics of Real-world Networks Dynamics of Real-world Networks Thesis proposal Jurij Leskovec Machine Learning Department Carnegie Mellon University May 2, 2007 Thesis committee: Christos Faloutsos, CMU Avrim Blum, CMU John Lafferty,

More information

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search 6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search Daron Acemoglu and Asu Ozdaglar MIT September 30, 2009 1 Networks: Lecture 7 Outline Navigation (or decentralized search)

More information

Network models: dynamical growth and small world

Network models: dynamical growth and small world Network models: dynamical growth and small world Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics

More information

Measuring the shape of degree distributions

Measuring the shape of degree distributions Measuring the shape of degree distributions Dr Jennifer Badham Visiting Fellow SEIT, UNSW Canberra research@criticalconnections.com.au Overview Context What does shape mean for degree distribution Why

More information

Network models: random graphs

Network models: random graphs Network models: random graphs Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Structural Analysis

More information

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University

Mining of Massive Datasets Jure Leskovec, AnandRajaraman, Jeff Ullman Stanford University Note to other teachers and users of these slides: We would be delighted if you found this our material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit

More information

Algebraic Representation of Networks

Algebraic Representation of Networks Algebraic Representation of Networks 0 1 2 1 1 0 0 1 2 0 0 1 1 1 1 1 Hiroki Sayama sayama@binghamton.edu Describing networks with matrices (1) Adjacency matrix A matrix with rows and columns labeled by

More information

Eigenvalues of Random Power Law Graphs (DRAFT)

Eigenvalues of Random Power Law Graphs (DRAFT) Eigenvalues of Random Power Law Graphs (DRAFT) Fan Chung Linyuan Lu Van Vu January 3, 2003 Abstract Many graphs arising in various information networks exhibit the power law behavior the number of vertices

More information

Modeling Dynamic Evolution of Online Friendship Network

Modeling Dynamic Evolution of Online Friendship Network Commun. Theor. Phys. 58 (2012) 599 603 Vol. 58, No. 4, October 15, 2012 Modeling Dynamic Evolution of Online Friendship Network WU Lian-Ren ( ) 1,2, and YAN Qiang ( Ö) 1 1 School of Economics and Management,

More information

Degree Distribution: The case of Citation Networks

Degree Distribution: The case of Citation Networks Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is

More information

Class President: A Network Approach to Popularity. Due July 18, 2014

Class President: A Network Approach to Popularity. Due July 18, 2014 Class President: A Network Approach to Popularity Due July 8, 24 Instructions. Due Fri, July 8 at :59 PM 2. Work in groups of up to 3 3. Type up the report, and submit as a pdf on D2L 4. Attach the code

More information

arxiv:physics/ v3 [physics.soc-ph] 28 Jan 2007

arxiv:physics/ v3 [physics.soc-ph] 28 Jan 2007 arxiv:physics/0603229v3 [physics.soc-ph] 28 Jan 2007 Graph Evolution: Densification and Shrinking Diameters Jure Leskovec School of Computer Science, Carnegie Mellon University, Pittsburgh, PA Jon Kleinberg

More information

Combining Geographic and Network Analysis: The GoMore Rideshare Network. Kate Lyndegaard

Combining Geographic and Network Analysis: The GoMore Rideshare Network. Kate Lyndegaard Combining Geographic and Network Analysis: The GoMore Rideshare Network Kate Lyndegaard 10.15.2014 Outline 1. Motivation 2. What is network analysis? 3. The project objective 4. The GoMore network 5. The

More information

Opinion Dynamics on Triad Scale Free Network

Opinion Dynamics on Triad Scale Free Network Opinion Dynamics on Triad Scale Free Network Li Qianqian 1 Liu Yijun 1,* Tian Ruya 1,2 Ma Ning 1,2 1 Institute of Policy and Management, Chinese Academy of Sciences, Beijing 100190, China lqqcindy@gmail.com,

More information

El Botellón: Modeling the Movement of Crowds in a City

El Botellón: Modeling the Movement of Crowds in a City El Botellón: Modeling the Movement of Crowds in a City Jonathan E. Rowe Rocio Gomez School of Computer Science, University of Birmingham, Birmingham B15 2TT, United Kingdom A simulation of crowd movement

More information

Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions

Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions 1 Modeling, Analysis and Validation of Evolving Networks with Hybrid Interactions Jiaqi Liu, Luoyi Fu, Yuhang Yao, Xinzhe Fu, Xinbing Wang and Guihai Chen Shanghai Jiao Tong University {13-liujiaqi, yiluofu,

More information

Spectral Analysis of Directed Complex Networks. Tetsuro Murai

Spectral Analysis of Directed Complex Networks. Tetsuro Murai MASTER THESIS Spectral Analysis of Directed Complex Networks Tetsuro Murai Department of Physics, Graduate School of Science and Engineering, Aoyama Gakuin University Supervisors: Naomichi Hatano and Kenn

More information

Shlomo Havlin } Anomalous Transport in Scale-free Networks, López, et al,prl (2005) Bar-Ilan University. Reuven Cohen Tomer Kalisky Shay Carmi

Shlomo Havlin } Anomalous Transport in Scale-free Networks, López, et al,prl (2005) Bar-Ilan University. Reuven Cohen Tomer Kalisky Shay Carmi Anomalous Transport in Complex Networs Reuven Cohen Tomer Kalisy Shay Carmi Edoardo Lopez Gene Stanley Shlomo Havlin } } Bar-Ilan University Boston University Anomalous Transport in Scale-free Networs,

More information

Recognizing single-peaked preferences on aggregated choice data

Recognizing single-peaked preferences on aggregated choice data Recognizing single-peaked preferences on aggregated choice data Smeulders B. KBI_1427 Recognizing Single-Peaked Preferences on Aggregated Choice Data Smeulders, B. Abstract Single-Peaked preferences play

More information

ECS 289 F / MAE 298, Lecture 15 May 20, Diffusion, Cascades and Influence

ECS 289 F / MAE 298, Lecture 15 May 20, Diffusion, Cascades and Influence ECS 289 F / MAE 298, Lecture 15 May 20, 2014 Diffusion, Cascades and Influence Diffusion and cascades in networks (Nodes in one of two states) Viruses (human and computer) contact processes epidemic thresholds

More information

arxiv: v1 [cs.si] 15 Dec 2011

arxiv: v1 [cs.si] 15 Dec 2011 Community structure and scale-free collections of Erdős Rényi graphs C. Seshadhri, Tamara G. Kolda, Ali Pinar Sandia National Laboratories, Livermore, CA, USA (Dated: December 16, 2011) Community structure

More information

Similarity Measures for Link Prediction Using Power Law Degree Distribution

Similarity Measures for Link Prediction Using Power Law Degree Distribution Similarity Measures for Link Prediction Using Power Law Degree Distribution Srinivas Virinchi and Pabitra Mitra Dept of Computer Science and Engineering, Indian Institute of Technology Kharagpur-72302,

More information

Self Similar (Scale Free, Power Law) Networks (I)

Self Similar (Scale Free, Power Law) Networks (I) Self Similar (Scale Free, Power Law) Networks (I) E6083: lecture 4 Prof. Predrag R. Jelenković Dept. of Electrical Engineering Columbia University, NY 10027, USA {predrag}@ee.columbia.edu February 7, 2007

More information

Overview of Network Theory

Overview of Network Theory Overview of Network Theory MAE 298, Spring 2009, Lecture 1 Prof. Raissa D Souza University of California, Davis Example social networks (Immunology; viral marketing; aliances/policy) M. E. J. Newman The

More information

Subgraph Detection Using Eigenvector L1 Norms

Subgraph Detection Using Eigenvector L1 Norms Subgraph Detection Using Eigenvector L1 Norms Benjamin A. Miller Lincoln Laboratory Massachusetts Institute of Technology Lexington, MA 02420 bamiller@ll.mit.edu Nadya T. Bliss Lincoln Laboratory Massachusetts

More information

Spatial and Temporal Behaviors in a Modified Evolution Model Based on Small World Network

Spatial and Temporal Behaviors in a Modified Evolution Model Based on Small World Network Commun. Theor. Phys. (Beijing, China) 42 (2004) pp. 242 246 c International Academic Publishers Vol. 42, No. 2, August 15, 2004 Spatial and Temporal Behaviors in a Modified Evolution Model Based on Small

More information

Groups of vertices and Core-periphery structure. By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA

Groups of vertices and Core-periphery structure. By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA Groups of vertices and Core-periphery structure By: Ralucca Gera, Applied math department, Naval Postgraduate School Monterey, CA, USA Mostly observed real networks have: Why? Heavy tail (powerlaw most

More information

Problem Set 4. General Instructions

Problem Set 4. General Instructions CS224W: Analysis of Networks Fall 2017 Problem Set 4 General Instructions Due 11:59pm PDT November 30, 2017 These questions require thought, but do not require long answers. Please be as concise as possible.

More information

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds

MAE 298, Lecture 8 Feb 4, Web search and decentralized search on small-worlds MAE 298, Lecture 8 Feb 4, 2008 Web search and decentralized search on small-worlds Search for information Assume some resource of interest is stored at the vertices of a network: Web pages Files in a file-sharing

More information

Finding patterns in large, real networks

Finding patterns in large, real networks Thanks to Finding patterns in large, real networks Christos Faloutsos CMU www.cs.cmu.edu/~christos/talks/ucla-05 Deepayan Chakrabarti (CMU) Michalis Faloutsos (UCR) George Siganos (UCR) UCLA-IPAM 05 (c)

More information

Complex-Network Modelling and Inference

Complex-Network Modelling and Inference Complex-Network Modelling and Inference Lecture 12: Random Graphs: preferential-attachment models Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/notes/

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21 Cosma Shalizi 3 April 2008 Models of Networks, with Origin Myths Erdős-Rényi Encore Erdős-Rényi with Node Types Watts-Strogatz Small World Graphs Exponential-Family

More information

Faloutsos, Tong ICDE, 2009

Faloutsos, Tong ICDE, 2009 Large Graph Mining: Patterns, Tools and Case Studies Christos Faloutsos Hanghang Tong CMU Copyright: Faloutsos, Tong (29) 2-1 Outline Part 1: Patterns Part 2: Matrix and Tensor Tools Part 3: Proximity

More information

Móstoles, Spain. Keywords: complex networks, dual graph, line graph, line digraph.

Móstoles, Spain. Keywords: complex networks, dual graph, line graph, line digraph. Int. J. Complex Systems in Science vol. 1(2) (2011), pp. 100 106 Line graphs for directed and undirected networks: An structural and analytical comparison Regino Criado 1, Julio Flores 1, Alejandro García

More information

SMALL-WORLD NAVIGABILITY. Alexandru Seminar in Distributed Computing

SMALL-WORLD NAVIGABILITY. Alexandru Seminar in Distributed Computing SMALL-WORLD NAVIGABILITY Talk about a small world 2 Zurich, CH Hunedoara, RO From cliché to social networks 3 Milgram s Experiment and The Small World Hypothesis Omaha, NE Boston, MA Wichita, KS Human

More information

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information

More information

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices

Communities Via Laplacian Matrices. Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices Communities Via Laplacian Matrices Degree, Adjacency, and Laplacian Matrices Eigenvectors of Laplacian Matrices The Laplacian Approach As with betweenness approach, we want to divide a social graph into

More information

A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal

A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal Nurcan Durak, Tamara G. Kolda, Ali Pinar, and C. Seshadhri Sandia National Laboratories Livermore, CA

More information

A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal

A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal A Scalable Null Model for Directed Graphs Matching All Degree Distributions: In, Out, and Reciprocal Nurcan Durak, Tamara G. Kolda, Ali Pinar, and C. Seshadhri Sandia National Laboratories Livermore, CA

More information

Chaos, Complexity, and Inference (36-462)

Chaos, Complexity, and Inference (36-462) Chaos, Complexity, and Inference (36-462) Lecture 21: More Networks: Models and Origin Myths Cosma Shalizi 31 March 2009 New Assignment: Implement Butterfly Mode in R Real Agenda: Models of Networks, with

More information

CMU SCS. Large Graph Mining. Christos Faloutsos CMU

CMU SCS. Large Graph Mining. Christos Faloutsos CMU Large Graph Mining Christos Faloutsos CMU Thank you! Hillol Kargupta NGDM 2007 C. Faloutsos 2 Outline Problem definition / Motivation Static & dynamic laws; generators Tools: CenterPiece graphs; fraud

More information

Finding Hidden Group Structure in a Stream of Communications

Finding Hidden Group Structure in a Stream of Communications Finding Hidden Group Structure in a Stream of Communications J. Baumes 1, M. Goldberg 1, M. Hayvanovych 1, M. Magdon-Ismail 1, W. Wallace 2, and M. Zaki 1 1 CS Department, RPI, Rm 207 Lally, 110 8th Street,

More information

ECS 253 / MAE 253, Lecture 15 May 17, I. Probability generating function recap

ECS 253 / MAE 253, Lecture 15 May 17, I. Probability generating function recap ECS 253 / MAE 253, Lecture 15 May 17, 2016 I. Probability generating function recap Part I. Ensemble approaches A. Master equations (Random graph evolution, cluster aggregation) B. Network configuration

More information

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68

CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68 CSI 445/660 Part 6 (Centrality Measures for Networks) 6 1 / 68 References 1 L. Freeman, Centrality in Social Networks: Conceptual Clarification, Social Networks, Vol. 1, 1978/1979, pp. 215 239. 2 S. Wasserman

More information

An evolving network model with community structure

An evolving network model with community structure INSTITUTE OF PHYSICS PUBLISHING JOURNAL OF PHYSICS A: MATHEMATICAL AND GENERAL J. Phys. A: Math. Gen. 38 (2005) 9741 9749 doi:10.1088/0305-4470/38/45/002 An evolving network model with community structure

More information

Decision Making and Social Networks

Decision Making and Social Networks Decision Making and Social Networks Lecture 4: Models of Network Growth Umberto Grandi Summer 2013 Overview In the previous lecture: We got acquainted with graphs and networks We saw lots of definitions:

More information

Adventures in random graphs: Models, structures and algorithms

Adventures in random graphs: Models, structures and algorithms BCAM January 2011 1 Adventures in random graphs: Models, structures and algorithms Armand M. Makowski ECE & ISR/HyNet University of Maryland at College Park armand@isr.umd.edu BCAM January 2011 2 Complex

More information

CS224W: Methods of Parallelized Kronecker Graph Generation

CS224W: Methods of Parallelized Kronecker Graph Generation CS224W: Methods of Parallelized Kronecker Graph Generation Sean Choi, Group 35 December 10th, 2012 1 Introduction The question of generating realistic graphs has always been a topic of huge interests.

More information

Directed Scale-Free Graphs

Directed Scale-Free Graphs Directed Scale-Free Graphs Béla Bollobás Christian Borgs Jennifer Chayes Oliver Riordan Abstract We introduce a model for directed scale-free graphs that grow with preferential attachment depending in

More information

Why does attention to web articles fall with time?

Why does attention to web articles fall with time? Why does attention to web articles fall with time? M.V. Simkin and V.P. Roychowdhury Department of Electrical Engineering, University of California, Los Angeles, CA 90095-594 We analyze access statistics

More information

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

LINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS LINK ANALYSIS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Retrieval evaluation Link analysis Models

More information

Network Science (overview, part 1)

Network Science (overview, part 1) Network Science (overview, part 1) Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California rgera@nps.edu Excellence Through Knowledge Overview Current research Section 1:

More information

Power Laws & Rich Get Richer

Power Laws & Rich Get Richer Power Laws & Rich Get Richer CMSC 498J: Social Media Computing Department of Computer Science University of Maryland Spring 2015 Hadi Amiri hadi@umd.edu Lecture Topics Popularity as a Network Phenomenon

More information

Data Mining Recitation Notes Week 3

Data Mining Recitation Notes Week 3 Data Mining Recitation Notes Week 3 Jack Rae January 28, 2013 1 Information Retrieval Given a set of documents, pull the (k) most similar document(s) to a given query. 1.1 Setup Say we have D documents

More information

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides

Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Thanks to Jure Leskovec, Stanford and Panayiotis Tsaparas, Univ. of Ioannina for slides Web Search: How to Organize the Web? Ranking Nodes on Graphs Hubs and Authorities PageRank How to Solve PageRank

More information

Social-Affiliation Networks: Patterns and the SOAR Model

Social-Affiliation Networks: Patterns and the SOAR Model Social-Affiliation Networks: Patterns and the SOAR Model Dhivya Eswaran 1, Reihaneh Rabbany 2, Artur W. Dubrawski 1, and Christos Faloutsos 1 1 School of Computer Science, Carnegie Mellon University {deswaran,

More information

Competition and multiscaling in evolving networks

Competition and multiscaling in evolving networks EUROPHYSICS LETTERS 5 May 2 Europhys. Lett., 54 (4), pp. 436 442 (2) Competition and multiscaling in evolving networs G. Bianconi and A.-L. Barabási,2 Department of Physics, University of Notre Dame -

More information

NOTES ON COOPERATIVE GAME THEORY AND THE CORE. 1. Introduction

NOTES ON COOPERATIVE GAME THEORY AND THE CORE. 1. Introduction NOTES ON COOPERATIVE GAME THEORY AND THE CORE SARA FROEHLICH 1. Introduction Cooperative game theory is fundamentally different from the types of games we have studied so far, which we will now refer to

More information

1 Mechanistic and generative models of network structure

1 Mechanistic and generative models of network structure 1 Mechanistic and generative models of network structure There are many models of network structure, and these largely can be divided into two classes: mechanistic models and generative or probabilistic

More information

Scale-free network of earthquakes

Scale-free network of earthquakes Scale-free network of earthquakes Sumiyoshi Abe 1 and Norikazu Suzuki 2 1 Institute of Physics, University of Tsukuba, Ibaraki 305-8571, Japan 2 College of Science and Technology, Nihon University, Chiba

More information

Lecture 9: Laplacian Eigenmaps

Lecture 9: Laplacian Eigenmaps Lecture 9: Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 18, 2017 Optimization Criteria Assume G = (V, W ) is a undirected weighted graph with

More information

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian Radu Balan January 31, 2017 G = (V, E) An undirected graph G is given by two pieces of information: a set of vertices V and a set of edges E, G =

More information

The Hypercube Graph and the Inhibitory Hypercube Network

The Hypercube Graph and the Inhibitory Hypercube Network The Hypercube Graph and the Inhibitory Hypercube Network Michael Cook mcook@forstmannleff.com William J. Wolfe Professor of Computer Science California State University Channel Islands william.wolfe@csuci.edu

More information

Modularity in several random graph models

Modularity in several random graph models Modularity in several random graph models Liudmila Ostroumova Prokhorenkova 1,3 Advanced Combinatorics and Network Applications Lab Moscow Institute of Physics and Technology Moscow, Russia Pawe l Pra

More information

Social and economic network formation: a dynamic model

Social and economic network formation: a dynamic model Social and economic network formation: a dynamic model Omid Atabati 1 and Babak Farzad 2 1 Department of Economics, University of Calgary, Calgary, AB T2N 1N4, Canada oatabati@ucalgary.ca 2 Department

More information