Random Networks. Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

Similar documents
Random Networks. Complex Networks, CSYS/MATH 303, Spring, Prof. Peter Dodds

Generating Functions for Random Networks

The Beginning of Graph Theory. Theory and Applications of Complex Networks. Eulerian paths. Graph Theory. Class Three. College of the Atlantic

Network models: random graphs

Complex Networks, Course 303A, Spring, Prof. Peter Dodds

6.207/14.15: Networks Lecture 12: Generalized Random Graphs

Data Mining and Analysis: Fundamental Concepts and Algorithms

Networks: Lectures 9 & 10 Random graphs

1 Mechanistic and generative models of network structure

Contagion. Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

3.2 Configuration model

ECS 253 / MAE 253, Lecture 15 May 17, I. Probability generating function recap

Scale-Free Networks. Outline. Redner & Krapivisky s model. Superlinear attachment kernels. References. References. Scale-Free Networks

Complex Networks CSYS/MATH 303, Spring, Prof. Peter Dodds

MATH6142 Complex Networks Exam 2016

Physics 6720 Introduction to Statistics April 4, 2017

Mini course on Complex Networks

Chapter 4 Continuous Random Variables and Probability Distributions

Almost giant clusters for percolation on large trees

Lecture 06 01/31/ Proofs for emergence of giant component

Assortativity and Mixing. Outline. Definition. General mixing. Definition. Assortativity by degree. Contagion. References. Contagion.

6.207/14.15: Networks Lecture 3: Erdös-Renyi graphs and Branching processes

BOSTON UNIVERSITY GRADUATE SCHOOL OF ARTS AND SCIENCES. Dissertation TRANSPORT AND PERCOLATION IN COMPLEX NETWORKS GUANLIANG LI

Final Exam: Probability Theory (ANSWERS)

TELCOM2125: Network Science and Analysis

Chapter 4: An Introduction to Probability and Statistics

Exact solution of site and bond percolation. on small-world networks. Abstract

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010

6.207/14.15: Networks Lecture 4: Erdös-Renyi Graphs and Phase Transitions

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 31, 2017

Name: Firas Rassoul-Agha

Random Graphs. 7.1 Introduction

Stat 134 Fall 2011: Notes on generating functions

Discrete Distributions

Introduction. Probability and distributions

FINAL EXAM: Monday 8-10am

CS 237: Probability in Computing

Susceptible-Infective-Removed Epidemics and Erdős-Rényi random

Statistics, Data Analysis, and Simulation SS 2013

ALL TEXTS BELONG TO OWNERS. Candidate code: glt090 TAKEN FROM

1 Random graph models

CS224W: Analysis of Networks Jure Leskovec, Stanford University

networks in molecular biology Wolfgang Huber

18.175: Lecture 13 Infinite divisibility and Lévy processes

Intensity Analysis of Spatial Point Patterns Geog 210C Introduction to Spatial Data Analysis

Graph Detection and Estimation Theory

Power laws. Leonid E. Zhukov

Randomized Algorithms

CS1800: Mathematical Induction. Professor Kevin Gold

w i w j = w i k w k i 1/(β 1) κ β 1 β 2 n(β 2)/(β 1) wi 2 κ 2 i 2/(β 1) κ 2 β 1 β 3 n(β 3)/(β 1) (2.2.1) (β 2) 2 β 3 n 1/(β 1) = d

I. Introduction to probability, 1

Math 216 First Midterm 8 October, 2012

Infection Dynamics on Small-World Networks

Holme and Newman (2006) Evolving voter model. Extreme Cases. Community sizes N =3200, M =6400, γ =10. Our model: continuous time. Finite size scaling

A = A U. U [n] P(A U ). n 1. 2 k(n k). k. k=1

Lectures on Elementary Probability. William G. Faris

Friday Four Square! Today at 4:15PM, Outside Gates

6.207/14.15: Networks Lecture 7: Search on Networks: Navigation and Web Search

Measuring the shape of degree distributions

MA131 - Analysis 1. Workbook 4 Sequences III

Erdős-Renyi random graphs basics

Random Graphs. EECS 126 (UC Berkeley) Spring 2019

Week 1 Quantitative Analysis of Financial Markets Distributions A

1: PROBABILITY REVIEW

Statistical Methods: Introduction, Applications, Histograms, Ch

Notes 6 : First and second moment methods

Phase Transitions in Physics and Computer Science. Cristopher Moore University of New Mexico and the Santa Fe Institute

Chapter 3. Erdős Rényi random graphs

BIRTHDAY PROBLEM, MONOCHROMATIC SUBGRAPHS & THE SECOND MOMENT PHENOMENON / 23

A Generalization of Wigner s Law

Lecture 29: Tractable and Intractable Problems

Statistics, Data Analysis, and Simulation SS 2017

Synchronization Transitions in Complex Networks

Probabilistic Combinatorics. Jeong Han Kim

Introduction to Probability

1 Complex Networks - A Brief Overview

Spectral Analysis of Directed Complex Networks. Tetsuro Murai

Phase Transitions in Networks: Giant Components, Dynamic Networks, Combinatoric Solvability

Probability Distributions - Lecture 5

It can be shown that if X 1 ;X 2 ;:::;X n are independent r.v. s with

MATH2206 Prob Stat/20.Jan Weekly Review 1-2

CS 246 Review of Proof Techniques and Probability 01/14/19

arxiv:cond-mat/ v1 28 Feb 2005

Asymptotics for minimal overlapping patterns for generalized Euler permutations, standard tableaux of rectangular shapes, and column strict arrays

Models of Computation,

14 Branching processes

Special distributions

ECS 289 F / MAE 298, Lecture 15 May 20, Diffusion, Cascades and Influence

Learning Theory Continued

Mathematical Induction Again

Critical Phenomena and Percolation Theory: II

THE N-VALUE GAME OVER Z AND R

Network Science: Principles and Applications

Principles of Complex Systems CSYS/MATH 300, Spring, Prof. Peter

Undecidable Problems. Z. Sawa (TU Ostrava) Introd. to Theoretical Computer Science May 12, / 65

The Sum n. Introduction. Proof I: By Induction. Caleb McWhorter

Principles of Complex Systems CSYS/MATH 300, Fall, Prof. Peter Dodds

04. Random Variables: Concepts

Chapter 9. Non-Parametric Density Function Estimation

Transcription:

Complex Networks CSYS/MATH 303, Spring, 2011 Prof. Peter Dodds Department of Mathematics & Statistics Center for Complex Systems Vermont Advanced Computing Center University of Vermont Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. 1 of 65

Outline 2 of 65

Random networks Pure, abstract random networks: Consider set of all networks with N labelled nodes and m edges. Standard random network = one randomly chosen network from this set. To be clear: each network is equally probable. Sometimes equiprobability is a good assumption, but it is always an assumption. Known as Erdős-Rényi random networks or ER graphs. 4 of 65

Random network generator for N = 3: Get your own exciting generator here ( ). As N, our polyhedral die rapidly becomes a ball... 5 of 65

Random networks basic features: Number of possible edges: ( ) N N(N 1) 0 m = 2 2 Limit of m = 0: empty graph. Limit of m = ( N 2) : complete or fully-connected graph. Number of possible networks with N labelled nodes: 2 (N 2) e ln 2 2 N2. Given m edges, there are ( ( N ) 2) m different possible networks. Crazy factorial explosion for 1 m ( N 2). Real world: links are usually costly so real networks are almost always sparse. 6 of 65

Random networks standard random networks: Given N and m. Two probablistic methods (we ll see a third later on) 1. Connect each of the ( N 2) pairs with appropriate probability p. Useful for theoretical work. 2. Take N nodes and add exactly m links by selecting edges without replacement. Algorithm: Randomly choose a pair of nodes i and j, i j, and connect if unconnected; repeat until all m edges are allocated. Best for adding relatively small numbers of links (most cases). 1 and 2 are effectively equivalent for large N. 8 of 65

Random networks A few more things: For method 1, # links is probablistic: ( ) N m = p = p 1 N(N 1) 2 2 So the expected or average degree is k = 2 m N = 2 N p 1 2 N(N 1) = 2 N p 1 N(N 1) = p(n 1). 2 Which is what it should be... If we keep k constant then p 1/N 0 as N. 9 of 65

Random networks: examples Next slides: Example realizations of random networks N = 500 Vary m, the number of edges from 100 to 1000. Average degree k runs from 0.4 to 4. Look at full network plus the largest component. 11 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 100 average degree k = 0.4 12 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 200 average degree k = 0.8 13 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 230 average degree k = 0.92 14 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 240 average degree k = 0.96 15 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 250 average degree k = 1 16 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 260 average degree k = 1.04 17 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 280 average degree k = 1.12 18 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 300 average degree k = 1.2 19 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 500 average degree k = 2 20 of 65

Random networks: examples entire network: largest component: N = 500, number of edges m = 1000 average degree k = 4 21 of 65

Random networks: examples for N=500 m = 100 k = 0.4 m = 200 k = 0.8 m = 230 k = 0.92 m = 240 k = 0.96 m = 250 k = 1 m = 260 k = 1.04 m = 280 k = 1.12 m = 300 k = 1.2 m = 500 k = 2 m = 1000 k = 4 22 of 65

Random networks: largest components m = 100 k = 0.4 m = 200 k = 0.8 m = 230 k = 0.92 m = 240 k = 0.96 m = 250 k = 1 m = 260 k = 1.04 m = 280 k = 1.12 m = 300 k = 1.2 m = 500 k = 2 m = 1000 k = 4 23 of 65

Random networks: examples for N=500 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 24 of 65

Random networks: largest components m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 m = 250 k = 1 25 of 65

in random networks: For method 1, what is the clustering coefficient for a finite network? Consider triangle/triple clustering coefficient: [1] C 2 = 3 #triangles #triples Recall: C 2 = probability that two friends of a node are also friends. Or: C 2 = probability that a triple is part of a triangle. For standard random networks, we have simply that C 2 = p. 27 of 65

Other ways to compute clustering: Expected number of triples in entire network: 1 N(N 1)(N 2)p2 2 (Double counting dealt with by 1 2.) Expected number of triangles in entire network: 1 N(N 1)(N 2)p3 6 (Over-counting dealt with by 1 6.) C 2 = 3 #triangles #triples = 3 1 6N(N 1)(N 2)p3 1 2 N(N 1)(N = p. 2)p2 28 of 65

Other ways to compute clustering: Or: take any three nodes, call them a, b, and c. Triple a-b-c centered at b occurs with probability p 2 (1 p) + p 2 p = p 2. Triangle occurs with probability p 3. Therefore, C 2 = p3 p 2 = p. 29 of 65

in random networks: So for large random networks (N ), clustering drops to zero. Key structural feature of random networks is that they locally look like pure branching networks No small loops. 30 of 65

Random networks Degree distribution: Recall P k = probability that a randomly selected node has degree k. Consider method 1 for constructing random networks: each possible link is realized with probability p. Now consider one node: there are N 1 choose k ways the node can be connected to k of the other N 1 nodes. Each connection occurs with probability p, each non-connection with probability (1 p). Therefore have a binomial distribution: ( ) N 1 P(k; p, N) = p k (1 p) N 1 k. k 32 of 65

Random networks Limiting form of P(k; p, N): Our degree distribution: P(k; p, N) = ( ) N 1 k p k (1 p) N 1 k. What happens as N? We must end up with the normal distribution right? If p is fixed, then we would end up with a Gaussian with average degree k pn. But we want to keep k fixed... So examine limit of P(k; p, N) when p 0 and N with k = p(n 1) = constant. 33 of 65

Limiting form of P(k; p, N): Substitute p = k N 1 into P(k; p, N) and hold k fixed: ( N 1 P(k; p, N) = k ) ( k N 1 (N 1)! k k = k!(n 1 k)! (N 1) k ) k ( 1 k ) N 1 k N 1 ( 1 k ) N 1 k N 1 = (N 1)(N 2) (N k) k! N k (1 1 N ) (1 N k ) k k k! N k (1 1 N )k k k (N 1) k ( 1 k ) N 1 k N 1 ( 1 k ) N 1 k N 1 34 of 65

Limiting form of P(k; p, N): We are now here: P(k; p, N) k k k! ( 1 k ) N 1 k N 1 Now use the excellent result: ( lim 1 + x ) n = e x. n n (Use l Hôpital s rule to prove.) Identifying n = N 1 and x = k : ( P(k; k ) k k k! e k 1 k ) k k k N 1 k! e k This is a Poisson distribution ( ) with mean k. 35 of 65

Poisson basics: P(k; λ) = λk k! e λ λ > 0 k = 0, 1, 2, 3,... Classic use: probability that an event occurs k times in a given time period, given an average rate of occurrence. e.g.: phone calls/minute, horse-kick deaths. Law of small numbers 36 of 65

Poisson basics: Normalization: we must have P(k; k ) = 1 Checking: k=0 P(k; k ) = k=0 k=0 = e k k=0 k k k! e k k k k! = e k e k = 1 37 of 65

Poisson basics: Mean degree: we must have k = kp(k; k ). Checking: k=0 k=0 kp(k; k ) = k k k k! e k = k e k = e k k=1 k=0 = k e k i=0 k i i! k=1 k k (k 1)! k k 1 (k 1)! = k e k e k = k Note: We ll get to a better and crazier way of doing this... 38 of 65

Poisson basics: The variance of degree distributions for random networks turns out to be very important. Use calculation similar to one for finding k to find the second moment: k 2 = k 2 + k. Variance is then σ 2 = k 2 k 2 = k 2 + k k 2 = k. So standard deviation σ is equal to k. Note: This is a special property of Poisson distribution and can trip us up... 39 of 65

General random networks So... standard random networks have a Poisson degree distribution Generalize to arbitrary degree distribution P k. Also known as the configuration model. [1] Can generalize construction method from ER random networks. Assign each node a weight w from some distribution P w and form links with probability P(link between i and j) w i w j. But we ll be more interested in 1. Randomly wiring up (and rewiring) already existing nodes with fixed degrees. 2. Examining mechanisms that lead to networks with certain degree distributions. 41 of 65

Random networks: examples Coming up: Example realizations of random networks with power law degree distributions: N = 1000. P k k γ for k 1. Set P 0 = 0 (no isolated nodes). Vary exponent γ between 2.10 and 2.91. Again, look at full network plus the largest component. Apart from degree distribution, wiring is random. 42 of 65

Random networks: examples for N=1000 γ = 2.1 hk i = 3.448 γ = 2.19 hk i = 2.986 γ = 2.28 hk i = 2.306 γ = 2.37 hk i = 2.504 γ = 2.46 hk i = 1.856 γ = 2.55 hk i = 1.712 γ = 2.64 hk i = 1.6 γ = 2.73 hk i = 1.862 γ = 2.82 hk i = 1.386 γ = 2.91 hk i = 1.49 43 of 65

Random networks: largest components γ = 2.1 k = 3.448 γ = 2.19 k = 2.986 γ = 2.28 k = 2.306 γ = 2.37 k = 2.504 γ = 2.46 k = 1.856 γ = 2.55 k = 1.712 γ = 2.64 k = 1.6 γ = 2.73 k = 1.862 γ = 2.82 k = 1.386 γ = 2.91 k = 1.49 44 of 65

The edge-degree distribution: The degree distribution P k is fundamental for our description of many complex networks Again: P k is the degree of randomly chosen node. A second very important distribution arises from choosing randomly on edges rather than on nodes. Define Q k to be the probability the node at a random end of a randomly chosen edge has degree k. Now choosing nodes based on their degree (i.e., size): Q k kp k Normalized form: Q k = kp k k =0 k = kp k P k k. 46 of 65

The edge-degree distribution: For random networks, Q k is also the probability that a friend (neighbor) of a random node has k friends. Useful variant on Q k : R k = probability that a friend of a random node has k other friends. (k + 1)P k+1 R k = k =0 (k + 1)P k +1 = (k + 1)P k+1 k Equivalent to friend having degree k + 1. Natural question: what s the expected number of other friends that one friend has? 47 of 65

The edge-degree distribution: Given R k is the probability that a friend has k other friends, then the average number of friends other friends is k R = kr k = = 1 k k=0 = 1 k k=0 k (k + 1)P k+1 k k(k + 1)P k+1 k=1 ( (k + 1) 2 (k + 1) ) P k+1 k=1 (where we have sneakily matched up indices) = 1 k (j 2 j)p j (using j = k+1) j=0 = 1 ( k 2 k ) k 48 of 65

The edge-degree distribution: Note: our result, k R = 1 ( k k 2 k ), is true for all random networks, independent of degree distribution. For standard random networks, recall k 2 = k 2 + k. Therefore: k R = 1 ( ) k 2 + k k = k k Again, neatness of results is a special property of the Poisson distribution. So friends on average have k other friends, and k + 1 total friends... 49 of 65

Two reasons why this matters Reason #1: Average # friends of friends per node is k 2 = k k R = k 1 ( k 2 k ) = k 2 k. k Key: Average depends on the 1st and 2nd moments of P k and not just the 1st moment. Three peculiarities: 1. We might guess k 2 = k ( k 1) but it s actually k(k 1). 2. If P k has a large second moment, then k 2 will be big. (e.g., in the case of a power-law distribution) 3. Your friends really are different from you... 50 of 65

Two reasons why this matters More on peculiarity #3: A node s average # of friends: k Friend s average # of friends: k 2 k Comparison: k 2 k = k k 2 k 2 = k σ2 + k 2 k 2 ) = k (1 + σ2 k 2 k So only if everyone has the same degree (variance= σ 2 = 0) can a node be the same as its friends. Intuition: for random networks, the more connected a node, the more likely it is to be chosen as a friend. 51 of 65

Two reasons why this matters (Big) Reason #2: k R is key to understanding how well random networks are connected together. e.g., we d like to know what s the size of the largest component within a network. As N, does our network have a giant component? Defn: Component = connected subnetwork of nodes such that path between each pair of nodes in the subnetwork, and no node outside of the subnetwork is connected to it. Defn: Giant component = component that comprises a non-zero fraction of a network as N. Note: Component = Cluster 52 of 65

Giant component S 1 1 0.8 0.6 0.4 0.2 0 0 1 2 3 4 k 54 of 65

of random networks Giant component: A giant component exists if when we follow a random edge, we are likely to hit a node with at least 1 other outgoing edge. Equivalently, expect exponential growth in node number as we move out from a random node. All of this is the same as requiring k R > 1. Giant component condition (or percolation condition): k R = k 2 k k > 1 Again, see that the second moment is an essential part of the story. Equivalent statement: k 2 > 2 k 55 of 65

Giant component Standard random networks: Recall k 2 = k 2 + k. Condition for giant component: k R = k 2 k k = k 2 + k k k = k Therefore when k > 1, standard random networks have a giant component. When k < 1, all components are finite. Fine example of a continuous phase transition ( ). We say k = 1 marks the critical point of the system. 56 of 65

Giant component Random networks with skewed P k : e.g, if P k = ck γ with 2 < γ < 3, k 1, then k 2 = c k 2 k γ x 3 γ x=1 x=1 k=1 x 2 γ dx = ( k ). So giant component always exists for these kinds of networks. Cutoff scaling is k 3 : if γ > 3 then we have to look harder at k R. How about P k = δ kk0? 57 of 65

Giant component And how big is the largest component? Define S 1 as the size of the largest component. Consider an infinite ER random network with average degree k. Let s find S 1 with a back-of-the-envelope argument. Define δ as the probability that a randomly chosen node does not belong to the largest component. Simple connection: δ = 1 S 1. Dirty trick: If a randomly chosen node is not part of the largest component, then none of its neighbors are. So δ = P k δ k k=0 Substitute in Poisson distribution... 58 of 65

Giant component Carrying on: δ = P k δ k = k=0 k=0 = e k ( k δ) k k! k=0 k k k! e k δ k = e k e k δ = e k (1 δ). Now substitute in δ = 1 S 1 and rearrange to obtain: S 1 = 1 e k S 1. 59 of 65

Giant component We can figure out some limits and details for S 1 = 1 e k S 1. First, we can write k in terms of S 1 : k = 1 S 1 ln 1 1 S 1. As k 0, S 1 0. As k, S 1 1. Notice that at k = 1, the critical point, S 1 = 0. Only solvable for S 1 > 0 when k > 1. Really a transcritical bifurcation. [2] 60 of 65

Giant component S 1 1 0.8 0.6 0.4 0.2 0 0 1 2 3 4 k 61 of 65

Giant component Turns out we were lucky... Our dirty trick only works for ER random networks. The problem: We assumed that neighbors have the same probability δ of belonging to the largest component. But we know our friends are different from us... Works for ER random networks because k = k R. We need a separate probability δ for the chance that an edge leads to the giant (infinite) component. We can sort many things out with sensible probabilistic arguments... More detailed investigations will profit from a spot of Generatingfunctionology. [3] 62 of 65

64 of 65

I [1] M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45(2):167 256, 2003. pdf ( ) [2] S. H. Strogatz. Nonlinear Dynamics and Chaos. Addison Wesley, Reading, Massachusetts, 1994. [3] H. S. Wilf. Generatingfunctionology. A K Peters, Natick, MA, 3rd edition, 2006. pdf ( ) 65 of 65