Shortest Paths & Link Weight Structure in Networks

Shortest Paths & Link Weight Structure in etworks Piet Van Mieghem CAIDA WIT (May 2006) P. Van Mieghem 1

Outline Introduction The Art of Modeling Conclusions P. Van Mieghem 2

Telecommunication: e2e A ETWORK G(,L) B Main purpose: Transfer information from A B early always: Transport of packets along shortest (or optimal) paths Optimality criterion: in terms of Quality of Service (QoS) parameters (delay, loss, jitter, distance, monetary cost,...) Broad focus: What is the role of the graph/network on e2e QoS? What is the collective behavior of flows, the network dynamics? P. Van Mieghem 3

Fractal ature of the Internet P. Van Mieghem 4

Large Graphs protein P. Van Mieghem 5

Internet observed by RIPE boxes

Ad-hoc network Despite node movements, at any instant in time the network can be considered as a graph with a certain topology Physics of the evolution of Ad-hoc graphs P. Van Mieghem 7

Outline Introduction The Art of Modeling: The Basic Model Conclusions P. Van Mieghem 8

etwork Modeling Any network can be representated as a graph G with nodes and L links: adjaceny matrix, degree (=number of direct neighbors), connectivity, etc... link weight structure: importance of a link Link weights: also known as quality of service (QoS) parameters delay, available capacity, loss, monetary cost, physical distance, etc... Assumption that transport follows shortest paths correct in more than 70% cases in the Internet P. Van Mieghem 9

etwork Modeling Confinement to: Hop Count (or number of hops) of the shortest path: Apart from end systems, QoS degradation occurs in routers (= node). QoS measures (packet delay, jitter, packet loss) depend on the number of traversed routers and not on the length of shortest path. Relatively easy to measure (trace-route utility) and to compute (initial assumption) Weight (End-to-end delay) of the shortest path: perhaps the most interesting QoS measure for applications difficult to measure accurately capacity of a link: measurement is related to a delay measurement (dispersion of two IP packets back-to-back) P. Van Mieghem 10

Simple Topology Model Link weights w(i j )>0: 2 unknown, but very likely not constant, w(i j ) 1 5/2 1 2 assumption: i.i.d. uniformly/exponentially distributed 5 3 3 bi-directional links: w(i j )= w(j i ) 1 4 1 Complete graph K One level of complexity higher: ER Random graphs of class G p () : number of nodes p: link density or probability of being edge i j equals p only connected graphs: p > p c ~ log/ 2/3 Is this the right structure? Are exponential weights reasonable? ot a good model for the Internet graph, but reasonably good for Peer to Peer networks (Gnutella, KaZaa) and Ad Hoc networks P. Van Mieghem 11

Markov Discovery Process in the complete graph K node A tt 0 1) property of i.i.d. exponential r.v. s min 1 j n n = ( Exp( a ) j ) Exp a j j= 1 2) memoryless property of exponential distribution 3) transition rates for K : λ = n( ) n, n+ 1 n t t 1 B t 2 t 3 time T P. Van Mieghem 12

Markov Discovery Process and Uniform Recursive Tree v 8 v 7 v 6 v 5 v 4 v 3 v 2 v 1 τ 6 0 1 3 2 7 4 5 Markov Discovery Process 6 8 time Discovery times: v k = τ j j= 1 where τ j is exponentially distributed 1 with mean τ = [ ] E j j( k j) 0 2 6 4 corresponding URT 3 7 8 1 5 H = 0 H = 1 H = 2 H = 3 Growth rule URT: Every not yet discovered node has equal probability to be attached to any node of the URT. Hence, the number of URTs with nodes equals (-1)! P. Van Mieghem 13

Uniform Recursive Tree 1 Root (0) X = 1 6 2 3 5 26 12 18 4 9 7 8 10 11 15 22 14 21 13 16 20 17 (1) X (2) X (3) X = 5 = 9 = 7 Pr X ( k ) E [ ] [ X ] y = k = ( k ) = 0 if k 24 19 23 25 Recursion from Growth rule URT: P. Van Mieghem 14 E (4) X = 4 [ ] [ ] 1 ( k 1) ( k ) E X m X = Recursion for generating function: ( 1) ϕ 1( z) = ( + z) ϕ ( z) y Solution: ( z) = E[ z ] ϕ Γ( + z) = Γ( + 1) Γ( z + 1) + + m= k m

Hopcount of the shortest path in K with exp. link weights Shortest path (SP) described via Markov discovery process. Shortest path tree (SPT) = Uniform recursive tree (URT) The generating function of the hopcount H of shortest path between different nodes is from which (via Taylor expansion around z = 0 and Stirling s approximation) follows that and E [ z ] H = ( z) 1 with 1 ϕ Γ( + z) ϕ ( z) = Γ( + 1) Γ( z + 1) P [ H = k] E var k m= k m log cm 0 ( k m)! [ H ] log + γ 1 [ ] H log + γ 2 π 6 close to Poisson P. Van Mieghem 15

The Weight of SP v 8 v 7 v 6 v 5 v 4 v 3 v 2 v 1 0 1 3 2 4 5 Markov Discovery Process 6 time The k-th discovered node is attached at time v k = τ 1 +τ 2 +...+τ k where τ n is exponentially distributed with rate n(-n) and the τ s are independent (Markov property): E k zvk [ e ] = n( n) n= 1 z + n( n) The weight W of the SP in the complete graph with exponential link weights is zw zvk [ ] = E[ e ] Pr[ endnode is k - th attached node in URT] E e or ϕ W k = 1 ( z) = E e k zw 1 [ ] = 1 n( n) k = 1 n= 1 z + n( n) τ 6 7 8 From this pgf, the mean weight (length) is derived as E 1 ' 1 1 [ ] W = ϕ W (0) = 1 n= 1 n P. Van Mieghem 16

Additional Results The asymptotic distribution of the weight of the SP is x e lim Pr[ W log x] = e The flooding time (= minimum time to inform all nodes in network) is = 1 v 1 τ j= 1 j The weight W U of the shortest path tree can be computed explicitly. Other instances of trees (see Multicast). The degree distribution (= number of direct neighbors) in the URT can be computed explicitly, Pr log k 1 k [ D = k] = 2 + O 2 P. Van Mieghem 17

Outline Introduction The Art of Modeling: Extension & Measurements Conclusions P. Van Mieghem 18

Extending the Basic Model From the complete graph K to the Erdös-Rényi random graph G p (): from p = 1 to p < 1. Result (which can be proved rigorously): for large and fixed p > p c ~ log/ holds that the SPT in the class of G p () with exponential link weights is a URT. Intuitively: G p () with p > p c is sufficiently dense (there are enough links) and the link weights can be arbitrarily small such that the thinning effect of the link weights is precisely the same in all connected random graphs. Implications: basic model (SPT = URT) seems more widely applicable! influence of the link weight structure is important P. Van Mieghem 19

RIPE measurement configuration The traceroute data provided by RIPE CC (the etwork Coordination Centre of the Réseaux IP Européen) in the period 1998-2001. Central Point MySQL Database Test Box A ISP A The results ISP C Test Box C Internal Internal networks networks Border Router Border Router Internal Internal networks networks Internal Internal networks networks Border Router Border Router Internal Internal networks networks Test Box B ISP B ISP D Test Box D Probe-packets P. Van Mieghem 20

Model and Internet measurements 0.10 E[h]=13 Var[h]=21.8 α=e[h]/var[h]=0.6 Pr[H = k] 0.08 0.06 0.04 Measurement from RIPE fit(log())=13.1 P [ H = k] k m= k m log cm 0 ( k m)! 0.02 0.00 0 5 10 hop k 15 20 25 2000 P. Van Mieghem 21

Model and Internet measurements 0.10 0.08 Asia Europe USA fit with log( Asia ) = 13.5 fit with log( Europe ) = 12.6 fit with log( USA ) = 12.9 Pr[H = k] Pr[H = k] 0.06 0.04 0.02 0.00 0 5 10 15 hop k Hop k 20 25 30 october 2004 P. Van Mieghem 22

But Power law graph τ = 2.4 G 0.013 (300) Internet: power law graph for AS? o consistent model for hopcount, although seemingly success of rg with exp. or unif. link weight structure P. Van Mieghem 23

Degree graph Scaling law for hopcount in Degree Graph defined by Pr[D > x] = c x 1-τ with mean µ = E[D]. For τ > 3, α > 0, and a k = [Log ν k]-log ν k Pr[H [Log ν ] = k] = Pr[R a = k] + O(log) α where the random variable R a Pr[R a > k] = E[ exp{- κ ν a+k W1W2} W1W2 > 0] with and ν = E[D(D-1)]/E[D] and κ = E[D]/(ν 1) and where W1 and W2 are independent normalized copies of a branching process. [Work in collaboration with R. van der Hofstad and G. Hooghiemstra] Importance: (a) for each and M where a = a M (i.e. M=/ν k ) Pr[H >k] follows by a shift over k hops from Pr[H M >k]. Hence, the hopcount for arbitrary large degree graphs (e.g. Internet?) can be simulated. (b) Currently most accurate hopcount formula E [ H ] ( ν 1) E[ logw W 0] log 1 γ + log µ log > + 2 logν 2 logν logν P. Van Mieghem 24

Outline Introduction The Art of Modeling: The Link Weight Structure Conclusions P. Van Mieghem 25

Extreme Value Index SP is mainly determined by smallest link weights in the distribution F w (x) = Pr[w < x] Taylor: F w (x) = F w (0) +F w (0).x+O(x 2 ) = f w (0).x+O(x 2 ) SP is not changed by scaling of link weights, hence, f w (0), can be considered as scaling factor: 1. f w (0) > 0 is finite: regular distribution 2. f w (0) = 0: link weights cannot be arbitrarily small; more influence of topology 3. f w (0) : increasing prob. mass at x = 0 Polynomial distribution: F w (x) = x α F w (x) x in [0,1] 1 = 1 x > 1 α < 1 where α is the extreme value index α = 1 1. α = 1: regular distribution α > 1 x 2. α > 1: decreasing influence w 0 1 3. α < 1: increasing influence w P. Van Mieghem 26

Link Weight Distribution 1 F w (x) α < 1 For the shortest path, only the small region around zero matters! α = 1 α > 1 0 ε larger scale 1 x P. Van Mieghem 27

α : all link weights are equal to w = 1 Three regimes no influence of link weights; only the topology matters α 1 : link weights are regular (e.g. uniform, exponential) SPT = URT if underlying graph is connected α 0 : strong disorder: heavily fluctuating link weights in region close to x = 0 union of all SPT is the minumum spanning tree (MST) P. Van Mieghem 28

Average Hopcount Average umber of Hops 30 25 20 15 10 Limit: α = 0 α = 0.05 α = 0.1 α = 0.2 α = 0.4 α = 0.6 α = 0.8 p = 2 ln / 5 3 4 5 6 7 8 9 100 2 3 4 5 6 7 8 9 1000 umber of odes [ H ( α )] ln α 1/3 [ H ( α )] = O( ) for α 0 E E for α around1 P. Van Mieghem 29

MST and URT 7 7 89 93 78 91 16 54 64 27 98 34 4 79 19 85 95 45 39 99 97 90 57 47 42 8 9 31 48 57 70 56 100 62 46 40 80 33 77 76 81 43 86 12 37 13 35 36 67 84 11 41 26 82 45 3 29 28 49 50 38 60 44 55 22 58 83 63 72 71 15 36 91 42 33 51 32 53 89 25 69 14 82 66 2 98 18 76 24 68 96 78 38 88 3 52 56 81 73 61 52 67 53 84 30 20 90 77 51 1 15 28 65 22 100 10 9 85 17 80 54 63 40 92 26 98 74 6 23 37 18 12 19 75 72 6 10 65 17 75 21 24 47 13 46 8 34 2 97 73 59 94 5 44 23 95 69 66 27 92 71 11 68 58 29 86 55 41 79 59 96 31 5 60 49 99 1 20 83 38 = 100 nodes 25 16 21 94 39 88 43 64 70 48 14 50 35 87 4 30 74 62 MST (α (a) 0 limit) URT (α(b) 1 limit) P. Van Mieghem 30

Phase Transition 1.0 0.8 MST-like = 25 = 50 = 100 = 200 Pr[G Uspt(α) = MST] 0.6 0.4 α c = O( -β ) α = O( -β ) β around 2/3 0.2 0.0 α URT-like 0.01 0.1 1 10 Van Faculty Mieghem, of Electrical P. Engineering, and S. M. Mathematics Magdalena, and Computer "A Phase ScienceTransition in the Link Weight Structure P. Van of Mieghem etworks", 31 Physical Review E, Vol. 72, ovember, p. 056138, (2005). α/α c

Phase Transition: implications ature: superconductivity T < T c : macroscopic quantum effect (R = 0) T > T c : normal conductivity (R > 0) etworks: Artifically created phase transition if link weight structure can be changed independently of topology, control of transport: α < α c : almost all over critical backbone (MST) α > α c : spread over more paths; load balanced critical backbone has minimum possible number of links 1: only these links need to be secured P. Van Mieghem 32

G Uspt : Union of SPTs Observable Part of a etwork Minimum spanning tree belongs to G Uspt Degree distribution in the overlay G Uspt on the complete graph with exp. link weights is exactly known. Conjecture: For large, the overlay G Uspt on the ER random graph G p () with i.i.d. regular weights and any p > p c is a connected ER graph G p c (). we have good arguments, not a rigorous proof important for Peer-to-peer networks! P. Van Mieghem 33

Outline Introduction The Art of Modeling: Conclusions P. Van Mieghem 34

Basic SPT model: URT Summarizing simple, analytic computations possible reasonable first order model to approximate ad-hoc networks, less accurate for Internet (degree!) Link weight structure is important: much room for research: what is link weight structure of a real network? what are relevant weights (delay, loss, distance, etc...?) how to update link weights? if link weights can be chosen independent of topology, a phase transition exists: steering of traffic in two modes P. Van Mieghem 35

Summary Articles:http://www.nas.e wi.tudelft.nl/people/piet Book: Performance Analysis of Computer Systems and etworks, Cambridge University Press (2006) P. Van Mieghem 36