Random matrices, phase transitions & queuing theory. Raj Rao Nadakuditi

Random matrices, phase transitions & queuing theory Raj Rao Nadakuditi Dept. of Electrical Engg. & Computer Science http://www.eecs.umich.edu/~rajnrao Joint works with Jinho Baik, Igor Markov and Mingyan Liu 1

Queuing theory 2

A fundamental problem in queuing theory F I F O 1 F I F O 2 F I F O m Job 1 Job 2 Job n Stage 1 0 Time Job 1 Job 2 Job n Stage 2 0 -idle Time Job 1 Job 2 Job n Stage 3 0 Time End-to-end latency m = # servers, n = # customers (or jobs) Objective: Characterize L(m, n) = exit time for n-th customer from m-th queue Model for production systems, multi-hop networks, pipelined computation 3

Why is characterizing latency important? F I F O 1 F I F O 2 F I F O m Many existing applications 0 are delay-sensitve Time Stage 1 Production systems, Streaming audio and video - particularly audio Optimal scheduling/provisioning Job 1 Job 2 delay-throughput Job n tradeoff 0 Job 1 Job 2 -idle Job n Emerging applications Stage 2 envision control and inference over large networks Telemedicine, sensor networks andjob distributed 1 Job computation 2 Job n Quality of Service 0 (QoS) guarantees important Time Stage 3 Network topology design End-to-end latency Ad-hoc, multi-hop networks prevalent (e.g. deliver interet to rural areas) Optimal placement of hops? Remote diagnosis of service bottlenecks? Statistical characterization of delay important Time 4

A basic model F I F O 1 F I F O 2 F I F O m Notation: Stage 1 S i = Server i {1,..., m} Stage 3 0 0 Job 1 Job 2 C j = Customer j {1,. 0.-idle., n} Stage 2 w(i, j) = Service time for C j at S i Assumptions: Infinitely long buffer Arrival process is Poissonian with rate α Job n Job 1 Job 2 Job n w(i, j) ind. exp(1/µ i ) M/M/m queue Job 1 Job 2 Job n End-to-end latency Time Time Time Question: Average Delay? 5

Little s Law and average delay Informally: Avg. Time in System = Avg. # of Cust. Eff. Arrival Rate By Burke s Theorem: Consequently: P(#Cust. in Queue i = k) = ( 1 µ ) k ( ) i µi α α for k = 0, 1,... Avg.# Cust. in System = m i=1 µ i µ i α i 6

What Little s law says and does not say Avg.# Time in System = 1 α m i=1 µ i µ i α i Mathematically: Avg.# Time in System = lim t α(t) i=0 Time spent by Customer i α(t) α(t) = # Customers who arrived in the interval [0, t] No insights on: variance, pdf, bottleneck behavior, etc. Contrast with L(m, n) = exit time for Customer n from Server m Transient-like statistic! Computable? 7

What makes the problem difficult? F I F O 1 F I F O 2 F I F O m Job 1 Job 2 Job n Stage 1 0 Time Job 1 Job 2 Job n Stage 2 0 -idle Time Job 1 Job 2 Job n Stage 3 0 Time End-to-end latency m = # servers, n = # customers (or jobs) Objective: Characterize L(m, n) = exit time for n-th customer from m-th queue Strong interaction between arrival and departure process no independence 9

New insights beyond Little s Law: Main message Latency mean and variance can be explicitly computed! Analysis reveals emergence of phase transitions Rigorous basis for statistical anomaly testing Can show that O(n 1/3 ) jobs have statistically independent latencies Extends easily to quasi-reversible networks (thanks Demos!) Analysis of queue-state dependent servicing (inspired by backpressure algorithms) Results appear to hold even for non-exponential service times Universality conjecture! All made possible due to connection with random matrix theory! 10

Phase transitions 11

A numerical example G = Gaussian random matrix G = randn(n,n) or G = sign(randn(n,n)) X n = G + G 2n X n = X n + P n P n = θ u u u is a fixed, non-random unit norm vector X n has i.i.d. zero mean, variance 1/2n entries (on off-diagonal) Question: Largest eigenvalue? Variation with θ? 12

One experimental realization 0.3 0.25 0.2 0.15 0.1 0.05 0 3 2 1 0 1 2 3 4 5 ρ θ = 4, n = 500 Bulk obeys semi-circle law on [ 2, 2] Largest eig. 4.2 13

An eigenvalue phase transition 12 λ1(xn + Pn) 10 8 6 4 2 Theory Experiment: n = 50 Experiment: n = 500 0 0 1 2 3 4 5 6 7 8 9 10 θ Clear phase transition @ θ = 1 with increasing n 14

Theorem: Consider X n = X n + θuu Phase transition prediction λ 1 ũ 1, u 2 a.s. a.s. θ + 1 θ, θ > 1 2, otherwise (1 1θ ), θ > 1 2 0, otherwise Eigenvalue result first due to Peche (2006), Peche-Feral (2007) Eigenvector result new (and derived by us) Eigenvalues and eigenvectors are biased 15

Phase transitions & Random matrix theory or What theory predicts the phase transition? 16

Definitions and assumptions Spectral measure: Eigenvalues of X n are λ 1,..., λ n : n µ X n = 1 n i=1 δ λi Assumptions: 1. µ X n a.s. µ X 2. supp µ X compactly supported on [a, b] 3. max(eig) a.s. to b 17

A basic signal-plus-noise model X n = θuu H + X n Assumptions: X n is symmetric with n real eigenvalues θ 1 >... > θ k > 0 X n = QΛQ where Q is a Haar distributed unitary (or orthogonal) matrix u is a unit-norm vector X n = GG will satisfy conditions 18

Phase transition in the eigenvalues Theorem [Benaych-Georges and N.]: As n, λ 1 ( X n ) a.s. { G 1 µ (1/θ i) if θ > θ c := 1/G µ (b + ), b otherwise, Critical threshold depends explicitly on spectral measure of noise Cauchy transform of µ: G µ (z) = 1 z y dµ(y) for z / supp µ X. 19

Phase transition of eigenvectors Theorem [Benaych-Georges and N.]: As n, for θ > θ c : ũ 1, u 2 a.s. 1 θi 2G µ (ρ) ρ = G 1 µ (1/θ i) is the corresponding eigenvalue limit Theorem: As n, for θ θ c : ũ 1, u a.s. 0 Eigenvalue density at edge needed of form (x b) α with α (0, 1] 20

Above phase transition u ũ a b ρ (a) Eigenvalue: θ > θc (b) Eigenvector: θ > θc 21

Below phase transition ũ u a b (c) Eigenvalue: θ θc (d) Eigenvector: θ θc 22 1

The queuing theory connection 23

Problem setup F I F O 1 F I F O 2 F I F O m Assumptions: Infinitely long buffer Stage 1 0 Job 1 Job 2 Job n Job 1 Job 2 Job n Arrival process is Poissonian 0 -idle with rate α Stage 2 w(i, j) ind. exp(1/µ i ) M/M/m queue 0 Time Objective: Compute L(m, Stage 3 n) = exit time for batch of n customers when Job 1 Job 2 Job n End-to-end latency Queues are in equilibrium before the batch of n customers arrive Time Time 24

Theorem [Baik & N., 2012]: The random matrix connection L(m, n) D = λ 1 (W ) W = Γ 1/2 gg Γ 1/2 + Σ 1/2 GG Σ 1/2 G is an m (n 1) matrix of i.i.d. CN(0, 1) entries g is an m 1 vector of i.i.d CN(0, 1) entries Σ = diag(1/µ 1,..., 1/µ m ) Γ = diag (1/(µ 1 α),..., 1/(µ m α)) Sanity check: α = 0, n = 1, L(m, n) = i 1/µ i g i 2 g i 2 is chi-squared with 2 d.o.f. Exponential! 25

Theorem [Baik & N., 2012]: The random matrix connection L(m, n) D = λ 1 (W ) W = Γ 1/2 gg Γ 1/2 + Σ 1/2 GG Σ 1/2 Rank-one-signal plus noise expect phase transition! G is an m (n 1) matrix of i.i.d. CN(0, 1) entries g is an m 1 vector of i.i.d CN(0, 1) entries Σ = diag(1/µ 1,..., 1/µ m ) Γ = diag (1/(µ 1 α),..., 1/(µ m α)) 26

Recall: New insight: phase transitions in queuing behavior Arrival process is Poissonian with rate α < µ i w(i, j) ind. exp(1/µ i ) M/M/m queue A critical rate: l crit = z such that i 1 (µ i z) 2 n z 2 = 0, z = l crit (0, µ min ) Theorem [Baik & N., 2012]: Case 1: 0 < l crit < α arrival rate is faster than critical rate L(m, n) is normally distributed: mean O(n), variance O(m) Case 2: l crit > µ min slowest server is slower than critical rate L(m, n) is normally distributed: mean O(n), variance O(m) Case 3: α < l crit < µ min slowest server fast enough, arrival rate slow enough L(m, n) is Tracy-Widom distributed: mean O(n) and variance O(m 2/3 ) 27

Tracy-Widom versus Normal 0.5 0.4 TW 2 distr. PDF 0.3 0.2 Normal distr. 0.1 0 5 0 5 28

New insights: phase transitions and more A critical rate: l crit = z such that i 1 (µ i z) 2 n z 2 = 0, z = l crit (0, µ min ) Theorem [Baik & N., 2012]: Case 1: 0 < l crit < α arrival rate is faster than critical rate L(m, n) is normally distributed: mean O(n), variance O(m) Case 2: l crit > µ min slowest server is slower than critical rate L(m, n) is normally distributed: mean O(n), variance O(m) Case 3: α < l crit < µ min slowest server fast enough, arrival rate slow enough L(m, n) is Tracy-Widom distributed: mean O(n) and variance O(m 2/3 ) 29

An elementary bound: The importance of the variance scaling result var max X i i var X i Upper-bounding latency: var L(m, n) O(n) Insight 1: Upper bound matched only when there is a bottleneck! Insight 2: Realized variance is much less than upper bound! Service prov. due to upp. bound very conservative Opportunity for perf. gains or relax system specs to meet existing QoS reqs! Work with Mingyan Liu on optimal file-split.in multi-route, multi-hop ntwk 30

Numerical results Mean Variance m n Experiment Theory Experiment Theory 5 5 13.1024 12.3685 9.4351 15.0981 10 10 30.9954 30.3849 18.6033 23.9668 20 20 68.3172 67.8858 33.0268 38.0449 40 40 145.0274 144.7371 55.1251 60.3926 80 80 300.9902 300.7699 90.0644 95.8673 160 160 615.9515 615.7717 148.8302 152.1799 320 320 1249.4124 1249.4742 236.0294 241.5705 480 480 1885.7545 1885.0567 311.7331 316.5469 640 640 2521.6221 2521.5399 374.6064 383.4693 1000 1000 3955.4348 3955.3710 506.5496 516.3498 Empirical mean and variance of compared to theoretical predictions. Here µ 1 =... = µ m = 1 8 = 31

Mean end to end latency per stage 6.5 6 5.5 5 4.5 4 3.5 Numerical results Experiment: n = m = 80 Experiment: n = m = 1000 Theory 1 2 3 4 5 λ Here n = m, µ 1 =... = µ m 1 = 1, µ m = 1/λ; exponential service time Regime where the bottleneck does not affect distribution! 32

Mean end to end latency per stage 6.5 6 5.5 5 4.5 4 3.5 Numerical results Lognormal expt.: m = 80 Lognormal expt.: m = 320 Exponential theory Predicted stoch. phase transition @ λ = 2 1 2 3 4 5 λ Here n = m, µ 1 =... = µ m 1 = 1, µ m = 1/λ; lognormal service time Conjecture: Distribution-independent limiting distribution 33

A fundamental recursion Notation: S i = Server i {1,..., m} C j = Customer j {1,..., n} w(i, j) = Service time for C j at S i L(i, j) = Exit time for C j from S i Fact (Glynn & Whitt, Tembe & Wolff): L(i, j) = w(i, j) + { L(i 1, j) when L(i, j 1) < L(i 1, j), L(i, j 1) when L(i, j 1) > L(i 1, j). Equivalently, L(i, j) = max{l(i 1, j), L(i, j 1)} + w(i, j) 34

The directed last-passage percolation problem (1,n) (m,n) 4 3 2 (1,1) 2 3 4 (m,1) L(m, n) = max{l(m 1, n), L(m, n 1)} + w(m, n) 35

The directed last-passage percolation problem (1,n) (m,n) 4 3 2 (1,1) 2 3 4 (m,1) L(m, n) = ( ) max w(k, l) π P (m,m) (k,l) π P (m, n) is the set of up/right paths ending at (m, n) 36

The random matrix connection L(m, j) = Theorem [Borodin & Peché]: Assume w(i, j) exp(1/(a i + b j )) ( ) 1 X ij CN 0, a i +b j ( ) max w(k, l) π P (m,n) (k,l) π L(m, n) D = λ 1 (XX ) Related work by Johansson (2000) Result easily extended to Poissonian (discrete) random variables 37

The percolation mapping for our problem C 6 µ 1 C 5 µ 1 µ 2 µ 2 µ 3 µ 3 µ 4 µ 4 µ 5 µ 5 µ 6 µ 6 µ 7 µ 7 C 4 C 3 µ 1 µ 1 µ 2 µ 2 µ 3 µ 3 µ 4 µ 4 µ 5 µ 5 µ 6 µ 6 µ 7 µ 7 C 2 C 1 µ 1 µ 2 µ 3 µ 4 µ 5 µ 6 µ 7 µ 1 α µ 2 α µ 3 α µ 4 α µ 5 α µ 6 α µ 7 α S 1 S 2 S 3 S 4 S 5 S 6 S 7 Note that queues are in equilibrium before first customer enters Queue lengths are random and have (shifted) geometric distribution First customer served at S i with rate µ i α, rest with µ i PASTA property = Poissonian Arrivals See Time Averages 38

Theorem [Baik & N., 2012]: Ergo the random matrix connection L(m, n) D = λ 1 (W ) W = Γ 1/2 gg Γ 1/2 + Σ 1/2 GG Σ 1/2 Rank-one-signal plus noise expect phase transition! G is an m (n 1) matrix of i.i.d. CN(0, 1) entries g is an m 1 vector of i.i.d CN(0, 1) entries Σ = diag(1/µ 1,..., 1/µ m ) Γ = diag (1/(µ 1 α),..., 1/(µ m α)) 39

Why the random matrix connection? 40

F I F O 1 F I F O 2 F I F O m Where are the non-intersecting random walks? Job 1 Job 2 Job n Stage 1 0 Time Job 1 Job 2 Job n Stage 2 0 -idle Time Job 1 Job 2 Job n Stage 3 0 Time End-to-end latency FIFO protocol means exit time trajectories do not intersect Mathematics of random walks classical probability theory Mathematics of random walks conditioned not to intersect random matrix theory 41

Bijection with TASEP & corner growth model http://www-wt.iam.uni-bonn.de/~ferrari/animations/continuoustasep.html 42

Non-interesecting random walks are everywhere! Taken from Andrei Okounkov s 2006 Fields Medal Citation 43

The traveling salesman problem (1,n) (m,n) 4 3 2 (1,1) 2 3 4 (m,1) L(m, n) = ( ) max w(k, l) π P (m,n) (k,l) π Fix w(k, l), what order of processing minimizes delay? Limits of scheduling? Application-motivated extensions of RMT! 44