Modeling Service with Time-Varying Demand - Performance Approximations and Staffing Controls (with Beixiang He, Liam Huang, Korhan Aras, Ward Whitt) Department of Industrial and Systems Engineering NC State University A survey on recent results of time-varying queueing models
Education B.E. in Electrical Engineering, Tsinghua Univ., Beijing, 2002 M.S. & Ph.D. in Oper. Res., Columbia Univ., NY, 2007, 2011 Research Interests Method: stochastic modeling, applied prob., queueing theory Application: call centers, health care, manufacturing systems When I m not teaching or writing papers...
Current Ph.D. students at NCSU Beixiang He (Graduating in August 2014) Korhan Aras (Graduating in Spring 2015) Liam Huang, Yao Yu (Ongoing) Former Ph.D. advisor Prof. Ward Whitt (Columbia University)
Nonstationary Systems: Time-Varying Arrivals service call center emergency room Green et al. (2007) Yom-Tov and Mandelbaum (2011)
Realistic Models Features: General Distributions Non-exponential service and abandonment service service and abandonment abandonment Brown et al. (2005)
The Base Queueing Model G t /GI/s t + GI Time-varying arrival rate λ(t) (the G t ) (e.g., non-homogeneous Poisson, M t ) I.I.D. service times G(x) P(S x) (the first GI) Time-varying staffing level s(t) (the s t ) I.I.D. abandonment times F (x) P(A x) (the +GI) First-Come First-Served (FCFS) Unlimited waiting capacity
The Base Queueing Model G t /GI/s t + GI Time-varying arrival rate λ(t) (the G t ) (e.g., non-homogeneous Poisson, M t ) I.I.D. service times G(x) P(S x) (the first GI) Time-varying staffing level s(t) (the s t ) I.I.D. abandonment times F (x) P(A x) (the +GI) First-Come First-Served (FCFS) Unlimited waiting capacity Textbook models: M/M/1 No longer useful!!
The Base Queueing Model Performance measures Q(t) and B(t): number waiting in queue and in service at t X (t) Q(t) + B(t): total number in system W (t) and V (t): head-of-line and potential waiting time at t Exact analysis HARD! Time-varying arrival rate and staffing function Customer abandonment Large scale (for large λ(t) and s(t)) Non-Markovian probability structure: non-poisson, non-exponential Complicated network structure
Many-Server Heavy-Traffic Approximations A sequence of queues indexed by n arrival rate: λ n (t) = n λ(t) number of servers: s n (t) = n s(t) system size grows with n customer individual behavior unscaled: service cdf G and patience cdf F held fixed independent of n
Many-Server Heavy-Traffic Limits Fluid Limit LLN scaling: Qn (t) Qn(t) n, B n (t) Bn(t) n, X n (t) Xn(t) n FSLLN: ( Qn, B n, X n, W n ) (Q, B, X, W ) in D 4, as n
Many-Server Heavy-Traffic Limits Fluid Limit LLN scaling: Qn (t) Qn(t) n, B n (t) Bn(t) n, X n (t) Xn(t) n FSLLN: ( Qn, B n, X n, W n ) (Q, B, X, W ) in D 4, as n Diffusion Limit CLT scaling: ˆQ n (t) n ( Q n (t) Q(t) ) Qn(t) n Q(t) = n, Ŵ n (t) n (W n (t) W (t)) FCLT: ( ˆQn, ˆB n, ˆX Ŵn) ( ) n, ˆQ, ˆB, ˆX, Ŵ in D 4, as n
Many-Server Heavy-Traffic Limits Fluid Limit LLN scaling: Qn (t) Qn(t) n, B n (t) Bn(t) n, X n (t) Xn(t) n FSLLN: ( Qn, B n, X n, W n ) (Q, B, X, W ) in D 4, as n Diffusion Limit CLT scaling: ˆQ n (t) n ( Q n (t) Q(t) ) Qn(t) n Q(t) = n, Ŵ n (t) n (W n (t) W (t)) FCLT: ( ˆQn, ˆB n, ˆX Ŵn) ( ) n, ˆQ, ˆB, ˆX, Ŵ in D 4, as n Approximations Q n (t) = n Q(t) + n ˆQ(t) + o( ) n) d N (n Q(t), n σ 2ˆQ (t) ( ) W n (t) = W (t) + Ŵ (t) n + o( 1 n ) d N W (t), 1n σ2ŵ (t)
MSHT Fluid Limits Fluid Limit LLN scaling: Qn (t) Qn(t) n, B n (t) Bn(t) n, X n (t) Xn(t) n FSLLN: ( Qn, B n, X n, W n ) (Q, B, X, W ) in D 4, as n Diffusion Limit CLT scaling: ˆQ n (t) n ( Q n (t) Q(t) ) Qn(t) n Q(t) = n, Ŵ n (t) n (W n (t) W (t)) FCLT: ( ˆQn, ˆB n, ˆX Ŵn) ( ) n, ˆQ, ˆB, ˆX, Ŵ in D 4, as n Approximations Q n (t) = n Q(t) + n ˆQ(t) + o( ) n) d N (n Q(t), n σ 2ˆQ (t) ( ) W n (t) = W (t) + Ŵ (t) n + o( 1 n ) d N W (t), 1n σ2ŵ (t)
What Are s?
MSHT Fluid Limit (FWLLN) random deterministic; discrete continuous; finite infinitely divisible.
Realistic Models Features: Network Structure Markovian routing Once finishing at one station, flip a coin to decide next station Pros: capture randomness in routing; tractable due to Markov property Cons: completely ignore service history; too crude to approximate real systems Prescribed paths Each customer follows a prescribed itineraries/paths Customers with same itineraries belong to same class Pros: characterizes different classes, capture service history Cons: difficult to analyze (no Markov property)
with
: A Dynamical System Key functions: fluid densities Q(t, y) : quantity of fluid in queue for up to y at t Q(t) : y q(t, x)dx 0 B(t, y) : quantity of fluid in service for up to y at t B(t) : y b(t, x)dx 0
: A Dynamical System Key functions: fluid densities Q(t, y) : quantity of fluid in queue for up to y at t Q(t) : y q(t, x)dx 0 B(t, y) : quantity of fluid in service for up to y at t B(t) : y b(t, x)dx 0 Fluid densities: transport PDE q(t,x) t + q(t,x) b(t,x) t x = h F (x) q(t, x), 0 x w(t), t 0. + b(t,x) t = h G (x) b(t, x), w(t)x 0, t 0. Fluid waiting time: ODE head-of-line (HOL) and potential waiting times at t w (t) = 1 b(t,0) q(t,w(t)) and v (t) = q(t+v(t),v(t)) b(t+v(t),0) 1 Rate into service: fixed-point equation (FPE) b(t, 0) = S (t) + t b(t x, 0)g(x)dx, 0 x t. 0
Algorithm for the (G t /GI/s t + GI) m /M t Fluid Network
A Non-Markovian Example M t /H 2 /s t + E 2 fluid model λ(t) = 1 + 0.6 sin(t) S = 1 (note: not a single-server queue) H 2 service: p = 0.11, µ 1 = 0.23, µ 2 = 1.77 (Cs 2 = 4) E 2 abandonment: A = X 1 + X 2, where X i i.i.d. exp(1) System initially empty λ(t) and S will be scaled by n!
Fluid Algorithm: Alternating between OL and UL
Fluid Algorithm: Alternating between OL and UL
Fluid Algorithm: Alternating between OL and UL
Fluid Algorithm: Alternating between OL and UL
Fluid Algorithm: Alternating between OL and UL
Fluid Algorithm: Alternating between OL and UL
Simulation Comparisons M t /H 2 /s t + E 2 queueing model n = 20, 100, 2000 λ n (t) = n λ(t) = n + 0.6 n sin(t) S n (t) = n S(t) = n Want to see When n is large: ( Qn(t) n, Bn(t) n When n is small: ( E[Qn(t)] n, E[Bn(t)] n, Xn(t) n, W n (t), E[Xn(t)] n, E[W n (t)] ) (Q(t), B(t), X (t), w(t)) ) (Q(t), B(t), X (t), w(t))
Simulation Comparisons: M t /H 2 /s t + E 2 n = 100 and 3 sample paths
Simulation Comparisons: M t /H 2 /s t + E 2 n = 2000 and a single sample path
Simulation Comparisons: M t /H 2 /s t + E 2 n = 100 and a average of 100 sample paths
Example: A Network of Two Queues (M t /LN/s t + E 2 ) 2 /M t fluid network Sinusoidal arrival: λ (0) 1 (t) = n(0.5 + 0.25 sin(t)), Sinusoidal arrival: λ (0) 2 (t) = n(0.5 + 0.35 sin(t 3)) Constant staffing: s 1 (t) = n, s 2 (t) = 2n Lognormal service: Ḡ 1 LN(1, 2), Ḡ2 LN(2, 8) Erlang abandonment: F 1 E 2 (0.5), F 2 E 2 (0.3) [ ] 0.3 0.2 Routing probability: P(t) 0.2 0.3 System initially empty
Example: (M t /LN/s t + E 2 ) 2 /M t Simulation comparison: n = 2000, one path
Example: (M t /LN/s t + E 2 ) 2 /M t Simulation comparison: n = 50, 2000 paths
Example: (M t /LN/s t + E 2 ) 2 /M t Algorithm convergence ɛ: error tolerance (10 9 to 0.1) I(ɛ): # of iterations T (ɛ): computation time (seconds) in MatLab
Example: An (M t /M/s t + M) 10 /M t Network For 1 i, j 10, Sinusoidal arrival: λ (0) i (t) = a i + b i sin(c i t + φ i ), a i = 0.5, b i = i 10 a i, φ i = π ( 1.5 i Constant staffing: s i (t) = 1 Exponential service: Ḡ i (x) = e µ i x, µ i = 1 10), Exponential abandonment: F i (x) = e θ i x, θ i = 0.5 Routing probability: P i,j (t) 1 20 System initially empty
Example: Fluid Paths of (M t /M/s t + M) 10 /M t
Large Scale Network: Running Time Complexity
MSHT (FCLT) Fluid Limit LLN scaling: Qn (t) Qn(t) n, B n (t) Bn(t) n, X n (t) Xn(t) n FSLLN: ( Qn, B n, X n, W n ) (Q, B, X, W ) in D 4, as n Diffusion Limit CLT scaling: ˆQ n (t) n ( Q n (t) Q(t) ) Qn(t) n Q(t) = n, Ŵ n (t) n (W n (t) W (t)) FCLT: ( ˆQn, ˆB n, ˆX Ŵn) ( ) n, ˆQ, ˆB, ˆX, Ŵ in D 4, as n Approximations Q n (t) = n Q(t) + n ˆQ(t) + o( ) n) d N (n Q(t), n σ 2ˆQ (t) ) W n (t) = W (t) + Ŵ (t) n + o( 1 n ) d N (W (t), 1n σ2ŵ (t)
Characterizing the Diffusion Process Partition {1,..., m} = O U O {1, 2,..., m }, U {m + 1, m + 2,..., m}. For OL queues 1,..., m A Stochastic Differential Equation (SDE): dŵ(t) = H(t)Ŵ(t)dt + J s(t)db s(t) + J a(t)db a(t) + J λ (t)db λ (t) dŵ (t) = H(t)Ŵ(t)dt + J(t)dB (t) Bλ : m-dim BM (arrival process) Ba: m -dim BM (abandonment times) Bs: (m + 1) 2 -dim BM (service times) H, Js, J a, J λ and J : analytic matrices of λ, s, F, µ, Cλ, 2 P and fluid functions Var(Ŵ(t)) = t 0 Ĵs(u)ĴT s (u) + Ĵa(u)ĴT a (u) + Ĵλ(u)ĴT λ (u) du ˆQ(t): Brownian integrals w.r.t. B λ, B a and B s For UL queues m + 1,..., m d ˆB(t) = H (t)ˆb(t)dt + J s (t)db s (t) + J λ (t)db λ(t)
Special Case: One-Queue Model dŵ (t) = H(t)Ŵ (t)dt + J s(t)db s (t) + J a (t)db a (t) + J λ (t)db λ (t) dŵ (t) = H(t)Ŵ (t)dt + J (t)db (t) σ 2 (t) Var(Ŵ (t)) = ) t (Ĵ2 Ŵ 0 s (t, u) + Ĵa 2 (t, u) + Ĵλ 2(t, u) du ( ) H(t) = (1 w (t)) λ (t w(t)) λ(t w(t)) + h F (w(t)) J s (t) = J a (t) = b(t,0) s (t) λ(t w(t)) F (w(t)) F (w(t))b(t,0) λ(t w(t)) F (w(t)) J λ (t) = C λ J (t) = Want to see ) Var (Ŵn (t) ( ) Var ˆQn (t) F (w(t))b(t,0) λ(t w(t)) F (w(t)) b(t,0) s (t)+(f (w(t))+cλ 2 F (w(t))) b(t,0) λ(t w(t)) F (w(t)) ( ) σ 2 (t), Var ˆVn (t) σ 2ˆV (t), Ŵ σ (t) 2ˆQ
Example: M t /M/s t + H 2 in Both UL and OL Intervals λ(t) = n(1 + 0.6 sin(t)), s(t) = n, µ = 1, θ = 0.5 n = 2000 and 500 sample path
Engineering Refinement for Smaller n λ(t) = n(1 + 0.6 sin(t)), s(t) = n, µ = 1, θ = 0.5 n = 100 and 2000 sample path
Engineering Refinement for Smaller n λ(t) = n(1 + 0.6 sin(t)), s(t) = n, µ = 1, θ = 0.5 n = 20 and 5000 sample path
Prescribed Routing Paths
Multiclass Queueing Network with (G t /GI + GI) n /s m t /PRP n customer classes and m service pools Class i (1 i n) characterized by a deterministic path πi with π i (k) denotes the destination at step k (the PRP) an external arrival rate λi (the G t) non-exponential service dist. Gi,k (the first GI) non-exponential abandonment dist. Fi,k (the +GI) Time-varying staffing level s j (t) at queue j, 1 j m (the s t ) First-Come First-Served (FCFS) at each queue j Unlimited waiting capacity
Multiclass Queueing Network with A two-class two-queue example Class 1: π 1 = [1, 2, 1] Class 2: π 2 = [2].
MSHT Fluid Limits Fluid Limit LLN scaling: Qn (t) Qn(t) n, B n (t) Bn(t) n, X n (t) Xn(t) n FSLLN: ( Qn, B n, X n, W n ) (Q, B, X, W ) in D 4, as n Diffusion Limit CLT scaling: ˆQ n (t) n ( Q n (t) Q(t) ) Qn(t) n Q(t) = n, Ŵ n (t) n (W n (t) W (t)) FCLT: ( ˆQn, ˆB n, ˆX Ŵn) ( ) n, ˆQ, ˆB, ˆX, Ŵ in D 4, as n Approximations Q n (t) = n Q(t) + n ˆQ(t) + o( ) n) d N (n Q(t), n σ 2ˆQ (t) ( ) W n (t) = W (t) + Ŵ (t) n + o( 1 n ) d N W (t), 1n σ2ŵ (t)
Algorithm for the (G t /GI + GI) n /s m t /PRP Fluid Network
Multiclass Queueing Network with A two-class two-queue example
Example: An (M t /LN + E 2 ) 2 /s 2 t /PRP Network Simulation comparison
Achieving Time-Stable Performance Shi et al. (2014) Avoid large delay spikes Smooth performance level across time System becomes stationary
Staffing to Achieve Time-Stable Performance Design staffing and shifts Meet service level agreements P(waiting < 2 mins)>0.8 E(wait) 3 mins = 0.05 hr P(Abandonment)<0.02
Mean Delay At each station j, design staffing s j (t) to stabilize delay at w j Approximate other performance functions (e.g., queue length) Treating both high QoS (small w j ) and low QoS (big w j ) Class dependent service levels calls: to reply in minutes (small wj ) emails: to reply in hours (big wj )
Delayed Infinite-Server (DIS) Approx. for M t /GI /s t + GI DIS staffing: mean # of busy servers s(t) = E[B(t)] F (w) (t w) + 0 λ(t w x)ḡ(x)dx Approximations for other performance measures t E[Q(t)] = λ(t x) F (x)dx (t w) + P(A < W (t)) = α F (w) Asymptotic stability: when scale becomes large sup 0<t T E[W (t)] w 0, sup 0<t T E[P(A < W (t)) α] 0
DIS Approx. for (G t /GI + GI) n /s m t /PRP DIS staffing formulas construct DIS approx. for each class i, step k m i,k (t) F (t wj ) + (w j ) λ i,k (t w j x)ḡi,k(x)dx 0 staffing at queue j: sum of all the required staffing for each type of customers that visits queue j s j (t) = N m i m i,k (t) 1 {πi (k)=j} i=1 k=1 Approximations for other performance measures t E[Qi,k (t)] λ i,k (t x) F i,k (x)dx (t w j ) + N m i E[Qj (t)] E[Q i,k (t)] 1 {πi (k)=j} αi,k (t) i=1 k=1 n F i,k (w j ) 1 {πi (k)=j} j=1
DIS Approx. for (G t /GI + GI) n /s m t /PRP Asymptotic stability: when scale becomes large sup E[W j (t)] w j 0 0<t T sup Pi,k ab (t) α j] 0 0<t T
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP Class i 1 2 Arrival Rate λ (0) i (t) 100 + 40 sin t 150 + 70 cos(t + 0.5) Path π i π 1 = [1, 2, 1], N 1 = 3 π 2 = [2], N 2 = 1 Stage k, 1 k N i 1 2 3 1 Service distribution LN(0.5, 0.75) c1,1 s =1.5 LN(1.5, 3) 1,2 =2 LN(1, 3) 1,3 =3 LN(2, 5) 2,1 =2.5 Abandonment distribution H 2 (0.15, 0.3) H 2 (1.15, 2.5) H 2 (1, 3) H 2 (0.25, 0.6) c1,1 a =2 ca 1,2 =2.17 ca 1,3 =3 ca 2,1 =2.4 LN(µ, σ 2 ) g(x) = 1 x (ln x µ) 2 2πσ e 2σ 2 H 2 (µ, σ 2 ) f (x) = pλ 1 e λ1x + (1 p)λ 2 e λ2x where p = µ2 +σ 2 + σ 4 µ 4 2(σ 2 +µ 2 ), λ 1 = 2p/µ, λ 2 = 2(1 p)/µ
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP Arrival rates
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP Low QoS: w 1 = 0.1, w 2 = 0.3
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP Low QoS: w 1 = 0.1, w 2 = 0.3
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP Low QoS: w 1 = 0.1, w 2 = 0.3
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP High QoS: w 1 = 0.005, w 2 = 0.02
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP High QoS: w 1 = 0.005, w 2 = 0.02
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP Mixed: w 1 = 0.005 (high QoS), w 2 = 0.1 (low QoS)
A Test Example: (M t /GI + GI) 2 /s 2 t /PRP Mixed: w 1 = 0.005 (high QoS), w 2 = 0.1 (low QoS)
Summary of Fluid Summary Markovian routing: (G t /GI/s t + GI) m /M t network Deterministic paths: (G t /GI + GI) m /s n t /PRP network Fluid (FWLLN) and diffusion (FCLT) approximations Staffing to achieve time-stable performance Simulations comparisons to verify effectiveness Future work Fit into realistic applications (e.g., health care) Incorporate both uncertainty and history
References [1] He & Liu, the Tail Probability of Delay in Service Systems with Time-Varying Demand. Submitted to Operations Research (2014) [2] Aras, Liu & Whitt, Heavy-Traffic Limit for the Initial Content Process. Submitted to Mathematics of Operations Research (2014) [3] Liu & Whitt, Stabilizing Performance in Many-Server Queues with Time-Varying Arrivals and Customer Feedback. Submitted to Operations Research (2014) [4] Liu & Whitt, Many-Server Heavy-Traffic Limits for Queues with Time-Varying Parameters. Annals of Applied Probability 24(1), 378 421 (2014) [5] Liu & Whitt, Algorithms for Time-Varying of Many-Server Fluid Queues. INFORMS Journal on Computing 26(1), 59 73 (2014) [6] Liu & Whitt, The G t/gi /s t + GI Many-Server Fluid Queue. Queueing Systems 71(4), 405 444 (2012) [7] Liu & Whitt, Stabilizing Customer Abandonment in Many-Server Queues with Time-Varying Arrivals. Operations Research 60(6) 1551 1564 (2012) [8] Liu & Whitt, A Network of Time-Varying Many-Server Fluid Queues with Customer Abandonment. Operations Research 59(4) 835 846 (2011) Available at http://yunanliu.wordpress.ncsu.edu
THANK YOU!
The G t /GI /s t + GI Queueing Model Constructing the G t arrivals Deterministic variability: time-varying λ(t) (avg. behavior) Stochastic variability: constant c λ (magnitude of variance) Composition methods Step 1: generate rate-1 (equilibrium) renewal process N 0 (t), with variance of interrenewal time Var(X ) = cλ 2 Step 2: set N(t) N 0 (Λ(t)), with Λ(t) t 0 λ(u)du Properties: Mean value: E[N(t)] = Λ(t), Variance-to-mean ratio: Var(N(t))/E[N(t)] = Var(N(t))/Λ(t) c λ (Liu et al. 2014)