Lecture 9: Deterministic Fluid Models and Many-Server Heavy-Traffic Limits. IEOR 4615: Service Engineering Professor Whitt February 19, 2015

Similar documents
Dynamic Control of Parallel-Server Systems

The Performance Impact of Delay Announcements

Erlang-C = M/M/N. agents. queue ACD. arrivals. busy ACD. queue. abandonment BACK FRONT. lost calls. arrivals. lost calls

What You Should Know About Queueing Models To Set Staffing Requirements in Service Systems

Making Delay Announcements

Designing a Telephone Call Center with Impatient Customers

BRAVO for QED Queues

ec1 e-companion to Liu and Whitt: Stabilizing Performance

Stabilizing Customer Abandonment in Many-Server Queues with Time-Varying Arrivals

Stabilizing Customer Abandonment in Many-Server Queues with Time-Varying Arrivals

Economy of Scale in Multiserver Service Systems: A Retrospective. Ward Whitt. IEOR Department. Columbia University

Delay Announcements. Predicting Queueing Delays for Delay Announcements. IEOR 4615, Service Engineering, Professor Whitt. Lecture 21, April 21, 2015

Online Supplement to Delay-Based Service Differentiation with Many Servers and Time-Varying Arrival Rates

A Diffusion Approximation for Stationary Distribution of Many-Server Queueing System In Halfin-Whitt Regime

A Fluid Approximation for Service Systems Responding to Unexpected Overloads

SENSITIVITY OF PERFORMANCE IN THE ERLANG-A QUEUEING MODEL TO CHANGES IN THE MODEL PARAMETERS

MODELING WEBCHAT SERVICE CENTER WITH MANY LPS SERVERS

The G t /GI/s t + GI Many-Server Fluid Queue: Longer Online Version with Appendix

REAL-TIME DELAY ESTIMATION BASED ON DELAY HISTORY IN MANY-SERVER SERVICE SYSTEMS WITH TIME-VARYING ARRIVALS

The Offered-Load Process: Modeling, Inference and Applications. Research Thesis

Many-Server Loss Models with Non-Poisson Time-Varying Arrivals

Critically Loaded Time-Varying Multi-Server Queues: Computational Challenges and Approximations

Class 11 Non-Parametric Models of a Service System; GI/GI/1, GI/GI/n: Exact & Approximate Analysis.

A DIFFUSION APPROXIMATION FOR THE G/GI/n/m QUEUE

DELAY PREDICTORS FOR CUSTOMER SERVICE SYSTEMS WITH TIME-VARYING PARAMETERS. Rouba Ibrahim Ward Whitt

REAL-TIME DELAY ESTIMATION BASED ON DELAY HISTORY IN MANY-SERVER SERVICE SYSTEMS WITH TIME-VARYING ARRIVALS

Q = (c) Assuming that Ricoh has been working continuously for 7 days, what is the probability that it will remain working at least 8 more days?

Asymptotic Coupling of an SPDE, with Applications to Many-Server Queues

Since D has an exponential distribution, E[D] = 0.09 years. Since {A(t) : t 0} is a Poisson process with rate λ = 10, 000, A(0.

USING DIFFERENT RESPONSE-TIME REQUIREMENTS TO SMOOTH TIME-VARYING DEMAND FOR SERVICE

A Fluid Approximation for Service Systems Responding to Unexpected Overloads

A Diffusion Approximation for the G/GI/n/m Queue

Advanced Computer Networks Lecture 3. Models of Queuing

The Offered-Load Process: Modeling, Inference and Applications. Research Thesis

ESTIMATING WAITING TIMES WITH THE TIME-VARYING LITTLE S LAW

The G t /GI/s t +GI many-server fluid queue. Yunan Liu & Ward Whitt. Queueing Systems Theory and Applications. ISSN Volume 71 Number 4

STAFFING A CALL CENTER WITH UNCERTAIN ARRIVAL RATE AND ABSENTEEISM

The M/M/n+G Queue: Summary of Performance Measures

Performance Evaluation of Queuing Systems

SERVER STAFFING TO MEET TIME-VARYING DEMAND

Stabilizing Performance in a Service System with Time-Varying Arrivals and Customer Feedback

Routing and Staffing in Large-Scale Service Systems: The Case of Homogeneous Impatient Customers and Heterogeneous Servers

Fluid Limit of A Many-Server Queueing Network with Abandonment

A How Hard are Steady-State Queueing Simulations?

Modeling Service Networks with Time-Varying Demand

Electronic Companion Fluid Models for Overloaded Multi-Class Many-Server Queueing Systems with FCFS Routing

Motivated by models of tenant assignment in public housing, we study approximating deterministic fluid

Design and evaluation of overloaded service systems with skill based routing, under FCFS policies

ENGINEERING SOLUTION OF A BASIC CALL-CENTER MODEL

THE HEAVY-TRAFFIC BOTTLENECK PHENOMENON IN OPEN QUEUEING NETWORKS. S. Suresh and W. Whitt AT&T Bell Laboratories Murray Hill, New Jersey 07974

MANY-SERVER HEAVY-TRAFFIC LIMIT FOR QUEUES WITH TIME-VARYING PARAMETERS 1

Markovian N-Server Queues (Birth & Death Models)

Introduction to queuing theory

Classical Queueing Models.

On the Resource/Performance Tradeoff in Large Scale Queueing Systems

New Perspectives on the Erlang-A Queue

IEOR 6711, HMWK 5, Professor Sigman

A multi-class fluid model for a contact center with skill-based routing

Overflow Networks: Approximations and Implications to Call-Center Outsourcing

Continuous Time Processes

Service Engineering January Laws of Congestion

Wait-Time Predictors for Customer Service Systems with Time-Varying Demand and Capacity

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

The Fluid View, or Flow Models

Routing and Staffing in Large-Scale Service Systems: The Case of Homogeneous Impatient Customers and Heterogeneous Servers 1

STABILIZING PERFORMANCE IN NETWORKS OF QUEUES WITH TIME-VARYING ARRIVAL RATES

Introduction to Markov Chains, Queuing Theory, and Network Performance

arxiv: v1 [math.pr] 5 Aug 2013

Author's personal copy

Tales of Time Scales. Ward Whitt AT&T Labs Research Florham Park, NJ

Stein s Method for Steady-State Approximations: Error Bounds and Engineering Solutions

Load Balancing in Distributed Service System: A Survey

Managing Call Centers with Many Strategic Agents

On Priority Queues with Impatient Customers: Stationary and Time-Varying Analysis

(b) What is the variance of the time until the second customer arrives, starting empty, assuming that we measure time in minutes?

Time-Varying Queues. Ward Whitt. Industrial Engineering and Operations Research Columbia University, New York, NY, 10027;

Stochastic Networks and Parameter Uncertainty

Author's personal copy

On Priority Queues with Impatient Customers: Stationary and Time-Varying Analysis

Staffing many-server queues with impatient customers: constraint satisfaction in call centers

Wait-Time Predictors for Customer Service Systems With Time-Varying Demand and Capacity

A Note on the Event Horizon for a Processor Sharing Queue

Using a Birth-and-Death Process to Estimate the Steady-State Distribution of a Periodic Queue

We consider how two networked large-scale service systems that normally operate separately, such as call

Abandonment and Customers Patience in Tele-Queues. The Palm/Erlang-A Model.

Stochastic grey-box modeling of queueing systems: fitting birth-and-death processes to data

HEAVY-TRAFFIC LIMITS VIA AN AVERAGING PRINCIPLE FOR SERVICE SYSTEMS RESPONDING TO UNEXPECTED OVERLOADS OHAD PERRY

Queues with Many Servers and Impatient Customers

Queuing Networks: Burke s Theorem, Kleinrock s Approximation, and Jackson s Theorem. Wade Trappe

PBW 654 Applied Statistics - I Urban Operations Research

An Analysis of the Preemptive Repeat Queueing Discipline

Networking = Plumbing. Queueing Analysis: I. Last Lecture. Lecture Outline. Jeremiah Deng. 29 July 2013

Control of Many-Server Queueing Systems in Heavy Traffic. Gennady Shaikhet

IEOR 8100: Topics in OR: Asymptotic Methods in Queueing Theory. Fall 2009, Professor Whitt. Class Lecture Notes: Wednesday, September 9.

State Space Collapse in Many-Server Diffusion Limits of Parallel Server Systems. J. G. Dai. Tolga Tezcan

Stationary remaining service time conditional on queue length

1 Delayed Renewal Processes: Exploiting Laplace Transforms

1.225 Transportation Flow Systems Quiz (December 17, 2001; Duration: 3 hours)

Design heuristic for parallel many server systems under FCFS-ALIS Adan, I.J.B.F.; Boon, M.A.A.; Weiss, G.

Introduction to Queueing Theory with Applications to Air Transportation Systems

Transcription:

Lecture 9: Deterministic Fluid Models and Many-Server Heavy-Traffic Limits IEOR 4615: Service Engineering Professor Whitt February 19, 2015

Outline Deterministic Fluid Models Directly From Data: Cumulative Arrivals and Departures Directly from M t /M/s t +M BD Model (deterministic view) Many-Server Heavy-Traffic Limits for Queueing Models Fluid Model Obtained in the Limit as Scale Increases Ultimately, limits explained by LLN and CLT

From Data to Fluid Models To analyze data, we plot cumulative arrival and departure functions: For large systems (bird s eye view), the functions look smoother.

From Data to Fluid Models Directly from event-based (call-by-call) measurements. For example, an isolated service-station: A(t) = cumulative # arrivals from time 0 to time t; D(t) = cumulative # departures from system during [0, t]; L(t) = A(T) D(t) = # customers in system at t. Arrivals and Departures from a Bank Branch Face-to-Face Service

Deterministic Fluid Model Describe impact of Predictable Variability Time-Varying Arrival Rate Ignoring Stochastic Variability Idealistic Smooth Model Ordinary Differential Equation (ODE)

Hall, textbook: Phases of Congestion

Four points of view Cumulative Arrivals and Departures Rates ( Peak Load) Queues ( Congestion) Outflows ( end of rush-hour)

Phases of Congestion via Rates arrival rate α(t) (maximum possible service rate) Time lag in congestion after peak arrival rate Changing Departure Rate

Mathematical Fluid Models: General Setup Queueing System as a Tub (Hall, p.188) A(t) cumulative arrivals function. D(t) cumulative departures function. ( t) A ( t) arrival rate (dot = derivative d/dt) ( t) D ( t) processing (departure) rate. c(t) maximal potential processing rate. q(t) total amount in the system at time t.

Mathematical Fluid Model Differential equation: λ(t) arrival rate at time t [0,T]. c(t) maximal potential processing rate. δ(t) effective processing (departure) rate. q(t) total amount in the system at time t. Then q(t) is a solution of q ( t) ( t) ( t); q(0) q, t [0, T]. 0

Mathematical Fluid Model: Multi-Server Queue s(t) statistically-identical servers, each with service rate μ. c(t) = μs(t): maximal potential processing rate. ( t) min( s( t), q( t)) : processing rate. q ( t) ( t) min( s( t), q( t)); q(0) q0, t [0, T]. i.e., q( t) q(0) ( u) du min( s( u), q( u)) du. 0 0 How to actually solve? Discrete-time approximation: Start with t, q( t ). Then, for t t : 0 0 0 q0 t n 1 t q( tn) q( tn 1) ( tn 1) t min( s( tn 1), q( tn 1)) t. t n

Mathematical Fluid Model: Multi-Server Queue with Abandonment θ Abandonment rate of customers in queue Processing rate: ( t) min( s( t), q( t)) [ q( t) s( t)] The fluid model: q ( t) ( t) min( s( t), q( t)) [ q( t) s( t)] q(0) q0, t [0, T]. Deterministic View of M t /M/s t +M BD Model Parameters: λ(t), μ, s(t),θ ;

Many-Server Heavy-Traffic Limit Sequence of M t /M/s t +M Models Indexed by n Let n. Parameters: λ n (t) = nλ(t) arrival rate at time t [heavy-traffic] s n (t) = ns(t) number of servers at time t [large scale] μ n (t) = μ individual service rate (constant) θ n (t) = θ individual abandonment rate (constant) Stochastic Processes: A n (t) number of arrivals in system n in [0,t] D n (t) number of departures in system n in [0,t] Q n (t) number of customers in system n at time t W n (t) potential waiting for arrival at time t (with infinite patience)

Fluid Approximation from Many-Server Heavy-Traffic Limit

Fluid Approximation from Many-Server Heavy-Traffic Limit

Fluid Approximation from Many-Server Heavy-Traffic Limit

Many-Server Heavy-Traffic Limits Sequence of M t /M/s t +M Models Indexed by n Parameters: λ n (t) = nλ(t) arrival rate at time t [heavy-traffic] s n (t) = ns(t) number of servers at time t [large scale] μ n (t) = μ individual service rate (constant) θ n (t) = θ individual abandonment rate (constant) Stochastic Processes: A n (t) number of arrivals in system n in [0,t] D n (t) number of departures in system n in [0,t] Q n (t) number of customers in system n at time t W n (t) potential waiting for arrival (with infinite patience) Limits (Fluid Limit = Law of Large Numbers): As n, A n (t)/n Λ(t) = t λ(s) ds (integral of fluid arrival rate) D n (t)/n D(t) = t δ(s) ds (integral of fluid departure rate) Q n (t)/n q(t) fluid content at time t W n (t) w(t) potential waiting time for atom of fluid

Refined Many-Server Heavy-Traffic Limit Sequence of Queueing Models Indexed by n M t /M/s t +M Fluid Limits (Fluid Limit = Law of Large Numbers): As n, A n (t)/n Λ(t) = t λ(s) ds (integral of fluid arrival rate) D n (t)/n D(t) = t δ(s) ds (integral of fluid departure rate) Q n (t)/n q(t) fluid content at time t W n (t) w(t) waiting time Stochastic Limits (Gaussian Limit = Central limit Theorem): As n, n[(a n (t)/n) - Λ(t)] X A (t) n[(d n (t)/n) - D(t)] X D (t) Gaussian limit processes n[(q n (t)/n) - q(t)] X Q (t) n[w n (t)- w(t)] X W (t)

Three Many-Server Heavy-Traffic Limiting Regimes Sequence of Stationary M/M/s+M Models Parameters: λ n = nλ - c n arrival rate at time t [No time-varying parameters] s n = ns number of servers at time t [large scale] μ n = μ individual service rate (constant) θ n = θ individual abandonment rate (constant) Limiting Regimes λ>s μ overloaded or Efficiency-Driven (ED) λ<s μ underloaded or Quality-Driven (QD) λ=s μ critically loaded need to look more closely Expanding the Critically Loaded Regime: λ=sμ More general arrival rate scaling: Quality-and-Efficiency-Driven (QED) regime = Halfin-Whitt (1981) regime λ n = ns μ[1 (β/ n)] (λ = s μ and c = -s μ β) (1-ρ n ) n =β where ρ n = λ n /s n μ = λ n /nsμ

Delay Probability Approximation in the M/M/s/ Model in the QED Regime Quality-and-Efficiency-Driven (QED) regime = Halfin-Whitt (1981) regime λ n = ns μ[1 (β/ n)] (1-ρ n ) n =β where ρ n = λ n /s n μ = λ n /nsμ P(W n > 0) α with 0 < α < 1. (W n steady state wait before starting service in model n) P(W n > 0) α(β) = HW(β) = 1/[βΦ(β)/φ(β)] Where Φ(x) = P(N(0,1) < x) standard normal cdf and φ(β) is the associated density function

Implications for Staffing in the M/M/s/ Model P(W n > 0) Target = α = α (β) = HW(β) = 1/[βΦ(β)/φ(β)] Where Φ(x) = P(N(0,1) < x) standard normal cdf and φ(β) is the associated density function Use Square-Root Staffing Formula: Set S = S(λ/μ) = (λ/μ) + β (λ/μ) 1/2 For β = HW -1 (α) (inverse function) λ/μ = offered load (infinite-server model)

Implications for Staffing in the M/M/s+M Model P(W n > 0) Target = α = α(β, γ) = G(β, γ) = 1/[1 + γh(β/γ)h(-β)] Garnett function from Garnett et al. (2002) Where γ = (θ/μ) 1/2, h(x) = φ(x)/[1- Φ(x)], Φ(x) = P(N(0,1) < x) standard normal cdf and φ(x) is the associated density function. Use Square-Root Staffing Formula: Set For β = G -1 (Target, γ) S = S(λ/μ) = (λ/μ) + β (λ/μ) 1/2

References Fluid Models 1. Chapter 6, especially Section 6.4, of Hall (1991) Queueing Methods for Services and Manufacturing, Prentice Hall. 2. Chapters 1 and 2 of Newell (1982) Applications of Queueing Theory, second edition, Chapman and Hall. Many-Server heavy-traffic Limits 3. Halfin, S., W. Whitt. (1981) Heavy-traffic limits for queues with many exponential servers. Operations Research, 29, 567-588. 4. Whitt, W. (1992) Understanding the Efficiency of Multi-Server Service Systems. Management Science, 38, 708-723.

More Advanced References Fluid Models 1. Hampshire, R. C., W. A. Massey. (2010). A tutorial on dynamic optimization with application to dynamic rate queues. TutORials in Operations Research, presented at the INFORMS National Meeting in Austin Texas. (www.princeton.edu~wmassey) Many-Server Heavy-Traffic Limits 2. Garnett, O., A. Mandelbaum, M. I. Reiman. (2002) Designing a call center with impatient customers. Manufacturing Service Oper. Management 4, 208 227. (http://iew3.technion.ac.il/serveng) 3. Pang, G., R. Talreja, W. Whitt. (2007) Martingale proofs of manyserver heavy-traffic limits for Markovian queues. Probability Surveys, 4, 193-267.