Erlang-C = M/M/N. agents. queue ACD. arrivals. busy ACD. queue. abandonment BACK FRONT. lost calls. arrivals. lost calls

Similar documents
Dynamic Control of Parallel-Server Systems

Abandonment and Customers Patience in Tele-Queues. The Palm/Erlang-A Model.

Designing a Telephone Call Center with Impatient Customers

Class 11 Non-Parametric Models of a Service System; GI/GI/1, GI/GI/n: Exact & Approximate Analysis.

Call Centers Queueing. Service Engineering

Service Engineering January Laws of Congestion

Lecture 9: Deterministic Fluid Models and Many-Server Heavy-Traffic Limits. IEOR 4615: Service Engineering Professor Whitt February 19, 2015

Classical Queueing Models.

The Palm/Erlang-A Queue, with Applications to Call Centers

The M/M/n+G Queue: Summary of Performance Measures

Staffing many-server queues with impatient customers: constraint satisfaction in call centers

On Priority Queues with Impatient Customers: Stationary and Time-Varying Analysis

Staffing Many-Server Queues with Impatient Customers: Constraint Satisfaction in Call Centers

On Priority Queues with Impatient Customers: Stationary and Time-Varying Analysis

Economy of Scale in Multiserver Service Systems: A Retrospective. Ward Whitt. IEOR Department. Columbia University

The Offered-Load Process: Modeling, Inference and Applications. Research Thesis

Internet Supplement to Call centers with impatient customers: many-server asymptotics of the M/M/n+G queue

The Offered-Load Process: Modeling, Inference and Applications. Research Thesis

Routing and Staffing in Large-Scale Service Systems: The Case of Homogeneous Impatient Customers and Heterogeneous Servers

What You Should Know About Queueing Models To Set Staffing Requirements in Service Systems

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

BRAVO for QED Queues

ec1 e-companion to Liu and Whitt: Stabilizing Performance

ENGINEERING SOLUTION OF A BASIC CALL-CENTER MODEL

The Performance Impact of Delay Announcements

One billion+ terminals in voice network alone

SENSITIVITY OF PERFORMANCE IN THE ERLANG-A QUEUEING MODEL TO CHANGES IN THE MODEL PARAMETERS

Routing and Staffing in Large-Scale Service Systems: The Case of Homogeneous Impatient Customers and Heterogeneous Servers 1

Class 9 Stochastic Markovian Service Station in Steady State - Part I: Classical Queueing Birth & Death Models

Introduction to queuing theory

Copyright by Jing Zan 2012

A Fluid Approximation for Service Systems Responding to Unexpected Overloads

Design, staffing and control of large service systems: The case of a single customer class and multiple server types. April 2, 2004 DRAFT.

Queueing Review. Christos Alexopoulos and Dave Goldsman 10/6/16. (mostly from BCNN) Georgia Institute of Technology, Atlanta, GA, USA

Designing load balancing and admission control policies: lessons from NDS regime

Class 8 Customers (Im)Patience & Abandonment; Hazard Rates

A STAFFING ALGORITHM FOR CALL CENTERS WITH SKILL-BASED ROUTING: SUPPLEMENTARY MATERIAL

State Space Collapse in Many-Server Diffusion Limits of Parallel Server Systems. J. G. Dai. Tolga Tezcan

Queueing Review. Christos Alexopoulos and Dave Goldsman 10/25/17. (mostly from BCNN) Georgia Institute of Technology, Atlanta, GA, USA

MODELING WEBCHAT SERVICE CENTER WITH MANY LPS SERVERS

Managing Call Centers with Many Strategic Agents

A How Hard are Steady-State Queueing Simulations?

Non Markovian Queues (contd.)

Chapter 5: Special Types of Queuing Models

Solutions to COMP9334 Week 8 Sample Problems

A Semi-parametric Bayesian Framework for Performance Analysis of Call Centers

Markovian N-Server Queues (Birth & Death Models)

Online Supplement to Delay-Based Service Differentiation with Many Servers and Time-Varying Arrival Rates

Queues with Many Servers and Impatient Customers

A DIFFUSION APPROXIMATION FOR THE G/GI/n/m QUEUE

BIRTH DEATH PROCESSES AND QUEUEING SYSTEMS

A Diffusion Approximation for the G/GI/n/m Queue

Stabilizing Customer Abandonment in Many-Server Queues with Time-Varying Arrivals

A multi-class fluid model for a contact center with skill-based routing

A Fluid Approximation for Service Systems Responding to Unexpected Overloads

Stabilizing Customer Abandonment in Many-Server Queues with Time-Varying Arrivals

Figure 10.1: Recording when the event E occurs

Stochastic Networks and Parameter Uncertainty

Service-Level Differentiation in Many-Server Service Systems: A Solution Based on Fixed-Queue-Ratio Routing

We study a capacity sizing problem in a service system that is modeled as a single-class queue with multiple

Contact Centers with a Call-Back Option and Real-Time Delay Information

Link Models for Circuit Switching

Making Delay Announcements

On First Come, First Served Queues with Two Classes of Impatient Customers

Design heuristic for parallel many server systems under FCFS-ALIS Adan, I.J.B.F.; Boon, M.A.A.; Weiss, G.

Exercises Stochastic Performance Modelling. Hamilton Institute, Summer 2010

Electronic Companion Fluid Models for Overloaded Multi-Class Many-Server Queueing Systems with FCFS Routing

DataMOCCA. Data MOdels for Call Center Analysis

We consider how two networked large-scale service systems that normally operate separately, such as call

CS418 Operating Systems

Fluid Limit of A Many-Server Queueing Network with Abandonment

Performance Evaluation of Queuing Systems

Queuing Theory. 3. Birth-Death Process. Law of Motion Flow balance equations Steady-state probabilities: , if

Beyond heavy-traffic regimes: Universal bounds and controls for the single-server (M/GI/1+GI) queue

Service-Level Differentiation in Many-Server Service Systems Via Queue-Ratio Routing

Modeling Service Networks with Time-Varying Demand

Single-Server Service-Station (G/G/1)

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

Stein s Method for Steady-State Approximations: Error Bounds and Engineering Solutions

REAL-TIME DELAY ESTIMATION BASED ON DELAY HISTORY IN MANY-SERVER SERVICE SYSTEMS WITH TIME-VARYING ARRIVALS

Load Balancing in Distributed Service System: A Survey

Asymptotic Coupling of an SPDE, with Applications to Many-Server Queues

STAFFING A CALL CENTER WITH UNCERTAIN ARRIVAL RATE AND ABSENTEEISM

λ λ λ In-class problems

Lecture 7: Simulation of Markov Processes. Pasi Lassila Department of Communications and Networking

1.225 Transportation Flow Systems Quiz (December 17, 2001; Duration: 3 hours)

Appendix. A. Simulation study on the effect of aggregate data

Overflow Networks: Approximations and Implications to Call-Center Outsourcing

P (L d k = n). P (L(t) = n),

Little s result. T = average sojourn time (time spent) in the system N = average number of customers in the system. Little s result says that

Advanced Computer Networks Lecture 3. Models of Queuing

THE HEAVY-TRAFFIC BOTTLENECK PHENOMENON IN OPEN QUEUEING NETWORKS. S. Suresh and W. Whitt AT&T Bell Laboratories Murray Hill, New Jersey 07974

Queueing Theory (Part 4)

DELAY PREDICTORS FOR CUSTOMER SERVICE SYSTEMS WITH TIME-VARYING PARAMETERS. Rouba Ibrahim Ward Whitt

Since D has an exponential distribution, E[D] = 0.09 years. Since {A(t) : t 0} is a Poisson process with rate λ = 10, 000, A(0.

Queueing Theory II. Summary. ! M/M/1 Output process. ! Networks of Queue! Method of Stages. ! General Distributions

λ, µ, ρ, A n, W n, L(t), L, L Q, w, w Q etc. These

HEAVY-TRAFFIC LIMITS VIA AN AVERAGING PRINCIPLE FOR SERVICE SYSTEMS RESPONDING TO UNEXPECTED OVERLOADS OHAD PERRY

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

Blind Fair Routing in Large-Scale Service Systems with Heterogeneous Customers and Servers

Transcription:

Erlang-C = M/M/N agents arrivals ACD queue Erlang-A <4CallCenters> lost calls FRONT BACK arrivals busy ACD queue abandonment lost calls

Erlang-C = M/M/N agents arrivals ACD queue

Rough Performance Analysis Peak 10:00 10:30 a.m., with 100 agents 400 calls 3:45 minutes average service time 2 seconds ASA (Average Speed of Answer)

Rough Performance Analysis Peak 10:00 10:30 a.m., with 100 agents 400 calls 3:45 minutes average service time 2 seconds ASA Offered load R = λ E(S) = 400 3:45 = 1500 min./30 min. = 50 Erlangs Occupancy ρ = R/N = 50/100 = 50%

Rough Performance Analysis Peak 10:00 10:30 a.m., with 100 agents 400 calls 3:45 minutes average service time 2 seconds ASA Offered load R = λ E(S) = 400 3:45 = 1500 min./30 min. = 50 Erlangs Occupancy ρ = R/N = 50/100 = 50% Quality-Driven Operation Classical Queueing Theory (Light-Traffic) (M/G/N approximations) Above: R = 50, N = R + 50, all served immediately. Rule of Thumb: N = R + δ R, > 0 δ service-grade.

Quality-driven: 100 agents, 50% utilization Can increase offered load - by how much? Erlang-C N=100 E(S) = 3:45 min. λ/hr ρ E(W q ) = ASA % Wait = 0 800 50% 0 100%

Quality-driven: 100 agents, 50% utilization Can increase offered load - by how much? Erlang-C N=100 E(S) = 3:45 min. λ/hr ρ E(W q ) = ASA % Wait = 0 800 50% 0 100% 1400 87.5% 0:02 min. 88% 1550 96.9% 0:48 min. 35% 1580 98.8% 2:34 min. 15% 1585 99.1% 3:34 min. 12%

Quality-driven: 100 agents, 50% utilization Can increase offered load - by how much? Erlang-C N=100 E(S) = 3:45 min. λ/hr ρ E(W q ) = ASA % Wait = 0 800 50% 0 100% 1400 87.5% 0:02 min. 88% 1550 96.9% 0:48 min. 35% 1580 98.8% 2:34 min. 15% 1585 99.1% 3:34 min. 12% Efficiency-driven Operation (Heavy Traffic) W q W q 1 ρ W 0 N q > = E( S) E( S) N 1 ρ N = 3:45! N( 1 ρ ) = 1, ρ 1 N Above: R = 99, N = R + 1, all delayed. N Rule of Thumb: N = R + γ, γ > 0 service grade.

Changing N (Staffing) in Erlang-C E(S) = 3:45 λ/hr N OCC ASA % Wait = 0 1585 100 99.1% 3:34 12%

Changing N (Staffing) in Erlang-C E(S) = 3:45 λ/hr N OCC ASA % Wait = 0 1585 100 99.1% 3:34 12% 1599 100 99.9% 59:33 0%

Changing N (Staffing) in Erlang-C E(S) = 3:45 λ/hr N OCC ASA % Wait = 0 1585 100 99.1% 3:34 12% 1599 100 99.9% 59:33 0% 1599 100+1 98.9% 3:06 13% 1599 102 98.0% 1:24 24% 1599 105 95.2% 0:23 50%

Changing N (Staffing) in Erlang-C E(S) = 3:45 λ/hr N OCC ASA % Wait = 0 1585 100 99.1% 3:34 12% 1599 100 99.9% 59:33 0% 1599 100+1 98.9% 3:06 13% 1599 102 98.0% 1:24 24% 1599 105 95.2% 0:23 50% New Rationalized Operation Efficiently driven, in the sense that OCC > 95%; Quality-Driven, 50% answered immediately QED Regime = Quality- and Efficiency-Driven Regime Economies of Scale in a Frictionless Environment Above: R = 100, N = R + 5, 50% delayed. Safety-Staffing N = R + β R, β > 0.

QED Theorem (Halfin-Whitt, 1981) Consider a sequence of M/M/N models, N=1,2,3, Then the following 3 points of view are equivalent: Customer Server lim N lim N P N {Wait > 0} =, 0 < < 1; N (1 ), 0 < < ; N Manager N R R, R E(S) large; Here 1 ( ) 1 ( ), where ( ) / ( ) is the standard normal density/distribution. Extremes: Everyone waits: 1 0 Efficiency-driven Quality-driven 0 No one waits:

Safety-Staffing: Performance R = λ E(S) Offered load (Erlangs) N = R + 1 β 23 R β = service-grade > 0 = R + safety-staffing Expected Performance: ( ) % Delayed P( ) βφ β β = 1 +, > 0 ( ) β Erlang-C ϕ β 1 Congestion index = E Wait E(S) Wait > 0 = 1 ASA % Wait E(S) > T Wait > 0 = -T e TSF Servers Utilization = R N β 1 Occupancy N 38

The Halfin-Whitt Delay Function P( ) Alpha Beta 19

Rules-of-Thumb in an "Erlang-C World" R = Offered Load (not small) Efficiency-Driven: N = R + 2 (or 3, or ); Expect that essentially all customers are delayed in queue, that average delay is about 1/2 (or 1/3, or ) average service-time, and that agents utilization is extremely high (close to 100%). Quality-Driven: N = R + (10% - 20%) R Expect essentially no delays of customers. QED: N = R + 0.5 R Expect that about half of the customers are not delayed in queue, that average delay is about one-order less than average service-time (seconds vs. minutes), and that agents utilization is high (90-95%). Can determine regime scientifically: Strategy: Retain performance levels under Pooling (4CC demo) Economics: Minimize agent salaries + congestion cost, or Satisfization: Least Number of Agents s.t. Constraints 39

Strategy: Sustain Regime under Pooling Base: λ = 300/hr, AHT = 5 min, N = 30 agents OCC = 83.3%, ASA = 15 sec, %{Wait>0} = 25% 4 CC: λ = 1200, AHT = 5 min, N=? Quality-Driven: maintain OCC at 83.3%. N = 120, ASA =.5 sec Efficiency-Driven: N = 107, OCC = 93.5% maintain ASA at 15 sec. QED: maintain %{Wait>0}) at 25% N = 100 + 1 100 = 110, OCC = 91%, ASA = 7 sec 9 CC: λ = 2700, AHT = 5, R = 225 Q: N = 270 E: N = 233 QED: N = 225 1 225 = 240, OCC = 94%, ASA = 4.7 sec 40

Economics: Quality vs. Efficiency (Dimensioning: with S. Borst and M. Reiman) Quality D(t) delay cost (t = delay time) Efficiency C(N) staffing cost (N = # agents) Optimization: N* minimizes Total Costs C >> D : Efficiency-driven C << D : Quality-driven C D : Rationalized - QED Satisfization: N* minimal s.t. Service Constraint Eg. %Delayed <. 1 : Efficiency-driven 0 : Quality-driven 0 < < 1 : Rationalized - QED Framework: Asymptotic theory of M/M/N, N

Figure 12: Mean Service Time (Regular) vs. Time-of-day (95% CI) (n = 42613) Mean Service Time 100 120 140 160 180 200 220 240 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Time of Day 30 29

Predictable Variability Arrivals Queues Waiting 1

Service Time Survival curve, by Types Means (In Seconds) NW (New) = 111 PS (Regular) = 181 Survival NE (Stocks) = 269 IN (Internet) = 381 Time 32 31

33 14

Operational Aspects of Impatience Recall earlier Q, E and QED Scenarios (E(S) = 3:45): /hr N OCC ASA % Wait = 0 1599 100 99.9% 59:33 1% 1599 105 95.2% 0:23 51% 1600 100 100% infinity 0% BUT with Patience=E(S) 1600 100 96% 0:09 50% AND could have %Abandon 1600 100 97.3% 0:23 2.7 % 1600 95 98.4% 0:23 6.5% 1800 105 97.7% 0:23 3.4% QED with (Im)patient Customers: The "fittest" survive and wait less much less! Erlang-A: Erlang-C with Exponential Patience / Abandonment Downloadable implementation: 4CallCenters(.com)

Operational Aspects of Impatience Recall earlier Q, E and QED Scenarios (E(S) = 3:45): /hr N OCC ASA % Wait = 0 1599 100 99.9% 59:33 1% 1599 105 95.2% 0:23 51% 1600 100 100% infinity 0% BUT with Patience=E(S) 1600 100 96% 0:09 50% AND could have %Abandon 1600 100 97.3% 0:23 2.7 % 1600 95 98.4% 0:23 6.5% 1800 105 97.7% 0:23 3.4% QED with (Im)patient Customers: The "fittest" survive and wait less much less! Erlang-A: Erlang-C with Exponential Patience / Abandonment Downloadable implementation: 4CallCenters(.com)

Operational Aspects of Impatience Recall earlier Q, E and QED Scenarios (E(S) = 3:45): /hr N OCC ASA % Wait = 0 1599 100 99.9% 59:33 1% 1599 105 95.2% 0:23 51% 1600 100 100% infinity 0% BUT with Patience=E(S) 1600 100 96% 0:09 50% AND could have %Abandon 1600 100 97.3% 0:23 2.7 % 1600 95 98.4% 0:23 6.5% 1800 105 97.7% 0:23 3.4% QED with (Im)patient Customers: The "fittest" survive and wait less much less! Erlang-A: Erlang-C with Exponential Patience / Abandonment Downloadable implementation: 4CallCenters(.com)

Operational Aspects of Impatience Recall earlier Q, E and QED Scenarios (E(S) = 3:45): /hr N OCC ASA % Wait = 0 1599 100 99.9% 59:33 1% 1599 105 95.2% 0:23 51% 1600 100 100% infinity 0% BUT with Patience=E(S) 1600 100 96% 0:09 50% AND could have %Abandon 1600 100 97.3% 0:23 2.7 % 1600 95 98.4% 0:23 6.5% 1800 105 97.7% 0:23 3.4% QED with (Im)patient Customers: The "fittest" survive and wait less much less! Erlang-A: Erlang-C with Exponential Patience / Abandonment Downloadable implementation: 4CallCenters(.com)

Erlang-A (with G-Patience): M/M/N+G lost calls FRONT arrivals busy ACD queue abandonment lost calls

QED Theorem (Garnett, M. and Reiman '02; Zeltyn '03) Consider a sequence of M/M/N+G models, N=1,2,3, Then the following points of view are equivalent: QED %{Wait > 0} α, 0 < α < 1 ; Customers %{Abandon} γ N, 0 < γ ; Agents OCC β + γ 1 < β < ; N Managers N R + β R, R = λ E(S) not small; QED performance (ASA,...) is easily computable, all in terms of β (the square-root safety staffing level) see later. Covers also the Extremes: α = 1 : N = R - γ R α = 0 : N = R + γ R Efficiency-driven Quality-driven

QED Approximations (Zeltyn) λ arrival rate, µ service rate, N number of servers, G patience distribution, g 0 patience density at origin (g 0 = θ, if exp(θ)). N = λ µ + β λµ + o( λ), <β<. P{Ab} { P W> T } N { P Ab W> T } N Here ˆβ = µ β g 0 Φ(x) = 1 Φ(x), 1 [h(ˆβ) ˆβ ] [ µ + h(ˆβ) ] 1, N g 0 h( β) [ ] 1 g0 1+ µ h(ˆβ) Φ (ˆβ + g 0 µ T ) h( β) Φ( ˆβ) 1 g0 N µ [h (ˆβ + g 0 µ T ) ˆβ ]., h(x) = φ(x)/ Φ(x), hazard rate of N(0, 1). Generalizing Garnett, M., Reiman (2002) (Palm 1943 53) No Process Limits

Erlang-A: P{Wait>0}= vs. (N=R+ R) 1 0.9 0.8 0.7 0.6 P{Wait>0} 0.5 0.4 0.3 0.2 0.1 0-3 -2.5-2 -1.5-1 -0.5 0 0.5 1 1.5 2 2.5 3 Halfin-Whitt GMR(0.1) GMR(0.5) GMR(1) GMR(2) GMR(5) GMR(10) GMR(20) GMR(50) GMR(100) GMR(x) describes the asymptotic probability of delay as a function of when x. Here, and µ are the abandonment and service rate, respectively.

Erlang-A: P{Abandon}* N vs. 3.2 3 2.8 2.6 2.4 2.2 2 P{Abandon}* N 1.8 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0-3 -2.5-2 -1.5-1 -0.5 0 0.5 1 1.5 2 2.5 3 GMR(0.1) GMR(0.5) GMR(1) GMR(2) GMR(5) GMR(10) GMR(20) GMR(50) GMR(100)

Forecasting the Arrival Function Theoretical & Empirical Prob. of Delay vs. 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-7 -6-5 -4-3 -2-1 0 1 2 3 4 5 6 7 Garnett Empirical 1