Blind Fair Routing in Large-Scale Service Systems with Heterogeneous Customers and Servers

Blind Fair Routing in Large-Scale Service Systems with Heterogeneous Customers and Servers Mor Armony Amy R. Ward 2 October 6, 2 Abstract In a call center, arriving customers must be routed to available servers, and servers that have just become available must be scheduled to help waiting customers. These dynamic routing and scheduling decisions are very difficult, because customers have different needs and servers have different skill levels. A further complication is that it is preferable that these decisions are made blindly; that is, they depend only on the system state and not on system parameter information such as call arrival rates and service speeds. This is because this information is generally not known with certainty. Ideally, a dynamic control policy for making routing and scheduling decisions balances customer and server needs, by keeping customer delays low, but still fairly dividing the workload amongst the various servers. In this paper, we propose two blind dynamic control policies for parallel server systems with multiple customer classes and server pools, one that is based on the number of customers waiting and the number of agents idling, and one that is based on customer delay times and server idling times. We show that, in the Halfin-Whitt many-server heavy traffic limiting regime, our proposed blind policies perform extremely well when the objective is to minimize customer holding or delay costs subject to server fairness, as defined by how the system idleness is divided among servers. To do this, we formulate an approximating diffusion control problem CP, and compare the performance of the non-blind CP solution to a feasible policy for the CP that is blind. We establish that the increase in the CP objective function value is small over a wide range of parameter values. We then use simulation to validate that a small increase in the CP objective function value is indicative of our proposed blind policies performing very well. Acknowledgement: We thank Itay Gurvich and Avi Mandelbaum for many valuable discussions. Stern School of Business, New York University, marmony@stern.nyu.edu. 2 Marshall School of Business, University of Southern California, amy.ward@marshall.usc.edu

Introduction Large scale call centers and other parallel server service systems with heterogeneous customer and server populations have brought up the need for skill-based control policies that dynamically match waiting customers and available servers. These dynamic control policies have two components: a routing component that specifies what to do when a customer arrives to the system, and a scheduling component that specifies what to do when a server completes service and becomes available. Many such dynamic control policies have been developed in industry and academia over the last two decades. Traditionally, these policies have been customer-centric in the sense that they have focused on customer related goals, such as assigning customers to servers whose expertise best match the needs of these customers and minimizing customer wait time. More recently, the recognition that employee satisfaction is also important for business success has led to policies that also consider server-centric goals, such as considering server preference when assigning customers to servers and making sure all servers have some idle time; see, for example, Section 5 in Aksin et al [] and Whitt [34]. This leads to the desire for dynamic control policies that balance customer and server goals. For example, the control policies in Armony [2], Gurvich and Whitt [7], and ai and Tezcan [4] asymptotically minimize customer delay cost, but this is at the expense of the faster servers having a heavier workload, and the slower servers experiencing almost all of the idle time. This is unfair. At the same time, fairness has been shown by psychologists to be a key component in employee satisfaction [2, 3]. This motivates defining a fair policy and having a revised goal of minimizing customer delay cost within the smaller class of fair policies, as observed in Armony and Ward [3]. It is also the case that model parameters may not be known. In particular, parameters such as arrival rates and mean service times can have large forecast errors associated with them. However, many of the dynamic control policies for parallel server systems that have been proposed rely on the assumption that these parameters are known. It is well-recognized that this issue calls for policies that either estimate these parameters in real time, or do not use parameter information at all, which is true of the QIR policies in Gurvich and Whitt [6]. As in Atar et al [8], we refer to the latter as blind policies. We are interested in finding a blind dynamic control policy that has small customer holding or delay cost and also satisfies fairness constraints with respect to how the system idleness is divided among servers. We would also like the policy to be simple and easily implementable. 2

The first issue is that finding a dynamic control policy that minimizes customer holding or delay cost subject to fairness constraints is very difficult, even when there is no requirement that the policy is blind. This is because finding dynamic control policies for parallel server systems that solve some defined optimization problem is a notoriously difficult problem, due to the curse of dimensionality many call centers have hundreds to thousands of agents, and the time required to solve this problem numerically becomes prohibitively long as the number of agents becomes large. Furthermore, it is also true that, in general, optimal policies that are simple and easily implementable can only be found for relatively simple models. For larger and more complex models, there is a need for simple and easily implementable policies that are approximately optimal. Therefore, we will follow the approach of looking for control policies, and evaluating their performance, in the Halfin-Whitt many-server heavy-traffic limit regime. In relation to our first paper on the topic of server fairness [3], which found an asymptotically optimal fair policy in the Halfin-Whitt limit regime, our main focus in this paper is to evaluate the performance of fair policies that are blind. We consider the diffusion control problem CP that approximates the optimization problem whose objective is to minimize customer holding cost subject to a fairness constraint that is based on the number of idle servers. This CP also approximates the optimization problem whose objective is to minimize customer delay cost subject to a fairness constraint that is based on agent idle time. We solve the CP by proving that it is separable; that is, its solution can be found by solving two simpler CPs, one that involves only routing and one that involves only scheduling, both of whose solutions can be found in the literature. However, the solution to the CP is not blind. There is a very simple intuitive blind control policy for the CP that keeps the diffusion drift fixed as opposed to the CP solution under which the diffusion drift depends on the diffusion s state. We show that the percentage increase in the CP objective function value over its minimum when the blind control is used can be calculated by considering a two-pool inverted-v model that is, a system with one customer class and servers that can serve at one of two speeds. This provides a closed-form analytic expression for the percentage increase in the CP objective function value when the blind control is used instead of the optimal control. We show that this percentage increase is small in a large part of the parameter space. The fact that the impact of using a blind control for the CP is small suggests that the impact of using a blind control policy in the parallel server system model is also small. We test the accuracy of this statement using simulation, and find supporting evidence for it. We further show in our simulation results that the established separability of the CP solution is true for the parallel server system. In particular, the routing decisions 3

determine how to minimize the total number of customers waiting in queue, and ensure that fairness amongst servers is maintained. The scheduling decisions determine the proportion of customers waiting in queue that are from each class. We complete this introduction with a brief review of the most relevant literature. Then, in Section 2, we present our model and problem formulation. We solve the relevant CP in Section 3. We then propose a blind control policy that is feasible for the CP and evaluate the percentage increase in the CP objective function value over its minimum when that blind control is used in Section 4. In Section 5, we propose a blind control policy for our model, that is based on the number of customers waiting and the number of servers idling, and establish via simulation that a small increase in the CP objective function value is indicative of our proposed blind policy performing very well. Next, in Section 6, we propose another blind control policy, based on the waiting times of customers and the idling times of servers, and show that it also performs very well. Finally, we make concluding remarks in Section 7. Literature Review The literature on skill-based routing is extensive. A recent survey of this literature is included in Aksin, Armony, and Mehrotra []. Here we survey only the most closely related stream of literature. We begin with the paper by Gurvich and Whitt [7]. That paper shows that a generalized cµ Gcµ scheduling rule combined with a fastest-server-first routing rule asymptotically stochastically minimizes customer delay costs over a finite horizon in a parallel server system with multiple customer classes and multiple server pools when the delay cost functions are strictly convex in the Halfin-Whitt many-server limiting regime 3. The Gcµ scheduling rule follows Van Mieghem [27] and Mandelbaum and Stoylar [26], who proved asymptotic optimality of this scheduling rule in the conventional heavy traffic limit. The Gcµ scheduling rule is interesting in large part because it is blind when the service rate depends on the server pool only, and not also the customer class as is the case in [7]. However, under fastest-serverfirst routing, nearly all of the system idle time is experienced by the servers from the pool with the slowest service speed. From the server perspective, this is unfair. 3 Actually, the proposed scheduling rule in [7] is more general than generalized cµ, and their optimality results require the less restrictive condition that the delay cost functions are convex. However, it is only in the case of strictly convex delay cost functions that the proposed scheduling rule is asymptotically equivalent to a generalized cµ rule. 4

It is also true in the closely related work by ai and Tezcan [4] that nearly all the system idle time is experienced by the servers from the pool with the slowest service speed. Their model is a parallel server system model that also allows for customer reneging, and they show an asymptotic optimality result that is similar to [7], but uses different proof techniques. The holding and reneging costs are restricted to be linear, and the reneging rates are assumed to be ordered so that classes that are less expensive to hold also have higher reneging rates. There are several recent papers that address the server fairness issue, all in the context of the inverted-v model. Armony and Ward [3] propose a threshold routing rule, where customers are assigned to server pools based on a set of thresholds on the total number of customers in the system. Atar [6] proposes a blind policy that routes newly arrived customers to the server that has been idle the longest. Atar, Shaki, and Shwartz [8] analyze the blind policy that routes newly arrived customers to the pool that has the longest weighted cumulative idleness process, where the cumulative idleness process for a particular pool is defined by summing over the idle times each server in that pool has experienced since time. The policy proposed in Mandelbaum, Momcilovic, and Tseytlin [25] balances the number of idle servers from each pool by routing newly arrived customers randomly, with the probability that a given pool is chosen being proportional to the number of idle servers in that pool. To the best of our knowledge, this is the first paper that considers the server fairness issue in a more general parallel server system, that has both multiple server pools and multiple customer classes. 2 Model and Problem Formulation Consider a Parallel-Server System PSS, as shown in Figure, with fixed sets I = {,..., I} that represents customer classes and J = {,..., J} that represents server types each type in its own server pool. We let A i, i I, be independent, unit rate Poisson processes and A i i t represent the cumulative number of class i arrivals up to time t >, so that i > is the class i arrival rate. The total arrival rate to the system is := I i= i. Service times are independent and exponential, and the average service time of a customer served by a server from pool j J is /µ j. There are N j servers in pool j. We let N = j J N j be the total number of servers and N = N,..., N J denote the staffing vector. Here and elsewhere, x is used to denote a vector whose elements are x, x 2,.... Note that we follow the notation in [6]. The set of possible assignments of customers to servers in this system can be represented as the set E 5

I Routing and Scheduling N N J μ μ J Figure : An Example Parallel Server System. of edges in the bipartite graph G formed from the set of nodes I J having allowable edges E = {i, j I J }. An edge i, j E corresponds to pool j being able to serve class i customers. Given the assignment graph G := I J, E with E E, we let Ij := {i I : i, j E} Ji := {j J : i, j E}. In words, Ij is the set of classes that a pool j server can serve, and Ji is the set of all server pools that can serve class i. A control policy has two components: a routing component that specifies what to do when a customer arrives to the system, and a scheduling component that specifies what to do when a server completes service and becomes available. enote by π := π, N a policy that operates in the system with arrival rate and staffing vector N in general, we omit the arguments and N when it is clear from the context which arguments should be used. Let t be an arbitrary time point. We denote the queue-length of class i customers by Q i t; π and the number of idle servers in pool j by I j t; π. Then, the total number of customers queueing is Q t; π := i I Q it; π, and the total number of idle servers is I t; π := j J I jt; π. Let Z j t; π be the number of busy servers in pool j. Note that we do not need to track which servers are serving which customer classes, because we have assumed the rate at which a server serves a customer depends only on his pool, and not on the customer s class. It follows that I j t; π = N j Z j t; π and I t; π = N j J Z jt; π. We let S j, j J, be independent, unit rate Poisson 6

processes, also independent of A i, i I, so that t j t; π := S j µ j t Z j s; πds = S j µ j N j I j s; π ds represents the cumulative number of service completions by pool j servers. The overall number of customers in the system is which has the equivalent representation X t; π := i I Q i t; π + j J Z j t; π, X t; π = X + i I A i i t j J j t; π, assuming that X = i I Q i + j J Z j. We omit the time argument when we refer to an entire process. Also, we omit π from the notation unless it is necessary to avoid confusion between different routing policies. Let Π be the set of non-anticipating, non-preemptive policies, under which a steady-state exists for X, Q i, i I, and Z j, j J. Non-anticipating roughly speaking means that the policy cannot require knowledge of the future. By non-preemptive, we mean that once a service has begun by a particular server, it cannot be reassigned to another server of a different pool, nor can it be preempted by another customer. We also assume that any policy π Π serves customers first-come first-served within each customer class and, similarly, that servers become busy in accordance with a first-idle first-busy policy within each server pool. Additionally, we consider policies that obtain a unique steady-state distribution with a finite mean for all the above relevant processes. Accordingly, we use as the time argument for the various processes to denote the process in steady-state. We call a policy π Π an admissible policy, and restrict our attention to such policies. Our stated objective in the Introduction is to find a blind policy for scheduling and routing that minimizes customer holding costs subject to the requirement that in the long run each server pool experiences a proportionate share of the server idleness. In order to do this, it is helpful to first formulate the relevant stochastic minimization problem when we ignore the desire to have a blind policy. Let C i : R + R + for i I be functions that represent the class i holding cost. Let f j, be such that J j= f j =. Then, 7

the relevant stochastic minimization problem is minimize π Π I i= E [C i Q i ; π] subject to: EI j ; π = f j EI ; π, j J. We call the constraints in the fairness constraints, and the parameters f j, j J, the fairness parameters. The ideal is to first solve, and to then search for a blind policy having average customer holding cost that is not too much larger than the solution to. However, it does not appear possible to solve exactly. Instead, we formulate and solve the diffusion control problem CP associated with that arises in the Halfin-Whitt many-server heavy traffic limiting regime. We then propose a blind policy that is feasible for the CP, and develop bounds on the performance of that blind policy relative to the CP solution. 2. The Many-Server Heavy Traffic Limiting Regime Our approach in solving is to solve it asymptotically in the Halfin-Whitt many-server heavy-traffic limit regime [9]; i.e., for large systems systems with many servers and large demand that are heavily loaded. Specifically, we consider a family of systems indexed by the aggregate arrival rate and let. The service rates µ j, j J, the routing graph G, and the ratios a i = i / are all held fixed. The associated family of staffing vectors is N := N,..., N J. Our convention is to superscript all processes and quantities associated with the system having arrival rate by. We also define the scaled processes ˆQ i t := Q i t, i I, Î j t := I j t, j J, and ˆQ t := Q t, Î t := I t, and ˆX t := X t N. We assume that the number of servers in each pool is of the same order as the arrival rate, and that the system is heavily loaded. This is analogous to Assumption 2.3 in [7]. Assumption 2.. Square-root safety capacity There is a strictly positive vector ν that satisfies J j= µ jν j = such that Nj lim ν j, j J, That is, µ j ν j is the limiting proportion of pool j service capacity out of the total system capacity. Furthermore, Nj lim ν j = θ j, j J. 8

for θ R J such that β := j J µ j θ j >. Assumption 2. implies that J j= µ j Nj = + β + o as. When J =, this is the Halfin-Whitt many-server heavy-traffic condition that appears in 2.2 in [9]. We also require a resource pooling condition. This is analogous to Assumption 2.4 in [7]. Assumption 2.2. Resource pooling There exists a vector x R I J + j Ji µ j x ij ν j = a i for all i I and i Ij that satisfies x ij = for all j J, and such that the graph G having allowable edges E := {i, j I J : x ij > } is a connected graph. The variables x ij may be interpreted as the limiting proportion of server-pool j s capacity that is devoted to customers of class i. Assumption 2.2 guarantees that with multiple customer classes each class has access to more than the minimal capacity that it requires, that is, that j Ji µ jν j > a i with strict inequality. This local excess capacity condition guarantees that if all the capacity in the set of pools Ji is directed to serve the class i queue, the queue can be drained extremely fast, and practically instantaneously as the system size grows. For the remainder of this paper, we will assume that a vector x R I J + Assumption 2.2. has been fixed that satisfies We also let the class i customer holding cost depend on the system arrival rate, and make the following assumption on the functions Ci, i I. This is analogous to Assumption 2.5 in [7], but more restrictive. In particular, we assume the cost functions are strictly convex instead of non-strictly convex, because that is when the Gcµ scheduling rule is relevant. Assumption 2.3. There exist continuously differentiable, strictly increasing, strictly convex functions C i, i I, having C i = C i = such that C i x := C ix/ for each i I. The cost functions are scaled because the queue lengths are of order when the arrival rate is. Assumptions 2.-2.3 are what is required in [7] in order for the generalized cµ scheduling combined with the Fastest-Server First FSF routing rule to be asymptotically optimal with respect to a finite time horizon version of that omits the fairness constraint. 9

Given this setup, the problem, for every fixed value of, is equivalently stated as ] minimize I π Π i= [C E i ˆQ i ; π subject to: EÎ j ; π = f jeî ; π, j J. 2 3 The Approximating iffusion Control Problem CP We begin in Section 3. by informally deriving the diffusion control problem CP that arises as an approximation to 2 as becomes large. We then, in Section 3.2, establish that the CP separates into two simpler problems: one that involves only scheduling and one that involves only routing. The implication is that the routing control policy determines the total number of customers waiting in queue, and the scheduling control policy determines how to split them among the various classes. The section culminates in Section 3.3 with the CP solution. 3. iffusion control problem formulation Consider a sequence of admissible control policies {π : π Π for every value of } that is asymptotically non-idling in the sense that for any T > sup t T ˆQ t Î t as, where the notation means minimum and the notation means weak convergence. Then, letting [x] + := maxx, and [x] := max x,, it follows that ˆQ, π [ ˆX, π ] + and Î, π [ ˆX, π ] in, as, where is the set of all functions ω : R + R m for some appropriate integer m that are right continuous and have left limits, and is endowed with the Skorokhod J -topology. Next assume that each admissible policy in the sequence is Markovian, and that there is the following state-space collapse in the limit which is possible due to Assumption 2.2: at any moment in time, the queue-length for each class can be represented as a function of the total queue-length, and the number of idle servers in each pool can be represented as another function of the total number of idle servers. In other words, there are functions { } p : R + S Q := p,..., p I : p i for all i I and p i = i=

and v : R + S I := v,..., v J : v j for all j J and J v j = that specify the division of waiting customers into classes and idle servers into pools such that j= ˆQ i, π p i ˆX, π + ˆX, π + in, as, i I, and Î j, π v j ˆX, π ˆX, π in, as, j J. Then, rather than solving 2, it would be approximately equivalent to solve the following problem, based on the one-dimensional stochastic process ˆX minimize p,v S I i= E [ C i p i ˆX ; π + ˆX ; π +] subject to: [ E v j ˆX ; π ˆX ; π ] [ = f j E ˆX ; π ], j J, 3 where S is the set of all control functions p, v having px S Q and vx S I for all x. The final step in formulating the CP that approximates 2 as becomes large is to observe that under any admissible policy π, ˆX has the martingale decomposition given in 28 in Gurvich and Whitt [6] In 4, ˆM is a martingale, and ˆX t = ˆX + ˆM t + J j= µ jn j t + ˆM 2Bt in as, J t µ j Îj sds. 4 j= by Lemma 7. in [6], for {Bt : t } a standard Brownian motion. Then, equation 4, Assumption 2. square root safety capacity, and the assumed state-space collapse suggest that ˆX ˆX in as for ˆX that solves the stochastic integral equation ˆX t = ˆX + t m ˆX s, v ds + 2Bt, 5 where J mx, v := β + µ j v j x x. 6 j=

The relevant CP follows by replacing ˆX in 3 by ˆX and is minimize p,v S I i= E [ C i p i ˆX + ˆX + ] subject to: ] E [v j ˆX ˆX = f j E ˆX, for all j J. 7 Recall from our terminology for the pre-limit system that we call the constraint in 7 the fairness constraint, and that the fairness parameters f j, j J, dictate the proportion of idleness experienced by pool j servers. In relation to the model parameters a i, i I, and ν j, j J, that respectively specify the proportion of the total arrival rate that is class i and the proportion of the total number of servers that are in pool j, the a i s disappear in the limiting approximation ˆX for ˆX, but the ν j s do not disappear, because β depends on the ν j s although there are many values of the ν j s that result in the same β. 3.2 Separability of the CP The key to solving the CP 7 is to observe that the routing and the scheduling components of the control policy 7 may be found separately. To see this, first observe that 7 is alternatively expressed as [ I ] minimize p,v S E i= C i p i ˆX + ˆX + ˆX > P ˆX > ] subject to: E [v j ˆX ˆX = f j E ˆX, for all j J. [ I ] The term E i= C i p i ˆX + ˆX + ˆX > in the above is not a function of the routing component v. In fact, ˆX t + ˆX t > is a reflected Brownian motion RBM with drift β and infinitesimal variance 2, so that ˆX + ˆX > expβ, where denotes is distributed as. Next, since the drift term 6 for ˆX that solves 5 does not depend on p, P ˆX > does not depend on p, and also the constraints in 7 do not depend on p. Therefore, it follows that 7 is equivalently written as minimize p SQ subject to: E I i= C i p i xx βe dx βx minimize v SI P ˆX > ] [v j ˆX ˆX = f j E ˆX, for all j J. 8 We can separate 8 into two problems. In particular, for any fixed routing component v, the scheduling component p that solves the problem minimize p SQ I i= C i p i xx βe dx βx, 9 2

is also an optimal scheduling component for 7. Furthermore, for any fixed scheduling component p, the routing component v that solves the problem minimize v SI P ˆX > ] subject to: E [v j ˆX ˆX = f j E ˆX, for all j J is also an optimal routing component for 7. In particular, p that solves 9 is not a function of the particular choice of v, and v that solves is not a function of the particular choice of p. Hence we have established the following result. Proposition 3.. Separability The optimal solution p, v to the problem 7 has p that solves 9 and v that solves. The implication of Proposition 3. is that the solution of the CP in 7 can be found by solving one problem that involves only scheduling, ignoring the fairness constraints, and one that involves only routing, ignoring the cost functions C i, i I. The routing problem determines E[ ˆX + ] = P ˆX > /β, which approximates the total number of customers waiting in queue, and the scheduling problem 9 finds the division of waiting customers into classes that minimizes delay costs as a function of the total number of customers waiting. Moreover, p is the same for all underlying PSS network structures having the same number of customer classes I and the cost functions C i. Similarly, v is the same for all underlying PSS network structures having the same number of server pools and desired fairness fractions f j. Hence we can think of the solution p, v as combining the scheduling control policy p that solves the approximating CP for the V model with multiple customer classes and a single server pool in which the fairness constraint disappears, and the approximating CP for the inverted-v model with a single customer class and multiple server pools in which the objective function reduces to minimizing the wait time. 3.3 The iffusion Control Problem Solution We use Proposition 3. to solve the CP 7. To do this, we first specify the scheduling and routing components, p and v, separately. These are given in [7] and [3]. For completeness, we also write the solutions here. 3

3.3. The CP solution: Scheduling Consider the problem minimize subject to: I i= C i p i xx i I p ix = p i x for all i I. This nonlinear program appears as in [7]. It is a separable, continuous, nonlinear resource allocation problem, as in [2] [29] [36]. It follows from Lemma 3. in [7] that we can choose the function p such that it is locally Hölder continuous on the open interval,, and we make this assumption. It follows from the associate KKT conditions that the solution p x := p x,..., p I x, x > to is the unique solution to the parameterized set of equations C i p ixx = C k p kxx, i, k I i I p ix = p i x for all i I. Since, p x solves for any value of x >, we conclude that p solves 9. 3.3.2 The CP solution: Routing The solution to is a threshold control policy, as defined below. efinition 3.. A threshold control policy is defined by a J dimensional vector L := L,..., L J having L := < L < L 2 < < L J < L J :=, and has v j x = {L j x < L j }, for all x [, and j J. It is also useful to note that the infinitesimal drift associated with threshold control policy at levels L is β x m L x = β x J j= µ j{l j x < L j } x <. 2 Let v be the threshold control policy, and L = L β, µ,..., µ J, f,..., f J,..., L J = L J β, µ,..., µ J, f,..., f be the associated threshold levels, under which ˆX satisfies 5 and [ { E ˆX L j ˆX }] [ < L j = f j E ˆX ] for all j J. 3 4

Such a control policy exists by Lemma 5 in [3], and the threshold levels can be found according to the search procedure described in that paper, that is a sequence of one-dimensional searches. Theorem in [3] establishes that the proposed threshold control policy v with threshold levels L,..., L J is an optimal solution to the problem. 3.3.3 The CP solution The following Theorem is immediate from Proposition 3., the solution to 9 provided in Section 3.3., and the solution to provided in Section 3.3.2. Theorem 3.. Let ˆX satisfy 5 under the routing control v S I. Then, for any p, v S, for ˆX that satisfies 5 under control v S I, and has [ E v j ˆX ˆX ] [ = f j E ˆX ] for all j J, it is also true that i= E [C i p i ˆX + ˆX +] i= E [C i p i ˆX + ˆX +] In summary, we have solved the CP 7 that approximates our original stochastic minimization problem 2 in the Halfin-Whitt many-server heavy traffic regime. However, the solution p, v requires knowledge of the parameters β, µ,..., µ J, because the threshold levels L,..., L J depend on those parameters. In particular, this policy is not blind. This is an issue because our stated objective is to propose a blind policy that has an associated cost that is not too much larger than the minimum achievable cost. Therefore, in the next Section, we consider a blind policy that is feasible for the CP in 7, and calculate the percentage cost increase in the CP objective function 7 under that blind policy relative to its value under p, v. 4 A Blind Policy that is Feasible for the CP In this section, we introduce our proposed blind policy, and evaluate the percentage increase in the CP objective function value when that blind policy is used instead of the optimal policy p, v. Our goal is to show that this percentage increase is small. In Section 4., we write the expression for the percentage cost increase, and observe that it does not depend on the scheduling component p. Next, in Section 4.2, we 5

establish that it is sufficient to study a two pool model with a single customer class J = 2 and I =, an inverted-v model in order to approximate the aforementioned percentage increase in any PSS model, with any number of server pools and customer classes. In particular, to calculate this percentage increase in any PSS model, it is required only to know the service rate of the servers in the slowest and fastest pools; it is not necessary to know the cost functions, the number of customer classes, or the number of server pools. Finally, we end in Section 4.3 by using the convenient analytic expressions associated with the two pool inverted V model to show that the percentage increase in the objective function value is under % in a large portion of the parameter space. 4. The Proposed Blind Policy One very simple routing policy that satisfies the constraints of the CP in 7 is the fixed drift policy v j x = f j for x and j J. 4 The routing control policy v F defined by 4 has the very attractive feature that it does not require knowledge of the parameters β, µ,..., µ J, and so is blind. However, v F is more restrictive than the optimal routing policy v, because it requires that the fairness constraint in 7 is satisfied at all times. In other words, each pool always has the desired proportion of idle servers, instead of that desired proportion being achieved in the long run, but not necessarily at any single point in time. The optimal scheduling control policy p is blind, and so our proposed scheduling policy remains the same as in Section 3. The percentage increase in the objective function value in 7 when the blind control policy p, v F is used instead of the optimal control policy p, v is ] I i= [C E i p i ˆXF + ˆXF + [ ] I i= E C i p i ˆX + ˆX + ], I i= [C E i p i ˆX + ˆX + where ˆX F satisfies 5 under the routing control policy in 4, and ˆX satisfies 5 under the routing control policy v. Since any ˆX that satisfies 5 has ˆX + ˆX > expβ, i= E [C i p i ˆX + ˆX +] = P ˆX > C i p i xx βe βx dx. i= 6

Therefore, ] I i= [C E i p i ˆXF + ˆXF + [ ] I i= E C i p i ˆX + ˆX + ] 5 I i= [C E i p i ˆX + ˆX + P ˆXF > P ˆX > =, P ˆX > so that the increase in the objective function value is independent of the scheduling control policy. Moreover, the increase in the objective function value does not depend on either the cost functions C i, i I, or the number of customer classes I. Furthermore, the percentage increase in the objective function value in the CP that considers only routing is exactly the percentage increase in the objective function value in the CP 7 that includes both routing and scheduling. The reason is the separability shown in Section 3.2. We would like to use the expression in 5 to understand when using v F instead of v results in a small percentage increase in the objective function value. This appears possible because the expression in 5 can be written in terms of the standard normal pdf φ and cdf Φ using the results in Section 8.4 Browne and Whitt []. In the case that J = 2, for a given f, and f 2 = f, P ˆXF > = P F f and P ˆX > = P Lf, where P F f = β µ f + µ 2 fφ Φ β µ f+µ 2 f β µ f+µ 2 f + β µ f + µ 2 fφ Φ β µ f+µ 2 f β µ f+µ 2 f, and P L = + µ2 φ µ 2 L+ β µ2 φ β β Φ µ µ 2 L+ β µ2 φ µ L+ β µ µ2 φ µ 2 L+ β µ2 µ Φ Φ β µ Φ µ L+ β µ µ2 φ µ 2 L+ β µ 2 L+ β µ2 φ + µ2 µ L+ β β µ Φ µ 2 L+ β µ2 φ φ β µ µ L+ β µ for L := Lf chosen such that the constraint in 7 is satisfied. However, the expression in 5 becomes algebraically complicated for J larger than 2. Fortunately, for the purposes of estimating P ˆXF > P ˆX > max, f,...,f J f j [,], J P ˆX j= f j= > 7

the maximum percentage cost increase over all fairness constraints for a given set of parameters β, J, µ,..., µ J, it is enough to find max f that is, it is sufficient to study the two pool system. P F f P Lf P ; Lf 4.2 Reduction to the Two Pool System Recalling that µ = min{µ, µ 2,..., µ J } and µ J = max{µ, µ 2,..., µ J }, let q, q 2,..., q J [, ], with q = and q J =, satisfy µ j = q j µ + q j µ J, j =,..., J. It follows that for any routing policy v for the original system, there exists a routing policy for the two-pool system with rates µ and µ J such that the underlying diffusion process is identical in both systems, and the slow servers idleness proportion in the two pool system, f 2P, is a convex combination of the parameters q, q 2,..., q J weighted by the idleness proportions associated with the policy v. Proposition 4.. Consider a system with J 3 pools, fixed parameters β, µ, µ 2,..., µ J and idleness proportions f, f 2,..., f J. Consider a routing policy v for this system. efine the routing policy v 2P v : R + {v 2P, v2 2P : vj 2P for the two-pool system with rates µ and µ J as follows Then, for m defined as in 6, v 2P v := J j= for j {, 2} and v 2P + v 2P 2 = } q j v j, and v 2P J v := J q j v j. 6 j= mx, v = mx, v 2P for all x R. In particular, if ˆX satisfies 5 under the routing control policy v and control policy v 2P, P ˆX > = P ˆX 2P ˆX2P >. satisfies 5 under the routing Moreover, let f, f 2,..., f J be the idleness proportions associated with pools,2,...,j in the original system, then the slow server idleness proportion in the two-pool system satisfies f 2P = J q j f j. 7 j= 8

It follows from Proposition 4. that the the percentage increase in cost in 5 between the optimal policy and the blind policy v F in the J-pool system is bounded above by the percentage increase in the same, with respect to the two-pool system with rate µ and µ J, and with idleness proportions defined as in 7. Proposition 4.2. Consider a system with J 3 pools, fixed parameters β, µ, µ 2,..., µ J and idleness proportions f, f 2,..., f J. Then, P ˆXF > P ˆX > P F f 2P P Lf 2P P ˆX > P Lf 2P, 8 where f 2P is as defined in 7. As a direct consequence of Proposition 4.2 one has that the maximum possible cost increase caused by using the blind policy v F, in place of the optimal threshold policy for the J -pool system v, is bounded above by the analogous cost increase in the two-pool system. Corollary 4.. Consider a system with J 3 pools and with fixed parameters β, µ, µ 2,..., µ J. Then, P ˆXF > P ˆX > P max max f,...,f J f j [,], f J P ˆX j= f j= > F f P Lf P. Lf 4.3 Numerical Evaluation We desire to understand the consequences of using the blind routing control policy v F instead of the optimal routing control policy v. More specifically, we would like to compute the percentage increase in the objective function value in order to understand the region of the parameter space in which the percentage increase is small. Proposition 4.2 establishes that to do this it is enough to study the fraction P F f P Lf /P Lf for f [, ]. For a given f, Lf does not have a convenient closed form expression. However, for a given L, [ E ˆX ˆX ] < L P ˆX < L f L = [ ] E ˆX = + β Φ β µ 3/2 µ Φ µ L+ β µ φ + φ β µ φ µ L+ β µ µ L+ β µ µ φ µ L+ β µ β Φ µ 2 L+ β µ2 µ 3/2 2 φ + µ 2 L+ β µ 2 µ2 9

6.% 4.% 2.%.% 8.% 6.% 4.% 2.% µ2 = 2 µ2 = 3 µ2 = 4 µ2 = 5 2.%.% 8.% 6.% 4.% 2.%.%..2.4.6.8..2 Slow server idleness fraction f µ =, β= µ2 = 2 µ2 = 3 µ2 = 4 µ2 = 5.%..2.4.6.8..2 Slow server idleness fraction f µ =, β= a Lf b F SF f Figure 2: The percentage increase in the objective function value under the blind control policy v F instead of a the optimal control policy v and b the optimal control policy when the fairness constraint is ignored v F SF. is a convenient closed form expression. Since for a given L >, the drift of the fixed drift diffusion that satisfies the identical fairness constraint is easily expressed in terms of f L the drift is β x µ f L + µ 2 f L, we have the following closed-form expression for the percentage increase in the objective function value under the control policy v F instead of v L := P F f L P L P L µ2 φ = β φ β µ µ L + β µ h µ 2 L + β µ2 + µ Φ β µ Φ µ L + β h β + f Lµ + f Lµ 2 f Lµ + f Lµ 2 β µ +, where hx = Φx/φx is the hazard rate function for the standard normal distribution. Figure 2a shows the behavior of when µ =, β =, and for various values of µ 2. The maximum percentage increase in the objective function value never exceeds %, and is below % for all values of µ 2 = 2 and µ 2 = 3. For comparison purposes, it is useful to also calculate the percentage increase in the objective function value under v F as compared to the optimal control policy when the fairness constraint is ignored, vj F SF x = if j = otherwise for all x. 9 2

35.% 45.% 3.% 25.% 2.% β =.5 4.% 35.% 3.% 25.% β =.5 5.%.% 5.% β =. β =.5 β = 2. 2.% 5.%.% 5.% β =. β =.5 β = 2..%..2.4.6.8. Slow server idleness fraction f.%..2.4.6.8. Slow server idleness fraction f a µ =, µ 2 = 2 b µ =, µ 2 = 5 Figure 3: The percentage increase in the objective function value, Lf, under the blind control policy v F instead of the optimal control policy v. In particular, v F SF solves minimize v SI P ˆX >, and so, by separability, is the routing control policy that minimizes the objective function in 7, ignoring ˆX F SF the constraint. Then, letting satisfy 5 under the routing control policy v F SF, the function P ˆXF F SF > P ˆXF SF > f =, P ˆXF SF > graphed in Figure 2b for various values of µ 2, provides information on the impact of adding the fairness constraint with respect to the increase in the objective function value. We conclude from Figure 2b that whenever f is not too close to one, meaning the faster servers are experiencing some non-trivial proportion of the idle time, the percentage increase in the objective function value when using v F instead of v F SF is much larger than that when using v F instead of v. This suggests that the impact of requiring a blind policy is tiny compared to the impact of requiring a fair policy. That insight is also evidenced by using [] to calculate F SF f = β µ f +µ 2 f 2 h β µ h β µ + β µ + f +µ 2 f 2 where hx = Φx/φx, and noting that F SF f increases to as µ 2 for any value of f < which can be seen by straightforward calculus. Figure 3 graphs when µ = for various values of β, and for µ 2 = 2 Figure 3a and µ 2 = 5 Figure 3b. Figure 3 shows that the maximum percentage increase in the objective function increases as 2

β increases, and that the impact of µ 2 is small compared to the impact of β. This suggests that caution is warranted when deciding to use the blind control policy v F if there is a sense that the value of β may be large exceed. In other words, even though the control policy v F itself is blind, it is necessary to have some partial knowledge of the parameters in order to evaluate its performance. 5 The Proposed Blind and Fair Policy There is a natural translation between the blind policy v F that is feasible for the CP 7 to a blind policy that is feasible for the original stochastic minimization problem 2. In particular, the separability of the CP shown in Proposition 3. implies that the scheduling and routing can be decided on in isolation. Scheduling should be in accordance with the Gcµ scheduling rule that was shown to be asymptotically optimal in Theorem 3.2 of [7] for a finite time horizon version of 2 in the absence of fairness constraints. Routing should be in accordance with the QIR policy in [6] that makes decisions based on always maintaining the proportion of idle servers in each pool j to be f j, so that, in accordance with v F, the fairness constraint is satisfied at all times in the Halfin-Whitt limit regime. We call our proposed blind policy the fair fixed idleness ratio and generalized cµ scheduling FFIR-Gcµ policy. In this section, we motivate why and when we expect the FFIR-Gcµ policy to perform well for 2 in Section 5., and then, in Sections 5.2 and 5.3, perform simulation studies that verify that intuition. 5. The Fair Fixed Idleness Ratio and Generalized cµ Scheduling FFIR-Gcµ Policy We begin with the precise definition of our proposed blind policy. efinition 5.. The Fair Fixed Idleness Ratio and Generalized cµ Scheduling FFIR-Gcµ Policy: Upon the arrival of a class i customer at time t, the customer will be routed to an available server in pool j, where j := j t { [ ] } argmax Îj t f j ˆX t ; j Ji,I j t> i.e., the customer will be routed to the server pool that has the greatest difference between the number of servers in that pool that are currently idling and the number of idle servers in that pool that maintains the fairness parameter f j. In particular, larger value of f j implies lower priority for the corresponding server pool. If there are no servers available, the customer waits in queue i, to be served in the order of arrival. 22

Upon service completion by a type j server at time t, the server will admit to service the customer from the head of queue i := i t argmax C i ˆQ i t ; i Ij,Q i t> i.e., the server prioritizes classes using a generalized cµ rule. If there are no customers waiting, the server will idle. The FFIR-Gcµ policy is blind because it only requires knowledge of the fairness parameters f j, j J. Note that the scheduling component is identical to that of 7 in Gurvich and Whitt [7], with the exception that [7] has a µ j multiplying the expression inside the argmax. As [7] points out, the term µ j is redundant due to the assumption that the service rates are pool dependent. However, their formulation underlines the connection to the name Gcµ. We chose to omit it, so as to emphasize the fact that the policy is blind. The FFIR-Gcµ policy acts in a symmetric manner to control the number of customers in each class waiting in queue, and the number of idle servers in each pool, so that for any t >, ˆQ i FFIR-Gcµ [ ] + [ ] and p i ˆXF t ˆXF +, [ ] t and also Îj t; FFIR-Gcµ and f j ˆXF, t are close. This follows from the fact that the FFIR-Gcµ policy is a QIR policy as defined in efinition 2.3 in [6], because the Gcµ scheduling is equivalent to letting { [ + [ ] } + i t argmax ˆQ i t p i ˆX t] ˆX t, i Ij,Q i t> as observed in [7]. We expect the FFIR-Gcµ policy to have a small percentage increase in the steady-state expected customer holding cost the objective function in 2 relative to the policy that is optimal for the stochastic minimization problem 2, in the Halfin-Whitt limit regime, and when the system capacity J j= µ jn j is close enough to the arrival rate so that This is because < J µ j Nj < +. j= the scheduling is motivated by the optimal scheduling control for the CP in 7 p ; the routing is motivated by the blind routing control for the CP v F ; 23

Section 4 shows that in a large portion of the parameter space whenever β < and for any values of J and µ,..., µ J use of the control p p, v F instead of the control p, v that is optimal for the CP results in a small percentage increase in the CP objective function value; the following Theorem, which is an immediate consequence of Theorem 4. in [6], suggests that in the Halfin-Whitt limit regime the FFIR-Gcµ policy has similar performance to the blind control for the CP v F. Theorem 5.. Let ˆXF satisfy 5 under the routing control v F, and let p be the scheduling control policy that solves 9 and is the optimal scheduling control policy for the CP in 7. Assume ˆX ˆX F as. Then, under the FFIR-Gcµ policy, ˆX t, ˆQ t,..., ˆQ I t, Î t,..., Î J t ˆX t, p [ ˆXF in, as. t ] + [ ˆXF t ] +,... p I [ ˆXF t ] + [ ˆXF ] + [ ] [ ] t, f ˆXF t,..., fj ˆXF t Theorem 5. suggests that ] [ E [Î j f j E ˆXF ] for all j J, as, so that the FFIR-Gcµ policy satisfies the constraints of the CP as becomes large. The convergence in the previous display is true when both the limit interchange required to show the weak convergence of the steady-state random variables ˆX, ˆQ,..., ˆQ I, Î,..., Î J ˆX, p [ ˆXF f [ ˆXF ] + [ ˆXF ] [ ],..., fj ˆXF ] + [ ] + [ ] +,... p I ˆXF ˆXF,, as, is valid and the sequence of steady-state random variables is uniformly integrable. Note that for a general PSS model, it is non-trivial to establish the aforementioned limit interchange and uniform integrability. One step in this direction is the work of Stoylar and Yudovina [32]. We next perform a simulation study to investigate the performance of the FFIR-Gcµ policy. The purpose of our simulation study is to show that the results in Section 4, that show that the percentage increase in the objective function value in the CP 7 when the blind control p, v F is used instead of the optimal control p, v, translate to the pre-limit. That is, the objective function value in 2 when the FFIR-Gcµ 24

8 8 2 = 8 N 34 2 2 N 65 2 N 34 2 2 N 65 2 a The N-model b The inverted-v model Figure 4: The PSS models used in our simulation study. policy is used should be close to its minimum whenever the objective function value in 7 when p, v F is used is close to its minimum. 5.2 The Setup for our Simulation Study We study the performance of the FFIR-Gcµ policy in the N-model and the two-pool inverted-v model shown in Figure 4. We use the control policy that is optimal for the CP 7 p, v to motivate a comparison policy to FFIR-Gcµ. The comparison policy schedules in accordance with the Gcµ rule, and the routing translates the thresholds L,..., L J that define v to a threshold routing policy exactly as in Section 5 in [3]. Hence we term the comparison control the TR-Gcµ control. Note that the parameters in Figure 4 are consistent with = 2 for the N-model, ν = /5, ν 2 = 4/, θ = /2, and θ 2 = /4, so that β = µ θ + µ 2 θ 2 = and N = ν + θ = 34, N2 = ν 2 + θ 2 = 65. Then, for any given value of f [, ], the threshold L = L β, µ, µ 2, f, f 2 is well-defined. The TR-Gcµ is a natural comparison policy to FFIR-Gcµ because its continuous modification has been shown to be asymptotically optimal for the inverted-v model in [3]. At the same time, TR-Gcµ is not blind, so one can consider this comparison as a the price of blindness indicator. Showing that the TR-Gcµ policy is asymptotically optimal for a general PSS model is technically challenging, and beyond the scope of this paper, because it requires knowing the limiting behavior of the sequence of steady-state random variables 25

associated with the system processes, and the uniform integrability of that sequence, under any admissible policy which, as mentioned after Theorem 5., is difficult. The routing of an incoming class customer at time t > in the N-model and an incoming customer at time t > in the inverted-v model is the same. The TR-Gcµ policy routes such a customer to pool if X t L and I t > 2 if X jt = t > L and I2 t >, 2 if X t L and I t = if X t > L and I2 t = where L = N L = 99 L 6. Otherwise, if I t = I 2 t =, the customer waits in queue. Scheduling is only relevant for the N-model, and is the same under both the FFIR-Gcµ and TR-Gcµ policies, but depends on the cost function. For the cost functions C x = 2. x. and C 2 x = x. for x >, 2 the scheduling has a pool 2 server that becomes available at time t >, when both Q t > and Q 2 t >, serve the class customer that has been waiting the longest if C ˆQ t > C 2 Q 2t, which is equivalent to 2Q t > Q 2t, and otherwise serve the class 2 customer that has been waiting the longest. If Q t > but Q 2 t = Q 2 t > but Q t =, the server serves the class class 2 customer that has been waiting the longest. The server idles only if Q t = Q 2 t =. We choose the cost function for the inverted-v model so that the predicted mean cost is exactly the same as for the N-model under both the FFIR-Gcµ and TR-Gcµ policies, for easy comparison. To do this, first note that in the N-model, the condition C px = C 2 px implies p x = /3 and p 2 x = 2/3 for 26