Queueing Systems with Synergistic Servers

Similar documents
Queueing Systems with Synergistic Servers

Maximizing throughput in zero-buffer tandem lines with dedicated and flexible servers

OPTIMAL CONTROL OF A FLEXIBLE SERVER

A cµ Rule for Two-Tiered Parallel Servers

Dynamic Control of a Tandem Queueing System with Abandonments

Pooling in tandem queueing networks with non-collaborative servers

On the static assignment to parallel servers

Dynamic control of a tandem system with abandonments

Optimal and Heuristic Resource Allocation Policies in Serial Production Systems

Dynamically scheduling and maintaining a flexible server

OPTIMALITY OF RANDOMIZED TRUNK RESERVATION FOR A PROBLEM WITH MULTIPLE CONSTRAINTS

LIMITS FOR QUEUES AS THE WAITING ROOM GROWS. Bell Communications Research AT&T Bell Laboratories Red Bank, NJ Murray Hill, NJ 07974

Statistics 150: Spring 2007

Time Reversibility and Burke s Theorem

Tutorial: Optimal Control of Queueing Networks

Maximum Pressure Policies in Stochastic Processing Networks

ADMISSION CONTROL IN THE PRESENCE OF PRIORITIES: A SAMPLE PATH APPROACH

Queueing Theory I Summary! Little s Law! Queueing System Notation! Stationary Analysis of Elementary Queueing Systems " M/M/1 " M/M/m " M/M/1/K "

Performance Evaluation of Queuing Systems

TCOM 501: Networking Theory & Fundamentals. Lecture 6 February 19, 2003 Prof. Yannis A. Korilis

The Transition Probability Function P ij (t)

CHARACTERIZING THE PERFORMANCE OF PROCESS FLEXIBILITY STRUCTURES. O. Zeynep Akşin and Fikri Karaesmen

Markov decision processes and interval Markov chains: exploiting the connection

Design and evaluation of overloaded service systems with skill based routing, under FCFS policies

Queueing systems. Renato Lo Cigno. Simulation and Performance Evaluation Queueing systems - Renato Lo Cigno 1

Session-Based Queueing Systems

On the Scheduling of Operations in a Chat Contact Center

A PARAMETRIC DECOMPOSITION BASED APPROACH FOR MULTI-CLASS CLOSED QUEUING NETWORKS WITH SYNCHRONIZATION STATIONS

Robustness and performance of threshold-based resource allocation policies

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance

Asymptotics for Polling Models with Limited Service Policies

IEOR 6711: Stochastic Models I, Fall 2003, Professor Whitt. Solutions to Final Exam: Thursday, December 18.

Zero-Inventory Conditions For a Two-Part-Type Make-to-Stock Production System

IEOR 6711, HMWK 5, Professor Sigman

Page 0 of 5 Final Examination Name. Closed book. 120 minutes. Cover page plus five pages of exam.

Chapter 5. Continuous-Time Markov Chains. Prof. Shun-Ren Yang Department of Computer Science, National Tsing Hua University, Taiwan

On Finding Optimal Policies for Markovian Decision Processes Using Simulation

Infinite-Horizon Discounted Markov Decision Processes

MULTIPLE CHOICE QUESTIONS DECISION SCIENCE

MDP Preliminaries. Nan Jiang. February 10, 2019

Markov Chains. X(t) is a Markov Process if, for arbitrary times t 1 < t 2 <... < t k < t k+1. If X(t) is discrete-valued. If X(t) is continuous-valued

Minimizing response times and queue lengths in systems of parallel queues

Operations Research Letters. Instability of FIFO in a simple queueing system with arbitrarily low loads

A Call Center Model with Upgrades

Individual, Class-based, and Social Optimal Admission Policies in Two-Priority Queues

Queueing Networks and Insensitivity

Solutions to Homework Discrete Stochastic Processes MIT, Spring 2011

CPSC 531: System Modeling and Simulation. Carey Williamson Department of Computer Science University of Calgary Fall 2017

Technical Appendix for: When Promotions Meet Operations: Cross-Selling and Its Effect on Call-Center Performance

On Polynomial Cases of the Unichain Classification Problem for Markov Decision Processes

Appendix: Simple Methods for Shift Scheduling in Multi-Skill Call Centers

Intro Refresher Reversibility Open networks Closed networks Multiclass networks Other networks. Queuing Networks. Florence Perronnin

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 43, NO. 3, MARCH

CS 798: Homework Assignment 3 (Queueing Theory)

Infinite-Horizon Average Reward Markov Decision Processes

On the Optimality of Randomized Deadlock Avoidance Policies

Near-Optimal Control of Queueing Systems via Approximate One-Step Policy Improvement

Simplex Algorithm for Countable-state Discounted Markov Decision Processes

[4] T. I. Seidman, \\First Come First Serve" is Unstable!," tech. rep., University of Maryland Baltimore County, 1993.

Maximum pressure policies for stochastic processing networks

Online Supplement to Delay-Based Service Differentiation with Many Servers and Time-Varying Arrival Rates

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

1 Markov decision processes

1 IEOR 4701: Continuous-Time Markov Chains

A Simple Solution for the M/D/c Waiting Time Distribution

Lecture notes for Analysis of Algorithms : Markov decision processes

Name of the Student:

UNIVERSITY OF YORK. MSc Examinations 2004 MATHEMATICS Networks. Time Allowed: 3 hours.

Efficient Nonlinear Optimizations of Queuing Systems

Blind Fair Routing in Large-Scale Service Systems with Heterogeneous Customers and Servers

System occupancy of a two-class batch-service queue with class-dependent variable server capacity

Stochastic Shortest Path Problems

Dynamic Call Center Routing Policies Using Call Waiting and Agent Idle Times Online Supplement

Motivated by models of tenant assignment in public housing, we study approximating deterministic fluid

Motivation for introducing probabilities

Stability of queueing networks. Maury Bramson. University of Minnesota

A FAST MATRIX-ANALYTIC APPROXIMATION FOR THE TWO CLASS GI/G/1 NON-PREEMPTIVE PRIORITY QUEUE

A monotonic property of the optimal admission control to an M/M/1 queue under periodic observations with average cost criterion

2905 Queueing Theory and Simulation PART III: HIGHER DIMENSIONAL AND NON-MARKOVIAN QUEUES

Stability and Rare Events in Stochastic Models Sergey Foss Heriot-Watt University, Edinburgh and Institute of Mathematics, Novosibirsk

Series Expansions in Queues with Server

A Joining Shortest Queue with MAP Inputs

6 Solving Queueing Models

Reinforcement Learning

STAT 380 Continuous Time Markov Chains

DYNAMIC ROUTING POLICIES FOR MULTISKILL CALL CENTERS

A CALL-ROUTING PROBLEM WITH SERVICE-LEVEL CONSTRAINTS

Monotonicity Properties for Multiserver Queues with Reneging and Finite Waiting Lines

THE HEAVY-TRAFFIC BOTTLENECK PHENOMENON IN OPEN QUEUEING NETWORKS. S. Suresh and W. Whitt AT&T Bell Laboratories Murray Hill, New Jersey 07974

A Semiconductor Wafer

A Queueing System with Queue Length Dependent Service Times, with Applications to Cell Discarding in ATM Networks

OPTIMAL CONTROL OF PARALLEL QUEUES WITH BATCH SERVICE

Markov Decision Processes and Solving Finite Problems. February 8, 2017

Cover Page. The handle holds various files of this Leiden University dissertation

HITTING TIME IN AN ERLANG LOSS SYSTEM

Service-Level Differentiation in Many-Server Service Systems: A Solution Based on Fixed-Queue-Ratio Routing

Strategic Dynamic Jockeying Between Two Parallel Queues

Chapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model

Synchronized Queues with Deterministic Arrivals

Stabilizing Customer Abandonment in Many-Server Queues with Time-Varying Arrivals

Transcription:

Queueing Systems with Synergistic Servers Sigrún Andradóttir and Hayriye Ayhan H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, U.S.A. Douglas G. Down Department of Computing and Software McMaster University Hamilton, Ontario L8S 4L7, Canada May 27, 2009 Abstract We consider tandem lines with finite buffers and flexible, heterogeneous servers who are synergistic in that they work more effectively in teams than on their own. Our objective is to determine how the servers should be assigned dynamically to tasks in order to maximize the long-run average throughput. In particular, we investigate when it is better to take advantage of synergy among servers, rather than exploiting the servers special skills, to achieve the best possible system throughput. For Markovian systems with two stations and two servers, we provide a complete characterization of the optimal policy and show that depending on how well the servers work together, the optimal policy either takes full advantage of servers special skills, or full advantage of server synergy (and hence there is no middle ground in this case). Moreover, for a class of Markovian systems, we provide sufficient conditions that guarantee that the optimal policy should take full advantage of server synergy at all times. Finally, we show that when there is no tradeoff between server synergy and servers special skills (because the servers are generalists who are eually skilled at all tasks), the optimal policy has servers working in teams of two or more at all times. 1 Introduction In the last decade, there has been a growing interest in ueues with flexible servers. However, past research in this area has assumed that when servers work together, their service rates are additive. This assumption is obviously restrictive because servers may gain or lose efficiency when they collaborate. In this paper, we focus on the case where the servers become more effective when they collaborate (i.e., the servers are synergistic), which could for example be due to increased motivation among the servers when they work in a team or due to their having complementary skills. Our objective is to understand when it is important to take advantage of synergy among 1

servers, as opposed to exploiting their special skills by assigning them to tasks that they are good at, to improve the system throughput. More specifically, we focus on a ueueing network with N 1 stations and M 1 flexible servers. There is an infinite amount of raw material in front of station 1, infinite room for departing customers after station N, and a finite buffer between stations j and j + 1, for j {1,..., N 1}, whose size is denoted by B j. The system operates under manufacturing blocking. At any given time, there can be at most one customer at each station and each server can work on at most one customer. We assume that server i {1, 2,..., M} works at a deterministic rate µ ij [0, ) at station j {1, 2,..., N}. Thus, server i is trained to work at station j if µ ij > 0 and the server s skill at station j is measured by the magnitude of µ ij. Several servers can work together on a single customer, in which case the combined rate of a server team is proportional to the sum of the rates of the individual servers. Thus, if servers i 1,..., i k {1,..., M} are simultaneously assigned to station j {1,..., N}, the service rate is eual to α k r=1 µ i r j. We assume that 1 α <, which implies that the servers are synergistic with α being a measure of the magnitude of the servers synergy. The service reuirements of different customers at each station j {1,..., N} are independent and identically distributed (i.i.d.) random variables whose rate we take to be eual to 1 without loss of generality, and the service reuirements at different stations are independent of each other. We assume that travel and set-up times are negligible. Under these assumptions, our objective is to determine the dynamic server assignment policy that maximizes the long-run average throughput. Our main contribution is the complete characterization of the optimal server assignment policy for systems with exponential service reuirements, N = 2 stations, and M = 2 servers. However, some of our results are for more general systems. There is a significant amount of literature on ueues with flexible servers. In the interest of space, we do not provide a complete literature review here, but refer the interested reader to Hopp and Van Oyen [9] for a comprehensive review of the literature in this area, and to Akşin, Armony, and Mehrotra [3], Akşin, Karaesmen, and Örmeci [4], and Gans, Koole, and Mandelbaum [8] for thorough reviews of the literature on flexible servers in call centers. This paper is most closely related to other works that employ Markov decision process techniues and sample path analysis in determining effective server allocation schemes, see for example Ahn, Duenyas, and Zhang [1], Ahn and Righter [2], Andradóttir and Ayhan [5], Andradóttir, Ayhan, and Down [6, 7], Kaufman, Ahn, and Lewis [10], Örmeci [12], Sennott, Van Oyen, and Iravani [14], Van Oyen, Gel, and Hopp [16], and Wu, Lewis, and Veatch [17]. However, as mentioned above, there is no previous work in the literature on ueues with flexible, synergistic servers (to the best of our knowledge). All the existing papers on ueues with flexible servers assume that when servers collaborate on a single customer, the service rates are additive (i.e., α = 1). The outline of this paper is as follows. In Section 2, we provide a complete characterization of the optimal policy for Markovian systems with two stations and two servers and for all 1 α <. In Section 3, we consider both more general Markovian networks and also systems with generalist 2

servers (i.e., µ ij = µ i γ j for all i {1,..., M} and j {1,..., N}) and general service reuirement distributions. Section 4 is devoted to the proof of the main result in Section 2. Finally, Section 5 summarizes our findings and provides some directions for future research. 2 Two Station Markovian Systems with Two Servers In this section, we characterize the optimal policy for a Markovian tandem line with N = 2 stations and M = 2 synergistic servers. We use Π to denote the set of all server assignment policies under consideration (which will be defined later) and D π (t) to denote the number of departures under policy π by time t 0. Define T π = lim sup t IE[D π (t)] t as the long-run average throughput corresponding to the server assignment policy π Π. Our objective is to solve the following optimization problem max T π. (1) π Π We start with the description of a continuous time Markov chain model for systems with N stations, M servers, and exponentially distributed service reuirements. For all π Π and t 0, let X π (t) = {X π,1 (t),..., X π,n 1 (t)}, where X π,j (t) {0,..., B j +2} denotes the number of customers that have been processed at station j at time t but are either waiting to be processed by station j + 1 or in process at station j + 1 at time t. Let S denote the state space of {X π (t)}. We assume that the class Π of server assignment policies under consideration consists of Markovian stationary deterministic policies corresponding to the state space S, and that the set of possible actions is given by A = {a σ1...σ M : σ i {0, 1,..., N}, i = 1,..., M}, where for all i {1,..., M}, σ i = 0 when server i is idle and σ i = j {1,..., N} when server i is assigned to station j. Finally, we use Σ j to denote M i=1 µ ij, for all j = 1,..., N, so that ασ j is the combined service rate of servers 1,..., M when they collaborate at station j. For the remainder of this section, we assume N = 2 and we set B 1 = B for notational convenience. Then it is clear that for π Π, {X π (t)} is a birth-death process with state space S = {0,..., B + 2} and that there exists a scalar π α 2 i=1 max 1 j 2 µ ij < such that the transition rates { π (s, s )} of {X π (t)} satisfy s S,s s π(s, s ) π for all s S. Hence, {X π (t)} is uniformizable. Let {Y π (k)} be the corresponding discrete time Markov chain, so that {Y π (k)} has state space S and transition probabilities p π (s, s ) = π (s, s )/ π if s s and p π (s, s) = 1 s S,s s π(s, s )/ π for all s S. Using the analysis in Section 3 of Andradóttir, Ayhan, and Down [6], one can show that the original optimization problem in (1) can be translated into an euivalent (discrete time) Markov decision problem. More specifically, for all i S, let { π (i, i 1) if i {1,..., B + 2}, R π (i) = 0 if i = 0, 3

be the departure rate from state i under policy π. Then the optimization problem (1) has the same solution as the Markov decision problem [ max π Π lim IE K 1 K ] K R π (Y π (k 1)). k=1 In other words, maximizing the steady-state throughput of the original ueueing system is euivalent to maximizing the steady-state departure rate for the associated embedded (discrete time) Markov chain. The next theorem provides a complete characterization of the optimal server assignment policy for Markovian systems with N = 2 stations and M = 2 synergistic servers. For notational convenience, we set { µ11 C 22 = max + µ 22, µ 21 + µ } 12. Σ 1 Σ 2 Σ 1 Σ 2 Note that C 22 1 since µ 11 Σ 1 + µ 22 Σ 2 + µ 21 Σ 1 + µ 12 Σ 2 = 2. The proof of Theorem 2.1 appears in Section 4. Theorem 2.1 For a Markovian system of two stations and two servers, we have (i) If 1 α C 22 and µ 11 µ 22 µ 21 µ 12 (µ 21 µ 12 µ 11 µ 22 ), then the policy that assigns server 1 (2) to station 1 and server 2 (1) to station 2 unless station 1 is blocked or station 2 is starved and assigns both servers to station 1 when station 2 is starved and to station 2 when station 1 is blocked is optimal. Moreover, this is the uniue optimal policy in the class of Markovian stationary deterministic policies if µ 11 µ 22 > µ 21 µ 12 (µ 21 µ 12 < µ 11 µ 22 ) and α < C 22. (ii) If α > C 22, then the expedite policy (which assigns all available servers to a single team that will follow each customer from the first to the last station and only starts work on a new customer once all work on the previous customer has been completed) is the uniue optimal policy in the class of Markovian stationary deterministic policies. Note that the uniueness of the optimal policy in Theorem 2.1 is subject to the interpretation that assigning a server to a station where there is no work is euivalent to idling him, and that the expedite policy can be executed in various ways as long as the two servers work together as a team at all times. Also, note that C 22 = µ 11 Σ 1 + µ 22 Σ 2 when µ 11 µ 22 µ 21 µ 12, and C 22 = µ 21 Σ 1 + µ 12 Σ 2 when µ 21 µ 12 µ 11 µ 22. One can observe that if α C 22, the optimal policy stated in Theorem 2.1 assigns servers to their primary stations (which are determined by the sign of µ 11 µ 22 µ 21 µ 12 ) unless they have no work at these stations. Thus, the servers are allowed to collaborate only if their primary stations are blocked or starved. On the other hand if α > C 22, the optimal policy has the servers work as a team at all times. Thus, the optimal policy switches from one that takes full advantage of servers skills in that the servers avoid collaboration unless they have no work at their primary assignments, to one that takes full advantage of server synergy (i.e., there is no middle ground). It is also important to point out that the expedite policy minimizes the WIP (Work In Process), since the number of customers in the system is always one. Thus, in a Markovian system with two 4

stations and two servers, one can simultaneously maximize the throughput and minimize the WIP if the synergy among the servers is sufficiently large. Theorem 2.1 states that for Markovian systems with two stations and two synergistic servers, it suffices to consider two policies. The next proposition uantifies the loss in throughput performance if one employs the wrong policy in Theorem 2.1, implying that it is important to use the correct policy for any given value of α. Proposition 2.1 Consider a Markovian tandem ueue with two stations and two servers. Assume that Σ 1 > 0 and Σ 2 > 0. If α C 22, using the expedite policy instead of the optimal policy described in part (i) of Theorem 2.1 can reduce the throughout performance by at most a factor of 2 (and this bound is tight). On the other hand, if α > C 22, using the policy described in part (i) of Theorem 2.1 rather than the expedite policy can yield arbitrarily worse throughout performance. Proof: We first obtain an expression for the ratio of the throughputs of the two policies described in parts (i) and (ii) of Theorem 2.1. As mentioned in the proof of Theorem 2.1 in Section 4, the uantity g 0 specified in euation (12) is the long-run average throughput of the optimal policy when α C 22 and µ 11 µ 22 µ 21 µ 12. Similarly, one can compute the throughput of the expedite policy (which is optimal when α > C 22 ) as T expedite = ασ 1Σ 2 Σ 1 +Σ 2. Then g 0 T expedite = (Σ 1 + Σ 2 ) B+1 j=0 µj 11 µb+1 j 22 ασ 1 Σ B 2 j=0 µj 11 µb j 22 + µ B+1 22 Σ 2 + µ B+1. (2) 11 Σ 1 Using the argument in Remark 4.1 of Andradóttir and Ayhan [5], one can verify that g 0 is nondecreasing in the buffer size B. Thus, g 0 /T expedite is nondecreasing in B and nonincreasing in α. This implies that lim B = g 0 T expedite max{µ 11, µ 22 }(Σ 1 + Σ 2 ) ασ 1 Σ 2 + (max{µ 11, µ 22 } min{µ 11, µ 22 })(1I(µ 11 > µ 22 )Σ 1 + 1I(µ 22 > µ 11 )Σ 2 ) max{µ 11, µ 22 }(Σ 1 + Σ 2 ) Σ 1 Σ 2 + (max{µ 11, µ 22 } min{µ 11, µ 22 })(1I(µ 11 > µ 22 )Σ 1 + 1I(µ 22 > µ 11 )Σ 2 ), (3) where 1I(E) = 1 if E occurs and 1I(E) = 0 otherwise. The bound in (3) is obtained by replacing α with 1. First assume that µ 11 µ 22. Then g 0 T expedite µ 11 (Σ 1 + Σ 2 ) Σ 1 Σ 2 + (µ 11 µ 22 )Σ 1 2. (4) The second ineuality in (4) follows immediately from the following contradiction argument. If µ 11 (Σ 1 + Σ 2 ) Σ 1 Σ 2 + (µ 11 µ 22 )Σ 1 > 2, 5

then we need to have µ 2 11 + µ 11µ 21 + µ 11 µ 12 < µ 11 µ 22 2µ 21 µ 12, which is not possible because µ 2 11 + µ 11µ 21 + µ 11 µ 12 µ 11 µ 22 and µ 21 µ 12 0. Note that since µ 11 µ 22, But µ 11 (Σ 1 + Σ 2 ) Σ 1 Σ 2 + (µ 11 µ 22 )Σ 1 2µ 2 11 + µ 11µ 21 + µ 11 µ 12 µ 2 11 + µ 11µ 21 + µ 11 µ 12 + µ 21 µ 12. 2µ 2 11 lim + µ 11µ 21 + µ 11 µ 12 µ 11 µ 2 11 + µ = 2, 11µ 21 + µ 11 µ 12 + µ 21 µ 12 which yields the desired result (i.e., the bound 2 is tight for α = 1 as B and µ 11 = µ 22 ). The same arguments can be used to obtain the tight upper bound of 2 when µ 11 µ 22. On the other hand, it immediately follows from the expression in (2) that employing the policy described in part (i) of Theorem 2.1 when α > C 22 yields arbitrarily worse throughput performance as α gets larger. Replacing µ 11 (µ 21 ) with µ 21 (µ 11 ) and µ 22 (µ 12 ) with µ 12 (µ 22 ) in the above expressions provides the ratio of g 0 to T expedite when µ 21 µ 12 µ 11 µ 22, and using the arguments above, one can obtain the same results in this case. 3 Results for More General Systems In this section, we first focus on more general Markovian tandem ueues with synergistic servers. After providing various results for these systems, we characterize the class of optimal server assignment policies for tandem lines with general service reuirement distributions and generalist servers. Our results for these two models are reported in Sections 3.1 and 3.2, respectively. 3.1 Markovian Systems with Synergistic Servers To analyze more general Markovian systems, we will need the following notation. Denote the number of servers at each station j {1,..., N} under action a σ1...σ M as n j (a σ1...σ M ) = M i=1 1I(σ i = j). Similarly, for all j {1,..., N}, r j (a σ1...σ M ) = M i=1 1I(σ i = j)µ ij is the cumulative service rate at station j under action a σ1...σ M (not taking into account the synergy among servers). For all a A, define N 1 (a) = {j : n j (a) = 1}, N 1+ (a) = {j : n j (a) 1}, and N 2+ (a) = {j : n j (a) 2}. Thus, N 1 (a), N 1+ (a), and N 2+ (a) denote the sets of stations with exactly one server, at least one server, and at least two servers under action a, respectively, and N 1 (a) N 2+ (a) = N 1+ (a). Finally, for notational convenience, we set R(a) = l N 1 (a) r l(a) + α l N 2+ (a) r l(a) for all a A. The next theorem states that a fully cross-trained server should never idle. Note that this generalizes Proposition 2.1 of Andradóttir and Ayhan [5] which implies the optimality of nonidling policies for Markovian systems with two stations in tandem, servers capable of working at both stations, and α = 1. 6

Theorem 3.1 Consider a Markovian tandem line with N 1 stations and M 1 servers. Suppose server i is such that µ ij > 0 for all j {1,..., N}. If µ kj α > for all k {1,..., M}, k i, and j {1,..., N}, (5) µ ij + µ kj then the optimal policy should not allow server i to idle. Note that the condition in euation (5) guarantees that when server i works with any server k at any station j, their rate as a team is faster than the individual service rate of server k at that station. Moreover, the lower bound on α in euation (5) is strictly less than 1, implying that the servers need not be synergistic for the result of Theorem 3.1 to hold (clearly, the condition is satisfied if the servers are synergistic). Proof: Let d denote a Markovian stationary deterministic policy that uses the decision rule d at each decision epoch with d(s) A for all s S. Moreover, assume that under the decision rule d, there exists s 0 S such that d(s 0 ) = a and σ i = 0 under a. Without loss of generality, we assume that N 1+ (a) and R(a) 0 (because otherwise s 0 is an absorbing state and the long-run average throughout euals zero). For all j = 1,..., N, let the action a j be the same as a except that σ i = j under a j. Note that αr l (a l ) r l (a) > 0 for all l N 1 (a) and r l (a l ) r l (a) > 0 for all l N 2+ (a) under our assumptions (see (5)). Now consider the Markovian stationary randomized policy (d ) such that d (s) = d(s) for all s S \ {s 0 } and where and p j = K(a) = d (s 0 ) = a j with probability p j for all j N 1+ (a), R(a j )r j (a) l N 1 (a) l j m N 1+ (a) R(a m )r m (a) (αr l (a l ) r l (a)) l N 2+ (a)(r l (a l ) r l (a)) l j 0 K(a) l N 1 (a) l m (αr l (a l ) r l (a)) l N 2+ (a) l m (r l (a l ) r l (a)) > 0. Thus, d (s 0 ) is a randomized action that does not allow server i to idle in state s 0. Moreover, R(a) + αr j (a j ) r j (a) if j N 1 (a), R(a j ) = R(a) + α(r j (a j ) r j (a)) if j N 2+ (a), and hence K(a) = R(a) [ l N 1 (a) m N 1+ (a) r m (a) (αr l (a l ) r l (a)) l N 1 (a) l m l N 2+ (a) (αr l (a l ) r l (a)) 7 (r l (a l ) r l (a)) l N 2+ (a) l m ] (r l (a l ) r l (a)) +. (6)

The transition probabilities out of state s 0 for the underlying Markov chain are the same under decision rules d and d. In order to see this, consider the transition probability out of state s 0 corresponding to a service completion at station j N 1 (a). Clearly, this probability is eual to r j (a)/r(a) under decision rule d. Under decision rule d, this transition probability euals p j αr j (a j ) R(a j ) + m N 1+ (a) l j p m r j (a) R(a m ) αr j (a j ) = p j R(a j ) p r j (a) j R(a j ) + = r j (a) = r j(a) R(a), [ m N 1+ (a) p m r j (a) R(a m ) l N 1 (a) (αr l(a l ) r l (a)) l N 2+ (a) (r l(a l ) r l (a)) K(a) m N 1+ (a) r m(a) l N 1 (a) l m (αr l (a l ) r l (a)) l N 2+ (a)(r l (a l ) r l (a))] K(a) l m + where the last euality follows from (6). Similarly, one can show that the transition probabilities are the same when j N 2+ (a) (both are eual to αr j (a)/r(a)). However, the total rate out of state s 0 under decision rule d is R(a), and similarly, the total rate out of state s 0 under decision rule d is eual to ( j N 1+ (a) p ) j 1 K(a) R(a j = ) j N 1+ (a) r j(a) l N 1 (a)(αr l (a l ) r l (a)) l N 2+ (a)(r l (a l ) r l (a)) l j l j R(a) l N = R(a) + 1 (a) (αr l(a l ) r l (a)) l N 2+ (a) (r l(a l ) r l (a)) j N 1+ (a) r j(a) l N 1 (a)(αr l (a l ) r l (a)) l N 2+ (a)(r l (a l ) r l (a)) l j l j > R(a), where the last euality again follows from (6). The derivations above and a simple sample path argument show that (d ) has a larger long-run average throughput than d. But Lemma 4.3.1 of Puterman [13] implies that there exists a policy in Π whose throughput performance is at least as good as that of (d ). Repeating the same argument for all states where server i idles under the decision rule d, it is clear that there is a Markovian stationary deterministic policy that does not allow server i to idle with long-run average throughput strictly larger than the throughput of d, and the proof is complete. Note that the result in Theorem 3.1 is not necessarily correct if µ ij = 0 for some j {1,..., N}. In order to see this, consider a Markovian tandem line with N = 2 stations, M = 3 servers, and B = 1, so that S = {0, 1, 2, 3}. Suppose that α = 1, µ 11 = 3, µ 12 = µ 21 = µ 32 = 1, µ 22 = 8, and 8

µ 31 = 0. Consider the stationary policy δ such that all servers work at station 1 if s = 0, server 2 works at station 1, servers 1 and 3 work at station 2 if s = 1, δ(s) = server 1 works at station 1, server 2 works at station 2, and server 3 idles if s = 2, all servers work at station 2 if s = 3. Let (δ ) be the same as δ in all states except s = 2 where δ (2) = server 1 works at station 1, servers 2 and 3 work at station 2. Thus, (δ ) does not allow server 3 to idle. But then we have 2.027 T (δ ) < T δ 2.030. Next we study fully collaborative policies. We call a policy fully collaborative if all servers work in teams of two or more at all times under this policy. Note that the team formations need not be static. The next theorem shows that if α is large enough, the optimal policy is fully collaborative for a certain class of Markovian systems. Theorem 3.2 Consider a Markovian tandem line with N = 2 stations and M 2 servers or N > 2 stations and 2 M 4 servers. Suppose µ ij > 0 for all i {1,..., M} and all j {1,..., N}. Then if α > max a A r j (a) j N 1 (a) Σ j r l (a) Σ l 1 l N 2+ (a) (with the convention that summation over an empty set is eual to 0), then the optimal policy is fully collaborative. Proof: Let d denote a Markovian stationary deterministic policy that uses the decision rule d at each decision epoch with d(s) A for all s S. Assume that under the decision rule d, there exists s 0 S such that d(s 0 ) = a with N 1 (a). Without loss of generality, we assume that N 1+ (a) and R(a) 0 (because otherwise s 0 is absorbing and the throughput is zero). Now consider the Markovian stationary randomized policy (d ) such that d (s) = d(s) for all s S \ {s 0 } and where d (s 0 ) = a j,j,...,j with probability p j for all j N 1+ (a), p j = r j (a) R(a) αr j (a) R(a) Thus, d (s 0 ) is a randomized, fully collaborative action. if j N 1 (a), if j N 2+ (a). We now show that the transition probabilities out of state s 0 for the underlying Markov chain are the same under decision rules d and d. Consider the transition probability out of state s 0 9

corresponding to a service completion at station j N 1 (a). Clearly, this probability is eual to r j (a)/r(a) under decision rule d, and it euals p j = r j (a)/r(a) under decision rule d. Similarly, the transition probability out of state s 0 corresponding to a service completion at station j N 2+ (a) is eual to αr j (a)/r(a) under both decision rules. However, the total rate out of state s 0 under decision rule d is R(a), and, similarly, the total rate out of state s 0 under decision rule d is eual to 1 p j = j N 1+ (a) ασ j αr(a) r j (a) j N 1 (a) Σ j +. αr j (a) j N 2+ (a) Σ j Thus, we can conclude that if α > r j (a) j N 1 (a) Σ j 1 j N 2+ (a) r j (a) Σ j, (7) then the rate out of state s 0 under (d ) is larger than the rate under d. Note that the denominator of the expression in (7) is always positive because for all a A, N 2+ (a) can have at most one element when N 1 (a) and either N = 2 or M 4. Thus, if α is greater than the lower bound specified in (7), a simple sample path argument shows that (d ) has a larger long-run average throughput than d. But Lemma 4.3.1 of Puterman [13] implies that there exists a policy in Π whose throughput performance is at least as good as that of (d ). Thus, there is a fully collaborative Markovian stationary deterministic policy whose long-run average throughput is strictly larger than the throughput of d when α is greater than the lower bound specified in (7). Taking the maximum over all possible server allocations in A with N 1 (a) yields the desired result (note that the numerator of the right-hand side in (7) euals zero when N 1 (a) = ). Note that for systems with N > 2 stations and M > 4 servers, the randomized fully collaborative decision rule defined in the proof of Theorem 3.2 may not yield a better throughput performance than a non-fully collaborative decision rule. Moreover, we would like to point out that if µ ij = 0 for some i {1,..., M} and j {1,..., N}, the optimal policy may not be fully collaborative even when α is large. For example, consider a Markovian tandem line with N = 2 stations, M = 3 servers, and B = 0, so that S = {0, 1, 2}. Suppose that µ 32 = 0 and all the other service rates are positive. Consider the stationary policy δ such that all servers work at station 1 if s = 0, δ(s) = servers 1 and 2 work at station 2, server 3 works at station 1 if s = 1, servers 1 and 2 work at station 2, server 3 is idle if s = 2. In this case, the only fully collaborative policy is the expedite policy. But T δ T expedite = ασ 1 Σ 2 2 µ 31 (ασ 2 2 + ασ 1Σ 2 + µ 31 Σ 1 )(Σ 1 + Σ 2 ), which is strictly bigger than 0 for all α > 0. 10

Note that the lower bound on α in Theorem 3.2 is always greater than or eual to 1. In order to see this, first consider systems with M = 2 servers and consider the allocations a 12 and a 21. Taking the maximum over these two allocations would yield C 22 1 (see the statement right before Theorem 2.1). For systems with M > 2, first consider the allocation where σ i = 1 and σ k = 2 for all k i and then consider the allocation σ i = 2 and σ k = 1 for all k i. The first allocation yields the bound (µ i1 Σ 2 )/(µ i2 Σ 1 ) and the second yields (µ i2 Σ 1 )/(µ i1 Σ 2 ), which again implies that α 1. Moreover, the lower bound on α in Theorem 3.2 is in general strictly bigger than one (for example when N = 2 and M 2, it is clear from the above that one needs to have µ 11 µ 12 = µ 21 µ 22 =... = µ M1 µ M2 for this bound to be eual to one). Thus, even when the servers are fully cross-trained and synergistic, the optimal policy is not fully collaborative for all α > 1 when the service rates are arbitrary (i.e., it takes advantage of the servers special skills for some α > 1). Hence, adding collaboration to a policy when α > 1 is not necessarily desirable (see also Theorem 2.1). However, if the servers are generalists, the lower bound is eual to one, as shown in Theorem 3.3 below. We would also like to point out that in general the bound in Theorem 3.2 is not tight. That is, one can easily construct systems where a fully collaborative policy outperforms all non-fully collaborative policies for values of α smaller than the bound specified in Theorem 3.2 (which one can also intuitively deduce from the proof of the theorem). For example, consider a system with N = 2 stations, M = 4 servers, and B = 0. Suppose that µ 11 = µ 21 = µ 22 = µ 42 = 1, µ 32 = 2, and µ 12 = µ 31 = µ 41 = 4. In this case, the lower bound in Theorem 3.2 is eual to 5, but the policy that assigns all servers to station 1 when station 2 is starved, all servers to station 2 when station 1 is blocked, and assigns servers 3 and 4 to station 1 and servers 1 and 2 to station 2 otherwise, has a better throughput than all non-idling, non-fully collaborative policies for all α 1. However, it follows from Theorem 2.1 in Section 2 that the bound is tight for systems with two stations and two servers. Note that it is not necessarily the case that increasing α will increase the long-run average throughput of a particular server allocation policy. For example, consider a Markovian tandem line with N = 2 stations, M = 3 servers, and B = 1. Suppose that µ 11 = 100, µ 12 = 1, µ 21 = µ 22 = µ 31 = 0.1, and µ 32 = 200. Note that in this case S = {0, 1, 2, 3}, and consider the stationary policy δ such that all servers work at station 1 if s = 0 servers 1 and 2 work at station 1, server 3 works at station 2 if s = 1, δ(s) = servers 2 and 3 work at station 1, server 1 works at station 2 if s = 2, all servers work at station 2 if s = 3. It is easy to compute the long-run average throughput of δ and to verify that it is decreasing in α when α [0.16, 3.11]. On the other hand, it is easy to verify that for a Markovian tandem line with N = 2 stations, M 2 servers, and B = 0, the long-run average throughput of any non-idling policy is increasing in α. As is mentioned in Section 2, in this case {X π (t)} is a birth-death process with state space 11

S = {0, 1, 2}. For a non-idling policy π and for all s S, let λ π (s) be the birth rate in state s and ν π (s) be the death rate in state s. Note that λ π (s) is non-increasing in s and ν π (s) is nondecreasing in s. It follows from Shantikumar and Yao [15] (see page 437 of [15]) that this network can be euivalently viewed as a two-node cyclic network with 1 customer circulating among the nodes, where node 1 models the death process and node 2 models the birth process. The result then follows from Lemma 14.B.10 of Shantikumar and Yao [15]. Similarly, consider a Markovian system with N 1 stations and M 1 servers. Then the longrun average throughput T π of a fully collaborative policy π is a strictly increasing function of α. This result follows immediately from the fact that since all the transition rates of the corresponding continuous time Markov chain under a fully collaborative policy π are multiplied by α, increasing α does not affect the stationary distribution of the continuous time Markov chain but increases the departure rate. Note that this result also holds for general Markovian ueueing networks with infinite buffers as long as the system is stable. 3.2 Systems with Generalist Servers So far, our focus has been on Markovian systems. We now consider systems with generalist servers where the service reuirements have arbitrary distributions. In systems with generalist servers, the service rate of each server at each station can be expressed as the product of two constants, one representing the server s speed at every task and the other representing the intrinsic difficulty of the task at the station. Thus, µ ij = µ i γ j for all i {1,..., M} and j {1,..., N}. Our objective is to generalize the result of Theorem 3.2 to systems with arbitrary service reuirement distributions and generalist servers. Moreover, Π may now include all possible server assignment policies. The next result states that when servers are generalists, the lower bound on α that guarantees that a fully collaborative policy is optimal is 1. This is not surprising because when the servers are generalists, there is no tradeoff between the servers special skills and server synergy. On the other hand, when α = 1 (i.e., the service rates are additive), it has been shown in Andradóttir, Ayhan, and Down [7] that any non-idling policy is optimal. Thus, when the servers are synergistic and generalists, the optimal policy is taking full advantage of this synergy, and a non-idling policy is no longer sufficient to achieve optimal throughput. Theorem 3.3 Assume that for each j = 1,..., N, the service reuirements S k,j of customer k 1 at station j are i.i.d. with mean 1. Moreover, assume that for all t 0, if there is a customer in service at station j at time t, then the expected remaining service reuirement at station j of that customer is bounded above by a scalar 1 S <. Finally, assume that service is either nonpreemptive or preemptive-resume. If µ ij = µ i γ j for all i = 1,..., M and j = 1,..., N and α > 1, then for all 0 B 1, B 2,..., B N 1 <, any fully collaborative policy π is optimal, with 12

long-run average throughput T π = α M i=1 µ i N j=1 1 γ j. Proof: Define A π (t) as the number of customers that have entered the system by time t under policy π Π. Then A π (t) = Q π (t) + D π (t), where Q π (t) denotes the number of customers in the system at time t under policy π Π. Since Q π (t) N 1 j=1 B j + N for all t 0 and for all π Π, we have T π = lim sup t IE[D π (t)] t = lim sup t IE[A π (t)]. (8) t Our model is euivalent to one where the service reuirements of successive customers at station j {1,..., N} are i.i.d. with mean 1/γ j and the service rates depend only on the server (i.e., µ ij = µ i for all i {1,..., M}). Let π be a fully collaborative policy and define W π,p (t) as the total work performed by time t for all servers under the policy π. Then W π,p (t) = α M i=1 µ it. Let S k = N j=1 S k,j/γ j be the total service reuirement (in the system) of customer k for all k 1. Let W π (t) = A π(t) k=1 S k and let W π,r (t) = W π (t) W π,p (t) be the total remaining service reuirement (work) at time t for the customers that entered the system by time t. We have N 1 IE[W π,r (t)] (N + B j ) S which implies that lim t IE[W π,r (t)]/t = 0 and j=1 IE[W π (t)] IE[W π,p (t)] lim = lim = α t t t t N j=1 1 γ j, M µ i. (9) For all n 0, let Z n = (S n,1,..., S n,n ). Since the event {A π (t) = n} is completely determined by the random vectors Z 1, Z 2,..., Z n 1 (and independent of Z n, Z n+1,...), A π (t) is a stopping time for the seuence of random vectors {Z n }. Moreover, for all t 0, A π (t) K π (t)+1, where K π (t) is the number of customers departing station 1 by time t if all servers work at station 1 at all times and there is unlimited room for completed customers after station 1. Since {K(t)} is a nondecreasing process with lim t IE[K(t)]/t = α M i=1 µ iγ 1 < (which follows from the elementary renewal theorem), we have IE[A π (t)] < for all t 0. Then from Wald s lemma, we have IE[W π (t)] = IE From (8), (9), and (10), we now have α M i=1 [ Aπ(t) i=1 ] S k = IE[A π (t)] k=1 IE[W π (t)] IE[A π (t)] µ i = lim = lim t t t t 13 N j=1 N j=1 1 γ j = T π 1 γ j. (10) N j=1 1 γ j,

which yields the desired throughput. The optimality of this throughput follows from euations (8), (9), and (10) and the fact that W π,p (t) α M i=1 µ it for all t 0 and for all server assignment policies π Π. 4 Proof of the Main Result In this section, we prove the main result of the paper given in Theorem 2.1. We start with the proof of part (i). We only provide the proof for the case µ 11 µ 22 µ 21 µ 12 (so that C 22 = µ 11 Σ 1 + µ 22 Σ 2 ) because the proof for the case µ 21 µ 12 µ 11 µ 22 simply follows by relabeling the servers. First suppose that µ 1j = µ 2j = 0 for some j {1, 2} (i.e., there is at least one station at which no server is capable of working). Then the long-run average throughput is zero under any policy and the policy described in part (i) of Theorem 2.1 is optimal. Thus, we can assume, without loss of generality, that there exist i 1, i 2 {1, 2}, such that µ i1 1 > 0 and µ i2 2 > 0, and hence Σ 1 > 0 and Σ 2 > 0. Similarly, if µ 11 µ 22 = µ 12 µ 21 = 0, Σ 1 > 0, and Σ 2 > 0, then we know that C 22 = 1 and that there exists i 1 {1, 2} such µ i1 1 = µ i1 2 = 0. But Theorem 2.1 of Andradóttir, Ayhan, and Down [7] implies that any policy, including the one stated in part (i) of Theorem 2.1, that does not allow server i 2 i 1 to idle is optimal. Since µ 11 µ 22 µ 21 µ 12, we have shown that we can assume, without loss of generality, that µ 11 > 0 and µ 22 > 0. The set A s of allowable actions in state s is given as {a 11 } for s = 0, A s = {a 11, a 12, a 21, a 22, a 00, a 01, a 02, a 10, a 20 } for s {1,..., B + 1}, {a 22 } for s = B + 2, where we use a sample path argument to eliminate actions that allow servers to idle in states s = 0 and s = B + 2 (see Lemma 2.1 of Kırkızlar, Andradóttir, and Ayhan [11]). Since the number of possible states and actions are both finite, the existence of an optimal Markovian stationary deterministic policy follows from Theorem 9.1.8 of Puterman [13]. Under our assumptions on the service rates neither µ 11 nor µ 22 can be eual to zero. Conseuently, the policy described in part (i) of Theorem 2.1 corresponds to an irreducible Markov chain, and, hence, we have a communicating Markov decision process. Therefore, we use the policy iteration algorithm for communicating models (see pages 479 and 480 of Puterman [13]) to prove the optimality of the policy described in Theorem 2.1. Let P d denote the (B + 3) (B + 3) dimensional probability transition matrix corresponding to the policy described by decision rule d. Similarly, r d is the B + 3 dimensional reward vector corresponding to d, with r d (s) denoting the reward earned in state s under the policy corresponding to d, for all s S. 14

As the initial policy of the policy iteration algorithm, we choose a 11 for s = 0, d 0 (s) = a 12 for 1 s B + 1, a 22 for s = B + 2, corresponding to the policy described in part (i) of Theorem 2.1. Then 0 for s = 0, r d0 (s) = µ 22 for 1 s B + 1, ασ 2 for s = B + 2, and P d0 (s, s ) = ασ 1 for s = 0, s = 1, 1 ασ 1 for s = s = 0, µ 22 for 1 s B + 1, s = s 1, 1 µ 11+µ 22 for 1 s B + 1, s = s, µ 11 for 1 s B + 1, s = s + 1, ασ 2 for s = B + 2, s = B + 1, 1 ασ 2 for s = s = B + 2, where is the uniformization constant. Since the policy (d 0 ) (corresponding to the decision rule d 0 ) is irreducible, we find a scalar g 0 and a vector h 0 solving r δ0 g 0 e + (P δ0 I)h 0 = 0, (11) subject to h 0 (0) = 0, where e is a column vector of ones and I is the identity matrix. Then g 0 = ασ 1 Σ B+1 2 j=0 µj 11 µb+1 j 22 ασ 1 Σ B 2 j=0 µj 11 µb j 22 + µ B+1 22 Σ 2 + µ B+1, (12) 11 Σ 1 h 0 (0) = 0, and h 0 (s) = g ( 0 s 2 ασ 1 j=0 (j + 1)µ s j 2 22 µ j s+1 ) s 2 11 (ασ 1 µ 11 + µ 22 ) + s µ 22 j=0 (j + 1)µ s j 2 22 µ j s+1 11 for 1 s B + 2 constitute a solution to euation (11). Note that g 0 is the long-run average throughput of the policy (d 0 ) and h 0 denotes the bias vector under the policy (d 0 ) (see pages 338 and 339 of Puterman [13] for the interpretation of the bias vector). For all s S and a A s, let r(s, a) be the immediate reward obtained when action a is chosen in state s and let p(j s, a) be the probability of going to state j in one step when action a is chosen in state s. For the next step of the policy iteration algorithm, we choose { d 1 (s) arg max r(s, a) + } p(j s, a)h 0 (j), s S, a A s j S 15

setting d 1 (s) = d 0 (s) if possible. We now show that d 1 (s) = d 0 (s) for all s S. In particular, for all s S \ {0, B + 2} and a A s \ {d 0 (s)}, we will compute the differences (s, a) = r(s, d 0 (s)) + ( p(j s, d 0 (s))h 0 (j) r(s, a) + ) p(j s, a)h 0 (j) j S j S and show that the differences are non-negative. Note that for s = 0 and s = B + 2, there is nothing to prove because there is only one possible action in these states, namely d 0 (0) = a 11 and d 0 (B + 2) = a 22. For 1 s B + 1, we have that d 0 (s) = a 12. Since the set A s of all possible actions is large, in the interest of space, we will specify (s, a) only for actions a 11, a 21, and a 22. With some algebra we obtain (s, a 11 ) = ασ 1µ B+1 s 11 s 1 ( ) j=0 µj 11 µs 1 j 22 2µ11 µ 22 ασ 1 Σ 2 + µ 21 µ 22 + µ 11 µ 12 ασ 1 Σ 2 B j=0 µj 11 µb j 22 + µ B+1 22 Σ 2 + µ B+1 11 Σ 1 0 since α 2µ 11µ 22 +µ 21 µ 22 +µ 11 µ 12 Σ 1 Σ 2 = µ 11 Σ 1 + µ 22 Σ 2, and the expression is eual to zero only when α attains the upper bound. Similarly, (s, a 21 ) = where Γ(s, α) = Γ 1 (s, α) + Γ 2 (s, α) with Γ(s, α) ασ 1 Σ B 2 j=0 µj 11 µb j 22 + µ B+1 22 Σ 2 + µ B+1, 11 Σ 1 and Γ 1 (s, α) = Γ 2 (s, α) = B j=b s+1 B s j= 1 µ j 11 µb j 22 Σ 1 ( α(µ11 µ 21 )Σ 2 + µ 21 µ 22 µ 11 µ 12 ) µ j+1 11 µb j 1 22 Σ 2 ( α(µ22 µ 12 )Σ 1 + µ 11 µ 12 µ 21 µ 22 ). We next prove that Γ(s, α) 0 by showing that Γ 1 (s, α) 0 and Γ 2 (s, α) 0 when µ 11 µ 22 µ 21 µ 12 and 1 α C 22. Note that if µ 11 µ 21, then Γ 1 (s, α) is a non-decreasing function of α and Γ 1 (s, 1) = B j=b s+1 µ j 11 µb j 22 Σ 1 (µ 11 µ 22 µ 12 µ 21 ) 0 (where the euality is attained if and only if µ 11 µ 22 = µ 12 µ 21 ) implying that Γ 1 (s, α) 0 for all α 1. If µ 11 < µ 21, then Γ 1 (s, α) is a decreasing function of α. But ( Γ 1 s, µ 11 + µ ) 22 Σ 1 Σ 2 = 2 B j=b s+1 µ j+1 11 µb j 22 (µ 11 µ 22 µ 12 µ 21 ) 0 (where the euality is attained if and only if µ 11 µ 22 = µ 12 µ 21 ) implying that Γ 1 (s, α) 0 for all α C 22. Similarly, if µ 22 µ 12, Γ 2 (s, α) is a non-decreasing function of α and Γ 2 (s, 1) = B s j= 1 µ j+1 11 µb j 1 22 Σ 2 ( µ11 µ 22 µ 12 µ 21 ) 0 16

(where the euality is attained if and only if µ 11 µ 22 = µ 12 µ 21 ) implying that Γ 2 (s, α) 0 for all α 1. On the other hand, if µ 22 < µ 12, then Γ 2 (s, α) is a decreasing function of α. However, ( Γ 2 s, µ 11 + µ ) 22 Σ 1 Σ 2 = 2 B s j= 1 µ j+1 ( ) 11 µb j 22 µ11 µ 22 µ 12 µ 21 0 (where the euality is attained if and only if µ 11 µ 22 = µ 12 µ 21 ) implying that Γ 2 (s, α) 0 for all α C 22. Finally, (s, a 22 ) = ασ B+1 s 2 j=0 µ j ( ) 11 µb j 22 2µ11 µ 22 ασ 1 Σ 2 + µ 21 µ 22 + µ 11 µ 12 ασ 1 Σ B 2 j=0 µj 11 µb j 22 + µ B+1 22 Σ 2 + µ B+1 0 11 Σ 1 because α 2µ 11µ 22 +µ 21 µ 22 +µ 11 µ 12 Σ 1 Σ 2 the upper bound. = C 22, and the expression is eual to zero only when α attains Proceeding as above, we have also shown that (s, a) 0 for a {a 00, a 01, a 02, a 10, a 20 }, with euality occurring only when µ 11 µ 22 = µ 12 µ 21 or α = C 22. This proves that d 1 (s) = d 0 (s) for all s S. By Theorem 9.5.1 of Puterman [13] this proves that the policy described in part (i) of Theorem 2.1 is optimal. The proof of the uniueness of the optimal policy is similar to the uniueness proof in Theorem 3.1 of Andradóttir and Ayhan [5]. We now consider the proof of part (ii) of Theorem 2.1. Note that if µ ij > 0 for all i {1, 2} and j {1, 2}, then part (ii) of Theorem 2.1 follows directly from Theorem 3.2, but the proof below is valid even when µ ij = 0 for some i {1, 2} and j {1, 2}. Without loss of generality, we again only provide the proof for the case with µ 11 µ 22 µ 21 µ 12, Σ 1 > 0, and Σ 2 > 0 (so that C 22 = µ 11 Σ 1 + µ 22 Σ 2 ). Note that we again have a communicating Markov decision process (e.g., to go from state s 1 to some state s 2, assign both servers to station 1 for all s < s 2, both servers to station 2 for all s > s 2, and both servers to station 1 (2) if s 2 = 0 (s 2 > 0)). Thus, as in the proof of part (i), we use the policy iteration algorithm for communicating Markov decision processes. This time, as the initial policy of the policy iteration algorithm, we choose { d a11 for s = 0, 0(s) = a 22 for 1 s B + 2, corresponding to the policy described in part (ii) of Theorem 2.1. Then { 0 for s = 0, r d 0 (s) = ασ 2 for 1 s B + 2, and P d 0 (s, s ) = ασ 1 for s = 0, s = 1, 1 ασ 1 for s = s = 0, ασ 2 for 1 s B + 2, s = s 1, 1 ασ 2 for 1 s B + 2, s = s, 17

where is the uniformization constant. Since the policy (d 0 ) (corresponding to the decision rule d 0 ) has a unichain structure, we again find a scalar g 0 and a vector h 0 h 0 (0) = 0. Then h 0 (0) = 0, and g 0 = T expedite = ασ 1Σ 2 Σ 1 + Σ 2, solving (11) subject to h 0(s) = sσ 2 Σ 1 + Σ 2 for 1 s B + 2 constitute a solution to euation (11). For the next step of the policy iteration algorithm, we again choose { d 1(s) arg max r(s, a) + } p(j s, a)h 0(j), s S, a A s j S setting d 1 (s) = d 0 (s) if possible and show that d 1 (s) = d 0 (s) for all s S. In particular, for all s S \ {0, B + 2} and a A s \ {d 0 (s)}, we compute the differences (s, a) = r(s, d 0(s)) + ( p(j s, d 0(s))h 0(j) r(s, a) + ) p(j s, a)h 0(j) j S j S and show that the differences are non-negative. For 1 s B + 1, we have that d 0 (s) = a 22. In the interest of space, we will again specify (s, a) only for actions a 11, a 12, and a 21. With some algebra, we obtain (s, a 11 ) = 0, which implies that both actions a 11 and a 22 can be used in states 1 s B + 1 (see the comment on the execution of the expedite policy after the statement of Theorem 2.1). Similarly, (s, a 12 ) = ασ 1Σ 2 2µ 11 µ 22 µ 11 µ 12 µ 21 µ 22 Σ 1 + Σ 2 > 0 because α > 2µ 11µ 22 +µ 11 µ 12 +µ 21 µ 22 Σ 1 Σ 2 = C 22. Finally, (s, a 12 ) = ασ 1Σ 2 2µ 12 µ 21 µ 11 µ 12 µ 21 µ 22 Σ 1 + Σ 2 > 0 because α > C 22 2µ 12µ 21 +µ 11 µ 12 +µ 21 µ 22 Σ 1 Σ 2 = µ 21 Σ 1 + µ 12 Σ 2. We have also shown that (s, a) > 0 for a {a 00, a 01, a 02, a 10, a 20 }. This proves that d 1 (s) = d 0 (s) for all s S and, hence, the expedite policy is optimal. The proof of the uniueness of the optimal policy (subject to the interpretation that the expedite policy can be executed in various ways as long as the two servers work in a team at all times) is again similar to the uniueness proof in Theorem 3.1 of Andradóttir and Ayhan [5]. 18

5 Concluding Remarks We considered tandem lines with finite buffers and flexible synergistic servers where the combined rate of a server team is proportional to the sum of the rates of the individual servers. For Markovian systems with two stations in tandem and two servers, we have shown that the policy which is optimal for systems with additive service rates remains optimal if the synergy among servers is small. However, as the servers become more synergistic (so that collaboration gets more effective), the expedite policy becomes optimal. Thus, for Markovian systems with two stations and two servers, the optimal policy abruptly switches from one where the servers work together only if they have to (i.e., they have no work at their primary assignments) to one where they work together at all times. Moreover, employing the wrong one among these two policies can result in significant losses in throughput performance. For systems with generalist servers, arbitrary number of stations and servers, and general service reuirement distributions, we proved that any fully collaborative policy is optimal. Hence, when the servers have no special skills, the optimal policy always takes full advantage of the synergy among the servers. Finally, we were able to identify how much synergy is sufficient to guarantee that the optimal policy is non-idling or fully collaborative for certain classes of Markovian systems. To the best of our knowledge, this paper is the first to analyze systems with servers who work more effectively in teams than on their own. Our results suggest that when the combined rate of a team of synergistic servers is proportional to the sum of the rates of the individual servers, there is a tradeoff between exploiting synergies arising from the differences between workers and synergies arising from their collaboration, with collaboration becoming more desirable as the collaboration becomes more effective (not surprisingly). In the future, we will investigate if such structural results also hold for systems with more general models for the team service rates. In particular, we will allow the proportionality constant α to depend on both the team size and the station that the team is working at. Allowing α to depend on the team size is important because it takes into account the fact that there may be an optimal team size (e.g., it may be beneficial to assign two or three workers to a team, but assigning more than two or three workers to the team may yield diminishing returns) and allowing α to depend on the station captures the fact that collaboration may not be eually beneficial for all tasks. We are also interested in studying systems with different classes of servers with the synergy between servers in a team depending on the classes the servers belong to. Finally, we are currently studying systems where the servers are not synergistic (i.e., α < 1). This case seems more complex than the synergistic case, but our observations so far indicate that for systems with two stations in tandem and two servers, the optimal policy depends on whether one server is better than the other one at both stations or not, and may involve some server idling when α is small. 19

References [1] Ahn, H.-S., I. Duenyas, and R. Zhang. (1999) Optimal Scheduling of a 2-stage Tandem Queue with Parallel Servers, Advances in Applied Probability, 31, 1095 1117. [2] Ahn, H.-S., and R. Righter. (2006) Dynamic Load Balancing with Flexible Workers, Advances in Applied Probability, 38, 621 642. [3] Akşin, O. Z. M. Armony, and V. Mehrotra. (2007) The Modern Call-Center: A Multi- Disciplinary Perspective on Operations Management Research, Production and Operations Management, 16, 665 688. [4] Akşin, O. Z., F. Karaesmen, and E. L. Örmeci. (2007) A Review of Workforce Cross-Training in Call Centers from an Operations Management Perspective, Chapter 8 in Workforce Cross Training Handbook, ed. D. Nembhard, CRC Press, Boca Raton, FL. [5] Andradóttir, S., and H. Ayhan. (2005) Throughput Maximization for Tandem Lines with Two Stations and Flexible Servers, Operations Research, 53, 516 531. [6] Andradóttir, S., H. Ayhan, and D. G. Down. (2001) Server Assignment Policies for Maximizing the Steady-State Throughput of Finite Queueing Systems, Management Science, 47, 1421 1439. [7] Andradóttir, S., H. Ayhan, and D. G. Down. (2007) Dynamic Assignment of Dedicated and Flexible Servers in Tandem Lines, Probability in the Engineering and Informational Sciences, 21, 497 538. [8] Gans, N., G. Koole, and A. Mandelbaum. (2003) Telephone Call Centers: Tutorial, Review, and Research Prospects, Manufacturing and Service Operations Management, 5, 79 141. [9] Hopp, W. J., and M. P. Van Oyen. (2004) Agile Workforce Evaluation: A Framework for Cross-training and Coordination, IIE Transactions, 36, 919 940. [10] Kaufman, D. L., H.-S., Ahn, and M. E. Lewis. (2005) On the Introduction of an Agile, Temporary Workforce into a Tandem Queueing System, Queueing Systems: Theory and Applications, 51, 135 171. [11] Kırkızlar, H. E., S. Andradóttir, and H. Ayhan. (2009) Robustness of Effective Server Assignment Policies to Service Time Distributions, Naval Research Logistics, under revision. [12] Örmeci, E. L. (2004) Dynamic Admission Control in a Call Center with One Shared and Two Dedicated Service Facilities, IEEE Transactions on Automatic Control, 49, 1157 1161. [13] Puterman, M. L. (1994) Markov Decision Processes. John Wiley & Sons, New York, NY. 20