A How Hard are Steady-State Queueing Simulations?

A How Hard are Steady-State Queueing Simulations? ERIC CAO NI and SHANE G. HENDERSON, Cornell University Some queueing systems require tremendously long simulation runlengths to obtain accurate estimators of certain steady-state performance measures when the servers are heavily utilized. However, this is not uniformly the case. We analyze a number of single-station Markovian queueing models, demonstrating that several steady-state performance measures can be accurately estimated with modest runlengths. Our analysis reinforces the meta result that if the queue is well dimensioned, then simulation runlengths will be modest. Queueing systems can be well dimensioned because customers abandon if they are forced to wait in line too long, or because the queue is operated in the quality and efficiency driven regime where servers are heavily utilized but wait times are short. The results are based on computing or bounding the asymptotic variance and bias for several standard single-station queueing models and performance measures. Categories and Subject Descriptors: G.3 [Probability and Statistics]: Markov Processes, Queueing Theory; I.6.6 [Simulation and Modeling]: Output Analysis General Terms: Design, Performance, Theory Additional Key Words and Phrases: Diffusion approximations, Markovian queues, asymptotic variance ACM Reference Format: Eric C. Ni and Shane G. Henderson. 23. How hard are steady-state queueing simulations? ACM Trans. Model. Comput. Simul. V, N, Article A (January YYYY), 2 pages. DOI:http://dx.doi.org/.45/.. INTRODUCTION There is a widely held perception that using simulation to estimate steady-state performance measures for queueing systems with heavily utilized servers is hard. By heavily utilized servers we mean that the fraction of time that the servers are busy is close to. By hard we mean that the runlengths needed to obtain narrow confidence intervals with the desired coverage level are very large. On the contrary, we will argue that for well-dimensioned single-station queueing systems, the simulation runlengths needed to obtain accurate estimates are often modest. Queueing systems can be well dimensioned because customers abandon if they are forced to wait in line too long, or because the queue is operated in the quality and efficiency driven regime where servers are heavily utilized but wait times are short. Our argument is based on extending existing results [Whitt 26] that support this view to additional singlestation queueing models with infinite waiting room and first-come-first served service discipline. See Srikant and Whitt [996] for closely related results for loss-systems, which we do not explore. To make this discussion more precise, let X = (X(t) : t ) be a stochastic process representing the number of customers or jobs in a queueing system as a function of time, and suppose that X possesses a steady-state, i.e., there exists a random vari- This work is partially supported by the National Science Foundation, under grant CMMI-235. Authors address: School of Operations Research and Information Engineering, Cornell University Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 7, New York, NY 2-7 USA, fax + (22) 869-48, or permissions@acm.org. c YYYY ACM 49-33/YYYY/-ARTA $5. DOI:http://dx.doi.org/.45/.

A:2 Ni and Henderson able X( ) say, for which X(t) X( ) as t, where denotes convergence in distribution. Furthermore, let f : {,, 2,...} R be a real-valued cost function and suppose we wish to estimate the steady-state performance measure α = Ef(X( )). For example, if f(x) = x, then our goal is to estimate the mean steady-state number of customers in the system. A natural estimator of α is α(t) = t t f(x(s)) ds. For a wide class of queueing systems and cost functions, it is known that as t, t(α(t) α) σn(, ), where N(, ) denotes a (standard) normal random variable with mean and variance, and σ 2 is the asymptotic variance, which is also called the time-average variance. Accordingly, an asymptotic ( κ)% confidence interval for α is α(t) ± z κ/2 σt /2, where z κ/2 is the κ/2 quantile of a standard normal random variable. The confidence interval halfwidth is z κ/2 σt /2, which is proportional to σ. Accordingly, the asymptotic variance σ 2, or the standard deviation σ, is an indicator of the absolute accuracy of the estimator α(t). Similarly, relative error, which is perhaps preferable to absolute error, is indicated through the ratio σ 2 /α 2 or instead σ/α. Whitt [989] and Asmussen [992] explore the magnitude of σ for a range of queueing systems and performance measures. The most important of their results shows that for certain queueing systems with a fixed number of servers in which the servers are utilized for a large fraction ρ < of the time, when estimating the mean steadystate number of customers in the system, σ is typically of order ( ρ) 2 while α is of order ( ρ). Accordingly, when ρ is close to, σ/α is of order ( ρ), and hence the runlengths required to obtain estimators of α with small relative error are very large. This observation has been exploited within the simulation community in stress testing of output-analysis algorithms. Indeed, the heavily loaded M/M/ queue is a standard test problem for batching algorithms; see, e.g., Steiger et al. [25]. It is now well understood that heavily loaded queueing systems as described above require large simulation runlengths to obtain accurate estimators of steady-state performance measures, at least for steady-state moments of queue size and waiting time. But what of other performance measures, such as the steady-state probability of delay, i.e., that a customer will have to wait for service? Perhaps more importantly, such heavily loaded queues do not necessarily reflect real queueing systems. In reality, customers will not queue forever; a common feature in queueing systems is customer abandonment, where customers leave without receiving service if they have to wait too long. Furthermore, it is usually the case that the number of servers in a queueing system is chosen to ensure good customer service in the sense of short waiting times. We call queueing systems in which customers may abandon, and/or where the number of servers is chosen to deliver short waiting times well dimensioned queueing systems. (The notion of dimensioning queueing systems is not ours, although our use of the term well dimensioned is specific to this paper; see Borst et al. 24.) The key question considered herein is how hard it is to accurately estimate various steady-state performance measures associated with well-dimensioned queueing systems. To answer this question we compute the asymptotic variance in a range of Markovian queueing models, including the M/M/, M/M/c, and M/M/c + M models, and also several diffusion models. We confine our attention to these tractable models, even though performance measures can be computed directly so that simulation is unnecessary, specifically because they are tractable. This allows us to perform the needed cal-

How Hard are Steady-State Queueing Simulations? A:3 culations. We believe that similar conclusions will hold for many less-tractable queueing systems, partly because many of our results are obtained for diffusion models that are known to accurately approximate more general queues. Assuming a Poisson arrival process is often appropriate, as justified by the Palm- Khintchine theorem; see Karlin and Taylor [975, p. 22], Cinlar [972], and Nelson [23, p. 7]. Exponential service times are sometimes reasonable, but usually some other distribution is more appropriate. Finally, assuming exponential customer patience times, (the +M in the M/M/c + M queue) is not ideal, although the results of Zeltyn and Mandelbaum [25] suggest that in many queueing systems the value of the density of the patience time distribution at is the key quantity, in which case the full distribution is unimportant and assuming an exponential distribution with rate equal to this density value is an accurate approximation. (See the excellent surveys Dai and He [2; 22] for much more on queueing systems with abandonment.) In any case, our goal is to obtain the right order of magnitude of the asymptotic variance, so as long as our results are interpreted as applying to queueing systems that are robust in this sense, we believe that confining attention to Markovian systems is reasonable. So, for example, one should not attempt to extend our conclusions to queueing systems with heavy-tailed interarrival and/or service time distributions, nor to systems in which the sequences of these quantities exhibit long-range dependence. (See Whitt 22 for much more on such queues.) In addition to considering the asymptotic variance of estimators, we also consider their asymptotic bias. It turns out that either variance or bias can be more problematic in terms of delivering narrow confidence intervals that have the desired coverage, depending on the performance measure and queueing system. In fact, bias is often the more important property, at least in certain asymptotic regimes, as previously noted in Srikant and Whitt [996] for a variety of estimators of loss probabilities in loss models. Our overall approach and philosophy is mostly adopted from Whitt [26], who analyzed Markovian single-server queues and infinite server queues in some detail, along with some results for multiserver queues. Indeed, on p. 4, Whitt stated that Other important classes of stochastic models should be analyzed in the same way. We actually work with the same stochastic processes that Whitt did, except that we emphasize the phenomenon of customer abandonment, we work with a greater variety of performance measures, we consider what happens in queues when the number of servers is chosen so as to ensure that large backups do not arise, and we use slightly different technical tools, especially for diffusion models. The paper Srikant and Whitt [996] mentioned above is also relevant. In that paper, asymptotic approximations are derived for the asymptotic variance and bias for four loss-probability estimators in loss systems. Similar calculations to those we employ for diffusion models are used in Wang and Glynn [24], where the properties of a certain bias reduction scheme are studied. Our primary contribution is to reinforce the meta result that for well-dimensioned queueing systems, estimating steady-state performance measures using simulation is not hard. This meta-result, supported by analysis in Srikant and Whitt [996] and Whitt [26] and reinforced here, shows that not only does abandonment or appropriate sizing of server pools relieve congestion (as is well understood), but the benefits extend to simulation models in the sense that the runlengths required to obtain high-quality confidence intervals for a number of steady-state performance measures are modest. The remainder of this paper is structured as follows. In Section 2 we explain the mathematical tools used to obtain the asymptotic variance and bias, and the interpretation of those quantities in choosing runlengths that deliver high-quality confidence intervals. Then, in Section 3 we review the so-called efficiency-driven regime, which is the source of the common view that simulating heavily loaded queues is hard.

A:4 Ni and Henderson We then present some results in Section 4 that show that the presence of customer abandonment changes the situation dramatically. In Section 5 we turn to the so-called quality and efficiency driven regime associated with queueing systems with many heavily-loaded servers, but where customer wait times are also modest. Finally, in Section 6 we discuss and compare our results. 2. PRELIMINARIES The primary queueing models we consider in this paper are the M/M/c model with arrival rate λ, service rate µ and c servers with λ < cµ, and the M/M/c + M model where a patience time is associated with each customer, and each customer is willing to wait in queue only up to its patience time, at which point it abandons, i.e., leaves without receiving service. In these systems, the sequences of customer interarrival times, service times and patience times are mutually independent iid sequences of exponential random variables. If X(t) gives the number of customers in the system (for either queueing model) at time t, then X := {X(t) : t } is an irreducible, positive-recurrent continuous-time Markov chain on the state space S = {,, 2,...}. Let π be the unique stationary distribution, and with an abuse of notation, let π(k) = π k = π({k}), k. Let f : S R + be a cost function and let α := k= f(k)π(k) be the desired performance measure, namely the expected steady-state cost. We approximate α by t α(t) := t f(x(s)) ds, () the average cost over [, t]. The regenerative strong law of large numbers ensures that α(t) α as t almost surely. See, e.g., Resnick [992, p. 23, p. 396], and Crane and Iglehart [974a; 974b; 975] for an introduction to the regenerative method for steady-state simulation output analysis. Let A be the rate matrix for X, and define the function V : S [, ) by V (k) = ae bk for k =,, 2,..., where we leave a, b > unspecified. It is straightforward to show, for each of our queueing models, that there exist strictly positive constants a, b, β, δ such that AV (k) βv (k) + δ, (2) for all k S, which is known as a Lyapunov drift criterion. (In this expression we take V to be a column vector with kth component V (k), k, so that AV is a matrix-vector product.) This condition implies that the chain X is V-uniformly ergodic, which allows us to make a number of conclusions below; see Meyn and Tweedie [993b, Theorem 7.] and also Down et al. [995]. It turns out that one can also apply this same theory to the diffusion models we consider in this paper to ensure that the same results apply to those models, with only modest modifications, e.g., the rate matrix A in (2) is replaced by the so-called generator of the diffusion process. The Lyapunov drift criterion (2) implies [Glynn and Meyn 996, Theorem 4.3] the central limit theorem (CLT) t(α(t) α) N(, σ 2 ), (3) as t, where denotes convergence in distribution, provided that for some γ >, f 2 (k) γv (k) for all k. (An expression for the asymptotic variance constant σ 2 is given below.) The functions f we consider grow at most linearly, so this condition is assured and the CLT indeed holds. The CLT establishes that an asymptotic confidence interval for α is given by α(t) ± zσ t, (4)

How Hard are Steady-State Queueing Simulations? A:5 where z is an appropriate quantile of the standard normal distribution. (In practice we must replace σ with an estimator thereof, but that is not important for our present purpose.) If we want the half-width of this confidence interval to be smaller than some absolute error tolerance ɛ >, then we require that the simulation runlength t z 2 σ 2 /ɛ 2. Hence, the (asymptotic) variance constant σ 2 provides information on the accuracy of the estimator α(t) in terms of the amount of simulated time that is required to obtain a narrow confidence interval. This remains true if we want to assure that the half-width of the confidence interval, relative to the true performance measure, is smaller than some relative error tolerance ɛ >. In that case we require t z 2 σ 2 /(ɛ 2 α 2 ). (5) (While relative error is typically the more relevant quantity, there is no additional work to obtaining absolute error as well, so we discuss both measures.) The estimator α(t) is almost always biased, owing to the fact that X() cannot usually be generated from the stationary distribution π. This bias can deteriorate the coverage probability of the confidence interval (4). Suppose that we initiate the chain X in some fixed state x, and let E x and P x denote the corresponding expectation and probability. As in, e.g., Proposition 2. of Awad and Glynn [27], the bias is then α(t) α = t E x = t t t = t = g(x) t where the bias constant [f(x(s)) α] ds [E x f(x(s)) α] ds (6) [E x f(x(s)) α] ds t t [E x f(x(s)) α] ds o(t ), (7) g(x) = [E x f(x(s)) α] ds, provided that the interchange (6) is valid, and that E x f(x(s)) α ds <. These conditions are satisfied for our examples as assured by (2); see Down et al. [995]. If we consider the (standard) regime where the desired confidence interval halfwidth ɛ, then the required runlength according to the relative error criterion (5) is of the order ɛ 2. For such a runlength, the asymptotic bias is, according to (7), of the order ɛ 2, which is asymptotically negligible compared to the confidence interval halfwidth. We instead consider a different asymptotic regime, where ɛ is held fixed and the limiting behavior of σ and g(x) are considered as a function of some other quantity, such as the arrival rate of customers and/or the number of servers, in order to understand how desired runlengths scale with these quantities. We adopt the philosophy that we want to ensure that confidence intervals are of a desired width and the coverage of the confidence interval is not unduly affected by bias. From this perspective, it is important that bias is small relative to the confidence interval width. The confidence interval width is of the order σt /2 while the bias is of the order g(x)/t. In order to achieve a narrow confidence interval, we must choose t so that t /2 is large relative to σ, i.e., t is large relative to σ 2. Likewise, to ensure that the bias g(x)/t is small, we must take t large relative to g(x). Relative to the simulation runlength t then, the appropriate comparison is between variance σ 2 and bias g(x). This may seem strange if one is used to measuring the quality of an estimator

A:6 Ni and Henderson through its mean-squared error, where variance and squared bias are often balanced. The difference arises from our goal of having the bias be negligible relative to the confidence interval width. If we instead consider relative error, then the relative confidence interval width (relative to the performance measure α) is proportional to (σ/α)t /2 and the relative bias is (g(x)/α)/t, so in terms of desired runlengths we then compare σ 2 /α 2 with g(x)/α. In the limiting regimes we consider, the confidence interval width criterion can dominate the bias criterion or vice versa, and there are also situations where neither criteria dominates. When the bias dominates the variance, or is of the same order as the variance, then confidence interval coverage will be affected, and one might turn to bias mitigation schemes such as initial transient deletion or careful choice of the initial conditions. But how can we compute the variance σ 2 and bias constant g(x) for a particular model and choice of parameters? It is known (see Meyn and Tweedie 993a, Section 7.4 for the result for discretetime chains, and Steckley and Henderson 26, Section 6 for a direct proof for the continuous-time chains corresponding to our queueing models) that Ag = f := f α (8) where A is the rate matrix of the chain and f is the centered cost function in the sense that π f = π f α =. (Here denotes the usual matrix transpose.) In fact, g is the π-integrable solution of these equations that satisfies π g =. We can therefore compute g, and hence the bias constant g(x) for any initial condition X() = x, by identifying the π-integrable solution to Poisson s equation (8) that satisfies π g =. It turns out that this also allows us to compute σ, because [Glynn and Meyn 996, Theorem 4.3] σ 2 = 2 f(k)g(k)π(k). (9) k= Thus, in the sections to come, we will compute the stationary distribution π of the appropriate Markov process, use this to compute α = π f and hence f = f α where f represents the performance measure in question, solve (8) for the π-integrable solution g satisfying π g =, and hence obtain the bias constant g(x) for any fixed initial condition (X() = x), and compute the variance constant using (9). The magnitude of these quantities then tells us how hard it is to estimate certain steady-state performance measures of Markovian queues using simulation. Whitt [26] uses a very similar approach for continuous time Markov chains, with the key differences being that we emphasize the phenomenon of customer abandonment, we work with a greater variety of performance measures, we consider what happens in queues when the number of servers is chosen so as to ensure that large backups do not arise, and we use a slightly different version of Poisson s equation. For birth-death processes the methodology above is the same as that of Whitt [992], except that we use what Whitt calls the alternate form of Poisson s equation. For diffusions we work with the infinitesimal generator of the process, as employed in Glynn and Meyn [996]. A similar agenda could be followed to analyse estimators other than those considered here, provided that they can be represented as a time-average for a suitably defined cost function f( ) as in (). 3. THE EFFICIENCY-DRIVEN REGIME Consider as performance measure the steady-state expected number of customers in system (the expected occupancy), so we take f(k) = k. In this section we analyze this

How Hard are Steady-State Queueing Simulations? A:7 performance measure within what is known as the efficiency-driven regime, first looking at the M/M/c special case and then at general GI/GI/c queues. The results in this section are known, but our derivations are included in an appendix because the method of derivation is instructive of our general approach and is new in some cases that we clarify there. One might be tempted to apply a similar analysis to the steady-state delay probability, i.e., the steady-state probability that a customer will have to wait. In doing so, one might exploit the Poisson arrivals see time averages property, e.g., Wolff [989, Section 5.6], taking f(k) = I(k c), i.e., f(k) equals if k c and otherwise. Indeed, we were so tempted, but as pointed out by a referee, in the efficiency-driven regime, this delay probability converges to, so there is (asymptotically) no value in using simulation if the error precision ɛ remains fixed. Moreover, the neglected term in the bias approximation (7) can in fact be non-negligible in the regime we consider, so we do not attempt to analyze this performance measure in this section. More refined tools are needed. 3.. The M/M/c Queue Suppose we initiate a simulation of the M/M/c queue with X() =, and consider the efficiency-driven regime where we keep c and µ fixed while λ cµ from below, i.e., ρ from below. From (7) the bias in the estimator t t X(s) ds is asymptotically g()/t, which calculations in the appendix show is asymptotically c ( ρ) 3 t (taking µ = ). The asymptotic variance is of the order 4c ( ρ) 4 /t as ρ. (These values agree with the M/M/ special case in Whitt [26].) Recall from the discussion in Section 2 that to obtain a desired absolute error (confidence interval halfwidth) of ±ɛ, the required simulation runlength t is z 2 σ 2 /ɛ 2. For a 95% confidence interval, z 2, so if µ =, then the desired runlength is 4σ 2 ɛ 2 6c ( ρ) 4 ɛ 2 as ρ. To ensure the asymptotic bias, g()/t, is smaller than ɛ, we require a runlength that is of the order c ( ρ) 3 ɛ. Consequently, as ρ, the variance is the dominant criterion. Considering relative error rather than absolute error, the simulation runlength needed is asymptotically t = z 2 σ 2 /(ɛ 2 α 2 ) which is of the order 6c ( ρ) 2 ɛ 2. Also, to ensure that the bias relative to α, g()/(tα), is smaller than ɛ requires a runlength of order c ( ρ) 2 ɛ, which is of the same order (in terms of ρ) as that required from the perspective of the confidence interval width. Nevertheless the constant multipliers ensure that variance is the primary driver of runlengths. These conclusions reinforce similar conclusions given in Whitt [26] for the M/M/ queue. One way to potentially reduce bias is to choose the initial state to be representative of steady-state conditions, which one might interpret as meaning taking X() = ( ρ), the approximate steady-state mean. In the appendix we compute the exact solution to Poisson s equation and then obtain its order as ρ. This enables us to conclude that, when estimating the mean occupancy, the bias constant is g(( ρ) ) (2cµ) ( ρ) 3, which is of the same order as g() so, at least in order, bias is not reduced. 3.2. The GI/GI/c Queue The results above shed light on what happens in heavily loaded Markovian queues. The assumption that the arrival process is Poisson is often easily justified, owing to the Palm-Khintchine theorem; see, e.g., Karlin and Taylor [975, p. 22], Cinlar [972], and Nelson [23, p. 7]. However, service times are often not well modeled as exponential random variables, with, e.g., the lognormal distribution often fitting empirical data. We now review the GI/GI/c queue where the sequences of interarrival and ser-

A:8 Ni and Henderson vice times are independent and each consists of i.i.d. random variables. Such queues defy exact analysis in general. We rely on a reflected Brownian motion approximation for the queue-size process due to Iglehart and Whitt [97a; 97b]. See Whitt [22, Theorems.2. and.2.3] for a recent review. We develop similar results to those of Whitt [989] and Whitt [26] using the tools sketched in Section 2. The derivations are given in the appendix. Let X ρ = (X ρ (s) : s ) be the stochastic process giving the number of customers in the system over time as a function of ρ, the utilization of the servers. Iglehart and Whitt [97a; 97b] established that X ρ can be approximated by a reflected Brownian motion (RBM) on [, ) with drift η and variance δ 2, where η = cµ( ρ) and δ 2 = cµ((cµσ U ) 2 + (µσ V ) 2 ). (The exact sense in which this approximation is appropriate is described in the appendix.) We take this approximation as exact in the sense that we compute results (bias and variance constants) for the approximating RBM rather than the original intractable queueing model, and use those to develop our conclusions. Consider the steady-state mean occupancy. The bias constant when the simulation is initiated at is of the order ( ρ) 3 as seen in our M/M/c results. The variance σ 2 is of order ( ρ) 4. Thus, exactly as with the M/M/c results, from the perspective of absolute error the variance dominates, while from the perspective of relative error, both variance and bias are of the same order, so that bias mitigation schemes should be considered. Accordingly, we come to the same conclusions for general GI/GI/c queues that we did for the M/M/c queue in that the bias becomes important to consider as ρ. We might try to mitigate bias by initializing the simulation in the (deterministic) state corresponding to the steady-state mean of the approximating RBM. In doing so, the initial bias when estimating the mean occupancy remains of order ( ρ) 3. Unfortunately, our tools are too crude to quantify the benefits from initiating a simulation of the queue from the steady-state distribution of the diffusion (or an analog thereof in the original queueing model), since we are confining our analysis to diffusion models and for the diffusion the initial bias is then exactly. 3.3. The M/M/c + M Queue Zeltyn and Mandelbaum [25] defined an ED regime for queues with abandonment in an asymptotic setting where the number of servers and the arrival rate both increase, while the patience time and service time parameters remain constant. They assumed that c = c(λ) = ( γ)λ/µ where γ (, ) is fixed. Thus the queue has insufficient servers to meet demand. As a result, some fraction of customers must abandon to ensure stability, and this fraction approaches γ as λ. We do not analyze this queueing system in this paper, because we believe that the quality and efficiency driven regime that we analyze later is almost always more relevant in practice; see Dai and He [2; 22] for more discussion about this regime. 4. THE IMPACT OF ABANDONMENT In the models we considered in the previous section, customers are willing to wait indefinitely, and this leads to very large queue sizes and persistent periods of congestion with the associated very large asymptotic variance constants. However, in almost all true queueing systems, customers will not wait indefinitely, and this can lead to dramatic differences in performance. Consider the M/M/c + M (or Erlang-A) queue in which customers are only willing to wait for an exponentially distributed patience time with mean θ (, ). Patience times of successive customers are iid and independent of the sequences of interarrival and service times.

How Hard are Steady-State Queueing Simulations? A:9 ρ =.95 ρ = ρ =.2 log (σ 2 ) 4 2 24 6 c 4 6 5 4 3 2 log (θ) log (σ 2 ) 5 24 6 c 4 6 5 4 3 2 log (θ) 8 6 4 2 24 6 c log (σ 2 ) 4 3 2 log (θ) Fig.. Asymptotic variance σ 2 for the average number of jobs in the system under µ = 4.. The M/M/ Queue Suppose that θ = µ so that the mean patience time and mean service times are the same. In this case, the queue-size stochastic process X = (X(t) : t ) coincides with that of the M/M/ queue. Even if θ µ, the stochastic process X is stochastically dominated by the queue size in an infinite-server queue with service rate min{µ, θ}. Therefore, the M/M/ queue is an interesting first model to consider. Let ρ = λ/µ. (We use this notation even though ρ no longer represents the server utilization, which is.) Whitt [26] showed that when estimating the mean steadystate number of customers in the system, the bias is ρ/µ and the asymptotic variance constant is 2ρ/µ. We conclude that in terms of absolute error, the asymptotic variance and bias are both of the same order in the regime where λ with µ held constant. Consequently, to ensure satisfactory confidence interval coverage, bias reduction must be explicitly considered. Interestingly, Whitt [26] shows that when one considers relative error in this same regime, then the bias becomes the dominant criterion. This happens because the runlength required to achieve a given confidence interval width relative to the true performance measure ρ is proportional to /ρ, while the bias relative to ρ remains constant. 4.2. The M/M/c + M Queue In general, when < θ µ, the solution to Poisson s equation can be computed but is complicated, and we turn to numerical experimentation to illustrate the effect of abandonment. We report computational results for the asymptotic variance σ 2 and asymptotic bias under different levels of λ, c and θ, with µ = held fixed, for the expected steady-state number of customers in the system. Additional numerical results for the performance measures steady-state probability of delay and steady-state probability of abandonment are reported in Section 5. Figure shows that σ 2 decreases significantly in the presence of abandonment relative to the no-abandonment, ED-regime case. Inspecting the plots, we see that for θ < µ, we have, approximately, that σ 2 θ 2 which is similar to the M/M/ case where σ 2 µ 2, except that the abandonment rate θ replaces the service rate µ. Recalling that σ 2 ( ρ) 4 in the ED regime for the M/M/c queue, this result suggests that the reduction in asymptotic variance is of order θ 2 ( ρ) 4, even when θ µ, i.e., the abandonment rate is a small fraction of the service rate. Furthermore, the plateau we see in the plot of variance when ρ =.95 suggests that when ρ <, the variance constant σ 2 is upper bounded by the M/M/c variance as θ. Also, when ρ, the queue without abandonment would be overloaded, but with abandonment the results are very much like those for an M/M/ queue with service rate θ. Recall that in the M/M/ queue, the bias constant differs from σ 2 by a multiplicative constant -2. We observe a similar scaling relationship in the M/M/c + M case in the plots of Figure 2, which are approximately proportional to the plots in Figure.

A: Ni and Henderson ρ =.95 ρ = ρ =.2 log ( bias ) 3 2 24 6 c 4 6 5 4 3 2 log (θ) log ( bias ) 5 24 6 c 4 6 5 4 3 2 log (θ) log ( bias ) 6 4 2 24 6 c 4 3 2 log (θ) Fig. 2. Absolute asymptotic bias for the average number of jobs in the system under µ = 5. THE QUALITY AND EFFICIENCY DRIVEN REGIME In this section, we consider Markovian queues operating in the Halfin-Whitt regime named in honor of Halfin and Whitt [98], which is now also known as the quality and efficiency driven regime, a name coined by Avi Mandelbaum, because not only are customers served quickly, but the servers are also heavily utilized. This regime is most relevant for systems with moderate to large numbers of servers, so we will be interested in asymptotics where both the arrival rate λ and the number of servers c increase, with the service rate µ held fixed. More precisely, we require that for some finite constant β, ( ρ) c β as c, where ρ = λ/(cµ). Hence, for a given value of c, the arrival rate is λ = cµ βµ c. When there is no abandonment (θ = ) we must have β > so that the system is stable, but this restriction is not necessary when the abandonment rate θ >, since abandonment stabilizes the system. We continue to think of hardness in terms of the simulation runlength t needed to obtain high-quality confidence intervals, although this is imperfect in the following sense. The computational effort required to simulate to simulated-time t is proportional to the number of random variates generated over the interval [, t], which is proportional to λt. In the asymptotic regime considered here both c and λ increase without bound. So the computational effort required to simulate to time t is better represented by λt, than by t alone. In previous sections where λ was bounded, these quantities are equivalent in order, but now that λ, they are not. Nevertheless, we continue to estimate and report the asymptotic variance and bias constants, which imply a desirable t, and which can in turn be scaled by λ (or cµ, since cµ and λ are of the same order in the asymptotic regime we consider) to estimate the computational effort required. Exact calculations for the continuous-time Markov chain models can be performed, but it appears to be difficult to extract insight from the results. Accordingly, we employ a combination of analytical results from diffusion models and numerical results for continuous-time Markov chain models. 5.. The M/M/c Queue Consider a sequence of M/M/c queueing systems, indexed by c =, 2,... All systems have a fixed service rate µ and are assumed to start out empty. The arrival rate in the cth system is chosen to ensure that c( ρ) is constant and equal to β >, where ρ = λ/(cµ). Let X c = (X c (t) : t ) be the stochastic process giving the number of customers in the system over time in the cth system. Halfin and Whitt [98] proved that X c ( ) c c Y ( ) ()

How Hard are Steady-State Queueing Simulations? A: as c, where Y is a diffusion on (, ) with drift function { βµ y > µ(y) = µ(β + y) y, and infinitesimal variance 2µ, with Y () =. (In fact, Halfin and Whitt 98 proved a version of this result for a sequence of GI/M/c queues, but we restrict attention to a Poisson arrival process.) The convergence result () suggests the process approximation X c ( ) c + cy ( ). () We take this approximation as an equality, which then allows us to obtain a number of insights that agree with our numerical results for exact calculation for M/M/c models. In other words, we redefine X c to be the right-hand side of (), which is a diffusion, and compute the order of magnitude of the variance and bias for our performance measures for these diffusions that are indexed by c. The scaling relationship makes this calculation quite tractable, but it is certainly not trivial, because the asymptotic bias depends on growth rates in the solution to Poisson s equation for the process Y ( ), which we therefore need to obtain. To begin, consider the cost function f(x) = x, so that we wish to estimate the expected steady-state number of customers in the system, with estimator t t X c(s) ds. We can compute the asymptotic variance and bias of this estimator as in Section 3.2, but to emphasize the role of the scaling we relate these quantities to similar ones associated with the process Y. Let g c be the desired solution to Poisson s equation for X c and f(x) = x, and let g Y be the solution to Poisson s equation for Y and f(y) = y. Let α c = Ef(X c ( )) be the expected steady-state cost for X c, and define α Y similarly, so that α c = c + cα Y. The functions g c and g Y are related, since g c (x) = = = c E[X c (t) α X X c () = x] dt E[c + cy (t) (c + cα Y ) (X c () c)/ c = (x c)/ c] dt E[Y (t) α Y Y () = (x c)/ c] dt = cg Y ((x c)/ c). Thus, the asymptotic bias constant g c () = cg Y ( c), and so we will need to compute g Y. Before doing so, consider the calculation of the asymptotic variance. Let σc 2 be the asymptotic variance of X c and σy 2 be the asymptotic variance of Y (for the cost function f(x) = x). Let π c and π Y be the stationary densities of X c and Y respectively, and note that π c (x) = c /2 π Y (c /2 (x c)). Hence, σ 2 c = 2 = 2 = c2 = cσ 2 Y, (x α X )g c (x)π c (x) dx c ( x c c α Y ) cgy (c /2 (x c)) c π Y (c /2 (x c)) dx (y α Y )g Y (y)π Y (y) dx so that σ 2 c grows linearly in c, with multiplicative constant σ 2 Y.

A:2 Ni and Henderson So now we return to obtaining the asymptotic bias cg Y ( c), for which we need to compute g Y, the π Y -integrable solution to Poisson s equation, with π Y integral, that satisfies the differential equation The solution for y is µg Y (y) + µ(y)g Y (y) = y + α Y. g Y (y) = A + y µ β + α Y µ y Φ(y + β) φ(y + β) dy, where the constant A does not depend on c and is not important for our purposes. Now we use the fact that so that g Y ( c) A yφ(y + β) lim y φ(y + β) = c µ β + α Y 2µ ln c c µ for large c. We conclude that the asymptotic bias is of order c/µ as c. A cautionary note is necessary at this point. The diffusion approximation () is most appropriate for measuring fluctuations in the process of order c around the central value c. In considering the bias starting from initial state, we are considering a larger fluctuation that is of order c c, so we are extrapolating past the usual range over which we can expect the diffusion approximation to accurately match the dynamics of the continuous-time Markov chain it approximates. If we instead take as initial state c a c for some a, then the diffusion approximation gives the asymptotic bias constant as cg Y ( a), which is of order c. So we might expect that the bias starting from initial state is at least of order c, and furthermore that bias is reduced to order c by choosing the initial state as c rather than. Our numerical experiments below support the view that the bias starting from is of order c. Furthermore, as pointed out by a referee, the fluid model also suggests that the asymptotic bias starting from that state is of order c/µ; see Section 6. Hence, the bias and variance are both of the same order, being asymptotically linear in c, and the bias can be reduced to order c by starting from initial state c. Next, consider the cost function f(x) = I(x c), so that we wish to estimate the steady-state probability that an arriving customer must wait, with estimator t t I(X c(s) c) ds. Let us redefine g c to be the desired solution to Poisson s equation for X c and f(x) = I(x c), and let g Y be the solution to Poisson s equation for Y and f(y) = I(y ). Redefine α c = P (X c ( ) c) and α Y = P (Y ( ) ) similarly, so that α c = α Y. Using the same arguments used for the cost function f(x) = x, we find that g c (x) = g Y (c /2 (x c)), σc 2 = σy 2 is constant, and g Y (y) = A 2 α Y µ y (2) Φ(y + β) dy, (3) φ(y + β) for y <. Hence the asymptotic bias constant when initiating in State is g c () α Y 2µ ln c, while the bias is reduced to constant order if we initiate in State c.

How Hard are Steady-State Queueing Simulations? A:3 (The same cautionary note above about the range of applicability of the diffusion approximation also applies here.) We conclude that when estimating the steady-state probability of delay, the bias is of order ln c, while the variance is constant in c, suggesting that at least for large c, the bias is the dominant criterion. However, given that ln c grows extremely slowly, it is likely that both quantities are important to consider, and this remains true even if we reduce bias by initiating in State c. 5.2. The M/M/c + M Queue Now consider the case where θ >, so that customers abandon if their waiting times are too long. Again consider a sequence of M/M/c + M queueing systems, indexed by c =, 2,... All systems have a fixed service rate µ and are assumed to start out empty. The arrival rate in the cth system is chosen to ensure that c( ρ) is constant and equal to β, where ρ = λ/(cµ). Hence, we use exactly the same asymptotic regime as in the previous section where customers did not abandon, except that we explicitly allow β, since abandonment ensures that the systems are stable. Let X c = (X c (t) : t ) be the stochastic process giving the number of customers in the system over time in the cth system. Garnett et al. [22] proved that X c ( ) c c Y ( ) (4) as c, where Y is a diffusion on (, ) with drift function { (βµ + θy) y > µ(y) = µ(β + y) y, and infinitesimal variance 2µ, with Y () =. We see that abandonment modifies the drift function for y >, but otherwise the diffusion is unchanged. We again take the process approximation implied by (4) as exact, so that we redefine X c ( ) = c + cy ( ). (5) Consider the cost function f(x) = x, so that we wish to estimate the expected steadystate number of customers in the system, with estimator t t X c(s) ds. We can compute the asymptotic variance and bias of this estimator exactly as in the M/M/c case above. Redefining all the quantities of interest for the case in point, we find that α c = c + cα Y, g c (x) = cg Y ((x c)/ c), σc 2 = cσy 2, g Y (y) is given, for y by (2) although with a different additive constant, and g c () c/µ as c. Hence, our conclusions for the M/M/c queue continue to hold in the case of abandonment, although with a different variance constant σy 2. This is perhaps to be expected, since the Halfin-Whitt regime corresponds to a situation where a nontrivial fraction of customers have to wait, but they have to wait for a vanishingly small amount of time as c increases, and so abandonment has very little effect asymptotically. The analysis for the cost function f(x) = I(x c) follows similar, albeit nontrivial, lines, and we omit the details. The asymptotic bias is of order ln c while the asymptotic variance does not depend on c. There is an additional cost function we should consider for this model. Some managers use the steady-state probability of abandonment as a performance measure for design, so it is worth understanding how this measure might be estimated, along with the asymptotic bias and variance of the estimator. The discrete-time process consisting of the indicators of whether successive customers abandon or not is not very tractable. Fortunately, there is an alternative based on the system-size process [Garnett et al.

A:4 Ni and Henderson Number of jobs in the system Probability of having to wait Probability of abandonment log (σ 2 ) 3 2.5 2 24 52 28 c 32.5.5.5.5 2 β σ 2.2 24 52 28 c 32.5.5.5.5 2 β log (σ 2 ) 2 4 24 52 28 c 32.5.5.5.5 2 β Fig. 3. Asymptotic variance σ 2 for various performance measures for the M/M/c + M queue under the QED regime with µ = θ = 22]. When there are x customers in the system, the abandonment rate is [x c] + θ. On the other hand, the long run abandonment rate is λα X, where α X is the steadystate probability that an arriving customer will abandon. Thus which can be estimated via α X = θ λ E[X c( ) c] +, θ λ t t [X c (s) c] + dt. (6) First consider the cost function f(x) = [x c] +. Following our now familiar argument, we redefine α X = E[X c ( ) c] + = ce[y ( )] + = cα Y. Again, g c (x) = cgy (c /2 (x c)), and g Y for this model is of the same form as (3) with different constants, so the asymptotic bias is of the order c /2 ln c and the asymptotic variance is cσy 2. The bias can be reduced to order c/2 by initiating the simulation in State x = c. Of course, we are more interested in the cost function θ λ [x c]+, and since λ cµ as c, the asymptotic bias is of the order c /2 ln c and the asymptotic variance is θ 2 µ 2 σy 2 /c. Hence, when estimating the probability of abandonment using (6), both the bias and the variance decay as c grows, with the bias being asymptotically of larger order. 5.3. Numerical Examples We derived the results above assuming that the diffusion approximation was exact. We now confirm the predictions of that approximation by numerically computing the asymptotic variance σ 2 and bias for the M/M/c + M queue under the QED regime. We present the results in Figures 3 and 4. In these plots, we fix µ and θ, and for each value of β and c considered we choose λ so that ( ρ) c = β. We then choose the scaling of the c axis and vertical axis according to the predictions made by the diffusion models, and find that both (scaled) σ 2 and bias on the vertical axis appear to be linear with respect to (scaled) c. This suggests that the diffusion model estimates the true orders of the variance and bias accurately. 5.4. The M/M/c + GI Queue We conclude this section with a brief comment about M/M/c + GI queues. In these queues, the patience times are still iid, but may not have an exponential distribution. Zeltyn and Mandelbaum [25] proved that (4) still holds for such queues, with the proviso that the term θ in the drift function of the limiting diffusion Y is redefined to be the value of the density of the patience time distribution at zero. To understand why, note that in the QED regime, customer wait times become very small, being of order c /2 as c. Hence, while a nontrivial fraction of customers have to wait, their waiting times are almost all very small. Consequently, customers have very little time

How Hard are Steady-State Queueing Simulations? A:5 log ( bias ) 3 2 Number of jobs in the system 24 52 28 c 32.5.5 β.5.5 2 bias 3 2 Probability of having to wait 24 52 28 c 32.5.5 β.5.5 2 Probability of abandonment.4.2.5.5.5.5 2 c /2 ln(c) β bias Fig. 4. Absolute asymptotic bias for various performance measures for the M/M/c + M queue under the QED regime with µ = θ = to abandon, and the patience time distribution is relevant only in terms of its behavior near. Assuming the patience distribution has a positive continuous density at, our conclusions about the order of the variance and bias for the performance measures we analyzed for M/M/c + M queues remain valid for M/M/c + GI queues, assuming that our approximation (5) does not introduce significant error. 6. DISCUSSION AND COMPARISONS Table I summarizes our results. The values given represent the highest-order term in the property (variance or bias accordingly) and do not include any multiplicative constants. For example, when estimating the steady-state probability of delay in the M/M/c queue operated in the QED regime, one can expect the asymptotic bias when starting the simulation in State to be O(ln c), while the asymptotic bias is O() when starting the simulation in State c, which is more representative of steady-state conditions. These values are also proportional to the order of magnitude of the simulation runlength t required to give a confidence interval of a fixed width in the case of variance, or to obtain a fixed bias respectively. In interpreting these results, recall that in the QED regime, the arrival rate is of the same order as c, so that the computational effort needed is of the order ct. Table I. A summary of our results. Values represent the order of magnitude of the variance or bias, ignoring multiplicative constants, for the stated steady-state performance measure and model in the stated regime. The three performance measures are the mean number of customers in the system, the probability of delay and the probability of abandonment. The columns labelled Bias and Bias α respectively give the order of the bias constant when initiating simulations with an empty system or when initiating from an approximation to the steady-state mean α obtained from the diffusion approximation. Performance Regime Model Variance Bias Bias α Measure EX ED M/M/c ( ρ) 4 ( ρ) 3 ( ρ) 3 QED M/M/c c c c /2 QED M/M/c + M c c c /2 P (X c) QED M/M/c ln c QED M/M/c + M ln c P (Ab) QED M/M/c + M c c /2 ln c c /2 The values in Table I are appropriate when errors are measured in absolute terms. If we instead measure errors relative to the true values of the performance measure, then as discussed in Section 2 we must divide the variance by the square of the performance measure, and the bias by the performance measure. Doing so yields Table II below. The values in Table II are striking in the sense that the bias when initiating with an empty system is of larger order than the variance in all cases, except for the M/M/c

A:6 Ni and Henderson Table II. Values are interpreted as in Table I above, except that variance is relative to the square of the performance measure, while bias is relative to the performance measure. Performance Regime Model Variance Bias Bias α Measure EX ED M/M/c ( ρ) 2 ( ρ) 2 ( ρ) 2 QED M/M/c c c /2 QED M/M/c + M c c /2 P (X c) QED M/M/c ln c QED M/M/c + M ln c P (Ab) QED M/M/c + M ln c queue in the ED regime, where the two properties are equal in magnitude. This suggests that bias should receive careful consideration in simulations of heavily-loaded queues, in agreement with results for loss models in Srikant and Whitt [996], and results for the M/M/ queue in Whitt [26]. To mitigate this bias, Whitt [26] suggested simulating starting from an initial state where all servers are busy, with residual service times sampled from the equilibrium residual-life distribution, instead of starting with an empty system. Our results suggest that this would substantially reduce bias in the QED regime, as seen in the final columns of the tables above. For example, in estimating the expected steady-state number of customers in the M/M/c queue in the QED regime, the absolute bias would then be of the order c rather than c, and in estimating the probability of delay the bias would be of order rather than ln c. However, as seen in the ED results for EX, the order of the bias reduction may depend on the performance measure; an order of magnitude in bias reduction is not guaranteed. Even more substantial bias reduction might result if the initial state of the simulation is randomly chosen with a distribution that is related to the stationary distribution of the heavy-traffic approximation. While we expect bias to be reduced, our methods cannot shed light on the effect, because we analyze the bias reduction from the perspective of the heavy-traffic approximation itself. Thus our prediction of the resulting bias would be, and a deeper analysis is needed. It is interesting that in Table II the asymptotic variances relative to the square of the mean in estimating EX in the QED regime are of order c, showing that the simulation runlength needed to obtain confidence interval widths with given relative error shrinks as c. It is worth keeping in mind that in the QED regime the arrival rate is approximately proportional to c, so that the total number of customer arrivals simulated is constant. This phenomenon was noted in Srikant and Whitt [996] and in Whitt [26] for related performance measures and systems. This is a striking observation, especially when one compares it with the situation in the ED regime in the absence of abandonment, where the number of customer arrivals that need to be simulated is of the order ( ρ) 2, which grows extremely rapidly as ρ. Although the relative bias is of equal or larger order than the relative variance in all cases, it is important to keep in mind the discussion from Section 2 that in the usual asymptotic setting where the desired confidence interval width ɛ, the confidence interval width will eventually dominate the bias. The comments above apply to the setting where ɛ is fixed and ρ (in the case of ED) or c (in the case of QED). A referee suggested that fluid models underlie and explain the large difference in bias results for the multi-server (c remaining bounded) and many-server (c ) regimes that we obtained through tractable diffusion models. This suggests that our results, and others, might instead be obtained by studying the even-more tractable fluid models associated with these processes. For example, for the M/M/c queue in the QED regime of Section 5, the fluid model initiated in State is x (t) = λ µ min(c, x)