arxiv: v2 [math.pr] 9 Dec 2016

Similar documents
The Hanging Chain. John McCuan. January 19, 2006

A Queueing Model for Call Blending in Call Centers

Omnithermal perfect simulation for multi-server queues

Millennium Relativity Acceleration Composition. The Relativistic Relationship between Acceleration and Uniform Motion

A Characterization of Wavelet Convergence in Sobolev Spaces

Complexity of Regularization RBF Networks

The Effectiveness of the Linear Hull Effect

Maximum Entropy and Exponential Families

Remark 4.1 Unlike Lyapunov theorems, LaSalle s theorem does not require the function V ( x ) to be positive definite.

Computer Science 786S - Statistical Methods in Natural Language Processing and Data Analysis Page 1

Methods of evaluating tests

Optimization of Statistical Decisions for Age Replacement Problems via a New Pivotal Quantity Averaging Approach

DIGITAL DISTANCE RELAYING SCHEME FOR PARALLEL TRANSMISSION LINES DURING INTER-CIRCUIT FAULTS

Nonreversibility of Multiple Unicast Networks

Taste for variety and optimum product diversity in an open economy

Journal of Inequalities in Pure and Applied Mathematics

Estimating the probability law of the codelength as a function of the approximation error in image compression

Discrete Bessel functions and partial difference equations

On Component Order Edge Reliability and the Existence of Uniformly Most Reliable Unicycles

Where as discussed previously we interpret solutions to this partial differential equation in the weak sense: b

Singular Event Detection

Hankel Optimal Model Order Reduction 1

max min z i i=1 x j k s.t. j=1 x j j:i T j

Control Theory association of mathematics and engineering

Likelihood-confidence intervals for quantiles in Extreme Value Distributions

Product Policy in Markets with Word-of-Mouth Communication. Technical Appendix

Average Rate Speed Scaling

Chapter 8 Hypothesis Testing

Lecture 7: Sampling/Projections for Least-squares Approximation, Cont. 7 Sampling/Projections for Least-squares Approximation, Cont.

Robust Recovery of Signals From a Structured Union of Subspaces

On the Licensing of Innovations under Strategic Delegation

Indian Institute of Technology Bombay. Department of Electrical Engineering. EE 325 Probability and Random Processes Lecture Notes 3 July 28, 2014

Math 151 Introduction to Eigenvectors

A model for measurement of the states in a coupled-dot qubit

Ordered fields and the ultrafilter theorem

Model-based mixture discriminant analysis an experimental study

The shape of a hanging chain. a project in the calculus of variations

A RUIN MODEL WITH DEPENDENCE BETWEEN CLAIM SIZES AND CLAIM INTERVALS

On the Quantum Theory of Radiation.

On the density of languages representing finite set partitions

Aharonov-Bohm effect. Dan Solomon.

23.1 Tuning controllers, in the large view Quoting from Section 16.7:

Simplified Buckling Analysis of Skeletal Structures

3 Tidal systems modelling: ASMITA model

Capacity Pooling and Cost Sharing among Independent Firms in the Presence of Congestion

Simplification of Network Dynamics in Large Systems

22.54 Neutron Interactions and Applications (Spring 2004) Chapter 6 (2/24/04) Energy Transfer Kernel F(E E')

Bäcklund Transformations: Some Old and New Perspectives

Lecture 3 - Lorentz Transformations

Physical Laws, Absolutes, Relative Absolutes and Relativistic Time Phenomena

Measuring & Inducing Neural Activity Using Extracellular Fields I: Inverse systems approach

Wavetech, LLC. Ultrafast Pulses and GVD. John O Hara Created: Dec. 6, 2013

IMPEDANCE EFFECTS OF LEFT TURNERS FROM THE MAJOR STREET AT A TWSC INTERSECTION

Fair Integrated Scheduling of Soft Real-time Tardiness Classes on Multiprocessors

MOLECULAR ORBITAL THEORY- PART I

arxiv:gr-qc/ v2 6 Feb 2004

12.1 Events at the same proper distance from some event

Sensor management for PRF selection in the track-before-detect context

COMBINED PROBE FOR MACH NUMBER, TEMPERATURE AND INCIDENCE INDICATION

Tight bounds for selfish and greedy load balancing

Metric of Universe The Causes of Red Shift.

Generalized Dimensional Analysis

Evaluation of effect of blade internal modes on sensitivity of Advanced LIGO

The Impact of Information on the Performance of an M/M/1 Queueing System

Relativity in Classical Physics

LECTURE NOTES FOR , FALL 2004

A NETWORK SIMPLEX ALGORITHM FOR THE MINIMUM COST-BENEFIT NETWORK FLOW PROBLEM

V. Interacting Particles

Modal Horn Logics Have Interpolation

EXACT TRAVELLING WAVE SOLUTIONS FOR THE GENERALIZED KURAMOTO-SIVASHINSKY EQUATION

Chapter 2 Linear Elastic Fracture Mechanics

REFINED UPPER BOUNDS FOR THE LINEAR DIOPHANTINE PROBLEM OF FROBENIUS. 1. Introduction

MultiPhysics Analysis of Trapped Field in Multi-Layer YBCO Plates

Can Learning Cause Shorter Delays in Reaching Agreements?

Most results in this section are stated without proof.

Lightpath routing for maximum reliability in optical mesh networks

arxiv:physics/ v4 [physics.gen-ph] 9 Oct 2006

ON A PROCESS DERIVED FROM A FILTERED POISSON PROCESS

Parallel disrete-event simulation is an attempt to speed-up the simulation proess through the use of multiple proessors. In some sense parallel disret

Boundary value problems for the one-dimensional Willmore equation

THE METHOD OF SECTIONING WITH APPLICATION TO SIMULATION, by Danie 1 Brent ~~uffman'i

ON THE MOVING BOUNDARY HITTING PROBABILITY FOR THE BROWNIAN MOTION. Dobromir P. Kralchev

Case I: 2 users In case of 2 users, the probability of error for user 1 was earlier derived to be 2 A1

Perturbation Analyses for the Cholesky Factorization with Backward Rounding Errors

Simple FIR Digital Filters. Simple FIR Digital Filters. Simple Digital Filters. Simple FIR Digital Filters. Simple FIR Digital Filters

A simple expression for radial distribution functions of pure fluids and mixtures

Coding for Random Projections and Approximate Near Neighbor Search

Collinear Equilibrium Points in the Relativistic R3BP when the Bigger Primary is a Triaxial Rigid Body Nakone Bello 1,a and Aminu Abubakar Hussain 2,b

A Functional Representation of Fuzzy Preferences

SURFACE WAVES OF NON-RAYLEIGH TYPE

Sensitivity Analysis in Markov Networks

Modeling of Threading Dislocation Density Reduction in Heteroepitaxial Layers

Tail-robust Scheduling via Limited Processor Sharing

Reliability Guaranteed Energy-Aware Frame-Based Task Set Execution Strategy for Hard Real-Time Systems

RESEARCH ON RANDOM FOURIER WAVE-NUMBER SPECTRUM OF FLUCTUATING WIND SPEED

NEW MEANS OF CYBERNETICS, INFORMATICS, COMPUTER ENGINEERING, AND SYSTEMS ANALYSIS

Relativistic Dynamics

arxiv:gr-qc/ v7 14 Dec 2003

FINITE WORD LENGTH EFFECTS IN DSP

Extending LMR for anisotropic unconventional reservoirs

Transcription:

Omnithermal Perfet Simulation for Multi-server Queues Stephen B. Connor 3th Deember 206 arxiv:60.0602v2 [math.pr] 9 De 206 Abstrat A number of perfet simulation algorithms for multi-server First Come First Served queues have reently been developed. hose of Connor and Kendall (205) and Blanhet, Pei, and Sigman (205) use dominated Coupling from the Past (domcfp) to sample from the equilibrium distribution of the Kiefer-Wolfowitz workload vetor for stable M/G/ and GI/GI/ queues respetively, using random assignment queues as dominating proesses. In this note we answer a question posed by Connor and Kendall (205), by demonstrating how these algorithms may be modified in order to arry out domcfp simultaneously for a range of values of (the number of servers), at minimal extra ost. Keywords and phrases: Dominated Coupling from the Past; First Come First Served disipline [FCFS]; Kiefer-Wolfowitz workload vetor; perfet simulation; M/G/ queue; Random Assignment disipline [RA]; sandwihing; stohasti ordering. 2000 Mathematis Subjet Classifiation: Primary 65C05; Seondary 60K25; 60J05; 68U20 Introdution During the past five years there have been a number of signifiant advanes in perfet simulation methods for multi-server queues. Sigman (20) pioneered the use of dominated Coupling from the Past (domcfp) for super-stable M/G/ queues with First Come First Served (FCFS) disipline. ( Super-stable means that the queue would remain stable even if all but one of the servers were removed.) he limitation to super-stable queues is neessitated by Sigman s use of a stable M/G/ queue as dominating proess in the domcfp algorithm. Connor and Kendall (205) subsequently showed how to generalise this idea to work for stable M/G/ queues, by using as dominating proess an M/G/ queue with random assignment (RA) disipline (under whih the servers are independent). hey desribe two algorithms (outlined in Setion 2 below) and ompare their effiieny; they show that their Algorithm, whih requires waiting for the dominating proess to empty, is signifiantly less effiient than Algorithm 2, whih makes use of sandwihing proesses (in ommon with many other domcfp algorithms). Blanhet, Dong, and Pei (205) were the first authors to show how to perform perfet simulation for multi-server queues with general inter-arrival time and servie time distributions (i.e. relaxing the assumption of Exponential inter-arrival times). Rather than use a random assignment queue as dominating proess, they make use of a so-alled vaation system.

(his idea is also employed by Blanhet and Chen (206) to sample from the equilibrium of a generalized Jakson Network of single-server queues.) However, Blanhet et al. (205) have sine demonstrated how to make the random assignment dominating proess work in this setting. he hard part here is working out how to simulate the dominating proess in reverse-time; with renewal, as opposed to Poisson arrivals, the servers in the RA model are no longer independent. hese piees of work all serve to demonstrate that perfet simulation is a pratial and effiient method for simulating from a wide lass of multi-server queueing systems. Connor and Kendall (205) ask a very natural question: is it possible to arry out dominated CFP simultaneously for M/G/ queues with a range of, the number of servers? he authors refer to this as omnithermal dominated CFP, borrowing a term used to desribe Grimmett (995) s oupling of random-luster proesses for all values of a speifi parameter, and applied to CFP in Propp and Wilson (996). he potentially diffiult issue in the queueing ontext is that of deteting a time at whih we an be sure that the appropriate sandwihing proesses will oalese for all in the range being onsidered. In this paper we show how suh oalesene may be deteted with the aid of a simple riterion that uses information about the sandwihing proesses only for the queue with the least number of servers. he outline of the paper is as follows. In Setion 2 we reall the definition of the Kiefer- Wolfowitz workload proess assoiated to a multi-server FCFS queue, and then sketh the two perfet simulation algorithms of Connor and Kendall (205). In Setion 3 we present a natural partial order between Kiefer-Wolfowitz vetors of potentially different lengths; we subsequently use this to determine a ondition whih ensures that the termination time of Connor and Kendall (205) s Algorithm 2 is monotoni in the number of servers. In Setion 4 we use this ondition to produe an Omnithermal Algorithm, and briefly report on the results of applying this to some M/M/ queues. Finally, in Setion 5, we indiate how our results may be used to perform omnithermal perfet simulation for queues with general renewal input, or in the situation where we are interested in saling the distribution of servie durations, rather than hanging the number of servers. 2 Dominated CFP for M/G/ queues Consider a general / / FCFS queue. We denote the Kiefer-Wolfowitz workload vetor (Kiefer and Wolfowitz, 955) at time t 0 by V(t) = (V (, t), V (2, t),..., V (, t)), where V (, t) V (2, t)... represent the ordered amounts of residual work in the system for the servers at time t. Customer n arrives at time t n (for 0 t t 2...), with inter-arrival times denoted by n = t n+ t n (where we set t 0 = 0). Customer n brings with it a servie duration S n. Observing V just before arrival of the n th ustomer (but definitely after the arrival of the (n ) st ustomer) generates the proess W n : in the ase t n < t n we have W n = V(t n ). his satisfies the well-known reursion W n+ = R(W n + S n e n f) +, for n 0, () where e = (, 0, 0,..., 0), f = (,,..., ), R plaes the oordinates of a vetor in inreasing order, and + replaes negative oordinates of a vetor by zeros. For simpliity of exposition we shall primarily disuss M/G/ queues in what follows (i.e. inter-arrival times are Exponential). However, our method for performing omnithermal 2

perfet simulation for these queues applies equally well to GI/GI/ queues using an algorithm of Blanhet et al. (205), as will be observed in Setion 5. Let the arrival rate be λ > 0, and let servie durations S n be i.i.d. with mean /µ and E [ S 2] <. (As explained in Connor and Kendall (205), this seond moment ondition is required in order to guarantee finite mean run-time of their perfet simulation algorithms.) he queue is stable if and only if λ/(µ) <, and so we restrit attention to this senario. Connor and Kendall (205) propose two domcfp algorithms for sampling from the equilibrium distribution of the Kiefer-Wolfowitz workload vetor for a stable M/G/ queue X. Both of these algorithms use as dominating proess an M/G/ queue Y with Random Assignment (RA) servie disipline. hat is, ustomers in Y are alloated upon arrival to a uniformly hosen server; this renders the servers independent, whih allows us to easily simulate a stationary version of the dominating proess in reverse-time, as required by domcfp. It is possible to arrange for X to be path-wise dominated by Y as long as the two queues are oupled by assigning servie durations in order of initiation of servie. (Under FCFS ustomers initiate servie in the same order in whih they arrive, but this is typially not the ase for other servie disiplines.) he preise statement of this domination an be found in Connor and Kendall (205), an abridged version of whih is reprodued here for onveniene. heorem (heorem 3.3 of Connor and Kendall (205)). Consider a -server queueing system viewed as a funtion of (a) the sequene of arrival times 0 t t 2 t 3... and (b) the sequene of servie durations S, S 2, S 3,... assigned in order of initiation of servie. Consider the following different alloation rules, in some ases varying over time:. / / [RA]; 2. / / [RA] until a speified non-random time, then swithing to / / [F CF S]; 3. / / [RA] until a speified non-random time, 0, then swithing to / / [F CF S]; 4. / / [F CF S]; hen ase k dominates ase k +, in the sense that the m th initiation of servie in ase k + ours no later than the m th initiation of servie in ase k, and the m th departure in ase k + ours no later than the m th departure in ase k. Moreover, for all times t the Kiefer-Wolfowitz workload vetor for ase 3 dominates (oordinate-by-oordinate) that of ase 4, with similar domination holding for ases 2 and 3 for all t. We an now summarise the two domcfp algorithms of Connor and Kendall (205). Algorithm. Construt a stationary M/G/ [RA] proess bakwards in time until it empties at some time < 0; 2. Use this to reate a forwards in time trajetory of an M/G/ [RA] queue Y started from empty at time ; 3. Use the sequenes of arrival times and servie durations in Y to onstrut an M/G/ [F CF S] queue X that is dominated by Y over [, 0]; 4. Return X 0. 3

Algorithm 2. Fix a bakoff (or inspetion) time < 0, and onstrut a path of the stationary M/G/ [RA] queue Y over the time period [, 0]; 2. Construt sandwihing proesses L and U over [, 0] as follows: (a) L and U both evolve as the Kiefer-Wolfowitz vetors of M/G/ [F CF S] queues, using the same sequenes of arrival times and servie durations as Y over (, 0]; (b) L is empty, while U is instantiated using the same residual workloads present in Y ; 3. Chek for oalesene: if L 0 = U 0. return this value; else set 2 and return to Step Connor and Kendall (205) provide more details for eah of the steps outlined above, and demonstrate that Algorithm 2, although more ompliated to desribe, is in general signifiantly faster than Algorithm. 3 Omnithermal perfet simulation In this setion we onsider the following question: is it possible to adapt the domcfp algorithms outlined in Setion 2 in order to simultaneously sample from the equilibrium of M/G/( + m) queues, where m ranges over some subset of N? As pointed out by Connor and Kendall (205), it is straightforward to aomplish this in a relatively effiient manner using Algorithm : one an emptying time has been established for the M/G/ queue, then all M/G/( + m) queues of interest an be started from empty at this time and run over [, 0] using the same arrival times and servie durations; a simple workload domination argument shows that their values at time 0 will form a single perfet sample from the required set of equilibrium distributions. However, given the signifiantly faster run-time of Algorithm 2, a far more interesting question is whether or not one an produe a omparably effiient omnithermal domcfp algorithm using sandwihing proesses. Suppose that we have implemented Algorithm 2, and have obtained one equilibrium sample for the M/G/ queue. hat is, we have established some bakoff time < 0, along with sequenes of arrival times and servie durations, suh that L 0 = U 0. Our first observation is the following: suppose that we use these sequenes to produe new FCFS proesses L +m and U +m over [, 0] in the manner desribed in step 2 of Algorithm 2. (hat is, L +m is empty, and U +m ontains the residual workloads present in Y, the M/G/ [RA] proess.) hen the workload vetor for Ut +m dominates (oordinate-by-oordinate) that of L +m t for all t [, 0], and if L +m 0 = U0 +m then this value will be a perfet draw from the equilibrium of the M/G/( + m) queue, as required. his follows from heorem : due to the way in whih it is instantiated, U +m is a queueing system that hanges from M/G/ [RA] to M/G/(+m) [F CF S] at time. But the former of these an be thought of as an M/G/(+m) system with a random alloation rule that uniformly distributes jobs amongst only of the ( + m) servers; sine this is less effiient than the FCFS disipline, the proof of heorem 3.3 in Connor and Kendall (205) holds with this slightly modified setup. his observation implies that, given the arrival times and servie durations used in Algorithm 2 with servers, we ould just onstrut sandwihing proesses L +m and U +m over [, 0] and see whether they oalese. If they do, then we have obtained a sample from the 4

required distribution; if not, then we need to extend the dominating proess Y for this sample further into the past (setting 2 ), and then hek again for oalesene. But this method is not as lean as we would like: in partiular, it is likely that the extent to whih any single sample path of Y needs to be extended will vary with the value of m. (As will be shown in the next setion, oalesene of L +m and U +m over [, 0] does not imply oalesene of L +n and U +n over the same interval for all n > m.) Assuming that we want to obtain samples for a range of values of m N, this method is therefore rather ineffiient. Ideally we would like to use Algorithm 2 to produe a sample for the M/G/ queue, and then re-use the path of Y for this sample in order to draw from the equilibrium of M/G/( + m) for any desired m N. 3. Comparing queues with different numbers of servers Suppose that we have two FCFS queues, eah seeing the same set of arrival times and assoiated servie durations. We first of all need to show that the workload vetor with fewer servers dominates that of the other, with respet to a ertain natural partial order. Definition 2. For V R and V +m R +m, we write V +m V if and only if V +m (k + m) V (k), k =,...,. hus if V and V +m are workload vetors, V +m V if and only if eah of the busiest servers in V +m has no more work remaining than the orresponding server in V. Proposition 3. Let V and V +m be Kiefer-Wolfowitz workload vetors for an M/G/ and an M/G/( + m) FCFS queue respetively. Suppose that V0 +m V0 and that eah queue sees the same set of arrival times and assoiated servie durations. hen Vt +m Vt for all t 0. Proof. It is lear that the ordering between V and V +m will hold until the first arrival time, t. Furthermore, one we show that Vt +m Vt the result will follow simply by indution. Let S denote the servie duration attahed to the arrival at time t. Reall that the effet of this arrival is that S is added to any outstanding work at the first (least busy) oordinate in V and V +m, and then the resulting vetors are eah reordered in inreasing order. Suppose that after this reordering has taken plae, the oordinate with value S + V t () (the amount of work now at the server to whih the arrival at t was alloated) is loated in position i of Vt, et. If i +m m then the result is trivial (sine the last oordinates of V +m are unhanged by the arrival at time t ). So suppose that i +m > m. hen for k < min{i, i +m m} we have V +m t (k + m) = V +m (k + m + ) Vt (k + ) = Vt (k). t Similarly, for k > max{i, i +m m} we have V +m t (k + m) = Vt + (k + m) Vt (k) = Vt (k). here are now two ases to onsider, depending on whih of i and i +m m is larger. Case : i i +m m. hen for k = i,..., i +m m: V +m t (k + m) V +m (i +m ) = V +m () + S V t () + S = Vt (i ) Vt (k). t t Here the first and last inequalities hold sine the oordinates of the workload vetors at time t are arranged in inreasing order; the middle inequality follows from Vt +m V t. 5

Case 2: i +m m < i. Note that the following relations hold between V t and V t : { Vt Vt (k) = (k + ) k = i +m m,..., i and V +m t (k + m) = For k = i +m m,..., i it follows that V +m t Vt () + S { k = i Vt +m () + S k = i+m m Vt +m (k + m) k = i+m m +,..., i. (k + m) V +m (k + m + ) = V +m (k + m + ) Vt (k + ) = Vt (k). t t he proof is ompleted by observing that when k = i, V +m t 3.2 Coalesene (k + m) = V +m (i + m) Vt (i ) = Vt (i ) Vt (i ). t Suppose one again that we have used Algorithm 2 to obtain a single perfet sample from the M/G/ queue: this yields a bakoff time < 0 and a sequene of arrival times and assoiated servie durations suh that the sandwihing proesses U and L oalese over [, 0]. Define D to be the non-negative vetor-valued proess given by the oordinate-wise differene between U and L : D t = U t L t, t 0. Let be the oalesene time for this realisation: = inf{t > : D t = 0} < 0. We shall write L t for the number of ustomers in L t, and A t for the set of oordinates where U t and L t agree: A t = {k : D t (k) = 0, k }. We are interested in the question of whether oalesene of U and L implies oalesene of U +m and L +m (instantiated at time as desribed in Setion 2) over the same period. he following example shows that this is not guaranteed. Example 4. Consider sandwihing proesses for two and three server systems (i.e. = 2 and m = ), as desribed above. Suppose that U 2 and U 3 are both initiated at time 0 with a single servie duration of length, and that these queues proeed to see arrival times/servies (t, S) as follows: (0.,.2), (0.3,.8), (0.8, 5). he evolution of these proesses viewed at arrival times is as follows: t 0 = 0 t = 0. t 2 = 0.3 t 3 = 0.8 U 2 (0.0,.0) (0.9,.2) (.0, 2.5) (2.0, 5.5) L 2 (0.0, 0.0) (0.0,.2) (.0,.8) (.3, 5.5) If there are no further arrivals within the next two units of time, we see that U 2 and L 2 will oalese at time 2 = 2.8 (sine it will take two more units of time for their first oordinates to agree, and their seond oordinates are already mathed). However, feeding the same sequene of arrival times/servies to U 3 and L 3, we see that they will not oalese before time 2 : 6

t 0 = 0 t = 0. t 2 = 0.3 t 3 = 0.8 U 3 (0.0, 0.0,.0) (0.0, 0.9,.2) (0.7,.0,.8) (0.5,.3, 5.2) L 3 (0.0, 0.0, 0.0) (0.0, 0.0,.2) (0.0,.0,.8) (0.5,.3, 5.0) Furthermore, if we were to onsider sandwihing proesses for a four-server system, these would oalese by time 2 using the above sequene of arrivals: t 0 = 0 t = 0. t 2 = 0.3 t 3 = 0.8 U 4 (0.0, 0.0, 0.0,.0) (0.0, 0.0, 0.9,.2) (0.0, 0.7,.0,.8) (0.2, 0.5,.3, 5.0) L 4 (0.0, 0.0, 0.0, 0.0) (0.0, 0.0, 0.0,.2) (0.0, 0.0,.0,.8) (0.0, 0.5,.3, 5.0) A simple, and intuitively obvious, ondition whih guarantees that the sandwihing proesses L +m and U +m will oalese by time is that no ustomer arriving at the lower proess L during the period [, ] has to wait to ommene servie: Proposition 5. If L t for all t [, ] then D +m m N. = 0 (and so +m ) for any Proof. Sine L +m is started from empty, the ondition implies that its first m oordinates are identially zero over [, ]. Moreover, oalesene of U and L implies that there must exist an empty server in both of these proesses at time (see Connor and Kendall (205)); i.e. U () = L +m () = 0. Sine U U +m, Proposition 3 ensures that U U, and so the first m oordinates of U +m (and of L +m ) must also be equal to zero. It remains to show that the last oordinates of U +m and L +m agree. Sine no server in L ever has more than one ustomer to deal with at any moment, the same is true for L +m, and so L +m t (k + m) = L t(k) for all k and for all t [, ]. hen by the domination established in Proposition 3, and the fat that U = L, for all k, as required. U +m (k + m) U (k) = L (k) = L+m (k + m), he ondition of Proposition 5 is rather strong, and an in fat be weakened, as we now show. heorem 6. Suppose that, for all arrival times τ [, ], if D τ () = 0 (equivalently, A τ ) then U τ () = 0. hen +m for any m N. Remark 7. he ondition of Proposition 5 is stronger than that of heorem 6. o see this, suppose that L t for all t [, ]. If at some arrival time τ [, ] we have D τ () = 0 but U τ () > 0, then there must be at least ustomers in L τ (sine L τ () = U τ () > 0). But then the ustomer arriving at time τ would fore L τ = +, whih would break our initial assumption. herefore if D τ () = 0 it must be the ase that U τ () = 0. Note that in Example 4 the ondition of Proposition 5 learly fails, and so therefore does the ondition of heorem 6. (In fat U 2 t 3 () = L 2 t 3 () > 0.) he key to proving heorem 6 is to onsider the time until oalesene of the sandwihing proesses U and L when viewed at time t, under the assumption of there being no further 7

arrivals in (t, ). Note that this is just the time taken for U to lear all work in oordinates whih disagree with those in L at time t. Let us write C t for this quantity: Ct = max U k / A t (k) = Ut (n t), (2) t where we define n t = max{ k : k / A t}. It is lear that C t dereases deterministially at unit rate until it either hits zero (at whih point L and U oalese) or a new ustomer arrives. Consider then what happens to C if there is an arrival at time τ with assoiated servie duration S. Let k U and k L be the oordinates satisfying U τ (k U ) = U τ () + S and L τ (k L ) = L τ () + S. (hat is, the arriving job gets alloated to the server with the least work in eah of U τ and L τ, and then when the workload vetors are reordered this job finds itself in position k U in U τ and k L in L τ.) o be expliit k U = min{k : U τ () + S U τ (k + ), k < }, with k U = if the minimum above is taken over the empty set. Note that this onvention that the new job is plaed at the lowest oordinate possible, after reordering, in U τ allows us to deal with the possibility that U τ () + S = U τ (k + ) for some k <, i.e. of U τ ontaining two equal but non-zero oordinates; when arrivals are Poisson this possibility ours with probability zero, of ourse. In partiular, this implies that U τ (k U ) > U τ (k U ). (3) here are two ases to onsider when assessing the impat of an arrival on C, depending on whether or not the servers with least workload in U τ and L τ are in agreement. Case : A τ (i) Suppose first that k U n τ. Sine U τ (k) = L τ (k) for all k > n τ, it must be the ase that k L = k U A τ. So n τ = n τ and C τ = U τ (n τ ) = U τ (n τ ) = C τ. (ii) Alternatively, if k U < n τ then n τ = n τ, and so C τ = C τ one again. hus there is no hange to C if the arriving ustomer finds U τ () = L τ (). Case 2: / A τ (i) Suppose that k U n τ. Sine Uτ (k) = L τ (k) for all k > n τ, it must be the ase that k L k U. We laim that k U / A τ, and so n τ = k U ; it then follows that Cτ = Uτ (n τ ) = Uτ (k U ) = Uτ () + S > Cτ. o see that n τ = k U, we need to show that L τ (k U ) < Uτ (k U ). Notie that L τ (k U ) = max{l τ (k U ), L τ () + S} and U τ (k U ) = U τ () + S. Clearly L τ () + S < Uτ () + S (sine / A τ ). Uτ (k U ) < Uτ (k U ) thanks to (3). Furthermore, L τ (k U ) (ii) Alternatively, if k U < n τ then k L n τ also, and so n τ = n τ. hus C τ = C τ. hus when / A τ, C τ = max{u τ (n τ ), U τ () + S}. 8

In summary, we see that the time to oalesene of L and U inreases exatly when there is an arrival at time τ, with / A τ and k U n τ. hat is, { Cτ Cτ if A τ = max{cτ, Uτ () + S} if / A (4) τ. he next result is key to the proof of heorem 6: it shows that, under the same assumption as the theorem, the time to oalesene with + m servers is dominated by the time to oalesene with servers. Lemma 8. Suppose that at arrival time τ, if A τ then Uτ () = 0. Suppose also that C τ for some m N. hen Cτ +m Cτ. C +m τ Proof. We onsider the two possible senarios seen by the ustomer arriving at time τ.. A +m τ. 2. / A +m τ and / A τ. (Note that the third possibility, that / A +m τ and A τ, is exluded by our assumption. Indeed, if A τ then our assumption would fore L τ () = Uτ () = 0. So the arrival at time τ would find a server empty in U, whih would mean that it must also find a server empty in U +m. But that would imply that Aτ +m.) Let s treat these two possibilities in order.. Sine A +m t, we know from (4) that the oalesene time for L+m and U +m is unhanged by the new arrival. In addition, the oalesene time for L and U annot derease due to this arrival. So C +m τ = C +m τ C τ C τ. 2. Here the arrival potentially affets the time-to-oalesene of both pairs of sandwihing proesses. However, C +m τ = max{cτ +m, U τ +m () + S} max{c τ, Uτ () + S} = Cτ, where the inequality follows from the seond assumption of the Lemma, and the previously established fat that Uτ +m U τ. We an now omplete the proof of heorem 6. Reall that the sandwihing proesses L +m and U +m are started at time < 0 with L +m empty and U +m instantiated using the same set of residual workloads that are present in U. Now, it is lear that departures in U +m our no later than in U, and sine C is simply the time taken for all ustomers present in U to depart it follows that C +m C. Given that the assumption of heorem 6 holds, Lemma 8 tells us that this ordering is preserved for all t [, ]: Ct +m Ct, t [, ]. But sine L and U oalese at time < 0, we see that C +m = C = 0, and so +m, as laimed. 9

4 Simulations he result of heorem 6 provides us with a reipe for performing omnithermal perfet simulation for M/G/( + m) queues, for m = 0,, 2,.... Omnithermal Algorithm. Use Algorithm 2 of Connor and Kendall (205) to establish a bakoff time < 0 and upper and lower sandwihing proesses U and L over [, 0] suh that U 0 = L 0 : (i) Calulate the oalesene time [, 0] of U and L, and hek to see whether the ondition of heorem 6 is satisfied for all arrival times in [, ]; (ii) While the ondition is not satisfied, set 2, and use Algorithm 2 of Connor and Kendall (205) to extend the simulation of the sandwihing proesses over the new window [, 0]. 2. For eah required m N, onstrut L +m over [, 0], using the same sequene of arrival times and servies as in the onstrution of L. Return L0 +m as a perfet equilibrium draw of the Kiefer-Wolfowitz vetor for the M/G/( + m) queue of interest. (Note that there is no need to also onstrut U +m : heorem 6 guarantees that U0 +m = L +m 0.) Remark 9. If we are alled upon to use part (ii) of the algorithm and extend the simulations of U and L further into the past, we are guaranteed that these new sandwihing proesses (Ũ and L, say) will still oalese by time : this follows from heorem 5. of Connor and Kendall (205), whih implies that L t L t Ũ t U t, t [, 0]. We applied the Omnithermal Algorithm to the three sets of simulations presented in Figure 4 of Connor and Kendall (205); these all onern M/M/ queues with λ = and µ = 2. 5,000 runs were performed for eah set of parameters (λ = 0, 30, 50): for eah run we reorded the original bakoff time (required for oalesene of U and L ), and the number of times that this was doubled in part (ii) in order to satisfy the ondition of heorem 6. When λ = 30 (respetively 50) we found that only 4 (respetively 0) of the original bakoff times needed to be extended; when λ = 0 this number rose to 08 (2.% of the sample). Furthermore, those runs whih needed extending did not require to be doubled a signifiant number of times: no run required us to double more than three times, with most of the runs requiring only a single additional bakoff in order to satisfy the ondition of heorem 6 see Figure. In addition, we used the Omnithermal Algorithm to investigate the effet of hanging server number for an M/M/ queue with arrival rate λ =.2, servie rate µ = and number of servers {2, 3, 4}. We ran the algorithm 5,000 times, for whih 94 runs failed the ondition of heorem 6 and needed extending further into the past. Figure 2 shows the mean value of eah oordinate of the Kiefer-Wolfowitz workload vetor from this sample. he effet on the value of the first oordinate (whih represents the mean waiting time of a ustomer arriving in equilibrium) of inreasing the number of servers from two to three an be seen to be substantial: the mean value drops from 0.57 to just 0.08. Further detail is provided in Figure 3, where we show the effet on the distribution funtion of the remaining workload at the first two oordinates of the workload vetors for the same set of simulations. 0

70 Frequeny 60 50 40 30 Original value of 2 4 8 20 0 0 2 3 Number of additional bakoffs required Figure : Simulation results for 5,000 runs of the Omnithermal Algorithm for an M/M/ queue with λ = = 0 and µ = 2. 08 runs failed the ondition of heorem 6 and needed extending further into the past; the figure shows the number of times eah of these 08 runs needed to be extended (using the usual binary bakoff sheme). No run needed to be extended more than three times. 5 Conluding omments We have shown how the effiient Algorithm 2 of Connor and Kendall (205) for M/G/ queues may be modified to allow for omnithermal perfet simulation; our new algorithm uses a simple test to determine whether or not the dominating proess used for the -server algorithm needs to be extended further into the past in order to allow for simultaneous sampling from M/G/( + m) queues for any desired m N. Furthermore, we have provided numerial evidene whih suggests that, at least in the ase of moderate traffi intensity, the Omnithermal Algorithm requires minimal additional omputational expense. We onlude by noting two extensions of our algorithm. Our first observation is that the oalesene arguments underpinning Setion 3 do not rely in any way on the distribution of inter-arrival times. As noted in the introdution, Blanhet et al. (205) have reently shown how to implement domcfp for GI/GI/ queues using a random assignment dominating.4.2 Number of servers 2 3 4 Mean workload.0 0.8 0.6 0.4 0.2 0.0 2 3 4 Workload vetor oordinate Figure 2: Mean of eah oordinate of the workload vetor for an M/M/ queue with λ =.2, µ = and {2, 3, 4}. (Results from 5,000 runs of the Omnithermal Algorithm.)

.0.0 Distribution funtion 0.8 0.6 0.4 0.2 Number of servers 2 3 4 Distribution funtion 0.8 0.6 0.4 0.2 Number of servers 2 3 4 0.0 0 2 3 4 5 Workload (first oordinate) 0.0 0 2 3 4 5 Workload (seond oordinate) Figure 3: Distribution funtions for workload at first (left) and seond (right) oordinates of the workload vetor, for the set of simulations presented in Figure 2. proess with upper and lower sandwihing proesses in the style of Algorithm 2 above. It is therefore possible to perform omnithermal simulation for these queues, by using their algorithm in plae of Algorithm 2 in step of the Omnithermal Algorithm. he seond natural extension is to omnithermal simulation of queues whih are more stable than the M/G/ on aount of having shorter servie durations, but the same number of servers. (We ould equivalently onsider queues with longer inter-arrival times of ourse; however, for the domination arguments of Setion 3 to hold it is essential that the two systems being ompared have the same set of arrival times. It is therefore more onvenient to adjust the servie durations instead.) Suppose that servie times in the more stable system are distributed as βs for some β (0, ]: this is equivalent to the servie times being distributed as S, but with eah server now ompleting work at rate β. So we an ompare the two systems as in Setion 3, feeding both the same sets of arrival times and servie durations, but with the time-to-oalesene in (2) replaed by C β t = βu β t (nβ t ) (where we have one again used the supersript to indiate the parameter that varies between the queues under onsideration). In a similar manner to Example 4, it is easy to onjure up a set of arrival times and servie durations suh that the system ompleting work at rate β has C β t > Ct for some values of t. However, we note that in this new setting equation (4) beomes { Cτ β C β τ if A β τ = max{c β τ, β(u β τ () + S)} if / Aβ τ. Using this, it is a simple exerise to hek that if the ondition of heorem 6 is satisfied, then the sandwihing proesses for the faster-working system will oalese no later than do U and L for the original M/G/ queue. In other words, we an perform omnithermal simulation in this setting by simply replaing step 2 of the Omnithermal Algorithm with the following variant: 2. For eah required β (0, ), onstrut L β over [, 0], using the same set of arrival times and servies as in the onstrution of L. Return L β 0 as a perfet equilibrium draw of the Kiefer-Wolfowitz vetor for the M/G/ queue, in whih work is ompleted at rate β. 2

(Step in whih we possibly extend some simulations further into the past does not hange at all.) In onlusion, omnithermal perfet simulation for multi-server queues under FCFS alloation may be performed in a pratial and effiient manner; moreover, one we have done the work of extending any runs of the dominating proess further into the past, the resulting simulations may be used to produe perfet samples when either the number of servers or the rate at whih work is performed is inreased. Referenes Blanhet, J. and X. Chen (206). arxiv 60.05499. Perfet Sampling of Generalized Jakson Network. Blanhet, J., J. Dong, and Y. Pei (205). Perfet Sampling of GI/GI/ Queues. arxiv 508.02262. Blanhet, J., Y. Pei, and K. Sigman (205). Exat sampling for some multi-dimensional queueing models with renewal input. arxiv 52.07284. Connor, S. B. and W. S. Kendall (205). Perfet simulation of M/G/ queues. Advanes in Applied Probability 47 (4), 039 063. Grimmett, G. (995). he stohasti random-luster proess and the uniqueness of randomluster measures. he Annals of Probability 23 (4), 46 50. Kiefer, J. and J. Wolfowitz (955). On the theory of queues with many servers. ransations of the Amerian Mathematial Soiety 8 (), 8. Propp, J. G. and D. B. Wilson (996). Exat sampling with oupled Markov hains and appliations to statistial mehanis. Random Strutures and Algorithms 9, 223 252. Sigman, K. (20). Exat Simulation of the Stationary Distribution of the FIFO M/G/ Queue. Journal of Applied Probability 48A, 209 23. 3