There is no straightforward approach for choosing the warmup period l.

B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace. However, iitial coditios could bias the estimatio. To avoid this bias, the most commo techique is to warm up the model, also called iitial-data deletio. Idetify idex l (for discrete-time processes) or time t 0 (for cotiuous-time processes) beyod which the output appears ot to be driftig ay more statistically. The, cosider data beyod the cut-off poit to estimate the desired measure of performace. For example, suppose the objective is to estimate the steady state mea of a process based o observed values, Y 1, Y,.Y m, i a sigle replicatio. The, after removig l iitial trasiet data, a estimator for Yi i l 1 the mea is Y ( m, l) m l m There is o straightforward approach for choosig the warmup period l.. 1

The simplest method for fidig l is a graphical procedure based o multiple replicatios. (May replicatios are used because it s difficult to estimate l from just oe replicatio.) The idea is to plot a estimate for E[Y i ] versus i ad choose l by eyeballig a poit where the curves flattes out. To estimate E[Y i ], first a average across replicatios is take leadig toy i ad the a movig average with widow w is applied to for further smoothig leadig to Yi ( w ). A plot of Yi ( w ) is the used to estimate l as show.

Note fially that there is o easy way to specify the ru legth i a otermiatig simulatio 1. You ca (i) rely o a graphical procedure to judge if steady state is reached or (ii) keep icreasig the legth of the simulatio util the variability i the output measure is reasoably small. Replicatio/Deletio Approaches for Estimatig Meas i a Notermiatig Simulatio Assume that a appropriate warm-up period l has bee idetified. The replicatio/deletio techique starts with replicatios each havig m observatios ad delete the first l observatios of each replicatio. The, it proceeds just like the case of termiatig simulatio to estimate a CI for the mea based o the ormal approximatio ad the t-distributio. Other Approaches for Estimatig Meas i a Notermiatig Simulatio Other approaches for estimatig meas i a otermiatig simulatio are based o havig oe log ru (versus may short replicatios i Replicatio/Deletio). 1 Whitt (1998) [Maagemet Sciece 35: 1341 1366] offers simple formulas for estimatig the ru legth of queueig simulatios. See also Maddah et al. (017) [Europea Joural of Operatioal Research, 6: 60-619] for extesive testig ad validatio of Whitt s formulas. 3

Batch Meas The batch meas approach cosiders a sigle ru with m observatios ad divides the observatios ito batches of legth k each (m = k). (Warm-up period data may be elimiated first.) Let Yj ( k ) be the mea of the observatios i batch j. 4

The batch meas approach is based o the approximate assumptio that Yj ( k ) are iid ormal ad that they provide a ubiased estimator for the mea. The, a cofidece iterval for the mea is foud as Y ( m) t 1, / S ( ) where Y( m) j1 Y ( k) j, S ( ) j1 ( Y ( k) Y ( m)) j. 1 The mai advatage of the batch meas approach is its simplicity ad the major drawback is the difficulty of choosig a appropriate batch size that reduces correlatio. See BCNN (pp. 447-450) book for correlatio tests, a example, ad further details. 5

Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe o the basis of a statistical aalysis. For example, whe comparig two systems, oe system ca be better o some replicatios ad worse o others. Tellig which system is better requires ca be oly doe approximately based o statistical aalysis. Example 1 Suppose a bak is cosiderig two possible ATM systems. o Zippy: M/M/1 queue with arrival rate 1 ad oe fast server with mea service time 0.9 miutes. o Kluky: M/M/ queue with arrival rate 1 ad two slow servers with mea service time 1.8 miutes each. The performace measure of iterest is the mea delay of the first 100 customers, d(100) (system starts empty). 6

The true measures (evaluated based o exact aalysis) are d Z (100) = 4.13 ad d K (100) = 3.70. So Kluky is better. Simulatig these two systems ad comparig the output of oe ru at a time i 100 replicatios, gives the followig. I 53 Out of 100 replicatios, d Z (100) < d K (100). That is, P{wrog aswer} = 0.53. This meas that a aalysis based o a sigle ru is likely to lead to wrog aswers. To improve the compariso oe could simulate each system for replicatio ad the base the decisio o comparig the averages of the replicatios. Doig 100 comparisos this way (with a total of 100 replicatios) gave the followig results for differet s. 7

A dot plot illustrates what s happeig. This example highlights the eed for methods to assess the ucertaity ad give statistical bouds or guaratees for coclusios ad decisios. Cofidece Itervals for the Differece of Systems Cosider two alterative systems with performace measures i, i = 1, (the mea of somethig). Suppose that i replicatios are made for system i. 8

Let X ij be the observatio from system i o replicatio j. The idea is to use the X ij s to build a cofidece iterval for If the iterval misses 0, we coclude there is a statistical differece betwee the two systems. We ca also have a idea of how sigificat is the differece betwee the two systems. There are two approaches for buildig cofidece itervals for Paired-t ad two-sample-t. Paired-t cofidece Iterval Assume replicatios of each system are performed. Let Z j = X 1j X j. The, the Z j s are IID. The paired-t approach works by buildig a cofidece iterval based o the sample mea ad variace for Z, Z ( Z Z ( )) j j1 j1 Z Z ( ), S ( ) j 1. Assume that the Z j s are ormally distributed (which may be justified by the cetral limit theorem). The, the followig is a approximate 100(1 ) percet cofidece iterval for 9

SZ ( ) Z( ) t 1, /. Oe importat fact about the paired-t cofidece iterval is that X 1j ad X j eed ot to be idepedet. This could be useful whe utilizig a variace reductio techique. Example Cosider comparig two alterative orderig policies i a (s, S) system ((0, 40) ad (0, 80)). The measure of performace of iterest is the average mothly cost i a 10 moth horizo. The simulatio results of five replicatios of the two systems are as follows. The, Z(5) 4.98 ad S Z (5) 1.19, ad usig a 10% sigificace level, t 4, 0.05 a approximate 90% t-paired cofidece iterval is 1.19 4.98.13 4.98 3.33 (1.65,8.31). 5 10

Sice the iterval does ot cotai 0, this implies that the two (s, S) policies are sesibly differet. I additio, the secod policy seems to be better (as it has lower cost). A two-sample cofidece iterval This method allows samples of uequal sizes (i.e. 1 ) from both systems. However, it requires that X 1j ad X j be idepedet. This ca be doe by usig differet radom umber streams for obtaiig X 1j ad X j. The method works by estimatig the mea ad variace for each sample separately. That is, oe first estimates 1 1 X ( X X ( )) 1 j 1 j 1 1 j1 j1 1 1 1 1 1 1 1 X ( ), S ( ), X ( X X ( )) j j j1 j1 1 X ( ), S ( ). A approximate 100(1 ) % cofidece iterval for is S1 ( 1) S ( ) X1( 1) X ( ) t fˆ, /, 1 where S1 ( 1 ) / 1 S ( ) / fˆ. S1 ( 1 ) / 1 /( 1 1) S ( ) / /( 1) 11

Example 3 Applyig the two-sample cofidece iterval approach for the ivetory system i Example, gives X (5) 15.57, X (5) 10.59, S (5) 4, S (5) 3.76. 1 1 The, f ˆ 7.99 8, ad t ˆ,0.95 1.86, ad approximate 90% cofidece iterval is f 4.98 1.86(1.46) 4.98.3 (.66,7.3). This also idicates that the two policies are sesibly differet with the secod policy beig better. Paired-t versus two-sample cofidece iterval Comparig steady-state measures of two system The approaches for buildig cofidece itervals require iid observatios X 11, X 1,, X 11, ad X1, X,, X each system. These ca be obtaied easily for termiatig simulatio., from 1

For o-termiatig steady-state simulatios such observatios ca be obtaied by the methods discussed before such as replicatio/deletio ad batch meas. Cofidece Itervals for the Differece of > Systems If more that two systems are to be compared, the cofidece itervals for the differece of the performace measure betwee differet pairs of the systems are required. A key issue here is the validity of idividual cofidece itervals for multiple comparisos. Specifically, if c cofidece itervals are to be developed with a overall cofidece level of 1 the each iterval must have a cofidece level 1 c. (This is called the Boferroi iequality.) This implies that makig multiple comparisos require a large umber of replicatios to achieve the high cofidece level 1 c. Comparig with a stadard Here, oe of the systems (say system 1) is the stadard (e.g., the existig cofiguratio). We eed to compare each of the other k systems with the stadard, ad obtai cofidece itervals o 13

k The, to obtai a overall cofidece of 1 each iterval should have a cofidece level 1 k 1). Example 4 The simulatio of five differet (s, S) policies i a ivetory system gave the followig results i 5 replicatios. The performace measure of iterest is the average mothly cost over a horizo o 10 moths. The first policy is cosidered the stadard system. The desired overall cofidece level is 90%. So, itervals for k k =,3,4,5, are developed with a cofidece level (1 0.1/4)% = 97.5%. The cofidece itervals are as follows. 14

Here we see that system 4 ad 5 are obviously worse tha the stadard, while system ad 3 are ot ecessarily better. All pair-wise compariso Here all systems are compared pair-wise. This would be doe if all alteratives receive equal cosideratio. Cofidece itervals for i 1 i, i 1 = 1,, k, i = 1,, k, i 1 i. The total umber of comparisos to be made is k(k 1)/. So to obtai a overall cofidece of 1 the cofidece level for every iterval should be 1 k(k 1)/]. See BCNN book (pp. 475-476) for a example of all pairwise compariso. 15