Network Simulation Chapter 6: Output Data Analysis

Network Simulation Chapter 6: Output Data Analysis Prof. Dr. Jürgen Jasperneite 1 Contents Introduction Types of simulation output Transient detection When to terminate a simulation 2 Prof. Dr. J ürgen Jasperneite 1

Goal of this chapter This chapter looks at two main problems: When is it ok to use measurements made in a simulation program? How to detect that enough measurements have been collected to satisfy accuracy requirements? For both problems, we will Look at the reasons why they arise at all Investigate some automatic procedures to solve them 3 Introduction In many simulation studies a great deal of time and money is spent on model development and programming It is not just a exercise in computer programming! A very common mode of operation is to make a single run of somewhat arbitrary length and then to treat the results as the true model characteristics. 4 Prof. Dr. J ürgen Jasperneite 2

Introduction Basic most serious disadvantage of simulation: With a stochastic simulation, don t get exact answers from run. Two different runs of the same model > different numerical results. 5 Random Input: Random numbers Random variates Simulation model/code Random Output: Performance measures Interpreting simulation output: statistical analysis of output data Contents Introduction Types of simulation output Transient detection When to terminate a simulation 6:Output Data Analysis 6 Prof. Dr. J ürgen Jasperneite 3

Classification of Simulation studies with respect to Output Analysis System simulated steady-state Analysis Nonterminating transient-state Analysis terminating 7 Types of Simulations with Regard to Output Analysis Terminating: Parameters to be estimated are defined relative to specific initial and stopping conditions that are part of the model There is a natural and realistic way to model both the initial and stopping conditions Output performance measures generally depends on both the initial and stopping conditions Example Bank opens each day at 0900 and closes at 1500. Initial condition: no customer Terminating event: departure of last customer 8 Prof. Dr. J ürgen Jasperneite 4

Types of Simulations with Regard to Output Analysis Non-terminating: There is no natural and realistic event that terminates the model Interested in long-run behavior characteristic of normal operation the steady-state behaviour is of interest Theoretically, does not depend on initial conditions Practically, must ensure that run is long enough so that initialcondition effects have dissipated. Example Any continuously running system e.g. Computer centre, traffic system,... Difficulties with non-terminating simulations: Initial bias How long to run the simulation 9 Contents Introduction Types of simulation output Transient detection When to terminate a simulation 10 Prof. Dr. J ürgen Jasperneite 5

Transient Removal In most simulations, only the steady-state performance is of interest. Steady state means the system reached a stable state (equilibrium). Results of initial part should not be included > transient state. Problem: transient removal 11 Transient Removal How to find the initial Part? No exact definitions: >Heuristics Proper initialization Initial data deletion Moving average of independent replications Batch means Long runs Truncation 12 Prof. Dr. J ürgen Jasperneite 6

Transient Removal Long Runs Assumption: The runs are long enough that the presence of initial conditions will not affect the result. wastes ressources Difficulty of determine what is long enough! Do not use this in practice Truncation Assumption: The variability during steady-state is less than that during the transient state. Given a sample of n observations {x 1, x 2,...x n }, ignore the first l observations and then calculate the min, max of the remaining n-l observations. Repeat this for l =1,2,..., n-1 until the (l+1)th observation is neither the min nor the max of the remaining observations. The value of l at this point gives the length of the transient state. 13 Transient Removal Example: Consider the observations: 1,2,3,4,5,6,7,8,9,10,11,10,9,10,11,10,9,10,11,10,9.. Ignoring the first observation (l=1), the range of the remaining values is (2,11). Since the 2nd. Value is equal to the Min, the transient phase is longer than 1. Ignoring the first two observations (l=2), the range is now (3,11). Again the next (3rd) value is equal to Min. Finally, at l=9 the range is (9,11) and the (l+1) value (10th) is neither the Min nor the Max. > transient length is 9 and the first nine values are discarded. Value 12 10 8 6 4 Transient interval 2 0 0 5 10 15 20 25 14 Observation number i Prof. Dr. J ürgen Jasperneite 7

Contents Introduction Types of simulation output Transient detection When to terminate a simulation 15 Terminating Simulations Again: A majority of simulations are such that steady-state performance is of interest. There are systems that never reach a steady-state. These systems always operate under transient conditions. Example: bursty network traffic 18 16 14 12 10 8 6 4 2 0 0 500 1000 1500 2000 2500 3000 3500 4000 16 Prof. Dr. J ürgen Jasperneite 8

Stopping Criteria: Variance Estimation It is important, that the length of the simulation be properly chosen. If the runs too short, the results may be highly variable. If the runs too long, ressources may be wasted The simulation should be run until the CI for the mean narrows to a desired width. (Do you remember?) Only valid if the observations are independent! Unfortunately the output of most simulations are not independent! Example: In a queueing system, if the waiting time for the i-th job is large, the waiting time for the (i+1)-th job would be large too and vice versa. 17 Stopping Criteria: Variance Estimation To compute the variance of means of correlated observations a number of method exists: Independent replications Method of Regeneration Batch means 18 Prof. Dr. J ürgen Jasperneite 9

Method of Regeneration Queue length Regeneration points Time 19 by Crane and Iglehart (1974) and Fishman (1973) Stopping Criteria: Variance Estimation Batch Means Requires running a very long simulation, discard the initial transient interval and later divide it up into several parts of equal duration. Each part is called a batch or subsample. The mean of observations in each batch is called the batch mean. Method requires studying the variance of these batch means as a function of the batch size. 20 Prof. Dr. J ürgen Jasperneite 10

Stopping Criteria: Variance Estimation A long run of M observations can be divided into n batches of size k each, with n M / k 1. For each batch, compute a batch mean: 1 k k j 1 xi x ij 21 2. Compute the overall mean: 1 n xi n i 1 x 3. Compute the variance of the batch means: n 1 Var ( x) ( xi x ) n 1 i 1 2 Stopping Criteria: Variance Estimation 22 The confidence interval for the mean is 1) Var ( x) [ x z 1 / 2 ] n The computation is essentially the same as in the method of independent replications. The batch size k must be large, so that the batch means have litte correlation. One way to find k is to compute the autocovariance of successive batch means: 1 Cov x x x x x x n 2 n 1 ( i, i 1 ) ( i )( i 1 ) i 1 Hence: Increase (e.g., double) batch size until Cov is small compared to their variance (less than 1%). 1) If n< 30 use the t-student distribution Prof. Dr. J ürgen Jasperneite 11

Application of this method 18 16 14 12 Queue Size Number of Samples: 4000 10 8 6 4 2 0 0 500 1000 1500 2000 2500 3000 3500 4000 Batch Size Auto-COV Variance Batch Mean Number of batches 1 3,992 4,28 1,85 4000 2 3,478 4,038 1,85 2000 4 2,7 3,69 1,854 1000 8 1,754 3,12 1,856 500 16 0,9259 2,317 1,854 250 32 0,323 1,539 1,857 125 64-0,0026 0,8492 1,867 62 128-0,1284 0,4313 1,8592 31 256-0,0122 0,045 1,887 15 512 0,00419 0,0172 1,89 7 Auto-COV becomes less than 1 % of the sample variance at a batch-size of 64. 23 End of this lecture After finishing the lab you should... know about simulation principles (in particular, the so-called discrete event simulation) be able to build models for such systems be able to identify suitable performance metrics be familiar with basic statistical have some experience with a modern simulation tool 7: Th e end 24 Prof. Dr. J ürgen Jasperneite 12

End of this lecture 7: The end 1. Basic Simulation Modeling Systems, Models (continuous/discrete) Discrete event simulation time advance mechanism, principle, architecture 2. OPNET IT Guru - A Tool for Discrete Event Simulation 3. Review of Basic Probabilities and Statistics Terminology (pmf, distribution function, pdf, cdf) Mean, variance, expected value Confidence Interval, quantile Determining the sample size 4. Building valid, credible Simulation Models Difference between Validation, Verification, Best practice techniques for computer programs Techniques for increasing Model Validity and Credibility 25 End of this lecture 5. Traffic Modeling Trace-driven, empirical distributions Goodness of fit tests Histograms, qq-plots 6. Output Data Analysis Terminating, non-terminating, steady-state, transient removal Stopping criteria: variance estimation 7: Th e end 26 Prof. Dr. J ürgen Jasperneite 13

End of this lecture Any questions? 7: The end 27 Prof. Dr. J ürgen Jasperneite 14