An Architecture for a WWW Workload Generator. Paul Barford and Mark Crovella. Boston University. September 18, 1997

Size: px
Start display at page:

Download "An Architecture for a WWW Workload Generator. Paul Barford and Mark Crovella. Boston University. September 18, 1997"

Transcription

1 An Architecture for a WWW Workload Generator Paul Barford and Mark Crovella Computer Science Department Boston University September 18, Overview SURGE (Scalable URL Reference Generator) is a WWW workload generator which is based on analytical models of WWW use. It relies on the fact that a great deal of prior work has been done in the analysis of WWW transactions, and that models for many of the important characteristics of WWW use have been developed. The goal of SURGE is to provide a scalable framework which, from the server's perspective, makes document requests which mimic a set of real users. SURGE is intended to be used for benchmarking, network trac generation and simulation. This paper contains descriptions of the details of the SURGE framework, the additional models which we developed, and the statistical methods we used to parameterize the models. 2 The SURGE Framework 2.1 Structure of SURGE SURGE is a scalable software framework within which a collection of models for the various components of WWW use are combined. From the top down, SURGE software resides on a set of clients which are connected to a WWW server as illustrated in Figure 1. Each SURGE client executes a set of threads, each of which is an ON/OFF source [15, 9]. Each thread requests a document set which is then transfered by the server (ON time). After receiving the document set, a thread sleeps for some amount of time (OFF time). This ON/OFF characteristic is an important dierence between SURGE and other benchmarks such as [1, 2, 3, 4, 5] ( [14] also includes OFF times). 1

2 SURGE Client System ON/OFF Thread ON/OFF Thread ON/OFF Thread ON/OFF Thread SURGE Client System LAN Web Server System SURGE Client System Figure 1: SURGE Architecture When SURGE is started on a client system, it begins by populating a number of arrays with data generated by the analytic models of WWW client use. It then spawns a user-designated number of threads which execute the loop shown in pseudo code in Figure 2. URL List, Inactive OFF, Active OFF and ON Count each represent an array of data. Each array is generated by the collection of models within the SURGE framework which characterize dierent aspects of WWW use. The arrays are generated in the sequence shown in Figure Models Used in SURGE We began the development of the SURGE framework by looking at the work that had been done in other studies. In particular, models for the following aspects of WWW use had already been proposed: A model for the distribution of unique sizes of les requested from servers is suggested in [9]. A model for the distribution of sizes of all les transferred from servers is suggested in [9] (used in box 3 in Figure 3). A model for the popularity of all les requested is suggested in [10] (used in box 1 in Figure 3). A model for the temporal locality of les requested is suggested in [6] (used in box 4 in Figure 3). A model for the Inactive OFF times is suggested in [9, 12] (used in box 7 in Figure 3). 2

3 (1) SLEEP(NEXT Inactive_OFF item) (2) WHILE(URLs remain in URL_List) { (3) I = NEXT ON_Count item (4) WHILE(I > 0) { (5) D = NEXT URL_List item (6) TRANSFER D (7) IF (I > 1) SLEEP(NEXT Active_OFF item) (8) I = I - 1 } (9) SLEEP(NEXT Inactive_OFF item) } Figure 2: Pseudo code executed by each SURGE Thread (1) Total Requests/File (Popularity) (2) Unique File Sizes (3) Matching (4) Sequence Generator (5) ON Count Generator URL_List ON_Count (6) (7) Active OFF Inactive OFF Generator Generator Active_OFF Inactive_OFF Figure 3: Generation of SURGE Data Arrays 3

4 We began by incorporating these models into SURGE; however they do not represent the complete set of information required to generate ON/OFF entities that follow the algorithm in 2. We had to develop the following analytic models in order to complete the SURGE framework: A model for the unique sizes of les transferred which was accurate over the entire distribution (used in box 2 in Figure 3). A model for Active OFF periods (used in box 6 in Figure 3). A model for the number of documents transferred during ON periods (used in box 5 in Figure 3). We believe that a good model for a representative WWW workload generator should include the aforementioned models of WWW use. However, there may be additional characteristics that could be added to this model. For example, [6] also describes spatial locality characteristics for WWW requests. Spatial locality has not been included in SURGE at the present time. 3 Statistical Overview We used the BU client trace data sets discussed in [10] to develop the three models required to complete SURGE. To develop these models, we used the standard statistical methods described in [16]. These are the following: 1. Use empirical data to determine distributional models (using Q-Q or CDF plots) for each data set using the half sample method [11]. Use logarithmic transformations where necessary to distinguish important characteristics. 2. Estimate parameters for analytic models using maximum likelihood estimators and then test for the accuracy of the model using goodness-of-t tests. We used the Anderson-Darling (A 2 ) test. This empirical distribution function (EDF) test is a powerful means for analyzing the entire distribution and is suggested as the recommended EDF test for models with unknown parameters [11, 16]. 3. If goodness-of-t tests fail then use goodness of t metrics such as the 2 test suggested in [16] and described in [17]. This discrepancy measure is used to compare how well analytic models describe an empirical data set. It is a technique that relies on partitioning a data set into bins. We used the method suggested in [16] to select bin size. 4

5 4. Data sets can often contain outliers which do not seem to be part of the distribution and can skew goodness-of-t analysis. These must be investigated and explained before they are excluded from any analysis. 5. Validate the model using the the second half of the sample data or against other data sets if they are available. We believe that each of the analytic models within SURGE are required to generate a representative workload. However, it is not our intention to argue for invariant properties of any of these models, including those which we developed in this study. To that end, SURGE has been developed as a fully parameterizable tool. Our aim was to complete the models required and to present them as a reasonable approximation for each characteristic. The constant evolution of the WWW means that it is important to understand how each component of SURGE eects the workload generated at the server. The following sections describe the results of the analysis completed for each of the three models. 4 File Size Model World Wide Web le sizes have been analyzed in a number of studies including [7, 9]. Crovella and Bestavros showed in [9] that the distribution of the set of unique le sizes transferred to users exhibit a \heavy tailed" characteristic. However, examination of the models proposed in this work show that \heavy tailed" distributions do not accurately describe the entire distribution of the le sizes. In particular, the data shows the distribution for le sizes less than approximately 100KB deviates from an ideal Pareto distribution. For SURGE, we had to develop a model which more accurately characterized the entire range of unique le sizes. We began our unique le size model assuming that the heavy tailed characterization of the distribution was accurate. The model was then developed as a hybrid consisting of a new distribution for the body up to a threshold, followed by a heavy tailed model for the upper tail. Our task was to nd the correct distribution for the body, its parameters, and the appropriate threshold between the body and upper tail. 4.1 Modeling Results We began by selecting all of the unique les sizes from the BU client traces. We used half of this sample to develop our model which gave us a data set with 11,188 points. Our assumption was 5

6 Log(Unique File Sizes) Figure 4: BU Unique File Size Data that the underlying distribution had a specic characteristic which is \contaminated" by the long upper tail. We generated complementary distribution function (CDF) plots to determine the most appropriate model t for the data. We found the best visual t for this data to be the lognormal distribution. Figure 4 shows the histogram for log e transform of the data. In addition, the 2 test on the data versus the distributional models (lognormal, Weibull, Pareto, exponential, log-extreme) whose parameters were derived using maximum likelihood estimators (MLE) showed that the best (smallest) 2 value was from the lognormal distribution. Our null hypothesis for the A 2 goodness-of-t test for unique le sizes was: H 0 : The log transformed unique le size sample comes from the normal distribution N(; 2 ) with both and 2 derived from the sample using MLE. The result of the A 2 test run on test data set showed that we must reject this null hypothesis at any level of signicance. The failure of the A 2 test was due primarily to the fact that a fairly large data set was used for the test. When large data sets are used in EDF tests to measure goodnessof-t, the null hypothesis is often rejected, because small deviations from the ideal function are exaggerated when sample sizes are large [11, 16]. One way of dealing with this problem is to take random sub-samples of the data and test goodness-of-t on these samples [16, 8]. On sub-samples of size 100 from our test data set, the A 2 test returned a goodness-of-t statistic at the 10% signicance level (meaning that with probability 10% the test will erroneously declare the hypothesis as false) for all of the sub-samples. Thus, it appears that the lognormal distribution is a reasonable model for the body of the unique le size distribution. 6

7 A^2 Statistic % Sample Used Figure 5: Cuto Threshold Analysis Censoring techniques were employed to determine where to split between the lognormal distribution for the body and the heavy tailed distribution in the tail. A sample is said to be right censored if all observations greater than some value are missing. Our sample is right censored since we assumed that it was contaminated with a heavy tail. We used the A 2 statistic to determine the cuto point. Figure 5 shows the A 2 statistic versus the percent of sample data. The Figure shows that the A 2 statistic decreases in value as the percent of the sample used decreases from 100% to 93%. It then increased in value until the percent of the sample used decreased to 82%. We believe that the decrease in the A 2 statistic until the 93% level was due to the eective elimination of the contaminating heavy tail and that the statistic rose between 93% and 82% because we were eliminating data from the true distribution. Eliminating data below the 82% level begins to make the A 2 statistic look better simply due to reduced eective sample size. The 93% level for the data gave us a cuto at approximately 133KB. Finally, we tested this model against the second half of the sample data and found the 2 value to very close to the value for the rst half of the sample. This means that the proposed model is a reasonably good predictor for the data. Using the 133KB cuto, we found that 93% of the total number of unique les sizes lies below the 133KB cuto. This gure along with the hybrid distributional model allow us to generate the appropriate number of les for test in SURGE. The summary for the hybrid, unique le size model is in Table 1. The t of both the body and the tail of our unique le size model can be seen in the CDF plot for the body and Log-Log Complementary Distribution (LLCD) plot for the tail in Figures 6 and 7. Both plots are necessary since CDF tends to obscure discrepancies in the tail and LLCD tends to do the same for discrepancies in the body. 7

8 Component Model Probability Density Function Parameters Body Log e normal p x (x) = 1 x p2 e?(lnx?)2 =2 2 = 9:357; = 1:318 Tail P areto p x (x) = k x?(+1) k = 133K; = 1:1 Cuto 133,225KB Percent of les in body 93% Percent of les in tail 7% Table 1: Summary Statistics for SURGE File Size Model log(sizes) Figure 6: CDF: Unique File Size Data vs. Model log(p[x>x]) log(sizes) Figure 7: LLCD: Unique File Size Data (+) vs. Model (*) 8

9 Active OFF times less than 10sec Active OFF times less than 5sec Active OFF times less than 1sec Figure 8: Histograms of Active OFF times for 10, 5 and 1 second thresholds 5 OFF Time Model A number of models for OFF times have been presented in [7, 9, 12]. The model presented in [9] proposes a structure which consists of two kinds of OFF times. The rst, referred to as Active OFF time, is the time needed by the client machine to process transmitted les (interpret, format, and display). The second, referred to as Inactive OFF time, is the time that users take to examine the data that has been transmitted to their browser. We incorporate both of these types of OFF time into SURGE. 5.1 Modeling Results We use the characterizations given in [9, 12] to dene the OFF times for SURGE. The model presented in [9] gives a parameterized model for Inactive OFF time (which we use in SURGE) but not a model for the Active OFF time. Thus, we derived our model for Active OFF times from the BU client trace les. We consider an OFF time to be \Active" if it is less than a threshold time. We considered three dierent threshold values in our analysis: 1, 5 and 10 seconds. As can be seen in Figure 8 there is a strong clustering of data in the one second threshold histogram which does not continue past approximately one second as can be seen in the ve and ten second threshold histograms. From this, we conclude that the principal Active OFF period is less than one second and we continued our analysis focusing on this data set. The trace data for the one second threshold gave us a half sample of 40,037 elements. We do not consider values from our model for Active OFF times extending beyond the one second threshold since we were not able to distinguish between machine generated and human generated OFF times in our data. 9

10 solid line is the empirical d.f. Figure 9: CDF: Active OFF Time Model for BU Data versus Weibull Distribution Visual analysis via CDF plots lead us to consider both Weibull and Beta distributions as possible models for the Active OFF times. The 2 value for the Beta distribution was 0.85 while the value for the Weibull distribution was Thus, Weibull was selected as the distributional model for the Active OFF time data. The 2 test on the second half of the OFF time data yielded a value of 0.50 which shows that the Weibull model is a good predictor of Active OFF time values. The CDF plot of the data versus the Weibull distribution is shown in Figure 9. The A 2 test for the Active OFF time data versus the Weibull distribution was run using the following null hypothesis: H 0 : our active OFF time sample comes from the Weibull distribution W (shape; scale) with both shape and scale derived from the sample using MLE. We failed to nd signicance in our test at any level, which we attribute to the relatively large sample size. For random sub-samples of size 100, the A 2 test returned a goodness-of-t statistic at the 1% signicance level for approximately 50% of the sub-samples. This is additional evidence that the Active OFF time distribution is reasonably modeled by Weibull. We used the model given in [9] for the Inactive OFF periods. This is a Pareto distribution with = 1:5 and k = 1. The summary of the complete OFF time model used in SURGE is in Table 2. We placed a upper limit of 30 minutes on the Inactive OFF times generated by our model, because the random generation of values from the Pareto distribution will produce some very large values. We selected this as the limit since less than 0.05% of the measured OFF times from the traces were greater than 30 minutes. 10

11 Component Model Probability Density Function Parameters Active OFF W eibull p x (x) = bxb?1 a b e?(x=a)b a = 1:46; b = 0:382 Inactive OFF P areto p x (x) = k x?(+1) k = 1; = 1:5 Threshold Upper limit 1 second 1800 seconds 6 ON Time Model Table 2: Summary Statistics for SURGE OFF Time Model Based on the model presented in [9], multiple documents can be transferred during any ON period. In order to complete the SURGE framework, a model for the number of documents fetched during any ON period (ON counts) was necessary. 6.1 Modeling Results Our data for ON counts was extraced from the BU client traces by counting the number of documents fetched by a given user for which the OFF time between documents was less than the one second threshold. This resulted in a half sample data set with 26,142 elements. Initial inspection of this data set showed that its distribution had a long right tail. This lead us to test it against standard distributions which also have this characteristic (Lognormal, Logextreme, Weibull, and Pareto). We also included the Zipf-Estroup discrete distribution which is the discrete form of the Pareto distribution [13]. Our set of ON counts had a mean value of 2.71 and thus may not have been well approximated by a continuous function. So, we consider discrete distributions model as candidates. Generation of CDF and LLCD plots lead us to consider the Pareto distribution the best visual t for the data. We found that the MLE value for the Pareto parameter gave a model whose tail was not a good visual t for the ON count data. A least squares estimate of the tail slope in the LLCD plot resulted in an estimate of which gave a better visual t for both the body which can be seen in the CDF plot in Figure 10, and in the tail which can be seen in the LLCD plot in Figure 11. Note that the least squares method gave a K value for the Pareto distribution of 1.5. We know that minimum ON counts are in fact 1.0 (that is a single document) which is the value used in SURGE. The eect of this value for K is that SURGE will generate ON counts which are smaller 11

12 log(on Counts) Figure 10: CDF: ON count Model vs. Pareto Distribution log(p[x>x]) log(data) Figure 11: LLCD: ON Count Data (+) vs. Pareto Model (*) on average than the empirical distribution. The Pareto distribution yielded the best 2 value of the candidate distributions (1.12). The predictive value of the model was tested via the 2 test on the second half of the ON count data. The resulting value of 1.96 indicated that the model was reasonable. The summary of the ON Count model used in SURGE is in Table 3. We added an upper limit parameter to this model since the random generation of values from the Pareto distribution will produce some very large values. The largest document count in our data was 138 and so this is the upper limit on ON counts generated by SURGE. Our null hypothesis for the A 2 test for the ON Count data versus the Pareto distribution was: 12

13 Component Model Probability Density Function Parameters ON Count P areto p x (x) = k x?(+1) k = 1; = 2:354 Upper Limit 138 Table 3: Summary Statistics for SURGE ON Count Model Component Model Probability Density Function Parameters File Size - Body Log e normal p x (x) = 1 x p2 e?(lnx?)2 =2 2 = 9:357; = 1:318 File Size - Tail P areto p x (x) = k x?(+1) k = 133K; = 1:1 Document Popularity Zipf Temporal Locality Log e normal p x (x) = 1 2 x p2 e?(lnx?)2 =2 = 1:5; = 0:80 Active OFF W eibull p x (x) = bxb?1 a b e?(x=a)b a = 1:46; b = 0:382 Inactive OFF P areto p x (x) = k x?(+1) k = 1; = 1:5 ON Count P areto p x (x) = k x?(+1) k = 1; = 2:43 Table 4: Summary Statistics for Models used in SURGE H 0 : our ON Count sample comes from the Pareto distribution P (k; ) with both k and derived from the sample using LLCD plot estimation. We failed to nd any signicance in our test and again we attribute this to relatively the large size of the sample. We employed the random sub-sample method to see how well our data t the Pareto model. For sub-samples of size 100, the A 2 test returned a goodness-of-t statistic at the 25% level for a few of the samples. We attribute this to the fact that there were only a few values in the tail of our empirical data and the 100 item sub-samples rarely had values in their tails. 7 Surge Model Summary The model distributions and parameters used in SURGE are given in Table 4. We do not argue that any of these characteristics of WWW use are invariant. We do argue that consideration of each of these characteristics is important in WWW server workload generation. SURGE has been designed to oer the distributions and parameters listed above as a baseline, 13

14 however the user can change parameter values as well as distributions for any characteristic. In the expanded version of the paper describing SURGE, we will present the importance of each characteristic in detail. This will allow users to scale the various parameters of SURGE to more accurately model the expected workload on their server. References [1] Http client. [2] Specweb96. [3] Web [4] Webcompare. [5] Webstone. webstone.html. [6] V. Almeida, A. Bestavros, M. Crovella, and A. Oliveira. Characterizing reference locality in the www. Technical Report TR-96-11, Boston University Department of Computer Science, Boston, MA 02215, [7] M.F. Arlitt and C.L. Williamson. Web server workload characterization: The search for invariants. In Proceeding of the ACM SIGMETRICS '96 Conference, Philadelphia, PA, April [8] Henry Braun. A simple method for testing goodness of t in the presence of nuisance parameters. Journal of the Royal Statistical Society, [9] M.E. Crovella and A. Bestavros. Self-similarity in world wide web trac: Evidence and possible causes. In Proceedings of the 1996 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, May [10] C.A. Cunha, A. Bestavros, and M.E. Crovella. Characteristics of www client-based traces. Technical report, Boston University Department of Computer Science, [11] R. B. D'Agostino and editors M. A. Stephens. Goodness-of-Fit Techniques. Marcel Dekker, Inc.,

15 [12] S. Deng. Empirical model of www document arrivals at access link. In Proceedings of the 1996 IEEE International Conference on Communication, June [13] N. Johnson, S. Kotz, and N. Balakrishnan. Discrete Univariate Distributions. John Wiley and Sons, Inc., [14] Sunil U. Khaunte and John O. Limb. Statistical characterization of a world wide web browsing session. Technical report, College of Computing, Georgia Institute of Technology, [15] W.E. Leland, M.S. Taqqu, W. Willinger, and D.V. Wilson. On the self-similar nature of ethernet trac (extended version). IEEE/ACM Transactions on Networking, pages 2:1{15, [16] Vern Paxson. Empirically-derived analytic models of wide-area tcp connections. IEEE/ACM Transactions on Networking, [17] S. Pederson and M. Johnson. Estimating model discrepancy. Technometrics,

Network Traffic Characteristic

Network Traffic Characteristic Network Traffic Characteristic Hojun Lee hlee02@purros.poly.edu 5/24/2002 EL938-Project 1 Outline Motivation What is self-similarity? Behavior of Ethernet traffic Behavior of WAN traffic Behavior of WWW

More information

Model Fitting. Jean Yves Le Boudec

Model Fitting. Jean Yves Le Boudec Model Fitting Jean Yves Le Boudec 0 Contents 1. What is model fitting? 2. Linear Regression 3. Linear regression with norm minimization 4. Choosing a distribution 5. Heavy Tail 1 Virus Infection Data We

More information

In Proceedings of the 1997 Winter Simulation Conference, S. Andradottir, K. J. Healy, D. H. Withers, and B. L. Nelson, eds.

In Proceedings of the 1997 Winter Simulation Conference, S. Andradottir, K. J. Healy, D. H. Withers, and B. L. Nelson, eds. In Proceedings of the 1997 Winter Simulation Conference, S. Andradottir, K. J. Healy, D. H. Withers, and B. L. Nelson, eds. LONG-LASTING TRANSIENT CONDITIONS IN SIMULATIONS WITH HEAVY-TAILED WORKLOADS

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

Mice and Elephants Visualization of Internet

Mice and Elephants Visualization of Internet Mice and Elephants Visualization of Internet Traffic J. S. Marron, Felix Hernandez-Campos 2 and F. D. Smith 2 School of Operations Research and Industrial Engineering, Cornell University, Ithaca, NY, 4853,

More information

Packet Size

Packet Size Long Range Dependence in vbns ATM Cell Level Trac Ronn Ritke y and Mario Gerla UCLA { Computer Science Department, 405 Hilgard Ave., Los Angeles, CA 90024 ritke@cs.ucla.edu, gerla@cs.ucla.edu Abstract

More information

Multiplicative Multifractal Modeling of. Long-Range-Dependent (LRD) Trac in. Computer Communications Networks. Jianbo Gao and Izhak Rubin

Multiplicative Multifractal Modeling of. Long-Range-Dependent (LRD) Trac in. Computer Communications Networks. Jianbo Gao and Izhak Rubin Multiplicative Multifractal Modeling of Long-Range-Dependent (LRD) Trac in Computer Communications Networks Jianbo Gao and Izhak Rubin Electrical Engineering Department, University of California, Los Angeles

More information

x

x New Models and Methods for File Size Distributions Michael Mitzenmacher Harvard University Email: michaelm@eecs.harvard.edu Brent Tworetzky Harvard University Email: tworetzky@post.harvard.edu Abstract

More information

IP Packet Level vbns Trac. fjbgao, vwani,

IP Packet Level vbns Trac.   fjbgao, vwani, IP Packet Level vbns Trac Analysis and Modeling Jianbo Gao a,vwani P. Roychowdhury a, Ronn Ritke b, and Izhak Rubin a a Electrical Engineering Department, University of California, Los Angeles, Los Angeles,

More information

Probability Plots. Summary. Sample StatFolio: probplots.sgp

Probability Plots. Summary. Sample StatFolio: probplots.sgp STATGRAPHICS Rev. 9/6/3 Probability Plots Summary... Data Input... 2 Analysis Summary... 2 Analysis Options... 3 Uniform Plot... 3 Normal Plot... 4 Lognormal Plot... 4 Weibull Plot... Extreme Value Plot...

More information

Statistical Analysis of Backtracking on. Inconsistent CSPs? Irina Rish and Daniel Frost.

Statistical Analysis of Backtracking on. Inconsistent CSPs? Irina Rish and Daniel Frost. Statistical Analysis of Backtracking on Inconsistent CSPs Irina Rish and Daniel Frost Department of Information and Computer Science University of California, Irvine, CA 92697-3425 firinar,frostg@ics.uci.edu

More information

Analytic Modeling of Load Balancing Policies for Tasks with Heavy-tailed Distributions

Analytic Modeling of Load Balancing Policies for Tasks with Heavy-tailed Distributions Analytic Modeling of Load Balancing Policies for Tasks with Heavy-tailed Distributions Alma Riska Evgenia Smirni Gianfranco Ciardo Department of Computer Science College of William and Mary Williamsburg,

More information

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain 152/304 CoDaWork 2017 Abbadia San Salvatore (IT) Modified Kolmogorov-Smirnov Test of Goodness of Fit G.S. Monti 1, G. Mateu-Figueras 2, M. I. Ortego 3, V. Pawlowsky-Glahn 2 and J. J. Egozcue 3 1 Department

More information

A Contribution Towards Solving the Web Workload Puzzle

A Contribution Towards Solving the Web Workload Puzzle A Contribution Towards Solving the Web Workload Puzzle Katerina Goševa-Popstojanova, Fengbin Li, Xuan Wang, and Amit Sangle Lane Department of Computer Science and Electrical Engineering West Virginia

More information

In Proceedings of the Tenth International Conference on on Parallel and Distributed Computing Systems (PDCS-97), pages , October 1997

In Proceedings of the Tenth International Conference on on Parallel and Distributed Computing Systems (PDCS-97), pages , October 1997 In Proceedings of the Tenth International Conference on on Parallel and Distributed Computing Systems (PDCS-97), pages 322-327, October 1997 Consequences of Ignoring Self-Similar Data Trac in Telecommunications

More information

Heavy Tails: The Origins and Implications for Large Scale Biological & Information Systems

Heavy Tails: The Origins and Implications for Large Scale Biological & Information Systems Heavy Tails: The Origins and Implications for Large Scale Biological & Information Systems Predrag R. Jelenković Dept. of Electrical Engineering Columbia University, NY 10027, USA {predrag}@ee.columbia.edu

More information

Capturing Network Traffic Dynamics Small Scales. Rolf Riedi

Capturing Network Traffic Dynamics Small Scales. Rolf Riedi Capturing Network Traffic Dynamics Small Scales Rolf Riedi Dept of Statistics Stochastic Systems and Modelling in Networking and Finance Part II Dependable Adaptive Systems and Mathematical Modeling Kaiserslautern,

More information

A NOVEL APPROACH TO THE ESTIMATION OF THE HURST PARAMETER IN SELF-SIMILAR TRAFFIC

A NOVEL APPROACH TO THE ESTIMATION OF THE HURST PARAMETER IN SELF-SIMILAR TRAFFIC Proceedings of IEEE Conference on Local Computer Networks, Tampa, Florida, November 2002 A NOVEL APPROACH TO THE ESTIMATION OF THE HURST PARAMETER IN SELF-SIMILAR TRAFFIC Houssain Kettani and John A. Gubner

More information

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2 based on the conditional probability integral transform Daniel Berg 1 Henrik Bakken 2 1 Norwegian Computing Center (NR) & University of Oslo (UiO) 2 Norwegian University of Science and Technology (NTNU)

More information

On the Response Time of Large-scale Composite Web Services

On the Response Time of Large-scale Composite Web Services On the Response Time of Large-scale Composite Web Services Michael Scharf Institute of Communication Networks and Computer Engineering (IKR) University of Stuttgart, Pfaffenwaldring 47, 70569 Stuttgart,

More information

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process

Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Applied Mathematical Sciences, Vol. 4, 2010, no. 62, 3083-3093 Sequential Procedure for Testing Hypothesis about Mean of Latent Gaussian Process Julia Bondarenko Helmut-Schmidt University Hamburg University

More information

A DIAGNOSTIC FUNCTION TO EXAMINE CANDIDATE DISTRIBUTIONS TO MODEL UNIVARIATE DATA JOHN RICHARDS. B.S., Kansas State University, 2008 A REPORT

A DIAGNOSTIC FUNCTION TO EXAMINE CANDIDATE DISTRIBUTIONS TO MODEL UNIVARIATE DATA JOHN RICHARDS. B.S., Kansas State University, 2008 A REPORT A DIAGNOSTIC FUNCTION TO EXAMINE CANDIDATE DISTRIBUTIONS TO MODEL UNIVARIATE DATA by JOHN RICHARDS B.S., Kansas State University, 2008 A REPORT submitted in partial fulfillment of the requirements for

More information

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters

Goodness-of-fit tests for randomly censored Weibull distributions with estimated parameters Communications for Statistical Applications and Methods 2017, Vol. 24, No. 5, 519 531 https://doi.org/10.5351/csam.2017.24.5.519 Print ISSN 2287-7843 / Online ISSN 2383-4757 Goodness-of-fit tests for randomly

More information

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim

Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim Tests for trend in more than one repairable system. Jan Terje Kvaly Department of Mathematical Sciences, Norwegian University of Science and Technology, Trondheim ABSTRACT: If failure time data from several

More information

ntopic Organic Traffic Study

ntopic Organic Traffic Study ntopic Organic Traffic Study 1 Abstract The objective of this study is to determine whether content optimization solely driven by ntopic recommendations impacts organic search traffic from Google. The

More information

Department of. Computer Science. Empirical Estimation of Fault. Naixin Li and Yashwant K. Malaiya. August 20, Colorado State University

Department of. Computer Science. Empirical Estimation of Fault. Naixin Li and Yashwant K. Malaiya. August 20, Colorado State University Department of Computer Science Empirical Estimation of Fault Exposure Ratio Naixin Li and Yashwant K. Malaiya Technical Report CS-93-113 August 20, 1993 Colorado State University Empirical Estimation of

More information

Reliability Engineering I

Reliability Engineering I Happiness is taking the reliability final exam. Reliability Engineering I ENM/MSC 565 Review for the Final Exam Vital Statistics What R&M concepts covered in the course When Monday April 29 from 4:30 6:00

More information

Source Traffic Modeling Using Pareto Traffic Generator

Source Traffic Modeling Using Pareto Traffic Generator Journal of Computer Networks, 207, Vol. 4, No., -9 Available online at http://pubs.sciepub.com/jcn/4//2 Science and Education Publishing DOI:0.269/jcn-4--2 Source Traffic odeling Using Pareto Traffic Generator

More information

Tuan V. Dinh, Lachlan Andrew and Philip Branch

Tuan V. Dinh, Lachlan Andrew and Philip Branch Predicting supercomputing workload using per user information Tuan V. Dinh, Lachlan Andrew and Philip Branch 13 th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Delft, 14 th -16

More information

STAT 6350 Analysis of Lifetime Data. Probability Plotting

STAT 6350 Analysis of Lifetime Data. Probability Plotting STAT 6350 Analysis of Lifetime Data Probability Plotting Purpose of Probability Plots Probability plots are an important tool for analyzing data and have been particular popular in the analysis of life

More information

IMPACT OF ALTERNATIVE DISTRIBUTIONS ON QUANTILE-QUANTILE NORMALITY PLOT

IMPACT OF ALTERNATIVE DISTRIBUTIONS ON QUANTILE-QUANTILE NORMALITY PLOT Colloquium Biometricum 45 2015, 67 78 IMPACT OF ALTERNATIVE DISTRIBUTIONS ON QUANTILE-QUANTILE NORMALITY PLOT Zofia Hanusz, Joanna Tarasińska Department of Applied Mathematics and Computer Science University

More information

PRIME GENERATING LUCAS SEQUENCES

PRIME GENERATING LUCAS SEQUENCES PRIME GENERATING LUCAS SEQUENCES PAUL LIU & RON ESTRIN Science One Program The University of British Columbia Vancouver, Canada April 011 1 PRIME GENERATING LUCAS SEQUENCES Abstract. The distribution of

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

USING PEARSON TYPE IV AND OTHER CINDERELLA DISTRIBUTIONS IN SIMULATION. Russell Cheng

USING PEARSON TYPE IV AND OTHER CINDERELLA DISTRIBUTIONS IN SIMULATION. Russell Cheng Proceedings of the Winter Simulation Conference S. Jain, R.R. Creasey, J. Himmelspach, K.P. White, and M. Fu, eds. USING PEARSON TYPE IV AND OTHER CINDERELLA DISTRIBUTIONS IN SIMULATION Russell Cheng University

More information

Joseph O. Marker Marker Actuarial a Services, LLC and University of Michigan CLRS 2010 Meeting. J. Marker, LSMWP, CLRS 1

Joseph O. Marker Marker Actuarial a Services, LLC and University of Michigan CLRS 2010 Meeting. J. Marker, LSMWP, CLRS 1 Joseph O. Marker Marker Actuarial a Services, LLC and University of Michigan CLRS 2010 Meeting J. Marker, LSMWP, CLRS 1 Expected vs Actual Distribution Test distributions of: Number of claims (frequency)

More information

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples

Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples 90 IEEE TRANSACTIONS ON RELIABILITY, VOL. 52, NO. 1, MARCH 2003 Point and Interval Estimation for Gaussian Distribution, Based on Progressively Type-II Censored Samples N. Balakrishnan, N. Kannan, C. T.

More information

Continuous Univariate Distributions

Continuous Univariate Distributions Continuous Univariate Distributions Volume 1 Second Edition NORMAN L. JOHNSON University of North Carolina Chapel Hill, North Carolina SAMUEL KOTZ University of Maryland College Park, Maryland N. BALAKRISHNAN

More information

Response Time in Data Broadcast Systems: Mean, Variance and Trade-O. Shu Jiang Nitin H. Vaidya. Department of Computer Science

Response Time in Data Broadcast Systems: Mean, Variance and Trade-O. Shu Jiang Nitin H. Vaidya. Department of Computer Science Response Time in Data Broadcast Systems: Mean, Variance and Trade-O Shu Jiang Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 7784-11, USA Email: fjiangs,vaidyag@cs.tamu.edu

More information

SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS

SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS Zeynep F. EREN DOGU PURPOSE & OVERVIEW Stochastic simulations involve random inputs, so produce random outputs too. The quality of the output is

More information

Finding Succinct. Ordered Minimal Perfect. Hash Functions. Steven S. Seiden 3 Daniel S. Hirschberg 3. September 22, Abstract

Finding Succinct. Ordered Minimal Perfect. Hash Functions. Steven S. Seiden 3 Daniel S. Hirschberg 3. September 22, Abstract Finding Succinct Ordered Minimal Perfect Hash Functions Steven S. Seiden 3 Daniel S. Hirschberg 3 September 22, 1994 Abstract An ordered minimal perfect hash table is one in which no collisions occur among

More information

Calculation of maximum entropy densities with application to income distribution

Calculation of maximum entropy densities with application to income distribution Journal of Econometrics 115 (2003) 347 354 www.elsevier.com/locate/econbase Calculation of maximum entropy densities with application to income distribution Ximing Wu Department of Agricultural and Resource

More information

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( )

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( ) COMSTA 28 pp: -2 (col.fig.: nil) PROD. TYPE: COM ED: JS PAGN: Usha.N -- SCAN: Bindu Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Transformation approaches for the construction

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

The Impact of Quanta on the Performance of Multi-level Time Sharing Policy under Heavy-tailed Workloads

The Impact of Quanta on the Performance of Multi-level Time Sharing Policy under Heavy-tailed Workloads The Impact of Quanta on the Performance of Multi-level Time Sharing Policy under Heavy-tailed Workloads Malith Jayasinghe 1 Zahir Tari 1 Panlop Zeephongsekul 2 1 School of Computer Science and Information

More information

Lecture 21. Hypothesis Testing II

Lecture 21. Hypothesis Testing II Lecture 21. Hypothesis Testing II December 7, 2011 In the previous lecture, we dened a few key concepts of hypothesis testing and introduced the framework for parametric hypothesis testing. In the parametric

More information

Internet Traffic Modeling and Its Implications to Network Performance and Control

Internet Traffic Modeling and Its Implications to Network Performance and Control Internet Traffic Modeling and Its Implications to Network Performance and Control Kihong Park Department of Computer Sciences Purdue University park@cs.purdue.edu Outline! Motivation! Traffic modeling!

More information

The DGX Distribution for Mining Massive, Skewed Data

The DGX Distribution for Mining Massive, Skewed Data The DGX Distribution for Mining Massive, Skewed Data Zhiqiang Bi Physics and CALD Carnegie Mellon University zb26@cs.cmu.edu Christos Faloutsos School of Computer Science Carnegie Mellon University christos@cs.cmu.edu

More information

Evaluation of Effective Bandwidth Schemes for Self-Similar Traffic

Evaluation of Effective Bandwidth Schemes for Self-Similar Traffic Proceedings of the 3th ITC Specialist Seminar on IP Measurement, Modeling and Management, Monterey, CA, September 2000, pp. 2--2-0 Evaluation of Effective Bandwidth Schemes for Self-Similar Traffic Stefan

More information

Chapter Learning Objectives. Probability Distributions and Probability Density Functions. Continuous Random Variables

Chapter Learning Objectives. Probability Distributions and Probability Density Functions. Continuous Random Variables Chapter 4: Continuous Random Variables and Probability s 4-1 Continuous Random Variables 4-2 Probability s and Probability Density Functions 4-3 Cumulative Functions 4-4 Mean and Variance of a Continuous

More information

EE/CpE 345. Modeling and Simulation. Fall Class 9

EE/CpE 345. Modeling and Simulation. Fall Class 9 EE/CpE 345 Modeling and Simulation Class 9 208 Input Modeling Inputs(t) Actual System Outputs(t) Parameters? Simulated System Outputs(t) The input data is the driving force for the simulation - the behavior

More information

Solutions. Some of the problems that might be encountered in collecting data on check-in times are:

Solutions. Some of the problems that might be encountered in collecting data on check-in times are: Solutions Chapter 7 E7.1 Some of the problems that might be encountered in collecting data on check-in times are: Need to collect separate data for each airline (time and cost). Need to collect data for

More information

Estimation of Parameters of the Weibull Distribution Based on Progressively Censored Data

Estimation of Parameters of the Weibull Distribution Based on Progressively Censored Data International Mathematical Forum, 2, 2007, no. 41, 2031-2043 Estimation of Parameters of the Weibull Distribution Based on Progressively Censored Data K. S. Sultan 1 Department of Statistics Operations

More information

Modeling and Performance Analysis with Discrete-Event Simulation

Modeling and Performance Analysis with Discrete-Event Simulation Simulation Modeling and Performance Analysis with Discrete-Event Simulation Chapter 9 Input Modeling Contents Data Collection Identifying the Distribution with Data Parameter Estimation Goodness-of-Fit

More information

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr. Simulation Discrete-Event System Simulation Chapter 8 Input Modeling Purpose & Overview Input models provide the driving force for a simulation model. The quality of the output is no better than the quality

More information

GIST 4302/5302: Spatial Analysis and Modeling

GIST 4302/5302: Spatial Analysis and Modeling GIST 4302/5302: Spatial Analysis and Modeling Basics of Statistics Guofeng Cao www.myweb.ttu.edu/gucao Department of Geosciences Texas Tech University guofeng.cao@ttu.edu Spring 2015 Outline of This Week

More information

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T.

ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T. Exam 3 Review Suppose that X i = x =(x 1,, x k ) T is observed and that Y i X i = x i independent Binomial(n i,π(x i )) for i =1,, N where ˆπ(x) = exp(ˆα + ˆβ T x) 1 + exp(ˆα + ˆβ T x) This is called the

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Some Statistical Properties of Exponentiated Weighted Weibull Distribution

Some Statistical Properties of Exponentiated Weighted Weibull Distribution Global Journal of Science Frontier Research: F Mathematics and Decision Sciences Volume 4 Issue 2 Version. Year 24 Type : Double Blind Peer Reviewed International Research Journal Publisher: Global Journals

More information

Overall Plan of Simulation and Modeling I. Chapters

Overall Plan of Simulation and Modeling I. Chapters Overall Plan of Simulation and Modeling I Chapters Introduction to Simulation Discrete Simulation Analytical Modeling Modeling Paradigms Input Modeling Random Number Generation Output Analysis Continuous

More information

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapte The McGraw-Hill Companies, Inc. All rights reserved. er15 Chapte Chi-Square Tests d Chi-Square Tests for -Fit Uniform Goodness- Poisson Goodness- Goodness- ECDF Tests (Optional) Contingency Tables A contingency table is a cross-tabulation of n paired observations

More information

A POPULATION-MIX DRIVEN APPROXIMATION FOR QUEUEING NETWORKS WITH FINITE CAPACITY REGIONS

A POPULATION-MIX DRIVEN APPROXIMATION FOR QUEUEING NETWORKS WITH FINITE CAPACITY REGIONS A POPULATION-MIX DRIVEN APPROXIMATION FOR QUEUEING NETWORKS WITH FINITE CAPACITY REGIONS J. Anselmi 1, G. Casale 2, P. Cremonesi 1 1 Politecnico di Milano, Via Ponzio 34/5, I-20133 Milan, Italy 2 Neptuny

More information

Power Laws & Rich Get Richer

Power Laws & Rich Get Richer Power Laws & Rich Get Richer CMSC 498J: Social Media Computing Department of Computer Science University of Maryland Spring 2015 Hadi Amiri hadi@umd.edu Lecture Topics Popularity as a Network Phenomenon

More information

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring

Estimation for Mean and Standard Deviation of Normal Distribution under Type II Censoring Communications for Statistical Applications and Methods 2014, Vol. 21, No. 6, 529 538 DOI: http://dx.doi.org/10.5351/csam.2014.21.6.529 Print ISSN 2287-7843 / Online ISSN 2383-4757 Estimation for Mean

More information

CHAPTER 7. Trace Resampling and Load Scaling

CHAPTER 7. Trace Resampling and Load Scaling CHAPTER 7 Trace Resampling and Load Scaling That which is static and repetitive is boring. That which is dynamic and random is confusing. In between lies art. John A. Locke ( 70) Everything that can be

More information

Continuous-Valued Probability Review

Continuous-Valued Probability Review CS 6323 Continuous-Valued Probability Review Prof. Gregory Provan Department of Computer Science University College Cork 2 Overview Review of discrete distributions Continuous distributions 3 Discrete

More information

Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data

Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data Jinzhong Wang April 13, 2016 The UBD Group Mobile and Social Computing Laboratory School of Software, Dalian University

More information

Computer simulation on homogeneity testing for weighted data sets used in HEP

Computer simulation on homogeneity testing for weighted data sets used in HEP Computer simulation on homogeneity testing for weighted data sets used in HEP Petr Bouř and Václav Kůs Department of Mathematics, Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University

More information

Exploring regularities and self-similarity in Internet traffic

Exploring regularities and self-similarity in Internet traffic Exploring regularities and self-similarity in Internet traffic FRANCESCO PALMIERI and UGO FIORE Centro Servizi Didattico Scientifico Università degli studi di Napoli Federico II Complesso Universitario

More information

Key Words: Lifetime Data Analysis (LDA), Probability Density Function (PDF), Goodness of fit methods, Chi-square method.

Key Words: Lifetime Data Analysis (LDA), Probability Density Function (PDF), Goodness of fit methods, Chi-square method. Reliability prediction based on lifetime data analysis methodology: The pump case study Abstract: The business case aims to demonstrate the lifetime data analysis methodology application from the historical

More information

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19 EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org (based on Dr. Raj Jain s lecture

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

COMP 5331: Knowledge Discovery and Data Mining

COMP 5331: Knowledge Discovery and Data Mining COMP 5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified by Dr. Lei Chen based on the slides provided by Tan, Steinbach, Kumar And Jiawei Han, Micheline Kamber, and Jian Pei 1 10

More information

Long Range Mutual Information

Long Range Mutual Information Long Range Mutual Information Nahur Fonseca Boston University 111 Cummington St Boston, Massachusetts, USA nahur@cs.bu.edu Mark Crovella Boston University 111 Cummington St Boston, Massachusetts, USA crovella@cs.bu.edu

More information

Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests

Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests International Journal of Performability Engineering, Vol., No., January 24, pp.3-4. RAMS Consultants Printed in India Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests N. CHANDRA *, MASHROOR

More information

Applied Regression Modeling

Applied Regression Modeling Applied Regression Modeling A Business Approach Iain Pardoe University of Oregon Charles H. Lundquist College of Business Eugene, Oregon WILEY- INTERSCIENCE A JOHN WILEY & SONS, INC., PUBLICATION CONTENTS

More information

Chapter 23: Inferences About Means

Chapter 23: Inferences About Means Chapter 3: Inferences About Means Sample of Means: number of observations in one sample the population mean (theoretical mean) sample mean (observed mean) is the theoretical standard deviation of the population

More information

Normal Probability Plot Probability Probability

Normal Probability Plot Probability Probability Modelling multivariate returns Stefano Herzel Department ofeconomics, University of Perugia 1 Catalin Starica Department of Mathematical Statistics, Chalmers University of Technology Reha Tutuncu Department

More information

Using Simulation Procedure to Compare between Estimation Methods of Beta Distribution Parameters

Using Simulation Procedure to Compare between Estimation Methods of Beta Distribution Parameters Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 13, Number 6 (2017), pp. 2307-2324 Research India Publications http://www.ripublication.com Using Simulation Procedure to Compare between

More information

Visualization Challenges in Internet Traffic Research

Visualization Challenges in Internet Traffic Research Visualization Challenges in Internet Traffic Research J. S. Marron Department of Statistics University of North Carolina Chapel Hill, NC 27599-3260 October 5, 2002 Abstract This is an overview of some

More information

Outline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems.

Outline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Outline Simulation of a Single-Server Queueing System Review of midterm # Department of Electrical and Computer Engineering

More information

F n and theoretical, F 0 CDF values, for the ordered sample

F n and theoretical, F 0 CDF values, for the ordered sample Material E A S E 7 AMPTIAC Jorge Luis Romeu IIT Research Institute Rome, New York STATISTICAL ANALYSIS OF MATERIAL DATA PART III: ON THE APPLICATION OF STATISTICS TO MATERIALS ANALYSIS Introduction This

More information

Introduction to Basic Statistics Version 2

Introduction to Basic Statistics Version 2 Introduction to Basic Statistics Version 2 Pat Hammett, Ph.D. University of Michigan 2014 Instructor Comments: This document contains a brief overview of basic statistics and core terminology/concepts

More information

Bayesian Analysis of Simple Step-stress Model under Weibull Lifetimes

Bayesian Analysis of Simple Step-stress Model under Weibull Lifetimes Bayesian Analysis of Simple Step-stress Model under Weibull Lifetimes A. Ganguly 1, D. Kundu 2,3, S. Mitra 2 Abstract Step-stress model is becoming quite popular in recent times for analyzing lifetime

More information

p-birnbaum SAUNDERS DISTRIBUTION: APPLICATIONS TO RELIABILITY AND ELECTRONIC BANKING HABITS

p-birnbaum SAUNDERS DISTRIBUTION: APPLICATIONS TO RELIABILITY AND ELECTRONIC BANKING HABITS p-birnbaum SAUNDERS DISTRIBUTION: APPLICATIONS TO RELIABILITY AND ELECTRONIC BANKING 1 V.M.Chacko, Mariya Jeeja P V and 3 Deepa Paul 1, Department of Statistics St.Thomas College, Thrissur Kerala-681 e-mail:chackovm@gmail.com

More information

Test of fit for a Laplace distribution against heavier tailed alternatives

Test of fit for a Laplace distribution against heavier tailed alternatives DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE University of Waterloo, 200 University Avenue West Waterloo, Ontario, Canada, N2L 3G1 519-888-4567, ext. 00000 Fax: 519-746-1875 www.stats.uwaterloo.ca UNIVERSITY

More information

Rigorous Evaluation R.I.T. Analysis and Reporting. Structure is from A Practical Guide to Usability Testing by J. Dumas, J. Redish

Rigorous Evaluation R.I.T. Analysis and Reporting. Structure is from A Practical Guide to Usability Testing by J. Dumas, J. Redish Rigorous Evaluation Analysis and Reporting Structure is from A Practical Guide to Usability Testing by J. Dumas, J. Redish S. Ludi/R. Kuehl p. 1 Summarize and Analyze Test Data Qualitative data - comments,

More information

Design of Repetitive Acceptance Sampling Plan for Truncated Life Test using Inverse Weibull Distribution

Design of Repetitive Acceptance Sampling Plan for Truncated Life Test using Inverse Weibull Distribution Design of Repetitive Acceptance Sampling Plan for Truncated Life Test using Inverse Weibull Distribution Navjeet Singh 1, Navyodh Singh 2, Harpreet Kaur 3 1 Department of Mathematics, Sant Baba Bhag Singh

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

Exact goodness-of-fit tests for censored data

Exact goodness-of-fit tests for censored data Exact goodness-of-fit tests for censored data Aurea Grané Statistics Department. Universidad Carlos III de Madrid. Abstract The statistic introduced in Fortiana and Grané (23, Journal of the Royal Statistical

More information

Performance of Low Density Parity Check Codes. as a Function of Actual and Assumed Noise Levels. David J.C. MacKay & Christopher P.

Performance of Low Density Parity Check Codes. as a Function of Actual and Assumed Noise Levels. David J.C. MacKay & Christopher P. Performance of Low Density Parity Check Codes as a Function of Actual and Assumed Noise Levels David J.C. MacKay & Christopher P. Hesketh Cavendish Laboratory, Cambridge, CB3 HE, United Kingdom. mackay@mrao.cam.ac.uk

More information

1 Introduction Suppose we have multivariate data y 1 ; y 2 ; : : : ; y n consisting of n points in p dimensions. In this paper we propose a test stati

1 Introduction Suppose we have multivariate data y 1 ; y 2 ; : : : ; y n consisting of n points in p dimensions. In this paper we propose a test stati A Test for Multivariate Structure Fred W. Huer Florida State University Cheolyong Park Keimyung University Abstract We present a test for detecting `multivariate structure' in data sets. This procedure

More information

Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution

Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution CMST 21(4) 221-227 (2015) DOI:10.12921/cmst.2015.21.04.006 Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution B. Sadeghpour Gildeh, M. Taghizadeh Ashkavaey Department

More information

Failure rate in the continuous sense. Figure. Exponential failure density functions [f(t)] 1

Failure rate in the continuous sense. Figure. Exponential failure density functions [f(t)] 1 Failure rate (Updated and Adapted from Notes by Dr. A.K. Nema) Part 1: Failure rate is the frequency with which an engineered system or component fails, expressed for example in failures per hour. It is

More information

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models

An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS023) p.3938 An Algorithm for Bayesian Variable Selection in High-dimensional Generalized Linear Models Vitara Pungpapong

More information

ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION

ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ON THE FAILURE RATE ESTIMATION OF THE INVERSE GAUSSIAN DISTRIBUTION ZHENLINYANGandRONNIET.C.LEE Department of Statistics and Applied Probability, National University of Singapore, 3 Science Drive 2, Singapore

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

Goodness of Fit Tests for Rayleigh Distribution Based on Phi-Divergence

Goodness of Fit Tests for Rayleigh Distribution Based on Phi-Divergence Revista Colombiana de Estadística July 2017, Volume 40, Issue 2, pp. 279 to 290 DOI: http://dx.doi.org/10.15446/rce.v40n2.60375 Goodness of Fit Tests for Rayleigh Distribution Based on Phi-Divergence Pruebas

More information