The Planning and Analysis of Industrial Selection and Screening Experiments
|
|
- Everett Ellis
- 5 years ago
- Views:
Transcription
1 The Planning and Analysis of Industrial Selection and Screening Experiments Guohua Pan Department of Mathematics and Statistics Oakland University Rochester, MI Thomas J. Santner Department of Statistics Ohio State University Columbus, OH David M. Goldsman School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA Introduction The purpose of this article is to explain methodology for designing and analyzing industrial experiments when selection and screening, rather than hypothesis testing, is the goal. In rough terms, we say the goal is one of selection and/or screening when the scientific objective is to find the best treatment. Empirical investigation, in the form of physical experiments, has been a key tool in the development of many advances in industrial product design, as well as online and offline improvements in manufacturing during the last 75 years. More recently, physical experiments have also been used to design products that 1
2 are robust to the environmental conditions in which they are used or robust with respect to the manufacturing process itself. Broadly speaking, at least three types of experiments have evolved. Historically, physical experiments were the earliest to be conducted; the first principles of the design and analysis of such experiments were developed in response to agricultural improvements, then later to meet industrial and medical needs. Simulation experiments are an attractive alternative to physical experiments when the experimenter has a complex physical system whose parts interact in a known manner but whose ensemble is not understood analytically. Such complex interacting structural components are typically combined with specified noise distributions to produce random output. Banks 1998) gives an overview of the field and Goldsman and Nelson 1998) provides a survey of methods useful for designing simulation experiments to identify best treatments. In the last ten to fifteen years, a third form of experiment, commonly called a computer experiment, has become popular. In a computer experiment, a deterministic output is calculated for each set of input variables. Many phenomena that could only be studied using physical experiments can now be studied by these computer experiments. Computer experiments are possible when the mathematical model of the physical process of interest is known and an algorithm has been developed for solving the resulting mathematical system in a reasonable time frame on fast) computing equipment. In engineering, dynamical models of physical systems implemented using finite-element methods are often the basis for computer code used in computer experiments. Because the output is deterministic, issues such as replication, randomization, and other fundamental tools for designing physical experiments are no longer appropriate. The special techniques used to design and analyze computer experiments are discussed in detail in the survey articles by Koehler and Owen 1996) and Sacks, Welch, Mitchell and Wynn 1989). This article will discuss the design and analysis of physical and simulation experiments for frequently occurring industrial problems involving the identification of best or near-best treatments. Another very useful approach to the problem of identifying best treatments is that of forming simultaneous confidence intervals for important parameters related to this problem. Length considerations require this review paper to focus attention on selection and screening approaches, with one exception in Section 7, but we refer the reader to the seminal book-length treatment of simultaneous confidence intervals in Hsu 1996). Sections 2 and 3 discuss the familiar one-way layout, illustrating statistical methods both for the problem of designing experiments to select best treatments 2
3 and for analysis procedures to screen a given set of data to extract a small) set of treatments containing best treatments. Section 4 reviews such problems for the case of completely randomized full-factorial experiments. Section 5 considers screening for fractional-factorial experiments for the specific goal of finding most robust engineering designs for products or manufacturing processes. Section 6 considers the important case of designing and analyzing randomization restricted experiments; particular attention is given to identifying treatment combinations with largest means in split-plot experiments and selection of robust product designs for the same setup. Finally, Section 7 reviews statistical procedures for finding treatments associated with smallest variances and gives a set of simultaneous confidence intervals for comparing treatment variances. Throughout we use the standard notation in which a bar over the quantity and a dot replacing one or more of its subscripts means that an average has been computed with respect to that those) subscripts); for example,. # Also we let denote the $&% column) vector with common )+*,-*. # element unity and ' denote the $/% zero vector. Further, for 021 IR, let be the smallest integer greater than or equal to 0. Finally, the notations 6 and 9 denote the cumulative distribution function and density function, respectively, of the standard normal distribution. We conclude this section by noting several other recent summary articles concerning selection and screening procedures. These include van der Laan and Verdooren 1989) and Gupta and Panchapakesan 1996) for general overviews, and Driessen 1992) and Dourleijn 1993) for results concerning subset selection in connected experiments. 2 Designing Single-Factor Experiments to Select Treatments Based on Means 2.1 Introduction We begin with the problem of designing an experiment to select the best treatment in the simplest possible setting, that of a single-factor experiment. Subsequent sections will both refine the model by allowing multiple factors and weaken the stochastic normality) assumptions of the one-way layout. The underlying question of interest is Which of competing normal distributions or treatments or systems or populations) has the largest mean?. Without loss of generality, we could also be interested in finding the normal treatment hav- 3
4 ing the smallest mean.) This broad question has a tremendous number of potential applications, for example, Which of fertilizers produces the highest crop yield? Which of groups scores highest on a certain standardized test? Which of factory layouts maximizes production flow? Throughout this section, we adopt the following assumptions on the data that we will use in our experimentation. Statistical Assumptions Independent random samples of normal observations are to be taken that satisfy 2.1) where the index represents the treatments, the index represents the observations taken within a treatment, and the are mutually independent and *, identically distributed i.i.d.) ) measurement errors. The variances of the measurement errors, the s, may or may not be known but the treatment means s are unknown. The number of observations to be taken from each treatment depends on the goal of the experiment, the design requirement, and the particular procedure employed accordingly the sample size may be fixed or random. The goal and design requirement we use are those of Bechhofer 1954) as refined by Fabian 1962). To describe the goal, let the vector of unknown treatment means be given by +. Further, the ordered but unknown -values are denoted as Roughly, our goal is to identify the treatment associated with the largest mean, the best treatment. However, from a practical viewpoint, any treatment with a mean that is sufficiently close to will be essentially equivalent to the best. We formalize this idea by defining a treatment to be # -near-best provided that $ &%. Of course, if the best and second-best treatments have means that are more than apart, i.e., '% ) +*,, then only the best treatment is, -near-best. On the other extreme, if all the means are within - of the best treatment, i.e.,.%,, then all the treatments are / -near-best. Motivated by this consideration, we formulate our experimental goal. 4
5 Goal 2.1 To select a treatment that is -near-best. Here, the choice of / should be made on engineering or scientific considerations to quantify practical equivalence. The difficulty of achieving Goal 2.1 depends on the spread in, whether the variances unknown, and whether the ) are known or s are or can be assumed to be) equal. All of the subsequent procedures we provide to select the best treatment will satisfy the following design àka probability) requirement. Design Requirement Given a desired level of correct selection CS) with and desired level of equivalence defined by # * *, we require 2.2) CS for all and all unknown +, where the event CS means that Goal 2.1 has been achieved. To illustrate the caveat about the in the Design Requirement, if the variances are known then 2.2) need only hold for all whereas if the variances are all unknown and not necessarily homogeneous, then 2.2) must also hold for all * *. As the notation in 2.2) suggests, the left-hand-side probability depends on the, ), the sizes of the samples taken from each differences % treatment, and the variances. Finally, we note that we restrict * because can be achieved without taking any observations by rolling a fair -sided die and selecting the treatment so identified as the best one. The remainder of this section describes statistical selection procedures appropriate in two variance settings. The single-stage procedure of Bechhofer 1954) is for the case in which the variances are known and equal, i.e., A two-stage procedure of Rinott 1978) is for the situation where are unknown and not necessarily equal. An efficient sequential procedure of Kim and Nelson 2000) is for the same assumptions as Rinott s procedure. Lastly, we note that any of the procedures described in this section can be used to find the treatment with the smallest mean by an obvious modification or formally by direct application of the procedure for finding the treatment with the largest mean using the negative of the responses). 5
6 $ $ 2.2 Selection of -Near-Best Treatments Common known variance case Bechhofer 1954) devised a statistical procedure that can be used to select a - near-best treatment when the variances of the measurement errors of the treatments are homogeneous, i.e., and is known. Bechhofer s procedure is intuitive we select the treatment corresponding to the largest of the sample means. To describe his procedure we require the one-sided upper- equicoordinate point of the equicorrelated multivariate normal distribution, ; this value is defined in brief by 2.4) and in the context of other relevant critical points in Section 8. Selection Procedure For the given and, and desired 3, compute ) 4 2.3) Sampling rule Take a random sample of $ observations $ ) from each treatment ). Terminal decision rule Calculate the sample means ). Select the treatment that yielded the largest sample mean,, as having a -near-best mean. Fabian 1962) showed that Procedure achieves Goal 2.1. The sample size $ can be directly computed using the FORTRAN program usenb that is described in Bechhofer, Santner and Goldsman 1995) and is available as part of the package of FORTRAN programs associated with their book at the web site tjs. Alternatively, $ can be determined ) from 2.3) because the critical point can be calculated via a zero-finding routine in conjunction with the very useful public domain FORTRAN program mvnprd from Dunnett 1989). We illustrate both methods below. Example 2.1 Suppose that five shop layouts are to be compared with respect to their mean output. The measured output from the th layout is assumed to be 6
7 . In addition, the experi-,. Invoking the program usenb produces the where. Define layout to be -near-best provided its mean is within 0.1 of the largest mean, i.e., % *, menter wishes to identify such a layout with probability at least 80%. Thus * *,* *,,, and, following dialogue. > usenb ENTER T AND SIGMA CHOICE GIVEN FIND 1 N, P-STAR DELTA-STAR 2 DELTA-STAR, P-STAR N 3 N, DELTA-STAR P-STAR ENTER CHOICE 1, 2, OR 3) 2 ENTER DELTA-STAR AND P-STAR.1.80 FOR T, DELTA*, P*, SIGMA) = 5, 0.10, 0.800, 1.00) THE REQUIRED SAMPLE SIZE FOR PROCEDURE N_B IS N = 422 DO YOU WANT TO CONTINUE? PLEASE ANSWER Y/N N One can also determine $ using 2.3). The FORTRAN program find-zt implements a bisection algorithm to solve % * 2.4) for, where % and has the multivariate normal distribution with mean vector zero, unit variances, and common correlation 1/2. The program find-zt is also included in the set of programs available at tjs.) For example, the dialogue below shows that, to two decimal places, the critical point. > find-zt Enter P = No. Dimensions, and D.o.F. 0 for MVN) or 0 0 to terminate) 7
8 $ Enter 1 for equal-rho case, or 2 to compute rho_ij from sample sizes, or 3 if rho_ij = b_i*b_j 1 Enter common rho.5 Enter Quantile Probability Enter 1 to repeat with new coverage or 0 for new problem 0 Enter P = No. Dimensions, and D.o.F. 0 for MVN) or 0 0 to terminate) 0 0 Thus ) *, *, 4 as produced by usenb, to roundoff error Arbitrary Unknown Variance Case A Two-Stage Procedure We now consider the more-realistic heteroscedastic problem in which the vari- are assumed to be completely arbitrary, i.e., unknown and not necessarily equal. This is clearly the situation that the experimenter would most likely encounter in practice. For example, this case covers a variety of problems that arise in comparing competing simulated systems. The most widely used procedures are those based on the work of Dudewicz and Dalal 1975) and Rinott 1978). These procedures are often carried out in two stages the first stage is meant to provide variances estimates; these estimates are then used to determine the number of observations that will be needed in the second stage to achieve the probability requirement. Rinott s procedure, presented next, is a bit easier to apply than that of Dudewicz and Dalal. ances 8
9 * 3 Selection Procedure Fix a common number of observations $ treatment. For the given and desired # that solves 2.5) where $ %. to be taken in Stage 1 from each, find the constant Stage 1 Take a random sample of $ observations treatment. $ from Stage 2 Calculate the sample means and variances based on the initial $ observations, $ $ and % $, an unbiased estimate of based on $ &% degrees of freedom d.f.). Take % $ additional observations from treatment where % $ if $, 4 % $ if * $ + Calculate based on the combined results of the Stage 1 and Stage 2 samples. Select the treatment associated with the largest sample mean over both stages,, as having a -near-best mean. The constant is the solution to ) #$ % 6, 2.5) where 6 is the standard normal c.d.f. and & is the p.d.f. of the ' -distribution with d.f. The FORTRAN program rinott in Bechhofer et al. 1995) calculates values of for arbitrary, or one can use the table in Wilcox 1984). Note that Procedure does not eliminate treatments at the end of Stage 1, but instead uses the Stage 1 data from each treatment to estimate that treatment s variance and then uses the combined sample data to make a final decision. A second characteristic of is that one cannot bound a priori the total number of observations that will require; intuitively this is the result of having no initial information about the treatment variances. Rather, the number of Stage 2 and total) observations that takes from each treatment is a random variable, 9
10 * possibly different for each treatment. The quantity shows that stochastically) the larger the variance for a particular treatment, the greater the number of Stage 2 observations one must take, as is to be expected intuitively. Example 2.2 The Rinott procedure is particulary useful for analyzing the output from simulation data. Here each treatment corresponds to a system that is being studied in the simulation experiment. The following case study was first described in Goldsman, Nelson and Schmeiser 1991). The goal is to evaluate different airline-reservation systems. The measure of performance is the expected time to failure, E[TTF], with the larger the value of this quantity the better. The system works if either of two computers works. Computer failures are rare, repair times are fast, and the resulting E[TTF] is large. The four systems arise from variations in parameters affecting the timeto-failure and time-to-repair distributions. From experience it is known that the E[TTF] s are roughly **, *** minutes about days) for all four systems. Suppose that we are indifferent to expected differences of less than minutes *** about two days). The large E[TTF] s, the highly variable nature of rare failures, the similarity of the systems, and as it turns out) the relatively small indifference zone of minutes yield a problem with reasonably large computational costs. We denote the E[TTF] arising from system by ), and the associated ordered s by The s, s, and their pairings are completely unknown. Since larger E[TTF] is preferred, the mean difference between the two best systems in this example is % ). For this example, *** the smallest difference worth detecting is minutes. We shall also use a desired probability of correct selection of *,* in this study. Let, ) denote the observed time to failure from the th independent simulation replication of system. Application of the Rinott Procedure requires i.i.d. normal observations from each system. If each simulation replication is initialized from a particular system under the same operating conditions, but with independent random number seeds, the resulting will be i.i.d. for each system. However, the cannot be justified as being normally distributed and, in fact, are somewhat skewed to the right. Thus, instead of using the raw, Goldsman et al. 1991) ap- in Procedure plied the procedure to the so-called macroreplication estimators of the. These into disjoint batches and use the batch averages estimators group the as the data to which Procedure is applied. In particular, they fix an integer number of simulation replications that comprise each macroreplication that is, *** 10
11 * * * * * * * Table 1 Summary of First- and Second-Stage Results for the Airline Reservation Experiment Based on an Initial Common Cample of Size $, where is the Estimated Standard Error of the Sample Mean $ is the batch size) and let, ) where - is the number of macroreplications to be taken from system. The macroreplication estimators from the th system, - -, are i.i.d. with expectation. If least is sufficiently large, say at, then the Central Limit Theorem yields approximate normality for each. No assumptions are made concerning the variances of the macroreplications. To apply Procedure, the authors conducted a pilot study to serve as the first stage of the procedure; each system was run for $ macroreplications with each macroreplication consisting of the averages of simulations of the system. The results are summarized in the first two rows of Table 1. Shapiro- Wilks tests on the macroreplications from each system showed no evidence of non-normality of the *. from each system.) From the appropriate table or from program rinott), we find that the critical constant for and *,* is. The total sample sizes were computed for each system and are displayed in row 3 of Table 1. For example, System 2 required an additional % * macroreplications in the second stage each macroreplication again being the average of system simulations). In all, a total of about 40,000 simulations of the four systems were required to implement Procedure. The combined sample means for each system are listed in row 4 of Table 1, and the standard errors of each mean are listed in row 5. They clearly 11
12 establish System 1 as having the largest E[TTF] Arbitrary Unknown Variance Case A Sequential Procedure For the same case as in Subsection 2.2.2, we present a multi-stage procedure due to Kim and Nelson 2000). After an initial stage of sampling similar to that of Rinott 1978), the Kim and Nelson procedure samples observations one-at-a-time from each contending treatment until a certain stopping criterion is met. The procedure has the advantage that it can eliminate treatments that appear to be noncompetitive with the best ones. Hence, if the experimenter is willing to use a sequential procedure, Kim and Nelson is often more efficient than Rinott s. So why would one ever want to use the Rinott procedure instead of Nelson and Kim? If each stage of sampling requires a unit of time, then logistic constraints may prevent the experimenter from sampling as the Nelson and Kim procedure requires. This is more often the case when conducting physical experiments than computer experiments. 12
13 Selection Procedure to be taken in the first stage. Initialization Fix a number of observations $ For the given and specified /, calculate the constant % % % #$ Further, set and let $ %. Stage 1 Take a random sample of $ observations treatment $ observations, $ $. For all from. For treatment compute the sample mean based on the, compute the sample variance of the difference between treatments and, and set where 0 set $ % % % $ % $ denotes the greatest integer less than or equal to 0. Finally, for all If $ * mean $ counter $ and go to the Screening phase of the procedure. Screening Set where, then stop and select the treatment with the largest sample as one having a / -near-best mean. Otherwise, set the sequential 1 and Stopping Rule If % &% and re-set $ % *,, % $# for all 1 $, then stop and select the treatment with index in as having a -near-best mean. If % &% *, take one additional observation ' from each treatment 1. Increment if. If associated with the largest having index and go to the screening stage, then stop and select the treatment
14 The Kim and Nelson Procedure is somewhat more complicated to implement than the Rinott Procedure, but it has several distinct advantages. First, once the initial set of $ observations is collected from each treatment, is parsimonious in taking additional observations in that they are added one-at-a-time and the data is examined to determine if sufficient information has been collected to stop. In contrast, takes potentially large) groups of observations. Second, allows treatments to be discarded before the final decision; those treatments that appear inferior can legitimately be dropped from further consideration. 3 Single-Factor Experiments to Screen Treatments Based on Means 3.1 Introduction Screening is concerned with the problem of identifying a subset of candidate treatments containing the best treatment from a set of potential candidates. As motivation, consider the data given in Table 2 on the mean amounts of fat absorbed by batches of doughnuts during cooking using one of four brands of shortening A, B, C, D). These data are from a completely randomized experiment conducted Table 2 Grams of Fat Absorbed per Batch for Four Brands of Shortening Brand of Shortening Observation A B C D Means by Lowe in 1935 at Iowa State University and analyzed in Snedecor and Cochran 1967) pp ). For each of the four shortenings considered, six batches of doughnuts were prepared. 14
15 Traditional hypothesis testing decides if the mean amounts of fat absorbed by the four brands are equal. Selection procedures seek to identify the brand of shortening for which the mean amount of fat absorbed is minimal or in other problems maximal, depending on the research goal). In this case, the data were not collected using an experimental design that had this objective in mind. This section introduces statistical screening procedures that allow the experimenter to identify a subset of the brands that contains the best or practically equivalent best) brand with at least a given confidence level. For simplicity, we initially explain the philosophy of screening in the context of the same normal theory single-factor experimental setup discussed in Section 2. For this model, we momentarily take the best treatment to be the treatment with the largest mean,. Later, we relax this definition and consider identifying any treatment whose mean is equivalent to from a practical viewpoint, i.e., any treatment associated with a -near-best mean. We also consider more complicated models where screening is natural to use. In the one-way layout, screening procedures are concerned with identifying a subset of the test treatments that is guaranteed to contain the best treatment with high probability; these procedures are usually called subset selection procedures in the statistical literature. More accurately, the procedures discussed in this section eliminate screen out ) inferior treatments those that exhibit strong evidence of having small -values) and retain all other treatments. To make clear the contrast between screening and the Bechhofer-Fabian selection formulation of Section 2, the latter is ordinarily employed when designing experiments; the choice of the sample size $ is central to guarantee the probability requirement. Screening methods can be employed when analyzing results from an experiment with arbitrary sample sizes. However, there is no free lunch here and the consequence of using too small an $ may be that a large subset is selected. Thus we describe two methods of choosing $ when the choice of sample size is under control of the experimenter. Lastly, we note that unlike the Bechhofer-Fabian probability requirement 2.2), it will be seen that the subset selection probability requirement 3.1) can be guaranteed using a single-stage procedure even when the common variance is unknown. In a more comprehensive approach, after the inferior treatments have been eliminated, the selected treatments can be subjected to further studies. For example, it might be desired to identify a # -near-best treatment using two stages of sampling screening after the first stage to eliminate obviously inferior treatments, followed by additional sampling of the remaining treatments to select a single best treatment so that the overall design requirement of Bechhofer-Fabian 15
16 is satisfied. For common known measurement error variance this can be accomplished by using, for example, the two-stage procedures of Tamhane and Bechhofer 1977) and Tamhane and Bechhofer 1979) which uses the Gupta procedure for screening) and Santner and Behaxeteguy 1992) which uses the Gupta and Santner Procedure, below, for screening) or the general procedure of Nelson, Swann, Goldsman and Song 2000). For the moment, we wish to attain the following goal. Goal 3.1 Select a random-size) subset that contains the treatment associated with. Confidence Requirement Given with for all, we require that 3.1) and any unknown, where CS denotes the event that the subset of treatments selected includes the treatment associated with. In Subsection 3.2 we illustrate the philosophy of screening procedures to attain Goal 3.1 for the simplest case that of the one-way layout with common known or unknown variance and equal numbers of observations from all treatments. As usual, screening for the treatment having smallest mean can be accomplished by applying the procedures given in this section to the negatives of the original data. In Subsection 3.3, extensions of the basic method will be given that allow additional flexibility in the specification of the best treatment. Subsection 3.4 discusses screening procedures that control the number of treatments selected by the subset. 3.2 Screening for the Best Treatment A Screening Procedure for the Balanced One-Way Layout Gupta 1956) and Gupta 1965) proposed screening procedures for the balanced one-way layout in 2.1) when the variances are homogeneous and the common value is either known or unknown. We emphasize that statistical screening procedures can be applied to data obtained either from designed experiments or from observational studies. 16
17 $ $ $ * Screening Procedure Calculate the sample means $ ) from the treatments. Let denote their ordered values. If is unknown also calculate % the unbiased pooled estimate of based on Case 1 & $ % d.f. Known) Include treatment in the selected subset if ) % where is the one-sided upper- % equicorrelated multivariate normal distribution defined by 2.4) or 8.1). Case 2 Unknown) Include treatment in the selected subset if % 3.2) equicoordinate point of the 3.3) where ) is the one-sided upper- % equicoordinate point of the equicorrelated multivariate central -distribution defined by 8.2). The constant ) required by Procedure case can be computed using the program find-zt as illustrated in Example 2.1. for the unknown Example 3.1 For the fat absorption data in Table 2, the sample variance, pooled **, over the four brands, is based on % is unknown, the constant should be used in. Notice that, for this example, the best brand is that having the lowest mean amount of fat absorption. Therefore, the analogue of Procedure for selecting the treatment associated with the smallest mean selects brand if **, d.f. Since Brands A and D satisfy this criterion and are selected. From another viewpoint, it can be concluded, with confidence, that brands B and C are inferior. 17
18 Intuitively, Procedure works by including in the subset the treatment corresponding to the largest sample mean plus possibly other treatments having sample means sufficiently close to the largest sample mean. The total number of additional treatments selected is not specified ahead of time but is a random variable that depends on the common sample size, the variability of the individual responses, the total number of treatments, and the distribution of the population means relative to one another. Physically, these characteristics are embodied in the length of the yardstick, $ or $ ), for the appropriate critical point. The yardstick is simply a constant times the true standard deviation of the difference between two means or an estimate of this quantity). The yardstick has the appealing feature that its length increases with the noise in the data and the desired confidence guarantee through ), while its length decreases as the sample size or the number of treatments increases again, through ). The number of additional treatments selected also depends on the relative spacing among the true population means; if the true means are close together, more treatments will tend to be selected than if the treatments are separated. The fact that screening procedures select a random number of treatments suggests that these procedures have some features in common with confidence intervals. In favor of this analogy, consider the confidence interval for the mean of a normal distribution with known variance, i.e., $, where is the sample mean of the data and is the upper ** % % percentile point of the standard normal distribution. This interval adjusts its center and, in the analogous case when is unknown, its width) to exactly achieve the nominal coverage for all. The width of this interval depends on the noise or through the sample standard deviation when is unknown) and on the sample size $. However, the analogy with confidence intervals is not perfect because screening procedures do not typically achieve their nominal confidence level exactly for all parameter configurations. It is true that and unknown, but in addition, whenever ) as $ for all. This fact shows that is conservative its achieved level is at least equal its nominal level but can be arbitrarily close to unity. The same is true of all screening procedures. 18
19 % Sample Size Determination for Screening It is natural to consider criteria for choosing the sample size when an experimenter has some) control over the sample size. We mention two criteria for choosing $ based on the number of treatments,, selected by the screening procedure. Of course is random with 1. The most informative possible outcome is and the least informative is. For any fixed and with ), it can be shown that as $ This fact suggests that the experimenter could choose $ to control the expected number of treatments in the selected subset, i.e., given or ) and * *, choose $ so that. Example 3.2 To illustrate this first strategy, consider using Procedure with *,* when and * is known. Table 3 lists the expected number and proportion of treatments selected under the slippage mean configuration,, and the equi-spaced mean configuration, *, *, *, *, %. Table 3 Expected Number and Proportion of Treatments Selected Under the Slippage and Equi-spaced Configurations when * * and $ Notice the extent to which the operating characteristics of depend on the true underlying mean. The screening Procedure requires over 100 observations to reduce the number of treatments selected to under 2/3 of the original number if the means are as tightly clustered as in the slippage configuration. ** On the other hand, $ observations per treatment screens out roughly 90% 19
20 of the treatments when the components of are as spread out as the equi-spaced configuration. These two extremes give the investigator guidance as to the choice of sample size. All computations in Example 3.2 were performed using the FORTRAN program eval-ng from Bechhofer et al. 1995) that uses simulation to estimate these expected values. The following dialogue illustrates the use of this program * for $. >eval-ng ENTER T AND SIGMA ENTER NUMBER OF SIMULATION REPS AND INTEGER SEED ENTER N, DELTA AND P-STAR K P{S.GEQ.K SC} P{S.GEQ.K ES} E{S SC} = E{S ES} = As can be seen in Example 3.2, the program eval-ng also provides a second criterion for choosing $. We choose $ to force the probability that the procedure 20
21 $ $ selects more than a specified number of treatments to be small, i.e., give a true configuration small. and some, choose $ to make Example 3.2 continued) Table 4 lists the tail probabilities for Procedure when and the means are either in the slippage or equi-spaced configurations defined above for several runs of eval-ng. Notice that under the very spread.* Table 4 Tail Probabilities for when $ and ** Based on when Under the Slippage and Equi-spaced Configurations.* ** out) equi-spaced configuration, the chance that the subset contains 4 or fewer of the treatments is over 97%. In contrast, under the very pessimistic) slippage configuration, there still is about a 10% chance of selecting all treatments, a worst-case scenario. The experimenter can identify some intermediate configuration that is more realistic and require that be sufficiently small for some choice of. 21
22 3.2.3 Other One-Factor Experiments The operation of Procedure is common to all screening procedures. Every screening procedure estimates the characterizing quantity of interest for all treatments and bases its decisions on this estimate; treatments whose estimated characteristics are sufficiently distant from the best estimated levels of the quantity are eliminated. These general features are illustrated, for example, in the screening procedures that we reference for two frequently occurring cases related to the balanced oneway layout. First, a number of screening procedures have been proposed for the unbalanced one-way layout, a situation often arising in observational studies. See Section of Bechhofer et al. 1995) for a recommendation of a particular screening procedure.) A second important practical situation where screening procedures are well documented is when the experiment consists of a randomized complete block design so that the observations have the structure +*, where the measurement errors are i.i.d., the s are treatment effects, and the s are additive block effects. This model, with batches viewed as block effects, more aptly describes the motivating doughnut example than does the oneway layout. See Section of Bechhofer et al. 1995) or Section 3.4 of Gupta and Panchapakesan 1996) who present screening procedures for this situation. Also see Driessen 1992) and Dourleijn 1993) for screening procedures involving general connected designs. 3.3 Screening for -Near-Best Treatments This section introduces additional flexibility in screening by considering the alternative definition of best treatment that was introduced in Section 2, namely, identification of a subset containing at least one near-best treatment. Recall that the idea of this formulation is that the experimenter is willing to consider a treatment as equivalent to the best treatment if its mean is sufficiently close to the mean of the best treatment. Recall that treatment is # -near-best if is within a specified amount * * of the largest treatment mean, i.e., %. The experimental goal and the associated probability design) requirement considered in this section are stated in Goal 3.2 and Equation 3.4), respectively. Goal 3.2 To select a random-size) subset that contains at least one treatment whose mean satisfies * %. 22
23 $ Confidence Requirement Given * * that and all unknown, where CS denotes the event of correctly achiev- for all ing Goal 3.2). and with, CS, we require 3.4) With the modified constant defined in 3.5) below, the basic procedure of Section 3.2 will guarantee probability requirement 3.4) see Panchapakesan and Santner 1977), van der Laan 1992), and Roth 1978)). The Procedure presented next is appropriate for the case where there is a common known measurement error variance. In this case the probability 3.4) depends on. Screening Procedure $ ). Let denote the ordered sample means. Calculate the sample means Include treatment in the selected subset if % where, ) % $ 3.5) ) and is the one-sided upper- % equicoordinate point of the equicorrelated multivariate normal distribution defined by 2.4) or 8.1). The minimum value of configuration, CS over all vectors occurs in the slippage ) %, 3.6) When 3.6) holds, there is only one # -near-best treatment, namely, the treatment associated with mean ; hence the procedure is designed in such a way that this treatment is selected with probability at least. Intuitively, it is easier to select a / -near-best treatment than to select the treatment associated with the best treatment). Thus it should be no surprise that 23
24 Table 5 Probability that Procedure *, +*,-*,,, and Probability 0.90 $ * Chooses the Best Treatment when is Chosen to Guarantee / CS with * Procedure uses a shorter yardstick to achieve Goal 3.2 than the best treatment. Mathematically, ) of %, $ ), used to select As an illustration, suppose that the constant *,*. Table 5 from van der Laan 1992)) lists the probability that Procedure selects the treatment with mean * when and the means are in * the slippage configuration 3.6). For example, when $ and - is defined by 3.5) for *.-*,*., then the achieved probability of selecting the treatment having mean * is When is modified for, then the achieved probability of selecting the treatment with mean is Therefore, by relaxing the requirement of selecting the best treatment to that of selecting a -near-best treatment, Procedure increases the probability of correct selection from 0.67 for finding the treatment with mean ) to 0.90 for finding a treatment with mean % ). When the measurement error variance is unknown, Santner 1976) and # of is chosen to guarantee CS Panchapakesan and Santner 1977) presented two-stage screening procedures for the goals of selecting a subset of treatments containing all - -near-best treatments and at least one / -near-best treatment, respectively. 3.4 Bounding the Number of Treatments Selected This section discusses screening procedures for finding - -near-best treatments that bound the maximum number of treatments that the screening procedures select. Recall that Procedure can select all treatments; it is this feature of the 24
25 6 6 $ $ $ $ 6 that we modify in Procedure below. For the one-way layout with known measurement error, Gupta and Santner 1973) introduced a procedure that selects a random number of treatments subject to an a priori specified upper bound, say ), on the number of treatments selected. Initially, they stressed a formulation with a goal and confidence requirement that selects a random-size subset of size or less, that contains the best treatment with a probability at least whenever satisfies the condition % ). However, their procedure achieves the stronger guarantee CS for all, where CS denotes the event of correctly achieving Goal 3.2 Hooper and Santner 1979)). To illustrate the philosophy of this approach, we present their screening procedure. Screening Procedure denote the ordered sample means. Calculate the sample means +. Let Include treatment in the selected subset if ) % $ where solves Equation 3.7) below. As usual, the width of the yardstick defining Procedure, is a multiple of the standard error of the difference between pairs of sample means, namely, $. The constant is the solution of ). 6 % where % % * )
26 is the incomplete beta function, and is the gamma function. Procedure 0 % 0 * *. can not be implemented for all,,,, and. Intuitively, this is explained by the fact that for the given the probability must be attainable for the fixed subset procedure that chooses the largest sample means ) in ). If it is, then one can choose a finite to permit a possibly smaller, random-size subset of size at most to attain that. In practice, for given, measurement error standard deviation, and confidence level, one might be interested in finding the sample size $ required by a given rule defined by ) to achieve a given level of practical equivalence defined by ). the yardstick component if it exists) required to achieve a given level of practical equivalence defined by # ) for a given sample size $. the level of practical equivalence / if it exists) corresponding to a given sample size $ and yardstick component. The FORTRAN program use-gs, available as part of the package of programs with Bechhofer et al. 1995) at tjs, can solve all three of these problems. We illustrate this program with an artificial example. Example 3.3 In a study of fertilizing methods, an experimenter measures the yields on a set of 10 test plots planted with each fertilizer. Let denote the yield on the th test plot that employs the th fertilizing method, and let denote the mean of, * ). We assume that the yields are independent and normally distributed with a common known standard deviation of bushels per test plot. Suppose that the experimenter wishes to select a subset containing any fertilizing method whose mean yield is within 2 bushels of the fertilizing method having the largest mean yield, that is, associated with any treatment having % %. Further assume that the experimenter would like the selected subset to contain no more than treatments. If the experimenter wishes to guarantee this goal with probability at least 0.90 using Procedure, then program use-gs gives as the required constant as the following dialogue *, shows. 26
27 > use-gs ENTER T,Q,SIGMA AND P-STAR CHOICE GIVEN FIND 1 N, H DELTA-STAR 2 DELTA-STAR, H N 3 DELTA-STAR, N H ENTER CHOICE 1, 2, OR 3) 3 ENTER DELTA-STAR AND N FOR T, Q, SIGMA, N, DELTA*) = 5, 3, 3.000, 10, 2.000) THE CRITICAL VALUE H IS DO YOU WANT TO CONTINUE? PLEASE ANSWER Y/N n The rule selects all those treatments satisfying % % +*, % *, $ * The statistical screening Procedure requires that the measurement error variance be known. Sullivan and Wilson 1984) and Sullivan and Wilson 1989) devised a bounded subset selection procedure for selecting a -near-best normal treatment when the treatments have unknown and not necessarily equal variances. They also proposed a procedure to allow selection of a # -near-best normal treatment when the data from the treatments come from stationary normal processes with unknown means and unknown covariance structures based on correlated sampling within each process. Santner 1975) presents bounded screening procedures for a number of other parametric families. 27
28 ' ' 4 Selection and Screening for Completely Randomized Experiments Factorial experiments are a mainstay of industrial statistics. Many of the selection and screening goals for multi-factor experiments are direct generalizations of those for single-factor experiments. The structure of factorial experiments also yields several useful new goals such as screening for the most robust product design or manufacturing process design. The randomization scheme involved in a factorial experiment plays an important role in formulating the statistical models. This section briefly mentions some basic issues for completely randomized factorial experiments. These issues are also relevant to randomization restricted factorial experiments such as split-plot experiments discussed in Section 6. The specific goals and statistical procedures for multi-factor experiments depend on the structure of the underlying mean model. For specificity consider a two-factor experiment where the row) factor R has levels and the column) factor C has levels. Let - - be the observations from treatment combination with an unknown mean, ). The treatment means always can be decomposed as 5 subject to the identifiability conditions 5 * and * The model is additive if * for all and, i.e., 5 4.1) and is non-additive otherwise. In practice, knowledge that the means are additive could be based on previous experience or knowledge of the factors being studied. When preliminary data are available, hypothesis tests for interactions can be conducted to facilitate the choice of a model. Strictly speaking, exact additive models are rare in practice; however, non-additive models with small or negligible interactions are not uncommon. In these situations, additive models can provide good approximations to the true models and can obtain increased efficiencies while maintaining the validity of the analyses. Note that the null hypothesis of no interactions can be rejected based 28
29 on large samples even if the interaction effects are not large enough to invalidate conclusions based on the additivity assumption. Suppose additivity holds. Denote the ordered values of the unknown s and s by 7 7 7& ' and 7 7 7&, respectively. The levels of factors R and C corresponding to the treatment combination having the largest mean,, are ' and, respectively. Therefore, under an additive model, selecting the best treatment combination is equivalent to simultaneously selecting the best levels of factors R and C. Chapter 6 of Bechhofer et al. 1995) provides details for several selection and screening procedures under additivity when completely randomized experiments can be run. The discussions there cover procedures that use single- and multi-stage sampling, as well as having other features. Suppose non-additivity holds for the means. Many of the procedures for single-factor experiments can be extended naturally to handle such situations. For example, to select or screen for the treatment combination associated with, one can treat the two-factor experiment as a giant single-factor experiment with treatments and apply the procedures discussed in the previous sections. Again, see Chapter 6 of Bechhofer et al. 1995) for details and references. The models for completely randomized factorial experiments can be viewed as special cases of those for factorial experiments with randomization restrictions. The procedures for split-plot designs presented in Section 6 can be applied to completely randomized experiments by setting the variances of the randomization restriction errors to zero. 5 Screening to Identify Robust Products Based on Fractional-Factorial Experiments This section considers screening procedures for the important industrial applications of determining robust product designs and robust manufacturing processes. In many quality improvement experiments, two types of factors are investigated. The first are control factors that can be manipulated by the experimenter to form various product or process) designs; control factors are also called design or engineering or manufacturing factors. The second type of factors are noise factors that represent uncontrollable field or factory conditions affecting the performance or the manufacture of a product; noise factors are also called environmental factors. Because of the similarity of process improvement and product improvement, we 29
30 will use the language of product improvement for simplicity although we keep in mind that these methods apply to both settings. In quantifying robustness, we consider applications where it is appropriate to measure the performance or quality of a product by its worst performance under the different environments. This criterion is natural in situations where a low response at any combination of the levels of the noise factors can have potentially serious consequences. For example, engineering designs for seat belts or heart valves that fail catastrophically under rare, though non-negligible, sets of operating conditions must be eliminated early in the product design cycle. One useful goal in such applications is to identify product designs whose worst-case performances are among the best. In a sense, this is similar to the larger-the-better criterion in robust design.) This section introduces a screening procedure to facilitate such identification. The procedure determines a random-sized subset of the product designs so as to contain, with a prespecified confidence level, the product designs having the best worst-case performances. In more detail, suppose that the mean of a response that characterizes the output quality depends on control factors and noise factors. In our notation, and denote the number of control and noise factors that interact with each other, respectively, while is the number of control factors having no interactions with noise factors, and is the number of noise factors having no interactions with control factors. Of special practical importance is the frequently occurring case when all factors are at two levels; however, nothing in the development below requires this assumption. We introduce the following notation to distinguish these two types of control and noise factors. C Notation C C C Interpretation Control Factors that interact with Noise Factors Control Factors that do not interact with Noise Factors N N N ' Noise Factors that interact with Control Factors N Noise Factors that do not interact with Control Factors The superscripts and are mnemonic; in C indicates that the th control factor of this type interacts with one or more noise factors; the in C indicates that th control factor of this type interacts with no noise factors. A parallel 30
Theory of Screening Procedures to Identify Robust Product Designs Using Fractional Factorial Experiments
Theory of Screening Procedures to Identify Robust Product Designs Using Fractional Factorial Experiments Guohua Pan Biostatistics and Statistical Reporting Novartis Pharmaceuticals Corporation East Hanover,
More informationSelecting the Best Simulated System. Dave Goldsman. Georgia Tech. July 3 4, 2003
Selecting the Best Simulated System Dave Goldsman Georgia Tech July 3 4, 2003 Barry Nelson 1 Outline of Talk 1. Introduction. Look at procedures to select the... 2. Normal Population with the Largest Mean
More informationSELECTING THE NORMAL POPULATION WITH THE SMALLEST COEFFICIENT OF VARIATION
SELECTING THE NORMAL POPULATION WITH THE SMALLEST COEFFICIENT OF VARIATION Ajit C. Tamhane Department of IE/MS and Department of Statistics Northwestern University, Evanston, IL 60208 Anthony J. Hayter
More informationCOMBINED RANKING AND SELECTION WITH CONTROL VARIATES. Shing Chih Tsai Barry L. Nelson
Proceedings of the 2006 Winter Simulation Conference L. F. Perrone, F. P. Wieland, J. Liu, B. G. Lawson, D. M. Nicol, and R. M. Fujimoto, eds. COMBINED RANKING AND SELECTION WITH CONTROL VARIATES Shing
More informationSTATISTICAL SCREENING, SELECTION, AND MULTIPLE COMPARISON PROCEDURES IN COMPUTER SIMULATION
Proceedings of the 1998 Winter Simulation Conference D.J. Medeiros E.F. Watson J.S. Carson and M.S. Manivannan eds. STATISTICAL SCREENING SELECTION AND MULTIPLE COMPARISON PROCEDURES IN COMPUTER SIMULATION
More informationComputational Tasks and Models
1 Computational Tasks and Models Overview: We assume that the reader is familiar with computing devices but may associate the notion of computation with specific incarnations of it. Our first goal is to
More information0 0'0 2S ~~ Employment category
Analyze Phase 331 60000 50000 40000 30000 20000 10000 O~----,------.------,------,,------,------.------,----- N = 227 136 27 41 32 5 ' V~ 00 0' 00 00 i-.~ fl' ~G ~~ ~O~ ()0 -S 0 -S ~~ 0 ~~ 0 ~G d> ~0~
More informationCONTROLLED SEQUENTIAL BIFURCATION: A NEW FACTOR-SCREENING METHOD FOR DISCRETE-EVENT SIMULATION
ABSTRACT CONTROLLED SEQUENTIAL BIFURCATION: A NEW FACTOR-SCREENING METHOD FOR DISCRETE-EVENT SIMULATION Hong Wan Bruce Ankenman Barry L. Nelson Department of Industrial Engineering and Management Sciences
More informationMustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson
Proceedings of the 0 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelspach, K. P. White, and M. Fu, eds. RELATIVE ERROR STOCHASTIC KRIGING Mustafa H. Tongarlak Bruce E. Ankenman Barry L.
More informationCHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)
FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter
More informationFINDING THE BEST IN THE PRESENCE OF A STOCHASTIC CONSTRAINT. Sigrún Andradóttir David Goldsman Seong-Hee Kim
Proceedings of the 2005 Winter Simulation Conference M. E. Kuhl, N. M. Steiger, F. B. Armstrong, and J. A. Joines, eds. FINDING THE BEST IN THE PRESENCE OF A STOCHASTIC CONSTRAINT Sigrún Andradóttir David
More informationA Maximally Controllable Indifference-Zone Policy
A Maimally Controllable Indifference-Zone Policy Peter I. Frazier Operations Research and Information Engineering Cornell University Ithaca, NY 14853, USA February 21, 2011 Abstract We consider the indifference-zone
More informationStatistical Applications in Genetics and Molecular Biology
Statistical Applications in Genetics and Molecular Biology Volume 5, Issue 1 2006 Article 28 A Two-Step Multiple Comparison Procedure for a Large Number of Tests and Multiple Treatments Hongmei Jiang Rebecca
More informationProceedings of the 2014 Winter Simulation Conference A. Tolk, S. Y. Diallo, I. O. Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, eds.
Proceedings of the 2014 Winter Simulation Conference A. Tolk, S. Y. Diallo, I. O. Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, eds. BOOTSTRAP RANKING & SELECTION REVISITED Soonhui Lee School of Business
More informationSession-Based Queueing Systems
Session-Based Queueing Systems Modelling, Simulation, and Approximation Jeroen Horters Supervisor VU: Sandjai Bhulai Executive Summary Companies often offer services that require multiple steps on the
More informationProbability and Statistics
Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT
More informationA Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators
Statistics Preprints Statistics -00 A Simulation Study on Confidence Interval Procedures of Some Mean Cumulative Function Estimators Jianying Zuo Iowa State University, jiyizu@iastate.edu William Q. Meeker
More informationINTRODUCTION TO ANALYSIS OF VARIANCE
CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two
More informationExperimental designs for multiple responses with different models
Graduate Theses and Dissertations Graduate College 2015 Experimental designs for multiple responses with different models Wilmina Mary Marget Iowa State University Follow this and additional works at:
More informationOrthogonal, Planned and Unplanned Comparisons
This is a chapter excerpt from Guilford Publications. Data Analysis for Experimental Design, by Richard Gonzalez Copyright 2008. 8 Orthogonal, Planned and Unplanned Comparisons 8.1 Introduction In this
More information2008 Winton. Statistical Testing of RNGs
1 Statistical Testing of RNGs Criteria for Randomness For a sequence of numbers to be considered a sequence of randomly acquired numbers, it must have two basic statistical properties: Uniformly distributed
More informationDesign and Implementation of CUSUM Exceedance Control Charts for Unknown Location
Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location MARIEN A. GRAHAM Department of Statistics University of Pretoria South Africa marien.graham@up.ac.za S. CHAKRABORTI Department
More informationChapter 13 Section D. F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons
Explaining Psychological Statistics (2 nd Ed.) by Barry H. Cohen Chapter 13 Section D F versus Q: Different Approaches to Controlling Type I Errors with Multiple Comparisons In section B of this chapter,
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationA General Overview of Parametric Estimation and Inference Techniques.
A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying
More informationModified Simes Critical Values Under Positive Dependence
Modified Simes Critical Values Under Positive Dependence Gengqian Cai, Sanat K. Sarkar Clinical Pharmacology Statistics & Programming, BDS, GlaxoSmithKline Statistics Department, Temple University, Philadelphia
More informationThe number of distributions used in this book is small, basically the binomial and Poisson distributions, and some variations on them.
Chapter 2 Statistics In the present chapter, I will briefly review some statistical distributions that are used often in this book. I will also discuss some statistical techniques that are important in
More informationFully Sequential Selection Procedures with Control. Variates
Fully Sequential Selection Procedures with Control Variates Shing Chih Tsai 1 Department of Industrial and Information Management National Cheng Kung University No. 1, University Road, Tainan City, Taiwan
More informationUnit 27 One-Way Analysis of Variance
Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied
More informationOpen Problems in Mixed Models
xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For
More informationAdaptive designs beyond p-value combination methods. Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013
Adaptive designs beyond p-value combination methods Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013 Outline Introduction Combination-p-value method and conditional error function
More informationSTATISTICAL ANALYSIS AND COMPARISON OF SIMULATION MODELS OF HIGHLY DEPENDABLE SYSTEMS - AN EXPERIMENTAL STUDY. Peter Buchholz Dennis Müller
Proceedings of the 2009 Winter Simulation Conference M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds. STATISTICAL ANALYSIS AND COMPARISON OF SIMULATION MODELS OF HIGHLY DEPENDABLE
More informationOPTIMIZATION OF FIRST ORDER MODELS
Chapter 2 OPTIMIZATION OF FIRST ORDER MODELS One should not multiply explanations and causes unless it is strictly necessary William of Bakersville in Umberto Eco s In the Name of the Rose 1 In Response
More informationProbability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur
Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation
More informationUSING RANKING AND SELECTION TO CLEAN UP AFTER SIMULATION OPTIMIZATION
USING RANKING AND SELECTION TO CLEAN UP AFTER SIMULATION OPTIMIZATION JUSTIN BOESEL The MITRE Corporation, 1820 Dolley Madison Boulevard, McLean, Virginia 22102, boesel@mitre.org BARRY L. NELSON Department
More informationA GENERAL FRAMEWORK FOR THE ASYMPTOTIC VALIDITY OF TWO-STAGE PROCEDURES FOR SELECTION AND MULTIPLE COMPARISONS WITH CONSISTENT VARIANCE ESTIMATORS
Proceedings of the 2009 Winter Simulation Conference M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds. A GENERAL FRAMEWORK FOR THE ASYMPTOTIC VALIDITY OF TWO-STAGE PROCEDURES
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationRecall the Basics of Hypothesis Testing
Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE
More informationUpper and Lower Bounds on the Number of Faults. a System Can Withstand Without Repairs. Cambridge, MA 02139
Upper and Lower Bounds on the Number of Faults a System Can Withstand Without Repairs Michel Goemans y Nancy Lynch z Isaac Saias x Laboratory for Computer Science Massachusetts Institute of Technology
More informationOverall Plan of Simulation and Modeling I. Chapters
Overall Plan of Simulation and Modeling I Chapters Introduction to Simulation Discrete Simulation Analytical Modeling Modeling Paradigms Input Modeling Random Number Generation Output Analysis Continuous
More informationPERFORMANCE OF VARIANCE UPDATING RANKING AND SELECTION PROCEDURES
Proceedings of the 2005 Winter Simulation Conference M. E. Kuhl, N. M. Steiger, F. B. Armstrong, and J. A. Joines, eds. PERFORMANCE OF VARIANCE UPDATING RANKING AND SELECTION PROCEDURES Gwendolyn J. Malone
More informationMultistage Methodologies for Partitioning a Set of Exponential. populations.
Multistage Methodologies for Partitioning a Set of Exponential Populations Department of Mathematics, University of New Orleans, 2000 Lakefront, New Orleans, LA 70148, USA tsolanky@uno.edu Tumulesh K.
More informationAppendix A. Review of Basic Mathematical Operations. 22Introduction
Appendix A Review of Basic Mathematical Operations I never did very well in math I could never seem to persuade the teacher that I hadn t meant my answers literally. Introduction Calvin Trillin Many of
More informationDecision Tree Learning
Decision Tree Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Machine Learning, Chapter 3 2. Data Mining: Concepts, Models,
More informationProceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds.
Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huscha, and S. E. Chic, eds. OPTIMAL COMPUTING BUDGET ALLOCATION WITH EXPONENTIAL UNDERLYING
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More information18.05 Practice Final Exam
No calculators. 18.05 Practice Final Exam Number of problems 16 concept questions, 16 problems. Simplifying expressions Unless asked to explicitly, you don t need to simplify complicated expressions. For
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationCHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the
CHAPTER 4 VARIABILITY ANALYSES Chapter 3 introduced the mode, median, and mean as tools for summarizing the information provided in an distribution of data. Measures of central tendency are often useful
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationAP Statistics Cumulative AP Exam Study Guide
AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics
More informationStochastic Processes
qmc082.tex. Version of 30 September 2010. Lecture Notes on Quantum Mechanics No. 8 R. B. Griffiths References: Stochastic Processes CQT = R. B. Griffiths, Consistent Quantum Theory (Cambridge, 2002) DeGroot
More informationTECHNICAL REPORT # 59 MAY Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study
TECHNICAL REPORT # 59 MAY 2013 Interim sample size recalculation for linear and logistic regression models: a comprehensive Monte-Carlo study Sergey Tarima, Peng He, Tao Wang, Aniko Szabo Division of Biostatistics,
More informationModule 1. Probability
Module 1 Probability 1. Introduction In our daily life we come across many processes whose nature cannot be predicted in advance. Such processes are referred to as random processes. The only way to derive
More informationInferences About the Difference Between Two Means
7 Inferences About the Difference Between Two Means Chapter Outline 7.1 New Concepts 7.1.1 Independent Versus Dependent Samples 7.1. Hypotheses 7. Inferences About Two Independent Means 7..1 Independent
More informationChapter 9. Non-Parametric Density Function Estimation
9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least
More informationTOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS
T E C H N I C A L B R I E F TOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS Dr. Martin Miller, Author Chief Scientist, LeCroy Corporation January 27, 2005 The determination of total
More informationAdvanced Statistical Methods. Lecture 6
Advanced Statistical Methods Lecture 6 Convergence distribution of M.-H. MCMC We denote the PDF estimated by the MCMC as. It has the property Convergence distribution After some time, the distribution
More informationStochastic Histories. Chapter Introduction
Chapter 8 Stochastic Histories 8.1 Introduction Despite the fact that classical mechanics employs deterministic dynamical laws, random dynamical processes often arise in classical physics, as well as in
More informationSimulation. Where real stuff starts
1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?
More informationVarieties of Count Data
CHAPTER 1 Varieties of Count Data SOME POINTS OF DISCUSSION What are counts? What are count data? What is a linear statistical model? What is the relationship between a probability distribution function
More informationMore on Input Distributions
More on Input Distributions Importance of Using the Correct Distribution Replacing a distribution with its mean Arrivals Waiting line Processing order System Service mean interarrival time = 1 minute mean
More informationSTAT T&E COE-Report Reliability Test Planning for Mean Time Between Failures. Best Practice. Authored by: Jennifer Kensler, PhD STAT T&E COE
Reliability est Planning for Mean ime Between Failures Best Practice Authored by: Jennifer Kensler, PhD SA &E COE March 21, 2014 he goal of the SA &E COE is to assist in developing rigorous, defensible
More informationEstimation and sample size calculations for correlated binary error rates of biometric identification devices
Estimation and sample size calculations for correlated binary error rates of biometric identification devices Michael E. Schuckers,11 Valentine Hall, Department of Mathematics Saint Lawrence University,
More informationCOMPARING SYSTEMS. Dave Goldsman. November 15, School of ISyE Georgia Institute of Technology Atlanta, GA, USA
1 / 103 COMPARING SYSTEMS Dave Goldsman School of ISyE Georgia Institute of Technology Atlanta, GA, USA sman@gatech.edu November 15, 2017 Outline 1 Introduction and Review of Classical Confidence Intervals
More informationStatistical inference
Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall
More informationMethod A3 Interval Test Method
Method A3 Interval Test Method Description of the Methodology 003-004, Integrated Sciences Group, All Rights Reserved. Not for Resale ABSTRACT A methodology is described for testing whether a specific
More informationSIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS
SIMULATION SEMINAR SERIES INPUT PROBABILITY DISTRIBUTIONS Zeynep F. EREN DOGU PURPOSE & OVERVIEW Stochastic simulations involve random inputs, so produce random outputs too. The quality of the output is
More informationChange-point models and performance measures for sequential change detection
Change-point models and performance measures for sequential change detection Department of Electrical and Computer Engineering, University of Patras, 26500 Rion, Greece moustaki@upatras.gr George V. Moustakides
More informationThe t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies
The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit
More informationChapter 11. Output Analysis for a Single Model Prof. Dr. Mesut Güneş Ch. 11 Output Analysis for a Single Model
Chapter Output Analysis for a Single Model. Contents Types of Simulation Stochastic Nature of Output Data Measures of Performance Output Analysis for Terminating Simulations Output Analysis for Steady-state
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationConfidence Intervals. Confidence interval for sample mean. Confidence interval for sample mean. Confidence interval for sample mean
Confidence Intervals Confidence interval for sample mean The CLT tells us: as the sample size n increases, the sample mean is approximately Normal with mean and standard deviation Thus, we have a standard
More informationEstimation of Quantiles
9 Estimation of Quantiles The notion of quantiles was introduced in Section 3.2: recall that a quantile x α for an r.v. X is a constant such that P(X x α )=1 α. (9.1) In this chapter we examine quantiles
More informationCHAPTER 4 THE COMMON FACTOR MODEL IN THE SAMPLE. From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum
CHAPTER 4 THE COMMON FACTOR MODEL IN THE SAMPLE From Exploratory Factor Analysis Ledyard R Tucker and Robert C. MacCallum 1997 65 CHAPTER 4 THE COMMON FACTOR MODEL IN THE SAMPLE 4.0. Introduction In Chapter
More informationCONTENTS OF DAY 2. II. Why Random Sampling is Important 10 A myth, an urban legend, and the real reason NOTES FOR SUMMER STATISTICS INSTITUTE COURSE
1 2 CONTENTS OF DAY 2 I. More Precise Definition of Simple Random Sample 3 Connection with independent random variables 4 Problems with small populations 9 II. Why Random Sampling is Important 10 A myth,
More informationA Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints
Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note
More informationOptimization of Muffler and Silencer
Chapter 5 Optimization of Muffler and Silencer In the earlier chapter though various numerical methods are presented, they are not meant to optimize the performance of muffler/silencer for space constraint
More informationUncertainty due to Finite Resolution Measurements
Uncertainty due to Finite Resolution Measurements S.D. Phillips, B. Tolman, T.W. Estler National Institute of Standards and Technology Gaithersburg, MD 899 Steven.Phillips@NIST.gov Abstract We investigate
More informationFeature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size
Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Berkman Sahiner, a) Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, b) and Lubomir Hadjiiski
More informationChap The McGraw-Hill Companies, Inc. All rights reserved.
11 pter11 Chap Analysis of Variance Overview of ANOVA Multiple Comparisons Tests for Homogeneity of Variances Two-Factor ANOVA Without Replication General Linear Model Experimental Design: An Overview
More informationarxiv: v1 [stat.me] 14 Jan 2019
arxiv:1901.04443v1 [stat.me] 14 Jan 2019 An Approach to Statistical Process Control that is New, Nonparametric, Simple, and Powerful W.J. Conover, Texas Tech University, Lubbock, Texas V. G. Tercero-Gómez,Tecnológico
More informationMultivariate Distributions
IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate
More informationOpen book and notes. 120 minutes. Covers Chapters 8 through 14 of Montgomery and Runger (fourth edition).
IE 330 Seat # Open book and notes 10 minutes Covers Chapters 8 through 14 of Montgomery and Runger (fourth edition) Cover page and eight pages of exam No calculator ( points) I have, or will, complete
More information4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49
4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether
More informationComputer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.
Simulation Discrete-Event System Simulation Chapter 0 Output Analysis for a Single Model Purpose Objective: Estimate system performance via simulation If θ is the system performance, the precision of the
More information18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages
Name No calculators. 18.05 Final Exam Number of problems 16 concept questions, 16 problems, 21 pages Extra paper If you need more space we will provide some blank paper. Indicate clearly that your solution
More informationProtean Instrument Dutchtown Road, Knoxville, TN TEL/FAX:
Application Note AN-0210-1 Tracking Instrument Behavior A frequently asked question is How can I be sure that my instrument is performing normally? Before we can answer this question, we must define what
More informationHypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3
Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest
More informationWorst-case design of structures using stopping rules in k-adaptive random sampling approach
10 th World Congress on Structural and Multidisciplinary Optimization May 19-4, 013, Orlando, Florida, USA Worst-case design of structures using stopping rules in k-adaptive random sampling approach Makoto
More informationHardy s Paradox. Chapter Introduction
Chapter 25 Hardy s Paradox 25.1 Introduction Hardy s paradox resembles the Bohm version of the Einstein-Podolsky-Rosen paradox, discussed in Chs. 23 and 24, in that it involves two correlated particles,
More informationRigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis
Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis The Philosophy of science: the scientific Method - from a Popperian perspective Philosophy
More informationTest Strategies for Experiments with a Binary Response and Single Stress Factor Best Practice
Test Strategies for Experiments with a Binary Response and Single Stress Factor Best Practice Authored by: Sarah Burke, PhD Lenny Truett, PhD 15 June 2017 The goal of the STAT COE is to assist in developing
More informationForecast comparison of principal component regression and principal covariate regression
Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric
More informationHYPOTHESIS TESTING. Hypothesis Testing
MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.
More informationBattery Life. Factory
Statistics 354 (Fall 2018) Analysis of Variance: Comparing Several Means Remark. These notes are from an elementary statistics class and introduce the Analysis of Variance technique for comparing several
More informationA lower bound for scheduling of unit jobs with immediate decision on parallel machines
A lower bound for scheduling of unit jobs with immediate decision on parallel machines Tomáš Ebenlendr Jiří Sgall Abstract Consider scheduling of unit jobs with release times and deadlines on m identical
More informationIntroduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs
Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs The Analysis of Variance (ANOVA) The analysis of variance (ANOVA) is a statistical technique
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More informationOne-Way Repeated Measures Contrasts
Chapter 44 One-Way Repeated easures Contrasts Introduction This module calculates the power of a test of a contrast among the means in a one-way repeated measures design using either the multivariate test
More information