The occurrence of rare events in manufacturing processes, e.g. nonconforming items or machine failures, is frequently modeled

Research Article (wileyonlinelibrary.com) DOI: 1.12/qre.1495 Published online in Wiley Online Library Exponential CUSUM Charts with Estimated Control Limits Min Zhang, a Fadel M. Megahed b * and William H. Woodall c Exponential CUSUM charts are used in monitoring the occurrence rate of rare events because the interarrival times of events for homogeneous Poisson processes are independent and identically distributed exponential random variables. In these applications, it is assumed that the exponential parameter, i.e. the mean, is known or has been accurately estimated. However, in practice, the in-control mean is typically unknown and must be estimated to construct the limits for the exponential CUSUM chart. In this article, we investigate the effect of parameter estimation on the run length properties of one-sided lower exponential CUSUM charts. In addition, analyzing conditional performance measures shows that the effect of estimation error can be significant, affecting both the in-control average run length and the quick detection of process deterioration. We also provide recommendations regarding phase I sample sizes. This sample size must be quite large for the in-control chart performance to be close to that for the known parameter case. Finally, we provide an industrial example to highlight the practical implications of estimation error, and to offer advice to practitioners when constructing/analyzing a phase I sample. Copyright 212 John Wiley & Sons, Ltd. Keywords: high-yield processes; phase I control charting; safety monitoring; statistical process control (SPC); time between events (TBE) 1. Introduction The occurrence of rare events in manufacturing processes, e.g. nonconforming items or machine failures, is frequently modeled using a homogeneous Poisson process. Zhang et al. 1 highlighted that the Poisson rate can be monitored using two different methods: control charts for counts 2,3 and control charts based on interarrival times, which are independent and identically distributed exponential random variables when the process is stable. Control charts based on interarrival times are more efficient 2 and they include the exponential Shewhart 4,5, exponentially weighted moving average (EWMA) 6, and exponential cumulative sum (CUSUM) control charts 3,7 1. Among these charts, the exponential CUSUM chart is popular because it can be optimized for the quick detection of a given shift 9. The available literature on the exponential CUSUM chart has been based on the assumption that the in-control mean for the interarrival times is known or has been accurately estimated. However, in practice, the performance of the exponential CUSUM chart is dependent on the control limits that are based on estimating the in-control parameter for the exponential distribution. Therefore, it is important to provide the practitioner guidelines such that the effect of estimation error on the exponential CUSUM chart can be better understood. It should be noted that such guidelines have been provided for the exponential Shewhart chart 11, the exponential EWMA chart 11,the related risk-adjusted Bernoulli CUSUM chart 12, and the non risk-adjusted Bernoulli CUSUM chart 13. The Bernoulli-based methods apply when one can count the number of opportunities (such as the items produced) between events rather than measuring the time between events (TBE). In such cases, the count for the number of opportunities between events is geometrically distributed. Therefore, the exponential CUSUM chart can be seen as a continuous time version of the geometric CUSUM chart. The reader is referred to Szarka and Woodall 14 for a detailed discussion on the relationship between the Bernoulli CUSUM and the geometric CUSUM charts. In this article, we study the effect of estimation error on the performance of the one-sided lower exponential CUSUM chart by using the expected value of the average run length (ARL avg ), the standard deviation of the average run length (SDARL), the standard deviation of the run length (SDRL), and percentiles of the run length distribution. We then evaluate these measures under two situations, when the relationship of the estimated mean to the true in-control mean is known and when it is unknown. These two analyses are referred to as conditional and marginal performance of the run length metrics, respectively. The conditional analysis allows us to understand the effect of overestimating or underestimating the exponential parameter on the run length performance. On the other a Department of Industrial Engineering, Tianjin University, Tianjin, 372, China b Department of Industrial and Systems Engineering, Auburn University, Auburn, AL 36849, USA c Department of Statistics, Virginia Tech, Blacksburg, VA 2461, USA *Correspondence to: Fadel M. Megahed, Department of Industrial and Systems Engineering, Auburn University, Auburn, AL 36849, USA. E-mail: fmegahed@auburn.edu

hand, the marginal performance is useful in providing sample size recommendations because it considers the distribution of the estimated parameter and thus accounts for the variability introduced through parameter estimation. It should be noted that we only consider the effect of estimation error on the lower exponential CUSUM chart because the most common situation in practice is to use the TBE (time between events) CUSUM to detect...a decrease in the time between these events (Montgomery, p. 428 2 ). Our work, however, could be extended to the two-sided case. In Section 2, we briefly review the properties of the one-sided lower exponential CUSUM chart. Then, we present the methods used in studying the effect of estimation error on the exponential CUSUM chart. In Section 4, we provide our results and a discussion of recommended sample sizes for the exponential EWMA and other control charts that can be used for this type of data. Section 5 offers an industrial example from occupational health and safety to illustrate the practical significance of estimation error on the CUSUM chart, while prompting some advice to practitioners. Finally, our conclusions are given in Section 6. 2. The one-sided lower exponential CUSUM chart 2.1. The one-sided lower exponential CUSUM chart with a known exponential parameter Suppose that the recording of rare events starts at time, and that they occur at times T 1, T 2, T 3, etc. Provided the process is stable, one can define a sequence of continuous independent and identically distributed random variables X 1 = T 1, X 2 = T 2 T 1, X 3 = T 3 T 2... to represent the interarrival times (TBE). Let the sequence {X n, n 1} follow the exponential distribution, defined by the following pdf: 8 1 b e x= b >< if x fðþ¼ x (1) >: otherwise; where b is the TBE mean. In addition, we first assume that the exponential parameter, b, is known and is equal to b when the process is in-control. Then, a one-sided lower exponential CUSUM can be used to detect decreases in the mean interarrival time. The statistics, C t, for this lower exponential CUSUM are defined by C t ¼ minf; C t 1 þ X t Kg; t¼ 1; 2;...; (2) where X t is the time that has elapsed between the tth and (t 1)st events, and K is defined as K ¼ b b b b ln! b ; (3) b where b is the smallest shift to be detected quickly. In Equation (2), the starting value for the CUSUM chart (C ) is typically chosen to be equal to zero. The control chart signals the first time when C t h, where h is the lower control limit. The reader should note that this one-sided lower CUSUM can be equivalently represented by the nonnegative statistics C t ¼ max ; Ct 1 X t þ K ; t¼ 1; 2;...; where C ¼ : (4) Lucas 3 and Gan 8 provide details regarding the choice of K and h such that a pre-specified in-control ARL can be obtained for the one-sided lower exponential CUSUM chart. Furthermore, Borror et al. 1 examined the robustness of this CUSUM chart and showed that moderate departures from the exponential distribution do not significantly affect its performance. Hereafter, we only refer to the non-positive statistics, C t, as the one-sided lower exponential CUSUM statistics. 2.2. The one-sided lower exponential CUSUM chart when b is unknown The control limit, h, for the one-sided lower CUSUM statistic can be calculated based on the recursive (Vardeman and Ray 7 ), the integral equation (Gan 8 ), or the Markov chain (Brook and Evans 15 ) approaches if b is known. However, when b is unknown, its value must be estimated prior to any calculations. The maximum likelihood estimator is typically used to estimate b from a historical phase I sample of n waiting times. We have the maximum likelihood estimator ^b ¼ 1 n X n i¼1 X i : (5) Accordingly, b is replaced by ^b in any of the run length calculations for the lower exponential CUSUM chart. The objective of our article is to determine the effect of the size of the phase I sample on the exponential CUSUM chart s performance. In Section 2.3, we scale the exponential model to facilitate the CUSUM chart s design, without the loss of generality to our results.

2.3. Scaling of the exponential model and applying the CUSUM chart to the scaled model We let d = b/b,whered is the out-of-control shift magnitude, and b isthetrueprocessmean.thecasewhend = 1 represents an in-control process, whereas d < 1 denotes a deteriorated Poisson process based on a decreased average TBE. To simplify the design of the exponential CUSUM chart, consider the following scaling of the exponential TBE observations: Y t ¼ X t ^b ¼ b ^b b b X t b : (6) We let W ¼ b =^b and Q t = X t /b. Therefore, Equation (6) can be rewritten as Y t ¼ W d Q t : (7) The random variable W is representative of the error in estimating the in-control exponential mean, and is distributed according to the inverse gamma distribution, IGamma (n, n), in the in-control state (Ozsan et al. 11 ). Conditioning on ^b, Y t is exponentially distributed and represents the exponential TBE observations scaled by ^b. Hereafter, we only consider Y t (rather than X t ) because it is easier to design the CUSUM chart based on Y t. The CUSUM statistic for Y t can be obtained by replacing X t by Y t in Equation (3). This yields In this article, S = and K is defined as: S t ¼ minf; S t 1 þ Y t Kg; t¼ 1; 2;...: (8) K ¼ b y b y ^b ;y ln b! y : (9) ^b ;y This formulation for K has been shown to be optimal in detecting the pre-specified shift of interest, b y. The choice of b y is application-specific. In our article, b y is defined as b y ¼ ð1 d Þ^b ;y ; (1) where d* =.2,.5, or. 8. Thus, we have K ¼ ð 1 1 = d Þlnð1 d Þ. The lower control limit of the lower exponential CUSUM chart used for monitoring Y t is defined as LCL y ¼ h y ; (11) where h y is a positive constant chosen to obtain a desired in-control ARL, ARL. In the following section, we provide some information about our calculations. 3. Methods used for calculating the marginal and conditional run length properties As mentioned in Section 2.2, the run length performance of the lower exponential CUSUM chart can be calculated in several ways. In this article, we use the Markov chain method, proposed by Brook and Evans 15, to calculate the chart s performance when the exponential parameter is estimated. Similar to Ozsan et al. 11, we consider both conditional and marginal run length properties 16,17. For the reader s convenience, we provide the Markov chain method and the equations used to obtain the run length properties in Appendices A and B, respectively. To obtain the run length properties, one needs to define both the number of states in the Markov chain and h y. Here, we set the number of states, m + 1, in the Markov chain to be equal to 41. The selection of the number of states was based on comparing the run length performance with 11, 21, 41, 51, 11, and 21 states. Our calculations converged when m + 1 = 41; these results can be obtained from the corresponding author upon request. Additionally, we set the in-control ARL to be approximately equal to 5 for all the exponential CUSUMs investigated in this work, i.e. for the charts designed for d* =.2,.5,. 8. We used a binary search approach to determine h y. It should be noted that we provide the design parameters (h y and K) for each of the exponential CUSUM charts in the results section. 4. Results and discussion In this section, we present the conditional and marginal run length properties of the exponential CUSUM chart when the control limits are estimated. The conditional performance is summarized in Section 4.1, in which we consider the three hypothetical scenarios of

parameter estimation presented in Ozsan et al. 11. These cases correspond to the first, second, and third quartiles of W cases, corresponding to overestimating (W < 1), more accurately estimating, and underestimating (W > 1) the exponential parameter, respectively. Note that the values of ^b ;y that correspond to these quartiles can be easily obtained because the random variable W is IGamma (n, n) distributed (see Ozsan et al. 11 for a presentation of these values for different sample sizes). On the other hand, we present the marginal performance summaries in Section 4.2. These summaries are obtained by averaging the performance measures over all possible values of W, as given in Equations (B7) to (B1). 4.1. Conditional performance of the one-sided exponential CUSUM chart Tables I III provide the run length performance summaries for CUSUM charts designed for d* =.2,.5, and. 8, respectively. These summaries are provided under the aforementioned hypothetical scenarios of parameter estimation as well as the known parameter case (W = 1). Additionally, the gray rows correspond to the in-control case (d = 1). Several interesting observations can be made based on the run length performance summaries. We focus our discussion according to the value of d. When d = 1, overestimation results in an increase in the number of false alarms and a decrease in the variability of the run length when compared with the median case. On the contrary, underestimation leads to an increase in the variability of the run length and a higher ARL than the designed value of 5. These results are consistent for the CUSUM charts in Tables 1 3; however, the magnitude of the effect decreases as d * and/or n increase. We believe that the justification of these observations is threefold. First, overestimation can lead to a perceived decrease in the TBE mean, which would be analogous to a process deterioration and hence, an increase in the rate of false alarms. Because we are only interested in detecting process deterioration, the perceived improvement that results from underestimating the exponential parameter will lead to less frequent false alarms when compared with the known parameter case because the CUSUM chart is one-sided. Second, as n increases, we converge toward the known parameter case and the difference between the performance measures for between each of the three hypothetical situations and the no-estimation error case decreases. Finally, the increases in d * make the CUSUM chart less sensitive to smaller shifts. Accordingly, increases in d * result in a smaller effect of estimation error on the performance of the CUSUM chart. For the out-of-control case (d < 1), our results show that the effect of estimation error on the in-control performance carries over to the out-of-control scenario. More specifically, an increase in the false alarm rate is accompanied with a more rapid detection of out-of-control conditions, whereas an increase in the in-control ARL (due to estimation error) results in higher values of the ARL in the out-of-control case. The results also indicate that the effect of estimation error on out-of-control performance decreases with smaller d and larger n and d * values. The major issue, however, is the difficulty in controlling the in-control ARL. 4.2. Marginal performance of the one-sided exponential CUSUM chart We provide the marginal performance summaries for the three exponential CUSUM charts under various sample sizes and shift magnitudes in Table IV. It is important to note that the marginal summaries provided are in fact the expected values for these random variables, e.g. in Table IV, we use ARL avg to denote the expected value of the ARL. As with the conditional performance summaries, we discuss the in-control and out-of-control cases separately. For the in-control case, the ARL values are much larger than the designed 5 value when n is small, an effect which is magnified for smaller values of d *.Asn increases, the ARL values start to converge to 5. With n 1, the ARL values become approximately within 1% of the targeted ARL value. Similar observations can be made for the percentiles and the standard deviation of the run length distribution. Therefore, in Table IV, we only show the ARL avg and SDRL avg values. Conclusions based on only the expected values of the ARL, standard run length, and percentiles of the run length distribution can be misleading, as explained in Lee et al. 13 and Zhang et al. 19, because they do not account for the practitioner-to-practitioner variation in the ARL on the basis of their respective phase I samples. This practitioner-to-practitioner variation can only be accounted for using the SDARL (or the percentiles of the ARL). The SDARL value is zero when the exponential parameter is known. Therefore, when n = 1, the SDARL values of 229.8, 13.5, and 7.6 for d * =.2,.5 and.8, respectively, represent significant variation in the ARL values. Accordingly, a much larger phase I sample is needed, especially as d * decreases. We recommend that the SDARL be small (maybe at 5 1% of the ARL value) such that practitioners can have confidence that the phase I performance is predictable and close to the in-control parameter known case. This is very difficult to achieve, however. For the out-of-control cases, by comparing the marginal performance summaries with the known parameter results, degradation in the performance of the exponential CUSUM chart can be easily observed. The average and standard deviation of the run length distributions are much larger when compared with the results for the known parameter, especially for smaller n and larger d (i.e. smaller shift) values. The out-of-control scenarios do provide further information regarding the sample size selection, but the key issue remains controlling the in-control ARL. On the basis of Tables I IV, two general conclusions can be made. First, as practitioners become more interested in detecting smaller shift sizes (as indicated by d * ), the size of the phase I sample needed dramatically increases. Similar results were obtained by Lee et al. 13 for the Bernoulli CUSUM chart. The second conclusion is the importance of the use of the SDARL to evaluate the effect of estimation error. Our results show that using only the expected value of the ARL and SDRL (as well as the percentiles of the run length) can be misleading. Therefore, much larger sample sizes are needed than the previous results investigating the effect of estimation error on control charts with exponential data. For example, Ozsan et al. 11 recommended sample sizes greater than 2 for the exponential EWMA chart. Their choice was based on their observation that increasing the sample size n up to 2 improves the marginal out-of-control performance significantly. Beyond 2, the improvement is steady but less. We could have made a nearly similar conclusion by ignoring the SDARL results. However, we do not recommend sample sizes this low because the

Table I. Conditional performance summary for d* =.2 when ^b corresponds to the quartiles of W and when W = 1, with K =.8926, h y = 1.21594, and ARL 5

Table II. Conditional performance summary for d* =.5 when ^b corresponds to the quartiles of W and when W = 1, with K =.6931, hy = 4.156, and ARL 5

Table III. Conditional performance summary for d* =.8 when ^b corresponds to the quartiles of W and when W = 1, with K =.424, h y = 1.244, and ARL 5

Table IV. Marginal performance summaries for various sample sizes, shift magnitudes, and d* values effect of estimation error on both the in-control and out-of-control performance is quite large, especially for d * =.2 and.5. The reader should note that the conclusions of Ozsan et al. 11 were not consistent with their calculations. Specifically, they showed that the in-control ARL will be more than 2% from the desired value 5% of the time, even when n = 5, which contradicts their earlier conclusions. The use of the SDARL metric makes it easier to visualize when the effect of estimation error can be neglected because it accounts for the practitioner-to-practitioner variation in the ARL. Therefore, it is easier for us to identify the need for larger sample sizes. In most applications, it will not be practical to have phase I sample sizes large enough to neglect the effect of estimation error. 5. An industrial example and advice to practitioners Table V presents an industrial data set of occupational accidents that occurred in a manufacturing plant over a 5-year period. In this case, a practitioner needs to first determine whether the data seem to indicate that the process is stable, and then decide whether the available data are sufficient for estimating the in-control parameter of interest. In this example, we only considered the last 5 years in

Table V. Accident data from a manufacturing plant for the years 1975 to 1979 Sequence Day TBE Sequence Day TBE Sequence Day TBE 1 23 21 85 39 41 1362 26 2 44 21 22 85 42 1434 72 3 63 19 23 86 1 43 1477 43 4 99 36 24 817 11 44 1512 35 5 113 14 25 846 29 45 1548 36 6 121 8 26 857 11 46 155 2 7 122 1 27 86 3 47 1552 2 8 22 98 28 882 22 48 162 68 9 24 2 29 889 7 49 1629 9 1 413 173 3 889 5 1636 7 11 462 49 31 93 14 51 1659 23 12 477 15 32 97 67 52 1668 9 13 517 4 33 128 58 53 1688 2 14 577 6 34 132 4 54 172 14 15 612 35 35 16 28 55 1762 6 16 646 34 36 182 22 56 1783 21 17 712 66 37 1154 72 57 1794 11 18 756 44 38 127 53 58 1819 25 19 759 3 39 125 43 2 766 7 4 1336 86 the data set provided in Lucas 3 because it was shown that the accident rate decreased by approximately 5% starting from 1975. Therefore, the question is now whether the remaining 5-year data are sufficient for estimating the mean TBE such that the effect of estimation error can be neglected on the exponential CUSUM chart. We explore this question below. The reader is referred to Montgomery 2 and Lucas 3 for a detailed discussion on how process stability can be evaluated for the phase I sample. On the basis of the results provided in Section 4, we can conclude that these 58 occurrences are insufficient in providing a robust estimate for the TBE. In fact, the sample size needed is much larger than that because we have established that a sample size of several thousand observations is needed so that the effect of estimation error can be neglected. This is not only impractical in this situation (hundreds of years of data), but it would also be difficult to ensure that the underlying assumptions behind the safety performance were still valid. To alleviate the aforementioned problem, it might be best to instead monitor a (or multiple) more frequent underlying process characteristic(s). In the case of our example, practitioners might consider monitoring near-misses and incidents that have resulted in minor and/or major injuries because they provide a good indication of the safety performance. This recommendation is similar to the approach of Steiner and MacKay 2, who suggested modeling the in-control performance of highyield manufacturing processes using a logistic regression model, which can be used to establish the relationship between the event of interest and some underlying continuous process/product characteristics that can be monitored more effectively. 6. Conclusions We have studied both the conditional and marginal performance summaries to investigate the effects of estimation error on the performance of the one-sided lower exponential CUSUM chart. Our results show that the effect of estimation error gets larger when the CUSUM is designed to detect smaller shifts in the mean. In general, the effect of estimation error is significant for all three d * values considered, especially when n < 1,. The conditional performance measures indicate that overestimation of the mean TBE results in a significantly lower value for the in-control ARL when compared with the targeted value. This was accompanied by a decrease in variability. On the contrary, underestimation led to an increase in the variability of the run length and a higher ARL than the desired value. The marginal performance summaries show that a sample of 1, or higher may be needed such that the phase I performance is predictable and close to the in-control parameter known case. More importantly, we noted that the sample sizes needed may not be practical when the in-control exponential mean is large, as shown in our industrial example. In that example, 58 occupational accidents occurred in 5 years, which makes it impractical to choose a significantly larger sample size. Acknowledgement The first author was partially supported by the NSFC (grant nos. 78243 and 712256).

References M. ZHANG ET AL. 1. Zhang CW, Xie M, Goh TN. A control chart for the gamma distribution as a model of time between events. International Journal of Production Research 27; 45(23):5649 5666. 2. Montgomery DC. Introduction to statistical quality control, (7th edn). Hoboken, NJ: John Wiley, 212. 3. Lucas JM. Counted data CUSUMs. Technometrics 1985; 27:129 144. 4. Jones LA, Champ CW. Phase I control charts for time between events. Quality and Reliability Engineering International 22; 18:479 488. 5. Xie M, Goh TN, Ranjan P. Some effective control chart procedures for reliability monitoring. Reliability Engineering and System Safety 22; 77:143 15. 6. Gan FF. Designs of one and two-sided exponential EWMA control charts. Journal of Quality Technology 1998; 3:55 69. 7. Vardeman S, Ray D. Average run lengths for CUSUM schemes when observations are exponentially distributed. Technometrics 1985; 27:145 15. 8. Gan FF. Exact run length distributions for one-sided exponential CUSUM schemes. Statistica Sinica 1992; 2:297 312. 9. Gan FF. Design of optimal exponential CUSUM control charts. Journal of Quality Technology 1994; 26:19 124. 1. Borror CM, Keats JB, Montgomery DC. Robustness of the time between events CUSUM. International Journal of Production Research 23; 41:3435 3444. 11. Ozsan G, Testik MC, Weiß CH. Properties of the exponential EWMA chart with parameter estimation. Quality and Reliability Engineering International 21; 26:555 569. 12. Jones MA, Steiner SH. Assessing the effect of estimation error on risk-adjusted CUSUM chart performance. International Journal for Quality in Health Care 212; 24:176 181. 13. Lee J, Wang N, Xu L, Schuh A, Woodall WH. The effects of parameter estimation on upper-sided Bernoulli cumulative sum charts. Quality and Reliability Engineering International 212; DOI: 1.12/qre.1413 (available online). 14. Szarka JL III, Woodall WH. On the equivalence of the Bernoulli and geometric CUSUM charts. Journal of Quality Technology 212; 44(1): 54 62. 15. Brook D, Evans, DA. An approach to the probability distribution of CUSUM run length. Biometrika 1972; 59:539 549. 16. Jensen WA, Jones-Farmer LA, Champ CW, Woodall WH. Effects of parameter estimation on control chart properties: A literature review. Journal of Quality Technology 26; 38:349 364. 17. Testik MC. Conditional marginal performance of the Poisson CUSUM control chart with parameter estimation. International Journal of Production Research 27; 45:5621 5638. 18. Fu JC, Spiring FA, Xie H. On the average run lengths of quality control schemes using a Markov chain approach. Statistics and Probability Letters 22; 56:369 38. 19. Zhang M, Peng Y, Schuh A, Megahed FM, Woodall WH. Geometric charts with estimated control limits. Quality and Reliability Engineering International 212, DOI: 1.12/qre.134 (available online). 2. Steiner SH, MacKay RJ. Effective monitoring of processes with parts per million defective A hard problem! In Frontiers in statistical quality control 7, Lenz HJ, Wilrich PTh (Eds.). Heidelberg, Germany: Springer-Verlag, 24. Appendix A The Markov chain approach Consider the CUSUM statistic, defined in Equation (8), where we accumulate the deviations of the random variables Y t from the reference value K. This process is continued while the CUSUM is negative, until the statistic falls below h y, or the statistic reverts back to its initial value of zero. The operation of such a scheme can be represented by a continuous state space Markov process. Here, we represent this continuous scheme by a Markov chain having m + 1 states E, E 2,..., E m. The state E denotes the initial value of the CUSUM, i.e. S =, and E m is the absorbing state, entered when the control chart signals. We define the width of state i ( < i < m) to be equal to d, where d = h y /m. We denote by a i the rescaled value of the CUSUM statistic in state i, and we take that to be the middle value between the lower and upper bounds of the grouping interval, denoted by L i and L i+1, respectively. Figure 1 provides a summary of the discretization of the state space. In this article, p i,j denotes the probability of transition from state i to j in one step. Then, the transition probability matrix, P, is defined as Figure 1. An overview of the state space for the Markov chain

1 p ; p ;1 p ;j p ;m 1 p ;m p 1; p 1;1 p 1;j p 1;m 1 p 1;m P ¼ p i; p i;1 p ij p i;m 1 p i;m ¼ B C @ p m 1; p m 1;1 p m 1;j p m 1;m 1 p m 1;m A 1! A r ~ : (A1) 1 ~ Note that all rows sum to unity, and that the last row consists of zeros, except with the last element that is equal to 1 because E m is an absorbing state. In addition, the matrix A is obtained by deleting the last row and column from the matrix P, i.e. it represents the transition probabilities between the different transient states. For i =,1,..., m 1, and j =1,2,..., m 1, the transition probabilities for the Markov chain are determined as follows: p i; ¼ PrðE i! E Þ ¼ PrðS t L 1 js t 1 ¼ a i Þ ¼ Prða i þ Y t K L 1 Þ ¼ PrðY t a d þ K a i Þ Z þ1 ¼ fðþdy y ¼ 1 Fa ð d þ K a i Þ; a dþk a i p i;j ¼ Pr E i! E j ¼ Pr Ljþ1 S t L St 1 j ¼ a i Þ ¼ Pr a j :5d a i þ Y t K < a j þ :5d ¼ Pr a j :5d þ K a i Y t < a j þ :5d þ K a i Z ajþ:5dþk a i ¼ fðþdy y ¼ Fa j þ :5d þ K a i Faj :5d þ K a i ; a j :5dþK a i p i;m ¼ PrðE i! E m Þ ¼ PrðS t L m js t 1 ¼ a i Þ ¼ Prða i þ Y t K L m Þ ¼ PrðY t h þ K a i Þ ¼ Z hþk ai 1 fðþdy y ¼ Fð h þ K a i Þ: (A2) (A3) (A4) Appendix B Conditional and marginal performance of the run length distribution We can define the run length of the CUSUM chart to be the number of steps taken starting from the initial state E to reach the absorbing state E m. Using this observation and the properties of the CUSUM chart 18, the conditional ARL is calculated as follows: ARL e ðdjwþ ¼ ði AÞ 1 1 ; (B1) ~ where I is the m m identity matrix and 1 is a m 1 vector of ones. Accordingly, ARL ~ e ðdjwþis also a m 1 vector where the ith entry is the ARL when the process started at state i. Therth order factorial moments can be obtained through the following recursive formula: M ðþ r ¼ r ði AÞ 1 I M ðr 1Þ : (B2) ~ ~ These factorial moments can be converted into power moments, as shown in the next equation: X k where Sr; ð kþ ¼ 1 k! l¼ M ~ r ¼ Xr k¼1 Sr; ð kþ M ðþ k ; ~ k ð 1Þ l k l r ; k ¼ ; 1;...; r. The reader should note that the vectors in Equations (B2) and (B3) are all m 1. l Hereafter, any vector is m 1, unless noted otherwise. The second-order moment about the mean can be calculated as 2 M ¼ M M ~ 2 ~ 2 ~ 1 ; (B4) 2 where M is a vector whose ith entry is the square of the ith entry of ~ 1 M1. Therefore, the SDRL is defined as SDRLðdjwÞ ¼ p ffiffiffiffiffiffiffiffiffi : (B5) m 2ðÞ i (B3)

Here, m 2(i) is the ith entry of M ~ 2. We let F ~ v be a vector whose ith entry is the cumulative probability function of the RL distribution, corresponding to an initial state. Accordingly, F ~ v can be defined as: ~ F v The elements of the cumulative probability vector, M. ZHANG ET AL. ¼ ði AvÞ1 ; where v ¼ ; 1;... (B6) ~ ~ F, multiplied by 1 can be used to determine the percentiles of the RL v distribution, in which each element corresponds to a different initial state. The bth percentile of the RL distribution is calculated by finding the smallest number of steps (v), which result in the first element of ~ F v to be greater than or equal to b/1. The marginal performance can be obtained by integrating the conditional performance measures with respect to the density of W, as shown in the following equations: ARL marginal ðþ¼ d SDRL marginal ðdþ ¼ Percentile marginal ðdþ ¼ Z þ1 Z þ1 Z þ1 ARLð djwþfðwþdw; and SDRLð djwþfðwþdw: percentileð djwþfðwþdw qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi SDARLðdÞ ¼ E ARL 2 ðwþ fe½arlðwþšg 2 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi R h þ1 ¼ ARLðdjwÞ 2 fðwþdw R i þ1 2 ARLðdjwÞfðwÞdw : (B7) (B8) (B9) (B1) Note that the random variable W is distributed IGamma (n, n). In addition, the operation in these equations corresponds to averaging the performance measures over all values of W, which is useful in determining the size of the phase I sample needed because the knowledge of the observed estimates is not required. This is different from Equations (B1), (B5), and (B6), which require an understanding of the relationship of the observed estimate to the true in-control exponential mean, i.e. the value of d. Finally, it should be noted that we followed the approach of Ozsan et al. 11, who used Simpson s quadrature method in Matlab W and a step size of.1 in evaluating the integrals in Equations (B7) to (B1). Authors' biographies Min Zhang is an associate professor at the Department of Industrial Engineering, Tianjin University. She received her PhD in management science and engineering from Tianjin University in 26 and her MS and BS in Material Engineering from Shandong University in Jinan in 23 and 2. Her research interests focus on statistical quality control, six sigma management, and process monitoring applications. Fadel Megahed is an assistant Professor at the Department of Industrial and Systems Engineering, Auburn University. He received his PhD and MS in Industrial and Systems Engineering from Virginia Tech and his BS in Mechanical Engineering from the American University in Cairo. He is a recipient of the Mary G. and Joseph Natrella Scholarship (212) from the American Statistical Association. His research interests are in the areas of statistical quality control and reliability. William Woodall is a professor of statistics at Virginia Tech. He is a former editor of the Journal of Quality Technology (21 23) and associate editor of Technometrics (1987 1995). He is the recipient of the Box Medal (212), Shewhart Medal (22), Jack Youden Prize (1995, 23), Brumbaugh Award (2, 26), Ellis Ott Foundation Award (1987), and best paper award for IIE Transactions on Quality and Reliability Engineering (1997). He is a Fellow of the American Statistical Association, a Fellow of the American Society for Quality, and an elected member of the International Statistical Institute.