Nonparametric Monitoring of Multiple Count Data

Size: px
Start display at page:

Download "Nonparametric Monitoring of Multiple Count Data"

Transcription

1 Nonparametric Monitoring of Multiple Count Data Peihua Qiu 1, Zhen He 2 and Zhiqiong Wang 3 1 Department of Biostatistics University of Florida, Gainesville, United States 2 College of Management and Economics Tianjin University, Tianjin, China 3 School of Management Tianjin University of Technology, Tianjin, China Abstract Process monitoring of multiple count data has received considerable attention recently in the statistical process control literature. Most existing methods on this topic are based on parametric modeling of the observed process data. However, the assumed parametric models are often invalid in practice, leading to unreliable performance of the related control charts. In this paper, we first show the consequence of using a parametric control chart in cases when the underlying parametric distribution is invalid. Then, we thoroughly investigate the performance of some parametric and nonparametric control charts in monitoring multiple count data. Our numerical results show that nonparametric methods can provide a more reliable and effective process monitoring in such cases. A real-data example about the crime log of the University of Florida Police Department is used for illustrating the implementation of the related control charts. Keywords: Distribution-free; Log-linear modeling; Multiple count data; Nonparametric procedures; Poisson distribution; Statistical process control. 1

2 1 Introduction Due to the spreading usage of sensors and the rapid development in information storage technology, it is common to collect and analyze several correlated quality characteristics of a process simultaneously. Online monitoring of these quality characteristics separately may not be effective in detecting process changes. Therefore, many multivariate control charts have been proposed in the literature to solve this problem more effectively (cf. Qiu, 2014, Chapters 7 and 9). Many of these charts are designed for monitoring multiple continuous quality characteristics. However, multiple count data are also common in practice. Examples include the number of delaminations at different positions of a printed circuit board in manufacturing industry (Wang et al., 2017), incidences of different types of diseases in epidemiology (Chen et al., 2015), and numbers of purchases of different products in marketing (Brijs et al., 2004). Therefore, monitoring of multiple count data is important. But, existing research on this research problem is limited. This paper aims to fill the gap by discussing different strategies to solve the problem. Traditionally, multiple count data could be modeled by either multivariate binomial/multinomial distributions or multivariate Poisson distributions (Johnson et al., 1997). Patel (1973) first proposed a T 2 -type chart for monitoring both multiple binomial and multiple Poisson data by approximating the related multivariate binomial or Poisson distribution by a multivariate normal distribution. Some subsequent research in this direction includes Lu et al. (1998), Chiu and Kuo (2010) and Li et al. (2014). Some other existing research handles multiple Poisson count data based on multivariate Poisson models (Holgate, 1964; Karlis, 2003; Karlis and Meligkotsidou, 2005). Parametric control charts proposed in this direction include Chiu and Kuo (2008), Lee Ho and Branco Costa (2009), Laungrungrong et al. (2011) and He et al. (2014). Jones et al. (1999) and Skinner et al. (2003) proposed two control charts in special cases when different components of the multiple Poisson data are assumed independent. According to Schmidt and Rodriguez (2011), the control charts based on the multivariate Poisson models have two major limitations. One is that they do not allow negative correlation between any pair of two variables, and thus lack generality for solv- 2

3 ing real-world problems. To address this issue, Chen et al. (2015) developed a multivariate exponentially weighted moving average (MEWMA) control chart to monitor multiple count data by using the Poisson log-normal distribution. The second limitation is that the multivariate Poisson models cannot describe overdispersed count data well. To overcome this limitation, Saghir and Lin (2014) proposed a Shewhart-type multivariate control chart based on the Conway-Maxwell-Poisson (COM-Poisson) distribution, which can accommodate over-dispersion in the observed data. Based on the same model as that in Chen et al. (2015), Das et al. (2016) proposed a control chart under the framework of the likelihood ratio test for monitoring over-dispersed multiple count data. Several overviews on monitoring multiple count data can be found in papers such as Topalidou and Psarakis (2009), Saghir and Lin (2015), and Cozzucoli and Marozzi (2017). All existing methods mentioned above on monitoring multiple count data are parametric in the sense that they rely on a specific parametric distribution. In the literature on monitoring multiple continuous data, it has been well demonstrated that such parametric control charts are not reliable to use in cases when the assumed parametric distributions are invalid (cf. Qiu, 2014, Chapters 8 and 9). In practice, the assumed parametric distributions are rarely valid, especially in multivariate cases, because there are so many different factors that may affect the quality variables and the relationship among them, and such factors are usually not included in the observed data. To address this important issue, there are several multivariate nonparametric control charts developed in the literature, including those discussed in Qiu and Hawkins (2001), Qiu and Hawkins (2003), Qiu (2008), Zou and Tsung (2011), Boone and Chakraborti (2012), Zou et al. (2012), Chen et al. (2016) and Qiu (2018). But, all these methods are suggested for monitoring multiple continuous data, and no nonparametric methods can be found for monitoring multiple count data. In this paper, we first introduce several representative parametric control charts for monitoring multiple count data in Section 2. Then, we propose a nonparametric control chart for the same purpose in Section 3. This chart is constructed under the framework of data categorization and log-linear modeling that was first discussed in Qiu (2008). Then, the related control charts are compared in Section 4 by a large simulation study regarding their in-control (IC) and out-of-control (OC) performance. A real-data example is used for 3

4 illustrating the implementation of the related methods in Section 5. Finally, several remarks conclude the article in Section 6. 2 Parametric Methods for Monitoring Multiple Count Data In the literature, there have been some methods proposed for monitoring multiple count data, as discussed in Section 1. In this section, we briefly introduce two recent ones that are considered flexible and versatile among the existing methods in this area. Let X j, for j = 1,2,..., p, be the number of defects or nonconformities of the quality characteristic j, and X = (X 1,X 2,...,X p ). The first recent method is the MEWMA control chart proposed by Chen et al. (2015). They used the multivariate Poisson log-normal distribution to model the multiple count data. More specifically, they assumed that X j followed a univariate Poisson distribution with rate parameter θ j, for j = 1,2,..., p. To allow correlation among different components of X, they further assumed that θ = (θ 1,θ 2,...,θ p ) was a random vector and followed a multivariate log-normal distribution with mean vector µ and covariance matrix Σ. In that framework, the IC value of µ, denoted as µ 0, and the IC value of Σ, denoted as Σ 0, are assumed known or they can be estimated accurately from an IC dataset. Then, when the related process is IC, the observed data X(n), for n 1, would have mean vector m 0 and covariance matrix V 0, both of which depend on µ 0 and Σ 0 implicitly. Then, the MEWMA charting statistic is defined as E n = R[X(n) m 0 ]+(I R)E n 1, for n 1, where E 0 = 0, I is the identity matrix, and R is a smoothing matrix with equal diagonal elements and equal off-diagonal elements as discussed in Hawkins et al. (2007). The chart gives a signal of process distributional shift when T 2 n = E nw 1 n E n > h T, (1) 4

5 where W n is the covariance matrix of E n, and h T > 0 is a control limit. Once the smoothing matrix R is chosen properly, h T is chosen to achieve a given value of the IC average run length, denoted as ARL 0. The second parametric chart to introduce is the one based on the COM-Poisson distributional assumption that was discussed in Saghir and Lin (2014). The COM-Poisson distribution generalizes the regular Poisson distribution by introducing a location parameter and a dispersion parameter to allow underdispersion or over-dispersion in the observed data. In that method, it is assumed that each component of X(n) has a COM-Poisson distribution and D(n) denotes the summation of all components at the time point n. Namely, D(n)= p X j (n). j=1 The mean and variance of D(n) can be calculated based on the COM-Poisson distributional assumption. Then, Saghir and Lin (2014) suggested a Shewhart-type control chart with the control limit determined by the mean and standard deviation of D(n). As well demonstrated in the literature, CUSUM charts are usually more effective in detecting persistent shifts than Shewhart charts (cf. Qiu, 2014, Chapter 4). For that reason and for a fair comparison among different charts in Sections 4 and 5 below, we also define a CUSUM chart based on D(n) here. Let u + 0,D = u 0,D = 0, and u + n,d = max(0,u+ n 1,D +(D(n) D 0) k D ), u n,d = min(0,u n 1,D +(D(n) D 0)+k D ), for n 1, where k D > 0 is an allowance constant, and D 0 is the IC mean of D(n) that can be estimated from an IC data. This CUSUM chart, denoted as DCUSUM hereafter, signals a shift in X(n) when u + n,d > h D or u n,d < h D, (2) where h D > 0 is a control limit chosen to achieve a given ARL 0 value. 5

6 3 Nonparametric Monitoring of Multiple Count Data The performance of the parametric methods described in the previous section depends heavily on the validity of the assumed parametric distributions. In the next section, we will show that their results are unreliable in cases when the assumed distributions are invalid. In practice, parametric distributions are often invalid to describe the observed quality characteristics of a specific process because the quality characteristics are often affected by many different factors and the mechanism of this impact is usually too complicated to describe by a parametric model. Therefore, development of nonparametric control charts is important for monitoring multiple count data. In this section, we propose a nonparametric control chart based on log-linear modeling. As pointed out by Qiu (2008), the major difficulty in describing the distribution of multiple quality characteristics is due to the complicated relationship among them. If the related variables are categorical, then the log-linear modeling would be a powerful tool for describing such relationship. Therefore, it is natural to categorize the original quality characteristics X(n) = (X 1 (n),x 2 (n),...,x p (n)), and then apply the log-linear modeling approach to the categorized data. To this end, let m j be the IC median of X j (n), Y(n)=(Y 1 (n),y 2 (n),...,y p (n)), and Y j (n)=i(x j (n)>m j ) for j=1,2,..., p, (3) where I(a) is an indicator function that equals 1 if a is true and 0 otherwise. Then, Y(n) is a binary version of X(n), and the IC distribution of each of its components should be Bernoulli(0.5). For count data, however, it is often difficult to find the exact median value m j due to the discreteness of the original variable X j (n). To address this issue, there are several possible solutions. One is to add a small random number to X j (n) before categorization, to make the distribution of X j (n) less discrete, as discussed in Qiu and Li (2011). Alternatively, in Equation (3), the IC median m j can be replaced by the more general IC rth quantile of X j (n), with r being a real number in (0,1) that is achievable and as close to 0.5 as possible. We consider using the median in Equation (3) because the resulting joint distribution of Y(n) could be more efficiently 6

7 estimated by the log-linear modeling approach (Agresti, 2013). Theoretically, it can be checked that as long as the IC distribution F(x) of X(n) has a positive probability mass in any neighborhood of the IC median vector (m 1,m 2,...,m p ), the distribution of Y(n) would be changed by any mean shift in the original data X(n). The reason is that X(n) has a mean shift if and only if it has a median shift, and a median shift in X(n) is equivalent to a distributional shift of Y(n), according to Equation (3). Thus, if we are concerned about mean shifts in X(n), we can just monitor the categorized data Y(n). Next, we briefly describe the log-linear modeling of the categorized data Y(n). For simplicity, the dimension is fixed at p=3, and the modeling can be discussed similarly for cases with p>3. Let O j1 j 2 j 3 be the observed cell count of the ( j 1, j 2, j 3 )th cell of the three-way contingency table of an IC data, with the three binary variables Y 1 (n), Y 2 (n) and Y 3 (n) defined in Equation (3) as classifiers, for j 1, j 2, j 3 = 0,1. Then, a saturated log-linear model is defined as log(o j1 j 2 j 3 )=λ + λ Y 1 j 1 + λ Y 2 j 2 + λ Y 3 j 3 + λ Y 1Y 2 j 1 j 2 + λ Y 1Y 3 j 1 j 3 + λ Y 2Y 3 j 2 j 3 + λ Y 1Y 2 Y 3 j 1 j 2 j 3, for j 1, j 2, j 3 = 0,1, (4) where λ is a constant term, λ Y 1 j 1, λ Y 2 j 2 and λ Y 3 j 3 are the main effects of Y 1, Y 2 and Y 3, respectively, λ Y 1Y 2 j 1 j 2, λ Y 1Y 3 j 1 j 3 and λ Y 2Y 3 j 2 j 3 are the two-way interaction terms, and λ Y 1Y 2 Y 3 j 1 j 2 j 3 is the three-way interaction term (cf., Agresti, 2013). By removing certain terms from the model (4), the resulting models can describe all kinds of possible association among Y 1, Y 2 and Y 3. For convenience, let us use a notation that lists the highest-order term(s) of each variable to denote a specific log-linear model. Then, the saturated log-linear model (4) can be denoted as (Y 1 Y 2 Y 3 ). If the three-way interaction term is excluded from the saturated model, then the conditional association between any two of Y 1, Y 2 and Y 3 are identical at the two levels of the remaining variable. That is, each pair of variables has homogeneous association. This model is denoted as (Y 1 Y 2,Y 1 Y 3,Y 2 Y 3 ). Similarly, the model (Y 1 Y 2,Y 1 Y 3 ) denotes the one with the two-way interaction term λ Y 2Y 3 j 2 j 3 and the threeway interaction term λ Y 1Y 2 Y 3 j 1 j 2 j 3 excluded from the saturated model. In that model, Y 2 and Y 3 are assumed conditionally independent given Y 1. These notations of log-linear models have been used in Agresti (2013). 7

8 For a specific categorical dataset, a proper log-linear model should be selected and estimated. To this end, the likelihood ratio test statistic G 2 and the hierarchy principle are routinely used for model selection. The hierarchy principle requires that all lower-order terms should be included in a model if a higher-order interaction term is in the model. In all simulation studies in this paper, we use the backward elimination procedure for model selection. Namely, we start from the saturated model. Then, in each step of model selection, only one term is considered to be deleted by performing a likelihood ratio test with the significance level of For instance, when comparing the saturated model(y 1 Y 2 Y 3 ) (denoted as M 1 ) with the reduced model(y 1 Y 2,Y 1 Y 3,Y 2 Y 3 ) (denoted as M 0 ), the likelihood ratio test statistic is G 2 (M 0 M 1 )= 2 log(l M0 /l M1 ), where l M0 and l M1 denote the likelihood functions of the reduced model M 0 and the saturated model M 1, respectively. The null distribution of G 2 (M 0 M 1 ) is the χ 2 (1) distribution, based on which the p-value of the test can be calculated for making decisions. This model selection process continues until no terms need to be deleted. Once the final model is determined, its estimation can be achieved by using the iterative weighted least square procedure. It should be pointed out that model selection and estimation in log-linear modeling are easy to implement, because almost all existing statistical software packages have specific functions for that purpose. Based on the estimated log-linear model described above, the IC joint distribution of Y(n), denoted as { f (0) j 1,..., j p = P(Y 1 = j 1,...,Y p = j p ), j 1,..., j p = 0,1}, can be estimated accordingly. For example, when p=3, f (0) j 1, j 2, j 3 can be estimated by E j1 j 2 j 3 /n 0, where E j1 j 2 j 3 is the estimated count of the ( j 1, j 2, j 3 )th cell that can be obtained from the estimated log-linear model and n 0 is the sample size of the IC dataset. Then, we can construct a Phase II control chart as follows. Let g j1,..., j p (n)=i(y 1 (n)= j 1,...,Y p (n)= j p ), for j 1,..., j p = 0,1, g(n) be a vector of all g j1,..., j p (n) values for j 1,..., j p = 0,1, and f (0) be a vector of all f (0) j 1,..., j p values in the corresponding order. By combining the Pearson s χ 2 test statistic with the CUSUM online monitoring 8

9 scheme, we can define a CUSUM control chart as follows. First, define S obs n = 0, if C n k P, S exp n = 0, if C n k P, S obs n =(S obs n 1 + g(n))(c n k P )/C n, if C n > k P, (5) S exp n =(S exp n 1 + f(0) )(C n k P )/C n, if C n > k P, where k P > 0 is an allowance constant, S obs 0 = S exp 0 = 0, (S obs C n =( n 1 S exp ) ( n 1 + g(n) f (0) )) (diag ( 1( (S S exp n 1 + f(0))) obs n 1 S exp ( n 1) + g(n) f (0) )), diag(a) denotes a diagonal matrix with its diagonal elements equal to the corresponding ones of the vector a, and the superscripts obs and exp denote the observed and expected counts, respectively. In Expression (5), C n is a quantity that measures the overall difference, up to the current time point n, between the observed counts and expected counts of the related p-way contingency table. When C n k P (i.e., the data do not show any significant evidence of a distributional shift in Y(n)), we reset S obs n is used to record the cumulative observed counts, and S exp n and S exp n in the scale of (C n k P )/C n. Then, the CUSUM charting statistic is defined by to be zero. Otherwise, S obs n is used to record the cumulative expected counts, u n,p = ( S obs n S exp n ) ( diag(s exp n ) ) 1( S obs n S exp ) n, (6) and the chart signals a shift when u n,p > h P, (7) where h P > 0 is a control limit chosen to achieve a given ARL 0 level. The chart (7) is denoted as chart hereafter, since it is derived from the Pearson s χ 2 test. We can check that u n,p in (6) is the conventional Pearson s χ 2 test statistic that measures the difference between the cumulative observed and expected counts 9

10 as of the time point n, when k P = 0. Also, it can be checked that u n,p = max(0,c n k P ) when k P = 0. Therefore, u n,p is defined in the way that the CUSUM scheme can repeatedly restart when there is little evidence of shifts. By using this CUSUM scheme and the log-linear modeling, we can detect any mean shift in the multivariate count data, without assuming the IC process distribution to follow a parametric form. 4 Numerical Performance Assessment We evaluate the numerical performance of the related methods that were discussed in the previous sections here. In different numerical examples, we assume that p = 3, and the true IC distribution of (X 1,X 2,X 3 ) belongs to one of the following four cases: Case I X 1 Bin(10,0.25), X 2 Bin(10,0.50), X 3 Bin(10,0.75), and X 1, X 2 and X 3 are independent; Case II X 1 Poisson(3), X 2 Poisson(5), X 3 Poisson(8), and X 1, X 2 and X 3 are independent; Case III X 1 ZIP(4,0.20), X 2 ZIP(6,0.20), X 3 ZIP(10,0.20), and X 1, X 2 and X 3 are independent; Case IV X 1 ZIP(4,0.20), X 2 ZIP(6,0.20), X 3 = X 1 + δ, δ ZIP(6,0.20), and X 1, X 2 and δ are independent. In Cases III and IV, ZIP(η, π) denotes the zero-inflated Poisson distribution with parameters η and π. More specifically, the probability mass function of X ZIP(η,π) is defined as P(X = 0)=π+(1 π)exp( η), P(X = x)=(1 π)η x exp( η)/x!, for x>0. In the above four cases, Case II denotes the case when the conventional Poisson distribution with independent components assumption is valid, Cases I and III represent two cases when the Poisson distribution assumption is violated, and Case IV represents a case when some components are correlated. Also, the 10

11 binomial, Poisson and zero-inflated Poisson distributions are three representative discrete distributions with under-dispersion, equal-dispersion and over-dispersion, respectively. We first compare the IC performance of the two parametric charts MEWMA and DCUSUM (cf., Equations (1) and (2)) that were described in Section 2 with some nonparametric charts. Besides the chart discussed in Section 3, we also consider another two representative nonparametric control charts. The first one is the CUSUM version of the multivariate sign chart proposed by Boone and Chakraborti (2012), which is denoted as SNCUSUM. Note that Boone and Chakraborti (2012) originally suggested a Shewharttype control chart based on the sign statistic and the control limit of the chart was determined based on the asymptotic distribution of the charting statistic. The robustness of the IC performance of this Shewhart chart cannot be guaranteed, particularly in cases with small subgroup sizes. To address this issue and make a fair comparison with the chart, we change the Shewhart chart into a CUSUM chart and use the bootstrap method (Chatterjee and Qiu, 2009) to find its control limit. The second alternative nonparametric CUSUM chart is the one based on the antiranks (Qiu and Hawkins, 2003), denoted as ARCUSUM. Since we usually do not know the shift direction in practice, we choose to use the first and last antiranks in this chart for fair comparisons. The smoothing matrix used in the MEWMA chart is chosen to be the same as that in Chen et al. (2015), the allowance constants in the four CUSUM charts are chosen to be 0.5, the subgroup size of the SNCUSUM chart is fixed at 10, and certain IC parameters used in all five charts are estimated from an IC dataset of size 500. The actual ARL 0 values of the charts are calculated from 10,000 replicated simulations and presented in Table 1. In the table, we would like to point out that the MEWMA chart cannot be used in Case I, because the chart is based on the multivariate Poisson log-normal distributional assumption, which implies that the variance of any component of X(n) cannot be smaller than its mean (Aitchison and Ho, 1989). But, Case I denotes a case of under-dispersion, as mentioned above. So, that multivariate Poisson log-normal distributional assumption cannot be valid in that case. From Table 1, we can see that the actual ARL 0 values of the nonparametric charts, AR- CUSUM and SNCUSUM are all very close to the nominal ARL 0 values in all cases considered. When 11

12 Table 1. The actual ARL 0 values and their standard errors (in parentheses) of the five control charts when the nominal ARL 0 values are fixed at 200 or 500. Nominal ARL 0 Chart Case I Case II Case III Case IV MEWMA (2.05) (1.83) (2.01) DCUSUM (2.87) (2.54) (2.91) (3.23) SNCUSUM (2.28) (2.25) (2.32) (2.34) ARCUSUM (3.25) (3.31) (2.93) (2.99) (2.87) (2.93) (2.81) (2.72) MEWMA (3.67) (3.25) (3.31) DCUSUM (7.30) 514 (5.60) (6.53) (7.73) SNCUSUM (5.05) (5.51) (5.11) (5.06) ARCUSUM (6.43) (6.50) (5.92) (6.03) (5.80) (5.69) (5.75) (5.65) further comparing the three nonparametric control charts in terms of the standard error, the SNCUSUM chart has the best performance and the chart performs better than the ARCUSUM chart. As a comparison, the actual ARL 0 values of the parametric charts MEWMA and DCUSUM deviate significantly from the nominal ARL 0 values in most cases considered. If the actual ARL 0 value of a chart is substantially larger than the nominal value, then the chart would be too conservative in detecting process distributional shifts in the sense that a real shift would not be detected as quickly as expected. Thus, we could have the situation when the process produces many defective products without notice. In the case when the actual ARL 0 value of a chart is substantially smaller than the nominal value, then the chart would give many false signals. Thus, the production process would be unnecessarily stopped too often and too soon. Much human resource and production efficiency would be wasted as a consequence. Therefore, the parametric charts should be used with care in practice, and their distributional assumptions should be adequately checked in advance. Next, we compare the OC performance of the related control charts. From Table 1, we can see that the MEWMA and DCUSUM charts have unreliable IC performance in various cases considered. In such cases, their shift detection power might be irrelevant because a good shift detection power could be due to an overly small actual ARL 0 value. To make a relatively fair comparison about their OC performance, 12

13 we use the bootstrap procedure to compute the control limits of the MEWMA and DCUSUM charts so that their actual ARL 0 values reach the nominal ARL 0 value, and the related control charts are denoted as MEWMA(b) and DCUSUM(b), respectively. Thus, the MEWMA(b) and DCUSUM(b) charts can also be regarded as nonparametric control charts because computation of their control limits does not depend on the assumed parametric distributions. The bootstrap sample size used in these two charts is chosen to be 500 that is the same as that of the IC dataset. For illustration purposes, the original versions of the MEWMA and DCUSUM charts are also considered here. In this example, the true process distribution is assumed to be the standardized version with mean 0 and standard deviation 1 of one of the Cases I-IV. In each case, two shift scenarios are considered: one is that a shift occurs only in the first component with the shift size changing from -1.0 to 1.0, and the other is that a shift occurs in all three components with the same size changing from -0.4 to 0.4. is used to denote the magnitude of the considered shift in the following figures. Because performance of different CUSUM charts with a same allowance constant may not be comparable (cf., Qiu, 2008), we choose to compare their optimal performance when detecting a specific shift. Namely, for detecting a given shift, we search the allowance constant of a CUSUM chart such that the value reaches the minimum when its ARL 0 value is fixed at the nominal level. For the MEWMA chart, we still use the same smoothing parameters as those in Chen et al. (2015), because it was shown in that paper that these smoothing parameters were robust and could give optimal or close to optimal performance across different settings. Also, that chart is not considered in Case I for the reason given above about Table 1. Based on 10,000 replications, the calculated values of the, DCUSUM, MEWMA, DCUSUM(b) and MEWMA(b) charts are shown in Figures 1-2, respectively, in the two shift scenarios. These two figures are used for demonstrating the unreliability of the parametric charts DCUSUM and MEWMA and for comparing the chart with the modified parametric charts DCUSUM(b) and MEWMA(b). As shown in Figures 1-2, when the true process distribution is Poisson (i.e., plot (b) in both Figures), DCUSUM and MEWMA perform similarly to DCUSUM(b) and MEWMA(b), respectively. This is expected because the Poisson distribution assumption is valid in such cases. In the other cases when 13

14 the true process distribution is binomial or zero-inflated Poisson, DCUSUM and MEWMA are unreliable. Regarding DCUSUM(b) and MEWMA(b), they have a similar performance to. The chart seems to be more sensitive to small shifts while the DCUSUM(b) and MEWMA(b) charts perform better for detecting larger shifts DCUSUM DCUSUM(b) MEWMA(b) DCUSUM MEWMA DCUSUM(b) MEWMA(b) (a) (b) DCUSUM MEWMA DCUSUM(b) MEWMA(b) DCUSUM MEWMA DCUSUM(b) MEWMA(b) (c) (d) Figure 1. Calculated values of the control charts when the nominal ARL 0 is fixed at 200, and the actual IC process distribution is the standardized version of the one in Case I (plot (a)), Case II (plot (b)), Case III (plot (c)), and Case IV (plot (d)). Shift of size occurs only in the first component of X(n). Figures 3-4 show the calculated values of the three nonparametric control charts, 14

15 DCUSUM DCUSUM(b) MEWMA(b) DCUSUM MEWMA DCUSUM(b) MEWMA(b) (a) (b) DCUSUM MEWMA DCUSUM(b) MEWMA(b) DCUSUM MEWMA DCUSUM(b) MEWMA(b) (c) (d) Figure 2. Calculated values of the control charts when the nominal ARL 0 is fixed at 200, and the actual IC process distribution is the standardized version of the one in Case I (plot (a)), Case II (plot (b)), Case III (plot (c)), and Case IV (plot (d)). Shift of size occurs in all components of X(n). ARCUSUM and SNCUSUM in the two shift scenarios, respectively, based on 10,000 replicated simulations. From the figures, we can see that PCUCUM outperforms ARCUSUM and SNCUSUM in most cases, and ARCUSUM outperforms SNCUSUM in most cases as well. Next, we study the standard deviation of the OC run length, denoted as SDRL 1, of various control charts. In cases when the shift occurs in the first component of X(n) (i.e., the first shift scenario) with the 15

16 ARCUSUM SNCUSUM ARCUSUM SNCUSUM (a) (b) ARCUSUM SNCUSUM ARCUSUM SNCUSUM (c) (d) Figure 3. Calculated values of the control charts when the nominal ARL 0 is fixed at 200, and the actual IC process distribution is the standardized version of the one in Case I (plot (a)), Case II (plot (b)), Case III (plot (c)), and Case IV (plot (d)). Shift of size occurs only in the first component of X(n). size changes from 0.2 to 1.0 and other setupa are the same as those in Figures 1-2, the calculated SDRL 1 values of the charts, ARCUSUM, SNCUSUM, DCUSUM(b) and MEWMA(b) are shown in Figure 5. From the plots in the figure, it can be seen that has relatively small SDRL 1 values, compared to the other four charts, in most cases considered. For downward shifts in the first shift scenario and for shifts in the second shift scenario, results are similar and thus omitted here. 16

17 ARCUSUM SNCUSUM ARCUSUM SNCUSUM (a) (b) ARCUSUM SNCUSUM ARCUSUM SNCUSUM (c) (d) Figure 4. Calculated values of the control charts when the nominal ARL 0 is fixed at 200, and the actual IC process distribution is the standardized version of the one in Case I (plot (a)), Case II (plot (b)), Case III (plot (c)), and Case IV (plot (d)). Shift of size occurs in all components of X(n). As a summary of the above simulation results, we can have the following conclusions. (i) The parametric charts DCUSUM and MEWMA are unreliable to use in cases when their distribution assumptions are violated. (ii) Among the three nonparametric charts ARCUSUM, SNCUSUM and, in terms of, outperforms SNCUSUM in all cases considered, and it outperforms ARCUSUM when detecting most upward shifts in all four cases and when detecting downward shifts in Cases I and II. In terms 17

18 SDRL ARCUSUM SNCUSUM DCUSUM(b) MEWMA(b) SDRL ARCUSUM SNCUSUM DCUSUM(b) MEWMA(b) (a) (b) SDRL ARCUSUM SNCUSUM DCUSUM(b) MEWMA(b) SDRL ARCUSUM SNCUSUM DCUSUM(b) MEWMA(b) (c) (d) Figure 5. The SDRL 1 values of the five nonparametric control charts when the nominal ARL 0 is fixed at 200, and the actual IC process distribution is the standardized version of the one in Case I (plot (a)), Case II (plot (b)), Case III (plot (c)), and Case IV (plot (d)). Shift of size occurs only in the first component of X(n). of SDRL 1, similar conclusions can be made. Therefore, the chart is preferred in general among the three nonparametric control charts considered here for monitoring multivariate count data. As mentioned earlier, certain IC parameters (e.g., the medians m j ) of the chart need to be estimated from an IC dataset. Thus, its performance depends on the size n 0 of the IC dataset. To study the 18

19 impact of n 0 on the performance of the chart, let us consider Case I and compute the optimal values of the chart when n 0 changes from 100, 500, 1000 to 5000 and all other parameters are chosen to be the same as those in the example of Figure 1. The results are presented in Figure 6, where the y-axis is in natural logarithm scale to better distinguish the difference. From the plots, we can see that the results are better when n 0 is chosen larger, as expected, and the results do not change much when n n_0=100 n_0=500 n_0=1000 n_0=5000 Self starting n_0=100 n_0=500 n_0=1000 n_0=5000 Self starting (a) (b) Figure 6. Calculated optimal values of the chart when n 0 = 100,500,1000 and 5000, and of the self-start version of the chart in Case I. Shift of size occurs only in the first component of X(n) in plot (a), and occurs in all components of X(n) in plot (b). In certain applications, it might be difficult to have 1000 or even 500 IC observations before online monitoring. In such cases, a natural approach to overcome that difficulty is to use a self-starting control chart (Hawkins, 1987; Capizzi and Masarotto, 2010). The basic idea behind that chart is that the new observation collected at the current time point can be combined with the IC dataset once we confirm that the related process is IC at the current time point. So, the size of the IC dataset can potentially increase during online process monitoring and the IC parameters can potentially be estimated more and more accurately. To use a self-starting chart, a small number of IC observations is still needed in advance, to 19

20 calculate initial estimates of the IC parameters. Several existing studies, including those in Hawkins (1987), Hawkins and Maboudou-Tchao (2007) and Sullivan and Jones (2002), showed that the results would be quite stable if we could collect IC observations in advance, depending on the dimensionality of the process observations. It was shown that a dozen initial IC observations were usually good enough in univariate cases and we needed initial IC observations when the dimensionality of the process observations was up to 15. We constructed the self-starting version of the chart, and performed a big simulation study about its numerical performance. Our results are overall consistent with those in the existing studies mentioned above. In the setup of the previous example, its optimal values are shown in Figure 6 by the solid lines, in cases when 30 initial IC observations are used. We can see that its performance is about the same as that of the original chart with 5,000 IC observations in the scenario of plot (a), and even much better than the latter in the scenario of plot (b). 5 A Real-Data Application In this section, we apply the related control charts to a real-data example about the crime logs in 2016 at the University of Florida. Three types of common crimes are considered in this example, including Driving Under the Influence (X 1 ), Narcotics Violation (X 2 ) and Larceny/Theft (X 3 ). Daily counts of these crimes can be found at the website of the University of Florida Police Department ( We then use the first half of the data (more specifically, the first 185 observations) as the IC data, and the remaining as the test data for online monitoring. Both the IC data and the test data are shown in the 3-D plot in Figure 7. We can see that the test data have a larger mean than the IC data. The IC data are also shown in the first row of Figure 8. The corresponding histograms of the three variables are shown in the second row, along with their density curves (solid lines) and the density curves of the Poisson distributions with the same means (dashed lines). From Figure 8(d)-(f), we can see that the marginal distributions of X 1, X 2 and X 3 are quite different from the corresponding Poisson distributions for 20

21 the IC data. We then use the Pearson s Chi-square goodness-of-fit test and the Fisher s index of dispersion test (Fisher et al., 1922) to formally test whether each variable in the IC data follows a Poisson distribution. Both tests conclude that X 2 and X 3 are significantly different from Poisson (Pearson s test: p-value=0.000 for X 2 and p-value=0.001 for X 3 ; Fisher s test: p-value=0.000 for X 2 and p-value=0.002 for X 3 ). As for X 1, the Pearson s test gives the p-value of 0.077, while the Fisher s test gives the p-value of Therefore, we can conclude that the joint distribution of X(n)=(X 1 (n),x 2 (n),x 3 (n)) cannot be multivariate Poisson, because a joint Poisson distribution implies that all marginal distributions are Poisson. X X 2 X 1 IC Observations Testing Observations Figure 7. A 3-D scatter plot of the daily counts of the three types of crimes in year 2016 at the University of Florida. Because it is confirmed above that the IC data do not follow a multivariate Poisson distribution, only the five nonparametric charts, ARCUSUM, SNCUSUM, DCUSUM(b) and MEWMA(b) are considered in this example. When implementing the related control charts, the nominal ARL 0 is fixed at 200 for all five charts, the allowance constant is chosen to be 0.1 for the, ARCUSUM, SNCUSUM and DCUSUM(b) charts, the subgroup size in the SNCUSUM chart is fixed at 10, and the smoothing parameters 21

22 X X X Day (a) Day (b) Day (c) Density Density Density (d) Figure 8. Daily counts ((a)-(c)) and histograms ((d)-(f)) of the three types of crimes in the first 185 days in year 2016 at the University of Florida. The dashed lines in plots (d)-(f) denote the density curves of the data, and the solid lines denote the density curves of the Poisson distributions with the same means. (e) (f) of the MEWMA(b) chart are chosen to be the same as those used in Chen et al. (2015). The five control charts are presented in Figure 9, where the dashed horizontal lines denote their control limits. From the plots, we can see that the, ARCUSUM, SNCUSUM, DCUSUM(b) and MEWMA(b) charts give signals at the 2nd, 2nd, 180th, 82nd and 70th testing observations, respectively. Therefore, the and ARCUSUM charts detect the distributional shifts earlier than the remaining 3 charts in this example. 22

23 ARCUSUM Observation (a) Observation (b) SNCUSUM u_n+ u_n DCUSUM(b) u_n+ u_n Subgroup (c) Observation (d) MEWMA(b) Observation (e) Figure 9. The, ARCUSUM, SNCUSUM, DCUSUM(b) and MEWMA(b) charts for monitoring the daily counts of the three types of crimes at the University of Florida during the second half of the year in The dashed horizontal lines are the control limits of the related control charts. 23

24 6 Concluding Remarks SPC for multiple count data remains a challenging problem. Most existing methods for monitoring multiple count data are based on certain parametric models, among which the most commonly used one is the multivariate Poisson distribution. In practice, however, the assumed parametric models are rarely valid, due to the complicated impact of various factors (e.g., the environment, weather, and so forth) on the count data. We have shown in the paper that the parametric control charts are unreliable to use in such cases because their actual ARL 0 values could be substantially different from a nominal level. In this paper, we carefully compare the performance of some representative parametric and nonparametric control charts in monitoring multiple count data. We suggest that the related parametric model should be checked carefully before a parametric chart is used. In cases when we are unsure whether the assumed parametric model is valid, or when we do not know which parametric chart is appropriate, we suggest using a nonparametric chart instead. To this end, the chart and its self-starting version can give reasonably good results in general, based on our intensive numerical study in various different cases. There are a number of issues that have not been discussed thoroughly in the current paper. For instance, we focus on mean shifts only in this paper. Although we believe that the chart can also detect shifts in the scale parameters, we do not know how effective it is for that purpose. Also, the allowance constant k P needs to be determined in advance and it might be selected adaptively, as discussed in Section 4.5 of Qiu (2014). These issues and some others will be addressed in our future research. Acknowledgments The authors thank the editors and referees for many helpful comments and suggestions, which improved the quality of the paper greatly. This research is supported in part by the National Science Foundation grant DMS in US and by the Natural Sciences Foundation of China grants and

25 Biographical Sketches Peihua Qiu received his PhD in statistics from the Department of Statistics at the University of Wisconsin at Madison in He worked as a senior research consulting statistician of the Biostatistics Center at the Ohio State University during Then, he worked as an assistant professor ( ), an associate professor ( ), and a full professor ( ) of the School of Statistics at the University of Minnesota. He is an elected fellow of the American Statistical Association, an elected fellow of the Institute of Mathematical Statistics, an elected member of the International Statistical Institute, a senior member of the American Society for Quality, and a lifetime member of the International Chinese Statistical Association. He served as associate editor for a number of top journals in statistics, including Journal of the American Statistical Association, Biometrics, and Technometrics. He was the editor of Technometrics during and has been the founding chair of the Department of Biostatistics at the University of Florida since July 1, Peihua Qiu has made substantial contributions in the areas of jump regression analysis, image processing, statistical process control, survival analysis, and reliability. So far, he has published over 100 research papers in top refereed journals, including Technometrics, Journal of the American Statistical Association, Annals of Statistics, Annals of Applied Statistics, Journal of the Royal Statistical Society (Series B), Biometrika, Biometrics, IEEE Transactions on Pattern Analysis and Machine Intelligence, and IIE Transactions. His research monograph titled Image Processing and Jump Regression Analysis (2005, Wiley) won the inaugural Ziegel prize in 2007, for its contribution in bridging the gap between jump regression analysis in statistics and image processing in computer science. His second book titled Introduction to Statistical Process Control was published in 2014 by Chapman & Hall/CRC. Zhen He is a professor in the College of Management and Economics, Tianjin University. He received his PhD in Management Science and Engineering from Tianjin University, China, in He is the recipient of the Outstanding Research Young Scholar Award of the National Natural Science Foundation of China. His research interests include statistical quality control, design of experiment, and Six Sigma management. 25

26 Zhiqiong Wang received his PhD degree in 2018 from the College of Management and Economics of the Tianjin University in China. He is currently an assistant professor of the School of Management at the Tianjin University of Technology. His major research interests include quality control and management, change-point detection, and various quality-related applications. This paper was written during his one-year visit at the Department of Biostatistics of the University of Florida. References Agresti, A. (2013). Categorical data analysis. John Wiley & Sons,. Aitchison, J. and Ho, C. (1989). The multivariate Poisson-log normal distribution. Biometrika, 76(4): Boone, J. and Chakraborti, S. (2012). Two simple Shewhart-type multivariate nonparametric control charts. Applied Stochastic Models in Business and Industry, 28(2): Brijs, T., Karlis, D., Swinnen, G., Vanhoof, K., Wets, G., and Manchanda, P. (2004). A multivariate Poisson mixture model for marketing applications. Statistica Neerlandica, 58(3): Capizzi, G. and Masarotto, G. (2010). Self-starting CUSCORE control charts for individual multivariate observations. Journal of Quality Technology, 42(2): Chatterjee, S. and Qiu, P. (2009). Distribution-free cumulative sum control charts using bootstrap-based control limits. The Annals of Applied Statistics, 3(1): Chen, N., Li, Z., and Ou, Y. (2015). Multivariate exponentially weighted moving-average chart for monitoring Poisson observations. Journal of Quality Technology, 47(3): Chen, N., Zi, X., and Zou, C. (2016). A distribution-free multivariate control chart. Technometrics, 58(4):

27 Chiu, J. E. and Kuo, T. I. (2008). Attribute control chart for multivariate Poisson distribution. Communications in Statistics-Theory and Methods, 37(1): Chiu, J. E. and Kuo, T. I. (2010). Control charts for fraction nonconforming in a bivariate binomial process. Journal of Applied Statistics, 37(10): Cozzucoli, P. C. and Marozzi, M. (2017). Monitoring multivariate Poisson processes: a review and some new results. Quality Technology & Quantitative Management. Das, D., Zhou, S., Chen, Y., and Horst, J. (2016). Statistical monitoring of over-dispersed multivariate count data using approximate likelihood ratio tests. International Journal of Production Research, 54(21): Fisher, R., Thornton, H., and Mackenzie, W. (1922). The accuracy of the plating method of estimating the density of bacterial populations. Annals of Applied Biology, 9(3-4): Hawkins, D. M. (1987). Self-starting cusum charts for location and scale. The Statistician, pages Hawkins, D. M., Choi, S., and Lee, S. (2007). A general multivariate exponentially weighted movingaverage control chart. Journal of Quality Technology, 39(2): Hawkins, D. M. and Maboudou-Tchao, E. M. (2007). Self-starting multivariate exponentially weighted moving average control charting. Technometrics, 49(2): He, S., He, Z., and Wang, G. A. (2014). CUSUM control charts for multivariate Poisson distribution. Communications in Statistics-Theory and Methods, 43(6): Holgate, P. (1964). Estimation for the bivariate Poisson distribution. Biometrika, 51(1-2): Johnson, N. L., Kotz, S., and Balakrishnan, N. (1997). Discrete multivariate distributions. Wiley New York. Jones, L. A., Woodall, W. H., and Conerly, M. D. (1999). Exact properties of demerit control charts. Journal of Quality Technology, 31(2):

28 Karlis, D. (2003). An EM algorithm for multivariate Poisson distribution and related models. Journal of Applied Statistics, 30(1): Karlis, D. and Meligkotsidou, L. (2005). Multivariate Poisson regression with covariance structure. Statistics and Computing, 15(4): Laungrungrong, B., Borror, C. M., and Montgomery, D. C. (2011). EWMA control charts for multivariate Poisson-distributed data. International Journal of Quality Engineering and Technology, 2(3): Lee Ho, L. and Branco Costa, A. F. (2009). Control charts for individual observations of a bivariate Poisson process. The International Journal of Advanced Manufacturing Technology, 43(7): Li, J., Tsung, F., and Zou, C. (2014). Multivariate binomial/multinomial control chart. IIE Transactions, 46(5): Lu, X. S., Xie, M., Goh, T. N., and Lai, C. D. (1998). Control chart for multivariate attribute processes. International Journal of Production Research, 36(12): Patel, H. (1973). Quality control methods for multivariate binomial and Poisson distributions. Technometrics, 15(1): Qiu, P. (2008). Distribution-free multivariate process control based on log-linear modeling. IIE Transactions, 40(7): Qiu, P. (2014). Introduction to statistical process control. Chapman & Hall/CRC: Boca Raton, FL. Qiu, P. (2018). Some perspectives on nonparametric statistical process control. Journal of Quality Technology, 50(1): Qiu, P. and Hawkins, D. (2001). A rank-based multivariate CUSUM procedure. Technometrics, 43(2):

29 Qiu, P. and Hawkins, D. (2003). A nonparametric multivariate cumulative sum procedure for detecting shifts in all directions. Journal of the Royal Statistical Society: Series D (The Statistician), 52(2): Qiu, P. and Li, Z. (2011). On nonparametric statistical process control of univariate processes. Technometrics, 53(4): Saghir, A. and Lin, Z. (2014). Control chart for monitoring multivariate COM-Poisson attributes. Journal of Applied Statistics, 41(1): Saghir, A. and Lin, Z. (2015). Control charts for dispersed count data: an overview. Quality and Reliability Engineering International, 31(5): Schmidt, A. M. and Rodriguez, M. A. (2011). Modelling multivariate counts varying continuously in space. Bayesian Statistics, 9: Skinner, K. R., Montgomery, D. C., and Runger, G. C. (2003). Process monitoring for multiple count data using generalized linear model-based control charts. International Journal of Production Research, 41(6): Sullivan, J. H. and Jones, L. A. (2002). A self-starting control chart for multivariate individual observations. Technometrics, 44(1): Topalidou, E. and Psarakis, S. (2009). Review of multinomial and multiattribute quality control charts. Quality and Reliability Engineering International, 25(7): Wang, Z., Li, Y., and Zhou, X. (2017). A statistical control chart for monitoring high-dimensional Poisson data streams. Quality and Reliability Engineering International, 33: Zou, C. and Tsung, F. (2011). A multivariate sign EWMA control chart. Technometrics, 53(1): Zou, C., Wang, Z., and Tsung, F. (2012). A spatial rank-based multivariate EWMA control chart. Naval Research Logistics, 59(2):

Distribution-Free Monitoring of Univariate Processes. Peihua Qiu 1 and Zhonghua Li 1,2. Abstract

Distribution-Free Monitoring of Univariate Processes. Peihua Qiu 1 and Zhonghua Li 1,2. Abstract Distribution-Free Monitoring of Univariate Processes Peihua Qiu 1 and Zhonghua Li 1,2 1 School of Statistics, University of Minnesota, USA 2 LPMC and Department of Statistics, Nankai University, China

More information

Statistical Process Control for Multivariate Categorical Processes

Statistical Process Control for Multivariate Categorical Processes Statistical Process Control for Multivariate Categorical Processes Fugee Tsung The Hong Kong University of Science and Technology Fugee Tsung 1/27 Introduction Typical Control Charts Univariate continuous

More information

A New Demerit Control Chart for Monitoring the Quality of Multivariate Poisson Processes. By Jeh-Nan Pan Chung-I Li Min-Hung Huang

A New Demerit Control Chart for Monitoring the Quality of Multivariate Poisson Processes. By Jeh-Nan Pan Chung-I Li Min-Hung Huang Athens Journal of Technology and Engineering X Y A New Demerit Control Chart for Monitoring the Quality of Multivariate Poisson Processes By Jeh-Nan Pan Chung-I Li Min-Hung Huang This study aims to develop

More information

Rejoinder. Peihua Qiu Department of Biostatistics, University of Florida 2004 Mowry Road, Gainesville, FL 32610

Rejoinder. Peihua Qiu Department of Biostatistics, University of Florida 2004 Mowry Road, Gainesville, FL 32610 Rejoinder Peihua Qiu Department of Biostatistics, University of Florida 2004 Mowry Road, Gainesville, FL 32610 I was invited to give a plenary speech at the 2017 Stu Hunter Research Conference in March

More information

The Robustness of the Multivariate EWMA Control Chart

The Robustness of the Multivariate EWMA Control Chart The Robustness of the Multivariate EWMA Control Chart Zachary G. Stoumbos, Rutgers University, and Joe H. Sullivan, Mississippi State University Joe H. Sullivan, MSU, MS 39762 Key Words: Elliptically symmetric,

More information

Construction of An Efficient Multivariate Dynamic Screening System. Jun Li a and Peihua Qiu b. Abstract

Construction of An Efficient Multivariate Dynamic Screening System. Jun Li a and Peihua Qiu b. Abstract Construction of An Efficient Multivariate Dynamic Screening System Jun Li a and Peihua Qiu b a Department of Statistics, University of California at Riverside b Department of Biostatistics, University

More information

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David

More information

Multivariate Charts for Multivariate. Poisson-Distributed Data. Busaba Laungrungrong

Multivariate Charts for Multivariate. Poisson-Distributed Data. Busaba Laungrungrong Multivariate Charts for Multivariate Poisson-Distributed Data by Busaba Laungrungrong A Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy Approved November

More information

Weighted Likelihood Ratio Chart for Statistical Monitoring of Queueing Systems

Weighted Likelihood Ratio Chart for Statistical Monitoring of Queueing Systems Weighted Likelihood Ratio Chart for Statistical Monitoring of Queueing Systems Dequan Qi 1, Zhonghua Li 2, Xuemin Zi 3, Zhaojun Wang 2 1 LPMC and School of Mathematical Sciences, Nankai University, Tianjin

More information

arxiv: v1 [stat.me] 14 Jan 2019

arxiv: v1 [stat.me] 14 Jan 2019 arxiv:1901.04443v1 [stat.me] 14 Jan 2019 An Approach to Statistical Process Control that is New, Nonparametric, Simple, and Powerful W.J. Conover, Texas Tech University, Lubbock, Texas V. G. Tercero-Gómez,Tecnológico

More information

Multivariate Binomial/Multinomial Control Chart

Multivariate Binomial/Multinomial Control Chart Multivariate Binomial/Multinomial Control Chart Jian Li 1, Fugee Tsung 1, and Changliang Zou 2 1 Department of Industrial Engineering and Logistics Management, Hong Kong University of Science and Technology,

More information

THE CUSUM MEDIAN CHART FOR KNOWN AND ESTIMATED PARAMETERS

THE CUSUM MEDIAN CHART FOR KNOWN AND ESTIMATED PARAMETERS THE CUSUM MEDIAN CHART FOR KNOWN AND ESTIMATED PARAMETERS Authors: Philippe Castagliola Université de Nantes & LS2N UMR CNRS 6004, Nantes, France (philippe.castagliola@univ-nantes.fr) Fernanda Otilia Figueiredo

More information

An Adaptive Exponentially Weighted Moving Average Control Chart for Monitoring Process Variances

An Adaptive Exponentially Weighted Moving Average Control Chart for Monitoring Process Variances An Adaptive Exponentially Weighted Moving Average Control Chart for Monitoring Process Variances Lianjie Shu Faculty of Business Administration University of Macau Taipa, Macau (ljshu@umac.mo) Abstract

More information

Nonparametric Multivariate Control Charts Based on. A Linkage Ranking Algorithm

Nonparametric Multivariate Control Charts Based on. A Linkage Ranking Algorithm Nonparametric Multivariate Control Charts Based on A Linkage Ranking Algorithm Helen Meyers Bush Data Mining & Advanced Analytics, UPS 55 Glenlake Parkway, NE Atlanta, GA 30328, USA Panitarn Chongfuangprinya

More information

A Theoretically Appropriate Poisson Process Monitor

A Theoretically Appropriate Poisson Process Monitor International Journal of Performability Engineering, Vol. 8, No. 4, July, 2012, pp. 457-461. RAMS Consultants Printed in India A Theoretically Appropriate Poisson Process Monitor RYAN BLACK and JUSTIN

More information

Module B1: Multivariate Process Control

Module B1: Multivariate Process Control Module B1: Multivariate Process Control Prof. Fugee Tsung Hong Kong University of Science and Technology Quality Lab: http://qlab.ielm.ust.hk I. Multivariate Shewhart chart WHY MULTIVARIATE PROCESS CONTROL

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Self-Starting Control Chart for Simultaneously Monitoring Process Mean and Variance

Self-Starting Control Chart for Simultaneously Monitoring Process Mean and Variance International Journal of Production Research Vol. 00, No. 00, 15 March 2008, 1 14 Self-Starting Control Chart for Simultaneously Monitoring Process Mean and Variance Zhonghua Li a, Jiujun Zhang a,b and

More information

A New Bootstrap Based Algorithm for Hotelling s T2 Multivariate Control Chart

A New Bootstrap Based Algorithm for Hotelling s T2 Multivariate Control Chart Journal of Sciences, Islamic Republic of Iran 7(3): 69-78 (16) University of Tehran, ISSN 16-14 http://jsciences.ut.ac.ir A New Bootstrap Based Algorithm for Hotelling s T Multivariate Control Chart A.

More information

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee Songklanakarin Journal of Science and Technology SJST-0-0.R Sukparungsee Bivariate copulas on the exponentially weighted moving average control chart Journal: Songklanakarin Journal of Science and Technology

More information

An exponentially weighted moving average scheme with variable sampling intervals for monitoring linear profiles

An exponentially weighted moving average scheme with variable sampling intervals for monitoring linear profiles An exponentially weighted moving average scheme with variable sampling intervals for monitoring linear profiles Zhonghua Li, Zhaojun Wang LPMC and Department of Statistics, School of Mathematical Sciences,

More information

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location MARIEN A. GRAHAM Department of Statistics University of Pretoria South Africa marien.graham@up.ac.za S. CHAKRABORTI Department

More information

Monitoring Multivariate Data via KNN Learning

Monitoring Multivariate Data via KNN Learning Monitoring Multivariate Data via KNN Learning Chi Zhang 1, Yajun Mei 2, Fugee Tsung 1 1 Department of Industrial Engineering and Logistics Management, Hong Kong University of Science and Technology, Clear

More information

A Multivariate EWMA Control Chart for Skewed Populations using Weighted Variance Method

A Multivariate EWMA Control Chart for Skewed Populations using Weighted Variance Method OPEN ACCESS Int. Res. J. of Science & Engineering, 04; Vol. (6): 9-0 ISSN: 3-005 RESEARCH ARTICLE A Multivariate EMA Control Chart for Skewed Populations using eighted Variance Method Atta AMA *, Shoraim

More information

Introduction to Statistical Analysis

Introduction to Statistical Analysis Introduction to Statistical Analysis Changyu Shen Richard A. and Susan F. Smith Center for Outcomes Research in Cardiology Beth Israel Deaconess Medical Center Harvard Medical School Objectives Descriptive

More information

Zero-Inflated Models in Statistical Process Control

Zero-Inflated Models in Statistical Process Control Chapter 6 Zero-Inflated Models in Statistical Process Control 6.0 Introduction In statistical process control Poisson distribution and binomial distribution play important role. There are situations wherein

More information

Likelihood-Based EWMA Charts for Monitoring Poisson Count Data with Time-Varying Sample Sizes

Likelihood-Based EWMA Charts for Monitoring Poisson Count Data with Time-Varying Sample Sizes Likelihood-Based EWMA Charts for Monitoring Poisson Count Data with Time-Varying Sample Sizes Qin Zhou 1,3, Changliang Zou 1, Zhaojun Wang 1 and Wei Jiang 2 1 LPMC and Department of Statistics, School

More information

A Modified Poisson Exponentially Weighted Moving Average Chart Based on Improved Square Root Transformation

A Modified Poisson Exponentially Weighted Moving Average Chart Based on Improved Square Root Transformation Thailand Statistician July 216; 14(2): 197-22 http://statassoc.or.th Contributed paper A Modified Poisson Exponentially Weighted Moving Average Chart Based on Improved Square Root Transformation Saowanit

More information

A Power Analysis of Variable Deletion Within the MEWMA Control Chart Statistic

A Power Analysis of Variable Deletion Within the MEWMA Control Chart Statistic A Power Analysis of Variable Deletion Within the MEWMA Control Chart Statistic Jay R. Schaffer & Shawn VandenHul University of Northern Colorado McKee Greeley, CO 869 jay.schaffer@unco.edu gathen9@hotmail.com

More information

Directionally Sensitive Multivariate Statistical Process Control Methods

Directionally Sensitive Multivariate Statistical Process Control Methods Directionally Sensitive Multivariate Statistical Process Control Methods Ronald D. Fricker, Jr. Naval Postgraduate School October 5, 2005 Abstract In this paper we develop two directionally sensitive statistical

More information

A Nonparametric Control Chart Based On The Mann Whitney

A Nonparametric Control Chart Based On The Mann Whitney A Nonparametric Control Chart Based On The Mann Whitney We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your computer,

More information

Detecting Assignable Signals via Decomposition of MEWMA Statistic

Detecting Assignable Signals via Decomposition of MEWMA Statistic International Journal of Mathematics and Statistics Invention (IJMSI) E-ISSN: 2321 4767 P-ISSN: 2321-4759 Volume 4 Issue 1 January. 2016 PP-25-29 Detecting Assignable Signals via Decomposition of MEWMA

More information

Monitoring and diagnosing a two-stage production process with attribute characteristics

Monitoring and diagnosing a two-stage production process with attribute characteristics Iranian Journal of Operations Research Vol., No.,, pp. -6 Monitoring and diagnosing a two-stage production process with attribute characteristics Downloaded from iors.ir at :6 +33 on Wednesday October

More information

A comparison study of the nonparametric tests based on the empirical distributions

A comparison study of the nonparametric tests based on the empirical distributions 통계연구 (2015), 제 20 권제 3 호, 1-12 A comparison study of the nonparametric tests based on the empirical distributions Hyo-Il Park 1) Abstract In this study, we propose a nonparametric test based on the empirical

More information

Variations in a manufacturing process can be categorized into common cause and special cause variations. In the presence of

Variations in a manufacturing process can be categorized into common cause and special cause variations. In the presence of Research Article (wileyonlinelibrary.com) DOI: 10.1002/qre.1514 Published online in Wiley Online Library Memory-Type Control Charts for Monitoring the Process Dispersion Nasir Abbas, a * Muhammad Riaz

More information

ON CONSTRUCTING T CONTROL CHARTS FOR RETROSPECTIVE EXAMINATION. Gunabushanam Nedumaran Oracle Corporation 1133 Esters Road #602 Irving, TX 75061

ON CONSTRUCTING T CONTROL CHARTS FOR RETROSPECTIVE EXAMINATION. Gunabushanam Nedumaran Oracle Corporation 1133 Esters Road #602 Irving, TX 75061 ON CONSTRUCTING T CONTROL CHARTS FOR RETROSPECTIVE EXAMINATION Gunabushanam Nedumaran Oracle Corporation 33 Esters Road #60 Irving, TX 7506 Joseph J. Pignatiello, Jr. FAMU-FSU College of Engineering Florida

More information

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links

Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Communications of the Korean Statistical Society 2009, Vol 16, No 4, 697 705 Goodness-of-Fit Tests for the Ordinal Response Models with Misspecified Links Kwang Mo Jeong a, Hyun Yung Lee 1, a a Department

More information

Control charts continue to play a transformative role in all walks of life in the 21st century. The mean and the variance of a

Control charts continue to play a transformative role in all walks of life in the 21st century. The mean and the variance of a Research Article (wileyonlinelibrary.com) DOI: 10.1002/qre.1249 Published online in Wiley Online Library A Distribution-free Control Chart for the Joint Monitoring of Location and Scale A. Mukherjee, a

More information

THE DETECTION OF SHIFTS IN AUTOCORRELATED PROCESSES WITH MR AND EWMA CHARTS

THE DETECTION OF SHIFTS IN AUTOCORRELATED PROCESSES WITH MR AND EWMA CHARTS THE DETECTION OF SHIFTS IN AUTOCORRELATED PROCESSES WITH MR AND EWMA CHARTS Karin Kandananond, kandananond@hotmail.com Faculty of Industrial Technology, Rajabhat University Valaya-Alongkorn, Prathumthani,

More information

Quality Control & Statistical Process Control (SPC)

Quality Control & Statistical Process Control (SPC) Quality Control & Statistical Process Control (SPC) DR. RON FRICKER PROFESSOR & HEAD, DEPARTMENT OF STATISTICS DATAWORKS CONFERENCE, MARCH 22, 2018 Agenda Some Terminology & Background SPC Methods & Philosophy

More information

Monitoring aggregated Poisson data with probability control limits

Monitoring aggregated Poisson data with probability control limits XXV Simposio Internacional de Estadística 2015 Armenia, Colombia, 5, 6, 7 y 8 de Agosto de 2015 Monitoring aggregated Poisson data with probability control limits Victor Hugo Morales Ospina 1,a, José Alberto

More information

Multivariate Change Point Control Chart Based on Data Depth for Phase I Analysis

Multivariate Change Point Control Chart Based on Data Depth for Phase I Analysis Multivariate Change Point Control Chart Based on Data Depth for Phase I Analysis Zhonghua Li a, Yi Dai a, Zhaojun Wang a, a LPMC and School of Mathematical Sciences, Nankai University, Tianjin 300071,

More information

CONTROL CHARTS FOR THE GENERALIZED POISSON PROCESS WITH UNDER-DISPERSION

CONTROL CHARTS FOR THE GENERALIZED POISSON PROCESS WITH UNDER-DISPERSION International J. of Math. Sci. & Engg. Appls. (IJMSEA) ISSN 0973-9424, Vol. 10 No. III (December, 2016), pp. 173-181 CONTROL CHARTS FOR THE GENERALIZED POISSON PROCESS WITH UNDER-DISPERSION NARUNCHARA

More information

On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator

On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator JAMES D. WILLIAMS GE Global Research, Niskayuna, NY 12309 WILLIAM H. WOODALL and JEFFREY

More information

Univariate Dynamic Screening System: An Approach For Identifying Individuals With Irregular Longitudinal Behavior. Abstract

Univariate Dynamic Screening System: An Approach For Identifying Individuals With Irregular Longitudinal Behavior. Abstract Univariate Dynamic Screening System: An Approach For Identifying Individuals With Irregular Longitudinal Behavior Peihua Qiu 1 and Dongdong Xiang 2 1 School of Statistics, University of Minnesota 2 School

More information

Bootstrap-Based T 2 Multivariate Control Charts

Bootstrap-Based T 2 Multivariate Control Charts Bootstrap-Based T 2 Multivariate Control Charts Poovich Phaladiganon Department of Industrial and Manufacturing Systems Engineering University of Texas at Arlington Arlington, Texas, USA Seoung Bum Kim

More information

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California

Ronald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University

More information

Change Point Estimation of the Process Fraction Non-conforming with a Linear Trend in Statistical Process Control

Change Point Estimation of the Process Fraction Non-conforming with a Linear Trend in Statistical Process Control Change Point Estimation of the Process Fraction Non-conforming with a Linear Trend in Statistical Process Control F. Zandi a,*, M. A. Nayeri b, S. T. A. Niaki c, & M. Fathi d a Department of Industrial

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

DIAGNOSIS OF BIVARIATE PROCESS VARIATION USING AN INTEGRATED MSPC-ANN SCHEME

DIAGNOSIS OF BIVARIATE PROCESS VARIATION USING AN INTEGRATED MSPC-ANN SCHEME DIAGNOSIS OF BIVARIATE PROCESS VARIATION USING AN INTEGRATED MSPC-ANN SCHEME Ibrahim Masood, Rasheed Majeed Ali, Nurul Adlihisam Mohd Solihin and Adel Muhsin Elewe Faculty of Mechanical and Manufacturing

More information

Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests

Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests International Journal of Performability Engineering, Vol., No., January 24, pp.3-4. RAMS Consultants Printed in India Optimum Test Plan for 3-Step, Step-Stress Accelerated Life Tests N. CHANDRA *, MASHROOR

More information

Multivariate Process Control Chart for Controlling the False Discovery Rate

Multivariate Process Control Chart for Controlling the False Discovery Rate Industrial Engineering & Management Systems Vol, No 4, December 0, pp.385-389 ISSN 598-748 EISSN 34-6473 http://dx.doi.org/0.73/iems.0..4.385 0 KIIE Multivariate Process Control Chart for Controlling e

More information

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data

Approximation of Survival Function by Taylor Series for General Partly Interval Censored Data Malaysian Journal of Mathematical Sciences 11(3): 33 315 (217) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal homepage: http://einspem.upm.edu.my/journal Approximation of Survival Function by Taylor

More information

Monitoring Censored Lifetime Data with a Weighted-Likelihood Scheme

Monitoring Censored Lifetime Data with a Weighted-Likelihood Scheme Monitoring Censored Lifetime Data with a Weighted-Likelihood Scheme Chi Zhang, 1,2 Fugee Tsung, 2 Dongdong Xiang 1 1 School of Statistics, East China Normal University, Shanghai, China 2 Department of

More information

The occurrence of rare events in manufacturing processes, e.g. nonconforming items or machine failures, is frequently modeled

The occurrence of rare events in manufacturing processes, e.g. nonconforming items or machine failures, is frequently modeled Research Article (wileyonlinelibrary.com) DOI: 1.12/qre.1495 Published online in Wiley Online Library Exponential CUSUM Charts with Estimated Control Limits Min Zhang, a Fadel M. Megahed b * and William

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

PACKAGE LMest FOR LATENT MARKOV ANALYSIS

PACKAGE LMest FOR LATENT MARKOV ANALYSIS PACKAGE LMest FOR LATENT MARKOV ANALYSIS OF LONGITUDINAL CATEGORICAL DATA Francesco Bartolucci 1, Silvia Pandofi 1, and Fulvia Pennoni 2 1 Department of Economics, University of Perugia (e-mail: francesco.bartolucci@unipg.it,

More information

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1 Parametric Modelling of Over-dispersed Count Data Part III / MMath (Applied Statistics) 1 Introduction Poisson regression is the de facto approach for handling count data What happens then when Poisson

More information

MONITORING BIVARIATE PROCESSES WITH A SYNTHETIC CONTROL CHART BASED ON SAMPLE RANGES

MONITORING BIVARIATE PROCESSES WITH A SYNTHETIC CONTROL CHART BASED ON SAMPLE RANGES Blumenau-SC, 27 a 3 de Agosto de 217. MONITORING BIVARIATE PROCESSES WITH A SYNTHETIC CONTROL CHART BASED ON SAMPLE RANGES Marcela A. G. Machado São Paulo State University (UNESP) Departamento de Produção,

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract

WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION. Abstract Journal of Data Science,17(1). P. 145-160,2019 DOI:10.6339/JDS.201901_17(1).0007 WEIGHTED QUANTILE REGRESSION THEORY AND ITS APPLICATION Wei Xiong *, Maozai Tian 2 1 School of Statistics, University of

More information

Prediction of Bike Rental using Model Reuse Strategy

Prediction of Bike Rental using Model Reuse Strategy Prediction of Bike Rental using Model Reuse Strategy Arun Bala Subramaniyan and Rong Pan School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, USA. {bsarun, rong.pan}@asu.edu

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

Statistics Toolbox 6. Apply statistical algorithms and probability models

Statistics Toolbox 6. Apply statistical algorithms and probability models Statistics Toolbox 6 Apply statistical algorithms and probability models Statistics Toolbox provides engineers, scientists, researchers, financial analysts, and statisticians with a comprehensive set of

More information

Exam details. Final Review Session. Things to Review

Exam details. Final Review Session. Things to Review Exam details Final Review Session Short answer, similar to book problems Formulae and tables will be given You CAN use a calculator Date and Time: Dec. 7, 006, 1-1:30 pm Location: Osborne Centre, Unit

More information

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures

More information

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti

Good Confidence Intervals for Categorical Data Analyses. Alan Agresti Good Confidence Intervals for Categorical Data Analyses Alan Agresti Department of Statistics, University of Florida visiting Statistics Department, Harvard University LSHTM, July 22, 2011 p. 1/36 Outline

More information

Control charts are used for monitoring the performance of a quality characteristic. They assist process

Control charts are used for monitoring the performance of a quality characteristic. They assist process QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL Qual. Reliab. Engng. Int. 2009; 25:875 883 Published online 3 March 2009 in Wiley InterScience (www.interscience.wiley.com)..1007 Research Identifying

More information

Hidden Markov models for time series of counts with excess zeros

Hidden Markov models for time series of counts with excess zeros Hidden Markov models for time series of counts with excess zeros Madalina Olteanu and James Ridgway University Paris 1 Pantheon-Sorbonne - SAMM, EA4543 90 Rue de Tolbiac, 75013 Paris - France Abstract.

More information

YIJUN ZUO. Education. PhD, Statistics, 05/98, University of Texas at Dallas, (GPA 4.0/4.0)

YIJUN ZUO. Education. PhD, Statistics, 05/98, University of Texas at Dallas, (GPA 4.0/4.0) YIJUN ZUO Department of Statistics and Probability Michigan State University East Lansing, MI 48824 Tel: (517) 432-5413 Fax: (517) 432-5413 Email: zuo@msu.edu URL: www.stt.msu.edu/users/zuo Education PhD,

More information

Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution

Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution CMST 21(4) 221-227 (2015) DOI:10.12921/cmst.2015.21.04.006 Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution B. Sadeghpour Gildeh, M. Taghizadeh Ashkavaey Department

More information

An Investigation of Combinations of Multivariate Shewhart and MEWMA Control Charts for Monitoring the Mean Vector and Covariance Matrix

An Investigation of Combinations of Multivariate Shewhart and MEWMA Control Charts for Monitoring the Mean Vector and Covariance Matrix Technical Report Number 08-1 Department of Statistics Virginia Polytechnic Institute and State University, Blacksburg, Virginia January, 008 An Investigation of Combinations of Multivariate Shewhart and

More information

System Monitoring with Real-Time Contrasts

System Monitoring with Real-Time Contrasts System Monitoring with Real- Contrasts HOUTAO DENG Intuit, Mountain View, CA 94043, USA GEORGE RUNGER Arizona State University, Tempe, AZ 85287, USA EUGENE TUV Intel Corporation, Chandler, AZ 85226, USA

More information

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data

Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Journal of Modern Applied Statistical Methods Volume 4 Issue Article 8 --5 Testing Goodness Of Fit Of The Geometric Distribution: An Application To Human Fecundability Data Sudhir R. Paul University of

More information

Statistical Data Analysis Stat 3: p-values, parameter estimation

Statistical Data Analysis Stat 3: p-values, parameter estimation Statistical Data Analysis Stat 3: p-values, parameter estimation London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway,

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

Sample size determination for logistic regression: A simulation study

Sample size determination for logistic regression: A simulation study Sample size determination for logistic regression: A simulation study Stephen Bush School of Mathematical Sciences, University of Technology Sydney, PO Box 123 Broadway NSW 2007, Australia Abstract This

More information

STATISTICS ( CODE NO. 08 ) PAPER I PART - I

STATISTICS ( CODE NO. 08 ) PAPER I PART - I STATISTICS ( CODE NO. 08 ) PAPER I PART - I 1. Descriptive Statistics Types of data - Concepts of a Statistical population and sample from a population ; qualitative and quantitative data ; nominal and

More information

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional

More information

Gaussian Process Regression Model in Spatial Logistic Regression

Gaussian Process Regression Model in Spatial Logistic Regression Journal of Physics: Conference Series PAPER OPEN ACCESS Gaussian Process Regression Model in Spatial Logistic Regression To cite this article: A Sofro and A Oktaviarina 018 J. Phys.: Conf. Ser. 947 01005

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3 University of California, Irvine 2017-2018 1 Statistics (STATS) Courses STATS 5. Seminar in Data Science. 1 Unit. An introduction to the field of Data Science; intended for entering freshman and transfers.

More information

MODELING COUNT DATA Joseph M. Hilbe

MODELING COUNT DATA Joseph M. Hilbe MODELING COUNT DATA Joseph M. Hilbe Arizona State University Count models are a subset of discrete response regression models. Count data are distributed as non-negative integers, are intrinsically heteroskedastic,

More information

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects School of Industrial and Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, GA 30332-0205,

More information

Negative Multinomial Model and Cancer. Incidence

Negative Multinomial Model and Cancer. Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence S. Lahiri & Sunil K. Dhar Department of Mathematical Sciences, CAMS New Jersey Institute of Technology, Newar,

More information

A new multivariate CUSUM chart using principal components with a revision of Crosier's chart

A new multivariate CUSUM chart using principal components with a revision of Crosier's chart Title A new multivariate CUSUM chart using principal components with a revision of Crosier's chart Author(s) Chen, J; YANG, H; Yao, JJ Citation Communications in Statistics: Simulation and Computation,

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

MYT decomposition and its invariant attribute

MYT decomposition and its invariant attribute Abstract Research Journal of Mathematical and Statistical Sciences ISSN 30-6047 Vol. 5(), 14-, February (017) MYT decomposition and its invariant attribute Adepou Aibola Akeem* and Ishaq Olawoyin Olatuni

More information

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence

Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Generalized Linear Model under the Extended Negative Multinomial Model and Cancer Incidence Sunil Kumar Dhar Center for Applied Mathematics and Statistics, Department of Mathematical Sciences, New Jersey

More information

A process capability index for discrete processes

A process capability index for discrete processes Journal of Statistical Computation and Simulation Vol. 75, No. 3, March 2005, 175 187 A process capability index for discrete processes MICHAEL PERAKIS and EVDOKIA XEKALAKI* Department of Statistics, Athens

More information

A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1

A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1 Int. J. Contemp. Math. Sci., Vol. 2, 2007, no. 13, 639-648 A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1 Tsai-Hung Fan Graduate Institute of Statistics National

More information

Directional Control Schemes for Multivariate Categorical Processes

Directional Control Schemes for Multivariate Categorical Processes Directional Control Schemes for Multivariate Categorical Processes Nankai University Email: chlzou@yahoo.com.cn Homepage: math.nankai.edu.cn/ chlzou (Joint work with Mr. Jian Li and Prof. Fugee Tsung)

More information

Efficient Control Chart Calibration by Simulated Stochastic Approximation

Efficient Control Chart Calibration by Simulated Stochastic Approximation Efficient Control Chart Calibration by Simulated Stochastic Approximation 1/23 Efficient Control Chart Calibration by Simulated Stochastic Approximation GIOVANNA CAPIZZI and GUIDO MASAROTTO Department

More information

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling

Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Bayesian SAE using Complex Survey Data Lecture 4A: Hierarchical Spatial Bayes Modeling Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Lecture Content Motivation

More information

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES

MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES REVSTAT Statistical Journal Volume 13, Number 3, November 2015, 233 243 MARGINAL HOMOGENEITY MODEL FOR ORDERED CATEGORIES WITH OPEN ENDS IN SQUARE CONTINGENCY TABLES Authors: Serpil Aktas Department of

More information

CONTROL charts are widely used in production processes

CONTROL charts are widely used in production processes 214 IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING, VOL. 12, NO. 2, MAY 1999 Control Charts for Random and Fixed Components of Variation in the Case of Fixed Wafer Locations and Measurement Positions

More information

A Note on Bayesian Inference After Multiple Imputation

A Note on Bayesian Inference After Multiple Imputation A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in

More information

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis Lecture 3 1 Probability (90 min.) Definition, Bayes theorem, probability densities and their properties, catalogue of pdfs, Monte Carlo 2 Statistical tests (90 min.) general concepts, test statistics,

More information