Monitoring Multivariate Data via KNN Learning

Size: px
Start display at page:

Download "Monitoring Multivariate Data via KNN Learning"

Transcription

1 Monitoring Multivariate Data via KNN Learning Chi Zhang 1, Yajun Mei 2, Fugee Tsung 1 1 Department of Industrial Engineering and Logistics Management, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong 2 H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology Abstract Process monitoring of multivariate quality attributes is important in many industrial applications, in which rich historical data are often available thanks to modern sensing technologies. While multivariate statistical process control (SPC) has been receiving increasing attention, existing methods are often inadequate as they either cannot deliver satisfactory detection performance or cannot cope with massive amounts of complex data. In this paper, we propose a novel k-nearest neighbors empirical cumulative sum (KNN-ECUSUM) control chart for monitoring multivariate data by utilizing historical data under in-control and out-of-control scenarios. Our proposed method utilizes the k-nearest neighbors (KNN) algorithm for dimension reduction to transform multi-attribute data into univariate data, and then applies the CUSUM procedure to monitor the change in the empirical distributions of the transformed univariate data. Simulation studies and a real industrial example based on a disk monitoring system demonstrates the effectiveness of our proposed method. Keywords: Category Variable, CUSUM, Empirical Probability Mass Function, KNN, Statistical Process Control 1

2 1 Introduction Due to the rapid development of sensing technologies, modern industrial processes are generating large amounts of real-time measurements taken by many different kinds of sensors to reflect system status. For instance, in chemical factories, the boiler temperature, air pressure and chemical action time are all recorded in real time during the boiler heating process. For the purpose of quality control, it is important to take advantage of these real-time measurements and develop efficient statistical methods that can detect undesired events as quickly as possible before the catastrophic failure of the system. The early detection of undesired events has been investigated in the subject of the subfield of statistical process control (SPC), and the corresponding statistical methods are often referred as control charts. In general, a control chart computes a real-valued monitoring statistic at each time step, and raises an alarm whenever this monitoring statistic exceeds a pre-specified control limit. In the SPC framework, the performance of a control chart is often measured by two kinds of average run lengthes (ARLs): one is in-control (IC) ARL, and the other is out-of-control (OC) ARL. Here the IC and OC ARLs are defined as the average amount of time/number of observations from the start of monitoring to the first alarm, respectively, when the system is IC or OC. For a given IC ARL, a control chart with a smaller OC ARL will be able to monitor processes more efficiently. In the context of monitoring multivariate data, many control charts have been proposed in the literature, including the multivariate cumulative sum (MCUSUM, Woodall & Ncube, 1985; Crosier, 1988), the multivariate exponentially weighted moving average (MEWMA, Lowry et al. 1992), the regression-adjusted control chart (Hawkins, 1991) and charts based on variable selection (Zou & Qiu, 2009; Wang & Jiang, 2009). We recommend Lowry and Montgomery (1995), Lu et al. (1998) and Woodall and Montgomery (2014) for excellent literature reviews. Most of the aforementioned multivariate control charts make parametric assumptions, e.g., the data are multivariate normally distributed. However, for the observed multivariate data in modern processes, the multivariate normal assumption is often violated, and it is nontrivial, if not impossible, to model their true parametric distributions or to infer the model parameters. Recently, some nonparametric or model-free control charts have been proposed in the literature, see Sun & Tsung (2003); Hwang et al. (2007); Camci et al. (2008); Qiu (2008); Sukchotrat et al. (2010); Zou and Tsung (2011); 2

3 Ning & Tsung (2012); Zou et al. (2012); and the references therein. Also see Chakraborti et al. (2001) for more reviews. However, these existing nonparametric or model-free charts often require that the data distribution is unimodal, and thus they might be inadequate for monitoring modern sensing systems when the unimodal or other assumptions are violated. Thus, more comprehensive studies on nonparametric or model-free charts are called for to cope with modern processes. A new feature of modern processes is that rich historical data are often available thanks to modern sensing systems. The historical data often include both in-control (IC) and outof-control (OC) data. A concrete motivating example is the application of the hard disk drive monitoring system (HDDMS). The HDDMS is a computerized system that records various attributes related to disk failure and provides early warnings of disk degradation. The HDDMS collects data on 23,010 IC hard disks and 270 OC hard disks. For each disk, 14 attributes are recorded once every hour for one week. In this way, large amount of historical IC and OC data are provided, and the challenge is how to make use of these rich historical IC and OC data. While the historical IC data have been extensively used by the traditional SPC methods, unfortunately the OC data are often overlooked, although they could give additional insights to develop efficient control charts in some scenarios (Zhang et al. 2015). In this paper, we develop a nonparametric or model-free control chart for monitoring multivariate data with the focus of efficiently detecting OC patterns that are similar to those in the historical OC dataset. Our method applies the k-nearest neighbors (KNN) algorithm to convert the multivariate data into a one-dimensional category variable, and a Phase II CUSUM control chart is constructed based on the estimated empirical probability mass function (p.m.f) of the one-dimensional category variable under both IC and OC states. Our proposed monitoring scheme has the following advantages: (i) it is data-driven and can be easily adapted to any multivariate distribution; (ii) it makes full use of historical IC and OC data and holds desirable performance; (iii) it is statistically appealing since it is based on the empirical p.m.f of the derived one-dimensional category variable; (iv) it is easy to implement and compute. The remainder of this paper is organized as follows. The problem of monitoring multivariate data is stated in Section 2, and our proposed KNN-based control chart is presented 3

4 in Section 3. Extensive simulation studies are reported in Section 4, and the case study involving the HDDMS data is presented in Section 5. Several concluding remarks are made in Section 6, and some of the technical or mathematical details are provided in the appendix. 2 Problem Description and Literature Review Following the standard SPC literature, we monitor a sequence of independent p-dimensional random vectors {x 1, x 2,..., x i,...} over time i = 1, 2,, from some process. Initially the process is in the IC state, and the cumulative distribution function (c.d.f.) of the x i is F (x; θ 0 ) for some IC parameter θ 0. At some unknown time point τ, the process is OC in the sense that the x i s have another c.d.f. F (x; θ 1 ) for some OC parameter θ 1. In other words, the multivariate data x i can be described by the following change-point model: x i { F (x; θ0 ) for i = 1,..., τ F (x; θ 1 ) for i = τ + 1,..., (1) where the change time τ is an unknown integer-valued parameter. When monitoring multivariate data process x i s, our goal is to detect the change time τ as soon as possible after the change occurs. This can be formulated as testing the simple null hypothesis H 0 : τ = (i.e., no change) against the composite alternative hypothesis H 1 : τ = 1, 2, (i.e., a change occurs at some finite time), but with the twist that we must test these hypotheses at each and every time step n until we feel we have enough evidence to reject the null hypothesis and declare that a change has occurred. A statistical procedure is defined as a control chart that raises an alarm at time N based on the first N observations, and the expectation, E(T ) is often referred as the IC or OC average run length (ARL), depending on whether the data is in the IC or OC state. One would like to find a control chart that minimizes the OC ARL, E oc (T ), subject to the following false alarm constraint under the IC state: E ic (T ) γ. (2) where γ > 0 is a pre-specified constant. In other words, the control chart is required to process at least γ observations on average before raising a false alarm when all observations are IC. 4

5 This problem is well studied under the parametric model when the distributions of x i under the IC and OC states are fully specified, and one classical method is the CUSUM procedure developed by Page (1954). To define the CUSUM procedure, denote by f ic ( ) and f oc ( ) the probability density functions (p.d.f.s) or p.m.f.s of the x i under IC and OC scenarios, respectively. Then the CUSUM statistic at time t can be defined recursively as for t = 1, 2, and W 0 W t = max{w t 1 + log f oc(x t ) f ic (x t ), 0} = 0. The CUSUM procedure will raise an alarm at the first time t whenever the CUSUM statistic W t L, where the pre-specified threshold L > 0 is chosen to satisfy the IC ARL constraint in (??). It is well known that the CUSUM statistic W t is actually the generalized likelihood ratio statistic of all observations up to time t, {x 1,, x t }, and the CUSUM procedure enjoys certain exactly optimal properties (Moustakides, 1986). Unfortunately the CUSUM procedure requires complete information of the IC and OC p.d.f.s, f ic ( ) and f oc ( ), and thus it may not be feasible in practice when it is non-trivial to model the distributions of multivariate data. In this article, we do not make any parametric assumptions on the c.d.f.s or p.d.f.s of the multivariate data x i under the IC or OC states. Instead we assume that two large historical datasets are available: one is an IC dataset, denoted by X IC = {x ic1, x ic2,...}, and the other is an OC dataset denoted by X OC = {x oc1, x oc2,...}. When monitoring the new observations x i, we are interested in detecting those OC patterns that are similar to the historical OC patterns. Our task is to utilize these historical IC and OC datasets to build an efficient control chart that can effectively monitor the possible changes in the new data x i, which can be regarded as the test data. As mentioned in the introduction, nonparametric control charts have been developed in the SPC literature (see Qiu, 2008; Zou et al., 2012), though they often only utilize the training IC dataset, and ignore the historical OC data. For the purpose of illustration and comparison, below we present the multivariate sign exponentially weighted moving average (MSEWMA) control chart proposed by Zou and Tsung (2011), which will be compared with our proposed method in our simulation studies in later sections. The MSEWMA control chart of Zou and Tsung (2011) is based on the sign test. The observed p-dimensional 5

6 multivariate vector x i is first unitized by the transformation and then the monitoring statistics are defined by v i = A 0(x i θ 0 ) A 0 (x i θ 0 ), (3) ω i = (1 λ)ω i 1 + λv i and Q i = 2 λ λ pω iω i. (4) In the MSEWMA control chart of Zou and Tsung (2011), an alarm is raised whenever Q i exceeds some prespecified threshold. The two parameters {θ 0, A 0 } in v i are estimated from the IC samples, and denotes the Euclidean norm. In general, the transformed unit vector v i in the MSEWMA control chart is uniformly distributed on the unit p-dimensional sphere when the observations are IC, and thus the statistic Q i is just the T 2 statistic of the Hotelling multivariate control chart applied to the p-dimensional univariate vectors v i. Clearly, the MSEWMA control chart of Zou and Tsung (2011), or in fact many other nonparametric methods, only utilize the training dataset from the IC scenario, and raise an alarm even if the observed data do not follow the historical IC patterns. By doing so, the benefit is to be able to detect any general OC patterns, but the disadvantage is to lose the statistical efficiency when one is interested in detecting specifical OC patterns that are similar to the historical ones. Our proposed nonparametric monitoring method utilizes both IC and OC data from the training datasets, and focuses on efficiently detecting OC patterns that are similar to those in the historical OC dataset. 3 The Proposed KNN-ECUSUM Control Chart Let us first provide a high-level description of our proposed k-nearest neighbors empirical cumulative sum (KNN-ECUSUM) control chart which is designed for monitoring p- dimensional observations x i based on the historical/training IC and OC datasets X IC and X OC. Our proposed KNN-ECUSUM control chart consists of the following three steps: Step 1: Use the KNN method to transform p-dimensional data x to one-dimensional category data z(x) = Z(x; X knn IC, X knn OC ), which is the proportion of the k nearest neighbors of x that are IC. Here X knn IC KNN classifier. and X knn OC are the subsets of training data for constructing the 6

7 Step 2: Estimate the empirical IC and OC probability mass functions (p.m.f.) of the one-dimensional category data z(x) by using the subsets of training IC and OC data X IC, X OC, which should be disjoint with the subsets X knn IC and X knn OC in Step 1. Step 3: Apply the classical CUSUM procedure to monitor the one-dimensional data z(x i ), where z(x i ) is computed for each new pieces of online data x i as in step 1, and the IC and OC p.m.f. of z(x i ) are estimated in Step 2. Below we will describe each component of our proposed KNN-ECUSUM control chart in detail and also provide some practical guidelines. The remainder of this section is organized as follows. Section 3.1 presents the construction of the transformation z(x) through the KNN method, and Section 3.2 discusses the estimation of the empirical p.m.f. of the transformed variable. In Section 3.3, the procedure for building the CUSUM chart for the one-dimensional data z(x i ), is elaborated. Section 3.4 gives a summary and practical guidance for our proposed control chart. 3.1 Construction of one-dimensional data z(x) In this subsection, we discuss the transformation of p-dimensional data x to one-dimensional data z(x) by applying the KNN algorithm to the IC and OC training data. For this purpose, we randomly select some training data, X knn IC and X knn OC, from the provided IC and OC training sets {X IC, X OC }. As a simple classification method, the KNN method is then applied to the datasets X knn IC and X knn OC to predict the label of any p-dimensional data x according to its nearest k neighbors. Here we extend the binary classification output of the KNN method to the k-category probability output. as To be more specific, the distance between two p-dimensional vectors {x i, x j } is defined dis(x i, x j ) = x i x j 2 where 2 denotes the Mahalanobis distance. Next, for any given p-dimensional vector x, let V (x, k) denote the k nearest points of x in the training datasets X knn IC and X knn OC under the above mentioned distance criterion. Note that the k points in the set V (x, k) may contain both IC and OC training samples in general. Then the transformed variable 7

8 z(x) is defined as the probability output of the KNN algorithm: z(x) = # of points in the intersection V (x, k) Xknn IC, (5) k i.e., z(x) is the proportion of IC samples in the k neighbors of x. Since the value of z(x) can only be i for an integer i = 0, 1, 2,, k, the transformed variable z(x) is a onedimensional category variable with k + 1 possible values. Hence the problem of k monitoring multivariate observations x i becomes a problem of monitoring one-dimensional category variables z(x i ) instead. Traditional multivariate control charts also convert multi-dimensional information into one-dimensional statistics that are often based on the probability distributions or the likelihood ratio test statistics of the multivariate data. However, from the statistical viewpoint, while the distribution of the aw p-dimensional observation x can also be estimated directly, e.g., by empirical likelihood (Owen 2001) and kernel density estimation (Terrell & Scott, 1992), such estimation generally becomes inefficient when the dimension p is medium or high. In our proposed method, we essentially use the KNN algorithm to conduct dimension reduction, so that the problem of monitoring the p-dimensional variables x i s is reduced to a problem of monitoring the one-dimensional variables z(x i ) s. 3.2 Estimation of the empirical probability mass function After using the KNN algorithm to convert the p-dimensional variable x to a one-dimensional category variable z(x), we face the problem of monitoring one-dimensional category data. Existing methods for monitoring category data are available from Marcucci (1985), Woodall (1997) and Montgomery (2009). However, these methods require counting the number of data points in each category, and are not suitable for our context where the category variable z(x) is calculated from a single observation x. Here our proposed method applies the classical CUSUM procedure to the one-dimensional category variable z(x), and for that purpose, we need to estimate the p.m.f. of z(x) under the IC and OC scenarios. Let us first review how the empirical p.m.f. of a category variable Y is estimated. Assume the category variable Y can take k + 1 possible values, say, t 0 t 1... t k, and assume that we observe G i.i.d. random samples y 1, y 2,..., y G of this variable. The 8

9 empirical p.m.f. of Y is ˆf(t j ) = 1 G n i=1 I(y i = t j ) = # of y i in j-th category ; j = 0, 1,..., k (6) G where I( ) denotes the indicator function. Next, let us illustrate how this empirical p.m.f. can be adapted to our context by applying the boostrapping or random sampling method to the IC and OC data. Taking the IC data as an example. Each time a subset of m IC observations are randomly (and possibly repeatedly) sampled from X IC (we exclude the observations in X knn IC that are used in step 1 of constructing the KNN classifier). After this subset of m pieces of IC data are input into the KNN classifier, we collect m transformed outputs z(x)s. This resampling and transformation procedure is repeated B times, and thus we will obtain a total of Bm category observations z(x)s. As in equation (??), the empirical p.m.f. of z(x) under the IC scenario is simply the proportion of Bm observations that fall under each category, and can be calculated as ˆf ic (z) = { # of z(x) s =z Bm 0 otherwise if z = j, j = 0, 1,..., k k (7) since z(x) equals only j, j = 0, 1,..., k. Also note that the empirical IC p.m.f. satisfies k k ˆf i=0 ic ( i ) = 1, which is the fundamental requirement for a mass function. Similarly, we k can derive the empirical OC p.m.f., ˆfoc (z), of the category variable z(x) under the OC scenario by applying the above procedure to the OC dataset X OC. Some issues on IC or OC samples for calculating the empirical densities will be discussed in more detail later in Subsection Monitoring the transformed variables z(x i ) After data transformation and p.m.f. estimation, our proposed control chart is to apply the classical CUSUM procedure to the transformed variable with the estimated IC and OC p.m.f.s. Specifically, when we monitor p-dimensional data x i in real time, we first use the KNN method in Step 1 to obtain the transformed variable z(x i ). In Step 2, when x i from either the IC or OC data distribution become available, the corresponding p.m.f. of the transformed variable z(x i ) can be estimated as ˆf ic (z) or ˆf oc (z). Thus the 9

10 problem of monitoring p-dimensional data x i can be simplified into one of monitoring the one-dimensional variables z(x i ) with a possible p.m.f. change from ˆf ic (z) to ˆf oc (z). Our proposed KNN-ECUSUM control chart simply involves applying the classical CUSUM procedure to the problem of monitoring the one-dimensional variable z(x i ) with the estimated IC and OC p.m.f.s, ˆfic (z) and ˆf oc (z). statistic as Our method defines the CUSUM W n = max(w n 1 + ˆf oc (z(x i )), 0) (8) ˆf ic (z(x i )) for n 1 with the initial value W 0 = 0. Our proposed KNN-ECUSUM control chart raises an alarm whenever W n exceeds a prespecified control limit L > 0. Recall that the CUSUM procedure is exactly optimal when the p.d.f.s/p.m.f.s under the IC and OC scenarios are completely specified. In our context, we utilize the historical IC and OC datasets to estimate the p.m.f. of the transformed variable. When the OC patterns are similar to those in the historical data, the empirical p.m.f. will be similar to the true p.m.f., and thus our proposed control chart would perform similarly to the optimal CUSUM procedure with the true density functions. 3.4 Summary and design issues of our KNN-ECUSUM method Our proposed KNN-ECUSUM control chart is summarized in Table 1. The novelty of our proposed control chart lies in combining four well established methods together: the KNN algorithm, the empirical probability mass function, bootstrapping, and CUSUM. The basic idea is to use the KNN algorithm as a dimension reduction tool. Rather than directly estimating the distributions of the raw p-dimensional variables, we monitor the transformed one-dimensional category variables, whose distribution can be easily estimated from their empirical densities. Our proposed KNN-ECUSUM control chart involves several tuning parameters, which must be chosen carefully when applying our control chart in practice. Below we will discuss how to choose these tuning parameters appropriately. 10

11 Table 1: Procedures for implementing our proposed KNN-ECUSUM control chart Algorithm 1: Procedure for Implementing the KNN-ECUSUM Chart Step 1 Step 2 Goal: To Build the KNN model for data transformation. Input: Training Samples X knn IC a given testing sample x. and X knn OC (sampled from X IC and X OC ); Output: The category variable z(x) = Z(x; X knn IC, X knn OC ). Goal: To calculate the empirical density of the transformed variable. Input: Samples X IC and X OC. Model: The KNN model z(x) built in the last step. Iterate: For the IC scenario, t = 1, 2,... B times. 1. Randomly select m IC samples from X IC where X knn IC are excluded. 2. The selected IC samples serve as input to z(x i ); collect the model outputs and determine the size of each category. Output: For the IC scenario, summarize the empirical densities after boostrapping; the densities are calculated from ˆf ic ( i k ) = #(testing result = i k ) Bm. Repeat: Using the same model, repeat the same iterative procedure for the OC scenario and estimate ˆf oc ( i k ). Step 3 Goal: To monitor online observations. Input: Online samples {x 1, x 2,..., }; control limit L. Charting Statistic: W n = max(w n 1 + ˆf oc (z(x i )) ˆf ic (z(x i )), 0), W 0 = 0. Output: If W n > L, the control chart raises an alarm; otherwise, the system is considered to be operating normally. 11

12 3.4.1 Selecting k in the KNN method In building the KNN model, the number k of neighbors is a very important parameter to set. Generally speaking, a small k may overfit the KNN training dataset and decrease the prediction accuracy. For the category variable z(x), a small k will also lead to imprecise output. On the other hand, a large k increases the computational load and may not necessarily bring a significant increase in the classification rate for prediction. Often people use cross-validation to achieve an appropriate k. Following the empirical rule of thumb, k is often chosen to be of order n knn, where n knn is the number of observations in the training IC and OC datasets reserved for the KNN method. This is used as the starting point in some KNN packages. Here we suggest choosing k = α n, where the value of α depends on practical conditions such as the differences between the IC and OC samples. Our experience from real-data studies suggests that α [0.5, 3] is reasonable Selecting the control limit When constructing the control chart, it is crucial to find an accurate control limit L for the monitoring statistics to satisfy the IC ARL constraint γ in (??). In our proposed KNN- ECUSUM control chart, the control limit L for the CUSUM statistics W n in (??) is related to both the IC and OC training datasets, and we propose a resampling method to find its value. For a given control limit L, we randomly select an observation from the IC training dataset X IC, and the CUSUM statistic W 1 is calculated as in (??). If the CUSUM statistic W 1 < L, another observation will be randomly selected from the IC training dataset X IC and the CUSUM statistic W 2 will be calculated. This procedure is repeated until the CUSUM statistic W T L after taking T observations from the IC training dataset. This completes one Monte Carlo simulation, and the number T of IC observations is recorded and is often called one realization of the run length. Then, we repeat the above steps n 0 times and obtain n 0 realizations of the run length, denoted by T 1,, T n0. The ARL of our proposed KNN-ECUSUM control chart with the control limit L is then estimated as ARL(L) ˆ = (T T n0 )/n 0. The control limit L is then adjusted through bisection search so that the obtained ARL(L) ˆ is close to the preset ARL constraint γ. In general, finding the control limit L is straightforward conceptually, 12

13 but unfortunately it can be very computationally demanding Sample size considerations The sample size is another factor that can affect the performance of our proposed KNN- CUSUM control chart. In the following, guidelines on choosing an appropriate sample size are provided. In Step 1 of our proposed control chart, it is crucial to have enough training samples for the KNN algorithm, since the quality of the transformed category variable z(x) can affect how our proposed method performs. However, limited by the computational capacity and the need to reserve sufficient training data for calculating empirical densities, it is also inappropriate for the KNN method to use all training data. In our studies below, when the size of the training dataset is N, the amount of training data for the KNN method in Step 1 is chosen as n knn O( N) for a large N or n knn [0.25, 0.75]N for a small N. In particular, for the real-data example in Section 5, we choose n knn 0.4N oc. In Step 2 of our proposed control chart, more samples will lead to more precise estimates of the p.m.f. of the transformed category variable. We suggest making use of much, or possibly all, of the training data especially when the training IC or OC dataset is not large enough. For a large training dataset, containing millions of observations, it might be acceptable to use a subset to estimate the p.m.f. It is important to note that the training data for building the KNN classifier in Step 1 should not overlap with those for estimating the empirical p.m.f. In the case of a small training dataset, repeated use of the training data might be acceptable, although it is generally not recommended. To better understand the effects of sample size, a simple simulation study is performed in later sections to see how the estimated densities approximate the true ones When multiple OC patterns exist So far we have only considered the binary classification when constructing the KNN classifier in Step 1, as we have assumed that all OC data exhibit a single OC pattern. However, sometimes multiple OC patterns may occur, and the historical OC data may contain more than one OC cluster. In this case, it might be inappropriate to keep the OC data in one 13

14 cluster, and more efforts are needed to investigate different OC clusters. It turns out that our proposed KNN-ECUSUM control chart can be easily extended to handle the case of multiple OC clusters. For that purpose, we need to divide the historical or training OC data into different clusters. This clustering procedure might be available during the Phase I analysis of data, or can be done by standard unsupervised learning methods such as the K-Means method, principal component analysis (PCA) and so on. From on our experience, while our proposed method can be extended to any number C of OC clusters, a smaller number of clusters will generally lead to better performance in monitoring changes, and thus a small number C 4 of OC clusters is suggested. Now we describe the extension of our proposed control chart. Denote the IC data and the C OC clusters from training data by X ic, X 1 oc..., X C oc. As in Step 1 in Table 1, we also build a KNN classifier but now it is a multi-class KNN classifier based on the subsets of these C + 1 training datasets. Next, we calculate the empirical p.m.f.s, ˆf ic ( ), ˆf oc( ), 1..., ˆf oc( ), C using boostrapping in Step 2. Notice that there are C +1 probability densities to be calculated, rather than two densities in the case of a single OC cluster. In Step 3, we construct the CUSUM statistic for each OC cluster and obtain C CUSUM statistics, where W 1,n = max(w 1,n 1 + ˆf 1 oc (d(x i)) ˆf ic (d(x i )), 0)),..., W C,n = max(w C,n 1 + ˆf C oc(d(x i )) ˆf ic (d(x i )), 0)). Then we define the monitoring statistics based on the maximum of all the C CUSUM statistics, W n = max(w 1,n, W 2,n,..., W C,n ), and raise an alarm when the maximum W n exceeds the control limit L. This control chart is intuitive as it is equivalent to raising an alarm if at least one CUSUM-type statistic W j,n exceeds the control limit L. 4 Simulation Studies In this section, we conduct extensive numerical simulation studies to demonstrate the usefulness of our proposed KNN-ECUSUM control chart. For the purpose of comparison, we also report the results of three alternative control charts. The two main competitors are the nonparametric MSEWMA chart in (??) proposed by Zou & Tsung (2011), and the traditional parametric MEWMA method (T 2 type). In addition, the performance of the classical parametric CUSUM procedure is also reported, since it is exactly optimal and provides the best possible bound when the distribution is known and correctly specified. 14

15 In Section 4.1, our KNN-ECUSUM method is compared with competitors for the case when there is only one cluster in the historical OC data. It is extended to the case of multiple clusters in Section 4.2. Section 4.3 contains sample size analysis, and we investigate how much training data are needed to estimate the empirical densities. Throughout the simulation, the IC ARL is fixed at 600. The MATLAB code for implementing the proposed method is available from the authors upon request. 4.1 When only one cluster exists in historical OC data In this subsection we will investigate the case when only one cluster exists in the historical OC data. Here we focus on detecting a change in the mean, and consider three different generative models of the p-dimensional multivariate data x i in order to evaluate the robust properties of control charts. The three different generative models are: the multivariate normal distribution N(0 p 1, Σ p p ), where p is the dimension; the multivariate t distribution t ζ (0 p 1, Σ p p ) with ζ degrees of freedom; the multivariate mixed distribution, rn(µ 1, Σ p p )+(1 r)n(µ 2, Σ p p ), i.e., a mixture of two multivariate normal distributions. In our simulation, we set r = 0.5, and the p 1 mean vector µ 1 (µ 2 ) is 3( 3) in the first component but is 0 for all other components. In our simulation, for all three generative models, the covariance matrices Σ p p of the p-dimensional variable x i are same: the diagonal elements of Σ p p are equal to 1, and the off-diagonal elements are set to 0.5. For each IC generative model, we consider the corresponding OC generative model that has a mean shift δ in the first component of the observations unless stated otherwise. We will consider different magnitudes of mean shift δ ranging from 0.2 to 2 with the step size 0.2. In addition, we will consider two choices of the dimension p: p = 6 and p = 20. The tuning parameter λ in the MEWMA method (T 2 type) and the MSEWMA chart of Zou & Tsung (2011) is set λ = 0.2 for simplicity. For the classic CUSUM procedure, the OC shift magnitude is assumed known beforehand in our simulation. Thus, the classic CUSUM method delivers the lowest bound in OC detection. 15

16 We emphasize that our proposed control chart does not use any information pertaining to the generative IC or OC model, and will only use the data generated from the IC or OC model: the KNN algorithm in Step 1 is based on 1000 pieces of IC and OC training data with the number of nearest neighbors k = 30, and the p.m.f. estimation in Step 2 is based on 100, 000 pieces of IC and OC training data with B = 1000 loops for bootstrapping. To illustrate the rationale of our proposed control chart, Figures 1-3 plot the estimated p.m.f. of the transformed one-dimensional category data z(x i ) under the three generative models. In Figure 1, two discrete densities of the KNN classifier s output are shown. From the generative model of the multivariate normal distribution, the estimated density of IC samples ( ˆf ic ) increases as the KNN output increases (except in some extreme situations), whereas the density of the OC cluster ( ˆf oc ) gradually decreases. This is consistent with our intuition that the neighboring samples of an IC sample are likely from the IC distribution, and the close neighbours of a OC sample have high OC probability densities. In Figures 2 and 3, a similar conclusion can be drawn from the multivariate t or multivariate mixed data. The significant differences in the estimated empirical IC and OC densities in Figures 1-3 demonstrate that the transformed variable z(x i ) can be used to detect changes in the original multivariate data x i, and thus our proposed KNN-ECUSUM control chart is reasonable. Tables 2-4 summarize the simulated ARL performance of different control charts under three different generative models. Table 2 presents the case of detecting mean shifts for the multivariate normal distribution, and it is clear that our proposed KNN-ECUSUM chart performs slightly worse than the optimal CUSUM chart, yet it has a much smaller OC ARL as compared to the MEWMA or MSEWMA control charts for all different kind of mean shift magnitudes. From Table 2, the performance of the MSEWMA control chart is similar to that of the MEWMA chart, but our proposed method has a much better performance in terms of a smaller OC ARL, especially when the mean shift magnitude is small. Our method is effective because it uses the OC information, which is missed in MEWMA or MSEWMA control chart. Thus, it is not surprising that our method can detect true changes more quickly than the alternatives. Table 3 reports the simulation results for multivariate t distributed data. The superiority of our method is maintained, e.g., our proposed KNN-ECUSUM chart still works 16

17 Table 2: (OC) ARL performance comparison for multivariate normal data Dimension Shift Size KNN-ECUSUM MEWMA MSEWMA CUSUM p δ λ = 0.2 λ = (595) 600(601) 602(594) 599(597) (126) 318(314) 322(323) 122(120) (34.3) 101(95.3) 104(97.3) 35.5(31.7) (13.5) 36.7(30.8) 39.5(33.0) 15.1(11.6) (6.47) 18.0(12.5) 20.2(13.4) 8.67(5.72) (3.76) 11.0(6.48) 13.3(7.24) 5.87(3.30) (2.44) 7.84(3.83) 9.80(4.36) 4.43(2.15) (1.74) 6.08(2.63) 7.97(2.87) 3.58(1.60) (1.32) 4.97(2.00) 6.93(2.18) 3.00(1.21) (1.09) 4.25(1.56) 6.24(1.78) 2.62(0.98) (0.89) 3.69(2.66) 5.76(1.45) 2.31(0.80) 0 600(586) 599(606) 599(589) 601(589) (127) 410(408) 396(400) 118(114) (36.0) 171(167) 163(158) 33.1(29.6) (13.6) 62.9(57.5) 62.1(54.1) 14.2(11.0) (6.37) 28.6(22.2) 29.1(21,7) 7.91(5.18) (3.91) 15.9(10.3) 16.8(10.2) 5.41(2.98) (2.50) 10.6(5.58) 11.7(5.71) 4.10(1.96) (1.86) 7.93(3.54) 8.94(3.68) 3.31(1.44) (1.74) 6.26(2.47) 7.41(2.59) 2.77(1.08) (1.16) 5.25(1.94) 6.40(1.97) 2.41(0.88) (0.93) 4.57(1.57) 5.71(1.58) 2.17(0.76) NOTE: Standard deviations are in parentheses 17

18 Table 3: OC performance comparison for multivariate t data Dimension Shift Size KNN-ECUSUM MEWMA MSEWMA CUSUM p δ λ = 0.2 λ = (603) 600(606) 600(587) 600(603) (146) 578(568) 342(339) 135(132) (41.4) 479(480) 115(108) 41.3(36.7) (14.7) 348(350) 44.7(37.4) 17.7(13.7) (7.33) 228(228) 23.1(16.6) 10.0(6.63) (4.10) 133(128) 15.1(8.74) 6.89(3.75) (2.86) 73.3(65.4) 11.1(5.32) 5.31(2.54) (2.04) 40.0(32.3) 9.05(3.86) 4.43(1.90) (1.64) 23.7(16.0) 7.77(2.88) 3.85(1.48) (1.27) 15.5(8.80) 6.94(2.36) 3.42(1.22) (1.10) 11.6(5.55) 6.38(1.95) 3.19(1.04) 0 601(589) 599(603) 600(589) 599(597) (135) 595(602) 420(414) 129(126) (39.6) 564(578) 180(171) 35.5(32.7) (14.4) 492(479) 33.1(25.8) 8.48(5.60) (4.04) 430(348) 19.1(12.5) 5.91(3.29) (2.67) 346(348) 13.1(7.19) 4.54(2.25) (1.94) 282(281) 10.0(4.63) 3.70(1.69) (1.49) 211(205) 8.25(3.35) 3.21(1.38) (1.25) 149(144) 7.10(2.59) 2.89(1.17) (1.10) 99.9(91.0) 6.34(2.09) 2.63(1.01) NOTE: Standard deviations are in parentheses 18

19 Table 4: OC performance comparison for multivariate mixed data Dimension Shift Size KNN-ECUSUM MEWMA MSEWMA CUSUM p δ λ = 0.2 λ = (600) 600(597) 601(583) 600(594) (128) 598(598) 584(583) 124(120) (36.3) 609(615) 565(562) 35.3(32.2) (13.5) 591(573) 528(530) 15.0(11.6) (6.39) 600(593) 516(514) 8.52(5.50) (3.76) 597(605) 471(467) 5.85(3.28) (2.44) 595(591) 455(446) 4.42(2.14) (1.73) 596(604) 427(423) 3.58(1.57) (1.38) 599(591) 414(407) 3.04(1.20) (1.11) 603(593) 382(381) 2.65(0.95) (0.92) 605(605) 377(373) 2.37(0.76) 0 602(594) 599(602) 600(581) 600(590) (128) 609(605) 604(615) 122(122) (37.1) 591(596) 595(589) 33.5(29.5) (13.3) 612(617) 585(580) 14.0(11.2) (6.68) 588(582) 584(583) 7.97(5.09) (3.88) 593(586) 568(564) 5.41(3.00) (2.57) 608(607) 556(546) 4.06(1.92) (1.84) 593(596) 549(559) 3.32(1.41) (1.41) 608(600) 539(546) 2.80(1.10) (1.13) 594(601) 528(529) 2.46(0.86) (0.96) 598(605) 524(525) 2.23(0.70) NOTE: Standard deviations are in parentheses 19

20 very effectively. It is also interesting to compare the results in Tables 2 and 3: when the true distribution of the data changes from multivariate normal to multivariate t, the performance of the MEWMA method deteriorates significantly. This is not surprising since the MEWMA method relies heavily on the assumption of the multivariate normal distribution. Meanwhile, the performances of MSEWMA and our proposed KNN-ECUSUM method are very stable and are not sensitive to the data distribution. From Table 3, our KNN-ECUSUM method has a smaller detection delay than the nonparametric MSEWMA chart in (??) proposed by Zou & Tsung (2011). Another issue that should be pointed out is the effect of data dimension. As the data dimension increases, both the MEWMA and MSEWMA charts require a longer ARL to detect small shifts, while our KNN-ECUSUM method achieves a robust performance. Table 4 reports the simulation results for mixed data. It can be seen that both the MEWMA and MSEWMA charts fail to detect the shift under the mixed distribution. One reason for their ineffectiveness is that the mixed data are not elliptically distributed, which is a fundamental requirement of both MEWMA and MSEWMA methods. Meanwhile, our proposed KNN-ECUSUM method still performs stably, and it is able to give satisfactory results under the mixed data scenario. To summarize, when only one cluster exists in historical OC data, our KNN-ECUSUM chart presents a better detection capability than its two competitors. The chart performs robustly and is adaptive to various distributions. The simulation studies reveal a similar detection efficiency between our method and the optimal CUSUM chart. Therefore, our KNN-ECUSUM method could be considered appealing. 4.2 When multiple clusters exist in historical OC data In this subsection, we conduct additional simulation studies to illustrate the usefulness of our proposed method in the case of multiple OC clusters in the OC training data. In the simulation, we only include the multivariate normal data and t data for simplicity, and the corresponding dimension ranges from p = 6 to p = 40. The MEWMA and MSEWMA schemes are also used for comparison. The simulation settings under multiple OC clusters are similar to the ones under the one OC cluster. The mean and covariance matrix of the IC data are still 0 p 1 and Σ p p ), and for each OC cluster, the shift magnitude is 1 but the 20

21 Table 5: OC performance comparison for the case of multiple clusters Dimenson # of Clusters Historical OC Mean Multivariate Normal Multivariate t p C KNN-ECUSUM MEWMA MSEWMA KNN-ECUSUM MEWMA MSEWMA [1,0,0,0,0,0] 12.7(7.68) 11.9(6.94) 13.2(7.09) 13.6(7.93) 136(130) 15.0(8.90) 6 2 [0,1,0,0,0,0] 9.89(5.57) 12.0(6.97) 13.4(7.19) 12.2(7.25) 132(124) 15.3(9.10) [1,0,0,0,0,0] 12.6(7.31) 11.9(6.94) 13.2(7.09) 13.7(7.65) 136(130) 15.0(8.90) 6 2 [0,1.5,0,0,0,0] 4.43(2.23) 5.74(2.37) 7.47(2.55) 5.61(2.93) 30.5(22.8) 8.45(3.38) [1,0,0,0,0,0] 19.9(13.4) 11.9(6.94) 13.2(7.09) 17.6(10.9) 136(130) 15.0(8.90) 6 3 [0,1,0,0,0,0] 14.5(8.76) 12.0(6.97) 13.4(7.19) 17.7(10.6) 132(124) 15.3(9.10) [0,0,1,0,0,0] 16.4(10.5) 11.9(6.99) 13.5(7.32) 17.9(10.7) 133(127) 15.4(9.08) 1 in the 1st com 12.8(7.84) 15.9(10.3) 16.8(10.2) 10.7(6.30) 420(424) 19.1(12.3) in the 2nd com 11.6(6.83) 16.0(10.3) 17.4(10.7) 12.7(7.70) 419(413) 19.7(13.1) 1 in the 1st com 11.3(6.66) 15.9(10.3) 16.8(10.2) 11.3(6.67) 420(424) 19.1(12.3) in the 2nd com 4.88(2.66) 6.96(3.00) 8.17(3.11) 4.88(2.66) 248(243) 9.20(4.03) 1.5 in the 1st com 10.1(6.11) 8.79(4.07) 9.82(4.18) 10.1(6.11) 405(408) 11.1(5.42) in the 2nd com 8.57(4.84) 8.73(4.00) 10.2(4.42) 8.57(4.84) 408(409) 11.1(5.47) 1.5 in the 3rd com 7.68(4.19) 8.77(4.02) 9.97(4.26) 7.68(4.19) 407(411) 11.2(5.44) NOTE: Standard deviations are in parentheses shift occurs in various dimensions in different clusters. For example, when the number of OC clusters C = 2 and data dimension p = 2, the mean of two OC clusters are [1, 0] and [0, 1]. The estimated densities of the KNN classifier s output are similar to those in Figures 1-3, and thus are omitted here for simplicity. The simulation results are shown in Table 5. As suggested in Section 3.4, the number of clusters is chosen to be small, and we consider C = 2, 3 here. The results in Table 5 demonstrate the detection competence of our method under the multi-cluster scenario. Although degrading a little as C increases, the performance of our chart is still desirable. All charts deliver similar performance levels under the normal distribution. The advantages of our chart become significant under the t data scenario. Also, the performances of both the MEWMA and MSEWMA charts are still influenced by data dimension, while our method appear to be more robust. Combining the results in this and the last subsection, we conclude that our method is quite a powerful tool for monitoring multivariate data, and it is more suitable for practice use than the alternative methods. 21

22 4.3 Sample size analysis When implementing the control chart, a practical issue is the sample size. As mentioned before, the sample size can influence the choice of the number k of neighbors, and the accuracy of the estimated density. Although the number of required samples depends on multiple factors such as data dimension and the true data distribution, it is still necessary to investigate how large a sample size is generally appropriate and how the estimated densities changes as the sample size increases. In this section, simulation studies are performed to demonstrate the effect of the sample size. As the KNN classifier delivers a probability estimator that the test sample is of a particular status, let us first consider how to achieve the theoretical true probablity. Denoting the true density functions of the IC and OC samples by f ic and f oc, according to the conditional probability formulation, the true probability that sample x is of a particular status can be written as P(x) = P(x IC x IC or OC) = Then the density of the above probablity f( ) can be derived as f ic (x) f ic (x) + f oc (x). (9) f(t) = and we call f( ) the true probability density. dp(p(x) t) dt (10) In the simulation, both the IC and OC samples are assumed to be normally distributed (p = 6) with mean µ 0 = 0, µ 1 = [1, 1, 1, 0, 0, 0], and the covariance matrix equals the identity matrix. Following equations?? and??, the true density is obtained after simple calculations (provided in the appendix) as f(t) = 1 π exp( 1 6 (3 2 log t 1 1 t )2 )( 1 t + 1 ). (11) t As we want to see how the estimated densities approximate the true densities, various sample sizes are tested and the results are shown in Figures 4 and 5. In the figures, numtrain and num-test denote the sample size for training the KNN model and the number of samples for calculating the densities, respectively. We find that the estimated density 22

23 approximates the true one better with more training samples. Comparing the three plots in Figure 4, it is evident that fewer test samples may lead to an unstable density, and the density curve becomes smoother as the number of test samples increases. Combining the results from all the plots, we see that at least 1000 training samples and test data points are required when the distribution is normal. Another interesting finding from Figures 4 and 5 is the choice of k. Recall that we suggested k = α n before. This general suggestion is validated by the results in these two figures. We can see that when the training sample size is equal to 1000, the density curve with k = 50 is the closest one to the true curve. Also, as the sample size increases to 4000, the performance levels of the k = 50 and k = 100 curves are quite similar. Futhermore, when the training sample size is not large enough, it is inappropriate to choose a large k, such as k = 500. Thus, the selection of k relies heavily on the training sample size. In summary, we can see that the sample size plays an important role in reaching an accurate empirical density. Since our method focuses on monitoring the one-dimensional output, the accuracy of its empirical density is essential to the performance of our control chart. Therefore, the accuracy of the estimated density must be ensured when implementing our method. 5 A Real Data Application In this section, we will use a real data example to demonstrate the usefulness of our proposed method by comparing it with other existing methods. Here we use the traditional MEWMA method and nonparametric MSEWMA chart for comparisons. The ARL values are obtained from 5,000 replicated simulations. Let us revisit the application of HDDMS. The HDDMS system records various attributes and provides early warnings of disks failure. The provided datasets (the IC and OC datasets) consist of the 14 attributes observed from IC disks and 270 OC disks that were continuously monitored once every hour for one week. Since the values of some attributes are constant or change very little in some disks, we select their average values to denote the working status (we also tried the full dataset in our experiment, and the results are similar). After the necessary data preprocessing, we delete four useless attributes. 23

24 This leads to a final dataset with p = 10 attributes. To reveal the nature of unknown distribution of the collected data, we pick three attributes (the 2nd, 3rd and 12th) and 200 random samples from the IC dataset for demonstration. The 200 samples are plotted in Figures 6 (a)-(c), while their normal Q-Q plots are shown in Figures 6 (d)-(f). It can be seen that the attributes do not follow the normal distribution, or any other commonly used distributions. Thus, our method is suitable and can be used in this scenario. In the Phase I procedure, although the IC and OC datasets are provided separately, we still need to cluster the OC dataset to find out possible shifts. In this study, the traditional K-Means clustering algorithm is applied. We find two main shifts in the OC dataset. The 270 disks are divided into two groups, containing 185 and 85 disks. Since more than one OC pattern exists in the historical data, we apply the multi-cluster strategy and build the KNN-ECUSUM chart in Phase II monitoring. After clustering the historical OC data, the KNN classifier can be constructed easily following Step 1 in Table 1. The number of training samples in each cluster is selected to be 50, and the number of nearest neighbors k equals 15. In Step 2, the empirical densities can be computed quickly using the provided OC dataset when the KNN classifier is provided. The result of the estimated category variable density is plotted in Figure 7. The conclusion that the empirical density of the IC cluster gradually increases while that of the OC cluster decreases still holds. Particularly, for the second OC cluster, its empirical density values are all 0 when the KNN output is larger than 2, which shows that the shift size of this 15 cluster is quite large compared to that of the first OC cluster. In Step 3, the KNN-ECUSUM chart can be implemented for online monitoring. We fix the IC ARL to 1000, and the control limit of our chart is found to be We compare our chart with two alternatives through a steady-state simulation, where each time 30 IC samples are selected before the change point. The results of this performance comparison are shown in Table 6. outperforms the competitors significantly. We can see that in detecting the first OC cluster, our method For the second OC cluster, the performance levels of our method and the MEWMA chart are close, and they both work slightly better than the MSEWMA chart. Thus, the superiority of the proposed method is proved. 24

A new multivariate CUSUM chart using principal components with a revision of Crosier's chart

A new multivariate CUSUM chart using principal components with a revision of Crosier's chart Title A new multivariate CUSUM chart using principal components with a revision of Crosier's chart Author(s) Chen, J; YANG, H; Yao, JJ Citation Communications in Statistics: Simulation and Computation,

More information

Module B1: Multivariate Process Control

Module B1: Multivariate Process Control Module B1: Multivariate Process Control Prof. Fugee Tsung Hong Kong University of Science and Technology Quality Lab: http://qlab.ielm.ust.hk I. Multivariate Shewhart chart WHY MULTIVARIATE PROCESS CONTROL

More information

Monitoring Censored Lifetime Data with a Weighted-Likelihood Scheme

Monitoring Censored Lifetime Data with a Weighted-Likelihood Scheme Monitoring Censored Lifetime Data with a Weighted-Likelihood Scheme Chi Zhang, 1,2 Fugee Tsung, 2 Dongdong Xiang 1 1 School of Statistics, East China Normal University, Shanghai, China 2 Department of

More information

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location MARIEN A. GRAHAM Department of Statistics University of Pretoria South Africa marien.graham@up.ac.za S. CHAKRABORTI Department

More information

System Monitoring with Real-Time Contrasts

System Monitoring with Real-Time Contrasts System Monitoring with Real- Contrasts HOUTAO DENG Intuit, Mountain View, CA 94043, USA GEORGE RUNGER Arizona State University, Tempe, AZ 85287, USA EUGENE TUV Intel Corporation, Chandler, AZ 85226, USA

More information

Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution

Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution CMST 21(4) 221-227 (2015) DOI:10.12921/cmst.2015.21.04.006 Optimal Cusum Control Chart for Censored Reliability Data with Log-logistic Distribution B. Sadeghpour Gildeh, M. Taghizadeh Ashkavaey Department

More information

A New Bootstrap Based Algorithm for Hotelling s T2 Multivariate Control Chart

A New Bootstrap Based Algorithm for Hotelling s T2 Multivariate Control Chart Journal of Sciences, Islamic Republic of Iran 7(3): 69-78 (16) University of Tehran, ISSN 16-14 http://jsciences.ut.ac.ir A New Bootstrap Based Algorithm for Hotelling s T Multivariate Control Chart A.

More information

Thresholded Multivariate Principal Component Analysis for Multi-channel Profile Monitoring

Thresholded Multivariate Principal Component Analysis for Multi-channel Profile Monitoring Thresholded Multivariate Principal Component Analysis for Multi-channel Profile Monitoring arxiv:163.5265v1 [stat.ap] 16 Mar 216 Yuan Wang, Kamran Paynabar, Yajun Mei H. Milton Stewart School of Industrial

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Statistics Toolbox 6. Apply statistical algorithms and probability models

Statistics Toolbox 6. Apply statistical algorithms and probability models Statistics Toolbox 6 Apply statistical algorithms and probability models Statistics Toolbox provides engineers, scientists, researchers, financial analysts, and statisticians with a comprehensive set of

More information

Algorithm-Independent Learning Issues

Algorithm-Independent Learning Issues Algorithm-Independent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning

More information

Unsupervised Learning Methods

Unsupervised Learning Methods Structural Health Monitoring Using Statistical Pattern Recognition Unsupervised Learning Methods Keith Worden and Graeme Manson Presented by Keith Worden The Structural Health Monitoring Process 1. Operational

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

Nonparametric Multivariate Control Charts Based on. A Linkage Ranking Algorithm

Nonparametric Multivariate Control Charts Based on. A Linkage Ranking Algorithm Nonparametric Multivariate Control Charts Based on A Linkage Ranking Algorithm Helen Meyers Bush Data Mining & Advanced Analytics, UPS 55 Glenlake Parkway, NE Atlanta, GA 30328, USA Panitarn Chongfuangprinya

More information

Statistical Process Control for Multivariate Categorical Processes

Statistical Process Control for Multivariate Categorical Processes Statistical Process Control for Multivariate Categorical Processes Fugee Tsung The Hong Kong University of Science and Technology Fugee Tsung 1/27 Introduction Typical Control Charts Univariate continuous

More information

Directionally Sensitive Multivariate Statistical Process Control Methods

Directionally Sensitive Multivariate Statistical Process Control Methods Directionally Sensitive Multivariate Statistical Process Control Methods Ronald D. Fricker, Jr. Naval Postgraduate School October 5, 2005 Abstract In this paper we develop two directionally sensitive statistical

More information

The Robustness of the Multivariate EWMA Control Chart

The Robustness of the Multivariate EWMA Control Chart The Robustness of the Multivariate EWMA Control Chart Zachary G. Stoumbos, Rutgers University, and Joe H. Sullivan, Mississippi State University Joe H. Sullivan, MSU, MS 39762 Key Words: Elliptically symmetric,

More information

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2

Rejoinder. 1 Phase I and Phase II Profile Monitoring. Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 Rejoinder Peihua Qiu 1, Changliang Zou 2 and Zhaojun Wang 2 1 School of Statistics, University of Minnesota 2 LPMC and Department of Statistics, Nankai University, China We thank the editor Professor David

More information

Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation

Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation Yuan Wang and Yajun Mei arxiv:1308.5738v3 [math.st] 16 Mar 2016 July 10, 2015 Abstract The quickest change detection

More information

Quality Control & Statistical Process Control (SPC)

Quality Control & Statistical Process Control (SPC) Quality Control & Statistical Process Control (SPC) DR. RON FRICKER PROFESSOR & HEAD, DEPARTMENT OF STATISTICS DATAWORKS CONFERENCE, MARCH 22, 2018 Agenda Some Terminology & Background SPC Methods & Philosophy

More information

Chart for Monitoring Univariate Autocorrelated Processes

Chart for Monitoring Univariate Autocorrelated Processes The Autoregressive T 2 Chart for Monitoring Univariate Autocorrelated Processes DANIEL W APLEY Texas A&M University, College Station, TX 77843-33 FUGEE TSUNG Hong Kong University of Science and Technology,

More information

Distribution-Free Monitoring of Univariate Processes. Peihua Qiu 1 and Zhonghua Li 1,2. Abstract

Distribution-Free Monitoring of Univariate Processes. Peihua Qiu 1 and Zhonghua Li 1,2. Abstract Distribution-Free Monitoring of Univariate Processes Peihua Qiu 1 and Zhonghua Li 1,2 1 School of Statistics, University of Minnesota, USA 2 LPMC and Department of Statistics, Nankai University, China

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Weighted Likelihood Ratio Chart for Statistical Monitoring of Queueing Systems

Weighted Likelihood Ratio Chart for Statistical Monitoring of Queueing Systems Weighted Likelihood Ratio Chart for Statistical Monitoring of Queueing Systems Dequan Qi 1, Zhonghua Li 2, Xuemin Zi 3, Zhaojun Wang 2 1 LPMC and School of Mathematical Sciences, Nankai University, Tianjin

More information

Machine Learning. Lecture 9: Learning Theory. Feng Li.

Machine Learning. Lecture 9: Learning Theory. Feng Li. Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell

More information

CS229 Final Project. Wentao Zhang Shaochuan Xu

CS229 Final Project. Wentao Zhang Shaochuan Xu CS229 Final Project Shale Gas Production Decline Prediction Using Machine Learning Algorithms Wentao Zhang wentaoz@stanford.edu Shaochuan Xu scxu@stanford.edu In petroleum industry, oil companies sometimes

More information

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods)

Machine Learning. Nonparametric Methods. Space of ML Problems. Todo. Histograms. Instance-Based Learning (aka non-parametric methods) Machine Learning InstanceBased Learning (aka nonparametric methods) Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Non parametric CSE 446 Machine Learning Daniel Weld March

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Likelihood-Based EWMA Charts for Monitoring Poisson Count Data with Time-Varying Sample Sizes

Likelihood-Based EWMA Charts for Monitoring Poisson Count Data with Time-Varying Sample Sizes Likelihood-Based EWMA Charts for Monitoring Poisson Count Data with Time-Varying Sample Sizes Qin Zhou 1,3, Changliang Zou 1, Zhaojun Wang 1 and Wei Jiang 2 1 LPMC and Department of Statistics, School

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier

Machine Learning. Theory of Classification and Nonparametric Classifier. Lecture 2, January 16, What is theoretically the best classifier Machine Learning 10-701/15 701/15-781, 781, Spring 2008 Theory of Classification and Nonparametric Classifier Eric Xing Lecture 2, January 16, 2006 Reading: Chap. 2,5 CB and handouts Outline What is theoretically

More information

The Perceptron algorithm

The Perceptron algorithm The Perceptron algorithm Tirgul 3 November 2016 Agnostic PAC Learnability A hypothesis class H is agnostic PAC learnable if there exists a function m H : 0,1 2 N and a learning algorithm with the following

More information

Regression Clustering

Regression Clustering Regression Clustering In regression clustering, we assume a model of the form y = f g (x, θ g ) + ɛ g for observations y and x in the g th group. Usually, of course, we assume linear models of the form

More information

Unsupervised machine learning

Unsupervised machine learning Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels

More information

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Rafdord M. Neal and Jianguo Zhang Presented by Jiwen Li Feb 2, 2006 Outline Bayesian view of feature

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Machine Learning Practice Page 2 of 2 10/28/13

Machine Learning Practice Page 2 of 2 10/28/13 Machine Learning 10-701 Practice Page 2 of 2 10/28/13 1. True or False Please give an explanation for your answer, this is worth 1 pt/question. (a) (2 points) No classifier can do better than a naive Bayes

More information

Likelihood Ratio-Based Distribution-Free EWMA Control Charts

Likelihood Ratio-Based Distribution-Free EWMA Control Charts Likelihood Ratio-Based Distribution-Free EWMA Control Charts CHANGLIANG ZOU Nankai University, Tianjin, China FUGEE TSUNG Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong

More information

Pattern Recognition. Parameter Estimation of Probability Density Functions

Pattern Recognition. Parameter Estimation of Probability Density Functions Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The

More information

JINHO KIM ALL RIGHTS RESERVED

JINHO KIM ALL RIGHTS RESERVED 014 JINHO KIM ALL RIGHTS RESERVED CHANGE POINT DETECTION IN UNIVARIATE AND MULTIVARIATE PROCESSES by JINHO KIM A dissertation submitted to the Graduate School-New Brunswick Rutgers, The State University

More information

Multivariate Binomial/Multinomial Control Chart

Multivariate Binomial/Multinomial Control Chart Multivariate Binomial/Multinomial Control Chart Jian Li 1, Fugee Tsung 1, and Changliang Zou 2 1 Department of Industrial Engineering and Logistics Management, Hong Kong University of Science and Technology,

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

An Investigation of Combinations of Multivariate Shewhart and MEWMA Control Charts for Monitoring the Mean Vector and Covariance Matrix

An Investigation of Combinations of Multivariate Shewhart and MEWMA Control Charts for Monitoring the Mean Vector and Covariance Matrix Technical Report Number 08-1 Department of Statistics Virginia Polytechnic Institute and State University, Blacksburg, Virginia January, 008 An Investigation of Combinations of Multivariate Shewhart and

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information

A Power Analysis of Variable Deletion Within the MEWMA Control Chart Statistic

A Power Analysis of Variable Deletion Within the MEWMA Control Chart Statistic A Power Analysis of Variable Deletion Within the MEWMA Control Chart Statistic Jay R. Schaffer & Shawn VandenHul University of Northern Colorado McKee Greeley, CO 869 jay.schaffer@unco.edu gathen9@hotmail.com

More information

An exponentially weighted moving average scheme with variable sampling intervals for monitoring linear profiles

An exponentially weighted moving average scheme with variable sampling intervals for monitoring linear profiles An exponentially weighted moving average scheme with variable sampling intervals for monitoring linear profiles Zhonghua Li, Zhaojun Wang LPMC and Department of Statistics, School of Mathematical Sciences,

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee Songklanakarin Journal of Science and Technology SJST-0-0.R Sukparungsee Bivariate copulas on the exponentially weighted moving average control chart Journal: Songklanakarin Journal of Science and Technology

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

A New Demerit Control Chart for Monitoring the Quality of Multivariate Poisson Processes. By Jeh-Nan Pan Chung-I Li Min-Hung Huang

A New Demerit Control Chart for Monitoring the Quality of Multivariate Poisson Processes. By Jeh-Nan Pan Chung-I Li Min-Hung Huang Athens Journal of Technology and Engineering X Y A New Demerit Control Chart for Monitoring the Quality of Multivariate Poisson Processes By Jeh-Nan Pan Chung-I Li Min-Hung Huang This study aims to develop

More information

Soft Sensor Modelling based on Just-in-Time Learning and Bagging-PLS for Fermentation Processes

Soft Sensor Modelling based on Just-in-Time Learning and Bagging-PLS for Fermentation Processes 1435 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 70, 2018 Guest Editors: Timothy G. Walmsley, Petar S. Varbanov, Rongin Su, Jiří J. Klemeš Copyright 2018, AIDIC Servizi S.r.l. ISBN 978-88-95608-67-9;

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

18.6 Regression and Classification with Linear Models

18.6 Regression and Classification with Linear Models 18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

A spatial variable selection method for monitoring product surface

A spatial variable selection method for monitoring product surface International Journal of Production Research ISSN: 0020-7543 (Print) 1366-588X (Online) Journal homepage: http://www.tandfonline.com/loi/tprs20 A spatial variable selection method for monitoring product

More information

Multivariate Process Control Chart for Controlling the False Discovery Rate

Multivariate Process Control Chart for Controlling the False Discovery Rate Industrial Engineering & Management Systems Vol, No 4, December 0, pp.385-389 ISSN 598-748 EISSN 34-6473 http://dx.doi.org/0.73/iems.0..4.385 0 KIIE Multivariate Process Control Chart for Controlling e

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

More on Unsupervised Learning

More on Unsupervised Learning More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data

More information

High-Dimensional Process Monitoring and Fault Isolation via Variable Selection

High-Dimensional Process Monitoring and Fault Isolation via Variable Selection High-Dimensional Process Monitoring and Fault Isolation via Variable Selection KAIBO WANG Tsinghua University, Beijing 100084, P. R. China WEI JIANG The Hong Kong University of Science & Technology, Kowloon,

More information

1 Handling of Continuous Attributes in C4.5. Algorithm

1 Handling of Continuous Attributes in C4.5. Algorithm .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous

More information

Robotics 2 Data Association. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard

Robotics 2 Data Association. Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard Robotics 2 Data Association Giorgio Grisetti, Cyrill Stachniss, Kai Arras, Wolfram Burgard Data Association Data association is the process of associating uncertain measurements to known tracks. Problem

More information

M.S. Project Report. Efficient Failure Rate Prediction for SRAM Cells via Gibbs Sampling. Yamei Feng 12/15/2011

M.S. Project Report. Efficient Failure Rate Prediction for SRAM Cells via Gibbs Sampling. Yamei Feng 12/15/2011 .S. Project Report Efficient Failure Rate Prediction for SRA Cells via Gibbs Sampling Yamei Feng /5/ Committee embers: Prof. Xin Li Prof. Ken ai Table of Contents CHAPTER INTRODUCTION...3 CHAPTER BACKGROUND...5

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest

More information

Zero-Inflated Models in Statistical Process Control

Zero-Inflated Models in Statistical Process Control Chapter 6 Zero-Inflated Models in Statistical Process Control 6.0 Introduction In statistical process control Poisson distribution and binomial distribution play important role. There are situations wherein

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Metric-based classifiers. Nuno Vasconcelos UCSD

Metric-based classifiers. Nuno Vasconcelos UCSD Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major

More information

Notation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions

Notation. Pattern Recognition II. Michal Haindl. Outline - PR Basic Concepts. Pattern Recognition Notions Notation S pattern space X feature vector X = [x 1,...,x l ] l = dim{x} number of features X feature space K number of classes ω i class indicator Ω = {ω 1,...,ω K } g(x) discriminant function H decision

More information

Classification and Pattern Recognition

Classification and Pattern Recognition Classification and Pattern Recognition Léon Bottou NEC Labs America COS 424 2/23/2010 The machine learning mix and match Goals Representation Capacity Control Operational Considerations Computational Considerations

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two one-page, two-sided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and

More information

Chemometrics. 1. Find an important subset of the original variables.

Chemometrics. 1. Find an important subset of the original variables. Chemistry 311 2003-01-13 1 Chemometrics Chemometrics: Mathematical, statistical, graphical or symbolic methods to improve the understanding of chemical information. or The science of relating measurements

More information

Construction of An Efficient Multivariate Dynamic Screening System. Jun Li a and Peihua Qiu b. Abstract

Construction of An Efficient Multivariate Dynamic Screening System. Jun Li a and Peihua Qiu b. Abstract Construction of An Efficient Multivariate Dynamic Screening System Jun Li a and Peihua Qiu b a Department of Statistics, University of California at Riverside b Department of Biostatistics, University

More information

Statistical Surface Monitoring by Spatial-Structure Modeling

Statistical Surface Monitoring by Spatial-Structure Modeling Statistical Surface Monitoring by Spatial-Structure Modeling ANDI WANG Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong KAIBO WANG Tsinghua University, Beijing 100084,

More information

Stephen Scott.

Stephen Scott. 1 / 35 (Adapted from Ethem Alpaydin and Tom Mitchell) sscott@cse.unl.edu In Homework 1, you are (supposedly) 1 Choosing a data set 2 Extracting a test set of size > 30 3 Building a tree on the training

More information

Supplementary Material for Wang and Serfling paper

Supplementary Material for Wang and Serfling paper Supplementary Material for Wang and Serfling paper March 6, 2017 1 Simulation study Here we provide a simulation study to compare empirically the masking and swamping robustness of our selected outlyingness

More information

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects

Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects Early Detection of a Change in Poisson Rate After Accounting For Population Size Effects School of Industrial and Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive NW, Atlanta, GA 30332-0205,

More information

TDT4173 Machine Learning

TDT4173 Machine Learning TDT4173 Machine Learning Lecture 3 Bagging & Boosting + SVMs Norwegian University of Science and Technology Helge Langseth IT-VEST 310 helgel@idi.ntnu.no 1 TDT4173 Machine Learning Outline 1 Ensemble-methods

More information

Efficient Control Chart Calibration by Simulated Stochastic Approximation

Efficient Control Chart Calibration by Simulated Stochastic Approximation Efficient Control Chart Calibration by Simulated Stochastic Approximation 1/23 Efficient Control Chart Calibration by Simulated Stochastic Approximation GIOVANNA CAPIZZI and GUIDO MASAROTTO Department

More information

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance

An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance Dhaka Univ. J. Sci. 61(1): 81-85, 2013 (January) An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance A. H. Sajib, A. Z. M. Shafiullah 1 and A. H. Sumon Department of Statistics,

More information

Decentralized Clustering based on Robust Estimation and Hypothesis Testing

Decentralized Clustering based on Robust Estimation and Hypothesis Testing 1 Decentralized Clustering based on Robust Estimation and Hypothesis Testing Dominique Pastor, Elsa Dupraz, François-Xavier Socheleau IMT Atlantique, Lab-STICC, Univ. Bretagne Loire, Brest, France arxiv:1706.03537v1

More information

1 Handling of Continuous Attributes in C4.5. Algorithm

1 Handling of Continuous Attributes in C4.5. Algorithm .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous

More information

Algorithm Independent Topics Lecture 6

Algorithm Independent Topics Lecture 6 Algorithm Independent Topics Lecture 6 Jason Corso SUNY at Buffalo Feb. 23 2009 J. Corso (SUNY at Buffalo) Algorithm Independent Topics Lecture 6 Feb. 23 2009 1 / 45 Introduction Now that we ve built an

More information

Computer simulation on homogeneity testing for weighted data sets used in HEP

Computer simulation on homogeneity testing for weighted data sets used in HEP Computer simulation on homogeneity testing for weighted data sets used in HEP Petr Bouř and Václav Kůs Department of Mathematics, Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University

More information

Recall the Basics of Hypothesis Testing

Recall the Basics of Hypothesis Testing Recall the Basics of Hypothesis Testing The level of significance α, (size of test) is defined as the probability of X falling in w (rejecting H 0 ) when H 0 is true: P(X w H 0 ) = α. H 0 TRUE H 1 TRUE

More information

An Adaptive Exponentially Weighted Moving Average Control Chart for Monitoring Process Variances

An Adaptive Exponentially Weighted Moving Average Control Chart for Monitoring Process Variances An Adaptive Exponentially Weighted Moving Average Control Chart for Monitoring Process Variances Lianjie Shu Faculty of Business Administration University of Macau Taipa, Macau (ljshu@umac.mo) Abstract

More information

Monte Carlo Studies. The response in a Monte Carlo study is a random variable.

Monte Carlo Studies. The response in a Monte Carlo study is a random variable. Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating

More information

Logistic Regression Analysis

Logistic Regression Analysis Logistic Regression Analysis Predicting whether an event will or will not occur, as well as identifying the variables useful in making the prediction, is important in most academic disciplines as well

More information

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014

Decision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014 Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists

More information

1 Machine Learning Concepts (16 points)

1 Machine Learning Concepts (16 points) CSCI 567 Fall 2018 Midterm Exam DO NOT OPEN EXAM UNTIL INSTRUCTED TO DO SO PLEASE TURN OFF ALL CELL PHONES Problem 1 2 3 4 5 6 Total Max 16 10 16 42 24 12 120 Points Please read the following instructions

More information

Introduction. Chapter 1

Introduction. Chapter 1 Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics

More information

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty

Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Learning Multiple Tasks with a Sparse Matrix-Normal Penalty Yi Zhang and Jeff Schneider NIPS 2010 Presented by Esther Salazar Duke University March 25, 2011 E. Salazar (Reading group) March 25, 2011 1

More information

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size

Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Berkman Sahiner, a) Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, b) and Lubomir Hadjiiski

More information

DIAGNOSIS OF BIVARIATE PROCESS VARIATION USING AN INTEGRATED MSPC-ANN SCHEME

DIAGNOSIS OF BIVARIATE PROCESS VARIATION USING AN INTEGRATED MSPC-ANN SCHEME DIAGNOSIS OF BIVARIATE PROCESS VARIATION USING AN INTEGRATED MSPC-ANN SCHEME Ibrahim Masood, Rasheed Majeed Ali, Nurul Adlihisam Mohd Solihin and Adel Muhsin Elewe Faculty of Mechanical and Manufacturing

More information

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:

More information

Univariate Dynamic Screening System: An Approach For Identifying Individuals With Irregular Longitudinal Behavior. Abstract

Univariate Dynamic Screening System: An Approach For Identifying Individuals With Irregular Longitudinal Behavior. Abstract Univariate Dynamic Screening System: An Approach For Identifying Individuals With Irregular Longitudinal Behavior Peihua Qiu 1 and Dongdong Xiang 2 1 School of Statistics, University of Minnesota 2 School

More information

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary

CPSC 531: Random Numbers. Jonathan Hudson Department of Computer Science University of Calgary CPSC 531: Random Numbers Jonathan Hudson Department of Computer Science University of Calgary http://www.ucalgary.ca/~hudsonj/531f17 Introduction In simulations, we generate random values for variables

More information

Forecasting Wind Ramps

Forecasting Wind Ramps Forecasting Wind Ramps Erin Summers and Anand Subramanian Jan 5, 20 Introduction The recent increase in the number of wind power producers has necessitated changes in the methods power system operators

More information