A Multi-Step, Cluster-based Multivariate Chart for. Retrospective Monitoring of Individuals

Size: px
Start display at page:

Download "A Multi-Step, Cluster-based Multivariate Chart for. Retrospective Monitoring of Individuals"

Transcription

1 A Multi-Step, Cluster-based Multivariate Chart for Retrospective Monitoring of Individuals J. Marcus Jobe, Michael Pokojovy April 29, 29 J. Marcus Jobe is Professor, Decision Sciences and Management Information Systems Department, Miami University, Oxford, Ohio 4556 ( Michael Pokojovy is a Mathematics Research Fellow, Fachbereich Mathematik und Statistik, Universität Konstanz, D Konstanz, Germany ( michael.pokojovy@uni-konstanz.de) Abstract The presence of several outliers in an individuals retrospective multivariate control chart distorts both the sample mean vector and covariance matrix making the classical Hotelling s T 2 approach unreliable for outlier detection. To overcome the distortion or masking, we propose a computer-intensive multi-step cluster-based method. Compared to classical and robust estimation procedures, simulation studies show that our method is usually better and sometimes much better at detecting randomly occurring outliers as well as outliers arising from shifts in the process location. Additional comparisons based on real data are given. Key Words: Breakdown point; Cluster analysis; Kernel estimation; Mahalanobis distance; Moving average and medoid; Outlier. 1

2 Introduction Individuals retrospective multivariate control charts are constructed to determine if in a multivariate sense, the already obtained, sequentially ordered data points X = {x i } i=1,...n R d are stable, i.e., free of outliers, upsets or shifts. Some refer to this kind of analysis as a retrospective Phase I analysis of a historical data set (HDS). The sample mean vector and covariance matrix are estimated from X. Based on these estimates, Hotelling s T 2 chart is constructed and used to flag outliers, upsets or shifts in the process (Mason and Young 22; Fuchs and Kenett 1998). For X, Hotelling s T 2 statistic at time i is T 2 i = (x i x) S 1 (x i x), (1) where x and S are the usual sample mean vector and covariance matrix based on X. If T 2 i exceeds the upper control limit (UCL) given by UCL = (n 1)2 B n (α; d 2, n d 1 ), (2) 2 then x i is determined to be an outlier and a special cause is assumed to have occurred at time i or before. Equation (2) assumes the x i s are independent and come from a multivariate normal distribution. However, the T 2 i s are correlated (Mason and Young 22) because they each depend on the same x and S. Mahmoud and Woodall (24) and Williams et al. (26) stated that because of the correlation, the approximate overall probability of a false alarm that performs well for a control limit like that given in equation (2) is α overall = 1 (1 α) n. Hence, α > in equation (2) becomes α = 1 (1 α overall ) 1/n and B (α; d 2, n d 1 ) is the (1 α) 2 percentile of the beta distribution (Tracy, Young and Mason 1992; Wierda 1994). T 2 i is the squared Mahalanobis distance from the i-th data point x i to the center of X described by the sample mean. However, if multiple individuals or clusters of data points are separated from a main group, the sample mean vector x, thought to represent the data center, will likely be pulled away from the middle of the larger group of points. Likewise, the sample covariance matrix will be distorted, T 2 i will misrepresent the squared Mahalanobis 2

3 distance from the center of the group to a data point in question and the UCL given in (2) will not be effective at outlier detection. These effects of outliers or groups of outliers on the sample mean and covariance matrix are typically referred to as masking effects. One natural approach to overcome these effects is to substitute into equation (1) estimators of the mean vector and covariance matrix which are not affected by outliers or groups of outliers (robust estimators). The resulting T 2 i together with a UCL determined via simulation could be used to effectively identify outliers or sustained shifts in the mean vector. Vargas (23) and Jensen et al. (27) evaluated the performance of several different retrospective multivariate control charts constructed in such a fashion. Each of the charts were based on a T 2 statistic calculated using a selected combination of robust mean vector and covariance matrix estimators. Most of the robust estimators Vargas (23) and Jensen et al. (27) considered do not take time into account, which is a shortcoming when it comes to detecting certain outlier configurations such as sustained shifts in the mean vector. The robust methods proposed by Rousseeuw (1984), Rousseeuw and Leroy (1987), Rousseeuw and van Zomeren (199) and Rousseeuw and van Driessen (1999) were considered by Vargas (23) and Jensen et al. (27). In particular, Vargas (23) selected robust estimators of the mean and covariance matrix referred to as the minimum volume ellipsoid estimators (MVE), the minimum covariance determinant estimators (MCD), estimators generated using a trimming approach and two alternatives developed by Sullivan and Woodall (1996). Jensen et al. (27) focused on the MVE and MCD methodologies. A thorough discussion of the five methodologies selected by Vargas (23) is given in the next section. Important limitations are noted when each are used for 1) detection of an important shift in the mean vector and 2) outlier detection in the context of individuals retrospective multivariate process data. The notion of a breakdown point is outlined and an overview of our proposed approach for detecting an important shift in the mean vector and outliers from individuals multivariate data occurring over time is given. Next, three methodologies needed for construction of our proposed cluster-based individuals retrospective multivariate control chart are presented. A synthesis of the three methodologies is then 3

4 set forth. Using simulation, we compare the performance of our proposed method to the simulation results determined by Jensen et al. (27) for the MVE and MCD and by Vargas (23) for the T 2 approach based on MVE, MCD, Hotelling s T 2 and the two Sullivan and Woodall robust estimators. An analysis and interpretation of these comparisons are stated as well. Further, comparisons of our method to the aforementioned methods are made based on applications to the same data set analyzed by Vargas (23). Limitations of Robust Estimators Robust estimators derived from the MVE and MCD methodologies are the more prominent candidates for reducing distortion or masking that occurs in individuals retrospective multivariate control chart applications. For a set X R d of n data points, the MVE method attempts to find the subset of g points such that the minimum volume ellipsoid containing those g points has the smallest volume among all ellipsoids containing any other subset of g points from X. For the set X, the MCD method attempts to find the subset of g points such that the determinant of the estimated covariance matrix derived from that subset of g points is the minimum among all determinants of the covariance matrix derived from any other subset of g points contained in X. The MVE and MCD robust estimators then become the corresponding mean and appropriately scaled covariance matrix calculated from the identified subset. Davies (1987), Rousseeuw and Leroy (1987) and Lopuhaä and Rousseeuw (1991) showed that g = g = [(n + d + 1)/2] should be used to obtain the largest breakdown point for both the MVE and MCD estimators. Note the Gauss bracket [ ] applied to x R denotes the greatest integer not exceeding x. Simply put, a breakdown point is the proportion (n g)/n of points such that if the number of outliers in the sample exceeds n g, estimators can be severely distorted. For any sample of interest, there is a subtle presupposition associated with the MVE and MCD estimators. That presupposition is there exists a baseline group of good points which exceeds 5% of the sample and one or several groups of bad points which are less than 5% of the sample. This assumption is not necessarily true for multivariate processes occurring in 4

5 time. In fact, there may occur some baseline high density, consistent group of points B which is not necessarily 5% or more of the sample and other potentially multiple groups of points which are shifted away from B; each with membership percentage and density less than that of B. These separate groups may occur in close proximity to B both by time and position, only in time, only in position scattered across the set of n time periods, perhaps in some sort of trend across time or far away scattered across time. The goal of the retrospective individuals multivariate control chart is to detect if some type of shift, upset or change has occurred away from the baseline set B of points. Since the MVE and MCD estimators determine the size g of set B without knowledge of the sequentially occurring multivariate data points X, the resulting MVE and MCD estimators can be very poor. Additionally, no provision is made for time with the MVE and MCD estimators. Robust estimators derived from a trimming algorithm have been proposed by Rousseeuw and Leroy (1987). These estimators have a breakdown point of γ = [1/(d + 1)]. When considered for use in a retrospective individuals multivariate control chart application, the downsides of the robust estimators provided by trimming are the same as for the MVE and MCD estimators along with a rather small breakdown point γ. Sullivan and Woodall (1996) proposed looking at differences between successive data points. Based on the successive differences, an estimator of the covariance matrix and the usual sample average based on X were suggested. Using the sample average from all data points has an obvious downside. Another downside has to do with the use of n 1 successive differences from the sequence X. The estimated covariance matrix is adversely affected by randomly occurring multiple outliers which likely produce many large successive differences. Sullivan and Woodall (1996) proposed another method for obtaining robust estimators of the mean and covariance matrix. This method is iterative and has the same downsides as the trimming method of Rousseeuw and Leroy (1987). In summary, the suggested robust estimators for the mean vector and covariance matrix based on sample sizes g < n are hampered by how the value of g is determined and lack of attention given to the time ordering of data. Rousseeuw and Leroy (1987, p. 263) 5

6 stated Both the MVE and MCD are very drastic, because they are intended to safeguard against up to 5% of outliers. If one is certain that the fraction of outliers is at most γ ( < γ.5), then one can work with estimators MVE(γ) and MCD(γ) obtained by replacing g by k(γ) = [n(1 γ)] The question then becomes, How do we decide γ and simultaneously take into account time? The answer to this question is at the heart of our proposed individuals retrospective multivariate control chart scheme for detecting sustained shifts in the mean and outliers in the presence of masking. We suggest letting the data tell us the answer by using a combination of carefully constructed moving averages, nonparametric kernel estimates of multivariate densities, a density-based clustering method and signal calculation based on a quadratic form determined from an identified bulk B. We combine these four tools into a two-step approach. An overview of our proposed method follows beginning with Step 1. Step 1 and Step 2 sections which appear later detail both steps. The reader should not confuse the two-step characterization of our method with the usual two-stage vocabulary within a Phase I analysis defined by Alt (1985). Our Step 1 and 2 are the two segments of our proposed algorithm. For our method, we transform each original data vector x orig X orig using the usual Mahalanobis standardization. Throughout the rest of this paper x X becomes x = S 1/2 orig (x orig x orig ) where x orig and S orig come from the original untransformed data X orig. In Step 1 of our outlier detection algorithm, we look for the presence of a sustained shift in the underlying data. Sullivan (22) labeled as a change point the point where a sustained shift in the mean occurs. Initially, we repeatedly transform the data points using a weighted moving average procedure. The repeated application of moving averages smoothes the scattered data but preserves discontinuities in the time trend. Distances between adjacent observations in the transformed space scaled by a certain normalization factor are considered. If the distance between two successive points exceeds a certain threshold, a sustained shift in the trend is assumed to have happened. Moving averages lying between each two sequential jumps are then assigned to the same group. By averaging the data lying in each group, l group centers c 1,..., c l are obtained. Each original data point x i is assigned to that c j that 6

7 has the minimal Euclidean distance to x i among all centers. The group of points associated with the center c j is denoted by X j. This procedure helps to detect whether a sustained shift in the process location has occurred based on the underlying data. Further, this methodology identifies potential clusters that would not ordinarily be recognized with the typical cluster analysis that ignores the time dimension of occurrence, i.e., perhaps three clusters of data exist, the first cluster occurring with the first n 1 time periods, the second cluster with the second n 2 time periods and the third cluster with the third n 3 time periods. If the n 1, n 2 and n 3 time periods are combined, the three clusters together may not be recognizable but separated according to time, the three clusters are more readily identified. In Step 2, we consider the largest group of points X j among all groups obtained in Step 1. A sequence of substeps are applied to X j that produces a subset of points upon which a preliminary sample covariance matrix is calculated. Using this matrix, a nonparametric kernel estimate of the multivariate density function for all points x i in X j is determined. The data contained in X j are then clustered using a nonparametric clustering by mode identification introduced by Li, Ray and Lindsay (27). Their clustering requires a nonparametric estimation of the multivariate density, modal expectation maximization algorithm and mode association methodology. The biggest cluster C of points is identified. Depending on dimension and size of C, a set of points having large values of a density symmetry measure we propose is peeled. The remaining points in C become the bulk B. The usual mean and covariance matrix are computed from B. A quadratic form based on these two estimators produces the outlier detection signal. We note that Rocke and Woodruff (21) proposed the use of cluster analysis for detecting certain outlier configurations which have been shown to be problematic for the MVE and MCD estimators. Harnish et al. (29) developed a time-based clustering method. Others such as Coleman and Woodruff (2) and Coleman et al. (1999) recommended a combined approach to outlier detection that includes clustering. The fundamental components of our two-step computer-intensive multi-step cluster-based control chart are presented in the following three sections. 7

8 Modified Moving Averages Consider again the transformed data set X. If the sample contains a certain trend (in particular, a discontinuous one), our aim is to reveal it by transforming this set into a new space of the same dimension d. A naïve approach would be to take a point x i and to average it with its neighbors in time. This approach would produce false information if the data tend to be inconsistent, i.e. when neighborhood points come from two or more different distributions with substantially different means. Even robust smoothing techniques such as Locally Weighted Scatter plot Smoothing (LOWESS) introduced by Cleveland (1979) often produce unsatisfactory results when applied to data with a discontinuous trend. For this reason, we introduce the notion of a modified moving average that combines the advantages of moving averages and medoids. Let s 1 be an arbitrary integer denoting the spread of a neighborhood over a certain point. We define a cyclical numeration ϕ : Z {1,..., n} mapping integers onto {1,..., n} by means of ϕ n (i) = 1 + ((i 1) mod n). (3) ϕ n is thus a periodic function with period n since ϕ n (i + kn) = ϕ n (i) for i, k Z. For each point x i we take its s neighbors to the left and to the right with respect to the cyclical indexing ϕ n. The s-neighborhood N i of x i is thus given by N i = {x ϕn(i s),..., x ϕn(i+s)}. Then, we find a medoid of N i, i.e. a point m i N i that minimizes the function δ i (x) = s j= s d(x, x ϕ n(i+j)) over N i. Note that d(, ) denotes here the standard Euclidean distance in R d and for i s negative, say, 3 mod 3, ϕ 3 ( 3) becomes 27. The point m i is on average the closest to all other points in N i. We take the s + 1 points x ji,1,..., x ji,s+1 from N i that lie closest to m i. Statistically speaking, these points are likely to come from the same population. The moving average x i is now given by x i = 1/(s + 1) (x ji,1 + + x ji,s+1 ). Altogether, we defined a mapping T : R d n R d n, {x i } i=1,...,n { x i } i=1,...,n. Figures 1 and 2 give a comparison of our modified moving average algorithm and LOWESS. 8

9 observation x i observation x i time period i time period i (a) Original sample and modified moving averages (b) Original sample and LOWESS Figure 1: Smoothing applied to {x i } i=1,...,3. Two univariate ordered samples {x i } i=1,...,3 and {w i } i=1,...,3 were generated, each of size n = 3. x i s were taken from the normal distribution N (, 1) for i = 1,..., 3 (see Figure 1) and w i s from the distribution N (µ i, 1) with µ i = for i = 1,..., 15 and µ i = 1 for i = 16,..., 3 (see Figure 2). The modified moving average algorithm with spread s = 9 as well as the LOWESS with the span specified as 33.33% of the data were then applied m = 5 times to the data (see the section on Choice of Factors and Thresholds later in this paper regarding the selection of s and m). (a) Original sample and modified moving averages (b) Original sample and LOWESS Figure 2: Smoothing applied to {w i } i=1,...,3. 9

10 In Figure 1, we see our moving averages are not as sensitive as LOWESS to data volatility. As can be seen in Figure 2, the moving averages defined as above effectively distinguish between points from different distributions whereas the LOWESS estimates do not readily respond to a sustained shift in the data. Multivariate Density Estimation and Bandwidth Selection Given a sample X selected independently from some underlying general population with an unknown density function f(x) in d-dimensional Euclidean space, the problem is to nonparametrically estimate f(x). We have chosen to develop our multivariate density estimation based on the well-known method by Parzen (1962). Throughout the following density estimation discussion, h will denote bandwidth or window size. Let the kernel K : R d R be a function satisfying certain regularity and moment properties. The estimator of f at x is then given by ˆf h (x) = 1 nh d n ( ) x xi K. (4) h i=1 The smoothing factor h > is typically referred to as the bandwidth or window size. If h depends only on the space dimension d and the sample size n, i.e. h = h(d, n), it is called a global bandwidth. If it depends on x, x i and {x j } j=1,...,n, i.e. h = h(x, x i, {x j } j=1,...,n ), it is referred to as a variable bandwidth. In the most general case, the bandwidth (usually written as H) can be a nonsingular d d-matrix. The estimator is then defined as ˆf H (x) = 1 n det H n K ( H 1 (x x i ) ). (5) i=1 The quality of the estimator depends thus on the choice of the kernel K and the bandwidth h. Scott (1992) stated that quality of a density estimate is widely recognized to be determined primarily by the choice of a smoothing parameter, and only in a minor way by the choice of a kernel. In the present paper, the normal probability density function ϕ(x) = (2π) d/2 exp( x 2 /2) is chosen as a kernel since ϕ has nice regularity properties and produces smooth estimators ˆf h. Here x = x x, the Euclidean norm on R d. 1

11 For a given K, one way to measure the quality of estimation at a point x is to use the Mean Square Error MSE(x) = E{ ˆf h (x) f(x)} 2. The overall error is described by the Mean Integrated Square Error MISE = MSE(x)dx. R d Härdle and Müller (1997) expressed the asymptotic mean integrated square error (AMISE) as a estimate of MISE for the limit case h for sufficiently smooth f and H = hh where H is a fixed matrix with det H = 1. Over the past few decades, the problem of optimal bandwidth selection has been extensively studied in statistical literature. Many automatic, data-driven bandwidth selection methods (Härdle and Müller (1997)) have been proposed. The most popular are the plugin methods (especially, normal reference rule-of-thumb and Scott s rule), cross-validation techniques as well as adaptive methods such as k-neighbors bandwidth. Whereas general plug-in methods are not widely used in multidimensional settings (Section 1.7 in Li and Racine (27)), the normal reference rule-of-thumb suggesting H = ( ) 1 4 d+4 n 1 d+4 Σ 1/2 d + 2 (6) is often preferred by practitioners. Σ is the covariance matrix of the underlying general population. The bandwidth determined by equation (6) is appealing for our purposes because it minimizes AMISE when the underlying distribution is normal. In the present work, a simple robust bandwidth determined by a modified normal reference rule-of-thumb is proposed. Our bandwidth estimator is obtained by plugging a preliminary robust covariance estimator S into equation (6). Modal Expectation Maximization and Mode Association Clustering The Modal EM (MEM) algorithm solves a local maximization problem of a mixture density by ascending iterations starting from any initial point. This procedure was introduced by Li, Ray and Lindsay (27) as a modification of the well-known EM algorithm by Dempster, Laird and Rubin (1977). Though the MEM algorithm is based on expectation and maximization steps similar to the EM, the aim of MEM is to find local maxima, i.e. modes, of a 11

12 given density function. For each point x X, MEM determines an ascending path to a local maximum. All points in X whose path ends with the same maxima are assigned to the same cluster. This assignment is referred to by Li, Ray and Lindsay (27) as Modal Association Clustering (MAC). See the appendix for details of the MEM/MAC iterative algorithm. We use MAC and MEM in Step 2 of our outlier detection scheme to identify the largest cluster C X j from Step 1. A peeling algorithm (see the Step 2 section below) is then applied to C which helps to ensure that bulk B has a symmetric probability distribution function. In the following two sections, our new individuals retrospective multivariate control chart methodology is presented. We continue to assume the sample X = {x i } i=1,...n R d is standardized according to Mahalanobis as described earlier. Standardizing the input data ensures the detection signals to be invariant under any nonsingular affine linear transformation. Step 1 Let s = [min( dn, n/2)] and m = [ d log 2 n]. For details about selection of s and m, see Choice of Factors and Thresholds section. (a) Identify the n neighboring groups N i each containing 2s + 1 points. Find the medoid in each of the n groups. See Modified Moving Average section for details. (b) Find the s + 1 x i s in each group which are closest to the medoid of that group. (c) Denote y i as the average of each such set of s + 1 x i s. Each of these is a suitably weighted moving average of the neighborhood of points N i where weights are either or 1/(s + 1). (d) Iteratively repeat steps (a) (c) m 1 additional times. (e) Find the Euclidean distances µ i = d(y ϕn(i), y ϕn(i+1)) between the resulting adjacent moving averages. Find the Euclidean distance ν i = d(x ϕn(i), x ϕn(i+1)) between the ordered x i X. The jump at position i is then given by τ i = µ i / ν where ν is the median of the ν i s. If τ i exceeds a threshold τ θ given in Table 3, a jump is assumed to have happened at position i. 12

13 (f) Let I = {i 1,..., i l } contain all jump positions. If I has at least two positions, sequential unshifted points can be presented as a collection of all sets Y 1,..., Y l given by Y 1 = {y ϕn (i 1 +1),..., y ϕn (i 2 )}, Y 2 = {y ϕn (i 2 +1),..., y ϕn (i 3 )},..., Y l 1 = {y ϕn (i l 1 +1),..., y ϕn (i l )}, Y l = {y ϕn (i l +1),..., y n, y 1,..., y ϕn (i 1 )}. Let c i denote the average over the set Y i. If I has less than two positions, then define c 1 = 1 n n i=1 y i. (g) Each data vector x i is assigned to that c j being closest in Euclidian distance. All points allocated to the same center c j are combined into a group referred to as X j. (h) The largest group amongst all X j s is identified as X j. If there are several such groups, we choose X j to be the group with the smallest j. (i) The group of data points X j becomes the input to Step 2. If X j contains less than d + 1 points, we set X j = X and c j = x =. We note that X j contains n j points and has center c j. Step 2 (a) Denote n = n j and c = c j. The center point c is a type of robust mean vector estimate. We trim points around c. Unlike peeling data values around a sample mean, trimming points having extreme values of an appropriately defined distance from c eliminates outliers without exhausting good points Let d 2 (x) = (x c) S 1 (x c) for every x X j, where S = 1 n 1 n i=1 (x i c)(x i c). A point x X j having maximum value of d 2 (x) is identified and a trimmed sample X j \{x} is considered. This is repeated until n q points from X j remaining set of q points be denoted by X 1. are peeled where q = [ n+d+1 2 ]. Let the (b) Calculate the usual x 1 and S 1 from X 1. (c) Calculate S 2 = s 1,θ (d, n) (c d,.5 ) 1 S 1, where c d,α = 1 d x R d : x 2 χ 2 d,α x 2 ϕ(x)dx (see Table 1) and s 1,θ (d, n) is a small sample correction factor (see Choice of Factors and Thresholds section for details and Table 3). Note that χ 2 d,α is the (1 α)-th quantile of the χ 2 d -distribution and ϕ is the probability density function of N (, I d). The correction factor c d,α is used to maintain consistency in the multivariate normal context. 13

14 d = 2 d = 3 d = 5 d = 1 α = α = Table 1: Values of c d,α (d) Since only q n/2 points are in X 1, to increase the number of points upon which to compute a preliminary estimate of Σ to be used in equation (6), we find X X j of all x such that (x x 1 ) S 1 2 (x x 1 ) χ 2 d,α. (To be consistent with our overall error probability of.5 we let α =.5.) (e) Let x be determined from X X 1 and α as in (d). Thus, S = s 2,θ (d, n) (c d,α ) 1 1 X X 1 1 where s 2,θ is a small sample correction factor listed in Table 3. x X X 1 (x x)(x x), (7) (f) Plug S from (e) into equation (6) giving ( ) 1 4 d+4 H = n 1 d+4 S 1/2. (8) d + 2 Since S is a preliminary robust estimator of Σ, our chosen bandwidth is more resistant to outliers and performs well for contaminated samples. (g) Estimate the multivariate probability density via ˆf H given by equation (5) using H in equation (8). Recall n is the number of data points in X j. (h) Apply the Mode Association Clustering (MAC) to ˆf H. Among the clusters C 1,..., C r determined in X j, the biggest cluster C and corresponding mode u are selected. In case of ambiguity, we pick that cluster C i with a smaller index i. (i) Let the set C contain the top 25% of points x C having the largest (x c) S 1 (x c) where c is defined in Step 2(a) and S is defined by equation (7). The 25% value was subjectively chosen. (j) For every x C, determine a mirror point with respect to u by x = 2u x. (k) Compute a density symmetry measure λ(x) = max{ ˆf H (x), ˆf H (x )} min{ ˆf H (x), ˆf. This helps to filter out H (x )} skewness which may dilute the effectiveness of our method. 14

15 (l) All x C with λ(x) > s 3,θ (see Table 3 for s 3,θ ) are assigned to C. (m) Define x robust = 1 C\C x C\C x, S robust = (n) The detection signal becomes 1 C\C 1 x C\C (x x robust )(x x robust ). (9) T 2 i = (x i x robust ) S 1 robust (x i x robust ), (1) and C\C is the bulk B. x robust and S robust are from equation (9). (o) Control limits for selected n, d and α overall =.5 are given in Table 4. Choice of Factors and Thresholds In this section, we discuss the choice of τ θ, s 1,θ, s 2,θ and s 3,θ necessary for our two-step approach. The jump threshold τ θ previously referred to in the Step 1 section is determined to be a certain percentile of the distribution of the maximal jump τ max = max i=1,...,n τ i from a randomly selected sample of size n from N (, I d ). The generated sample data are transformed with the Mahalanobis standardization procedure because sample mean and covariance in general differ from the assumed parameters. Maintaining a 5% overall false detection probability, we selected the 98-th percentile of the the maximal τ max obtained from a simulation of 2 samples each of size n. Table 2 lists the 98-th percentile for some d and n. d \ n Table 2: 98-th percentile of a maximum jump τ = max i=1,...,n τ i In the spirit of Rousseeuw and van Zomeren (199), we assume s 1,θ, s 2,θ and s 3,θ to depend on n/d describing spatial sparsity of a sample. Based on knowledge of the asymptotic 15

16 behavior lim n/d s i,θ = 1 along with numerical simulations, we determined a general expression for s 1,θ, s 2,θ and s 3,θ (see Table 3). The functions s 1,θ and s 2,θ are the small sample correction factors for covariance matrix estimation. For n/d 5, our preliminary simulations showed that larger correction factors s 1,θ and s 2,θ were necessary to decrease volatility of the preliminary covariance matrix estimator. This corresponds to the empirical obervations of Rousseeuw and van Zomeren (199) who stated that MVE becomes unreliable for n/d 5. Hence, we constructed s 1,θ and s 2,θ that performed well for most n/d considered. The factor s 3,θ is the symmetry threshold for the estimated empirical density function for Step 2(l). s 1,θ (d, n) = exp( n d ) s 2,θ (d, n) = exp( n d )( n d 1) 1 s 3,θ (d, n) = exp(.6767 n d )( n d 1) Table 3: Small sample correction factors and density symmetry threshold Recall s and m are determined according to s = [min( dn, n/2)] and m = [ d log 2 n]. We selected s and m to comply with the following properties: sm and sm/n for n required for the asymptotic consistency of the moving average estimator. Simulation, Analysis and Conclusions A simulation was initially conducted to determine appropriate control limits for the detection signal T 2 i in equation (1) where the estimated mean vector and covariance matrix were produced from our two-step approach outlined in Step 1 and Step 2 sections. For selected combinations of n and d, 5 sets of n data values, assumed to have come from an in-control process, were generated by simulation. We simulated an in-control process by generating data from a multivariate normal distribution with zero mean vector and identity covariance matrix. For each of the j = 1,..., 5 sets of n data values, T 2 j = max i=1,...,n T 2 ij recorded and the 475-th ranked T 2 j identified as the control limit with an overall α = 5% false alarm rate for each n and d combination. Mason and Young (22) noted this approach was necessary for determining the control limits because of the dependence among T 2 ij s within the was 16

17 d \ n Table 4: Control limits for Our two-step approach with α overall =.5 j-th set of n data points. The control limits for our two-step method are displayed in Table 4 as a function of n and d where the overall false alarm rate for each n, d combination is α = 5%. We simulated a variety of out-of-control or shifted situations and calculated the detection probability of our combined outlier detection scheme. Jensen et al. (27) and Vargas (23) noted that for affine linear equivariant signal computation procedures (see Rousseeuw and van Zomeren (199)) any out-of-control setting corresponding to a designated shift in the mean with the same covariance structure is dependent only on the non-centrality parameter ncp = (µ 1 µ ) Σ 1 (µ 1 µ ), where µ 1 is the mean vector of the shifted scenario and µ is the in-control mean vector. Hence, shifted, out-of-control or events that can distort the usual mean and covariance estimators were simulated by generating data from a multivariate normal with mean µ 1 and identity covariance matrix that would produce a selected ncp of interest. For comparison purposes, we selected the same n, d, ncp and simulation sizes as Vargas (23). Additional n and d values for ncp = 5, 15 and 25 were selected for comparison to the probabilities found by Jensen et al. (27) whose simulation sizes were much larger. Letting k equal the number of bad points generated from an out-of-control scenario indexed by a selected ncp value, we arranged in a random order the k bad points with n k good points coming from an in-control process. For example, suppose n = 3, d = 2, ncp = 4 and k = 1. We generated r = 15 sets of n = 3 data points. Each of the r sets had 29 bivariate in-control data points and a k = 1 bivariate data point generated from a bivariate normal distribution with µ 1 = ( 2, 2) and identity covariance matrix. For each of the j = 1,..., r sets of n = 3 points, the k = 1 bad point and n k good points were randomly arranged as to order and 3 T 2 ij s were calculated, each based on estimators 17

18 determined by our proposed combination approach. If for some j, one or more of the T 2 ij s exceeded the corresponding control limit, an out-of-control signal was assigned to that set of n = 3 data points. The detection probability for an out-of-control masking scenario indexed by ncp = 4, k = 1, n = 3 and d = 2 was estimated with the proportion of out-of-control signals from the r sets. Jensen et al. (27) and Vargas (23) performed simulations based on the usual Hotelling s T 2, a Ti 2 statistic calculated with MVE estimators and a Ti 2 statistic calculated with MCD estimators. In Figure 3(a), outlier detection probabilities are plotted versus ncp for each of the four methods where n = 3, d = 2 and k = 1. We see that our method is as good or better than the MVE and MCD methods and often much better for detecting arbitrarily occurring outliers. Further, for k = 1, our method is competitive but not necessarily better than the approach based on T 2 i in (1) and UCL in (2) where all data points are considered. We call this the usual method. Consider now Figures 3(b), 3(c) and 3(d). For k > 1, it appears our method of detecting outliers is usually better than each of the three other methods and often much better. An expanded set of outlier detection probabilities for our method and the MVE method (determined by Vargas (23)) are given in Table 5. Since for selected n, d, k and ncp values, Vargas (23) recommended the MVE method over the usual, MCD and the two methods of Sullivan and Woodall for detecting randomly occurring outliers, we only repeat the MVE outlier detection probabilities for comparison purposes. Examination of Figure 3, along with Table 5 suggests our method is almost always better than MVE and often much better for detecting randomly occurring outliers. Hence, it is reasonable to conclude our method for detecting multiple outliers also surpasses the effectiveness of the other four control chart methods, at least for the n, d, k and ncp values considered. One exception being for k = 1, the usual method does seem to be somewhat better than our proposed method. In addition to the arbitrary occurrence(s) of an upset in a process, we considered the scenario where the individuals multivariate process was consistent at some unknown mean vector µ and then shifted to a different unknown mean vector µ 1 after = n 2 time periods. 18

19 1 1 Outlier detection probability Usual Our MVE MCD Outlier detection probability Our MVE MCD Usual Non centrality parameter Non centrality parameter (a) k = 1 (b) k = Outlier detection probability Our MVE MCD Outlier detection probability Our MVE MCD Non centrality parameter Usual.1 Usual Non centrality parameter (c) k = 5 (d) k = 7 Figure 3: Estimated outlier detection probabilities for k outliers in d = 2 dimensions. The outlier detection method by Sullivan and Woodall based on a moving range estimator of the covariance matrix was identified by Vargas (23) to be the best at detecting a shift of this type. We did a simulation under these same conditions and applied our two-step outlier detection method. The estimated detection probabilities of our method are given in Table 6 along with the performance of Sullivan and Woodall s approach (SW) and MVE as reported by Vargas (23). For the ncps and shift considered in Table 6, the simulated detection probabilities of our method exceed those based on the MVE and SW for d = 2 and n = 3. Next, a formal comparison of the simulated probabilities in Figure 3, Tables 5 and 6 is presented. 19

20 d = 3 n = 3 n = 5 n = 1 method ncp \ k MVE Our MVE Our MVE Our d = 5 n = 3 n = 5 n = 1 method ncp \ k MVE Our MVE Our MVE Our d = 1 n = 3 n = 5 n = 1 method ncp \ k MVE Our MVE Our MVE Our Table 5: Estimated outlier detection probabilities for Our method and MVE as reported by Vargas (23). Non-shaded regions correspond to Our method being superior. Let ˆp our, ˆp MVE and ˆp SW represent the simulated outlier detection probabilities determined for our method, MVE and Sullivan and Woodall. Assuming negligible variation in the simu- 2

21 Method \ ncp MVE SW Our Table 6: Estimated outlier detection probabilities for a sustained shift in the mean after = 15 time periods for d = 2 and n = 3 (MVE and SW are as reported by Vargas (23)). Non-shaded regions correspond to Our method being superior. lated control limits, the approximate standard deviation of the difference ˆp our ˆp MVE can be represented as s = 1/15(ˆp our (1 ˆp our ) + ˆp MVE (1 ˆp MVE )). For selected n and d, 9% lower-one sided Bonferroni family-wise confidence intervals were computed for the corresponding differences Our-MVE or Our-SW in Figure 3, Tables 5 and 6. Those intervals with a negative lower bound are shaded in the tables and indicate that the particular difference is not statistically significant. Those differences not shaded are statistically significant, indicating that our approach is superior to the compared approach. Inspection of the shading indicates that our method is similar for smaller values of ncp and superior to MVE or SW for most other ncp values and k > 1. Independent of an n, d combination, the improved power of our method over the others to detect outliers in the presence of extreme masking (ncp > 5) is apparent. Further, for d = 2 and n = 3 for all levels of masking considered (ncp 4), our method is just as good and usually better than the others (except when the usual method is applied and k = 1). Vargas (23) recommended the simultaneous applications of two outlier detection approaches (MVE and SW). Thus, to maintain an overall false detection probability of α =.5, the actual outlier detection probabilities become less than the individual detection probabilities reported by Vargas (23) due to the necessary increase of the corresponding control limits. Hence, the performance of our method compared to the combined MVE and SW methods is even better than suggested by the previous assessment. Jensen et al. (27) performed extensive simulations for n, d and k combinations beyond 21

22 1.9 Our 1.9 Our Outlier detection probability MVE MCD Usual Outlier detection probability MVE MCD Non centrality parameter.1 Usual Non centrality parameter (a) k = 4 (b) k = Our 1.9 Our Outlier detection probability MCD MVE Outlier detection probability MCD MVE Non centrality parameter Usual.1 Usual Non centrality parameter (c) k = 12 (d) k = 16 Figure 4: Probability of a signal for Our, MVE, MCD and usual estimators where n = 5, d = 3 and k equals the number of outliers. that of Vargas (23). The more recent work of Jensen et al. (27) noted that for n 5, MCD tended to be a better method than MVE at detecting multiple outliers for ncp = 5, 1, 15, 2, 25. Further, for n < 5, their work determined MVE to be the preferred method of detecting multiple outliers. We have done extensive additional simulations using our method for ncp = 5, 15 and 25, d = 2, 3, 5 and 1 and n = 3, 5, 75, 1 and 125. For d > 2, all n and ncp, our method proved equal to or better than MCD. Setting d = 2, and all n and ncp, our method was equal to or better than MCD about 66% of the time. See Figure 4 for the relative performance of our method compared to MVE and MCD for n = 5, d = 3 22

23 and k = 4, 8, 12 and 16. Our method outperforms both MVE and MCD in this example. Further, MCD performs better than MVE for larger k and poorer than MVE for smaller k. Additionally, our method showed an ability and sometimes a strong ability to detect shifts of size ncp 15 when n exceeds the breakdown point of MCD and MVE. See Table 7 for our detection probabilities when n exceeds the breakdown point. The control limits we determined for the new simulation showed some slight variation from those listed in Table 4 for n = 3, 5 and 1. d = 2 d = 3 d = 5 d = 1 n\ncp Table 7: Detection probabilities for Our method for k larger than the breakdown point of MVE and MCD. In particular, k equals 5% of sample size n. Our simulations were carried out using Matlab 7 (26b). Pseudo-random numbers were produced using the ziggurat-algorithm implemented in the function mvnrnd of Matlab on a Pentium IV PC (2.6 GHz, 1 Gb RAM). For d = 2 and n = 3, the maximum average runtime over all ncp- and k-scenarios was approximately.31 sec. For d = 1 and n = 125, this number was 8.56 sec. The runtime grows superlinear in n and linear in d. Example For two different data sets previously analyzed by Vargas (23), we compared the outlier detection effectiveness of our two-step method to the MVE, usual, MCD and two methods of Sullivan and Woodall. The data can be found in Table 8. Originally the complete data set presented by Quesenberry (21) had 11 variables but for illustration purposes Vargas (23) considered only two of the variables. Our method and all but one of the methods applied by 23

24 Vargas (23) detected the same outlier for the data in Table 8. The MCD method failed to detect any outlier. Figure 5(a) displays the bivariate data and corresponding outlier detection ellipsoids determined by MVE and our two-step approach. For comparison purposes, observations 16 and 24 were modified by Vargas (23) to (.469, 56.23) and (.496, 56.8). Only the MVE method detected them as outliers (usual, MCD and both Sullivan and Woodall methods failed to detect the 2 new outliers). Our two-step approach applied to the same altered data correctly detected the 2 new outliers x x obs x obs 16 obs 2 obs x 1 (a) Original sample (b) Altered sample Figure 5: Outlier detection ellipsoid for two data sets MVE (dashed line), Our two-step method (solid line)..6 f(x 1, x 2 ) f(x 1, x 2 ) x x x x (a) Original sample (b) Altered sample Figure 6: Estimated density for two data sets. 24

25 Figure 5(b) gives the outlier detection ellipsoids determined by MVE and our two-step approach for the altered data. For both datasets, our ellipsoid has a smaller volume (1.985) than that of MVE (2.491), detects the same outliers as MVE and has an inclination that better matches the data. This is consistent with the fact that the MVE does not take time ordering into account and is restricted to a bulk size which does not use information from the data. Figures 6(a) and 6(b) are the estimated bivariate density functions determined by our method for the unaltered and altered data. The slight skewness seen in Figure 6(a) corresponds to the presence of a single outlier in the unaltered data. The secondary mound seen in Figure 6(b) reflects the presence of the 3 identified outliers for the altered data. i x x i x x i x x Table 8: Bivariate dataset The x robust and S robust determined by equation (9) for original data are x = ( ), S = ( ) ( and for altered data are x =.5459 ) ( , S = ). Table 9 lists the detection signals determined by equation (1) for the original and altered samples. The control limit from Table 4 is No jumps were found for either data set. Shaded regions in Table 9 correspond to signals (and observation indices) that exceed the control limit. Summary We have developed a two-step method of identifying the largest bulk of similar multivariate data from a time-ordered sequence of individual multivariate responses. Using the mean 25

26 i original altered i original altered Table 9: Detection signals for original sample and altered sample vector and covariance matrix from the data in this bulk, control limits have been developed for selected n and d to determine whether any of the individual multivariate data points in the selected time ordered sequence of size n suggest the observed multivariate process of interest has shifted from a standard represented by the identified bulk of data. Extensive simulations assuming a multivariate normal distribution have shown that our method is equal to and usually better than the MVE and MCD at detecting multiple outliers for d = 2, 3, 5 and 1 and n = 3, 5, 75, 1 and 125. Further, our method shows strong ability at outlier detection even when n exceeds the breakdown point for MCD and MVE. If outliers occur systematically, our method performs even better than if the outliers occur sporadically. Finally, the focus of the comparative aspect of our paper has been the detectability of up to 5% sample contamination. Actually, the flexibility of our method permits the detection of multiple shifts of different magnitudes in different directions where the total contamination could exceed 5%. Appendix A brief outline of the MEM algorithm according to Li, Ray and Lindsay (27) is given. Let a mixture density be defined as f(x) = n i=1 π if i (x) at every point x R d, where f i is the unimodal density of mixture component i and π i is its a priori probability. Given any initial value x () R d, MEM solves a local maximum of the mixture function by alternating the following two steps thus producing a sequence {x (r) } r : 26

27 1. Let p i = π if i (x (r) ), i = 1,..., n. f(x (r) ) 2. Update x (r+1) = arg max x R d n i=1 p i log f i (x). The first step is the expectation step where the a posteriori probability of each mixture component i at the current point x (r) is computed. The second step is the maximization step. The function n i=1 p i log f i (x k ) has a unique maximum due to the unimodality of f i. According to Wu (1983), if f k are normal densities, all the limit points of {x (r) } r are stationary points of f, i.e. gradf(x) = if x = lim r x (r) and f is smooth. It is possible that {x (r) } r converges to a stationary, but not locally maximal point. A detailed treatment of the convergence of the EM style algorithms can be found in Wu (1983). For the practical use of MEM, it is sufficient to define a termination rule, e.g. stop if small ε >. x (r+1) x (r) max{ x (r+1),1} < ε for a We present here a simplification of the nonparametric clustering algorithm of Li, Ray and Lindsay (27). Due to the selection of H in equation (8), the construction of a hierarchy of clusters by gradually increasing the bandwidth of Gaussian kernels can be omitted. Let X be the set of data to be clustered. A nonparametric density estimator is formed for a nonsingular H according to (5) ˆf H (x) = 1 n det H n ϕ (x x i, H), (11) where ϕ ( µ, Σ) is the probability density function of a normal random variable with the mean µ and covariance matrix Σ, i.e. ϕ (x x i, H) = ϕ(h 1 (x x i )) for ϕ(x) = (2π) n/2 exp( x 2 /2). The clustering algorithm reads as follows: 1. Form a kernel density ˆf H (x) as in (11). i=1 2. Use ˆf H (x) as the density function. Use each x i, i = 1,..., n, as the initial value in the MEM algorithm described earlier in the appendix. Let the mode identified by starting from x i be m H (x i ). 3. Extract distinctive values from the set {m H (x i ) i = 1,..., n} to form a set M. Label the elements in M from 1 to M. In practice, due to finite precision, two modes 27

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI **

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI ** ANALELE ŞTIINłIFICE ALE UNIVERSITĂłII ALEXANDRU IOAN CUZA DIN IAŞI Tomul LVI ŞtiinŃe Economice 9 A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI, R.A. IPINYOMI

More information

Computational Connections Between Robust Multivariate Analysis and Clustering

Computational Connections Between Robust Multivariate Analysis and Clustering 1 Computational Connections Between Robust Multivariate Analysis and Clustering David M. Rocke 1 and David L. Woodruff 2 1 Department of Applied Science, University of California at Davis, Davis, CA 95616,

More information

Re-weighted Robust Control Charts for Individual Observations

Re-weighted Robust Control Charts for Individual Observations Universiti Tunku Abdul Rahman, Kuala Lumpur, Malaysia 426 Re-weighted Robust Control Charts for Individual Observations Mandana Mohammadi 1, Habshah Midi 1,2 and Jayanthi Arasan 1,2 1 Laboratory of Applied

More information

On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator

On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator JAMES D. WILLIAMS GE Global Research, Niskayuna, NY 12309 WILLIAM H. WOODALL and JEFFREY

More information

Research Article Robust Multivariate Control Charts to Detect Small Shifts in Mean

Research Article Robust Multivariate Control Charts to Detect Small Shifts in Mean Mathematical Problems in Engineering Volume 011, Article ID 93463, 19 pages doi:.1155/011/93463 Research Article Robust Multivariate Control Charts to Detect Small Shifts in Mean Habshah Midi 1, and Ashkan

More information

TITLE : Robust Control Charts for Monitoring Process Mean of. Phase-I Multivariate Individual Observations AUTHORS : Asokan Mulayath Variyath.

TITLE : Robust Control Charts for Monitoring Process Mean of. Phase-I Multivariate Individual Observations AUTHORS : Asokan Mulayath Variyath. TITLE : Robust Control Charts for Monitoring Process Mean of Phase-I Multivariate Individual Observations AUTHORS : Asokan Mulayath Variyath Department of Mathematics and Statistics, Memorial University

More information

Supplementary Material for Wang and Serfling paper

Supplementary Material for Wang and Serfling paper Supplementary Material for Wang and Serfling paper March 6, 2017 1 Simulation study Here we provide a simulation study to compare empirically the masking and swamping robustness of our selected outlyingness

More information

Improvement of The Hotelling s T 2 Charts Using Robust Location Winsorized One Step M-Estimator (WMOM)

Improvement of The Hotelling s T 2 Charts Using Robust Location Winsorized One Step M-Estimator (WMOM) Punjab University Journal of Mathematics (ISSN 1016-2526) Vol. 50(1)(2018) pp. 97-112 Improvement of The Hotelling s T 2 Charts Using Robust Location Winsorized One Step M-Estimator (WMOM) Firas Haddad

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.1 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

The Robustness of the Multivariate EWMA Control Chart

The Robustness of the Multivariate EWMA Control Chart The Robustness of the Multivariate EWMA Control Chart Zachary G. Stoumbos, Rutgers University, and Joe H. Sullivan, Mississippi State University Joe H. Sullivan, MSU, MS 39762 Key Words: Elliptically symmetric,

More information

Chapter 9. Non-Parametric Density Function Estimation

Chapter 9. Non-Parametric Density Function Estimation 9-1 Density Estimation Version 1.2 Chapter 9 Non-Parametric Density Function Estimation 9.1. Introduction We have discussed several estimation techniques: method of moments, maximum likelihood, and least

More information

Detection of outliers in multivariate data:

Detection of outliers in multivariate data: 1 Detection of outliers in multivariate data: a method based on clustering and robust estimators Carla M. Santos-Pereira 1 and Ana M. Pires 2 1 Universidade Portucalense Infante D. Henrique, Oporto, Portugal

More information

12 - Nonparametric Density Estimation

12 - Nonparametric Density Estimation ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6

More information

IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH

IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH SESSION X : THEORY OF DEFORMATION ANALYSIS II IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH Robiah Adnan 2 Halim Setan 3 Mohd Nor Mohamad Faculty of Science, Universiti

More information

Introduction to robust statistics*

Introduction to robust statistics* Introduction to robust statistics* Xuming He National University of Singapore To statisticians, the model, data and methodology are essential. Their job is to propose statistical procedures and evaluate

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Why is the field of statistics still an active one?

Why is the field of statistics still an active one? Why is the field of statistics still an active one? It s obvious that one needs statistics: to describe experimental data in a compact way, to compare datasets, to ask whether data are consistent with

More information

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix

Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Nonparametric Identification of a Binary Random Factor in Cross Section Data - Supplemental Appendix Yingying Dong and Arthur Lewbel California State University Fullerton and Boston College July 2010 Abstract

More information

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model Minimum Hellinger Distance Estimation in a Semiparametric Mixture Model Sijia Xiang 1, Weixin Yao 1, and Jingjing Wu 2 1 Department of Statistics, Kansas State University, Manhattan, Kansas, USA 66506-0802.

More information

A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications

A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications Hyunsook Lee. hlee@stat.psu.edu Department of Statistics The Pennsylvania State University Hyunsook

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

ECE 661: Homework 10 Fall 2014

ECE 661: Homework 10 Fall 2014 ECE 661: Homework 10 Fall 2014 This homework consists of the following two parts: (1) Face recognition with PCA and LDA for dimensionality reduction and the nearest-neighborhood rule for classification;

More information

IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE

IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE Eric Blankmeyer Department of Finance and Economics McCoy College of Business Administration Texas State University San Marcos

More information

A Novel Nonparametric Density Estimator

A Novel Nonparametric Density Estimator A Novel Nonparametric Density Estimator Z. I. Botev The University of Queensland Australia Abstract We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with

More information

Small Sample Corrections for LTS and MCD

Small Sample Corrections for LTS and MCD myjournal manuscript No. (will be inserted by the editor) Small Sample Corrections for LTS and MCD G. Pison, S. Van Aelst, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling

More information

Robust Preprocessing of Time Series with Trends

Robust Preprocessing of Time Series with Trends Robust Preprocessing of Time Series with Trends Roland Fried Ursula Gather Department of Statistics, Universität Dortmund ffried,gatherg@statistik.uni-dortmund.de Michael Imhoff Klinikum Dortmund ggmbh

More information

Segmentation of Subspace Arrangements III Robust GPCA

Segmentation of Subspace Arrangements III Robust GPCA Segmentation of Subspace Arrangements III Robust GPCA Berkeley CS 294-6, Lecture 25 Dec. 3, 2006 Generalized Principal Component Analysis (GPCA): (an overview) x V 1 V 2 (x 3 = 0)or(x 1 = x 2 = 0) {x 1x

More information

3.4 Linear Least-Squares Filter

3.4 Linear Least-Squares Filter X(n) = [x(1), x(2),..., x(n)] T 1 3.4 Linear Least-Squares Filter Two characteristics of linear least-squares filter: 1. The filter is built around a single linear neuron. 2. The cost function is the sum

More information

A CUSUM approach for online change-point detection on curve sequences

A CUSUM approach for online change-point detection on curve sequences ESANN 22 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges Belgium, 25-27 April 22, i6doc.com publ., ISBN 978-2-8749-49-. Available

More information

Descriptive Data Summarization

Descriptive Data Summarization Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

More information

Gaussian Mixture Models with Component Means Constrained in Pre-selected Subspaces

Gaussian Mixture Models with Component Means Constrained in Pre-selected Subspaces Gaussian Mixture Models with Component Means Constrained in Pre-selected Subspaces Mu Qiao and Jia Li Abstract We investigate a Gaussian mixture model (GMM) with component means constrained in a pre-selected

More information

Nonlinear Signal Processing ELEG 833

Nonlinear Signal Processing ELEG 833 Nonlinear Signal Processing ELEG 833 Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu May 5, 2005 8 MYRIAD SMOOTHERS 8 Myriad Smoothers 8.1 FLOM

More information

Robustness and Distribution Assumptions

Robustness and Distribution Assumptions Chapter 1 Robustness and Distribution Assumptions 1.1 Introduction In statistics, one often works with model assumptions, i.e., one assumes that data follow a certain model. Then one makes use of methodology

More information

Robust estimation of scale and covariance with P n and its application to precision matrix estimation

Robust estimation of scale and covariance with P n and its application to precision matrix estimation Robust estimation of scale and covariance with P n and its application to precision matrix estimation Garth Tarr, Samuel Müller and Neville Weber USYD 2013 School of Mathematics and Statistics THE UNIVERSITY

More information

Smooth simultaneous confidence bands for cumulative distribution functions

Smooth simultaneous confidence bands for cumulative distribution functions Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

More information

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

More information

CHAPTER 5. Outlier Detection in Multivariate Data

CHAPTER 5. Outlier Detection in Multivariate Data CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for

More information

Estimating Gaussian Mixture Densities with EM A Tutorial

Estimating Gaussian Mixture Densities with EM A Tutorial Estimating Gaussian Mixture Densities with EM A Tutorial Carlo Tomasi Due University Expectation Maximization (EM) [4, 3, 6] is a numerical algorithm for the maximization of functions of several variables

More information

Asymptotic Relative Efficiency in Estimation

Asymptotic Relative Efficiency in Estimation Asymptotic Relative Efficiency in Estimation Robert Serfling University of Texas at Dallas October 2009 Prepared for forthcoming INTERNATIONAL ENCYCLOPEDIA OF STATISTICAL SCIENCES, to be published by Springer

More information

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX STATISTICS IN MEDICINE Statist. Med. 17, 2685 2695 (1998) ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX N. A. CAMPBELL *, H. P. LOPUHAA AND P. J. ROUSSEEUW CSIRO Mathematical and Information

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy Introduction to Robust Statistics Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy Multivariate analysis Multivariate location and scatter Data where the observations

More information

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson Proceedings of the 0 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelspach, K. P. White, and M. Fu, eds. RELATIVE ERROR STOCHASTIC KRIGING Mustafa H. Tongarlak Bruce E. Ankenman Barry L.

More information

Identification of Multivariate Outliers: A Performance Study

Identification of Multivariate Outliers: A Performance Study AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 127 138 Identification of Multivariate Outliers: A Performance Study Peter Filzmoser Vienna University of Technology, Austria Abstract: Three

More information

Research Article Robust Control Charts for Monitoring Process Mean of Phase-I Multivariate Individual Observations

Research Article Robust Control Charts for Monitoring Process Mean of Phase-I Multivariate Individual Observations Journal of Quality and Reliability Engineering Volume 3, Article ID 4, 4 pages http://dx.doi.org/./3/4 Research Article Robust Control Charts for Monitoring Process Mean of Phase-I Multivariate Individual

More information

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 004 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER II STATISTICAL METHODS The Society provides these solutions to assist candidates preparing for the examinations in future

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Bayesian Decision Theory

Bayesian Decision Theory Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian

More information

Modelling Non-linear and Non-stationary Time Series

Modelling Non-linear and Non-stationary Time Series Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means

More information

The S-estimator of multivariate location and scatter in Stata

The S-estimator of multivariate location and scatter in Stata The Stata Journal (yyyy) vv, Number ii, pp. 1 9 The S-estimator of multivariate location and scatter in Stata Vincenzo Verardi University of Namur (FUNDP) Center for Research in the Economics of Development

More information

Parametric Empirical Bayes Methods for Microarrays

Parametric Empirical Bayes Methods for Microarrays Parametric Empirical Bayes Methods for Microarrays Ming Yuan, Deepayan Sarkar, Michael Newton and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 General Model Structure: Two Conditions

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Time Series and Forecasting Lecture 4 NonLinear Time Series

Time Series and Forecasting Lecture 4 NonLinear Time Series Time Series and Forecasting Lecture 4 NonLinear Time Series Bruce E. Hansen Summer School in Economics and Econometrics University of Crete July 23-27, 2012 Bruce Hansen (University of Wisconsin) Foundations

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

The Performance of Mutual Information for Mixture of Bivariate Normal Distributions Based on Robust Kernel Estimation

The Performance of Mutual Information for Mixture of Bivariate Normal Distributions Based on Robust Kernel Estimation Applied Mathematical Sciences, Vol. 4, 2010, no. 29, 1417-1436 The Performance of Mutual Information for Mixture of Bivariate Normal Distributions Based on Robust Kernel Estimation Kourosh Dadkhah 1 and

More information

Fast and robust bootstrap for LTS

Fast and robust bootstrap for LTS Fast and robust bootstrap for LTS Gert Willems a,, Stefan Van Aelst b a Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan 1, B-2020 Antwerp, Belgium b Department of

More information

Joint Estimation of Risk Preferences and Technology: Further Discussion

Joint Estimation of Risk Preferences and Technology: Further Discussion Joint Estimation of Risk Preferences and Technology: Further Discussion Feng Wu Research Associate Gulf Coast Research and Education Center University of Florida Zhengfei Guan Assistant Professor Gulf

More information

Unsupervised machine learning

Unsupervised machine learning Chapter 9 Unsupervised machine learning Unsupervised machine learning (a.k.a. cluster analysis) is a set of methods to assign objects into clusters under a predefined distance measure when class labels

More information

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location

Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location Design and Implementation of CUSUM Exceedance Control Charts for Unknown Location MARIEN A. GRAHAM Department of Statistics University of Pretoria South Africa marien.graham@up.ac.za S. CHAKRABORTI Department

More information

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran

PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION. Alireza Bayestehtashk and Izhak Shafran PARSIMONIOUS MULTIVARIATE COPULA MODEL FOR DENSITY ESTIMATION Alireza Bayestehtashk and Izhak Shafran Center for Spoken Language Understanding, Oregon Health & Science University, Portland, Oregon, USA

More information

Composite Hypotheses and Generalized Likelihood Ratio Tests

Composite Hypotheses and Generalized Likelihood Ratio Tests Composite Hypotheses and Generalized Likelihood Ratio Tests Rebecca Willett, 06 In many real world problems, it is difficult to precisely specify probability distributions. Our models for data may involve

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Robust scale estimation with extensions

Robust scale estimation with extensions Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance

More information

Solving Corrupted Quadratic Equations, Provably

Solving Corrupted Quadratic Equations, Provably Solving Corrupted Quadratic Equations, Provably Yuejie Chi London Workshop on Sparse Signal Processing September 206 Acknowledgement Joint work with Yuanxin Li (OSU), Huishuai Zhuang (Syracuse) and Yingbin

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4:, Robust Principal Component Analysis Contents Empirical Robust Statistical Methods In statistics, robust methods are methods that perform well

More information

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline.

Dependence. MFM Practitioner Module: Risk & Asset Allocation. John Dodson. September 11, Dependence. John Dodson. Outline. MFM Practitioner Module: Risk & Asset Allocation September 11, 2013 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y

More information

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu

Dimension Reduction Techniques. Presented by Jie (Jerry) Yu Dimension Reduction Techniques Presented by Jie (Jerry) Yu Outline Problem Modeling Review of PCA and MDS Isomap Local Linear Embedding (LLE) Charting Background Advances in data collection and storage

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood Kuangyu Wen & Ximing Wu Texas A&M University Info-Metrics Institute Conference: Recent Innovations in Info-Metrics October

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

A Bayesian Criterion for Clustering Stability

A Bayesian Criterion for Clustering Stability A Bayesian Criterion for Clustering Stability B. Clarke 1 1 Dept of Medicine, CCS, DEPH University of Miami Joint with H. Koepke, Stat. Dept., U Washington 26 June 2012 ISBA Kyoto Outline 1 Assessing Stability

More information

368 XUMING HE AND GANG WANG of convergence for the MVE estimator is n ;1=3. We establish strong consistency and functional continuity of the MVE estim

368 XUMING HE AND GANG WANG of convergence for the MVE estimator is n ;1=3. We establish strong consistency and functional continuity of the MVE estim Statistica Sinica 6(1996), 367-374 CROSS-CHECKING USING THE MINIMUM VOLUME ELLIPSOID ESTIMATOR Xuming He and Gang Wang University of Illinois and Depaul University Abstract: We show that for a wide class

More information

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures David Hunter Pennsylvania State University, USA Joint work with: Tom Hettmansperger, Hoben Thomas, Didier Chauveau, Pierre Vandekerkhove,

More information

A Multivariate Process Variability Monitoring Based on Individual Observations

A Multivariate Process Variability Monitoring Based on Individual Observations www.ccsenet.org/mas Modern Applied Science Vol. 4, No. 10; October 010 A Multivariate Process Variability Monitoring Based on Individual Observations Maman A. Djauhari (Corresponding author) Department

More information

5. Discriminant analysis

5. Discriminant analysis 5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density

More information

Nonparametric Modal Regression

Nonparametric Modal Regression Nonparametric Modal Regression Summary In this article, we propose a new nonparametric modal regression model, which aims to estimate the mode of the conditional density of Y given predictors X. The nonparametric

More information

ECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015

ECE 275B Homework #2 Due Thursday 2/12/2015. MIDTERM is Scheduled for Thursday, February 19, 2015 Reading ECE 275B Homework #2 Due Thursday 2/12/2015 MIDTERM is Scheduled for Thursday, February 19, 2015 Read and understand the Newton-Raphson and Method of Scores MLE procedures given in Kay, Example

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning Multivariate Gaussians Mark Schmidt University of British Columbia Winter 2019 Last Time: Multivariate Gaussian http://personal.kenyon.edu/hartlaub/mellonproject/bivariate2.html

More information

Accurate and Powerful Multivariate Outlier Detection

Accurate and Powerful Multivariate Outlier Detection Int. Statistical Inst.: Proc. 58th World Statistical Congress, 11, Dublin (Session CPS66) p.568 Accurate and Powerful Multivariate Outlier Detection Cerioli, Andrea Università di Parma, Dipartimento di

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

Nonrobust and Robust Objective Functions

Nonrobust and Robust Objective Functions Nonrobust and Robust Objective Functions The objective function of the estimators in the input space is built from the sum of squared Mahalanobis distances (residuals) d 2 i = 1 σ 2(y i y io ) C + y i

More information

Improved Feasible Solution Algorithms for. High Breakdown Estimation. Douglas M. Hawkins. David J. Olive. Department of Applied Statistics

Improved Feasible Solution Algorithms for. High Breakdown Estimation. Douglas M. Hawkins. David J. Olive. Department of Applied Statistics Improved Feasible Solution Algorithms for High Breakdown Estimation Douglas M. Hawkins David J. Olive Department of Applied Statistics University of Minnesota St Paul, MN 55108 Abstract High breakdown

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Detecting outliers in weighted univariate survey data

Detecting outliers in weighted univariate survey data Detecting outliers in weighted univariate survey data Anna Pauliina Sandqvist October 27, 21 Preliminary Version Abstract Outliers and influential observations are a frequent concern in all kind of statistics,

More information

Nonparametric Methods

Nonparametric Methods Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis

More information

INVARIANT COORDINATE SELECTION

INVARIANT COORDINATE SELECTION INVARIANT COORDINATE SELECTION By David E. Tyler 1, Frank Critchley, Lutz Dümbgen 2, and Hannu Oja Rutgers University, Open University, University of Berne and University of Tampere SUMMARY A general method

More information

Supervised Learning: Non-parametric Estimation

Supervised Learning: Non-parametric Estimation Supervised Learning: Non-parametric Estimation Edmondo Trentin March 18, 2018 Non-parametric Estimates No assumptions are made on the form of the pdfs 1. There are 3 major instances of non-parametric estimates:

More information

Statistical inference on Lévy processes

Statistical inference on Lévy processes Alberto Coca Cabrero University of Cambridge - CCA Supervisors: Dr. Richard Nickl and Professor L.C.G.Rogers Funded by Fundación Mutua Madrileña and EPSRC MASDOC/CCA student workshop 2013 26th March Outline

More information

Gradient Descent. Sargur Srihari

Gradient Descent. Sargur Srihari Gradient Descent Sargur srihari@cedar.buffalo.edu 1 Topics Simple Gradient Descent/Ascent Difficulties with Simple Gradient Descent Line Search Brent s Method Conjugate Gradient Descent Weight vectors

More information

Data Exploration and Unsupervised Learning with Clustering

Data Exploration and Unsupervised Learning with Clustering Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a

More information

Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups

Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups Contemporary Mathematics Pattern Recognition Approaches to Solving Combinatorial Problems in Free Groups Robert M. Haralick, Alex D. Miasnikov, and Alexei G. Myasnikov Abstract. We review some basic methodologies

More information

Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points

Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points Detecting outliers and/or leverage points: a robust two-stage procedure with bootstrap cut-off points Ettore Marubini (1), Annalisa Orenti (1) Background: Identification and assessment of outliers, have

More information