# Supplementary Material for Wang and Serfling paper

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Supplementary Material for Wang and Serfling paper March 6, Simulation study Here we provide a simulation study to compare empirically the masking and swamping robustness of our selected outlyingness functions. Two parametric parent distributions are considered, the bivariate standard normal, which is symmetric, and the bivariate exponential with mean (0.5, 0.5), which is rightskewed. 1.1 Simulation plan We generate samples separately from the above two distributions with three different sample sizes n = 50, 500, and We consider contamination models with true positive rate ɛ, for ɛ = 0.02, 0.1, 0.25, and 0.4. For each model, the number of replacements in the sample is given by m = nɛ. The numbers of replications for the sample sizes are chosen to be 5000 for n = 50, 2000 for n = 500, and 500 for n = 2000, respectively. To choose the sample outlyingness thresholds λ, we assume false positive rates under no contamination to be given by α = 0.02 for n = 50, and α = 0.01 for n = 500 and n = That is, under no contamination, we expect about 1 observation, 5 observations, and 20 observations with outlyingness values > λ for n = 50, 500, and 2000, respectively. On this basis, we take as sample outlyingness threshold λ the 2nd largest, the 6th largest, and the 21st largest value of the observed sample outlyingness function, for n = 50, 500, and 2000, respectively. For convenience, we index X i, i = 1, 2,..., n, in order of their Euclidean norms, from smallest to largest. We explore masking and swamping robustness of the following four affine invariant sample outlier identifiers treated in Section 4 of the paper: 1

2 Classical Mahalanobis distance outlyingness (CMD), i.e., O MD (x, X n ) with ( µ, Σ) = (X, S). Robust Mahalanobis distance outlyingness (RMD), i.e., O MD (x, X n ) with ( µ, Σ) given by the MCD estimators. Here these are defined taking subsample size h = n+d+1, which for d = 2 yields 26 for n = 50, 251 for 2 n = 500, and 1001 for n = 2000, respectively. The Robustbase package in R is used to obtain approximate MCD estimators via the Fast-MCD algorithm of Rousseeuw and Van Driessen (1999). Robust Mahalanobis spatial outlyingness (RMS), i.e., O MS (x, X n ) with the MCD covariance estimator and the subsample size h chosen as for RMD. Projection outlyingness (P), i.e., O P (x, X n ) with univariate location and scale given by (Med, MAD d 1 ). For computational convenience, only a finite number 1000 of randomly chosen projection directions are used in our simulation. We measure and compare the masking and swamping robustness of the above outlyingness functions using the following two performance indices: Percent of data points masked among m outliers. If all m outliers are masked, this indicates masking breakdown. Percent of data points swamped among nonoutliers. Here the number of nonoutliers is at most n m. If all n m nonoutliers are swamped, this indicates swamping breakdown. Two scenarios in relation to outliers (similar to Dang and Serfling, 2010) are considered: A Replace X n m+1,..., X n 1, X n by KX n m+1,..., KX n 1, KX n for an inflation factor K chosen = 5. Denote the modified data set by X (A) n,m. B Replace X n m+1,..., X n 1, X n by KX n,..., KX n, KX n for inflation factor K again chosen = 5. Denote the modified data set by X (B) n,m. These are illustrated in Figure 1 for the two parent distributions, with sample size n = 50 and m = 5 replacements. 2

3 Figure 1. Scenarios A and B with sample size 50 and 5 replacements, for the bivariate standard normal distribution (left) and the bivariate exponential(0.5, 0.5) distribution (right). The 5 observations in the original sample with largest Euclidean norms are marked +. Their replacements in Scenario A are marked and in Scenario B are marked. Note the 5 replacements in Scenario B overlap each other and also one of the Scenario A replacements. 1.2 Simulation results The following sections present simulation results on masking and swamping for the two distributions, respectively Results for contaminated bivariate standard normal Table 1 shows for each procedure the average sample outlyingness thresholds λ for the replicated samples of sizes 50, 500, and 2000 from bivariate standard normal. As described earlier, the threshold is chosen according to the false positive rate α under no contamination, which is 0.02 for n = 50 and 0.01 for n = 500 and 2000, respectively. 3

4 n = 50, n = 500, n = 2000, 5000 trials 2000 trials 500 trials CMD RMD RMS P Table 1. Average sample outlyingness thresholds λ for CMD, RMD, RMS and P, with n = 50, 500 and 2000 under no contamination, based on the bivariate normal model Masking performance Table 2 displays the masking robustness of CMD, RMD, RMS and P under scenarios A and B with the four different contamination levels, based on the standard normal model. Here masking breakdown occurs in a given sample when all m outliers (replaced data points) are masked. Average percent (%) masked among m = nɛ replacement outliers ɛ = 0.02 ɛ = 0.10 ɛ = 0.25 ɛ = 0.40 MBP A B A B A B A B CMD n = 50, RMD trials RMS P CMD n = 500, RMD trials RMS P CMD n = 2000, RMD trials RMS P Table 2. Masking performance of CMD, RMD, RMS and P for bivariate standard normal samples with n = 50, 500 and As a benchmark, the MBP column gives the theoretical MBP values. As ɛ increases, masking occurs more frequently, especially when ɛ > MBP. Masking breakdown in a sample occurs when the percent of outliers masked is

6 on the standard normal model. Here swamping breakdown occurs in a given sample when all n m nonoutliers (nonreplaced data points) are swamped. Average percent (%) swamped among n m = n(1 ɛ) nonoutliers ɛ = 0.02 ɛ = 0.10 ɛ = 0.25 ɛ = 0.40 SBP A B A B A B A B CMD n = 50, RMD trials RMS P CMD n = 500, RMD trials RMS P CMD n = 2000, RMD trials RMS P Table 3. Swamping performance of CMD, RMD, RMS and P for bivariate standard normal samples with n = 50, 500 and As a benchmark, the SBP column gives the theoretical SBP values. Swamping breakdown in a sample occurs when the percent of nonoutliers swamped is 100. Comments based on Table 3. Except for Scenario B at contamination level ɛ = 0.40, all four procedures exhibit very low swamping, with CMD also performing very well and RMS moderately well even for this extreme scenario. Thus, for the standard normal contamination model, CMD and RMS perform best with respect to swamping robustness, although for masking robustness they are outperformed by RMD and P. These findings corroborate the high SBP values of the four procedures (especially high for CMD) Results for contaminated bivariate exponential Table 4 shows for each of the four procedures the average sample outlyingness thresholds λ for the replicated samples of sizes 50, 500, and 2000 from bivariate exponential(0.5, 0.5). Again, the threshold is chosen according to the false positive rate α under no contamination, which is 0.02 for n = 50 and 0.01 for n = 500 and 2000, respectively. 6

7 n = 50, n = 500, n = 2000, 5000 trials 2000 trials 500 trials CMD RMD RMS P Table 4. Average sample outlyingness thresholds λ for CMD, RMD, RMS and P, with n = 50, 500 and 2000 under no contamination, based on the bivariate exponential(0.5, 0.5) model Masking performance Table 5 displays the masking robustness of CMD, RMD, RMS and P under scenarios A and B with the four different contamination levels, based on the bivariate exponential(0.5, 0.5) model. Here masking breakdown occurs in a given sample when all m outliers (replaced data points) are masked. Average percent (%) masked among m = nɛ replacement outliers ɛ = 0.02 ɛ = 0.10 ɛ = 0.25 ɛ = 0.40 MBP A B A B A B A B CMD n = 50, RMD trials RMS P CMD n = 500, RMD trials RMS P CMD n = 2000, RMD trials RMS P Table 5. Masking performance of CMD, RMD, RMS and P for bivariate exponential(0.5, 0.5) samples with n = 50, 500 and The MBP column gives the theoretical MBP values. Masking breakdown in a sample occurs when the percent of outliers masked is

8 Comments based on Table 5. Both CMD and RMS lack masking robustness, while RMD and P maintain high masking robustness with all contamination levels below the theoretical MBPs. As added perspective, we note that RMD is slightly better than P for Scenario A, while the reverse holds for Scenario B. Of course, when imposing elliptical contours is not appropriate, P is the more suitable choice Swamping performance Table 6 displays the swamping robustness of CMD, RMD, RMS and P under scenarios A and B with the four different contamination levels, based on the bivariate exponential(0.5, 0.5) model. Swamping breakdown occurs in a given sample when all n m nonoutliers (nonreplaced data points) are swamped. Average percent (%) swamped among n m = n(1 ɛ) nonoutliers ɛ = 0.02 ɛ = 0.1 ɛ = 0.25 ɛ = 0.4 SBP A B A B A B A B CMD n = 50, RMD trials RMS P CMD n = 500, RMD trials RMS P CMD n = 2000, RMD trials RMS P Table 6. Swamping performance of CMD, RMD, RMS and P, for bivariate exponential(0.5, 0.5) samples with n = 50, 500 and The SBP column gives the theoretical SBP values. Swamping breakdown in a sample occurs when the percent of nonoutliers swamped is 100. Comments based on Table 6. CMD and RMS exhibit excellent swamping robustness. RMD follows closely, with weakness only in the case of a large cluster of outiers, and P follows RMD competitively. 8

9 1.2.3 Practical recommendations based on the simulation results Although its swamping performance is optimal, CMD has unacceptably low masking performance and hence is not recommended for use in practice. The masking robustness of RMS is weak at high sample thresholds λ, yet it has good swamping performance and its computational burden is relatively light. Both RMD and P are robust with respect to both masking and swamping, with RMD having less computational complexity. It is worth noting, however, that in the contaminated bivariate standard normal model, both the masking performance of RMD and the swamping performance of RMD and P degrade in the presence of a large cluster of outliers. One may select the appropriate outlier identifier according to the preferred balance on masking robustness versus swamping robustness, in conjunction with consideration of the appropriateness or not of elliptical contours, the computational burden, and whether or not protection against a large cluster is desired. These recommendations based on results in two specific parametric models for the data are consistent with the results based on the MBPs and SBPs, which, of course, are nonparametric and have wide general application across diverse data settings. References [1] Dang, X. and Serfling, R. (2010). Nonparametric depth-based multivariate outlier identifiers, and masking robustness properties. Journal of Statistical Planning and Inference [2] Rousseeuw, P. and Van Driessen, K. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics

### General Foundations for Studying Masking and Swamping Robustness of Outlier Identifiers

General Foundations for Studying Masking and Swamping Robustness of Outlier Identifiers Robert Serfling 1 and Shanshan Wang 2 University of Texas at Dallas This paper is dedicated to the memory of Kesar

### Accurate and Powerful Multivariate Outlier Detection

Int. Statistical Inst.: Proc. 58th World Statistical Congress, 11, Dublin (Session CPS66) p.568 Accurate and Powerful Multivariate Outlier Detection Cerioli, Andrea Università di Parma, Dipartimento di

### IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH

SESSION X : THEORY OF DEFORMATION ANALYSIS II IDENTIFYING MULTIPLE OUTLIERS IN LINEAR REGRESSION : ROBUST FIT AND CLUSTERING APPROACH Robiah Adnan 2 Halim Setan 3 Mohd Nor Mohamad Faculty of Science, Universiti

### Robust estimation of principal components from depth-based multivariate rank covariance matrix

Robust estimation of principal components from depth-based multivariate rank covariance matrix Subho Majumdar Snigdhansu Chatterjee University of Minnesota, School of Statistics Table of contents Summary

### Identification of Multivariate Outliers: A Performance Study

AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 127 138 Identification of Multivariate Outliers: A Performance Study Peter Filzmoser Vienna University of Technology, Austria Abstract: Three

### A Multi-Step, Cluster-based Multivariate Chart for. Retrospective Monitoring of Individuals

A Multi-Step, Cluster-based Multivariate Chart for Retrospective Monitoring of Individuals J. Marcus Jobe, Michael Pokojovy April 29, 29 J. Marcus Jobe is Professor, Decision Sciences and Management Information

### Asymptotic Relative Efficiency in Estimation

Asymptotic Relative Efficiency in Estimation Robert Serfling University of Texas at Dallas October 2009 Prepared for forthcoming INTERNATIONAL ENCYCLOPEDIA OF STATISTICAL SCIENCES, to be published by Springer

### FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS

Int. J. Appl. Math. Comput. Sci., 8, Vol. 8, No. 4, 49 44 DOI:.478/v6-8-38-3 FAULT DETECTION AND ISOLATION WITH ROBUST PRINCIPAL COMPONENT ANALYSIS YVON THARRAULT, GILLES MOUROT, JOSÉ RAGOT, DIDIER MAQUIN

### Outlier detection for skewed data

Outlier detection for skewed data Mia Hubert 1 and Stephan Van der Veeken December 7, 27 Abstract Most outlier detection rules for multivariate data are based on the assumption of elliptical symmetry of

### Robust Exponential Smoothing of Multivariate Time Series

Robust Exponential Smoothing of Multivariate Time Series Christophe Croux,a, Sarah Gelper b, Koen Mahieu a a Faculty of Business and Economics, K.U.Leuven, Naamsestraat 69, 3000 Leuven, Belgium b Erasmus

### Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles

Generalized Multivariate Rank Type Test Statistics via Spatial U-Quantiles Weihua Zhou 1 University of North Carolina at Charlotte and Robert Serfling 2 University of Texas at Dallas Final revision for

### Small sample corrections for LTS and MCD

Metrika (2002) 55: 111 123 > Springer-Verlag 2002 Small sample corrections for LTS and MCD G. Pison, S. Van Aelst*, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling

### Robust Tools for the Imperfect World

Robust Tools for the Imperfect World Peter Filzmoser a,, Valentin Todorov b a Department of Statistics and Probability Theory, Vienna University of Technology, Wiedner Hauptstr. 8-10, 1040 Vienna, Austria

### Vienna University of Technology

Vienna University of Technology Deliverable 4. Final Report Contract with the world bank (1157976) Detecting outliers in household consumption survey data Peter Filzmoser Authors: Johannes Gussenbauer

### Definition 5.1: Rousseeuw and Van Driessen (1999). The DD plot is a plot of the classical Mahalanobis distances MD i versus robust Mahalanobis

Chapter 5 DD Plots and Prediction Regions 5. DD Plots A basic way of designing a graphical display is to arrange for reference situations to correspond to straight lines in the plot. Chambers, Cleveland,

### Robust scale estimation with extensions

Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance

### Evaluation of robust PCA for supervised audio outlier detection

Institut f. Stochastik und Wirtschaftsmathematik 1040 Wien, Wiedner Hauptstr. 8-10/105 AUSTRIA http://www.isw.tuwien.ac.at Evaluation of robust PCA for supervised audio outlier detection S. Brodinova,

### x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

.8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

### Robust repeated median regression in moving windows with data-adaptive width selection SFB 823. Discussion Paper. Matthias Borowski, Roland Fried

SFB 823 Robust repeated median regression in moving windows with data-adaptive width selection Discussion Paper Matthias Borowski, Roland Fried Nr. 28/2011 Robust Repeated Median regression in moving

### Improvement of The Hotelling s T 2 Charts Using Robust Location Winsorized One Step M-Estimator (WMOM)

Punjab University Journal of Mathematics (ISSN 1016-2526) Vol. 50(1)(2018) pp. 97-112 Improvement of The Hotelling s T 2 Charts Using Robust Location Winsorized One Step M-Estimator (WMOM) Firas Haddad

### Descriptive Data Summarization

Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning

### Robust detection of discordant sites in regional frequency analysis

Click Here for Full Article WATER RESOURCES RESEARCH, VOL. 43, W06417, doi:10.1029/2006wr005322, 2007 Robust detection of discordant sites in regional frequency analysis N. M. Neykov, 1 and P. N. Neytchev,

### Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

### Joseph W. McKean 1. INTRODUCTION

Statistical Science 2004, Vol. 19, No. 4, 562 570 DOI 10.1214/088342304000000549 Institute of Mathematical Statistics, 2004 Robust Analysis of Linear Models Joseph W. McKean Abstract. This paper presents

### Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

Extending the Robust Means Modeling Framework Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie One-way Independent Subjects Design Model: Y ij = µ + τ j + ε ij, j = 1,, J Y ij = score of the ith

### On the limiting distributions of multivariate depth-based rank sum. statistics and related tests. By Yijun Zuo 2 and Xuming He 3

1 On the limiting distributions of multivariate depth-based rank sum statistics and related tests By Yijun Zuo 2 and Xuming He 3 Michigan State University and University of Illinois A depth-based rank

### Heteroskedasticity-Robust Inference in Finite Samples

Heteroskedasticity-Robust Inference in Finite Samples Jerry Hausman and Christopher Palmer Massachusetts Institute of Technology December 011 Abstract Since the advent of heteroskedasticity-robust standard

### Identification of local multivariate outliers

Noname manuscript No. (will be inserted by the editor) Identification of local multivariate outliers Peter Filzmoser Anne Ruiz-Gazen Christine Thomas-Agnan Received: date / Accepted: date Abstract The

### Robust regression in Stata

The Stata Journal (2009) 9, Number 3, pp. 439 453 Robust regression in Stata Vincenzo Verardi 1 University of Namur (CRED) and Université Libre de Bruxelles (ECARES and CKE) Rempart de la Vierge 8, B-5000

### AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC

Journal of Applied Statistical Science ISSN 1067-5817 Volume 14, Number 3/4, pp. 225-235 2005 Nova Science Publishers, Inc. AN IMPROVEMENT TO THE ALIGNED RANK STATISTIC FOR TWO-FACTOR ANALYSIS OF VARIANCE

### Author s Accepted Manuscript

Author s Accepted Manuscript Robust fitting of mixtures using the Trimmed Likelihood Estimator N. Neykov, P. Filzmoser, R. Dimova, P. Neytchev PII: S0167-9473(06)00501-9 DOI: doi:10.1016/j.csda.2006.12.024

### MULTICOLLINEARITY DIAGNOSTIC MEASURES BASED ON MINIMUM COVARIANCE DETERMINATION APPROACH

Professor Habshah MIDI, PhD Department of Mathematics, Faculty of Science / Laboratory of Computational Statistics and Operations Research, Institute for Mathematical Research University Putra, Malaysia

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

### Robust Estimation of Cronbach s Alpha

Robust Estimation of Cronbach s Alpha A. Christmann University of Dortmund, Fachbereich Statistik, 44421 Dortmund, Germany. S. Van Aelst Ghent University (UGENT), Department of Applied Mathematics and

### Feature Vector Similarity Based on Local Structure

Feature Vector Similarity Based on Local Structure Evgeniya Balmachnova, Luc Florack, and Bart ter Haar Romeny Eindhoven University of Technology, P.O. Box 53, 5600 MB Eindhoven, The Netherlands {E.Balmachnova,L.M.J.Florack,B.M.terHaarRomeny}@tue.nl

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

### On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator

On the Distribution of Hotelling s T 2 Statistic Based on the Successive Differences Covariance Matrix Estimator JAMES D. WILLIAMS GE Global Research, Niskayuna, NY 12309 WILLIAM H. WOODALL and JEFFREY

### Nonparametric Multivariate Control Charts Based on. A Linkage Ranking Algorithm

Nonparametric Multivariate Control Charts Based on A Linkage Ranking Algorithm Helen Meyers Bush Data Mining & Advanced Analytics, UPS 55 Glenlake Parkway, NE Atlanta, GA 30328, USA Panitarn Chongfuangprinya

### Data Preprocessing. Cluster Similarity

1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

### Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION

Eric Shou Stat 598B / CSE 598D METHODS FOR MICRODATA PROTECTION INTRODUCTION Statistical disclosure control part of preparations for disseminating microdata. Data perturbation techniques: Methods assuring

### Statistical Depth Function

Machine Learning Journal Club, Gatsby Unit June 27, 2016 Outline L-statistics, order statistics, ranking. Instead of moments: median, dispersion, scale, skewness,... New visualization tools: depth contours,

### Journal of the American Statistical Association, Vol. 85, No (Sep., 1990), pp

Unmasking Multivariate Outliers and Leverage Points Peter J. Rousseeuw; Bert C. van Zomeren Journal of the American Statistical Association, Vol. 85, No. 411. (Sep., 1990), pp. 633-639. Stable URL: http://links.jstor.org/sici?sici=0162-1459%28199009%2985%3a411%3c633%3aumoalp%3e2.0.co%3b2-p

### Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

### Letter-value plots: Boxplots for large data

Letter-value plots: Boxplots for large data Heike Hofmann Karen Kafadar Hadley Wickham Dept of Statistics Dept of Statistics Dept of Statistics Iowa State University Indiana University Rice University

### Dose-response modeling with bivariate binary data under model uncertainty

Dose-response modeling with bivariate binary data under model uncertainty Bernhard Klingenberg 1 1 Department of Mathematics and Statistics, Williams College, Williamstown, MA, 01267 and Institute of Statistics,

### Testing Statistical Hypotheses

E.L. Lehmann Joseph P. Romano, 02LEu1 ttd ~Lt~S Testing Statistical Hypotheses Third Edition With 6 Illustrations ~Springer 2 The Probability Background 28 2.1 Probability and Measure 28 2.2 Integration.........

### Smooth simultaneous confidence bands for cumulative distribution functions

Journal of Nonparametric Statistics, 2013 Vol. 25, No. 2, 395 407, http://dx.doi.org/10.1080/10485252.2012.759219 Smooth simultaneous confidence bands for cumulative distribution functions Jiangyan Wang

### WALD LECTURE II LOOKING INSIDE THE BLACK BOX. Leo Breiman UCB Statistics

1 WALD LECTURE II LOOKING INSIDE THE BLACK BOX Leo Breiman UCB Statistics leo@stat.berkeley.edu ORIGIN OF BLACK BOXES 2 Statistics uses data to explore problems. Think of the data as being generated by

### Robust Space-Time Adaptive Processing Using Projection Statistics

Robust Space-Time Adaptive Processing Using Projection Statistics André P. des Rosiers 1, Gregory N. Schoenig 2, Lamine Mili 3 1: Adaptive Processing Section, Radar Division United States Naval Research

### A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

### Invariant coordinate selection for multivariate data analysis - the package ICS

Invariant coordinate selection for multivariate data analysis - the package ICS Klaus Nordhausen 1 Hannu Oja 1 David E. Tyler 2 1 Tampere School of Public Health University of Tampere 2 Department of Statistics

### Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling

Test (2008) 17: 461 471 DOI 10.1007/s11749-008-0134-6 DISCUSSION Rejoinder on: Control of the false discovery rate under dependence using the bootstrap and subsampling Joseph P. Romano Azeem M. Shaikh

### A Bootstrap Test for Causality with Endogenous Lag Length Choice. - theory and application in finance

CESIS Electronic Working Paper Series Paper No. 223 A Bootstrap Test for Causality with Endogenous Lag Length Choice - theory and application in finance R. Scott Hacker and Abdulnasser Hatemi-J April 200

### Robust statistic for the one-way MANOVA

Institut f. Statistik u. Wahrscheinlichkeitstheorie 1040 Wien, Wiedner Hauptstr. 8-10/107 AUSTRIA http://www.statistik.tuwien.ac.at Robust statistic for the one-way MANOVA V. Todorov and P. Filzmoser Forschungsbericht

### Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity and Outliers Problems

Modern Applied Science; Vol. 9, No. ; 05 ISSN 9-844 E-ISSN 9-85 Published by Canadian Center of Science and Education Using Ridge Least Median Squares to Estimate the Parameter by Solving Multicollinearity

### Physics and Chemistry of the Earth

Physics and Chemistry of the Earth 34 (2009) 626 634 Contents lists available at ScienceDirect Physics and Chemistry of the Earth journal homepage: www.elsevier.com/locate/pce Performances of some parameter

### What s New in Econometrics? Lecture 14 Quantile Methods

What s New in Econometrics? Lecture 14 Quantile Methods Jeff Wooldridge NBER Summer Institute, 2007 1. Reminders About Means, Medians, and Quantiles 2. Some Useful Asymptotic Results 3. Quantile Regression

### Quantile Scale Curves

This article was downloaded by: [Zhejiang University] On: 24 March 2014, At: 16:06 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office:

### Why some measures of fluctuating asymmetry are so sensitive to measurement error

Ann. Zool. Fennici 34: 33 37 ISSN 0003-455X Helsinki 7 May 997 Finnish Zoological and Botanical Publishing Board 997 Commentary Why some measures of fluctuating asymmetry are so sensitive to measurement

### On robust and efficient estimation of the center of. Symmetry.

On robust and efficient estimation of the center of symmetry Howard D. Bondell Department of Statistics, North Carolina State University Raleigh, NC 27695-8203, U.S.A (email: bondell@stat.ncsu.edu) Abstract

### Comparing the performance of modified F t statistic with ANOVA and Kruskal Wallis test

Appl. Math. Inf. Sci. 7, No. 2L, 403-408 (2013) 403 Applied Mathematics & Information Sciences An International ournal http://dx.doi.org/10.12785/amis/072l04 Comparing the performance of modified F t statistic

### The energy Package. October 3, Title E-statistics (energy statistics) tests of fit, independence, clustering

The energy Package October 3, 2005 Title E-statistics (energy statistics) tests of fit, independence, clustering Version 1.0-3 Date 2005-10-02 Author Maria L. Rizzo and Gabor J.

### System Monitoring with Real-Time Contrasts

System Monitoring with Real- Contrasts HOUTAO DENG Intuit, Mountain View, CA 94043, USA GEORGE RUNGER Arizona State University, Tempe, AZ 85287, USA EUGENE TUV Intel Corporation, Chandler, AZ 85226, USA

### CoDa-dendrogram: A new exploratory tool. 2 Dept. Informàtica i Matemàtica Aplicada, Universitat de Girona, Spain;

CoDa-dendrogram: A new exploratory tool J.J. Egozcue 1, and V. Pawlowsky-Glahn 2 1 Dept. Matemàtica Aplicada III, Universitat Politècnica de Catalunya, Barcelona, Spain; juan.jose.egozcue@upc.edu 2 Dept.

### A Brief Overview of Robust Statistics

A Brief Overview of Robust Statistics Olfa Nasraoui Department of Computer Engineering & Computer Science University of Louisville, olfa.nasraoui_at_louisville.edu Robust Statistical Estimators Robust

### Robust Maximum Association Between Data Sets: The R Package ccapp

Robust Maximum Association Between Data Sets: The R Package ccapp Andreas Alfons Erasmus Universiteit Rotterdam Christophe Croux KU Leuven Peter Filzmoser Vienna University of Technology Abstract This

### A Modified M-estimator for the Detection of Outliers

A Modified M-estimator for the Detection of Outliers Asad Ali Department of Statistics, University of Peshawar NWFP, Pakistan Email: asad_yousafzay@yahoo.com Muhammad F. Qadir Department of Statistics,

### Comparing Non-informative Priors for Estimation and. Prediction in Spatial Models

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Vigre Semester Report by: Regina Wu Advisor: Cari Kaufman January 31, 2010 1 Introduction Gaussian random fields with specified

### Detection of Anomalies in Texture Images using Multi-Resolution Features

Detection of Anomalies in Texture Images using Multi-Resolution Features Electrical Engineering Department Supervisor: Prof. Israel Cohen Outline Introduction 1 Introduction Anomaly Detection Texture Segmentation

### THE BREAKDOWN POINT EXAMPLES AND COUNTEREXAMPLES

REVSTAT Statistical Journal Volume 5, Number 1, March 2007, 1 17 THE BREAKDOWN POINT EXAMPLES AND COUNTEREXAMPLES Authors: P.L. Davies University of Duisburg-Essen, Germany, and Technical University Eindhoven,

### HOTELLING S CHARTS USING WINSORIZED MODIFIED ONE STEP M-ESTIMATOR FOR INDIVIDUAL NON NORMAL DATA

20 th February 205. Vol.72 No.2 2005-205 JATIT & LLS. All rights reserved. ISSN: 992-8645 www.jatit.org E-ISSN: 87-395 HOTELLING S CHARTS USING WINSORIZED MODIFIED ONE STEP M-ESTIMATOR FOR INDIVIDUAL NON

### Surprise Detection in Science Data Streams Kirk Borne Dept of Computational & Data Sciences George Mason University

Surprise Detection in Science Data Streams Kirk Borne Dept of Computational & Data Sciences George Mason University kborne@gmu.edu, http://classweb.gmu.edu/kborne/ Outline Astroinformatics Example Application:

### The energy Package. October 28, 2007

The energy Package October 28, 2007 Title E-statistics (energy statistics) tests of fit, independence, clustering Version 1.0-7 Date 2007-10-27 Author Maria L. Rizzo and Gabor

### Intraclass Correlations in One-Factor Studies

CHAPTER Intraclass Correlations in One-Factor Studies OBJECTIVE The objective of this chapter is to present methods and techniques for calculating the intraclass correlation coefficient and associated

### Dealing With Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation 1

Mathematical Geology, Vol. 35, No. 3, April 2003 ( C 2003) Dealing With Zeros and Missing Values in Compositional Data Sets Using Nonparametric Imputation 1 J. A. Martín-Fernández, 2 C. Barceló-Vidal,

### On Parameter-Mixing of Dependence Parameters

On Parameter-Mixing of Dependence Parameters by Murray D Smith and Xiangyuan Tommy Chen 2 Econometrics and Business Statistics The University of Sydney Incomplete Preliminary Draft May 9, 2006 (NOT FOR

### Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

### Estimating Coefficients in Linear Models: It Don't Make No Nevermind

Psychological Bulletin 1976, Vol. 83, No. 2. 213-217 Estimating Coefficients in Linear Models: It Don't Make No Nevermind Howard Wainer Department of Behavioral Science, University of Chicago It is proved

### FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

### On Selecting Tests for Equality of Two Normal Mean Vectors

MULTIVARIATE BEHAVIORAL RESEARCH, 41(4), 533 548 Copyright 006, Lawrence Erlbaum Associates, Inc. On Selecting Tests for Equality of Two Normal Mean Vectors K. Krishnamoorthy and Yanping Xia Department

### Application and Use of Multivariate Control Charts In a BTA Deep Hole Drilling Process

Application and Use of Multivariate Control Charts In a BTA Deep Hole Drilling Process Amor Messaoud, Winfied Theis, Claus Weihs, and Franz Hering Fachbereich Statistik, Universität Dortmund, 44221 Dortmund,

### CONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES

REVSTAT Statistical Journal Volume 13, Number, June 015, 131 144 CONTROL CHARTS FOR MULTIVARIATE NONLINEAR TIME SERIES Authors: Robert Garthoff Department of Statistics, European University, Große Scharrnstr.

### Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

### Robust Performance Hypothesis Testing with the Variance. Institute for Empirical Research in Economics University of Zurich

Institute for Empirical Research in Economics University of Zurich Working Paper Series ISSN 1424-0459 Working Paper No. 516 Robust Performance Hypothesis Testing with the Variance Olivier Ledoit and Michael

### Bounding the number of affine roots

with applications in reliable and secure communication Inaugural Lecture, Aalborg University, August 11110, 11111100000 with applications in reliable and secure communication Polynomials: F (X ) = 2X 2

### A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness

A Bayesian Nonparametric Approach to Monotone Missing Data in Longitudinal Studies with Informative Missingness A. Linero and M. Daniels UF, UT-Austin SRC 2014, Galveston, TX 1 Background 2 Working model

### A nonparametric two-sample wald test of equality of variances

University of Wollongong Research Online Faculty of Informatics - Papers (Archive) Faculty of Engineering and Information Sciences 211 A nonparametric two-sample wald test of equality of variances David

### An Empirical Study of Probability Elicitation under Noisy-OR Assumption

An Empirical Study of Probability Elicitation under Noisy-OR Assumption Adam Zagorecki and Marek Druzdzel Decision Systems Laboratory School of Information Science University of Pittsburgh, Pittsburgh,

### A NONPARAMETRIC TEST FOR HOMOGENEITY: APPLICATIONS TO PARAMETER ESTIMATION

Change-point Problems IMS Lecture Notes - Monograph Series (Volume 23, 1994) A NONPARAMETRIC TEST FOR HOMOGENEITY: APPLICATIONS TO PARAMETER ESTIMATION BY K. GHOUDI AND D. MCDONALD Universite' Lava1 and

### Applications of Information Geometry to Hypothesis Testing and Signal Detection

CMCAA 2016 Applications of Information Geometry to Hypothesis Testing and Signal Detection Yongqiang Cheng National University of Defense Technology July 2016 Outline 1. Principles of Information Geometry

### Robust and Efficient Fitting of the Generalized Pareto Distribution with Actuarial Applications in View

Robust and Efficient Fitting of the Generalized Pareto Distribution with Actuarial Applications in View Vytaras Brazauskas 1 University of Wisconsin-Milwaukee Andreas Kleefeld 2 University of Wisconsin-Milwaukee

### Robust Outcome Analysis for Observational Studies Designed Using Propensity Score Matching

The work of Kosten and McKean was partially supported by NIAAA Grant 1R21AA017906-01A1 Robust Outcome Analysis for Observational Studies Designed Using Propensity Score Matching Bradley E. Huitema Western

### Analysis of Cross-Sectional Data

Analysis of Cross-Sectional Data Kevin Sheppard http://www.kevinsheppard.com Oxford MFE This version: November 8, 2017 November 13 14, 2017 Outline Econometric models Specification that can be analyzed

### Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

### PRINCIPLES OF STATISTICAL INFERENCE

Advanced Series on Statistical Science & Applied Probability PRINCIPLES OF STATISTICAL INFERENCE from a Neo-Fisherian Perspective Luigi Pace Department of Statistics University ofudine, Italy Alessandra

### CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

FROM: PAGANO, R. R. (007) I. INTRODUCTION: DISTINCTION BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS Statistical inference tests are often classified as to whether they are parametric or nonparametric Parameter

### A Statistical Perspective on Algorithmic Leveraging

Ping Ma PINGMA@UGA.EDU Department of Statistics, University of Georgia, Athens, GA 30602 Michael W. Mahoney MMAHONEY@ICSI.BERKELEY.EDU International Computer Science Institute and Dept. of Statistics,

### Generalized Impulse Response Analysis: General or Extreme?

Auburn University Department of Economics Working Paper Series Generalized Impulse Response Analysis: General or Extreme? Hyeongwoo Kim Auburn University AUWP 2012-04 This paper can be downloaded without