A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications

Similar documents
Statistical Depth Function

CONTRIBUTIONS TO THE THEORY AND APPLICATIONS OF STATISTICAL DEPTH FUNCTIONS

Robust estimation of principal components from depth-based multivariate rank covariance matrix

Supplementary Material for Wang and Serfling paper

Data Release 5. Sky coverage of imaging data in the DR5

Surprise Detection in Multivariate Astronomical Data Kirk Borne George Mason University

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX

Jensen s inequality for multivariate medians

Influence Functions for a General Class of Depth-Based Generalized Quantile Functions

Introduction to the Sloan Survey

Descriptive Univariate Statistics and Bivariate Correlation

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy

Distributed Statistical Estimation and Rates of Convergence in Normal Approximation

Methods of Nonparametric Multivariate Ranking and Selection

Computationally Easy Outlier Detection via Projection Pursuit with Finitely Many Directions

Accurate and Powerful Multivariate Outlier Detection

Multivariate quantiles and conditional depth

Empirical likelihood-based methods for the difference of two trimmed means

Supervised Classification for Functional Data Using False Discovery Rate and Multivariate Functional Depth

Extreme geometric quantiles

Detecting outliers in weighted univariate survey data

BNG 495 Capstone Design. Descriptive Statistics

Inequalities Relating Addition and Replacement Type Finite Sample Breakdown Points

Monitoring Random Start Forward Searches for Multivariate Data

Small Sample Corrections for LTS and MCD

Detection of outliers in multivariate data:

Statistical Data Analysis

Robust estimators based on generalization of trimmed mean

On the limiting distributions of multivariate depth-based rank sum. statistics and related tests. By Yijun Zuo 2 and Xuming He 3

Letter-value plots: Boxplots for large data

ON MULTIVARIATE MONOTONIC MEASURES OF LOCATION WITH HIGH BREAKDOWN POINT

On Invariant Within Equivalence Coordinate System (IWECS) Transformations

An Overview of Multiple Outliers in Multidimensional Data

General Foundations for Studying Masking and Swamping Robustness of Outlier Identifiers

YIJUN ZUO. Education. PhD, Statistics, 05/98, University of Texas at Dallas, (GPA 4.0/4.0)

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 1: Introduction, Multivariate Location and Scatter

Fast Hierarchical Clustering from the Baire Distance

Quantile Scale Curves

Robust estimation of scale and covariance with P n and its application to precision matrix estimation

A Multi-Step, Cluster-based Multivariate Chart for. Retrospective Monitoring of Individuals

CHAPTER 5. Outlier Detection in Multivariate Data

Fast and Robust Classifiers Adjusted for Skewness

Geovisualization. Luc Anselin. Copyright 2016 by Luc Anselin, All Rights Reserved

THE BREAKDOWN POINT EXAMPLES AND COUNTEREXAMPLES

DD-Classifier: Nonparametric Classification Procedure Based on DD-plot 1. Abstract

Exploratory data analysis: numerical summaries

Supplement: Efficient Global Point Cloud Alignment using Bayesian Nonparametric Mixtures

Statistical Tools and Techniques for Solar Astronomers

INVARIANT COORDINATE SELECTION

arxiv: v1 [stat.me] 14 Jan 2019

D4.2. First release of on-line science-oriented tutorials

Chapter 2 Depth Statistics

Multiple Variable Analysis

Modulation of symmetric densities

PROJECTION-BASED DEPTH FUNCTIONS AND ASSOCIATED MEDIANS 1. BY YIJUN ZUO Arizona State University and Michigan State University

ESTIMATING THE WINTERING LOCATION OF THE WOOD THRUSH HANNA JANKOWSKI, VERONICA SABELNYKOVA, JOHN SHERIFF

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Astr 323: Extragalactic Astronomy and Cosmology. Spring Quarter 2014, University of Washington, Željko Ivezić. Lecture 1:

Statistics 612: L p spaces, metrics on spaces of probabilites, and connections to estimation

MULTIVARIATE TECHNIQUES, ROBUSTNESS

A Comparison of Robust Estimators Based on Two Types of Trimming

Median Cross-Validation

Identification of Multivariate Outliers: A Performance Study

Rare Event Discovery And Event Change Point In Biological Data Stream

Statistics: Learning models from data

Unit 2. Describing Data: Numerical

Jyh-Jen Horng Shiau 1 and Lin-An Chen 1

Modern Image Processing Techniques in Astronomical Sky Surveys

Descriptive Data Summarization

Outlier detection for skewed data

Regression Clustering

Introduction to robust statistics*

Lecture Outlines. Chapter 25. Astronomy Today 7th Edition Chaisson/McMillan Pearson Education, Inc.

DESCRIPTIVE STATISTICS FOR NONPARAMETRIC MODELS I. INTRODUCTION

Correlation Lengths of Red and Blue Galaxies: A New Cosmic Ruler

Estimation of Parameters

arxiv:astro-ph/ v1 16 Apr 2004

Depth Measures for Multivariate Functional Data

Definition 5.1: Rousseeuw and Van Driessen (1999). The DD plot is a plot of the classical Mahalanobis distances MD i versus robust Mahalanobis

APPROXIMATING CONTINUOUS FUNCTIONS: WEIERSTRASS, BERNSTEIN, AND RUNGE

ABSTRACT. Title: The Accuracy of the Photometric Redshift of Galaxy Clusters

Financial Econometrics and Volatility Models Copulas

WEIGHTED HALFSPACE DEPTH

Multivariate Statistical Analysis

Segmentation of Subspace Arrangements III Robust GPCA

Math 172 HW 1 Solutions

1. Density and properties Brief outline 2. Sampling from multivariate normal and MLE 3. Sampling distribution and large sample behavior of X and S 4.

Week 9 The Central Limit Theorem and Estimation Concepts

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

MATH4427 Notebook 4 Fall Semester 2017/2018

The effect of large scale environment on galaxy properties

Classier Selection using the Predicate Depth

Stahel-Donoho Estimation for High-Dimensional Data

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

A SHORT COURSE ON ROBUST STATISTICS. David E. Tyler Rutgers The State University of New Jersey. Web-Site dtyler/shortcourse.

Chapter 3. Data Description

Age of the Universe Lab Session - Example report

ICSA Applied Statistics Symposium 1. Balanced adjusted empirical likelihood

Rudin Real and Complex Analysis - Harmonic Functions

Issues on quantile autoregression

Transcription:

A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications Hyunsook Lee. hlee@stat.psu.edu Department of Statistics The Pennsylvania State University Hyunsook Lee, Department of Statistics, Penn State Univ p. 1

Outlines Convex Hull Peeling (CHP) and Multivariate Data Analysis Definitions on CHP Data Depth (Ordering Multivariate Data) Quantiles and Density Estimation Color Magnitude (CM) Diagram and Sloan Digital Sky Survey Nonparametric Descriptive Statistics with CHP Multivariate Median Skewness and Kurtosis of a Multivariate Distribution Outlier Detection with CHP Level α ; Shape Distortion; Balloon Plot Concluding Remarks Hyunsook Lee, Department of Statistics, Penn State Univ p. 2

Definitions Convex Set A set C R d is convex if for every two points x, y C the whole segment xy is also contained in C. Convex Hull The convex hull of a set of points X in R d is denoted by CH(X), is the intersection of all convex sets in R d containing X. In algorithms, a convex hull indicates points of a shape invariant minimal subset of CH(X) (vertices, extreme points), connecting these points produces a wrap of CH(X). y 2 1 0 1 2 y 2 1 0 1 2 2 1 0 1 2 x 2 1 0 1 2 Hyunsook Lee, Department of Statistics, Penn State Univ p. 3 x

Convex Hull Peeling Before After y 2 1 0 1 2 y 2 1 0 1 2 2 1 0 1 2 x 2 1 0 1 2 x Hyunsook Lee, Department of Statistics, Penn State Univ p. 4

Convex Hull Peeling Depth (CHPD) [CHPD:] For a point x R d and the data set X = {X 1,..., X N 1 }, let C 1 = CH{x, X} and denote a set of its vertices V 1. We can get C j = C j 1 \V j 1 through CHP until x V j (j = 2,...). Then, CHPD(x) = ( k i=1 V i) N for k s.t. k = min j {j : x V j } ; otherwise CHPD is 0. Tukey (1974): Locating data center (median) by the Convex Hull Peeling Process. Barnett (1976): Ordering based on Depth ˆp th quantiles are 1 ˆp th CHPDs. Hyper-polygons of 1 ˆp th depth obtainable from any dimensional data. QHULL(Barber et. al., 1996) works for general dimensions (http://qhull.org). Why CHPD... Hyunsook Lee, Department of Statistics, Penn State Univ p. 5

Challenges in Nonparametric Multivariate Analysis How to Order Multivariate Data? Hyunsook Lee, Department of Statistics, Penn State Univ p. 6

Challenges in Nonparametric Multivariate Analysis How to Order Multivariate Data? Ordering Multivariate Data Data Depth Mahalanobis Depth : Mahalanobis (1936) Convex Hull Peeling Depth: Barnett (1976) Half Space Depth: Tukey (1975) Simplical Depth : Liu (1990) Oja Depth : Oja (1983) Majority Depth : Singh (1991) Ordering is not uniformly defined Hyunsook Lee, Department of Statistics, Penn State Univ p. 6

Statistical Data Depth (Zuo and Serfling, 2000) (P1) (Affine invariance) D(Ax + b; F AX+b ) = D(x; F X ) for all X (A nonsingular matrix) holds for any random vector X in R d, any d d nonsingular matrix A, and any d-vector b; (P2) (Maximality at center) D(θ; F) = sup x R d D(x; F) holds for any F F having center θ; (P3) (Monotonicity) for any F F having deepest point θ, D(x; F) D(θ + α(x θ); F) holds for α [0, 1]; and (P4) D(x; F) 0 as x, for each F F. Hyunsook Lee, Department of Statistics, Penn State Univ p. 7

Convex Hull Peeling Depth affine invariance maximality at center monotonicity relative to deepest point vanishing at infinity CHPD has these properties and points of smallest depth are possible outliers Hyunsook Lee, Department of Statistics, Penn State Univ p. 8

Quantile Estimation Median: A point(s) left after peeling (will show robustness of this estimator later) p th Quantile: Level set whose central region contains 100p% data (will define the level set and the central region later) No Closed Form; Empirical Process Hyunsook Lee, Department of Statistics, Penn State Univ p. 9

Empirical Density Estimation Density Estimation with CHPD on Bivariate Normal Data (McDermott, 2003) 100000 Bivariate Normal Sample Quantiles={0.99,0.95,0.90,0.80,...0.20,0.10,0.05,0.01} y 4 2 0 2 4 density 0.00 0.05 0.10 0.15 0.20 4 3 2 1 0 1 2 3 4 3 2 10123 y 4 2 0 2 4 x x Hyunsook Lee, Department of Statistics, Penn State Univ p. 10

Lessons and Further Studies Sample from a convex distribution (no doughnut shape) Works on Massive data Sequential Method Without previous knowledge, no model or prior is known to start an analysis. Exploratory data analysis for a large database Nonparametric and non-distance based approach Where CHP can be applied and how? Multi-color diagram from astronomy, where a plethora of free data archives is available. Hyunsook Lee, Department of Statistics, Penn State Univ p. 11

Color Magnitude diagram Two dimensional Color-Color diagram or Celebrated Hertzsprung-Russell diagram (switch) Hyunsook Lee, Department of Statistics, Penn State Univ p. 12

Color Magnitude diagram Two dimensional Color-Color diagram or Celebrated Hertzsprung-Russell diagram (switch) What if we can see beyond 2 dimensions without bias (projection) Then, 3 or higher dimensional color diagrams might have popularity. Hyunsook Lee, Department of Statistics, Penn State Univ p. 12

Color Magnitude diagram Two dimensional Color-Color diagram or Celebrated Hertzsprung-Russell diagram (switch) What if we can see beyond 2 dimensions without bias (projection) Then, 3 or higher dimensional color diagrams might have popularity. CHP may assist analyzing multi-color diagrams. Need a suitable data set with colors. Hyunsook Lee, Department of Statistics, Penn State Univ p. 12

Sloan Digital Sky Survey: SDSS Commissioned 2000, now Data Release 5 is available. 5 bands; 4 variables (u-g, g-r, r-i, i-z) Studies on analyzing astronomical massive data received spotlights recently. http://www.sdss.org July, 2005: Data Release Four 6670 square degrees, 180 million objects Available from http://www.sdss.org/dr4 From SpecPhotoAll with SQL: Attributes of photometric data are color indices, u,b,g,i,z along with coordinates. Hyunsook Lee, Department of Statistics, Penn State Univ p. 13

SQL for SDSS select ra, dec, z, psfmag_u, psfmag_g, psfmag_r, psfmag_i, psfmag_z from SpecPhotoAll where specclass= 2 Note 2: galaxies, 3: QSO, 4: HighZ QSO Galaxies: 499043 Quasars: 70204 Hyunsook Lee, Department of Statistics, Penn State Univ p. 14

Multivariate Descriptive Statistics CHP Median CHP Skewness CHP Kurtosis with bivariate simulated data and SDSS DR4 Hyunsook Lee, Department of Statistics, Penn State Univ p. 15

Convex Hull Peeling Median (CHPM) Multivariate Median: the inner most point among data Survey of Multivariate Median (Small, 1990) CHPM: recursive peeling leads to the inner most point(s). The average of these largest depth points is the median of a data set. Hyunsook Lee, Department of Statistics, Penn State Univ p. 16

Convex Hull Peeling Median (CHPM) Multivariate Median: the inner most point among data Survey of Multivariate Median (Small, 1990) CHPM: recursive peeling leads to the inner most point(s). The average of these largest depth points is the median of a data set. Simulations: Sample from standard bivariate normal distribution n mean median CHPM 10 4 (0.001338, -0.02232) (-0.005305, -0.01643) (0.000918, -0.010589) 10 6 (0.000072, 0.000114) (0.001185, -0.000717) (0.002455, -0.000456) Sequential CHPM (0.004741, -0.004111) Setting for the sequential method: m=10000 and d=0.05 Hyunsook Lee, Department of Statistics, Penn State Univ p. 16

Application: Median Quasars u-g g-r r-i i-z Mean 0.4619 0.2484 0.1649 0.1008 Median 0.2520 0.1750 0.1520 0.0770 CHPM 0.2530 0.1640 0.1913 0.0700 Galaxies u-g g-r r-i i-z Mean 1.622 0.9211 0.4226 0.3439 Median 1.680 0.8930 0.4200 0.3540 CHPM 1.790 0.957 0.424 0.367 Seq. CHPM 1.772 0.950 0.4228 0.363 Hyunsook Lee, Department of Statistics, Penn State Univ p. 17

Robustness of Convex Hull Peeling Median Breakdown point of a convex hull peeling median goes to zero as n (Donoho, 1982). Outliers are necessarily located at infinity. Hyunsook Lee, Department of Statistics, Penn State Univ p. 18

Robustness of Convex Hull Peeling Median Breakdown point of a convex hull peeling median goes to zero as n (Donoho, 1982). Outliers are necessarily located at infinity. Empirical mean square error (EMSE) and Relative Efficiency (RE): Model:(1 ɛ)n((0, 0),I) + ɛn(, 4I) n = 5000, m = 500, T j =(CHPM, Mean) EMSE = 1 m m i=1 T j µ 2 N((5,5) t,4i) N((10,10) t,4i) ɛ CHPM Mean RE CHPM Mean RE 0 0.002178 0.000417 0.191689 0.002178 0.000417 0.191689 0.005 0.0028521 0.001682 0.589961 0.002891 0.005444 1.88291 0.05 0.016842 0.125522 7.45262 0.017824 0.500610 28.08597 0.2 0.139215 2.00109 14.37612 0.1435910 8.0017 55.7264 Hyunsook Lee, Department of Statistics, Penn State Univ p. 18

Generalized Quantile Process EinMahl and Mason (1992) U n (t) = inf{λ(a) : P n (A) t, A A}, 0 < t < 1. Central Region: R CH (t) = {x R d : CHPD(x) t} Level Set: B CH (t) = R CH (t) = {x R d : CHPD(x) = t} Volume Functional: V CH (t) = V olume(r CH (t)) One dimensional mapping. Hyunsook Lee, Department of Statistics, Penn State Univ p. 19

y 4 2 0 2 4 y 0 2 4 6 8 4 2 0 2 4 4 2 0 2 4 x x y 3 2 1 0 1 2 3 y 200 0 100 300 3 2 1 0 1 2 3 50 0 50 100 x x not equi-probability contours, assume smooth convex distributions Hyunsook Lee, Department of Statistics, Penn State Univ p. 20

Skewness Measure Let x j,i be the i th vertex in a level set B CH,j comprised by the j th peeling process. A measure of skewness: R j = max i x j,i CHPM min i x j,i CHPM min i x j,i CHPM Not only a sequence of R j visualizes but also quantizes the skewness along depths. Denominator for the regularization affine invariant R j symmetric: flat R j along convex hull peels skewed: fluctuating R j Hyunsook Lee, Department of Statistics, Penn State Univ p. 21

Simulation: Skewness Measure N(0, I) N(0, Σ) non normal 2 0 2 2 0 2 N(0, 1) 3 2 1 0 1 2 3 3 2 1 0 1 2 3 4 2 0 2 5 10 15 20 25 30 2 χ 10 R j 0.0 0.5 1.0 1.5 2.0 2.5 R j 1.0 1.5 2.0 2.5 R j 2 3 4 5 6 0 20 40 60 80 j th convex hull level set 0 20 40 60 80 j th convex hull level set 0 20 40 60 80 j th convex hull level set 2000 Hyunsook Lee, Department of Statistics, Penn State Univ p. 22

Application:Skewness Measure (Quasars) R j 2 4 6 8 10 12 14 0 20 40 60 80 j th convex hull level set Hyunsook Lee, Department of Statistics, Penn State Univ p. 23

Application:Skewness Measure (Galaxies) R j 2 4 6 8 10 12 0 50 100 150 200 j th convex hull level set Hyunsook Lee, Department of Statistics, Penn State Univ p. 24

Kurtosis Measure Quantile (Depth) based Kurtosis: Tailweight: K CH (r) = V CH( 1 2 r 2 ) + V CH( 1 2 + r 2 ) 2V CH( 1 2 ) V CH ( 1 2 r 2 ) V CH( 1 2 + r 2 ) t(r, s) = V CH(r) V CH (s) for 0 < s < r 1. Here, V CH (r) indicates the volume functional at depth r. Hyunsook Lee, Department of Statistics, Penn State Univ p. 25

Simulation: Kurtosis Measure (Tailweight) t(r, r min ) 0.0 0.2 0.4 0.6 0.8 1.0 uniform normal t 10 cauchy 1.0 0.8 0.6 0.4 0.2 0.0 r : depth Hyunsook Lee, Department of Statistics, Penn State Univ p. 26

Application: Kurtosis Measure (Quasars) v(r, r min ) 0.0 0.2 0.4 0.6 0.8 1.0 1.0 0.8 0.6 0.4 0.2 0.0 depth Hyunsook Lee, Department of Statistics, Penn State Univ p. 27

Multivariate Outlier Detection What are Outliers? Detecting Algorithms Level α Shape Distortion Balloon Plot Hyunsook Lee, Department of Statistics, Penn State Univ p. 28

What are Outliers? Outliers are... Cumbersome Observations Lead to New Scientific Discoveries Improve Models (Robust Statistics)... No Clear Objectives but Come Along Often CHP: Experience and relative Robustness support the Idea of Outlier Detection. We need a clear definition on outliers; especially, outliers of the 21st century. And outlier detecting methods. Hyunsook Lee, Department of Statistics, Penn State Univ p. 29

Outliers are observations... Huber (1972): unlikely to belong to the main population. Barnett and Leroy (1994): appear inconsistent with the remainder. Hawkins (1980): deviated so much to arouse suspicion. Beckman and Cook (1983): surprising and discrepant to the investigator. Discordant Observations or Contaminants Rohlf (1975): somewhat isolated from the main cloud of points. Yet, somewhat VAGUE! Hyunsook Lee, Department of Statistics, Penn State Univ p. 30

Some Outlier Detection Methods Univariate: Box-and-Whisker plot, Order statistics,... Hyunsook Lee, Department of Statistics, Penn State Univ p. 31

Some Outlier Detection Methods Univariate: Box-and-Whisker plot, Order statistics,... Multivariate: Mostly bivariate applications Generalized Gap Test (Rolhf, 1975) Bivariate Box Plot (Zani et. al, 1999) Sunburst Plot (Liu et. al., 1999) Bag plot (Miller et. al., 2003) and Mahalanobis distance D(x) = (x ˆµ)ˆΣ 1 (x ˆµ). Hyunsook Lee, Department of Statistics, Penn State Univ p. 31

Some Outlier Detection Methods Univariate: Box-and-Whisker plot, Order statistics,... Multivariate: Mostly bivariate applications Generalized Gap Test (Rolhf, 1975) Bivariate Box Plot (Zani et. al, 1999) Sunburst Plot (Liu et. al., 1999) Bag plot (Miller et. al., 2003) and Mahalanobis distance D(x) = (x ˆµ)ˆΣ 1 (x ˆµ). Difficulties of multivariate analysis arise from the complexity of ordering multivariate data. Hyunsook Lee, Department of Statistics, Penn State Univ p. 31

Quantile Based Outlier Detection bivariate standard normal bivariate t 5 with ρ= 0.5 y 4 2 0 2 4 y 20 10 0 10 20 4 2 0 2 4 20 10 0 10 20 x x density 0.00 0.05 0.10 0.15 0.20 4 3 2 1 0 1 2 3 4 3 2 10123 y density 0.00 0.05 0.10 0.15 0.20 6 4 2 0 2 4 6 6 4 20 2 4 6 y x x Hyunsook Lee, Department of Statistics, Penn State Univ p. 32

Contour Shape Changes Bivariate Normal Sample Outliers Added y 2 0 2 y 2 0 2 4 3 2 1 0 1 2 3 x 2 0 2 4 6 x y 2 0 2 2 0 2 4 y 3 2 1 0 1 2 3 x 2 0 2 4 6 x Vol j 0 10 20 30 40 50 Vol j 0 10 20 30 40 50 GAP 0 20 40 60 80 j th convex hull level set 0 20 40 60 80 j th convex hull level set Hyunsook Lee, Department of Statistics, Penn State Univ p. 33

Balloon Plot Bivariate Normal Sample Outliers Added y 2 0 2 y 2 0 2 4 3 2 1 0 1 2 3 2 0 2 4 6 x x A Balloon Plot is obtained by blowing.5 th CHPD polyhedron by 1.5 times (lengthwise). Let V.5 be a set of vertices of.5 th CHPD hull. The balloon for outlier detection is B 1.5 = {y i : y i = x i + 1.5(x i CHPM), x i V.5 }. In other words, blow the balloon of IQR 1.5 times larger. Hyunsook Lee, Department of Statistics, Penn State Univ p. 34

Outliers in Quasar Population Vol j 0 100 200 300 400 Vol j 0.25 0 1 2 3 4 0 20 40 60 80 j th convex hull level set 0 20 40 60 80 j th convex hull level set Volumes of 1st CH,.01 Depth CH,.05 Depth CH: (474.134, 14.442, 4.353) Hyunsook Lee, Department of Statistics, Penn State Univ p. 35

Outliers in Galaxy Population Vol j 0 1000 2000 3000 4000 5000 Vol j 0.25 0 2 4 6 8 0 50 100 150 200 j (peel) 0 50 100 150 200 j(peel) Volumes of 1st CH,.01 Depth CH,.05 Depth CH: (4919.492, 4.310, 1.075) Hyunsook Lee, Department of Statistics, Penn State Univ p. 36

Discussion on CHP Convex Hull Peeling is.. a robust location estimator. a tool for descriptive statistics. Skewness and Kurtosis measure. a reasonable approach for detecting multivariate outliers. a starter for clustering. Our methods help to characterize multivariate distributions and identify outlier candidates from multivariate massive data; therefore, the results initiate scientists to study further with less bias. CHP as Exploratory Data Analysis and Data Mining Tools. Hyunsook Lee, Department of Statistics, Penn State Univ p. 37

Concluding Remarks Drawbacks of CHPD Limited to moderate dimension data. CHPD estimates depths inward not outward. Convexity of a data set. No population/theoretical counterpart. Hyunsook Lee, Department of Statistics, Penn State Univ p. 38

Concluding Remarks Drawbacks of CHPD Limited to moderate dimension data. CHPD estimates depths inward not outward. Convexity of a data set. No population/theoretical counterpart. No assumption on data distribution, Non-distance based, Affine invariant, Applicable to streaming data, Detecting Outliers, Providing Multivariate Descriptive Statistics, Exploratory data analysis Hyunsook Lee, Department of Statistics, Penn State Univ p. 38