A SHORT COURSE ON ROBUST STATISTICS. David E. Tyler Rutgers The State University of New Jersey. Web-Site dtyler/shortcourse.

Size: px
Start display at page:

Download "A SHORT COURSE ON ROBUST STATISTICS. David E. Tyler Rutgers The State University of New Jersey. Web-Site dtyler/shortcourse."

Transcription

1 A SHORT COURSE ON ROBUST STATISTICS David E. Tyler Rutgers The State University of New Jersey Web-Site dtyler/shortcourse.pdf

2 References Huber, P.J. (1981). Robust Statistics. Wiley, New York. Hampel, F.R., Ronchetti, E.M., Rousseeuw,P.J. and Stahel, W.A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley, New York. Maronna, R.A., Martin, R.D. and Yohai, V.J. (2006). Robust Statistics: Theory and Methods. Wiley, New York.

3 PART 1 CONCEPTS AND BASIC METHODS

4 Data Set: X 1, X 2,..., X n Parametric Model: F (x 1,..., x n θ) MOTIVATION θ: Unknown parameter F : Known function e.g. X 1, X 2,..., X n i.i.d. Normal(µ, σ 2 ) Q: Is it realistic to believe we don t know (µ, σ 2 ), but we know e.g. the shape of the tails of the distribution? A: The model is assumed to be approximately true, e.g. symmetric and unimodal (past experience). Q: Are statistical methods which are good under the model reasonably good if the model is only approximately true? ROBUST STATISTICS Formally addresses this issue.

5 CLASSIC EXAMPLE: MEAN.vs. MEDIAN Symmetric distributions: µ = Sample mean: X Normal ( µ, σ2 ) n Sample median: Median Normal At normal: Median Normal ( µ, σ2 n population mean population median ( µ, 1 n ) π 2 ) 1 4f(µ) 2 Asymptotic Relative Efficiency of Median to Mean ARE(Median, X) = avar(x) avar(median) = 2 π =

6 CAUCHY DISTRIBUTION X Cauchy(µ, σ 2 ) f(x; µ, σ) = 1 πσ 1 + x µ σ 2 1 Mean: X Cauchy(µ, σ 2 ) Median Normal ( µ, π2 σ 2 ) 4n ARE(Median, X) = or ARE(X, Median) = 0 For t on ν degrees of freedom: ARE(Median, X) = 4 Γ((ν + 1)/2) 2 (ν 2)π Γ(ν/2) ν ARE(M edian, X) ARE(X, M edian)

7 MIXTURE OF NORMALS Theory of errors: Central Limit Theorem gives plausibility to normality. X Normal(µ, σ 2 ) with probability 1 ɛ Normal(µ, (3σ) 2 ) with probability ɛ i.e. not all measurements are equally precise. ɛ = 0.10 X (1 ɛ) Normal(µ, σ 2 ) + ɛ Normal(µ, (3σ) 2 ) Classic paper: Tukey (1960), A survey of sampling from contaminated distributions. For ɛ > 0.10 ARE(Median, X) > 1 The mean absolute deviation is more effiicent than the sample standard deviation for ɛ > 0.01.

8 PRINCETON ROBUSTNESS STUDIES Andrews, et.al. (1972) Other estimates of location. α-trimmed mean: Trim a proportion of α from both ends of the data set and then take the mean. (Throwing away data?) α-windsorized mean: Replace a proportion of α from both ends of the data set by the next closest observation and then take the mean. Example: 2, 4, 5, 10, 200 Mean = 44.2 Median = 5 20% trimmed mean = ( ) / 3 = % Windsorized mean = ( ) / 5 = 6.6

9 Measuring the robustness of a statistics Relative Efficiency over a range of distributional models. There exist estimates of location which are asymptotically most efficient for the center of any symmetric distribution. (Adaptive estimation, semi-parametrics). Robust? Influence Function over a range of distributional models. Maximum Bias Function and the Breakdown Point.

10 Good Data Set: x 1,..., x n 1 Measuring the effect of an outlier (not modeled) Contaminated Data Set: x 1,..., x n 1, x Statistic: T n 1 = T (x 1,..., x n 1 ) Contaminated Value: T n = T (x 1,..., x n 1, x) THE SENSITIVITY CURVE (Tukey, 1970) SC n (x) = n(t n T n 1 ) T n = T n n SC n(x) THE INFLUENCE FUNCTION (Hampel, 1969, 1974) Population version of the sensitivity curve.

11 THE INFLUENCE FUNCTION Statistic T n = T (F n ) estimates T (F ). Consider F and the ɛ-contaminated distribution, F ɛ = (1 ɛ)f + ɛδ x where δ x is the point mass distribution at x. F F ɛ Compare Functional Values: T (F ).vs. T (F ɛ ) Given Qualitative Robustness (Continuity): T (F ɛ ) T (F ) as ɛ 0 (e.g. the mode is not qualitatively robust) Influence Function (Infinitesimal pertubation: Gâteaux Derivative) IF (x; T, F ) = lim ɛ 0 T (F ɛ ) T (F ) ɛ = ɛ T (F ɛ) ɛ=0

12 EXAMPLES Mean: T (F ) = E F [X]. T (F ɛ ) = E Fɛ (X) = (1 ɛ)e F [X] + ɛe[δ x ] = (1 ɛ)t [F ] + ɛ x IF(x; T, F) = lim ɛ 0 (1 ɛ)t (F ) + ɛx T (F ) ɛ = x T(F) Median: T (F ) = F 1 (1/2) IF(x; T, F) = {2 f(t(f))} 1 sign(x T(F))

13 Plots of Influence Functions Gives insight into the behavior of a statistic. T n T n IF (x; T, F ) n Mean Median α-trimmed mean α-winsorized mean (somewhat unexpected?)

14 SMALL Desirable robustness properties for the influence function Gross Error Sensitivity Asymptotic Variance GES(T ; F ) = sup x IF (x; T, F ) GES < B-robust (Bias-robust) Note : E F [ IF (X; T, F ) ] = 0 AV (T ; F ) = E F [ IF (X; T, F ) 2 ] Under general conditions, e.g. Frèchet differentiability, n(t (X1,..., X n ) T (F )) Normal(0, AV (T ; F )) Trade-off at the normal model: Smaller AV Larger GES SMOOTH (local shift sensitivity): protects e.g. against rounding error. REDESCENDING to 0.

15 REDESCENDING INFLUENCE FUNCTION Example: Data Set of Male Heights in cm 180, 175, 192,..., 185, 2020, 190,... Redescender = Automatic Outlier Detector

16 CLASSES OF ESTIMATES L-statistics: Linear combination of order statistics Let X (1)... X(n) represent the order statistics. where a i,n are constants. Examples: Mean. a i,n = 1/n Median. α-trimmed mean. a i,n = α-winsorized mean. T (X 1,..., X n ) = n 1 i = n i n+1 2 for n odd a i,n = General form for the influence function exists a i,n X(i) 1 2 i = n 2, n otherwise for n even Can obtain any desirable monotonic shape, but not redescending Do not readily generalize to other settings.

17 M-ESTIMATES Huber (1964, 1967) Maximum likelihood type estimates under non-standard conditions One-Parameter Case. X 1,..., X n i.i.d. f(x; θ), θ Θ Maximum likelihood estimates Likelihood function. L(θ x 1,..., x n ) = n Minimize the negative log-likelihood. min θ n Solve the likelihood equations n f(x i ; θ) ρ(x i ; θ) where ρ(x i ; θ) = log f(x : θ). ψ(x i ; θ) = 0 where ψ(x i ; θ) = ρ(x i; θ) θ

18 DEFINITIONS OF M-ESTIMATES Objective function approach: θ = arg min n ρ(x i ; θ) M-estimating equation approach: n ψ(x i ; θ) = 0. Note: Unique solution when ψ(x; θ) is strictly monotone in θ. Basic examples. Mean. MLE for Normal: f(x) = (2π) 1/2 e 1 2 (x θ)2 for x R. ρ(x; θ) = (x θ) 2 or ψ(x; θ) = x θ Median. MLE for Double Exponential: f(x) = 1 2 e x θ for x R. ρ(x; θ) = x θ or ψ(x; θ) = sign(x θ) ρ and ψ need not be related to any density or to each other. Estimates can be evaluated under various distributions.

19 M-ESTIMATES OF LOCATION A symmetric and translation equivariant M-estimate. Translation equivariance X i X i + a T n T n + a gives ρ(x; t) = ρ(x t) and ψ(x; t) = ψ(x t) Symmetric X i X i T n T n gives ρ( r) = ρ(r) or ψ( r) = ψ(r). Alternative derivation Generalization of MLE for center of symmetry for a given family of symmetric distributions. f(x; θ) = g( x θ )

20 INFLUENCE FUNCTION OF M-ESTIMATES M-FUNCTIONAL: T (F ) is the solution to E F [ψ(x; T (F ))] = 0. IF (x; T, F ) = c(t, F )ψ(x; T (F )) [ ] where c(t, f) = 1/E ψ(x;θ) F θ evaluated at θ = T (F ). Note: E F [ IF (X; T, F ) ] = 0. One can decide what shape is desired for the Influence Function and then construct an appropriate M-estimate. Mean Median

21 EXAMPLE Choose where c is a tuning constant. ψ(r) = c r c r r < c c r c Huber s M-estimate Adaptively trimmed mean. i.e. the proportion trimmed depends upon the data.

22 DERIVATION OF INFLUENCE FUNCTION FROM M-ESTIMATES Sketch Let T ɛ = T (F ɛ ), and so 0 = E Fɛ [ψ(x; T ɛ )] = (1 ɛ)e F [ψ(x; T ɛ )] + ɛ ψ(x; T ɛ ). Taking the derivative with respect to ɛ 0 = E F [ψ(x; T ɛ )] + (1 ɛ) ɛ E F[ψ(X; T ɛ )] + ψ(x; T ɛ ) + ɛ ɛ ψ(x; T ɛ). Let ψ (x, θ) = ψ(x, θ)/ θ. Using the chain rule 0 = E F [ψ(x; T ɛ )] + (1 ɛ)e F [ψ (X; T ɛ )] T ɛ ɛ + ψ(x; T ɛ) + ɛψ (x; T ɛ ) T ɛ ɛ. Letting ɛ 0 and using qualitative robustness, i.e. T (F ɛ ) T (F ), then gives 0 = E F [ψ (X; T )]IF (x; T (F )) + ψ(x; T (F )) RESULTS.

23 ASYMPTOTIC NORMALITY OF M-ESTIMATES Sketch Let θ = T (F ). Using Taylor series expansion on ψ(x; θ) about θ gives 0 = n ψ(x i ; θ) = n n ψ(x i ; θ) + ( θ θ) n or 0 = n{ 1 ψ(x i ; θ)} + n( θ θ){ 1 n n n ψ (x i ; θ) +..., ψ (x i ; θ)} + O p (1/ n). By the CLT, n{ 1 n n ψ(x i ; θ)} d Z Normal(0, E F [ψ(x; θ) 2 ]). By the WLLN, 1 n n ψ (x i ; θ) p E F [ψ (X; θ)]. Thus, by Slutsky s theorem, n( θ θ) d Z/E F [ψ (X; θ)] Normal(0, σ 2 ), where σ 2 = E F [ψ(x; θ) 2 ]/E F [ψ (X; θ)] 2 = E F [IF (X; T F ) 2 ] NOTE: Proving Frèchet differentiability is not necessary.

24 M-estimates of location Adaptively weighted means Recall translation equivariance and symmetry implies ψ(x; t) = ψ(x t) and ψ( r) = ψ(r). Express ψ(r) = ru(r) and let w i = u(x i θ), then 0 = n ψ(x i θ) = n (x i θ)u(x i θ) = n w i {x i θ} θ = n w i x i n w i The weights are determined by the data cloud. θ {}}{ Heavily Downweighted

25 SOME COMMON M-ESTIMATES OF LOCATION Huber s M-estimate ψ(r) = c r c r r < c c r c Given bound on the GES, it has maximum efficiency at the normal model. MLE for the least favorable distribution, i.e. symmetric unimodal model with smallest Fisher Information within a neighborhood of the normal. LFD = Normal in the middle and double exponential in the tails.

26 Tukey s Bi-weight M-estimate (or bi-square) u(r) = 1 r2 2 c 2 + where a + = max{0, a} Linear near zero Smooth (continuous second derivatives) Strongly redescending to 0 NOT AN MLE.

27 CAUCHY MLE ψ(r) = r/c (1 + r 2 /c 2 ) NOT A STRONG REDESCENDER.

28 COMPUTATIONS IRLS: Iterative Re-weighted Least Squares Algorithm. θ k+1 = n w i,k x i n w i,k where w i,k = u(x i θ k ) and θ o is any intial value. Re-weighted mean = One-step M-estimate θ 1 = n w i,o x i n w i,o, where θ o is a preliminary robust estimate of location, such as the median.

29 θ k+1 θ k = PROOF OF CONVERGENCE Sketch n w i,k x i n w i,k θ k = n w i,k (x i θ k ) n w i,k = n ψ(x i θ k ) n u(x i θ k ) Note: If θ k > θ, then θ k > θ k+1 and θ k < θ, then θ k < θ k+1. Decreasing objective function Let ρ(r) = ρ o (r 2 ) and suppose ρ o(s) 0 and ρ o(s) 0. Then n ρ(x i θ k+1 ) < n ρ(x i θ k ). Examples: Mean ρ o (s) = s Median ρ o (s) = s Cauchy MLE ρ o (s) = log(1 + s). Include redescending M-estimates of location. Generalization of EM algorithm for mixture of normals.

30 PROOF OF MONOTONE CONVERGENCE Sketch Let r i,k = (x i θ k ). By Taylor series with remainder term, ρ o (r 2 i,k+1) = ρ o (r 2 i,k) + (r 2 i,k+1 r 2 i,k)ρ o(r 2 i,k) (r2 i,k+1 r 2 i,k) 2 ρ o(r 2 i, ) So, Now, n ρ o (r 2 i,k+1) < n ρ o (ri,k) 2 + n (ri,k+1 2 ri,k)ρ 2 o(ri,k) 2 ru(r) = ψ(r) = ρ (r) = 2rρ o(r 2 ) and so ρ o(r 2 ) = u(r)/2. Also, (r 2 i,k+1 r 2 i,k) = ( θ k+1 θ k ) 2 2( θ k+1 θ k )(x i θ k ). Thus, n ρ o (r 2 i,k+1) < n ρ o (r 2 i,k) 1 2 ( θ k+1 θ k ) 2 n u(r i,k ) < n Slow but Sure Switch to Newton-Raphson after a few iterations. ρ o (r 2 i,k)

31 PART 2 MORE ADVANCED CONCEPTS AND METHODS

32 SCALE EQUIVARIANCE M-estimates of location alone are not scale equivariant in general, i.e. X i X i + a θ θ + a location equivariant X i b X i θ b θ not scale equivariant (Exceptions: the mean and median.) Thus, the adaptive weights are not dependent on the spread of the data {}}{ Equally Downweighted }{{}

33 SCALE STATISTICS: s n X i bx i + a s n b s n Sample standard deviation. s n = 1 n 1 MAD (or more approriately MADAM). n (x i x) 2 Median Absolute Deviation About the Median s n = Median x i median s n = s n p σ at Normal(µ, σ 2 )

34 Example 2, 4, 5, 10, 12, 14, 200 Median = 10 Absolute Deviations: 8, 6, 5, 0, 2, 4, 190 MADAM = 5 s n = (Standard deviation = )

35 M-estimates of location with auxiliary scale θ = arg min n ρ or x i θ c s n θ solves: n ψ x i θ c s n = 0 c: tuning constant s n : consistent for σ at normal

36 Tuning an M-estimate Given a ψ-function, define a class of M-estimates via ψ c (r) = ψ(r/c) Tukey s Bi-weight. ψ(r) = r{(1 r 2 ) + } 2 ψ c (r) = r c 1 r2 2 c 2 + c Mean c 0 Locally very unstable

37 Normal distribution. Auxiliary scale.

38 GES: measure of local robustness. Measures of global robustness? MORE THAN ONE OUTLIER X n = {X 1,..., X n }: n good data points Y m = {Y 1,..., Y m }: m bad data points Z n+m = X n Y m : ɛ m -contaminated sample. ɛ m = Bias of a statistic: T (X n Y m ) T (X n ) m n+m

39 MAX-BIAS and THE BREAKDOWN POINT Max-Bias under ɛ m -contamination: B(ɛ m ; T, X n ) = sup Y m T (X n Y m ) T (X n ) Finite sample contamination breakdown point: Donoho and Huber (1983) ɛ c(t ; X n ) = inf{ɛ m B(ɛ m ; T, X n ) = } Other concepts of breakdown (e.g. replacement). The definition of BIAS can be modified so that BIAS as T (X n Y m ) goes to the boundary of the parameter space. Example: If T represents scale, then choose BIAS = log{t (X n Y m )} log{t (X n )}.

40 EXAMPLES Mean: ɛ c = 1 n+1 Median: ɛ c = 1/2 α-trimmed mean: ɛ c = α α-windsorized mean: ɛ c = α M-estimate of location with monotone and bounded ψ function: ɛ c = 1/2

41 PROOF (sketch of lower bound) Let K = sup r ψ(r). 0 = n+m n ψ(z i T n+m ) = n ψ(x i T n+m ) = ψ(x i T n+m ) + m m ψ(y i T n+m ) ψ(y i T n+m ) mk Breakdown occurs T n+m, say T n+m n ψ(x i T n+m ) Therefore, m n and so ɛ c 1/2. [] n ψ( ) = nk

42 Population Version of Breakdown Point under contamination neighborhoods Model Distribution: F Contaminating Distribution: H ɛ-contaminated Distribution: F ɛ = (1 ɛ)f + ɛh Max-Bias under ɛ-contamination: B(ɛ; T, F ) = sup H Breakdown Point: Hampel (1968) Examples T (F ɛ ) T (F ) ɛ (T ; F ) = inf{ɛ B(ɛ; T, F ) = } Mean: T (F ) = E F (X) ɛ (T ; F ) = 0 Median: T (F ) = F 1 (1/2) ɛ (T ; F ) = 1/2

43 ILLUSTRATION GES B(ɛ; T, F )/ ɛ ɛ=0 ɛ (T ; F ) = asymptote

44 Heuristic Interpretation of Breakdown Point subject to debate Proportion of bad data a statistic can tolerate before becoming arbitrary or meaningless. If 1/2 the data is bad then one cannot distinguish between the good data and the bad data? ɛ 1/2? Discussion: Davies and Gather (2005), Annals of Statistics.

45 Example: Redescending M-estimates of location with fixed scale T (n 1,..., x n ) = arg min t n ρ x i t c Breakdown point. Huber (1984). For bounded increasing ρ, where ɛ c(t ; F ) = 1 A(X n; c)/n 2 A(X n ; c)/n A(X n ; c) = min t n ρ( x i t c ) and sup ρ(r) = 1. Breakdown point depends on X n and c ɛ : 0 1/2 as For large c, c : 0 T (n 1,..., x n ) Mean!!

46 Explanation: Relationship between redescending M-estimates of location and kernel density estimates Chu, Glad, Godtliebsen and Marron (1998) Objective function: Kernel density estimate: n ρ f(x) n x i t c κ x i t h Relationship: κ 1 ρ and c = window width h Example: κ(r) = 1 2π exp r2 /2 Normal kernel Example: Epanechnikov kernel skipped mean ρ(r) = 1 exp r2 /2 Welsch s M-estimate

47 Outliers less compact than good data Breakdown will not occur. Not true for monotonic M-estimates.

48 PART 3 ROBUST REGRESSION AND MULTIVARIATE STATISTICS

49 REGRESSION SETTING Data: (Y i, X i ) i = 1,..., n Y i R Response X i R p Predictors Predict: Y by X β Residual for a given β: r i (β) = y i x iβ M-estimates of regression Generalizion of MLE for symmetric error term Y i = X iβ + ɛ i where ɛ i are i.i.d. symmetric.

50 Objective function approach: M-ESTIMATES OF REGRESSION min β n where ρ(r) = ρ( r) and ρ for r 0. M-estimating equation approach: e.g. ψ(r) = ρ (r) n ρ(r i (β)) ψ(r i (β))x i = 0 Interpretation: Adaptively weighted least squares. Express ψ(r) = ru(r) and w i = u(r i ( β)) β = [ n Computations via IRLS algorithm. w i x i x i] 1 [ n w i y i x i ]

51 r = y x T (F ) Due to residual: ψ(r) INFLUENCE FUNCTIONS FOR M-ESTIMATES OF REGRESSION IF (y, x; T, F ) ψ(r) x Due to Design: x GES =, i.e. unbounded influence. Outlier 1: highly influential for any ψ function. Outlier 2: highly influential for monotonic but not redescending ψ.

52 BOUNDED INFLUENCE REGRESSION GM-estimates (Generalized M-estimates). Mallows-type (Mallows, 1975): n w(x i ) ψ (r i (β)) x i = 0 Downweights outlying design points (leverage points), even if they are good leverage points. General form (c.g. Maronna and Yohai, 1981): Breakdown points. n ψ (x i, r i (β)) x i = 0 M-estimates: ɛ = 0 GM-estimates: ɛ 1/(p + 1)

53 LATE 70 s - EARLY 80 s Open problem: Is high breakdown point regression possible? Yes. Repeated Median. Siegel (1982). Not regression equivariate Regression equivariance: For a R and A nonsingular (Y i, X i ) (ay i, A X i ) β 1 aa β i.e. Y i aŷi Open problem: Is high breakdown point equivariate regression possible? Yes. Least Median of Squares. Hampel (1984), Rousseeuw, (1984)

54 Least Median of Squares (LMS) Hampel (1984), Rousseeuw, (1984) min β Median{r i(β) 2 i = 1,..., n} Alternatively: min β MAD{ r i (β) } Breakdown point: ɛ = 1/2 Location version: SHORTH (Princeton Robustness Study) Midpoint of the SHORTest Half.

55 LMS: Mid-line of the shortest strip containing 1/2 of the data. Problem: Not n-consistent, but only 3 n-consistent n β β p 3 n β β = O p (1) Not locally stable. e.g. Example is pure noise.

56 S-ESTIMATES OF REGRESSION Rousseeuw and Yohai (1984) For S( ), an estimate of scale (about zero): min β S(r i (β)) S = MAD LMS S = sample standard deviation about 0 Least Squares Bounded monotonic M-estimates of scale (about zero): n χ ( r i /s) = 0 for χ, and bounded above and below. Alternatively, for ρ, and 0 ρ 1 1 n n ρ ( r i /s) = ɛ For LMS: ρ : 0-1 jump function and ɛ = 1/2 Breakdown point: ɛ = min(ɛ, 1 ɛ)

57 S-estimates of Regression n- consistent and Asymptotically normal. Trade-off: Higher breakdown point Lower efficiency and higher residual gross error sensivity for Normal errors. One Resolution: One-step M-estimate via IRLS. (Note: R, S-plus, Matlab.)

58 M-ESTIMATES OF REGRESSION WITH GENERAL SCALE Martin, Yohai, and Zamar (1989) min β n ρ y i x i β c s n Parameter of interest: β Scale statistic: s n (consistent for σ at normal errors) Tuning constant: c Monotonic bounded ρ redescending M-estimate High Breakdown Point Examples LMS and S-estimates MM-estimates, Yohai (1987). CM-estimates, Mendes and Tyler (1994).

59 s n.vs. n ρ y i x i β c s n Each β gives one curve S minimizes horizonally M minimizes vertically

60 MM-ESTIMATES OF REGRESSION Default robust regression estimate in S-plus (also SAS?) Given a preliminary estimate of regression with breakdown point 1/2. Usually an S-estimate of regression. Compute a monotone M-estimate of scale about zero for the residuals. s n : Usually from the S-estimate. Compute an M-estimate of regression using the scale statistics s n and with any desired tuning constant c. Tune to obtain reasonable ARE and residual GES. Breakdown point: ɛ = 1/2

61 TUNING High breakdown point S-estimates are badly tuned M-estimates. Tuning the S-estimates affects its breakdown point. MM-estimates can be tuned without affecting the breakdown point

62 COMPUTATIONAL ISSUES All known high breakdown point regression estimates are computationally intensive. Non-convex optimization problem. Approximate or stochastic algorithms. Random Elemental Subset Selection. Rousseeuw (1984). Consider exact fit of lines for p of the data points Optimize the criterion over such lines. n p such lines. For large n and p, randomly sample from such lines so that there is a good chance, say e.g. 95% chance, that one of the elemental subsets will contain only good data even if half the data is contaminated.

63 MULTIVARIATE DATA

64 Robust Multivariate Location and Covariance Estimates Parallel developments Multivariate M-estimates. Maronna (1976). Huber (1977). Adaptively weighted mean and covariances. IRLS algorithms. Breakdown point: ɛ 1/(d + 1), where d is the dimension of the data. Minimum Volume Ellipsoid Estimates. Rousseeuw (1985). MVE is a multivariate version of LMS. Multivariate S-estimates. Davies (1987), Lopuhuaä (1989) Multivariate CM-estimates. Kent and Tyler (1996). Multivariate MM-estimates. Tatsuoka and Tyler (2000). Tyler (2002).

65 MULTIVARIATE DATA d-dimensional data set: X 1,... X n Classical summary statistics: Sample Mean Vector: X = 1 n n X i Sample Variance-Covarinance Matrix n S n = 1 (X i X)(X i X) T = {s ij } n where s ii = sample variance, s ij = sample covariance. Maximum likelihood estimates under normality

66 Sample Mean and Covariance Hertzsprung Russell Galaxy Data (Rousseeuw) Visualization Ellipse containing half the data. (X X) T S 1 n (X X) c

67 Mahalanobis distances: d 2 i = (X i X) T S 1 n (X i X) Index MahDis Mahalanobis angles: θ i,j = Cos 1 { (X i X) T S 1 n (X j X)/(d i d j ) }

68 PC PC1 Principal Components: Do not always reveal outliers.

69 ENGINEERING APPLICATIONS Noisy signal arrays, radar clutter. Test for signal, i.e. detector, based on Mahalanobis angle between signal and observed data. Hyperspectral Data. Adaptive cosine estimator = Mahalanobis angle. BIG DATA Can not clean data via observations. Need automatic methods.

70 ROBUST MULTIVARIATE ESTIMATES Location and Scatter (Pseudo-Covariance) Trimmed versions: complicated Weighted mean and covariance matrices µ = n w i X i n w i Σ = 1 n n w i (X i µ)(x i µ) T where w i = u(d i,o ) and d 2 i,o = (X i µ o ) T Σ 1 o (X i µ o ) and with µ o and Σ o being initial estimates, e.g. Sample mean vector and covariance matrix. High breakdown point estimates.

71 MULTIVARIATE M-ESTIMATES MLE type for elliptically symmetric distributions Adaptively weighted mean and covariances µ = n w i X i n w i Σ = 1 n n w i (X i µ)(x i µ) T w i = u(d i ) and d 2 i = (X i µ) T Σ 1 (X i µ) Implicit equations

72 PROPERTIES OF M-ESTIMATES Root-n consistent and asymptotically normal. Can be tuned to have desirable properties: high efficiency over a broad class of distributions smooth and bounded influence function Computationally simple (IRLS) Breakdown point: ɛ 1/(d + 1)

73 HIGH BREAKDOWN POINT ESTIMATES MVE: Minimum Volume Ellipsoid ɛ = 0.5 Center and scatter matrix for smallest ellipse covering at least half of the data. MCD, S-estimates, MM-estimates, CM-estimates ɛ = 0.5 Reweighted versions based on high breakdown start. Computationally intensive: Approximate/probabilistic algorithms. Elemental Subset Approach.

74 Sample Mean and Covariance Hertzsprung Russell Galaxy Data (Rousseeuw)

75 Add MVE

76 Add Cauchy MLE

77 ELLIPTICAL DISTRIBUTIONS and AFFINE EQUIVARIANCE

78 SPHERICALLY SYMMETRIC DISTRIBUTIONS Z = R U where R U, R > 0, and U d Uniform on the unit sphere. Density: f(z) = g(z z) y x

79 ELLIPTICAL SYMMETRIC DISTRIBUTIONS X = AZ + µ E(µ, Γ; g), where Γ = AA f(x) = Γ 1/2 g ( (x µ) Γ 1 (x µ) ) Eliplot()[,2] Eliplot()[,1]

80 AFFINE EQUIVARIANCE Parameters of elliptical distributions: X E(µ, Γ; g) X = BX + b E(µ, Γ ; g) Sample version: Examples where µ = Bµ + b and Γ = BΓB X i X i µ µ = B µ + b and Mean vector and Covariance matrix M-estimates MVE = BX + b, i = 1,... n V V = B V B

81 ELLIPTICAL SYMMETRIC DISTRIBUTIONS f(x; µ, Γ) = Γ 1/2 g ( (x µ) Γ 1 (x µ) ). If second moments exist, shape matrix Γ covariance matrix. For any affine equivariant scatter functional: Σ(F ) Γ That is, any affine equivariant scatter statistic estimates a matrix proportional to the shape matrix Γ. NON-ELLIPTICAL DISTRIBUTIONS Difference scatter matrices estimate different population quantities. Analogy: For non-symmetric distributions: population mean population median.

82 OTHER TOPICS Projection-based multivariate approaches Tukey s data depth. (or half-space depth) Tukey (1974). Stahel-Donoho s Estimate. Stahel (1981). Donoho Projection-estimates. Maronna, Stahel and Yohai (1994). Tyler (1996). Computationally Intensive.

83 TUKEY S DEPTH

84 PART 5 ROBUSTNESS AND COMPUTER VISION Independent development of robust statistics within the computer vision/image understanding community.

85

86

87

88 Hough transform: ρ = xcos(θ) + ysin(θ) y = a + bx where a = ρ/sin(θ) and b = cos(θ)/sin(θ) That is, intersection refers to the line connecting two points. Hough: Estimation in Data Space to Clustering in Feature Space Find centers of the clusters Terminology: Feature Space = Parameter Space Accumulator = Elemental Fit Computation: RANSAC (Random sample consensus) Randomly choose a subset from the Accumulator. (Random elemental fits.) Check to see how many data points are within a fixed neighborhood are in a neighborhood.

89 Alternative Formulation of Hough Transform/RANSAC ρ(r i ) should be small, where ρ(r) = 1, r > R 0, r R That is, a redescending M-estimate of regression with known scale. Note: Relationship to LMS. Vision: Need e.g. 90% breakdown point, i.e. tolerate 90% bad data

90 Defintion of Residuals? Hough transform approach does not distinguish between: Regression line for regressing Y on X. Regression line for regressing X on Y. Orthogonal regression line. (Note: Small stochastic errors little difference.)

91 GENERAL PARADIGM Line in 2-D: Exact fit to all pairs Quadratic in 2-D: Exact fit to all triples Conic Sections: Ellipse Fitting (x µ) A(x µ) = 1 Linearizing: Let x = (x 1, x 2, x 2 1, x2 2, x 1x 2, 1) Ellipse: a x = 0 Exact fit to all subsets of size 5.

92 Hyperboloid Fitting in 4-D Epipolar Geometry: Exact fit to 5 pairs: Calibrated cameras (location, rotation, scale). 8 pairs: Uncalibrated cameras (focal point, image plane). Hyperboloid surface in 4-D. Applications: 2D 3D, Motion. OUTLIERS = MISMATCHES

Lecture 14 October 13

Lecture 14 October 13 STAT 383C: Statistical Modeling I Fall 2015 Lecture 14 October 13 Lecturer: Purnamrita Sarkar Scribe: Some one Disclaimer: These scribe notes have been slightly proofread and may have typos etc. Note:

More information

Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland.

Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland. Introduction to Robust Statistics Elvezio Ronchetti Department of Econometrics University of Geneva Switzerland Elvezio.Ronchetti@metri.unige.ch http://www.unige.ch/ses/metri/ronchetti/ 1 Outline Introduction

More information

Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc

Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc Robust regression Robust estimation of regression coefficients in linear regression model Jiří Franc Czech Technical University Faculty of Nuclear Sciences and Physical Engineering Department of Mathematics

More information

Measuring robustness

Measuring robustness Measuring robustness 1 Introduction While in the classical approach to statistics one aims at estimates which have desirable properties at an exactly speci ed model, the aim of robust methods is loosely

More information

IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE

IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE IMPROVING THE SMALL-SAMPLE EFFICIENCY OF A ROBUST CORRELATION MATRIX: A NOTE Eric Blankmeyer Department of Finance and Economics McCoy College of Business Administration Texas State University San Marcos

More information

Introduction to robust statistics*

Introduction to robust statistics* Introduction to robust statistics* Xuming He National University of Singapore To statisticians, the model, data and methodology are essential. Their job is to propose statistical procedures and evaluate

More information

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX STATISTICS IN MEDICINE Statist. Med. 17, 2685 2695 (1998) ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX N. A. CAMPBELL *, H. P. LOPUHAA AND P. J. ROUSSEEUW CSIRO Mathematical and Information

More information

ON THE MAXIMUM BIAS FUNCTIONS OF MM-ESTIMATES AND CONSTRAINED M-ESTIMATES OF REGRESSION

ON THE MAXIMUM BIAS FUNCTIONS OF MM-ESTIMATES AND CONSTRAINED M-ESTIMATES OF REGRESSION ON THE MAXIMUM BIAS FUNCTIONS OF MM-ESTIMATES AND CONSTRAINED M-ESTIMATES OF REGRESSION BY JOSE R. BERRENDERO, BEATRIZ V.M. MENDES AND DAVID E. TYLER 1 Universidad Autonoma de Madrid, Federal University

More information

Lecture 12 Robust Estimation

Lecture 12 Robust Estimation Lecture 12 Robust Estimation Prof. Dr. Svetlozar Rachev Institute for Statistics and Mathematical Economics University of Karlsruhe Financial Econometrics, Summer Semester 2007 Copyright These lecture-notes

More information

A Brief Overview of Robust Statistics

A Brief Overview of Robust Statistics A Brief Overview of Robust Statistics Olfa Nasraoui Department of Computer Engineering & Computer Science University of Louisville, olfa.nasraoui_at_louisville.edu Robust Statistical Estimators Robust

More information

Breakdown points of Cauchy regression-scale estimators

Breakdown points of Cauchy regression-scale estimators Breadown points of Cauchy regression-scale estimators Ivan Mizera University of Alberta 1 and Christine H. Müller Carl von Ossietzy University of Oldenburg Abstract. The lower bounds for the explosion

More information

Robust estimation of scale and covariance with P n and its application to precision matrix estimation

Robust estimation of scale and covariance with P n and its application to precision matrix estimation Robust estimation of scale and covariance with P n and its application to precision matrix estimation Garth Tarr, Samuel Müller and Neville Weber USYD 2013 School of Mathematics and Statistics THE UNIVERSITY

More information

Development of robust scatter estimators under independent contamination model

Development of robust scatter estimators under independent contamination model Development of robust scatter estimators under independent contamination model C. Agostinelli 1, A. Leung, V.J. Yohai 3 and R.H. Zamar 1 Universita Cà Foscàri di Venezia, University of British Columbia,

More information

A Modified M-estimator for the Detection of Outliers

A Modified M-estimator for the Detection of Outliers A Modified M-estimator for the Detection of Outliers Asad Ali Department of Statistics, University of Peshawar NWFP, Pakistan Email: asad_yousafzay@yahoo.com Muhammad F. Qadir Department of Statistics,

More information

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY

ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY ROBUST ESTIMATION OF A CORRELATION COEFFICIENT: AN ATTEMPT OF SURVEY G.L. Shevlyakov, P.O. Smirnov St. Petersburg State Polytechnic University St.Petersburg, RUSSIA E-mail: Georgy.Shevlyakov@gmail.com

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS AIDA TOMA The nonrobustness of classical tests for parametric models is a well known problem and various

More information

Symmetrised M-estimators of multivariate scatter

Symmetrised M-estimators of multivariate scatter Journal of Multivariate Analysis 98 (007) 1611 169 www.elsevier.com/locate/jmva Symmetrised M-estimators of multivariate scatter Seija Sirkiä a,, Sara Taskinen a, Hannu Oja b a Department of Mathematics

More information

odhady a jejich (ekonometrické)

odhady a jejich (ekonometrické) modifikace Stochastic Modelling in Economics and Finance 2 Advisor : Prof. RNDr. Jan Ámos Víšek, CSc. Petr Jonáš 27 th April 2009 Contents 1 2 3 4 29 1 In classical approach we deal with stringent stochastic

More information

KANSAS STATE UNIVERSITY

KANSAS STATE UNIVERSITY ROBUST MIXTURES OF REGRESSION MODELS by XIUQIN BAI M.S., Kansas State University, USA, 2010 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree DOCTOR OF PHILOSOPHY

More information

OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION

OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION REVSTAT Statistical Journal Volume 15, Number 3, July 2017, 455 471 OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION Authors: Fatma Zehra Doğru Department of Econometrics,

More information

Efficient and Robust Scale Estimation

Efficient and Robust Scale Estimation Efficient and Robust Scale Estimation Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline Introduction and motivation The robust scale estimator

More information

Definitions of ψ-functions Available in Robustbase

Definitions of ψ-functions Available in Robustbase Definitions of ψ-functions Available in Robustbase Manuel Koller and Martin Mächler July 18, 2018 Contents 1 Monotone ψ-functions 2 1.1 Huber.......................................... 3 2 Redescenders

More information

Robust Estimation of Cronbach s Alpha

Robust Estimation of Cronbach s Alpha Robust Estimation of Cronbach s Alpha A. Christmann University of Dortmund, Fachbereich Statistik, 44421 Dortmund, Germany. S. Van Aelst Ghent University (UGENT), Department of Applied Mathematics and

More information

Robust Estimation Methods for Impulsive Noise Suppression in Speech

Robust Estimation Methods for Impulsive Noise Suppression in Speech Robust Estimation Methods for Impulsive Noise Suppression in Speech Mital A. Gandhi, Christelle Ledoux, and Lamine Mili Alexandria Research Institute Bradley Department of Electrical and Computer Engineering

More information

ASYMPTOTICS OF REWEIGHTED ESTIMATORS OF MULTIVARIATE LOCATION AND SCATTER. By Hendrik P. Lopuhaä Delft University of Technology

ASYMPTOTICS OF REWEIGHTED ESTIMATORS OF MULTIVARIATE LOCATION AND SCATTER. By Hendrik P. Lopuhaä Delft University of Technology The Annals of Statistics 1999, Vol. 27, No. 5, 1638 1665 ASYMPTOTICS OF REWEIGHTED ESTIMATORS OF MULTIVARIATE LOCATION AND SCATTER By Hendrik P. Lopuhaä Delft University of Technology We investigate the

More information

MULTIVARIATE TECHNIQUES, ROBUSTNESS

MULTIVARIATE TECHNIQUES, ROBUSTNESS MULTIVARIATE TECHNIQUES, ROBUSTNESS Mia Hubert Associate Professor, Department of Mathematics and L-STAT Katholieke Universiteit Leuven, Belgium mia.hubert@wis.kuleuven.be Peter J. Rousseeuw 1 Senior Researcher,

More information

INVARIANT COORDINATE SELECTION

INVARIANT COORDINATE SELECTION INVARIANT COORDINATE SELECTION By David E. Tyler 1, Frank Critchley, Lutz Dümbgen 2, and Hannu Oja Rutgers University, Open University, University of Berne and University of Tampere SUMMARY A general method

More information

9. Robust regression

9. Robust regression 9. Robust regression Least squares regression........................................................ 2 Problems with LS regression..................................................... 3 Robust regression............................................................

More information

A Robust Strategy for Joint Data Reconciliation and Parameter Estimation

A Robust Strategy for Joint Data Reconciliation and Parameter Estimation A Robust Strategy for Joint Data Reconciliation and Parameter Estimation Yen Yen Joe 1) 3), David Wang ), Chi Bun Ching 3), Arthur Tay 1), Weng Khuen Ho 1) and Jose Romagnoli ) * 1) Dept. of Electrical

More information

Small Sample Corrections for LTS and MCD

Small Sample Corrections for LTS and MCD myjournal manuscript No. (will be inserted by the editor) Small Sample Corrections for LTS and MCD G. Pison, S. Van Aelst, and G. Willems Department of Mathematics and Computer Science, Universitaire Instelling

More information

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION (REFEREED RESEARCH) COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION Hakan S. Sazak 1, *, Hülya Yılmaz 2 1 Ege University, Department

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

Minimum distance estimation for the logistic regression model

Minimum distance estimation for the logistic regression model Biometrika (2005), 92, 3, pp. 724 731 2005 Biometrika Trust Printed in Great Britain Minimum distance estimation for the logistic regression model BY HOWARD D. BONDELL Department of Statistics, Rutgers

More information

Two Simple Resistant Regression Estimators

Two Simple Resistant Regression Estimators Two Simple Resistant Regression Estimators David J. Olive Southern Illinois University January 13, 2005 Abstract Two simple resistant regression estimators with O P (n 1/2 ) convergence rate are presented.

More information

arxiv: v1 [math.st] 2 Aug 2007

arxiv: v1 [math.st] 2 Aug 2007 The Annals of Statistics 2007, Vol. 35, No. 1, 13 40 DOI: 10.1214/009053606000000975 c Institute of Mathematical Statistics, 2007 arxiv:0708.0390v1 [math.st] 2 Aug 2007 ON THE MAXIMUM BIAS FUNCTIONS OF

More information

Globally Robust Inference

Globally Robust Inference Globally Robust Inference Jorge Adrover Univ. Nacional de Cordoba and CIEM, Argentina Jose Ramon Berrendero Universidad Autonoma de Madrid, Spain Matias Salibian-Barrera Carleton University, Canada Ruben

More information

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression

robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression Robust Statistics robustness, efficiency, breakdown point, outliers, rank-based procedures, least absolute regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html

More information

Robust Statistics for Signal Processing

Robust Statistics for Signal Processing Robust Statistics for Signal Processing Abdelhak M Zoubir Signal Processing Group Signal Processing Group Technische Universität Darmstadt email: zoubir@spg.tu-darmstadt.de URL: www.spg.tu-darmstadt.de

More information

Nonrobust and Robust Objective Functions

Nonrobust and Robust Objective Functions Nonrobust and Robust Objective Functions The objective function of the estimators in the input space is built from the sum of squared Mahalanobis distances (residuals) d 2 i = 1 σ 2(y i y io ) C + y i

More information

Identification of Multivariate Outliers: A Performance Study

Identification of Multivariate Outliers: A Performance Study AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 127 138 Identification of Multivariate Outliers: A Performance Study Peter Filzmoser Vienna University of Technology, Austria Abstract: Three

More information

Robust estimation for linear regression with asymmetric errors with applications to log-gamma regression

Robust estimation for linear regression with asymmetric errors with applications to log-gamma regression Robust estimation for linear regression with asymmetric errors with applications to log-gamma regression Ana M. Bianco, Marta García Ben and Víctor J. Yohai Universidad de Buenps Aires October 24, 2003

More information

1 Robust statistics; Examples and Introduction

1 Robust statistics; Examples and Introduction Robust Statistics Laurie Davies 1 and Ursula Gather 2 1 Department of Mathematics, University of Essen, 45117 Essen, Germany, laurie.davies@uni-essen.de 2 Department of Statistics, University of Dortmund,

More information

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy

Introduction to Robust Statistics. Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy Introduction to Robust Statistics Anthony Atkinson, London School of Economics, UK Marco Riani, Univ. of Parma, Italy Multivariate analysis Multivariate location and scatter Data where the observations

More information

Robust model selection criteria for robust S and LT S estimators

Robust model selection criteria for robust S and LT S estimators Hacettepe Journal of Mathematics and Statistics Volume 45 (1) (2016), 153 164 Robust model selection criteria for robust S and LT S estimators Meral Çetin Abstract Outliers and multi-collinearity often

More information

Robust Variable Selection Through MAVE

Robust Variable Selection Through MAVE Robust Variable Selection Through MAVE Weixin Yao and Qin Wang Abstract Dimension reduction and variable selection play important roles in high dimensional data analysis. Wang and Yin (2008) proposed sparse

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Joint work with Nottingham colleagues Simon Preston and Michail Tsagris.

Joint work with Nottingham colleagues Simon Preston and Michail Tsagris. /pgf/stepx/.initial=1cm, /pgf/stepy/.initial=1cm, /pgf/step/.code=1/pgf/stepx/.expanded=- 10.95415pt,/pgf/stepy/.expanded=- 10.95415pt, /pgf/step/.value required /pgf/images/width/.estore in= /pgf/images/height/.estore

More information

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI **

A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI *, R.A. IPINYOMI ** ANALELE ŞTIINłIFICE ALE UNIVERSITĂłII ALEXANDRU IOAN CUZA DIN IAŞI Tomul LVI ŞtiinŃe Economice 9 A ROBUST METHOD OF ESTIMATING COVARIANCE MATRIX IN MULTIVARIATE DATA ANALYSIS G.M. OYEYEMI, R.A. IPINYOMI

More information

J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea)

J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea) J. W. LEE (Kumoh Institute of Technology, Kumi, South Korea) V. I. SHIN (Gwangju Institute of Science and Technology, Gwangju, South Korea) G. L. SHEVLYAKOV (Gwangju Institute of Science and Technology,

More information

High Breakdown Point Estimation in Regression

High Breakdown Point Estimation in Regression WDS'08 Proceedings of Contributed Papers, Part I, 94 99, 2008. ISBN 978-80-7378-065-4 MATFYZPRESS High Breakdown Point Estimation in Regression T. Jurczyk Charles University, Faculty of Mathematics and

More information

Accurate and Powerful Multivariate Outlier Detection

Accurate and Powerful Multivariate Outlier Detection Int. Statistical Inst.: Proc. 58th World Statistical Congress, 11, Dublin (Session CPS66) p.568 Accurate and Powerful Multivariate Outlier Detection Cerioli, Andrea Università di Parma, Dipartimento di

More information

Segmentation of Subspace Arrangements III Robust GPCA

Segmentation of Subspace Arrangements III Robust GPCA Segmentation of Subspace Arrangements III Robust GPCA Berkeley CS 294-6, Lecture 25 Dec. 3, 2006 Generalized Principal Component Analysis (GPCA): (an overview) x V 1 V 2 (x 3 = 0)or(x 1 = x 2 = 0) {x 1x

More information

Robust GM Wiener Filter in the Complex Domain

Robust GM Wiener Filter in the Complex Domain Robust GM Wiener Filter in the Complex Domain Matthew Greco Kayrish Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements

More information

Inference based on robust estimators Part 2

Inference based on robust estimators Part 2 Inference based on robust estimators Part 2 Matias Salibian-Barrera 1 Department of Statistics University of British Columbia ECARES - Dec 2007 Matias Salibian-Barrera (UBC) Robust inference (2) ECARES

More information

REGRESSION ANALYSIS AND ANALYSIS OF VARIANCE

REGRESSION ANALYSIS AND ANALYSIS OF VARIANCE REGRESSION ANALYSIS AND ANALYSIS OF VARIANCE P. L. Davies Eindhoven, February 2007 Reading List Daniel, C. (1976) Applications of Statistics to Industrial Experimentation, Wiley. Tukey, J. W. (1977) Exploratory

More information

The influence function of the Stahel-Donoho estimator of multivariate location and scatter

The influence function of the Stahel-Donoho estimator of multivariate location and scatter The influence function of the Stahel-Donoho estimator of multivariate location and scatter Daniel Gervini University of Zurich Abstract This article derives the influence function of the Stahel-Donoho

More information

Nonlinear Signal Processing ELEG 833

Nonlinear Signal Processing ELEG 833 Nonlinear Signal Processing ELEG 833 Gonzalo R. Arce Department of Electrical and Computer Engineering University of Delaware arce@ee.udel.edu May 5, 2005 8 MYRIAD SMOOTHERS 8 Myriad Smoothers 8.1 FLOM

More information

TITLE : Robust Control Charts for Monitoring Process Mean of. Phase-I Multivariate Individual Observations AUTHORS : Asokan Mulayath Variyath.

TITLE : Robust Control Charts for Monitoring Process Mean of. Phase-I Multivariate Individual Observations AUTHORS : Asokan Mulayath Variyath. TITLE : Robust Control Charts for Monitoring Process Mean of Phase-I Multivariate Individual Observations AUTHORS : Asokan Mulayath Variyath Department of Mathematics and Statistics, Memorial University

More information

DavidAllanBurt. Dissertation submitted to the Faculty of the. Virginia Polytechnic Institute and State University

DavidAllanBurt. Dissertation submitted to the Faculty of the. Virginia Polytechnic Institute and State University Bandwidth Selection Concerns for Jump Point Discontinuity Preservation in the Regression Setting Using M-smoothers and the Extension to Hypothesis Testing DavidAllanBurt Dissertation submitted to the Faculty

More information

Robust Linear Discriminant Analysis and the Projection Pursuit Approach

Robust Linear Discriminant Analysis and the Projection Pursuit Approach Robust Linear Discriminant Analysis and the Projection Pursuit Approach Practical aspects A. M. Pires 1 Department of Mathematics and Applied Mathematics Centre, Technical University of Lisbon (IST), Lisboa,

More information

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis

MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4: Measures of Robustness, Robust Principal Component Analysis MS-E2112 Multivariate Statistical Analysis (5cr) Lecture 4:, Robust Principal Component Analysis Contents Empirical Robust Statistical Methods In statistics, robust methods are methods that perform well

More information

KANSAS STATE UNIVERSITY Manhattan, Kansas

KANSAS STATE UNIVERSITY Manhattan, Kansas ROBUST MIXTURE MODELING by CHUN YU M.S., Kansas State University, 2008 AN ABSTRACT OF A DISSERTATION submitted in partial fulfillment of the requirements for the degree Doctor of Philosophy Department

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Stahel-Donoho Estimation for High-Dimensional Data

Stahel-Donoho Estimation for High-Dimensional Data Stahel-Donoho Estimation for High-Dimensional Data Stefan Van Aelst KULeuven, Department of Mathematics, Section of Statistics Celestijnenlaan 200B, B-3001 Leuven, Belgium Email: Stefan.VanAelst@wis.kuleuven.be

More information

A Comparison of Robust Estimators Based on Two Types of Trimming

A Comparison of Robust Estimators Based on Two Types of Trimming Submitted to the Bernoulli A Comparison of Robust Estimators Based on Two Types of Trimming SUBHRA SANKAR DHAR 1, and PROBAL CHAUDHURI 1, 1 Theoretical Statistics and Mathematics Unit, Indian Statistical

More information

Robust high-dimensional linear regression: A statistical perspective

Robust high-dimensional linear regression: A statistical perspective Robust high-dimensional linear regression: A statistical perspective Po-Ling Loh University of Wisconsin - Madison Departments of ECE & Statistics STOC workshop on robustness and nonconvexity Montreal,

More information

Statistics 240 Lecture Notes

Statistics 240 Lecture Notes Statistics 240 Lecture Notes P.B. Stark www.stat.berkeley.edu/ stark/index.html November 23, 2010 1 Part 9: Robustness and related topics. ROUGH DRAFT References: Hampel, F.R., Rousseeuw, P.J., Ronchetti,

More information

Independent Component (IC) Models: New Extensions of the Multinormal Model

Independent Component (IC) Models: New Extensions of the Multinormal Model Independent Component (IC) Models: New Extensions of the Multinormal Model Davy Paindaveine (joint with Klaus Nordhausen, Hannu Oja, and Sara Taskinen) School of Public Health, ULB, April 2008 My research

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,

More information

Statistical Data Analysis

Statistical Data Analysis DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the

More information

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak. Large Sample Theory Large Sample Theory is a name given to the search for approximations to the behaviour of statistical procedures which are derived by computing limits as the sample size, n, tends to

More information

Regression Analysis for Data Containing Outliers and High Leverage Points

Regression Analysis for Data Containing Outliers and High Leverage Points Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain

More information

Robust regression in R. Eva Cantoni

Robust regression in R. Eva Cantoni Robust regression in R Eva Cantoni Research Center for Statistics and Geneva School of Economics and Management, University of Geneva, Switzerland April 4th, 2017 1 Robust statistics philosopy 2 Robust

More information

ROBUST MINIMUM DISTANCE NEYMAN-PEARSON DETECTION OF A WEAK SIGNAL IN NON-GAUSSIAN NOISE

ROBUST MINIMUM DISTANCE NEYMAN-PEARSON DETECTION OF A WEAK SIGNAL IN NON-GAUSSIAN NOISE 17th European Signal Processing Conference EUSIPCO 2009) Glasgow, Scotland, August 24-28, 2009 ROBUST MIIMUM DISTACE EYMA-PEARSO DETECTIO OF A WEAK SIGAL I O-GAUSSIA OISE Georgy Shevlyakov, Kyungmin Lee,

More information

Geodesic Convexity and Regularized Scatter Estimation

Geodesic Convexity and Regularized Scatter Estimation Geodesic Convexity and Regularized Scatter Estimation Lutz Duembgen (Bern) David Tyler (Rutgers) Klaus Nordhausen (Turku/Vienna), Heike Schuhmacher (Bern) Markus Pauly (Ulm), Thomas Schweizer (Bern) Düsseldorf,

More information

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite

More information

Augmented Reality VU numerical optimization algorithms. Prof. Vincent Lepetit

Augmented Reality VU numerical optimization algorithms. Prof. Vincent Lepetit Augmented Reality VU numerical optimization algorithms Prof. Vincent Lepetit P3P: can exploit only 3 correspondences; DLT: difficult to exploit the knowledge of the internal parameters correctly, and the

More information

Computational Connections Between Robust Multivariate Analysis and Clustering

Computational Connections Between Robust Multivariate Analysis and Clustering 1 Computational Connections Between Robust Multivariate Analysis and Clustering David M. Rocke 1 and David L. Woodruff 2 1 Department of Applied Science, University of California at Davis, Davis, CA 95616,

More information

Robust multivariate methods in Chemometrics

Robust multivariate methods in Chemometrics Robust multivariate methods in Chemometrics Peter Filzmoser 1 Sven Serneels 2 Ricardo Maronna 3 Pierre J. Van Espen 4 1 Institut für Statistik und Wahrscheinlichkeitstheorie, Technical University of Vienna,

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Outliers Robustness in Multivariate Orthogonal Regression

Outliers Robustness in Multivariate Orthogonal Regression 674 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS PART A: SYSTEMS AND HUMANS, VOL. 30, NO. 6, NOVEMBER 2000 Outliers Robustness in Multivariate Orthogonal Regression Giuseppe Carlo Calafiore Abstract

More information

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R. Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Indian Statistical Institute

Indian Statistical Institute Indian Statistical Institute Introductory Computer programming Robust Regression methods with high breakdown point Author: Roll No: MD1701 February 24, 2018 Contents 1 Introduction 2 2 Criteria for evaluating

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality

An Empirical Characteristic Function Approach to Selecting a Transformation to Normality Communications for Statistical Applications and Methods 014, Vol. 1, No. 3, 13 4 DOI: http://dx.doi.org/10.5351/csam.014.1.3.13 ISSN 87-7843 An Empirical Characteristic Function Approach to Selecting a

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Robust Statistics. Frank Klawonn

Robust Statistics. Frank Klawonn Robust Statistics Frank Klawonn f.klawonn@fh-wolfenbuettel.de, frank.klawonn@helmholtz-hzi.de Data Analysis and Pattern Recognition Lab Department of Computer Science University of Applied Sciences Braunschweig/Wolfenbüttel,

More information

Outlier Robust Nonlinear Mixed Model Estimation

Outlier Robust Nonlinear Mixed Model Estimation Outlier Robust Nonlinear Mixed Model Estimation 1 James D. Williams, 2 Jeffrey B. Birch and 3 Abdel-Salam G. Abdel-Salam 1 Business Analytics, Dow AgroSciences, 9330 Zionsville Rd. Indianapolis, IN 46268,

More information

Maximum Bias and Related Measures

Maximum Bias and Related Measures Maximum Bias and Related Measures We have seen before that the influence function and the gross-error-sensitivity can be used to quantify the robustness properties of a given estimate. However, these robustness

More information

ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES

ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES ROBUST ESTIMATOR FOR MULTIPLE INLIER STRUCTURES Xiang Yang (1) and Peter Meer (2) (1) Dept. of Mechanical and Aerospace Engineering (2) Dept. of Electrical and Computer Engineering Rutgers University,

More information

Robust scale estimation with extensions

Robust scale estimation with extensions Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Review and continuation from last week Properties of MLEs

Review and continuation from last week Properties of MLEs Review and continuation from last week Properties of MLEs As we have mentioned, MLEs have a nice intuitive property, and as we have seen, they have a certain equivariance property. We will see later that

More information

THE BREAKDOWN POINT EXAMPLES AND COUNTEREXAMPLES

THE BREAKDOWN POINT EXAMPLES AND COUNTEREXAMPLES REVSTAT Statistical Journal Volume 5, Number 1, March 2007, 1 17 THE BREAKDOWN POINT EXAMPLES AND COUNTEREXAMPLES Authors: P.L. Davies University of Duisburg-Essen, Germany, and Technical University Eindhoven,

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Inference for High Dimensional Robust Regression

Inference for High Dimensional Robust Regression Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:

More information