EXTREME VALUE DISTRIBUTIONS AND ROBUST ESTIMATION

Similar documents
Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Financial Econometrics and Volatility Models Extreme Value Theory

Extreme Value Theory and Applications

Introduction to Algorithmic Trading Strategies Lecture 10

Modelação de valores extremos e sua importância na

Extreme Value Analysis and Spatial Extremes

Shape of the return probability density function and extreme value statistics

OPTIMAL B-ROBUST ESTIMATORS FOR THE PARAMETERS OF THE GENERALIZED HALF-NORMAL DISTRIBUTION

PROPERTIES OF SELECTED SIGNIFICANCE TESTS FOR EXTREME VALUE INDEX

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Overview of Extreme Value Analysis (EVA)

Analysis methods of heavy-tailed data

Extreme Value Theory as a Theoretical Background for Power Law Behavior

A Note on Tail Behaviour of Distributions. the max domain of attraction of the Frechét / Weibull law under power normalization

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

A NOTE ON SECOND ORDER CONDITIONS IN EXTREME VALUE THEORY: LINKING GENERAL AND HEAVY TAIL CONDITIONS

Math 576: Quantitative Risk Management

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Efficient Estimation of Distributional Tail Shape and the Extremal Index with Applications to Risk Management

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz

Variable inspection plans for continuous populations with unknown short tail distributions

Wei-han Liu Department of Banking and Finance Tamkang University. R/Finance 2009 Conference 1

FORECAST VERIFICATION OF EXTREMES: USE OF EXTREME VALUE THEORY

On the Estimation and Application of Max-Stable Processes

Chapter 2 Asymptotics

Spatial and temporal extremes of wildfire sizes in Portugal ( )

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS

Applying the proportional hazard premium calculation principle

A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals

VaR vs. Expected Shortfall

ASTIN Colloquium 1-4 October 2012, Mexico City

A simple graphical method to explore tail-dependence in stock-return pairs

Modified Kolmogorov-Smirnov Test of Goodness of Fit. Catalonia-BarcelonaTECH, Spain

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

COMPARISON OF THE ESTIMATORS OF THE LOCATION AND SCALE PARAMETERS UNDER THE MIXTURE AND OUTLIER MODELS VIA SIMULATION

Extreme Value Theory.

On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves

Zwiers FW and Kharin VV Changes in the extremes of the climate simulated by CCC GCM2 under CO 2 doubling. J. Climate 11:

EXTREMAL QUANTILES OF MAXIMUMS FOR STATIONARY SEQUENCES WITH PSEUDO-STATIONARY TREND WITH APPLICATIONS IN ELECTRICITY CONSUMPTION ALEXANDR V.

PENULTIMATE APPROXIMATIONS FOR WEATHER AND CLIMATE EXTREMES. Rick Katz

Estimation de mesures de risques à partir des L p -quantiles

Modelling Operational Risk Using Bayesian Inference

Correlation: Copulas and Conditioning

A Class of Weighted Weibull Distributions and Its Properties. Mervat Mahdy Ramadan [a],*

Tail Dependence of Multivariate Pareto Distributions

Information Matrix for Pareto(IV), Burr, and Related Distributions

Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions

Risk Aggregation and Model Uncertainty

Breakdown points of Cauchy regression-scale estimators

Introduction to Robust Statistics. Elvezio Ronchetti. Department of Econometrics University of Geneva Switzerland.

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Introduction to robust statistics*

Risk Aggregation. Paul Embrechts. Department of Mathematics, ETH Zurich Senior SFI Professor.

Research Article Strong Convergence Bound of the Pareto Index Estimator under Right Censoring

Change Point Analysis of Extreme Values

RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

Multivariate Non-Normally Distributed Random Variables

The Fundamentals of Heavy Tails Properties, Emergence, & Identification. Jayakrishnan Nair, Adam Wierman, Bert Zwart

Bayesian Modelling of Extreme Rainfall Data

On the Haezendonck-Goovaerts Risk Measure for Extreme Risks

ROBUST AND EFFICIENT JOINT DATA RECONCILIATION PARAMETER ESTIMATION USING A GENERALIZED OBJECTIVE FUNCTION

Journal of Biostatistics and Epidemiology

Lifetime Dependence Modelling using a Generalized Multivariate Pareto Distribution

Abstract: In this short note, I comment on the research of Pisarenko et al. (2014) regarding the

Some conditional extremes of a Markov chain

Introduction Robust regression Examples Conclusion. Robust regression. Jiří Franc

In Chapter 2, some concepts from the robustness literature were introduced. An important concept was the inuence function. In the present chapter, the

Reducing Model Risk With Goodness-of-fit Victory Idowu London School of Economics

Generalized additive modelling of hydrological sample extremes

Lecture 8: Information Theory and Statistics

EVANESCE Implementation in S-PLUS FinMetrics Module. July 2, Insightful Corp

arxiv: v1 [math.st] 4 Aug 2017

Department of Econometrics and Business Statistics

High-frequency data modelling using Hawkes processes

Robust Statistics vs. MLE for OpRisk Severity Distribution Parameter Estimation

Probability and Estimation. Alan Moses

Lecture 14 October 13

A Course on Advanced Econometrics

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

Storm Open Library 3.0

Does k-th Moment Exist?

Volatility. Gerald P. Dwyer. February Clemson University

Lecture 12 Robust Estimation

Independent Component (IC) Models: New Extensions of the Multinormal Model

High-frequency data modelling using Hawkes processes

Generalized quantiles as risk measures

The Goodness-of-fit Test for Gumbel Distribution: A Comparative Study

Contributions to extreme-value analysis

On Backtesting Risk Measurement Models

Operational risk modeled analytically II: the consequences of classification invariance. Vivien BRUNEL

A MODIFICATION OF HILL S TAIL INDEX ESTIMATOR

Stat 5101 Lecture Notes

Efficient rare-event simulation for sums of dependent random varia

Extreme value theory and high quantile convergence

APPROXIMATING THE GENERALIZED BURR-GAMMA WITH A GENERALIZED PARETO-TYPE OF DISTRIBUTION A. VERSTER AND D.J. DE WAAL ABSTRACT

Bivariate generalized Pareto distribution

Explicit Bounds for the Distribution Function of the Sum of Dependent Normally Distributed Random Variables

An algorithm for robust fitting of autoregressive models Dimitris N. Politis

Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770

R&D Research Project: Scaling analysis of hydrometeorological time series data

Transcription:

A C T A U N I V E R S I T A T I S L O D Z I E N S I S FOLIA OECONOMICA 228,2009 Grażyna Trzpiot EXTREME VALUE DISTRIBUTIONS AND ROBUST ESTIMATION Abstract. In parametric statistics estimators such as maximum likelihood or OLS typically estimate stochastic models, which play an important role in finance and insurance. These methods are generally optimal for an assumed reference model. Slight deviations from the assumed model may easy destroy the good statistical properties of the estimator. We present some aspects related to robust estimation in the context of extreme value theory (ETV). We discuss some methodological aspects how robust methods can improve the quality of extreme value theory data analysis by providing information on influential observations. Key words: Extreme value theory, Extreme value distributions, Robust estimation, ^/-estimator. I. INTRODUCTION Robust statistics achieves this by a set of different statistical frameworks that generalize classical statistical procedures such as maximum likelihood or OLS. Seminal contributions are Huber (1981) and Hampel et al. (1986). Since then many different and related approaches have emerged. Dell Aquila and Ronchetti (2006) give a comprehensive introduction to the principles of robust statistics estimation, testing and model selection and apply and extend the theory to different models used in risk management, asset allocation and insurance. We discuss how robust methods can improve the EVT data analysis by providing information on influential observations, deviating substructures and possible wrong specification of a model. Basically EVT is using in risk management, see Embrechts et al. (1997) and McNeil et al. (2005). We find that robust statistical methods can improve the data analysis process of the skilled analyst and provide him with useful additional information. We will shortly review some key concepts from EVT and robust statistics and next we will consider how the whole data analysis process can be improved by additionally using robust statistical procedures. ' Professor, Department of Statistics, University of Economics in Katowice.

II. EXTREME VALUE THEORY Extreme value theory is more and more used in recent years to model extremes o f financial and economic data or natural phenomena. The EVT framework provides on the one hand asymptotic distributions for the description of (normalized) maxima or minima and on the other hand the asymptotic distribution of extremes over a high threshold. Basic references with a focus on finance and insurance are Embrechts et al. (1997) and Malevergne and Somette (2006). 2.1. Generalized extreme value distributions The EVT analyses the asymptotic distribution of (normalized) maxima or minima of i.i.d. samples, i.e. Mn = max(xi,..., Xn). It turns out that under weak conditions, the normalized maximum of n i.i.d. random variables is distributed as Gumbel, Weibull or Fréchet, depending on the data generation process. The generalized extreme value (GEV) distributions can be combined into a single form (1) The parameters /и and ß are the location and scale and, is the shape parameter of limiting distribution. Its sign determines the three possible limiting forms of the GEV of distribution of maxima: - If = 0 the limit distribution is the Gumbel distribution (double- exponential), - If > 0 the limit distribution is the (shifted) Fréchet power-like distribution, - If < 0 the limit distribution is the Weibull distribution. A special case is the Gumbel distribution (taking the limit -» 0) FeGum (X) = (X) = expf - expf (2) V v И J J This distribution is widely used as it is the appropriate limit of maxima from many common distributions, e.g. normal, lognormal, Weibull and gamma.

2.2. Generalized Pareto distribution Another important result in EVT is related to the distribution function for exceedances over a given threshold. It turns out that excesses over a high threshold и have a generalized Pareto distribution (GPD) with distribution function: FeGPD(x) = F % D(x) = 1-0 + - dl a * 0 ß 1-е х р (-* /Д ), dla = 0 (3) ß> 0 and x ^ O when, ž 0, O í x á - when č, < 0, for = 0 the limiting distribution is exponential. The case > 0 corresponds to heavy-tailed distributions whose tails decay like power functions, such as the Pareto, Student t, Cauchy, Burr, log-gamma and Fréchet distributions. The case = 0 corresponds to distributions like the normal, exponential, gamma and lognonnal, with tails essentially decaying exponentially. The final group of distributions (, < 0) are short-tailed distributions with a finite right endpoint, such as the uniform and beta distributions. Ш. ESTIMATION METHODS Consider a parametric model given by a distribution Fe with density fe. In classical statistics, one often chooses an estimation framework that is optimal at the assumed model distribution (e.g. the maximum likelihood framework delivers the asymptotically most efficient estimator at the model distribution). However, as soon as the real underlying model deviates from the assumed one, the estimator may lose its good statistical properties and many alternative estimators may perform better. In robust statistics we want to construct estimators and tests that have good statistical properties (high efficiency, low bias) for a whole neighbourhood o f the assumed model distribution Fe. Such a neighborhood can, for example, be formalized by: Ar(F fi) = {Ge\Ge = (1 - e)fe + sg, G arbitrary} and 0 < e < 1, thought of as a measure for contamination.

The aim o f robust statistics is to provide statistical procedures: - which are still reliable and reasonably efficient under small deviations from the assumed parametric model and to quantify the maximal bias on the statistical quantity o f interest when the underlying distribution lies in a neighborhood of the reference m odel1. - these procedures should highlight which observations (e.g. outliers) or deviating substructures have most influence on the statistical quantity under observation. 3.1. Maximum likelihood estimators The parameter <9for distributions such as GEV (в = [/а Д ]r) or GDP ( 0 - [ß, ŠÝ) are typically estimated by maximum likelihood, i.e.: or finding the zeros o f the estimating equations П <9 = a r g m in ^ lo g /0(x,) (4) / - 1 2 > ( * 0 ) = O (5) /-i is the score function. J( r ) = i o g A W (6) 1 дв 3.2. Af-estimators We will consider only the most general robust framework, the M-estimation framework. M-estimators can be seen as a generalization of the maximum likelihood approach and allow to analyses the robustness properties of estimators and tests. An M-estimator is defined as the solution to the minimization problem it = arg m in Y /?(*,;#) (7) 6e@ * /=1 1In this sense it is a generalization of the classical statistical procedures.

for some objective function p. If p has a derivative (8) then the Af-estimator satisfies the first order conditions П YJV'(xi,0) = O. (9) In general we restrict to estimators, which are Fisher consistent, we require a yy such that Under weak conditions on ц/ it can be shown that the resulting estimator is normally distributed with variance-covariance matrix given by ЕРвМх;в)] = 0. (10) V = M~X E[i//(X;0)t//(X;e)r]-M-r ( 1 1) ( 12) The classical maximum likelihood estimator corresponds to or p(x-e) = -\ogfe(x) (13) (14) s(x-,0) is the score function. 3.3. Robust estimators - properties General results in robust statistics imply that an estimator with - a bounded asymptotic bias in a neighbourhood of the reference model can be constructed by choosing a bounded function i//(in x); - a high asymptotic efficiency can be achieved by choosing a \\i function which is similar to the score function s(x; 9) in the range most of the observations lie.

There is an trade off between maximal asymptotic bias in a neighbourhood of the model distribution Fe and the asymptotic efficiency of the estimator at the reference model. Because ц/ enters in the linear approximation of the asymptotic bias as well as in the asymptotic variance of the estimator (11), it is possible to solve a general optimality problem to find the estimator that is the most efficient given a bound on the maximal bias of the estimator in a neighbourhood of the model. The solution to this problem is the A/-estimator defined by v t'a{x\6) = he(a m s(x,0) - a(0)) (15) С K(r) = r m in ( l, - ) is a multivariate version of the Huber function seen above and the matrix Л and the vector a are determined by solving: EFe[he( A m X - a m ] = 0 and Е,. [ ^ а( Х ; в ) ^ а(х;в)]г = / (16) which ensure that the estimator is consistent and the asymptotic bias remains below the chosen bound. In the one-dimensional location case presented above, the optimal solution reduces to using the \\ic function, in particular in this symmetric case a = 0. For asymmetric reference models, а(в) must be typically found numerically to ensure consistency of the estimator. The computation of the estimator can typically be performed by a slightly adapted Newton-Raphson type procedure. IV. APPLYING ^/-ESTIM ATORS TO EVT We consider the estimation o f a Gumbel model, one o f the GEV distributions. It can be verified that the score function is unbounded in a*, ln this example (Dell Aquila, Ronchetti, 2006) we would like to highlight that the robust (in this case the optimal robust) estimator is able to detect observations that do not conform to the bulk of the data, even in the case of a very asymmetric model. These observations must not necessarily be far away. To illustrate this robustness issue, we generate 300 observations from a contaminated model given by 95% from a Gumbel with ju= 4 and ß= 2 and 5% from a Gumbel with ju = - 0.5 and /? = 0.2; notice that the contaminated model puts more mass on the left of the true Gumbel distribution as can be seen in Figure 1, which plots the two densities.

Figure 2 shows that the classical estimator for {ß, ß) is clearly attracted by the contaminating structure and fails to model part of the majority of the data. The robust estimator (tuned to have approximately 90% efficiency at the model) fits the distribution much better most data are located. Pdf Gumbel, ^=-0.5. i=0.2 1.5 1 0.5 0 5 10 15 20 Source: Dell Aquila, Ronchetti (2006). Fig. 1. Gumbel model - the GEV distributions Source: Dell Aquila, Ronchetti (2006). Fig. 2. Contaminated model Similar robustness issues apply for other EVT distributions such as the Weibull and GPD distribution and for other distributions such as gamma, beta, etc. We can observe that robust methods do not down weight extreme observations if they conform to the majority of the data. Additionally robust methods can guarantee a stable efficiency, MSE and a bounded bias over a whole neighborhood o f the assumed distribution.

V. CONCLUSIONS In this paper we have discussed how robust statistics may improve the data analysis process in the specific case of EVT. We have seen that robust methods can help to identify deviating structure, influential observations and guarantee good statistical properties over a whole set of underlying distributions, therefore considerably enhancing the data analysis. In this sense there is no obvious contradiction between robustness and EVT. Overall we find that robust statistical methods can improve the data analysis process of the skilled analyst and provide useful additional information. Similar robustness issues arise for many other models such as linear regression, generalized linear models, multivariate models and virtually all time series models. REFERENCES Dell Aquila R., Ronchetti E. (2006), Robust statistics and econometrics with economic and financial applications, New York, Wiley. Embrechts P., Kllippelberg C., Mikosch T. (1997), Modeling extremal events fo r insurance and finance, Berlin Heidelberg New York, Springer. Hampel F., Ronchetti E., Rousseeuw P., Stahel W. (1986), Robust statistics: The approach based on influence functions. New York, Wiley. Huber P. (1981), Robust statistics, New York, Wiley. Malevergne Y., Somette D. (2006), Extreme Financial Risk, Berlin Heidelberg New York, Springer. McNeil A., Frey R., Embrechts P. (2005), Quantitative risk management: concepts, techniques and tools, Princeton, Princeton University Press. Reiss R.-D., Thomas M. (2001), Statistical analysis o f extreme values, Basel, Birkhäuser. Grażyna Trzpiot ROZKŁADY WARTOŚCI EKSTREMALNYCH A ESTYMACJA ODPORNA Modele stochastyczne są istotne dla zastosowań w finansach czy w ubezpieczeniach. Statystyczne metody estymacji parametrycznej wykorzystywane najczęściej do wyznaczania parametrów modeli to metoda największej wiarygodności lub MNK. Metody te dają optymalne oszacowania modeli, jednakże odchylenia obserwowanych wartości w kalibrowanym modelu mogą zachwiać dobre własności estymatorów. Przedstawimy pewne aspekty estymacji odpornej w kontekście rozkładów wartości ekstremalnych. Podejmiemy dyskusję metodologicznych aspektów zagadnienia pokazując, jak estymatory odporne wpływają na jakość analiz z wykorzystaniem rozkładów wartości ekstremalnych poprzez informacje o obserwacjach wpływowych.