Some Thoughts on the Importance of Weighing the Tails

Similar documents
Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)*

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

An Extreme Value Theory Approach for Analyzing the Extreme Risk of the Gold Prices

of (X n ) are available at certain points. Under assumption of weak dependency we proved the consistency of Hill s estimator of the tail

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Extreme Value Theory in Civil Engineering

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution

Estimation of Gumbel Parameters under Ranked Set Sampling

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

THE DATA-BASED CHOICE OF BANDWIDTH FOR KERNEL QUANTILE ESTIMATOR OF VAR

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statisticians use the word population to refer the total number of (potential) observations under consideration

3. Z Transform. Recall that the Fourier transform (FT) of a DT signal xn [ ] is ( ) [ ] = In order for the FT to exist in the finite magnitude sense,

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

R. van Zyl 1, A.J. van der Merwe 2. Quintiles International, University of the Free State

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL/MAY 2009 EXAMINATIONS ECO220Y1Y PART 1 OF 2 SOLUTIONS

Power Comparison of Some Goodness-of-fit Tests

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

Kernel density estimator

Chapter 6 Principles of Data Reduction

Distribution of Random Samples & Limit theorems

A Weak Law of Large Numbers Under Weak Mixing

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Statistical Theory; Why is the Gaussian Distribution so popular?

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

Chapter 6 Sampling Distributions

MAT1026 Calculus II Basic Convergence Tests for Series

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION

Bayesian and E- Bayesian Method of Estimation of Parameter of Rayleigh Distribution- A Bayesian Approach under Linex Loss Function

32 estimating the cumulative distribution function

Estimation for Complete Data

There is no straightforward approach for choosing the warmup period l.

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

On an Application of Bayesian Estimation

Trading Friction Noise 1

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction

An Introduction to Asymptotic Theory

Summary. Recap ... Last Lecture. Summary. Theorem

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Lecture 2: Monte Carlo Simulation

Lesson 10: Limits and Continuity

First Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise

A note on self-normalized Dickey-Fuller test for unit root in autoregressive time series with GARCH errors

Introductory statistics

Berry-Esseen bounds for self-normalized martingales

Stat 319 Theory of Statistics (2) Exercises

Lecture 33: Bootstrap

Limit distributions for products of sums

Bootstrap Intervals of the Parameters of Lognormal Distribution Using Power Rule Model and Accelerated Life Tests

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Quick Review of Probability

LECTURE 8: ASYMPTOTICS I

This is an introductory course in Analysis of Variance and Design of Experiments.

Random Variables, Sampling and Estimation

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

CONTROL CHARTS FOR THE LOGNORMAL DISTRIBUTION

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution

Properties and Hypothesis Testing

Chapter 13: Tests of Hypothesis Section 13.1 Introduction

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

Lecture Chapter 6: Convergence of Random Sequences

Sample Size Determination (Two or More Samples)

Access to the published version may require journal subscription. Published with permission from: Elsevier.

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Bayesian Methods: Introduction to Multi-parameter Models

A NEW METHOD FOR CONSTRUCTING APPROXIMATE CONFIDENCE INTERVALS FOR M-ESTU1ATES. Dennis D. Boos

A proposed discrete distribution for the statistical modeling of

Simulation. Two Rule For Inverting A Distribution Function

Generalized Semi- Markov Processes (GSMP)

Bayesian Control Charts for the Two-parameter Exponential Distribution

Asymptotic distribution of products of sums of independent random variables

STA Object Data Analysis - A List of Projects. January 18, 2018

Empirical Process Theory and Oracle Inequalities

Quick Review of Probability

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

5. Likelihood Ratio Tests

Chapter 2 Descriptive Statistics

A statistical method to determine sample size to estimate characteristic value of soil parameters

7.1 Convergence of sequences of random variables

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

7.1 Convergence of sequences of random variables

STA6938-Logistic Regression Model

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This section is optional.

Transcription:

SEMINAR UIMA Probability ad Statistics Group Departmet of Mathematics Uiversity Aveiro 25 Jue 2008 Some Thoughts o the Importace of Weighig the Tails Isabel Fraga Alves CEAUL & DEIO Uiversity Lisbo Cláudia Neves UIMA & DM Uiversity Aveiro Iês Farias Istituto de Ivestigação das Pescas e do Mar, Recursos Marihos e Sustetabilidade

Cotets Itroductio Prelimiaries ad otatio Testig extremes Parametric Approaches Aual Maxima (AM) Peaks Over Threshold (POT) Largest Observatios (LO) Semi-Parametric Approaches Testig EV Coditios PORT approach Three Tests Case studies fiacial / evirometal / risk i health scieces AVEIRO, Jue 25, 2008 2

Itroductio I aalysis of extreme large (or small) values it is of relevat importace the model assumptios o the right (or left) tail of the uderlyig distributio fuctio (d.f.) F to the sample data. We focus o the problem of extreme large values. By a obvious trasformatio, the problem of extreme small values is aalogous. Statistical iferece about rare evets ca clearly be deduced oly from those observatios which are extreme i some sese: classical Gumbel method of block of aual maxima (AM) peaks-over-threshold (POT) methods peaks-over-radom-threshold (PORT) methods. Statistical iferece is clearly improved if oe make a a priori statistical choice about the more appropriate tail decay for the uderlyig df: light tails with fiite right edpoit expoetial or polyomial This is supported by Extreme Value Theory (EVT). AVEIRO, Jue 25, 2008 3

Theory ad Extreme Values Aalysis Extreme Values Aalysis Models for Extreme Values, ot cetral values; modellig the tail of the uderlyig distributio Problem: How to make iferece beyod the sample data? Oe Aswer: use techiques based o EVT i such a way that it is possible to make statistical iferece about rare evets, usig oly a limited amout of data! Notatio: Sample ( X1, X 2, L, X ) iid r.v.'s with d. f. F( x). Tail of F F ( x) = P( X > x) = 1 F ( x). Order Statistics X X L X =: M 1, 2,, AVEIRO, Jue 25, 2008 4

The Basic Theory distributio of the Maximum [ 1 L ] [ ] L [ ] P[ M x] = P X x,, X x = P X x P X x = F Gedeko (1943) [ ] 1 ( x). a. s. Cosequetly, M xf, { x F x } with x = sup, ( ) < 1. Suppose there exist a >0 ad b R, such that P M a x + b G( x), for every x R F ( z) 1/ γ exp 1 + γ, para 1+ γz > 0, se γ 0 G( z) Gγ ( z) = exp( exp( z)), para z R, se γ = 0 F D( G γ ), γ R [GEV- Geeralized Extreme Value] vo Mises-Jekiso Represetatio AVEIRO, Jue 25, 2008 5

Extreme Value Distributios (max-stable) The GEV(γ) icorporates the 3 types:[fisher-tippett] Fréchet: limit for heavy tailed distributios Weibull: limit for short tailed distributios with Gumbel: α Φ α z = z z > α > ( ) exp( ( ) ), 0, 0; γ = 1 / α > 0 ( ) exp( ( ) ), 0, α 0; α Ψ α z = z z < > x F limit for expoetial tailed distributios < γ = 1 / α < 0 Λ ( z) = exp( exp( z)), z R. γ = 0 AVEIRO, Jue 25, 2008 6

Theory of Regular Variatio & Extreme Value Theory AVEIRO, Jue 25, 2008 7

GEV(γ) 0.5 0.4 0.3 0.2 0.1 0-4 -2 0 2 4 6 8 10 Gev(0.5)=Fréchet Gev(-0.5)=Weibull Gev(0)=Gumbel AVEIRO, Jue 25, 2008 8

Normal N(µ,σ) φ(x ) p.d.f. 68.27% µ x x µ σ µ+ σ AVEIRO, Jue 25, 2008 9

Normal N(µ,σ) φ(x ) p.d.f. 95.45% µ x x µ 2σ µ+ 2σ AVEIRO, Jue 25, 2008 10

Gumbel f (x ) p.d.f. 13.22% 72.37% 14.41% µ x µ σ µ+ σ AVEIRO, Jue 25, 2008 11

Gumbel f (x ) p.d.f. 0.07% 95.71% 4.22% µ x µ 2σ µ+ 2σ AVEIRO, Jue 25, 2008 12

Normal & Gumbel p.d.f. φ(x ) f (x ) x 7 6.5 6 5.5 5 4.5 4 3.5 3 2.5 2 1.5 1 0.5 0-0.5-1 -1.5-2 -2.5-3 -3.5 AVEIRO, Jue 25, 2008 13

Normal & Gumbel d.f. 1 Φ(x ) 0.5 Λ( x ) 0 x 6.5 5.5 4.5 3.5 2.5 1.5 0.5-0.5-1.5-2.5-3.5 AVEIRO, Jue 25, 2008 14

0.4 Normal & Gumbel 0.3 0.2 0.1 Normal & Gumbel same mea values ad variaces 0-3 2 7 Normal Gumbel 0.4 0.3 0.2 0.1 Normal(0,1) & Gumbel stadard. 0-3 2 7 Gumbel Normal (0,1) AVEIRO, Jue 25, 2008 15

Extremal Quatiles : Normal or Gumbel?? d.f.'s & e.d.f. (,, L, ) = (-1.5,-0.5,-0.2,0.1,0.2,0.5,0.8,0.9,1.3,2.1) x x x Model? 1:10 2:10 10:10 1 0.95 e.d.f. -3.5-2.5-1.5-0.5 0 0.5 1.5 Q0.95 2.5 3.5 4.5 x 5.5 AVEIRO, Jue 25, 2008 16

Extremal Quatiles : Normal or Gumbel?? f.d.'s e f.d.e. x x L x = Model? (,,, ) (-1.5,-0.5,-0.2,0.1,0.2,0.5,0.8,0.9,1.3,2.1) 1:10 2:10 10:10 1 e.d.f. 0.95 Λ( x ) Φ(x) 0 x -3.5-2.5-1.5-0.5 0.5 1.5 2.5 Q0.95 3.5 4.5 5.5 AVEIRO, Jue 25, 2008 17

Extremal Quatiles : Normal or Gumbel?? d.f.'s & e.d.f. (,, L, ) = (-1.5,-0.5,-0.2,0.1,0.2,0.5,0.8,0.9,1.3,2.1) x x x Model? 1:10 2:10 10:10 1 0.05 0.95 Φ(x) e.d.f. Λ( x ) Radom sample from Gumbel! -3.5-2.5-1.5-0.5 0 0.5 Φ -1 (0.95) Q0.95 1.5 2.5 Λ -1 (0.95) 3.5 4.5 x 5.5 AVEIRO, Jue 25, 2008 18

Tails AVEIRO, Jue 25, 2008 19

Heavy Tails, Tail idex & Momets AVEIRO, Jue 25, 2008 20

Heavy Tails, Tail idex & Momets F 1 D (G γ ), γ = > 0 α Heavy Tails γ < 1 2 fiite variace 1 2 < γ < 1 γ > mea value ifiite ifiite variace, mea value fiite 1 AVEIRO, Jue 25, 2008 21

Super-Heavy tails (o fiite Momets) AVEIRO, Jue 25, 2008 22

Light, Heavy & Super-Heavy tails (o fiite Momets) AVEIRO, Jue 25, 2008 23

Parametric aproches Fittig GEV(γ) to Aual Maxima (AM) GUMBEL METHOD Iclusio of locatio λ ad scale δ parameters i GEV(γ) df γ tail idex (shape) x λ G γ ( x; λ, δ ) = G γ, λ R, δ > 0, γ R δ Block 1 Block 2 Block 3 Block 4 Block 5 AVEIRO, Jue 25, 2008 24

Testig problem i GEV(γ) The shape parameter γ determies the weight of the tail Choice betwee Gumbel, Weibull or Fréchet { Gγ : γ = 0 } vs. { Gγ : γ 0 } Va Motfort (1970) Bardsley (1977) Otte ad Va Motfort (1978) Tiago de Oliveira (1981) Gomes (1982) Tiago de Oliveira (1984) Tiago de Oliveira ad Gomes (1984) Hoskig (1984) Maroh (1994) Wag, Cooke, ad Li (1996) Maroh (2000) or { G } γ γ < vs. : 0 { G } γ γ > vs. : 0 AVEIRO, Jue 25, 2008 25

H γ Geeralized Pareto distributio GP(γ) -1/ γ x λ 1-1 + γ if γ 0 ( x; λ, δ ) = δ, λ R, δ > 0, 1- exp[ -( x λ) / δ ] if γ = 0 GP(γ) df icludes the models: Pareto: ( x ) for x 0 ad 1+ γ λ / δ 0 H ( x; λ, δ ) = 1+ log G ( x; λ, δ ) γ Beta: 2, Expoetial: α W ( x) = 1 x, α > 0, x 1 1, α α W ( x) = 1 ( x), α < 0, -1 x 0 α 0 γ Heavy Tail bouded support W ( x) = 1 exp( x), x 0 Expoetial tail AVEIRO, Jue 25, 2008 26

Excesses over high thresholds POT ( Peaks Over Thresholds ) Balkema-de Haa 74+Pickads 75 F D(G γ ) lim sup P X -u x X > u - Hγ ( x; δ ( u)) = 0 u 0 < x < x u F ( u + y ) = F ( u ) P X u > y X > u, y 0 F P X u > y X > u H γ ( y ; δ ( u )) u Excesses over u : X - u X > u i i AVEIRO, Jue 25, 2008 27

Testig problem i GP(γ) The shape parameter γ determies the weight of the tail Choice betwee Expoetial, Beta or Pareto Va Motfort ad Witter (1985) Gomes ad Va Motfort (1986) Brilhate (2004) Maroh (2000) AM & POT vs. { H : 0} = γ γ < or vs. { H : 0} γ γ > { Hγ : γ 0 } vs. { Hγ : γ 0 } Fittig GPdf to data Castillo ad Hadi (1997) Goodess-of-fit tests for GPdf model Choulakia ad Stephes (2001) Goodess-of-fit problem heavy tailed Pareto-type dfs Beirlat, de Wet ad Goegebeur (2006) AVEIRO, Jue 25, 2008 28

Z k largest observatios of the sample: X X (1) X (2) X (3) X (4) λ ( L ) LO (Larger Observatios) X X L X (1) (2) ( k ) ( i) ( i) : =, i = 1, L, k δ g ( z ) f z,, z = g ( z ), z > > z, g ( z): = G ( z)/ z γ are modeled by joit pdf GEV(γ) - extremal process k 1 γ i 1 k γ k 1 L k γ γ i= 1 Gγ ( zi ) X (k) AVEIRO, Jue 25, 2008 29

Testig problem i GEV(γ) GEV(γ)-extremal process The shape parameter γ determies the weight of the tail Choice betwee Gumbel, Weibull or Fréchet { Gγ : γ = 0 } vs. { Gγ : γ 0 } or { G } γ γ < vs. : 0 { G } γ γ > vs. : 0 Gomes ad Alpuim (1986) Gomes (1989) LO & AM Goodess-of-fit tests Gomes (1987) AVEIRO, Jue 25, 2008 30

Semi-Parametric Approach Upper Order Statistics F D(G γ ) X k, upper itermediate o.s. X X L X, 1, k, k k( ), k / 0, X, X 1, X 2, X 3, X k, AVEIRO, Jue 25, 2008 31

Peaks Over Radom Threshold - PORT Z : = X i 1: X k:, i 1, L, k { } i + = Excesses Over Radom Threshold k: X Excesses over X Z : = X - X k : i i + 1: k: X k : AVEIRO, Jue 25, 2008 32

Testig Problem: Max-Domais of Attractio The shape parameter γ determies the weight of the tail Choice betwee Domais of Attractio F D(G ) vs. F D(G γ ) γ 0 0 or vs. F D(G ) γ vs. F D(G ) γ γ < 0 γ > 0 Galambos (1982) Castillo, Galambos ad Sarabia (1989) Hasofer ad Wag (1992) Falk (1995) Fraga Alves ad Gomes (1996) Fraga Alves (1999) Maroh (1998a,b) Segers ad Teugels (2000) PORT approach Neves, Picek ad Fraga Alves(2006) Neves ad Fraga Alves (2006) AVEIRO, Jue 25, 2008 33

X X Testig EV coditios, 1, L X k, k, X upper itermediate o.s. F D(G γ ), for ay real γ Adapted Goodess-of-fit tests (Kolmogorov-Smirov & Cramér-vo Mises type) Dietrich, de Haa ad Husler (2002) Drees, de Haa ad Li (2006) AVEIRO, Jue 25, 2008 34

PORT approach Three Tests for F D(G ) vs. F D(G ) 0 γ γ 0 X X, 1, L X Largest k k( ), k, Observatios k / 0, { Z : = X X i L, k} i i+ = 1, k,, 1, Excesses over the X Radom Threshold, k Defie the r-momet of Excesses 1 1 M : X : Z, k k ( r) = ( ) r + 1,, = r i X k i k i= 1 k i= 1 r = 1, 2 AVEIRO, Jue 25, 2008 35

NPFA test statistic: Ratio betwee the Maximum ad the Mea of Excesses Z X X T ( k) = = M 1 : k: (1) (1) M Neves, Picek & FragaAlves 06 The distributio does NOT deped o the locatio ad scale Motivatio: differet behaviour of the ratio betwee the maximum ad the mea for light ad heavy tails AVEIRO, Jue 25, 2008 36

Gt test statistic: Greewood-type Statistic R ( k) (2) 1 k M k i= 1 = = Z 2 i ( ( 1) ) 2 ( ) 1 k M k Z i= 1 i 2 Motivatio: based o the statistic Greewood 46 (Neves & FragaAlves 06) The distributio does NOT deped o the locatio ad scale AVEIRO, Jue 25, 2008 37

W HW - test statistic: Hasofer ad Wag Statistic ( (1) M ) 1 1 1 ( k ) : = = 2 k M k R ( k ) 1 1 = k k 2 ( ( M ) ( 2 ) 1) ( ) 1 k k Z i = 1 i 2 ( ) k k Z i 1 i 1 k 2 1 Z i = 1 i = 2 (Hasofer & Wag 92; Neves & FragaAlves 06) The distributio does NOT deped o the locatio ad scale Motivatio: based o goodess-of-fit statistic Shapiro-Wilk 65 AVEIRO, Jue 25, 2008 38

NPFA - Test at asymptotic level α uder H 0 + extra secod order coditios o the upper tail of F + extra coditios o covergece rate of k to ifiity T, : T log k * k = d k, G 0 Λ H : F D(G ) vs. H : F D(G γ ) γ 0 0 1 0 Reject H 0 (light tails) i favour of H 1 (bilateral) if: g ε : = l( l ε ) Gumbel quatile T < g T > g * * k, α 2 or k, 1 α 2 H : F D(G ) vs. H : F D(G γ ) γ > 0 0 1 0 Reject H 0 (light tails) i favour of H 1 (heavy tails) if: T * k, > g1 α H : F D(G ) vs. H : F D(G γ ) γ < 0 0 1 0 Reject H 0 (light tails) i favour of H 1 (short tails) if: * Tk, < g α AVEIRO, Jue 25, 2008 39

Exact Properties of NPFA, GT & HW - Tests A extesive simulatio study cocerig the proposed procedures, allows us to coclude that: The Gt-test is show to good advatage whe testig the presece of heavy-tailed distributios is i demad. While the Gt-test barely detects small egative values of γ, the HW-test is the most powerful test uder study cocerig alteratives i the Weibull domai of attractio. Sice the NPFA- test based o the very simple T-statistic teds to be a coservative test ad yet detais a reasoable power, this test proves to be a valuable complemet to the remaider procedures. AVEIRO, Jue 25, 2008 40

uder H 0 + extra secod order coditios o the upper tail of F + extra coditios o covergece rate of k to ifiity Gt & HW - Tests at asymptotic level α H : F D(G ) vs. H : F D(G γ ) γ 0 0 1 0 Reject H 0 (light tails) i favour of H 1 (bilateral) if: R W * * ( ) ( ) ( k) : = k / 4 R ( k) 2 ( k) : = k / 4 kw ( k) 1 Gt HW - test - test N d (0,1) z : 1 ε = Φ ( ε ) ε - Normal quatile R ( k) > z * W ( k) > z * 1 α 2 1 α 2 H : F D(G ) vs. H : F D(G γ ) γ > 0 0 1 0 Reject H 0 (light tails) i favour of H 1 (heavy tails) if: Gt - test HW - test R ( k) > z * * 1 α W ( k) < z 1 α H : F D(G ) vs. H : F D(G γ ) γ < 0 0 1 0 Reject H 0 (light tails) i favour of H 1 (short tails) if: Gt - test HW - test R ( k) < z * W ( k) > z * 1 α 1 α AVEIRO, Jue 25, 2008 41

Data 1 Fiacial data: stock idex log-returs EVT offers a powerful framework to characterize fiacial market crashes ad booms. The exact distributio of fiacial returs remais a ope questio. Heavy tails are cosistet with a variety of fiacial theories. I fiacial studies, the followig questio is relevat: are retur distributios symmetric i the tails? Differeces i the behavior of extreme positive ad egative tail movemets withi the same market costitute a poit of ivestigatio. The aforemetioed tests ca be see as a first test for symmetry betwee the positive ad egative tails of the log-returs of some stock idex. AVEIRO, Jue 25, 2008 42

Data 1 S&P500: left ad right tails of stock idex log-returs S&P500 data: =6985 observatios series of closig prices, {S i, i = 1,, } of S&P500 stock idex take from 4 Jauary, 1960 up to Friday, 16 October, 1987 (the last tradig day before the crash of Black Moday, October 19, 1987 ), from which we use the daily log-returs (assumed to be statioary ad weakly depedet). Study left tail of the distributio of the returs: egative log-returs, i.e., L i := log (S i+1 / S i ), i = 1,, -1. Study right tail of the distributio of the returs: positive log-returs, defied as X i := log (S i+1 / S i )= L i, i = 1,, -1. AVEIRO, Jue 25, 2008 43

S&P500: percetage log-returs X i := log (S i+1 / S i ) S&P500 (log-returs, 5 Ja 60-16 Oct 87) X i 6 4 2 0-2 -4-6 -8 1/12/1988 1/11/1986 1/11/1984 1/10/1982 1/10/1980 1/9/1978 1/9/1976 1/8/1974 1/8/1972 1/7/1970 1/7/1968 1/6/1966 1/6/1964 1/5/1962 1/5/1960 AVEIRO, Jue 25, 2008 44

Sample paths of the statistics T*, R* ad W*, plotted agaist k = 5,, 1200, applied to S&P500: egative log-returs L i := log (S i+1 / S i ) 6 5 4 3 2 1 0-1 -2-3 S&P500 (Left tail) NPFA-test Gt-test HW-test k 0 200 400 600 800 1000 1200 g 0.95 z 0.95 z 0.05 T* R* W* F (G γ ), γ > 0 L D Fre chet Domai, Heavy Tail! AVEIRO, Jue 25, 2008 45

Sample paths of the statistics T*, R* ad W*, plotted agaist k = 5,, 1200, applied to S&P500: positive log-returs X i := log (S i+1 / S i ) 6 5 4 3 2 1 0-1 -2-3 k 0 200 400 600 800 1000 1200 0 S&P500 (Right tail) g 0.975 z 0.975 Gt-test NPFA-test HW-test g 0.025 z 0.025 T* R* W* FX D(G ), Gumbel Domai, light/expoetial Tail! AVEIRO, Jue 25, 2008 46

S&P500: left ad right tails of stock idex log-returs NPFA, HW ad Gt testig procedures uder the PORT approach yielded the sample paths plots preseted. This aalysis suggests the cosideratio of the Fréchet ad Gumbel domais of attractio, respectively, for the left ad right tails of the returs distributio. This may have the followig iterpretatio: i this stock idex the crashes are much more likely tha large gai values. AVEIRO, Jue 25, 2008 47

Data 2 & Data 3 Evirometal data Bilbao 179 observatios zero-crossig hourly mea periods (i secods) of the sea waves, measured i a Bilbao buoy, Jauary 1997 - Maritime Climate Program of CEDEX, Spai. (ifluece of periods o beach morphodyamics ad other problems related to the right tail of the uderlyig distributio) I de Zea Bermudez & Amaral-Turkma (2003) a estimatio procedure for the parameter of the GPd built uder a Bayesia perspective. Therei the authors alert for the sigificat advatage that might derive from discrimiatig the proper domai of attractio. Ozoo 731 observatios ambiet ozoe levels (i parts per billio) that were recorded hourly by betwee 9 ad 12 statios i Harris Couty from 1980-1993. AVEIRO, Jue 25, 2008 48

Bilbao wave data F D(G γ ), γ < 0! Sample : ( = 179) g0.05 AVEIRO, Jue 25, 2008 49

Ozoe data F D(G γ ), γ = 0! Sample : ( = 731) g 0.0 2 5 -z 0.975 AVEIRO, Jue 25, 2008 50

Data 4 risk i health scieces data: Cotamiat metals (cadmium(cd) & lead(pb)) black fish-sword or black scabbard fish (Aphaopus carbo) PROBLEM: Risk of populatio exposed to high levels of cotamiat metals (Cd & Pb) i black scabbard fish. Cadmium & Lead cocetratios were measured (mg/kg wet weight ) i Aphaopus carbo, caught off Sesimbra(=130), Madeira(=24) ad Azores(=26) archipelagos. Europea limits for cadmium (Cd = 0.05mg/kg) ad lead (Pb = 0.3 mg/kg), defied by Europea Commuity Regulatio 78/2005 (EU, 2005). I some studies ad accordig to the permissible WHO ad FAO levels, this species does ot represet a risk for huma cosumptio if the liver is excluded ad the edible part cosumed with moderatio. QUESTION: What is the probability of exceedig EU limits for the 3 regios? FRAMEWORK: Iferece Statistics (Tail Probability) for Extreme Values. AVEIRO, Jue 25, 2008 51

Tail Probability Estimatio Give a large value x, estimatio of p = 1 F( x) p = p, x = x, p = 1 F( x ) 0, as., k X upper itermediate o.s. k k, k / 0, 1/ γ k x X k, 1 + γ, γ 0 x X k, a( / k) 1 F( x ) ( 1 F( X k, ) ) 1 Hγ. a( / k) k x X k, exp, γ = 0 a( / k) AVEIRO, Jue 25, 2008 52

Tail Probability Estimatio γ ˆ γ a( / k) aˆ ( / k) pˆ 1/ ˆ γ k x X + ˆ γ γ aˆ( / k) k x X k, exp, γ = 0 aˆ( / k) k, max 0, 1, 0 For heavy tails the expressio specializes k x pˆ : = X k, 1/ ˆ γ AVEIRO, Jue 25, 2008 53

Data 4 risk i health scieces data: Cadmium Madeira (=24) Estimatio γ 0,5 0-0,5 5 7 9 11 13 15 17 19 21 23 25 k Mometos -1-1,5-2 AVEIRO, Jue 25, 2008 54

Data 4 risk i health scieces data: Cadmium Madeira (=24) Statistical Choice (EVT TRILEMMA) 5 4 3 2 RATIO Greewood Hasofer-Wag g_0.025 quatil da Gumbel g_0.975 quatil da Gumbel z_0.025 quatil da Normal z_0.975 quatil da Normal 1 0-1 5 7 9 11 13 15 17 19 21 23 25 k -2-3 AVEIRO, Jue 25, 2008 55

Data 4 risk i health scieces data: Cadmium Madeira (=24) Tail Probability ( EVT TRILEMMA - "a priori" fixed sig g <0 or γ =0 or γ >0) 0,0099 pmeg pm0 pmpos -0,0001 5 7 9 11 13 15 17 19 21 23 25 k AVEIRO, Jue 25, 2008 56

Data 4 risk i health scieces data: Cadmium Madeira (=24) Tail Probability (sequetial procedure accordig to EVT TRILEM M A coditioal o k ) 0,00099 pgr pratio phw -0,00001 5 10 15 20 25 k AVEIRO, Jue 25, 2008 57

Mai Refereces EU (2005). Regulatio (EC) No. 78/2005. JO L16, 19.01.05 (pp. 43 45). Fraga Alves, M.I., de Haa, L. ad Neves, C. (2008). A test procedure for detectig super heavy tails. Joural of Statistical Plaig ad Iferece. DOI: 10.1016/j.jspi.2008.04.026. I Press. de Haa, L. ad Ferreira, A. (2006). Extreme Value Theory: A Itroductio, Spriger Series i Operatios Research ad Fiacial Egieerig. Cláudia Neves ad M. I. Fraga Alves (2008). Testig extreme value coditios - a overview ad recet approaches. REVSTAT - Statistical Joural, Volume 6, Number 1, 83-100. Special issue o "Statistics of Extremes ad Related Fields" edited by Ja Beirlat, Isabel Fraga Alves, Ross Leadbetter. Neves, C. ad Fraga Alves, M. I.(2007). The Ratio of Maximum to the Sum for Testig Super-Heavy Tails. I Advaces i Mathematical ad Statistical Modelig, edited by Barry Arold, Narayaaswamy Balakrisha, José M. Sarabia ad Roberto Miguez, Series Statistics for Idustry ad Techology. Birkhäuser Bosto. I press. Neves, C., Picek, J. ad Fraga Alves, M.I. (2006). Cotributio of the maximum to the sum of excesses for testig max-domais of attractio. JSPI, 136, 4, 1281-1301. Neves, C. ad Fraga Alves, M.I. (2006). Semi-parametric Approach to Hasofer-Wag ad Greewood Statistics i Extremes. TEST,16, 297-313. AVEIRO, Jue 25, 2008 58