Change Point Analysis of Extreme Values

Similar documents
Change Point Analysis of Extreme Values

Change Point Analysis of Extreme Values

Spatial and temporal extremes of wildfire sizes in Portugal ( )

Extreme Value Theory and Applications

APPROXIMATING THE GENERALIZED BURR-GAMMA WITH A GENERALIZED PARETO-TYPE OF DISTRIBUTION A. VERSTER AND D.J. DE WAAL ABSTRACT

Sharp statistical tools Statistics for extremes

Analysis methods of heavy-tailed data

A Closer Look at the Hill Estimator: Edgeworth Expansions and Confidence Intervals

Goodness-of-fit testing and Pareto-tail estimation

CHANGE DETECTION IN TIME SERIES

Variable inspection plans for continuous populations with unknown short tail distributions

Estimation de mesures de risques à partir des L p -quantiles

NONPARAMETRIC ESTIMATION OF THE CONDITIONAL TAIL INDEX

Financial Econometrics and Volatility Models Extreme Value Theory

Modelling Risk on Losses due to Water Spillage for Hydro Power Generation. A Verster and DJ de Waal

EXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009

Modelação de valores extremos e sua importância na

Introduction to Algorithmic Trading Strategies Lecture 10

ESTIMATING BIVARIATE TAIL

A Robust Estimator for the Tail Index of Pareto-type Distributions

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Assessing Dependence in Extreme Values

STAT 461/561- Assignments, Year 2015

Extreme L p quantiles as risk measures

Modelling Multivariate Peaks-over-Thresholds using Generalized Pareto Distributions

CONTAGION VERSUS FLIGHT TO QUALITY IN FINANCIAL MARKETS

A THRESHOLD APPROACH FOR PEAKS-OVER-THRESHOLD MODELING USING MAXIMUM PRODUCT OF SPACINGS

Overview of Extreme Value Analysis (EVA)

Department of Econometrics and Business Statistics

Bias-corrected goodness-of-fit tests for Pareto-type behavior

Extreme Value Analysis and Spatial Extremes

Estimation of the extreme value index and high quantiles under random censoring

SIMULTANEOUS TAIL INDEX ESTIMATION

Practical conditions on Markov chains for weak convergence of tail empirical processes

Pitfalls in Using Weibull Tailed Distributions

Shape of the return probability density function and extreme value statistics

Quantile-quantile plots and the method of peaksover-threshold

Estimating Bivariate Tail: a copula based approach

arxiv: v1 [stat.me] 18 May 2017

Math 576: Quantitative Risk Management

The Convergence Rate for the Normal Approximation of Extreme Sums

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

Extreme Value Theory as a Theoretical Background for Power Law Behavior

Does k-th Moment Exist?

Nonparametric Estimation of Extreme Conditional Quantiles

Testing Contagion in Multivariate Financial Time Series

Bias Reduction in the Estimation of a Shape Second-order Parameter of a Heavy Right Tail Model

PARAMETER ESTIMATION FOR THE LOG-LOGISTIC DISTRIBUTION BASED ON ORDER STATISTICS

STT 843 Key to Homework 1 Spring 2018

Bayesian Modelling of Extreme Rainfall Data

Investigation of an Automated Approach to Threshold Selection for Generalized Pareto

Emma Simpson. 6 September 2013

International Student Enrollment Fall 2018 By CIP Code, Country of Citizenship, and Education Level Harpur College of Arts and Sciences

Estimation of risk measures for extreme pluviometrical measurements

APPLICATION OF EXTREMAL THEORY TO THE PRECIPITATION SERIES IN NORTHERN MORAVIA

A TEST OF FIT FOR THE GENERALIZED PARETO DISTRIBUTION BASED ON TRANSFORMS

A New Estimator for a Tail Index

Estimation of the second order parameter for heavy-tailed distributions

A Conditional Approach to Modeling Multivariate Extremes

One-Sample Numerical Data

Estimation of spatial max-stable models using threshold exceedances

STAT 6350 Analysis of Lifetime Data. Probability Plotting

Stochastic Analysis and Forecasts of the Patterns of Speed, Acceleration, and Levels of Material Stock Accumulation in Society

Statistics I Chapter 2: Univariate data analysis

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Statistics I Chapter 2: Univariate data analysis

Corporate Governance, and the Returns on Investment

A NOTE ON SECOND ORDER CONDITIONS IN EXTREME VALUE THEORY: LINKING GENERAL AND HEAVY TAIL CONDITIONS

Challenges in implementing worst-case analysis

STA 2201/442 Assignment 2

Non-parametric Inference and Resampling

arxiv: v1 [math.st] 4 Aug 2017

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

USDA Dairy Import License Circular for 2018

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

EXTREMAL MODELS AND ENVIRONMENTAL APPLICATIONS. Rick Katz

Problem 1 (20) Log-normal. f(x) Cauchy

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Pareto approximation of the tail by local exponential modeling

A simple nonparametric test for structural change in joint tail probabilities SFB 823. Discussion Paper. Walter Krämer, Maarten van Kampen

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Abstract: In this short note, I comment on the research of Pisarenko et al. (2014) regarding the

Nonlinear Time Series Modeling

On the modelling of extreme droughts

On the estimation of the second order parameter in extreme-value theory

First steps of multivariate data analysis

Contents 1. Contents

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Estimation of the functional Weibull-tail coefficient

Skew Generalized Extreme Value Distribution: Probability Weighted Moments Estimation and Application to Block Maxima Procedure

Extreme value modelling of rainfalls and

Approximation of Average Run Length of Moving Sum Algorithms Using Multivariate Probabilities

EVA Tutorial #2 PEAKS OVER THRESHOLD APPROACH. Rick Katz

The extremal elliptical model: Theoretical properties and statistical inference

Do Policy-Related Shocks Affect Real Exchange Rates? An Empirical Analysis Using Sign Restrictions and a Penalty-Function Approach

Global Data Catalog initiative Christophe Charpentier ArcGIS Content Product Manager

Measures of center. The mean The mean of a distribution is the arithmetic average of the observations:

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Mathematics Qualifying Examination January 2015 STAT Mathematical Statistics

NCoVaR Granger Causality

Transcription:

Change Point Analysis of Extreme Values TIES 2008 p. 1/? Change Point Analysis of Extreme Values Goedele Dierckx Economische Hogeschool Sint Aloysius, Brussels, Belgium e-mail: goedele.dierckx@hubrussel.be Jef L. Teugels Katholieke Universiteit Leuven, Belgium & EURANDOM, Technical University Eindhoven, the Netherlands e-mail: jef.teugels@wis.kuleuven.be

Change Point Analysis of Extreme Values TIES 2008 p. 2/? Overview 1. Introduction 2. Test statistic (a) Construction (b) Extreme value situation (c) Asymptotics (d) Practical procedure 3. Examples (a) Simulation (b) Malaysian Stock Index. Classical Approach Improved Approach (c) Nile Data (d) Swiss-Re Catastrophic Data 4. Conclusions 5. References

Change Point Analysis of Extreme Values TIES 2008 p. 3/? 1. INTRODUCTION We start with an example where a change point has occurred. 987 measurements of the Daily Stock Market Returns of the Malaysian Stock Index. Jan. 1995 Dec. 1998, covering the Asian financial crisis, July 1997. -0.1 0.0 0.1 0.2 Changes in distribution? 1 250 500 750 987 1/1/95 10/1/96 14/1/97 15/1/98 31/12/98 in parameters of a distribution? central behavior? tail behavior?

Change Point Analysis of Extreme Values TIES 2008 p. 4/? 2. TEST STATISTIC 2.a. Construction of Test Statistic Start with a sample X 1,..., X m, X m +1,... X n, from a density function f(x; θ i, η). Csörgő and Horváth (1997) test whether θ i changes at some point m H 0 : θ 1 = θ 2 =... = θ n versus H 1 : θ 1 =... = θ m θ m +1 =... = θ n for some m. using the test statistic Z n = max 1 m<n ( 2log Λ m), where Λ m = sup θ,η n i=1 f(x i; θ, η) sup θ,τ,η m i=1 f(x i; θ, η) n i=m+1 f(x i; τ, η).

Change Point Analysis of Extreme Values TIES 2008 p. 5/? 2.TEST STATISTIC Example For the exponential distribution where X i has mean θ i 2log Λ m = 2 m log 1 m m X i (n m)log i=1 1 n m n i=m+1 X i + n log 1 n n i=1 X i For large n, m and n m one can expect normal behaviour expressed in terms of Brownian motions.

Change Point Analysis of Extreme Values TIES 2008 p. 6/? 2.TEST STATISTIC 2.b. Extreme Value Situation Assume that X n,n is the maximum in a sample of independent random variables with a common distribution. Maximum domain of attraction condition lim P n ( Xn,n b n a n ) x Under very weak conditions we get the approximation = G γ (x). P (X n,n y) G γ (b n + a n x) where γ is a real-valued extreme value index and G γ (x) = exp {1 + γx} 1/γ + an extremal law. When γ > 0 we end up with heavy right-tailed distributions, the Pareto-Fréchet Case.

Change Point Analysis of Extreme Values TIES 2008 p. 7/? 2.TEST STATISTIC We concentrate on changes of parameters that describe the tail of distributions appearing in extreme value analysis. X has a Pareto-type distribution with parameter θ = γ, when the relative excesses of X over a high threshold u, given that X exceeds u satisfy the condition P ( ) X u > x X > u x 1 γ. u, More generally X follows a Generalized Pareto distribution (GPD) with parameter θ = (γ, σ) if the behavior of the absolute excesses over a high threshold u satisfies the condition P (X u > x X > u) ( 1 + γx ) 1 γ, u. σ

Change Point Analysis of Extreme Values TIES 2008 p. 8/? 2.TEST STATISTIC For large values, log of Pareto-type with extreme value index γ i is close to be exponential with mean γ i. The most classical approach for the estimation of the extreme value index γ > 0 is to use the Hill estimator: H k,n = 1 k k log X n i+1,n log X n k,n. i=1 Hence, only a segment of the available data is used. The determination of the quantity k is important. Alternatively, we look at extremes above a threshold u = X n k,n. The Hill estimator has small bias but large variance for small k large bias but small variance for large k. As a compromise we select k such that the empirical mean squared error is minimal.

Change Point Analysis of Extreme Values TIES 2008 p. 9/? 2.TEST STATISTIC 1. Pareto-type density Suppose X 1,..., X m, X m+1,... X n are independent and Pareto-type distributed. We denote the extreme value index for X i by γ i. In order to determine whether the index γ changes or not, we perform the following test H 0 : γ 1 = γ 2 =... = γ n = γ versus H 1 : γ 1 = γ m γ m +1 = γ n for some m Hence Z n = where in turn max 1 m<n ( 2log Λ m) log Λ m = [ k 1 log H k1,m + (k k 1 ) log H k k1,n m k log H k,n ] [ ] 1 ( ) + k1 H k1,m + (k k 1 )H k k1,n m kh k,n H k,n.

Change Point Analysis of Extreme Values TIES 2008 p. 10/? 2.TEST STATISTIC 2. GPD. Suppose now that X i is GPD with parameters θ i = (γ i, σ i ).To perform the test H 0 : θ 1 = θ 2 =... = θ n versus H 1 : θ 1 =... = θ m θ m +1 =... = θ n for some m we use as test statistic Z n = max 1 m<n ( 2 log Λ m), where 2 log Λ m = 2 L m (ˆθ m ) = m log ˆσ m [ ] L k1 (ˆθ k1 ) + L + k (ˆθ + 1 k ) L 1 k (ˆθ k ) ( ) 1 m + 1 ˆγ m ( 1 L + m(ˆθ m) + = (n m)log ˆσ m + ˆγ + m i=1 + 1 ( ) x log 1 + ˆγ m ˆσ m ) n ( log 1 + ˆγ m + i=m+1 x ˆσ + m ) and likelihood estimators (ˆγ m, ˆσ m ) resp. (ˆγ + m, ˆσ + m) based on X 1, X 2,..., X m and X m+1,... X n are obtained by numerical procedures.

Change Point Analysis of Extreme Values TIES 2008 p. 11/? 2. TEST STATISTIC 2.c. Asymptotics Using the procedure suggested by Csörgő and Horváth we have Theorem Suppose X 1,..., X m, X m+1,... X n are independent and identically distributed. We set the threshold at u = X n k,n. Define Z n = max c n m<n d n ( 2 log Λ m ), with 2log Λ m as before. Let n, k such that k/n 0. Let further c n and d n be intermediate sequences for which c n /n 0 and d n /n 0. Then, under H 0 of our test, Z n d sup 0 t<1 sup 0 t<1 B 2 (t) t(1 t) if Pareto-type, B 2 2 (t) t(1 t) if GPD. B(t) is a Brownian bridge, B 2 (t) is a sum of two independent Brownian bridges.

Change Point Analysis of Extreme Values TIES 2008 p. 12/? 2. TEST STATISTIC 2.d. Practical Procedure Consecutive steps 1. Check on Pareto-type behavior of the data by Q Q plots. 2. Select a threshold u or the value of k = k opt,n that minimizes the asymptotic mean square error of the Hill estimator. We choose the optimal threshold u = X n kopt,n. 3. (a) Define c n as the smallest number such that at least k min = (log k opt,n ) 3/2 of the data points X 1,, X cn are larger than u. (b) Define d n as the smallest number such that at least k min of the data points X n dn +1,..., X n are larger than u. 4. Repeat the next step for all m from c n up to n d n. (a) Split the data up in two groups X 1, X 2,..., X m and X m+1,,..., X n. (b) Calculate 2log Λ m. 5. Calculate Z n = values for sample size k. max c n m<n d n ( 2 log Λ m ) and compare Z n with the critical

Change Point Analysis of Extreme Values TIES 2008 p. 13/? 3.a. Simulation We simulate 1000 data sets of size n (with n = 100, n = 500) from the Burr distribution Burr(β, τ, λ) with parameters as given by P(X > x) = ( β β + x τ ) λ, an example of a GPD with γ = (λτ) 1. The rejection probabilities are given below. H 0 true H 0 false n m γ = 1 γ 1 = 1 γ 1 = 2 γ 1 = 1 γ 1 =.5 γ 2 = 2 γ 2 = 1 γ 2 =.5 γ 2 = 2 100 20.096.191.460.486.182 50.075.517.512.519.559 500 50.029.181.782.799.144 100.044.378.955.951.645 250.019.894.951.966.909

Change Point Analysis of Extreme Values TIES 2008 p. 14/? The corresponding median of ˆm is given in the table below. H 0 false n m γ 1 = 1 γ 1 = 2 γ 1 = 1 γ 1 =.5 γ 2 = 2 γ 2 = 1 γ 2 =.5 γ 2 = 2 100 20 48 21 45 21 50 55 44 56 45 500 50 175 47 92 48 100 139 97 107 97 250 252 247 252 248

Change Point Analysis of Extreme Values TIES 2008 p. 15/? 400 -- 300 -- 200 -- 100 -- 1 2 3 4 5 Figure shows Boxplot of ˆm for the Burr cases for n = 500 and m = 100.

Change Point Analysis of Extreme Values TIES 2008 p. 16/? 3.b. Malaysian Stock Index: Classical approach Figure below indicates that the data are Pareto-type distributed. If we accept that July 1997 was a change point, then the data before that date give an extreme value index γ 1 between 0.1 and 0.2 while those after that date give γ 2 around 0.5. The mean squared error of the Hill estimator based on the whole data set attains a local minimum for the threshold u given by X 987 224,987 = 0.0099 so that k = k opt = 224. 0.0 0.2 0.4 0.6 0.8 1.0 0 100 200 300 400 500

Change Point Analysis of Extreme Values TIES 2008 p. 17/? 1. Pareto-type distribution First 2log Λ m,1 m n 1 is plotted below. 1 250 500 750 987 1/1/95 10/1/96 14/1/97 15/1/98 31/12/98 Graph of (m, 2 log Λ m ) with critical value indicated with a horizontal line. We see that Z n = max( 2log Λ m ) = 5.8 falls above the critical value 3.14 and we reject H 0. The maximum is attained at m = 635, which corresponds to 1/08/1997, shortly after the beginning of the Asian crisis.

Change Point Analysis of Extreme Values TIES 2008 p. 18/? 2. GPD Now 2log Λ m, 1 m n 1 is plotted below. 1 250 500 750 987 1/1/95 10/1/96 14/1/97 15/1/98 31/12/98 Since Z n = max( 2 log Λ m ) = 5.93 is above the critical value 3.18 we again reject H 0. Also the instant of change ˆm = 636 is again very close to the value before.

Change Point Analysis of Extreme Values TIES 2008 p. 19/? 3.b. Malaysian Stock Index: Improved approach In the above analysis, we assumed that the data were independent. But market data are hardly ever independent. However, it is known that the Hill estimator withstands many forms of dependence. Alternatively, one can proceed as follows. The time series and an estimate of the extremal index are given below. -0.1 0.0 0.1 0.2 1 250 500 750 987 1/1/95 10/1/96 14/1/97 15/1/98 31/12/98 0.0 0.2 0.4 0.6 0.8 1.0 0.90 0.92 0.94 0.96 0.98 A declusturing scheme cuts the data into clusters that can safely be taken as independent. Apply the previous procedure to the 76 cluster maxima.

Change Point Analysis of Extreme Values TIES 2008 p. 20/? 1. Pareto-type distribution 1 20 40 60 1/1/95 14/1/97 15/1/98 31/12/98 There is a local maximum for cluster maximum 48 which corresponds to m = 631 on (28/07/1997). However this local maximum is not larger than the critical. The actual maximum Z n is attained for cluster maximum 66 which corresponds to m = 854 (22/6/98). We cannot reject the hypothesis.

Change Point Analysis of Extreme Values TIES 2008 p. 21/? 2. GPD 1 20 40 60 1/1/95 14/1/97 15/1/98 31/12/98 Now 2log Λ m, 1 m n 1 is plotted in the figure. The maximum Z n is attained for cluster maximum 48 which corresponds to m = 631(28/07/1997). The critical value 2.95 for the test is indicated with a horizontal line. On the basis of this test, we reject the hypothesis of no change.

Change Point Analysis of Extreme Values TIES 2008 p. 22/? 3.c. Nile Data Annual flow volume of the Nile River at Aswan from 1871 to 1970. 1120 1160 963 1210 1160 1160 813 1230 1370 1140 995 935 1110 994 1020 960 1180 799 958 1140 1100 1210 1150 1250 1260 1220 1030 1100 774 840 874 694 940 833 701 916 692 1020 1050 969 831 726 456 824 702 1120 1100 832 764 821 768 845 864 862 698 845 744 796 1040 759 781 865 845 944 984 897 822 1010 771 676 649 846 812 742 801 1040 860 874 848 890 744 749 838 1050 918 986 797 923 975 815 1020 906 901 1170 912 746 919 718 714 740

Change Point Analysis of Extreme Values TIES 2008 p. 23/? Prior studies indicate 1877 (measurement 813) candidate for additive outlier, 1913 (measurements 456) candidate for additive outlier, 1899 (measurement 774) indicates start of construction of Aswan dam. nile 600 800 1000 1200 1400 0 20 40 60 80 100 Time

Change Point Analysis of Extreme Values TIES 2008 p. 24/? Group 1: first 28 points Group 2: remaining 71 points with Pareto QQ plots for both groups. Optimal values k = 17, resp. k = 13 lead to the estimators 0.07 and 0.13. Pareto quantile plot Pareto quantile plot log(x) 6.7 6.8 6.9 7.0 7.1 7.2 log(x) 6.2 6.4 6.6 6.8 7.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Quantiles of Standard Exponential 0 1 2 3 Quantiles of Standard Exponential

Change Point Analysis of Extreme Values TIES 2008 p. 25/? The change point detection based on the Pareto and the GPD model are given in the figure, both leading to a significant change point at ˆm = 28 at the beginning of construction of the Aswan dam. sqrt(hh$ll) 0 1 2 3 4 sqrt(hh$ll) 0 1 2 3 4 5 0 20 40 60 80 100 Time 0 20 40 60 80 100 Time

Change Point Analysis of Extreme Values TIES 2008 p. 26/? 3.d. Swiss-Re Catastrophic Data PLACE DATE VICTIMS PLACE DATE VICTIMS Bangladesh 0,87 300000 Indonesia 34,98 280000 China 6,57 255000 Bangladesh 21,33 138000 Peru 0,51 66000 Gilan (Iran) 20,47 50000 Bam (Iran) 33,98 26271 Tabas (Iran) 8,71 25000 Armenia 18,93 25000 Colombia 15,87 23000 Guatemala 6,10 22084 Izmit (Turkey) 29,63 19118 Gujarat (India) 31,07 15000 India 8,67 15000 India 29,83 15000 India 9,61 15000 India 1,83 10800 Venezuela 29,95 10000 Bangladesh 7,89 10000 Mexico 15,72 9500 India 23,75 9500 Honduras 28,81 9000 Kobe (Japan) 25,05 6425 Philippines 21,85 6304 Pakistan 4,99 5300 Brazil 31,87 5112

Change Point Analysis of Extreme Values TIES 2008 p. 27/? The Pareto QQ plots with the corresponding Hill estimators. The mean squared error is minimal at k = 39 for Pareto and k = 22 for GPD, both leading to ˆγ = 1.3. Pareto quantile plot Estimates of extreme value index log(x) 8 9 10 11 12 gamma 0.5 1.0 1.5 0 1 2 3 Quantiles of Standard Exponential 0 10 20 30 40 k

Change Point Analysis of Extreme Values TIES 2008 p. 28/? The likelihood expression 2 log Λ m based on the Pareto model and the GPD model as a function of m where m is indicating where the group is split up in two. Pictures for Pareto model with critical value 1.6 and GPD model with critical value 1.4.

Change Point Analysis of Extreme Values TIES 2008 p. 29/? 4. CONCLUSIONS What has been shown are just first attempts Assumption on positive γ Rounded figures make accurate conclusions harder There is a need for sufficiently large data sets Need for studies under specific dependence structures Multivariate extensions should be possible 5. REFERENCES Beirlant, J., Goegebeur Y., Segers, J. and Teugels, J.L. (2004). Statistics of Extremes, Theory and Applications, Wiley, Chichester. Csörgő, M., Horváth, L. (1997). Limit Theorems in Change Point Analysis. Wiley, Chichester.