Quantile-quantile plots and the method of peaksover-threshold

Similar documents
Math 576: Quantitative Risk Management

Analysis methods of heavy-tailed data

Introduction to Algorithmic Trading Strategies Lecture 10

Financial Econometrics and Volatility Models Extreme Value Theory

Approximating the Integrated Tail Distribution

VaR vs. Expected Shortfall

Limiting Distributions

DRAFT - Math 101 Lecture Note - Dr. Said Algarni

Quantitative Modeling of Operational Risk: Between g-and-h and EVT

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich

Normal Random Variables and Probability

Rare Events in Random Walks and Queueing Networks in the Presence of Heavy-Tailed Distributions

ECON 4130 Supplementary Exercises 1-4

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

Assessing Dependence in Extreme Values

Losses Given Default in the Presence of Extreme Risks

Shape of the return probability density function and extreme value statistics

Extreme Value Analysis and Spatial Extremes

Lecture 3. David Aldous. 31 August David Aldous Lecture 3

Nonlinear Time Series Modeling

Estimation of Quantiles

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2

Asymptotic distribution of the sample average value-at-risk

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS

Stochastic (Random) Demand Inventory Models

Comparing downside risk measures for heavy tailed distributions

Limiting Distributions

2 Functions of random variables

Classical Extreme Value Theory - An Introduction

Math438 Actuarial Probability

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Moment Properties of Distributions Used in Stochastic Financial Models

STAT 512 sp 2018 Summary Sheet

ESTIMATING BIVARIATE TAIL

Twelfth Problem Assignment

8. Limit Laws. lim(f g)(x) = lim f(x) lim g(x), (x) = lim x a f(x) g lim x a g(x)

Continuous distributions

CONTAGION VERSUS FLIGHT TO QUALITY IN FINANCIAL MARKETS

Sec 4.1 Limits, Informally. When we calculated f (x), we first started with the difference quotient. f(x + h) f(x) h

Regular Variation and Extreme Events for Stochastic Processes

MATH 104: INTRODUCTORY ANALYSIS SPRING 2008/09 PROBLEM SET 8 SOLUTIONS

Risk Aggregation. Paul Embrechts. Department of Mathematics, ETH Zurich Senior SFI Professor.

Jan Kallsen. Risk Management

Formulas for probability theory and linear models SF2941

Analysis of Experimental Designs

Order Statistics and Distributions

Extreme Value Theory and Applications

Research Article Strong Convergence Bound of the Pareto Index Estimator under Right Censoring

On Sums of Conditionally Independent Subexponential Random Variables

MS 2001: Test 1 B Solutions

MFM Practitioner Module: Quantitiative Risk Management. John Dodson. October 14, 2015

Baire category theorem and nowhere differentiable continuous functions

Change Point Analysis of Extreme Values

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Non-parametric Inference and Resampling

Monte Carlo Integration I [RC] Chapter 3

Limits at Infinity. Horizontal Asymptotes. Definition (Limits at Infinity) Horizontal Asymptotes

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

STA 4321/5325 Solution to Homework 5 March 3, 2017

Change Point Analysis of Extreme Values

Three hours. To be supplied by the Examinations Office: Mathematical Formula Tables and Statistical Tables THE UNIVERSITY OF MANCHESTER.

MA 123 (Calculus I) Lecture 3: September 12, 2017 Section A2. Professor Jennifer Balakrishnan,

BMIR Lecture Series on Probability and Statistics Fall 2015 Discrete RVs

Financial Econometrics and Quantitative Risk Managenent Return Properties

120 CHAPTER 1. RATES OF CHANGE AND THE DERIVATIVE. Figure 1.30: The graph of g(x) =x 2/3.

LIMITS AT INFINITY MR. VELAZQUEZ AP CALCULUS

arxiv: v1 [math.st] 4 Aug 2017

Introduction to Rare Event Simulation

Review: mostly probability and some statistics

Problem 1 (20) Log-normal. f(x) Cauchy

Nonparametric estimation of extreme risks from heavy-tailed distributions

STA 2201/442 Assignment 2

Assessing financial model risk

Brief Review of Probability

SOME INDICES FOR HEAVY-TAILED DISTRIBUTIONS

1: PROBABILITY REVIEW

Section 2.1 Limits: Approached Numerically and Graphically

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

2 (Statistics) Random variables

Mathematical Statistics 1 Math A 6330

Math Camp II. Calculus. Yiqing Xu. August 27, 2014 MIT

Upper and lower bounds for ruin probability

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

The best expert versus the smartest algorithm

Parameter Estimation

MAS331: Metric Spaces Problems on Chapter 1

Asymptotics of sums of lognormal random variables with Gaussian copula

3 Continuous Random Variables

Real Analysis Problems

Generalized additive modelling of hydrological sample extremes

Generalization theory

Qualifying Exam in Probability and Statistics.

Notes 9 : Infinitely divisible and stable laws

MATH4210 Financial Mathematics ( ) Tutorial 7

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

n! (k 1)!(n k)! = F (X) U(0, 1). (x, y) = n(n 1) ( F (y) F (x) ) n 2

1 Joint and marginal distributions

Monitoring Forecasting Performance

Tail bound inequalities and empirical likelihood for the mean

Universidad Carlos III de Madrid

Transcription:

Problems in SF2980 2009-11-09 12 6 4 2 0 2 4 6 0.15 0.10 0.05 0.00 0.05 0.10 0.15 Figure 2: qqplot of log-returns (x-axis) against quantiles of a standard t-distribution with 4 degrees of freedom (y-axis). Note, in this plot the emprical quantiles are on the x-axis and the reference (t 4 ) quantiles are on the y-axis. Quantile-quantile plots and the method of peaksover-threshold Problem 1 You have bought 10 shares of a stock with share price $1 today and with historical daily log-returns shown in the qq-plot in Figure 2. For a standard t-distributed random variable Z with 4 degrees of freedom it holds that ES 0.01 (Z) = 5.22. Estimate ES 0.01 for your linearized portfolio net worth from today until tomorrow. Problem 2 Below you find the 24 insurance claims that exceed 10 Million Kronor out of in total 1949 claims (the total sample size is hence 1949/24 81 times larger). The claims are assumed to be observations of independent and identically distributed random variables X 1,...,X 1949. [1] 152.413209 95.168375 41.213000 34.605146 23.191095 18.301611 [7] 17.569546 16.934801 16.753927 15.527950 15.213358 14.577883 [13] 14.023591 12.701101 12.152200 12.150434 12.059369 11.310395 [19] 11.089682 11.081675 10.692103 10.471204 10.248902 10.125362 (a) Compute the empirical estimate of F 1 X (0.99) based on all 1949 claims. (b) Recall the POT approximation P(X > u + x) N u n ( 1 + γ βx) 1/ γ, x > 0, where n is the total sample size and N u the number of observations exceeding the threshold u. Use the parameter esimates ( γ, β) = (0.5, 4.5) and the POT approximation above to estimate F 1 X (0.99).

Problems in SF2980 2009-11-09 13 X 10 5 0 5 10 Z 4 2 0 2 4 30 20 10 0 10 20 30 W 10 5 0 5 10 X Z 4 2 0 2 4 W 30 10 0 10 20 30 3 2 1 0 1 2 3 Y 3 2 1 0 1 2 3 Y Figure 3: QQ-plots Problem 3 Consider four iid samples of logreturns, each of size 500, for four different stocks. The samples X = {x 1,...,x 500 }, Y = {y 1,...,y 500 }, Z = {z 1,...,z 500 } and W = {w 1,...,w 500 } are compared in the QQ-plot in Figure 3. The logreturn distributions are: N(0, 1), N(0, σ 2 ) with σ > 1, t(3) and t(10), where t(ν) denotes a standard t-distribution with ν degrees of freedom. Determine the correct match between the four samples and the four distributions. Problem 4 Consider a portfolio loss L with distribution function F. It is assumed that F = 1 F RV 2, i.e. F is regularly varying at infinity with index 2. To account for stochastic volatility you introduce the random variable σ with distribution P(σ = 1/2) = 3/4, P(σ = 2) = 1/4, which represents two possible volatility regimes. It is assumed that σ and L are independent. Which of the models L and σl gives asymptotically the biggest probability for large losses? i.e. compute the limit lim z P(σL > z)/ P(L > z). Problem 5 Let X be a financial loss with distribution function F satisfying F = 1 F RV α (α > 0), i.e. F is a regularly varying function. (a) Compute lim u P(X u > ux/α X > u) for x > 0.

Problems in SF2980 2009-11-09 14 (b) Find a sequence (a n ) such that lim n nf(a n x) = x α for x > 0. You may assume that F is continuous and strictly increasing. (c) Let (X k ) be a sequence of iid random variables satisfying X d k = X (equal in distribution) for each k. Show that lim n nf(a n x) = x α for x > 0 implies and determine H(x). lim n P(a 1 n max{x 1,...,X n } x) = H(x), x > 0, (d) Hill estimation on an iid sample of size n from the distribution of X gives you the Hill estimate ( 1 k k 1 (ln x j,n ln x k,n )) 2, j=1 where x 1,n > > x n,n and k is assumed to be optimally chosen. Use this information to approximate P(X > 2 10 6 X > 10 6 ). Problem 6 Let X 1,..., X n be observations of the iid random variables. The quantile-quantile plot of the sample with respect to a reference distribution F consists of the points {( F ( n k + 1 n + 1 ), X k,n ) } : k = 1,...,n. (a) Explain how the quantile-quantile plot can be used as a graphical tool to check if the X i s are approximately distributed according to F. (b) Let G(x) = F((x µ)/σ) for µ R, σ > 0 and suppose that the quantile-quantile plot with respect to F is approximately linear. Explain why the quantile-quantile plot with respect to G is also approximately linear. Answer must be properly motivated! (c) In Figure 4, a quantile-quantile plot is constructed from a sample of 10000 data points, X 1,...,X 10000 with respect to a reference distribution F. Does the reference distribution have too light (right) tail? too heavy (right) tail? Answer must be properly motivated! Problem 7 Let X, X 1, X 2,...,X n be iid random variables and suppose x 1, x 2,..., x n are observations of X 1, X 2,...,X n. In the peaks-over-threshold method (POT) the tail of the distribution F of X is approximated by F(u 0 + y) F(u 0 )G γ;β (y), y 0, (1) for some (large) u 0, where G γ;β is the generalised Pareto distribution.

Problems in SF2980 2009-11-09 15 10 9 8 7 6 5 4 3 2 1 0 0 20 40 60 80 100 120 140 160 180 Figure 4: Quantile-quantile plot in Problem 7 (c). (a) Assume that 0 < γ < 1. Use the approximation (1) to show that an approximation of the mean excess function e(u) of X for u u 0, is given by e(u) = β + γ(u u 0), u u 0. 1 γ (b) In Figure 5, the empirical (or sample) mean excess function e n (u) for the observations x 1, x 2,...,x n is plotted. Determine from this figure an estimate of u 0. (c) Use Figure 5 to find (graphically) estimates for γ and β. Hint: (a) The generalised Pareto distribution is given by { 1 (1 + γ G γ;β (x) = β x) 1/γ, γ 0, 1 e x/β, γ = 0. where x 0 if γ 0 and 0 x 1/γ if γ < 0. You may also use that for a > 1, b > 0 z (x z)ab(1 + bx) a 1 dx = Problem 8 1 b(a 1) (1 + bz)1 a. Let X be a positive random variable with distribution function F satisfying F RV α, α > 0 (i.e. F = 1 F is regularly varying with index α). (a) It holds that lim u sup x (0, ) F u (x) G γ,β(u) (x) = 0, where F u is the excess distribution function F u (x) = P(X u x X > u) of X over the threshold u and G γ,β(u) (x) = (1 + γx/β(u)) 1/γ, x 0. Derive the POT approximation of the tail probability P(X > u + x) expressed in terms of x, u, γ and β(u). Suppose that F(x) = 1 Cx α for x C 1/α, where C > 0 and α > 0.

Problems in SF2980 2009-11-09 16 4 3.5 3 2.5 2 1.5 1 0.5 0 0 1 2 3 4 5 6 7 8 Figure 5: Empirical (sample) mean excess function for the sample x 1,...,x n. (b) For some function β : (0, ) (0, ) and γ > 0 we have lim P((X u)/β(u) > x X > u) = G γ,1(x), x 0, (2) u where G γ,1 (x) = (1 + γx) 1/γ. Find γ and β(u) so that (2) holds. (c) The POT method provides the VaR estimator ( ( ) γ F 1 β(u) n X,POT (p) = u + (1 p) 1), γ N u where n is the sample size and N u the number of observations larger than u. Choose the threshold u = 10 and p = 0.995. Use the information in Figures 1 and 2 and part (a) and (b) to obtain the estimates γ and β(u). Compute the empirical estimate F 1 n (p) and the POT estimate 1 (p) of F (p) (give also numeric results). F 1 X,POT X Problem 9 An insurance company defaults if its yearly loss is greater than l. The yearly loss is given by L = X + Y and represents the sum of losses X 0 and Y 0 in two lines of business. Suppose that P(X x) = P(Y x) = F(x) and that P(X x, Y y) = min{f(x), F(y)}, where F is continuous and satisfies F RV α, α > 0. The insurance company finds its overall position risky and wants to investigate how much diversification can reduce its default probability. Therefore, it considers the loss L = X + Ỹ, where Ỹ = d Y (i.e. equally distributed) but X and Ỹ are independent. You are asked to compute P( L > l)/ P(L > l) and since l is large you may approximate this ratio of default probabilities by its limit as l.

Problems in SF2980 2009-11-09 17 5 10 15 20 25 0 100 200 300 400 500 Figure 6: 500 simulations from the distribution of X. 0 1 2 3 4 0 100 200 300 400 500 Figure 7: Hill plot of the sample, from the distribution of X, shown above.

Problems in SF2980 2009-11-09 18 (a) There exists a function f such that Compute f(α). f(α) = lim l P( L > l) P(L > l). (b) Interpret the result in (a): for which regular variation indices can the default probability be reduced, and how much, by having independent lines of business? Hint: To get an upper bound for f(α) you may use that for every ε (0, 1/2), P( L > l) = P( L > l, X εl) + P( L > l, Ỹ εl) + P( L > l, X > εl, Ỹ > εl). Then show that this implies that P( L > l) P(Ỹ > (1 ε)l) + P(X > (1 ε)l) + P(X > εl, Ỹ > εl). To get a lower bound you may use that P( L > l) P(X > l) + P(Ỹ > l) P(X > l, Ỹ > l). Then show that for some g and h, g(α) f(α) h(α, ε) for every ε (0, 1/2) and that lim ε 0 h(α, ε) = g(α).

Problems in SF2980 2009-11-09 19 Solutions Problem 1 Let Z be a standard t-distributed rv with four degrees of freedom, with df F. Let Y the log-return from today until tomorrow, with df G. We see that F 1 (q) = a + bg 1 (q), for q (0, 1), with a 1 and b 50. Hence, approximatively, G 1 (q) = F 1 (q)/50 + 1/50. Hence, we may assume that Y d = Z/50 + 1/50. This means that the linearized net worth is X = 10Y d = Z/5 + 1/5, and the linearized loss L = X d = Z/5 1/5 d = Z/5 1/5, because Z is symmetric. Then 1 ES 0.01 (X ) = 1 F 1 (p)/5 1/5dp 0.01 0.99 = ES 0.01 (Z)/5 1/5 0.844($). Problem 2 (a) [1949(1 0.99)] + 1 = 19 + 1 = 20. Hence, F 1 n (0.99) = X 20,1949 11.1. (b) We get the quantile F 1 X (p) by solving F X(F 1 X (p)) = p and using the approximation 1 F X (x) N ( u 1 + γ β(x 1/ γ u)), n for x > u. This gives With the given values this yields [ ( ) γ β n F 1 X (p) u + (1 p) 1]. γ N u F 1 X (0.99) = 10 + 9[(0.81) 1/2 1] = 10 + 9[(0.9) 1 1] = 11. Problem 3 X t(3), Y N(0, 1), Z t(10), W N(0, σ 2 ) with σ > 1. Problem 4 P(σL > z) lim z P(L > z) = lim z 3 P(L > 2z) 4 P(L > z) + lim z = 3 4 2 2 + 1 4 22 = 3 16 + 1 > 1. 1 P(L > z/2) 4 P(L > z) Hence, the model σl gives loss probabilities that are bigger than those for the model L (asymptotically).

Problems in SF2980 2009-11-09 20 (a) We have Problem 5 P(X u > ux/α X > u) = using that F is regularly varying with index α. P(X > u(1 + x/α)) P(X > u) F(u(1 + x/α)) = F(u) (1 + x/α) α, (b) lim n nf(a n ) = lim n n(1 F(a n )) = 1. Hence one choice is a n such that F(a n ) = 1 1/n, i.e. we can choose a n = F 1 (1 1/n). Now, (c) nf(a n x) = F(a nx) 1/n = F(a n x) 1 F(F 1 (1 1/n)) = F(a nx) F(a n ) x α. P(max{X 1,..., X n } a n x) = F n (a n x) = (1 F(a n x)) n ( = 1 nf(a ) nx) n exp{ x α } n = H(x). (d) From the Hill plot we find that α 2. Hence, P(X > 2 10 6 X > 10 6 ) 2 α 1/4. Problem 6 (a) If the iid random variables X 1,...,X n are distributed according to F, then the quotient between successive quantiles in the reference distribution and the successive empirical quantiles is approximately constant, i.e. F ( n k+1 n+1 ) F ( x k,n x k+1,n n k n+1 ) const. This results in the qq-plot being (approximately) a straight line. (b) Note that G (α) = σf (α) + µ. Hence, if F ((n k + 1)/(n + 1)) cx k,n + m, then ( n k + 1 ) G cσx k,n + m + µ, n + 1

Problems in SF2980 2009-11-09 21 which also is a straight line. (c) The qq-plot does not look like ( a straight ) line; ( it curves ) down in the right tail. This means that the distance F F is too large compared to n k+1 n+1 X k,n X k+1,n when k is small. That is, the distance between the high quantiles in the reference distribution is too large, which implies that it has a too heavy right tail. Problem 7 (a) By (1) we have F(x) 1 F(u 0 )G γ;β (x u 0 ) for x > u 0. Hence the density of F can be approximated by f(x) d ( ) 1 F(u 0 )G γ;β (x u 0 ) = F(u 0 )g γ;β (x u 0 ), x > u 0 dx n k n+1 with g γ;β (x) the density of G γ;β (x). The mean excess function is then e(u) = E(X u X > u) = 1 (x u)f X (x)dx F(u) u 1 (x u)f(u 0 )g γ;β (x u 0 )dx F(u 0 )G γ;β (u u 0 ) u 1 = (y (u u 0 ))g γ;β (y)dy G γ;β (u u 0 ) u u 0 = {HINT} = (1 + γ β (u u 0)) 1/γ β(1 + γ β (u u 0)) 1 1/γ = β + γ(u u 0). 1 γ 1 γ (b) Since the mean excess function is linear a reasonable estimate of u 0 is where the empirical mean excess function seems linear for u > u 0. u 0 (2.5, 3.5) seems to be a good estimate. for u > u 0, which is a straight line. We take u 0 = 3. At u = u 0 = 3 we have e n (3) 1 and at u = 5 we have e n (5) 4. We have to solve (c) The mean excess function is given by β+γ(u u 0) 1 γ β 1 γ = 1, β 1 γ + γ (5 3) = 4. 1 γ This system of equations has solution γ = 3/5, β = 2/5. (Anything reasonably close to these values will give full points). (a) See Lecture notes. Problem 8

Problems in SF2980 2009-11-09 22 (b) For u sufficiently large, P((X u)/β(u) > x X > u) = P(X > u + xβ(u) X > u) By choosing β(u) = u/α we see that P(X > u + xβ(u)) = P(X > u) (u + xβ(u)) α = u α = (1 + xβ(u)/u) α. lim P((X u)/β(u) > x X > u) = (1 + u x/α) α = G 1/α,1 (x). (c) From Figure 2 we choose our Hill estimate of α as α = 2. Since we have chosen the threshold u = 10 we have γ = 1/ α = 1/2, β(u) = u/ α = 5. Hence, with p = 0.995, ( ( ) γ F 1 β(u) n X,POT (p) = u + (1 p) 1) γ N u ( (500 ) 1/2 = 10 + 10 (1 0.995) 1) 10 = 10 + 10 ( (0.25) 1/2 1 ) = 20. For the empirical estimate, we have [500(1 0.995)] + 1 = 2 + 1 = 3. Hence, the empirical quantile estimate is the third largest observation: Fn 1 (0.995) 19. Problem 9 (a) Note that X and Y are comonotonic so that (X, Y ) d = (F (U), F (U)) where U U(0, 1). Hence, (X, Y ) d = (X, X). Hence, P(L > l) = P(X + X > l) = P(X > l/2). We have P( L > l) = P( L > l, X εl) + P( L > l, Ỹ εl) + P( L > l, X > εl, Ỹ > εl). If X + Ỹ > l and X εl, then we must have Ỹ > (1 ǫ)l. Similarly, if X + Ỹ > l and Ỹ εl, then we must have X > (1 ǫ)l. Hence, P( L > l) = P( L > l, X εl) + P( L > l, Ỹ εl) + P( L > l, X > εl, Ỹ > εl) P(Ỹ > (1 ε)l) + P(X > (1 ε)l) + P(X > εl) P(Ỹ > εl) = 2 P(X > (1 ε)l) + P(X > εl) 2.

Problems in SF2980 2009-11-09 23 Therefore, for any ε (0, 1/2), P( L > l) P(L > l) 2P(X > (1 ε)l) P(X > l/2) + P(X > εl) P(X > l/2) P(X > εl) 2 2 α (1 ε) α + 0, as l. Since ε (0, 1/2) can be chosen arbitrarily small, For the lower bound we have P( L > l) lim l P(L > l) 21 α. P( L > l) P(X > l) + P(Ỹ > l) P(X > l, Ỹ > l) P(L > l) P(X > l/2) P(X > l) P(X > l) = 2 P(X > l/2) P(X > l/2) P(X > l) 2 2 α 0, as l. Hence, We conclude that P( L > l) lim l P(L > l) 21 α. f(α) = lim l P( L > l) P(L > l) = 21 α. (b) For α < 1 (very heavy tails) the default probability is actually higher for two independent lines of business. For α > 1 the default probability is lower for two independent lines of business. For large α-values (lighter tails) the default probability is much lower for independent lines of business.