Reliable Inference in Conditions of Extreme Events. Adriana Cornea

Similar documents
Better Bootstrap Confidence Intervals

11. Bootstrap Methods

Can we do statistical inference in a non-asymptotic way? 1

large number of i.i.d. observations from P. For concreteness, suppose

Robust Performance Hypothesis Testing with the Variance. Institute for Empirical Research in Economics University of Zurich

Spring 2012 Math 541B Exam 1

Statistical Inference

IIT JAM : MATHEMATICAL STATISTICS (MS) 2013

STAT 4385 Topic 01: Introduction & Review

First Year Examination Department of Statistics, University of Florida

Review of Statistics

Lecture 13: Subsampling vs Bootstrap. Dimitris N. Politis, Joseph P. Romano, Michael Wolf

Point and Interval Estimation II Bios 662

Severity Models - Special Families of Distributions

Probabilities & Statistics Revision

Robust Performance Hypothesis Testing with the Sharpe Ratio

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Robust Backtesting Tests for Value-at-Risk Models

WISE International Masters

Confidence Intervals for the Autocorrelations of the Squares of GARCH Sequences

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Hypothesis testing: theory and methods

Location Multiplicative Error Model. Asymptotic Inference and Empirical Analysis

Contents 1. Contents

Shape of the return probability density function and extreme value statistics

Inference on distributions and quantiles using a finite-sample Dirichlet process

Preliminaries The bootstrap Bias reduction Hypothesis tests Regression Confidence intervals Time series Final remark. Bootstrap inference

Statistics 135 Fall 2007 Midterm Exam

Basic concepts of probability theory

AN EMPIRICAL LIKELIHOOD RATIO TEST FOR NORMALITY

Financial Econometrics and Volatility Models Extreme Value Theory

The bootstrap. Patrick Breheny. December 6. The empirical distribution function The bootstrap

Method of Moments. which we usually denote by X or sometimes by X n to emphasize that there are n observations.

Math 494: Mathematical Statistics

Inferences for the Ratio: Fieller s Interval, Log Ratio, and Large Sample Based Confidence Intervals

A Refined Bootstrap for Heavy Tailed Distributions

The Slow Convergence of OLS Estimators of α, β and Portfolio. β and Portfolio Weights under Long Memory Stochastic Volatility

Predictive Regression and Robust Hypothesis Testing: Predictability Hidden by Anomalous Observations

The Nonparametric Bootstrap

Bootstrap, Jackknife and other resampling methods

IEOR E4703: Monte-Carlo Simulation

1 Exercises for lecture 1

Bootstrap Confidence Intervals

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Maximum Non-extensive Entropy Block Bootstrap

5.6 The Normal Distributions

Robust Resampling Methods for Time Series

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Correlation and Heterogeneity Robust Inference using Conservativeness of Test Statistic. London School of Economics November 24, 2011

This does not cover everything on the final. Look at the posted practice problems for other topics.

Basic concepts of probability theory

A GEOMETRIC APPROACH TO CONFIDENCE SETS FOR RATIOS: FIELLER S THEOREM, GENERALIZATIONS AND BOOTSTRAP

Week Topics of study Home/Independent Learning Assessment (If in addition to homework) 7 th September 2015

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2

STAT 461/561- Assignments, Year 2015

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

Notes on Mathematical Expectations and Classes of Distributions Introduction to Econometric Theory Econ. 770

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm

Resampling-Based Control of the FDR

Stat 710: Mathematical Statistics Lecture 31

A simple nonparametric test for structural change in joint tail probabilities SFB 823. Discussion Paper. Walter Krämer, Maarten van Kampen

Economics Division University of Southampton Southampton SO17 1BJ, UK. Title Overlapping Sub-sampling and invariance to initial conditions

Zwiers FW and Kharin VV Changes in the extremes of the climate simulated by CCC GCM2 under CO 2 doubling. J. Climate 11:

A Resampling Method on Pivotal Estimating Functions

Introduction to Algorithmic Trading Strategies Lecture 10

High Dimensional Empirical Likelihood for Generalized Estimating Equations with Dependent Data

The assumptions are needed to give us... valid standard errors valid confidence intervals valid hypothesis tests and p-values

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Basic concepts of probability theory

MAS223 Statistical Inference and Modelling Exercises

A comparison of four different block bootstrap methods

Probability Distributions Columns (a) through (d)

Empirical likelihood and self-weighting approach for hypothesis testing of infinite variance processes and its applications

Cross-Validation with Confidence

A Bootstrap Test for Conditional Symmetry

Does k-th Moment Exist?

Estimation of AUC from 0 to Infinity in Serial Sacrifice Designs

Bootstrap. Director of Center for Astrostatistics. G. Jogesh Babu. Penn State University babu.

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

STAT 830 Non-parametric Inference Basics

A Course in Applied Econometrics. Lecture 10. Partial Identification. Outline. 1. Introduction. 2. Example I: Missing Data

University of California San Diego and Stanford University and

Bahadur representations for bootstrap quantiles 1

Cross-Validation with Confidence

Probability Theory and Statistics. Peter Jochumzen

Tail bound inequalities and empirical likelihood for the mean

Comparison of inferential methods in partially identified models in terms of error in coverage probability

Petter Mostad Mathematical Statistics Chalmers and GU

Analytical Bootstrap Methods for Censored Data

δ -method and M-estimation

A multiple testing procedure for input variable selection in neural networks

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

where r n = dn+1 x(t)

Econ 508B: Lecture 5

1 Probability theory. 2 Random variables and probability theory.

SUPPLEMENT TO PARAMETRIC OR NONPARAMETRIC? A PARAMETRICNESS INDEX FOR MODEL SELECTION. University of Minnesota

STAT 512 sp 2018 Summary Sheet

Transcription:

Reliable Inference in Conditions of Extreme Events by Adriana Cornea University of Exeter Business School Department of Economics ExISta Early Career Event October 17, 2012

Outline of the talk Extreme events & fat-tails Bootstrap Two papers joint with K. M. Abadir (Imperial College): Approximating moments by nonlinear transformations, with an application to bootstrapping for fat-tails Bootstrapping with fat-tailed asymmetry

Black Monday: Extreme event Black Monday, 19 Oct. 1987: Fall of more than 20% Average daily % change, Jan. 1 to Oct. 19, 1987: µ = 0.055% Standard deviation, Jan. 1 to Oct. 19, 1987: σ = 0.95%

Assume normal distribution N(µ, σ 2 ) normal density f(x) 0.9 0.8 0.7 0.6 0.5 0.4 0.3 µ = 0, σ 2 = 0.2 µ = 0, σ 2 = 1.0 µ = 0, σ 2 = 5.0 µ = 2, σ 2 = 0.5 0.2 0.1 0 5 4 3 2 1 0 1 2 3 4 5 x Normal densities

µ = 0.055%, σ = 0.95% Pr(x < 22) 2 10 126 How small is 2 10 126? (Stock and Watson (2007)) The world population is about 6 billion. Probability of winning a lottery among all living people is 1 6 10 9 2 10 10. The universe has existed for 15 billion years, or about 5 10 17 sec. Probability of choosing a particular second at random from all seconds since the beginning of time is 2 10 18. There are approximately 10 43 molecules of gas in the first kilometer above the earth s surface. Probability of choosing one is 10 43. It is extremely unlikely that the return distribution is normal. Tails of the return distribution: much fatter (heavier) than those of the normal distribution.

Normal distribution N(0, 1) is extremely light-tailed Pr( X > x) 1 2πxe x 2 /2 (1) Many economic & financial variables are fat-tailed power laws Pr( X > x) C x α (2) α: tail index (fat-tailedness measure) E X p exists for p < α E X p does not exist for p α Empirics: economics losses from earthquakes (α (0.6, 1.5)), income (α (1.5, 3)), wealth (α 1.5), returns on many stocks (α (2, 4) infinite fourth moment)

Light vs. fat tails 0.4 0.35 0.3 Normal N(25,1) Asymmetric α stable, α=1.2 Lévy, α = 0.5 density f(x) 0.25 0.2 0.15 0.1 0.05 0 0 5 10 15 20 25 30 35 40 45 50 x Tails of asymmetric α-stable distributions with α = 1.2 < 2 are fatter than those of the normal distribution (for which α = 2). Tails of Lévy distribution are fatter than those of the asymmetric α-stable distribution with α = 1.2.

Bootstrap Pulling yourself by the bootstraps. Illustration in the Dec. 04 issue of Significance

More serious bootstrap Suppose we have an i.i.d. sample: x = (x 1,, x n ) from an unknown distribution F (θ) Compute ˆθ and a statistic of interest t(n, ˆθ) for testing H 1 : θ = θ 0 vs. H 1 : θ θ 0 A simple bootstrap draws randomly with replacement from the empirical distribution function of x say B times Bootstrap sample x j = (x n 1, x 1, x 5, x 10,, x 5 ), j = 1,, B For each x j compute ˆθ j and t j (n, ˆθ j) Reject H 0 if 1 ( N B j=1 I t j (n, ˆθ j) < t(n, ˆθ) ) is smaller than 0.05 or 0.10 Or build a 90% confidence interval for θ with limits ˆθ (ˆθ 0.95 ˆθ), ˆθ (ˆθ 0.05 ˆθ)

Different bootstraps Naive or nonparametric bootstrap: i.i.d. Wild bootstrap: heteroskedasticity Block bootstrap, subsampling: autocorrelation M out of n bootstrap, subsampling: non-smooth statistics (max(x)), x has an infinite variance distribution Rich literature: Efron (1979), Godfrey (2009), Shao & Tu (1995), Politis, Romano & Wolf (1999), Good (1994), Efron & Tibshirani (1993) and many more

Bootstrap and fat-tails Let s take θ = E(x) and assume x F unknown and var(x) = We want to build a 90% CI about E(x) Naive bootstrap not valid if var(x) = : Athreya (1989), Knight (1989) Previous work: m out of n bootstrap, subsampling: Politis, Romano & Wolf (1999) parametric bootstrap: Cornea & Davidson (2011) Nothing works if tails are fat and asymmetric

An idea Take any transformation of x, x = g(y) such that var(y) < For simplicity we take x = exp(y) Suppose x has a Pareto distribution, F x (u) = 1 u α, 1 < α < 2 Then, F y (w) = 1 e αw E(x) = α/(α 1), E(y) = 1/α, hence E(x) = 1/(1 E(y)) Upper/lower limits of 90% bootstrap CI for E(x) are x n ( x 0.95 x n ), x n ( x 0.05 x n ) ( ) ( ) x 0.95 := 1/ 1 ȳ(b+1)(0.95), x 0.05 := 1/ 1 ȳ(b+1)0.05

Some simulations to illustrate Naive bootstrap M out of n bootstrap 0.90 0.95 0.99 0.90 0.95 0.99 α = 1.1 0.20 0.21 0.23 0.27 0.28 0.28 α = 1.3 0.48 0.52 0.56 0.73 0.76 0.79 α = 1.5 0.63 0.66 0.63 0.89 0.91 0.95 Table 1: Bootstrap coverage probabilities for E(x) without transformation; B = 399; 10,000 replications

Naive bootstrap 0.90 0.95 0.99 α = 1.1 0.89 0.93 0.98 α = 1.3 0.89 0.94 0.98 α = 1.5 0.88 0.93 0.98 Table 2: Naive bootstrap coverage probabilities for E(x) using the transformation; B = 399; 10,000 replications

Relaxing the assumptions In reality we do not know F x and F y and the link between E(x) and E(y) We can use power series expansions of exp(y) Raw expansion Centered expansion x = k j=0 y j j! + R k (3) k x = e E y e y E y = e E y (y E y) j j! j=0 + R c k (4) Higher-order terms create the problems when x has a fat-tailed distribution Bounding R k by a low power term will (hopefully) solve the problem

A crazy expansion Let 1 i 2 and (y E y) /m ζ + 2πi y, where i y Z, m N, and ζ ( π, π] Then we have the expansion x = e E y e 2πmiy (exp ( )) ζ im e E y e ( ) 2πmiy im ξ i k + ϱ x,k ξk := k j=0 ζj /(i j j!) iy is random, but m is deterministic and to be chosen later Binomial expansion gives x = e E y e 2πmiy Re(ξ im k ) + Rc x,k

This expansion allows us to conclude that Rx,k c = O ( p ζ k+1 ) And find an accurate bound for Rθ,k c, θ = E(x) R c θ,k [ e E y E e ( )] 2πmiy ξ im ζ k+1 k H (k + 1)! ξ k Denote ψ = ζ k+1 (k+1)! ξ k Where 1 2e m sin 1 ψ cos (m log (1 ψ )) + e 2m sin 1 ψ H( ψ ) := 1 + e m sin 1 ψ 1 + e mπ [ ) 0, 1 e π/m [ for ψ 1 e π/m, 1 ] (1, ) respectively

Application: transformation-based bootstrap Letting y := log(x) and z := e E y e 2πmiy Re(ξ im k ), we have x = z + Rx,k c and the remainder has the bound Bc x,k By applying the triangle inequality twice, z B c x,k x z + Bc x,k which can be used to build conservative CIs for E(x) To do this, consider an i.i.d. sample x 1,, x n and compute x n := 1 n n x j, j=1 z + n := 1 n n z j, B c x := 1 n,k n j=1 n j=1 B c x j,k By the triangle inequality, t 1,n x n t 2,n, t 1,n := z + n B c x n,k, t 2,n := z + n +B c x n,k

We can bootstrap t 1,n and t 2,n instead of x n for appropriate choice of k and m If Bx,k c is too large then the CI is too conservative (we don t want that). If k or m, then Bx,k c vanishes and z coincides with x and we are back to the original invalid bootstrap. Thus k, m have to be finite and their value chosen depending on the thickness of the tail of x, α.

In practice, first estimate α (using for instance Hill (1975) method) Then, for extreme quantile (99%) take k = 1 and an estimate of m is given by the integer part of exp (1.44 37.90n 1/2 + 14.41α 1 15.42α 2)

n = 100 n = 1000 m 0.90 0.95 0.99 m 0.90 0.95 0.99 α = 1.1 1 0.98 0.98 0.99 1 1 1 1 2 0.58 0.58 0.60 2 0.99 0.99 0.99 3 0.48 0.49 0.54 3 0.67 0.71 0.75 4 0.51 0.53 0.57 4 0.71 0.74 0.79 α = 1.3 1 0.99 0.99 0.99 9 0.99 0.99 0.99 2 0.88 0.90 0.93 12 0.97 0.98 0.99 3 0.88 0.90 0.93 16 0.95 0.96 0.98 4 0.87 0.89 0.92 20 0.92 0.93 0.96 α = 1.5 1 0.99 0.99 0.99 16 0.98 0.99 0.99 2 0.96 0.97 0.98 20 0.97 0.98 0.99 3 0.93 0.95 0.97 25 0.95 0.96 0.98 4 0.86 0.88 0.91 30 0.93 0.94 0.97 Table 3: Transformation-based bootstrap coverage probabilities for E(x), k = 1, B = 399, 10,000 replications; data from Pareto(α)

For lower quantiles take k = 2, an estimate of m is given by the integer part of exp (3.22 66.36n 1/2 1.16α 2 + 102.58n 1 α 2) exp (40.36n 1/2 sign. level 1)

n = 100 n = 1000 m 0.90 0.95 0.99 m 0.90 0.95 0.99 α = 1.1 1 0.90 0.90 0.92 3 0.99 0.99 0.99 2 0.82 0.84 0.86 4 0.98 0.98 0.99 3 0.77 0.79 0.81 5 0.93 0.94 0.95 4 0.66 0.67 0.69 6 0.86 0.87 0.89 α = 1.3 1 0.98 0.99 0.99 4 0.99 0.99 0.99 2 0.96 0.97 0.98 5 0.99 0.99 0.99 3 0.90 0.91 0.93 6 0.96 0.96 0.98 4 0.82 0.83 0.86 7 0.92 0.94 0.95 α = 1.5 1 0.99 0.99 0.99 6 0.97 0.97 0.98 2 0.97 0.98 0.99 7 0.94 0.95 0.97 3 0.92 0.93 0.95 8 0.91 0.93 0.96 4 0.86 0.88 0.91 9 0.89 0.91 0.94 Table 4: Transformation-based bootstrap coverage probabilities for E(x), k = 2, B = 399, 10,000 replications; data from Pareto(α)