Estimation of the functional Weibull-tail coefficient

Similar documents
Nonparametric estimation of tail risk measures from heavy-tailed distributions

Nonparametric estimation of extreme risks from heavy-tailed distributions

NONPARAMETRIC ESTIMATION OF THE CONDITIONAL TAIL INDEX

Estimation of the second order parameter for heavy-tailed distributions

Estimation of risk measures for extreme pluviometrical measurements

Frontier estimation based on extreme risk measures

Spatial analysis of extreme rainfalls In the Cévennes-Vivarais region

Extreme L p quantiles as risk measures

Contributions to extreme-value analysis

On the estimation of the second order parameter in extreme-value theory

Estimation de mesures de risques à partir des L p -quantiles

The high order moments method in endpoint estimation: an overview

Estimating a frontier function using a high-order moments method

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

The largest eigenvalues of the sample covariance matrix. in the heavy-tail case

Functional kernel estimators of large conditional quantiles

Lecture 1: August 28

Kernel regression with Weibull-type tails

QED. Queen s Economics Department Working Paper No. 1244

Markov Chain Monte Carlo (MCMC)

Some functional (Hölderian) limit theorems and their applications (II)

A Resampling Method on Pivotal Estimating Functions

Asymptotic normality of conditional distribution estimation in the single index model

Practical conditions on Markov chains for weak convergence of tail empirical processes

ELEMENTS OF PROBABILITY THEORY

Estimation of the Bivariate and Marginal Distributions with Censored Data

Information geometry for bivariate distribution control

Can we do statistical inference in a non-asymptotic way? 1

Minimax Rate of Convergence for an Estimator of the Functional Component in a Semiparametric Multivariate Partially Linear Model.

The Convergence Rate for the Normal Approximation of Extreme Sums

Qualifying Exam in Probability and Statistics.

Quantile Regression for Extraordinarily Large Data

Asymptotic Statistics-VI. Changliang Zou

AN INTEGRATED FUNCTIONAL WEISSMAN ES- TIMATOR FOR CONDITIONAL EXTREME QUAN- TILES

CREATES Research Paper A necessary moment condition for the fractional functional central limit theorem

5 Introduction to the Theory of Order Statistics and Rank Statistics

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Additive functionals of infinite-variance moving averages. Wei Biao Wu The University of Chicago TECHNICAL REPORT NO. 535

1 Appendix A: Matrix Algebra

Gaussian Random Field: simulation and quantification of the error

Statistical inference on Lévy processes

Quantile Processes for Semi and Nonparametric Regression

Extreme geometric quantiles

STAT Chapter 5 Continuous Distributions

STA 2201/442 Assignment 2

Nonparametric regression with martingale increment errors

Quantile prediction of a random eld extending the gaussian setting

MAT 771 FUNCTIONAL ANALYSIS HOMEWORK 3. (1) Let V be the vector space of all bounded or unbounded sequences of complex numbers.

Multivariate Analysis and Likelihood Inference

Bickel Rosenblatt test

Asymptotics for posterior hazards

Weak quenched limiting distributions of a one-dimensional random walk in a random environment

Pitfalls in Using Weibull Tailed Distributions

STA205 Probability: Week 8 R. Wolpert

Introducing the Normal Distribution

Extreme Value Analysis and Spatial Extremes

Qualifying Exam CS 661: System Simulation Summer 2013 Prof. Marvin K. Nakayama

A proposal of a bivariate Conditional Tail Expectation

Nonparametric Identification and Estimation of a Transformation Model

Asymptotic Statistics-III. Changliang Zou

Gaussian Processes. 1. Basic Notions

Probability and Measure

Statistics: Learning models from data

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

ON THE ESTIMATION OF EXTREME TAIL PROBABILITIES. By Peter Hall and Ishay Weissman Australian National University and Technion

Improving linear quantile regression for

Qualifying Exam in Probability and Statistics.

Stable Process. 2. Multivariate Stable Distributions. July, 2006

Phenomena in high dimensions in geometric analysis, random matrices, and computational geometry Roscoff, France, June 25-29, 2012

Introduction and Preliminaries

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

LARGE DEVIATIONS OF TYPICAL LINEAR FUNCTIONALS ON A CONVEX BODY WITH UNCONDITIONAL BASIS. S. G. Bobkov and F. L. Nazarov. September 25, 2011

Homework 1 Due: Wednesday, September 28, 2016

12 - Nonparametric Density Estimation

Nonparametric estimation using wavelet methods. Dominique Picard. Laboratoire Probabilités et Modèles Aléatoires Université Paris VII

STAT 461/561- Assignments, Year 2015

LAN property for sde s with additive fractional noise and continuous time observation

Nonparametric Function Estimation with Infinite-Order Kernels

Branching Processes II: Convergence of critical branching to Feller s CSB

Multi-operator scaling random fields

Regularization in Cox Frailty Models

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Weighted Exponential Distribution and Process

Introducing the Normal Distribution

1 Complete Statistics

Asymptotic statistics using the Functional Delta Method

Efficient rare-event simulation for sums of dependent random varia

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Goodness-of-fit tests for the cure rate in a mixture cure model

Global Maxwellians over All Space and Their Relation to Conserved Quantites of Classical Kinetic Equations

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

Economics 241B Review of Limit Theorems for Sequences of Random Variables

UNIVERSITÄT POTSDAM Institut für Mathematik

Integrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Chapter 6. Order Statistics and Quantiles. 6.1 Extreme Order Statistics

ESTIMATING BIVARIATE TAIL

Heavy Tailed Time Series with Extremal Independence

Lifetime Dependence Modelling using a Generalized Multivariate Pareto Distribution

Transcription:

1/ 29 Estimation of the functional Weibull-tail coefficient Stéphane Girard Inria Grenoble Rhône-Alpes & LJK, France http://mistis.inrialpes.fr/people/girard/ June 2016 joint work with Laurent Gardes, Université de Strasbourg.

2/ 29 1 Weibull-tail distributions 2 Estimation of extreme conditional quantiles 3 Estimation of the functional Weibull-tail coefficient 4 Illustration on simulations

Notations 3/ 29 Let (X i, Y i ), i = 1,..., n, be iid copies of a random pair (X, Y ) E R where E is an arbitrary space associated with a semi-metric (or pseudometric) d, see [3], Definition 3.2. The conditional survival function of Y given X = x E is denoted by F (y x) := P(Y > y X = x) and is supposed to be continuous and strictly decreasing with respect to y. The associated conditional cumulative hazard function is defined by H(y x) := log F (y x). The conditional quantile is given by q(α x) := F 1 (α x) = H 1 (log(1/α) x), for all α (0, 1).

Conditional Weibull-tail distributions 4/ 29 (A.1) H(. x) is supposed to be regularly varying with index 1/θ(x), i.e. H(ty x) lim y H(y x) = t1/θ(x), for all t > 0. In this situation, θ(.) is a positive function of the covariate x E referred to as the functional Weibull tail-coefficient. From [1], H 1 (. x) is regularly varying with index θ(x). Thus, there exists a slowly-varying function l(. x) such that q(e y x) = H 1 (y x) = y θ(x) l(y x). Recall that the slowly-varying function l(. x) is such that for all t > 0. l(ty x) lim = 1, (1) y l(y x)

Additional assumptions 5/ 29 (A.2) l(. x) is a normalised slowly-varying function. In such a case, the Karamata representation (see [1]) of the slowly-varying function can be written as { y } ε(u x) l(y x) = c(x) exp du, u where c(x) > 0 and ε(u x) 0 as u. The function ε(. x) plays an important role in extreme-value theory since it drives the speed of convergence in (1) and more generally the bias of extreme-value estimators. Therefore, it may be of interest to specify how it converges to 0: (A.3) ε(. x) is regularly varying with index ρ(x) 0. 1

Examples of (unconditional) Weibull-tail distributions 6/ 29 Distribution θ l(y) ε(y) ρ Gaussian 1/2 N (µ, σ 2 ) σ 2σ 2 2 log y y + O(1/y) 1 log y 4 y 1 Gamma 1 Γ(α 1, λ) 1 β + α 1 β log y y + O(1/y) (1 α) log y y 1 Weibull 1/α λ 0 W(α, λ)

Two goals 7/ 29 Starting from an iid sample (X i, Y i ), i = 1,..., n, Estimate the extreme conditional quantiles defined as P(Y > q(α n, x) X = x) = α n, when α n 0 as n. Estimate the functional Weibull-tail coefficient θ(x).

8/ 29 1 Weibull-tail distributions 2 Estimation of extreme conditional quantiles 3 Estimation of the functional Weibull-tail coefficient 4 Illustration on simulations

Principle 9/ 29 First, F (y x) is estimated by a kernel method. For all (x, y) E R, let ˆ F n (y x) = / n n K(d(x, X i )/h n )I{Y i > y} K(d(x, X i )/h n ), i=1 i=1 where h n is a nonrandom sequence such that h n 0 as n, K is assumed to have a support included in [0, 1] such that C 1 K(t) C 2 for all t [0, 1] and 0 < C 1 < C 2 <. It is assumed without loss of generality that K integrates to one. K is called a type I kernel, see [3], Definition 4.1. Second, q(α x) is estimated via the generalized inverse of ˆ Fn (. x): for all α (0, 1). ˆq n (α x) = ˆ F n (α x) = inf{y, ˆ F n (y x) α},

10/ 29 Notations: B(x, h n ) the ball of center x and radius h n, ϕ x (h n ) := P(X B(x, h n )) the small ball probability of X, µ (τ) x (h n ) := E{K τ (d(x, X )/h n )} the τ-th moment, Λ n (x) = (nα n (µ (1) x (h n )) 2 /µ (2) x (h n )) 1/2. It is easily shown that for all τ > 0 0 < C τ 1 ϕ x (h n ) µ (τ) x (h n ) C τ 2 ϕ x (h n ), and thus Λ n (x) is of order (nα n ϕ x (h n )) 1/2.

Regularity assumption 11/ 29 Since the considered estimator involves a smoothing in the x direction, it is necessary to assess the regularity of the conditional survival function with respect to x. To this end, the oscillations are controlled by F (x, α, ζ, h) := sup F (q(β x) x ) (x,β) B(x,h) [α,ζ] F (q(β x) x) 1 = sup F (q(β x) x ) 1 β, where (α, ζ) (0, 1) 2. (x,β) B(x,h) [α,ζ]

Asymptotic normality 12/ 29 Theorem 1 Suppose (A.1), (A.2) hold. Let 0 < τ J < < τ 1 1 where J is a positive integer. x E such that ϕ x (h n ) > 0 where h n 0 as n. If α n 0 and there exists η > 0 such that nϕ x (h n )α n, nϕ x (h n )α n ( F ) 2 (x, (1 η)τ J α n, (1 + η)α n, h n ) 0, then, the random vector { log(1/α n )Λ 1 n (x) ( )} ˆqn (τ j α n x) q(τ j α n x) 1 j=1,...,j is asymptotically Gaussian, centered, with covariance matrix θ 2 (x)σ where Σ j,j = τ 1 j j for (j, j ) {1,..., J} 2.

Conditions on the sequences 13/ 29 nϕ x (h n )α n : Necessary and sufficient condition for the almost sure presence of at least one point in the region B(x, h n ) [q(α n x), + ) of E R. nϕ x (h n )α n ( F ) 2 (x, (1 η)τ J α n, (1 + η)α n, h n ) 0: The biais induced by the smoothing is negligible compared to the standard-deviation.

14/ 29 1 Weibull-tail distributions 2 Estimation of extreme conditional quantiles 3 Estimation of the functional Weibull-tail coefficient 4 Illustration on simulations

Principle 15/ 29 We propose a family of estimators of θ(x) based on some properties of the log-spacings of the conditional quantiles. Recall that q(e y x) = H 1 (y x) = y θ(x) l(y x). Let α (0, 1) small enough and τ (0, 1), log q(τα x) log q(α x) = log H 1 ( log(τα) x) log H 1 ( log(α) x) ( ) l( log(τα) x) = θ(x)(log 2 (τα) log 2 (α)) + log l( log(α) x) θ(x)(log 2 (τα) log 2 (α)) θ(x) log(1/τ) log(1/α), where log 2 (.) := log log(1/.),

16/ 29 Hence, for a decreasing sequence 0 < τ J < < τ 1 1, where J is a positive integer, and for all (twice differentiable) function φ : R J R satisfying the shift and location invariance condition φ(ηz) = ηφ(z), φ(ηu + z) = φ(z), for all η R \ {0}, z R J and where u = (1,..., 1) t R J, one has: θ(x) log(1/α) φ(log q(τ 1α x),..., log q(τ J α x)). φ(log(1/τ 1 ),..., log(1/τ J )) Thus, the estimation of θ(x) relies on the estimation of conditional quantiles q(. x): ˆθ n (x) = log(1/α n ) φ(log ˆq n(τ 1 α n x),..., log ˆq n (τ J α n x)), φ(log(1/τ 1 ),..., log(1/τ J )) with α n 0 as n.

Asymptotic normality 17/ 29 Theorem 2 Ssuppose (A.1) (A.3) hold. Let x E such that ϕ x (h n ) > 0 where h n 0 as n. If α n 0, nϕx (h n )α n ε(log(1/α n ) x) λ R and there exists η > 0 such that nϕ x (h n )α n and nϕx (h n )α n { F (x, (1 η)τ J α n, (1 + η)α n, h n ) 1/log(1/α n )} 0, then, Λ 1 d n (x)(ˆθ n (x) θ(x)) N (µ φ, θ 2 (x)v φ ) where µ φ = λv t log φ(v), V φ = ( log φ(v)) t Σ ( log φ(v)) and v = (log(1/τ 1 ),..., log(1/τ J )) t do not depend on (X, Y ) distribution.

Corollary Suppose (A.1) (A.3) hold. Let x E such that ϕ x (h n ) > 0 and yε(y x) as y. Assume there exist L θ, L c et L ε such that 1 θ(x) 1 θ(x ) L θ d(x, x ), log c(x) log c(x ) L c d(x, x ), sup ε(u x) ε(u x ) L ε d(x, x ), u [1,ȳ n(x)] where ȳ n (x) := sup{h(q(α n x) x ), x B(x, h n )}. Suppose ϕ 1 x (1/y)(log y) 1+ξ ρ(x) 0 (2) for some ξ > 0 as y. Then, letting λ > 0, α n = n 1+ξ and h n = ϕ 1 x ( λ(1 ξ) 2ρ(x) n ξ (ε(log n x)) 2), Theorem 2 yields Λ 1 d n (x)(ˆθ n (x) θ(x)) N (µ φ, θ 2 (x)v φ ). The key assumption (2) holds in the finite dimensional setting or for fractal-type and some exponential-type processes, see [3], Chapter 13. 18/ 29

Example 19/ 29 ( J ) 1/p Let us focus on the functions φ (p) (z) = j=2 β j(z j z 1 ) p, where z = (z 1,..., z J ) t R J, p N \ {0} and for all j {2,..., J}, β j R. The corresponding estimator of θ writes: ( J ˆθ n (p) j=2 (x) = log(1/α n ) β j [log ˆq n (τ j α n x) log ˆq n (τ 1 α n x)] p ) 1/p J j=2 β. j[log(τ 1 /τ j )] p As a consequence of Theorem 2, the associated asymptotic mean and variance of (x) are given for an arbitrary vector β by µ = λ and ˆθ (p) n V (p) = (η(p) ) t AΣA t η (p) (η (p) ) t Avv t A t η (p), where A is a given matrix and η (p) = (β j (v j v 1 ), j = 2,..., J) t. The asymptotic bias µ does not depend neither on p and nor on the weights {β j, j = 2,..., J}. It is possible to minimize V (p) with respect to η (p).

20/ 29 Proposition (p) The asymptotic variance of ˆθ n (x) is minimal for η (p) proportional to η opt = (AΣA t ) 1 Av and is given by and is independent of p. V opt = 1 (Av) t (AΣA t ) 1 Av, Moreover, for a fixed value of J, it is possible to minimize numerically the optimal variance V opt with respect to parameters 0 < τ J < < τ 1 1. The resulting values of V opt are displayed in the table below: J V opt τ 1 τ 2 τ 3 τ 4 τ 5 2 1.5441 1.0000 0.2032 3 1.2191 1.0000 0.3615 0.0735 4 1.1223 1.0000 0.4703 0.1702 0.0346 5 1.0789 1.0000 0.5486 0.2585 0.0936 0.0190

21/ 29 1 Weibull-tail distributions 2 Estimation of extreme conditional quantiles 3 Estimation of the functional Weibull-tail coefficient 4 Illustration on simulations

Framework 22/ 29 E is a subset of L 2 ([0, 1]) made of trigonometric functions ψ z : [0, 1] [0, 1], ψ z (t) = cos(2πzt) with different periods indexed by z [1/10, 1/2]. Two semi-metrics are considered: d 1 (ψ z, ψ z ) = ψz 2 2 ψ z 2 2, d 2 (ψ z, ψ z ) = ψ z ψ z 2, for all (z, z ) [1/10, 1/2] 2, where ψ z 2 2 = 1 0 ψ 2 z (t)dt = 1 2 ( 1 + sin(4πz) ). 4πz The semi-metric d 2 is built on the classical L 2 norm while d 1 measures some spacing between the periods of the trigonometric functions.

Simulated data 23/ 29 N = 100 copies of a n = 1000 samples from a random pair (X, Y ) defined as follows: The covariate X is chosen randomly on E by considering X = ψ Z where Z is a uniform random variable on [1/10, 1/2]. For a given function x E, the generalized inverse of the conditional hazard function H(. x) is given for y 0 by with H (y x) = y θ(x) ( 1 γy ρ(x)), θ(x) = (18/5 x 2 2 + 9/50) 1 5/18, ρ(x) = 50/(60 x 2 2 + 3) 5/2, γ = 1/10.

Four simulated random functions X (.) 24/ 29

Estimators 25/ 29 The previous estimator with optimal weights is used. Here, we limit ourselves to J = 5 and p {1, 3}. A modified bi-quadratic kernel is adopted (type I kernel): K(u) = 10 ( 3 ( 1 u 2 ) ) 2 1 + I{ u 1}. 9 2 10 h n and α n are selected simultaneously thanks to a data-driven procedure. For a fixed x, let {Z 1 (x, h n ),..., Z mn (x, h n )} be the m n random values Y i for which X i B(x, h n ). The idea is to select the sequences h n and α n such that the rescaled log-spacings i log(m n /i)(log Z mn i+1,m n (x, h n ) log Z mn i,m n (x, h n )), i = 1,..., m n α n, are approximately Exp(θ(x)) distributed. The optimal sequences are obtained by minimizing a Kolmogorov-Smirnov distance. Comparison with the non-conditional estimator proposed in [2]: ˆθ NCE n = kn i=1 (log Y n i+1,n log Y n kn+1,n) ( log 2 (n/i) log 2 (n/k n ) ). kn i=1

Influence of the exponent p 26/ 29 Let e i,l,p be the relative error obtained on the ith replication using the semi-metric d l and the estimator ˆθ (p). Left: histogram of e,1,3 e,1,1 (semi-metric d 1 ), right: histogram of e,2,3 e,2,1 (semi-metric d 2 ). Both histograms are nearly centered, small influence of p.

Influence of the semi-metric 27/ 29 Recall that e i,l,p is the relative error obtained on the ith replication using the semi-metric d l and the estimator ˆθ (p). Left: histogram of e,2,1 e,1,1 (p = 1), right: histogram of e,2,3 e,1,3 (p = 3). Both histograms are skewed to the right, the semi-metric d 1 yields better result than d 2.

Comparison with the non-conditional estimator 28/ 29 Recall that e i,l,p is the relative error obtained on the ith replication using the semi-metric d l and the estimator ˆθ (p). We moreover denote by e i the relative error obtained on the ith replication using the non-conditional NCE estimator ˆθ n. Left: histogram of e e,1,1 (p = 1), right: histogram of e e,1,3 (p = 3). Both histograms are skewed to the right, the conditional estimator yields better results than the unconditional one.

References 29/ 29 1 Bingham, N.H., Goldie, C.M., Teugels, J.L. (1987) Regular Variation, Cambridge University Press. 2 Daouia, A., Gardes, L., Girard, S. (2013). On kernel smoothing for extremal quantile regression. Bernoulli, 19, 2557 2589. 3 Ferraty, F., Vieu, P. (2006). Nonparametric functional data analysis, Springer. 4 Gardes, L., Girard, S. (2006). Comparison of Weibull tail-coefficient estimators. REVSTAT - Statistical Journal, 4(2), 163 188. 5 Gardes, L., Girard, S. (2012). Functional kernel estimators of large conditional quantiles. Electronic Journal of Statistics, 6, 1715 1744. 6 Gardes, L., Girard, S. (2016). On the estimation of the functional Weibull tail-coefficient. Journal of Multivariate Analysis, 146, 24 45. 7 Gardes, L., Girard, S., Lekina, A. (2010). Functional nonparametric estimation of conditional extreme quantiles. Journal of Multivariate Analysis, 101, 419 433.