Kernel density estimation for heavy-tailed distributions...

Similar documents
Kernel density estimation for heavy-tailed distributions using the champernowne transformation

Histogram Härdle, Müller, Sperlich, Werwatz, 1995, Nonparametric and Semiparametric Models, An Introduction

Local Transformation Kernel Density Estimation of Loss Distributions

Nonparametric Density Estimation (Multidimension)

Multivariate density estimation using dimension reducing information and tail flattening transformations for truncated or censored data

A Bayesian approach to parameter estimation for kernel density estimation via transformations

Nonparametric estimation of Value-at-Risk

Nonparametric Methods

Additive Isotonic Regression

Model Fitting. Jean Yves Le Boudec

Spatially Smoothed Kernel Density Estimation via Generalized Empirical Likelihood

ESTIMATORS IN THE CONTEXT OF ACTUARIAL LOSS MODEL A COMPARISON OF TWO NONPARAMETRIC DENSITY MENGJUE TANG A THESIS MATHEMATICS AND STATISTICS

STAT 6350 Analysis of Lifetime Data. Probability Plotting

Nonparametric Econometrics

Severity Models - Special Families of Distributions

Method of Moments. which we usually denote by X or sometimes by X n to emphasize that there are n observations.

Statistics: Learning models from data

Empirical Likelihood

Introduction to Nonparametric and Semiparametric Estimation. Good when there are lots of data and very little prior information on functional form.

Analysis methods of heavy-tailed data

Nonparametric Model Construction

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Economics 620, Lecture 19: Introduction to Nonparametric and Semiparametric Estimation

Financial Econometrics and Volatility Models Extreme Value Theory

Nonparametric Density Estimation

Minimum Hellinger Distance Estimation in a. Semiparametric Mixture Model

A Novel Nonparametric Density Estimator

Density estimators for the convolution of discrete and continuous random variables

Analysis methods of heavy-tailed data

Correlation: Copulas and Conditioning

Introduction. Linear Regression. coefficient estimates for the wage equation: E(Y X) = X 1 β X d β d = X β

Quantitative Economics for the Evaluation of the European Policy. Dipartimento di Economia e Management

Overview of Extreme Value Theory. Dr. Sawsan Hilal space

L. Brown. Statistics Department, Wharton School University of Pennsylvania

Log-transform kernel density estimation of income distribution

1 Degree distributions and data

Log-Transform Kernel Density Estimation of Income Distribution

Approximating the Integrated Tail Distribution

Quantile-quantile plots and the method of peaksover-threshold

Exam C Solutions Spring 2005

Bayesian Semiparametric GARCH Models

Bayesian Semiparametric GARCH Models

12 - Nonparametric Density Estimation

Motivational Example

Parameter Estimation

Robustness to Parametric Assumptions in Missing Data Models

Estimation of Quantiles

Computational treatment of the error distribution in nonparametric regression with right-censored and selection-biased data

Tail negative dependence and its applications for aggregate loss modeling

ASTIN Colloquium 1-4 October 2012, Mexico City

Nonparametric Function Estimation with Infinite-Order Kernels

Bootstrap, Jackknife and other resampling methods

Math 494: Mathematical Statistics

Stat 5101 Lecture Notes

The Fundamentals of Heavy Tails Properties, Emergence, & Identification. Jayakrishnan Nair, Adam Wierman, Bert Zwart

On the Choice of Parametric Families of Copulas

Variable inspection plans for continuous populations with unknown short tail distributions

Survival Analysis: Weeks 2-3. Lu Tian and Richard Olshen Stanford University

Time Series and Forecasting Lecture 4 NonLinear Time Series

Nonparametric Regression Härdle, Müller, Sperlich, Werwarz, 1995, Nonparametric and Semiparametric Models, An Introduction

New mixture models and algorithms in the mixtools package

A Conditional Approach to Modeling Multivariate Extremes

Introduction to Rare Event Simulation

Teruko Takada Department of Economics, University of Illinois. Abstract

Spline Density Estimation and Inference with Model-Based Penalities

A tailor made nonparametric density estimate

Kernel density estimation of reliability with applications to extreme value distribution

Non-parametric Inference and Resampling

Local Multiplicative Bias Correction for Asymmetric Kernel Density Estimators

Finite Sample Performance of Semiparametric Binary Choice Estimators

Semiparametric Generalized Linear Models

Inverse Statistical Learning

3 Continuous Random Variables

On robust and efficient estimation of the center of. Symmetry.

Tobit and Interval Censored Regression Model

Discrete Distributions Chapter 6

Estimating and Testing Quantile-based Process Capability Indices for Processes with Skewed Distributions

Model-free prediction intervals for regression and autoregression. Dimitris N. Politis University of California, San Diego

Copulas. MOU Lili. December, 2014

Maximum Smoothed Likelihood for Multivariate Nonparametric Mixtures

Kernel density estimation

ECON 4130 Supplementary Exercises 1-4

41903: Introduction to Nonparametrics

Logistic Kernel Estimator and Bandwidth Selection. for Density Function

Semi-parametric predictive inference for bivariate data using copulas

More on Estimation. Maximum Likelihood Estimation.

Some New Methods for Latent Variable Models and Survival Analysis. Latent-Model Robustness in Structural Measurement Error Models.

SOLUTION FOR HOMEWORK 8, STAT 4372

University, Tempe, Arizona, USA b Department of Mathematics and Statistics, University of New. Mexico, Albuquerque, New Mexico, USA

Bayesian estimation of the discrepancy with misspecified parametric models

Lecture 3: Statistical Decision Theory (Part II)

A NOTE ON TESTING THE SHOULDER CONDITION IN LINE TRANSECT SAMPLING

Transformations and Bayesian Density Estimation

E cient Regressions via Optimally Combining Quantile Information

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

Bayesian Point Process Modeling for Extreme Value Analysis, with an Application to Systemic Risk Assessment in Correlated Financial Markets

Local Polynomial Regression

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Bayesian estimation of bandwidths for a nonparametric regression model with a flexible error density

Challenges in implementing worst-case analysis

Transcription:

Kernel density estimation for heavy-tailed distributions using the Champernowne transformation Buch-Larsen, Nielsen, Guillen, Bolance, Kernel density estimation for heavy-tailed distributions using the Champernowne transformation, Statistics, Vol. 39, No. 6, December 2005, 503-518 Tine Buch-Kromann

Limitations of the kernel density estimator Simulated lognormal data set Kernel density estimator Density 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 True, (lognormal) Kernel density est. 0 5 10 15 20 X

Limitations of the kernel density estimator Simulated lognormal data set Kernel density estimator Density 0.00 0.02 0.04 0.06 0.08 0.10 True, (lognormal) Kernel density est. 0 5 10 15 20 X

Motivation Combine the advantages of non-parametric and parametric statistics. Non-parametric statistics: + No assumptions of the shape of the distribution. The estimation of the distribution is uncertain when few data. Parametric statistics: + The estimated distribution converge faster to the true distribution than the non-parametric distribution. The distribution assumption might be wrong.

Characteristics Non par. model Method Par. model Few data A lot of data When data are few, the method should be close to a parametric model, When the amount of data increases, the method should become more non-parametric.

The Champernowne distribution The Champernowne distribution: The Champernowne cdf is defined for x 0 and has the form T α,m (x) = x α x α + M α x R + with parameters α > 0 and M > 0 and density t α,m (x) = αmα x α 1 (x α + M α ) 2 x R +

The Champernowne distribution The Champernowne distribution converges to a Pareto distribution: t α,m (x) αmα x α+1 as x Notice that the Champernowne distribution is defined on [0, ) in contrast to the Pareto distribution ( ) M α G(x) = 1 x with density g(x) = αmα, α > 0, M > 0 x α+1 which is only defined for x [M, ). This makes the Pareto distribution inappropriate as an underlying parametric distribution.

The Champernowne distribution The tail of the Champernowne distribution is advantageous because it is heavy. However the shape near 0 is unfortunately quite inflexible: α < 1 1 t α,m (0) = M α = 1 0 α > 1

The Champernowne distribution The effect of the parameter α For α < α : T α,m (x) > T α,m(x) T α,m (x) = T α,m(x) T α,m (x) < T α,m(x) if 0 x < M if x = M if M < x < α is not at scale parameter, but it has some properties which are similar to a scale parameter s: α > 1: Increasing α results in a more steep derivation of the cdf in the value M. (Scale parameter effect: To narrow the density. Moreover, the mode moves to the right, and the tail becomes ligher.) α < 1, increasing α results in a less steep shape of the density near 0.

The Champernowne distribution The effect of the parameter M For M < M : T α,m (x) > T α,m (x) x R + Increasing M results in decreasing cdf. α > 1: The mode of the density moves to the right and becomes lower.

The Modified Champernowne distribution The Champernowne heavy tailed distribution, but the shape near 0 is inflexible and depends on α which also determines the tail. The modifiend Champernowne cdf is defined for x 0 and has the form T α,m,c (x) = (x + c) α c α (x + c) α + (M + c) α 2c α x R + with parameters α > 0, M > 0 and c 0 and density t α,m,c (x) = α(x + c)α 1 ((M + c) α c α ) ((x + c) α + (M + c) α 2c α ) 2 x R +

The Modified Champernowne distribution The modified Champernowne distribution converges to Pareto distribution: t α,m,c (x) α ( ((M + c) α c α ) 1/α) α x α+1 x Note that the modified Champernowne distribution is defined for on [0, ) in contrast to the Pareto distribution ( ) α ((M + c) α c α ) 1/α G(x) = 1 x with density g(x) = α((m + c)α c α ) x α+1 which is only defined for x [((M + c) α c α ) 1/α, ).

The Modified Champernowne distribution The effect of the parameter c For c < c and α > 1 T α,m,c (x) < T α,m,c (x) T α,m,c (x) = T α,m,c (x) T α,m,c (x) > T α,m,c (x) if 0 x < M if x = M if M < x < for α < 1 T α,m,c (x) > T α,m,c (x) T α,m,c (x) = T α,m,c (x) T α,m,c (x) < T α,m,c (x) if 0 x < M if x = M if M < x <

The Modified Champernowne distribution When α 1, then c has some scale parameter properties : c changes the density in the tail. α < 1 increasing c results in lighter tails. Opposite when α > 1. c changes the density in 0. Positive c s give a positive finite density in 0. c moves the mode. α > 1 increasing c shift the mode to the left. When α = 1: c has no effect.

Parameter estimation Almost maximum-likelihood parameters Notice: T α,m,c (M) = 0.5 Therefore estimate M as the empirical median Sub-optimal parameters (but close to the optimal parameters), Simplify the computations, Robust estimator, especially for heavy-tailed distributions. Estimate (α, c) by maximizing the log likelihood function: l = N log α + N log((m + c) α c α ) + (α 1) 2 N log((x i + c) α + (M + c) α 2c α ) i=1 For fixed M the likelihood function is concave and has a maximum. N log(x i + c) i=1

The semiparametric transformation kernel density estimator Step 1: Original data Density 0.0 0.2 0.4 0.6 0.8 True Mod. Champ. 0 5 10 15 20 Data set: (X 1,..., X n ) with unknown cdf F (x) and density f (x). Parameter estimation of the mod. Champ.: Estimate (α, M, c) of the mod. Champ.: T (x). X

The semiparametric transformation kernel density estimator Step 2: Transformed data Density 0.0 0.5 1.0 1.5 Transformation: Transform (X 1,..., X n ) into (Z 1,..., Z n ) using Z i = T (X i ). 0.0 0.2 0.4 0.6 0.8 1.0 Z

The semiparametric transformation kernel density estimator Step 3: Density 0.0 0.5 1.0 1.5 Transformed data Ker. den. (no boundary corr.) 0.0 0.2 0.4 0.6 0.8 1.0 Z Correction: Compute a correction estimator by means of a kernel density estimator ˆf t (z) = 1 n n K b (z Z i ) i=1 where K b (z) is the Epanechnikov kernel function and b = 0.2 is the bandwidth.

The semiparametric transformation kernel density estimator Step 3: Transformed data Density 0.0 0.5 1.0 1.5 Ker. den. (no boundary corr.) Ker. den. (with boundary corr.) 0.0 0.2 0.4 0.6 0.8 1.0 Correction: Compute a correction estimator by means of a kernel density estimator ˆf t (z) = 1 n k(z) n K b (z Z i ) i=1 where k ( z) is the boundary correction. Z ĝ(z): The final estimator on the tranformed axis.

The semiparametric transformation kernel density estimator Step 4: Density 0.0 0.2 0.4 0.6 0.8 Original data True Mod. Champ. KMCE Inverse tranformation: The final estimator of (X 1,..., X n ) on the original axis is obtained by an inverse transformation, such that ˆf t (T (x)) ˆf (x) = (T 1 ) (T (x)) 0 2 4 6 8 10 X Summarized formula: ˆf (x) = 1 n k(t (x)) n i=1 K b (T (x) T (X i )) T (x)

Asymptotic theory Let X 1,..., X n be iid var. with density f. Let ˆf (x) be the transf. kernel density est. of f ˆf (x) = 1 n n K b (T (x) T (X i ))T (x) i=1 where T ( ) is the transf. fct. Then bias and variance of ˆf (x) are given by E[ˆf (x)] f (x) = 1 (( ) f (x) 2 µ 2(K)b 2 1 T (X ) V[ˆf (x)] = 1 nb R(K)T (x)f (x) + o T (x) ( ) 1 nb ) + o(b 2 ) as n, where µ 2 (K) = u 2 K(u) du and R(K) = K 2 (u) du

Simulation study Simulation study setup: Simulated from four distributions: Lognormal Mixture of lognormal and Pareto Weibull Truncated logitic Number of observations: n = {50, 100, 500, 1000} 2000 repetition. Epanechnikov kernel function. Bandwidth selection: Silverman s Rule-of-Thumb.

Simulation study Distributions: Lognormal Lognormal(0.7) Pareto(0.3) 0.0 0.4 0.8 0.0 0.4 0.8 0 2 4 6 8 10 0 2 4 6 8 10 Lognormal(0.3) Pareto(0.7) Weibull 0.0 0.4 0.8 0.0 0.4 0.8 0 2 4 6 8 10 0 2 4 6 8 10 Normal Truncated logistic 0.0 0.4 0.8 0.0 0.4 0.8 0 2 4 6 8 10 0 2 4 6 8 10

Simulation study Error measures: L 1 norm L 2 norm L 1 = 0 ˆf (x) f (x) dx L 2 = 0 ) 2 (ˆf (x) f (x) dx L 1 and L 2 measures the errors near 0 and in the tail equally.

Simulation study WISE ) 2 WISE = (ˆf (x) f (x) x 2 dx 0 E (mean excess fuctions) E = = (ê(x) e(x)) 2 f (x) dx 0 ( 2 u(f (u) ˆf (u)) du) f (x) dx 0 x WISE and E is error measures that emphasizes the tail of the distribution.

Simulation study Benchmark estimators: BGN: Transf. kernel density est. with the shifted power transf. { (x + λ1 ) y = λ 2 λ 2 0 ln(x + λ 1 ) λ 2 = 0 Aim: Transformed data symmetric. CHL: Transf. kernel density est. with the Mobius-like transf. y = x α R α x α + R α Champernowne transformation with another parameter estimation method.

Simulation study Results:

Application: Automobile claims Spanish automobile accidents: Spanish automobile accidents: Bodily injury from 1997; Data divided into two age groups: young drives (less than 30 years) and old drives (above 30 years); Young: 1061 obs. in the inverval [1;126000] with mean value 402.7, Old: 4061 obs. in the inverval [1;17000] with mean value 243.1,

Application: Automobile claims Spanish automobile accidents: Young: Estimated mod. Champ. par.: ˆα 1 = 1.116, ˆM1 = 66, ĉ 1 = 0.000 Old: Estimated mod. Champ. par.: ˆα 2 = 1.145, ˆM2 = 68, ĉ 2 = 0.000 Bandwidths: b 1 = 0.172 and b 2 = 0.134 Notice that ˆα 1 < ˆα 2, ie. young drivers has a heavier tail.

Application: Automobile claims Spanish automobile claims on the transformed axis <30 years old >30 years old 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Kernel est. 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Kernel est. 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Application: Automobile claims Spanish automobile claims and the resulting KSCE estimate 0.000 0.005 0.010 0.015 0.000 0.005 0.010 0.015 Small claims, <30 years old 0 500 1000 1500 2000 Small claims, >30 years old 0 500 1000 1500 2000 0 e+00 1 e 05 2 e 05 3 e 05 4 e 05 0 e+00 1 e 05 2 e 05 3 e 05 4 e 05 Moderately size claims, <30 years old 2000 6000 10000 14000 Moderately size claims, >30 years old 2000 6000 10000 14000 0 e+00 1 e 07 2 e 07 3 e 07 4 e 07 0 e+00 1 e 07 2 e 07 3 e 07 4 e 07 Extreme claims, <30 years old 20000 60000 100000 Extreme claims, >30 years old 20000 60000 100000

Application: Automobile claims 0.9 1.0 1.1 1.2 1.3 1.4 Quotient between the KSCE estimates of <30 years old and >30 years old Small claims KSCEyoung KSCEold 0 500 1500 1.02 1.04 1.06 1.08 1.10 1.12 1.14 Moderately size claims KSCEyoung KSCEold 2000 8000 14000 1.12 1.14 1.16 1.18 1.20 1.22 Extreme claims KSCEyoung KSCEold 20000 80000 Conclusion: The young drivers has a heavier tail than the old drivers.

Application: Employer s liability Employer s liability claims: 2522 claims (Irish insurance company), Estimated mod. Champ. par.: ˆα = 1.955, ˆM = 32379, ĉ = 64759 Estimated Champ. par.(c = 0): ˆα = 0.954, ˆM = 32379 What is the effect of not including c???

Application: Automobile claims

Application: Automobile claims Conclusion: Nearly identical for small and moderate claims whereas the KCE overestimates the tail. This shows the importance of the modified Champernowne distribution.

Conclusion Estimating loss distributions, Introduced the semiparametric transformation kernel density estimator: Based on a parametric estimator that is subsequently corrected with a nonparametric estimator Lot of information close to a nonparametric estimator, Little information close to a parametric estimator. Introduced the Champernowne distribution (heavy-tailed), Generalized to the modified Champernowne distribution (flexible and heavy-tailed)