Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business

Similar documents
arxiv: v1 [stat.ap] 19 Jun 2013

Bootstrapping the triangles

Forecasting with the age-period-cohort model and the extended chain-ladder model

Claims Reserving under Solvency II

ONE-YEAR AND TOTAL RUN-OFF RESERVE RISK ESTIMATORS BASED ON HISTORICAL ULTIMATE ESTIMATES

GARCH Models Estimation and Inference

Various Extensions Based on Munich Chain Ladder Method

Convolution Based Unit Root Processes: a Simulation Approach

July 31, 2009 / Ben Kedem Symposium

Efficient estimation of a semiparametric dynamic copula model

GARY G. VENTER 1. CHAIN LADDER VARIANCE FORMULA

Financial Econometrics and Volatility Models Copulas

Generalized Autoregressive Score Models

13. Estimation and Extensions in the ARCH model. MA6622, Ernesto Mordecki, CityU, HK, References for this Lecture:

Multivariate Non-Normally Distributed Random Variables

On the Systemic Nature of Weather Risk

Lecture Quantitative Finance Spring Term 2015

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

Switching Regime Estimation

Identification of the age-period-cohort model and the extended chain ladder model

Time Varying Hierarchical Archimedean Copulae (HALOC)

Appendix A: The time series behavior of employment growth

Nonlinear Parameter Estimation for State-Space ARCH Models with Missing Observations

Extreme Value Analysis and Spatial Extremes

DELTA METHOD and RESERVING

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory

Correlation: Copulas and Conditioning

GARCH Models. Eduardo Rossi University of Pavia. December Rossi GARCH Financial Econometrics / 50

Explicit Bounds for the Distribution Function of the Sum of Dependent Normally Distributed Random Variables

Regression Models - Introduction

The Instability of Correlations: Measurement and the Implications for Market Risk

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Modelling and Estimation of Stochastic Dependence

Volatility. Gerald P. Dwyer. February Clemson University

The Use of Copulas to Model Conditional Expectation for Multivariate Data

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

Semi-parametric estimation of non-stationary Pickands functions

Discussion of Bootstrap prediction intervals for linear, nonlinear, and nonparametric autoregressions, by Li Pan and Dimitris Politis

Semi-parametric predictive inference for bivariate data using copulas

Statistical Inference and Methods

Bayesian Semiparametric GARCH Models

Computer Intensive Methods in Mathematical Statistics

Bayesian Semiparametric GARCH Models

Marginal Specifications and a Gaussian Copula Estimation

Calendar Year Dependence Modeling in Run-Off Triangles

Model Specification Testing in Nonparametric and Semiparametric Time Series Econometrics. Jiti Gao

Symmetric btw positive & negative prior returns. where c is referred to as risk premium, which is expected to be positive.

GARCH Models Estimation and Inference

Autoregressive and Moving-Average Models

Web-based Supplementary Material for A Two-Part Joint. Model for the Analysis of Survival and Longitudinal Binary. Data with excess Zeros

Gaussian kernel GARCH models

Reserving for multiple excess layers

1. The OLS Estimator. 1.1 Population model and notation

GENERALIZED LINEAR MODELING APPROACH TO STOCHASTIC WEATHER GENERATORS

Markov Switching Regular Vine Copulas

Calibration Estimation of Semiparametric Copula Models with Data Missing at Random

Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories

1 Class Organization. 2 Introduction

GARCH processes probabilistic properties (Part 1)

Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI

arxiv: v1 [stat.me] 9 Feb 2012

Vine copula specifications for stationary multivariate Markov chains

Robustness of a semiparametric estimator of a copula

Chain ladder with random effects

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

Forecasting time series with multivariate copulas

ECON 3150/4150, Spring term Lecture 6

Consistency of the maximum likelihood estimator for general hidden Markov models

Simulating Exchangeable Multivariate Archimedean Copulas and its Applications. Authors: Florence Wu Emiliano A. Valdez Michael Sherris

Using copulas to model time dependence in stochastic frontier models

Multivariate Random Variable

Systemic Weather Risk and Crop Insurance: The Case of China

2.5 Forecasting and Impulse Response Functions

STAT 100C: Linear models

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Reducing Model Risk With Goodness-of-fit Victory Idowu London School of Economics

Generated Covariates in Nonparametric Estimation: A Short Review.

Multivariate GARCH models.

Synchronous bootstrapping of loss reserves

FORECASTING IN FINANCIAL MARKET USING MARKOV R MARKOV REGIME SWITCHING AND PRINCIPAL COMPONENT ANALYSIS.

Imputation Algorithm Using Copulas

Define y t+h t as the forecast of y t+h based on I t known parameters. The forecast error is. Forecasting

Analysing geoadditive regression data: a mixed model approach

The Realized Hierarchical Archimedean Copula in Risk Modelling

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

Goodness-of-Fit Tests for Time Series Models: A Score-Marked Empirical Process Approach

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Statistics 910, #15 1. Kalman Filter

Applied Econometrics (QEM)

Copulas, a novel approach to model spatial and spatio-temporal dependence

Using all observations when forecasting under structural breaks

How to select a good vine

Class 1: Stationary Time Series Analysis

Time Series Models for Measuring Market Risk

Generalized Linear Models. Kurt Hornik

Math 180B Problem Set 3

Transcription:

Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business Michal Pešta Charles University in Prague Faculty of Mathematics and Physics Ostap Okhrin Dresden University of Technology Institute of Economics and Transport

Introduction 2 30 Overview Motivated by claims reserving in non-life insurance Joint work with Ostap Okhrin (TU Dresden) Triangular data models Published in Insurance: Mathematics and Economics, 56 (May 2014): 28 37.

Data structure 3 30 Triangular data i\j 1 2 n 1 n 1 Y 1,1 Y 1,2 Y 1,n 1 Y 1,n 2 Y 2,1 Y 2,2 Y 2,n 1...... Y i,n+1 i n 1 Y n 1,1 Y n 1,2 n Y n,1 n copies of stochastic process The first realization consists of n observations The last one has only one observation

Main aims 4 30 Terminology and goals Y i,j... cumulative payments in origin year i after j development periods (accounting year i + j) n... current year corresponds to the most recent accident year and development period Our data history consists of right-angled isosceles triangles Y i,j, where i + j n + 1 Predict Y i,n and R i = Y i,n Y i,n+1 i (claims reserve) Estimate distribution of the reserves

CMV model 5 30 Conditional mean and variance (CMV) model CMV model Y i,j = µ(y i,j 1, α, j) + σ(y i,j 1, β, j)ε i,j (α, β) α and β are unknown parameters, which dimensions do not depend on n µ is a continuous function in α σ is a positive and continuous function in β Errors ε i,j (α, β)

CMV model 6 30 CMV model s errors Disturbances {ε i,j (α, β)} n+1 i j=1 are independent sample copies of a stationary first-order Markov process for all i All ε i,j (α, β) have the common true invariant distribution G α,β which is absolutely continuous with respect to Lebesgue measure on the real line Filtration F i,j = σ(y k,l : l j, k i + 1 j) denotes the information set generated by that trapezoid E[ε i,j (α, β) F i,j 1 ] = 0 var[ε i,j (α, β) F i,j 1 ] = s(α, β)

CMV model 7 30 Properties of the CMV model Unknown true values [α, β ] of parameters [α, β ] set (due to identifiability purposes): s(α, β ) = 1 Model s name come from the fact that E[Y i,j F i,j 1 ] = µ(y i,j 1, α, j) var[y i,j F i,j 1 ] = σ 2 (Y i,j 1, β, j)s(α, β) Conditional mean models: types of ARMA models, vector autoregressions, linear and nonlinear regressions,... Conditional variance models: ARCH and any of its numerous parametric extensions (GARCH, EGARCH, GJR-GARCH, etc.), stochastic volatility models,...

CMV model 8 30 Candidates for the mean and variance function From the nature of data: Y i,j C i R + almost surely as j, i (stabilizing property) One may propose, e.g., µ(y i,j 1, α, j) = η(α, j)y i,j 1 σ(y i,j 1, β, j) = ν(β, j) Y i,j 1 η(α, j) should be decreasing in j with limit 1 as j ν(β, j) should be decreasing in j with limit 0 as j

CMV model 9 30 Dependence modeling Since the mean and variance trends are removed by the CMV model, the rest of the relationship among claim amounts Y i,j can be additionally captured by modeling dependent errors {ε i,j (α, β)} n+1 i j=1 are independent sample copies of a stationary first-order Markov process for all i generated from (G α,β ( ), C(, ; γ)) C(, ; γ) is the true parametric copula for [ε i,j 1 (α, β), ε i,j (α, β)], which is given and fixed up to unknown parameter γ

CMV model 10 30 Copula-based model It is believed that there exist a kind of information overlap between the claims from consecutive development periods Joint bivariate distribution of [ε i,j 1 (α, β), ε i,j (α, β)] has distribution function H(e 1, e 2 ) = C(G α,β (e 1 ), G α,β (e 2 ); γ) Conditional copula density can be derived as h(e 2 e 1 ) = g α,β (e 2 )c(g α,β (e 1 ), G α,β (e 2 ); γ) where c is the copula density and g α,β is the marginal density corresponding to G α,β Play an important role in making" the dependent errors conditionally independent

Nonlinear generalized semiparametric regression 11 30 Parameter estimation CMV model with copula assume three vector parameters to be estimated Estimation process consists of two stages In the first one, mean and variance parameters α and β are estimated in a distribution-free fashion, since no specific distributional assumptions are proposed nor required for the claims The second stage concerns estimation of the dependence structure, mainly the copula parameter γ, in a likelihood based way

Nonlinear generalized semiparametric regression 12 30 Conditional least squares (CLS) Denote M n (α, β) = 1 n n+1 j 1 n 1 n + 1 j j=2 i=1 [ Yi,j µ(y i,j 1, α, j) ] 2 σ 2 (Y i,j 1, β, j) n n+1 j i=1 V n (α, β) = 1 1 n 1 n + 1 j j=2 { [Yi,j µ(y i,j 1, α, j) ]2 2 σ 2 (Y i,j 1, β, j)}

Nonlinear generalized semiparametric regression 13 30 CLS estimates CLS estimate of the mean parameter α for a fixed value of parameter β Θ 2 is defined as α(β) = arg min α Θ 1 M n (α, β) and CLS estimate of the variance parameter β for a fixed value of parameter α Θ 1 is defined as β(α) = arg min β Θ 2 V n (α, β) Computationally not feasible to find the global minimum of M n and V n with respect to [α, β ] simultaneously

Nonlinear generalized semiparametric regression 14 30 Consistency Under regularity conditions α(β) P n α P (β), β; β(α) n β (α), α Mixingales are to mixing processes as martingale differences are to independent processes

Nonlinear generalized semiparametric regression 15 30 Iterative CLS What is the connection between the true unknown parameter values α and β of the CMV model and true unknown parameter values α (β) and β (α)? [ α(β ) β(α ) ] [ P α n β Iteratively estimate α given the fixed value of β and, consequently, estimate β given the fixed value of α (obtained from previous step) Repeat in turns until almost no change in consecutive estimates of [α, β ] ]

Nonlinear generalized semiparametric regression 16 30 Estimation of dependence structure Estimate the unknown marginal distribution function G α,β of CMV model errors ε i,j (α, β) non-parametrically by the empirical distribution function Ĝ n (e) = 1 n(n 1)/2 + 1 of the fitted residuals n 1 i=1 ε i,j ( α, β) = Y i,j µ(y i,j 1, α, j) σ(y i,j 1, β, j) n+1 i I{ ε i,j ( α, β) e} j=2

Nonlinear generalized semiparametric regression 17 30 Likelihood for copula Full log-likelihood for copula parameter γ L(γ) = + n 2 i=1 n 2 i=1 n+1 i log g α,β (ε i,j (α, β)) j=2 n+1 i log c(g α,β (ε i,j 1 (α, β)), G α,β (ε i,j (α, β)); γ) j=3

Nonlinear generalized semiparametric regression 18 30 Psuedo (quasi) likelihood Ignoring the first term in L(γ) and replacing ε s and G α,β by their estimated counterparts ε s and Ĝ n, parameter γ can be estimated by the so-called canonical maximum likelihood, i.e., maximizing the partial (pseudo) log-likelihood γ = arg max γ L(γ) = n 2 i=1 L(γ) n+1 i log c(ĝ n ( ε i,j 1 ( α, β)), Ĝ n ( ε i,j ( α, β)); γ) j=3

Nonlinear generalized semiparametric regression 19 30 Prediction Predictor for reserve R (n) i R (n) i = Ŷ i,n Y i,n+1 i can be defined as Prediction of unobserved claims may be done in a telescopic way based on the CMV model formulation: start with the diagonal element Y i,n+1 i and predict Y i,j, j > n + 1 i stepwise in each row Ŷ i,j = Y i,j, i + j n + 1 Ŷ i,j = µ(ŷ i,j 1, α, j) + σ(ŷ i,j 1, β, j) ε j, i + j > n + 1

Nonlinear generalized semiparametric regression 20 30 Semiparametric bootstrap Errors ε j are simulated from the fitted residuals Takes advantage of the fact that ε i,j (α, β) = G 1 α,β (X j) for all i (due to the independent rows), where {X j } n j=2 is a stationary first-order Markov process with the copula C(x 1, x 2 ; γ) being the joint distribution of [X j 1, X j ]

Nonlinear generalized semiparametric regression 21 30 Resampling algorithm Generate n 1 independent Un(0, 1) rvs {X j } n j=2 Repeat b = 1,..., B (b) U 2 X 2 (b) ε 2 Ĝn ( (b) U 2 ) (b) U j C 1 2 1 (X j (b) U j 1 ; γ), j = 3,..., n (b) ε j Ĝn ( (b) U j ), j = 3,..., n Center bootstrap residuals (b) ε j (b) ε j 1 n 1 n l=2 (b) ε l (b) Ŷ i,n+1 j Y i,n+1 i (b) Ŷ i,j µ( (b) Ŷ i,j 1, α, j) + σ( (b) Ŷ i,j 1, β, j) (b) ε j, j = n + 2 i,..., n (b) R (n) i (b) Ŷ i,n Y i,n+1 i

Real data example 22 30 1 1 1 1 1 1 1 1 1 1 1 2 4 6 8 10 200000 400000 600000 800000 1000000 Development period j Y ij 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 6 6 6 6 6 6 7 7 7 7 7 8 8 8 8 9 9 9 0 0 a

Real data example 23 30 Real data Data set from Zehnwirth and Barnett (2000) ( µ(y i,j 1, α, j) = 1 + α 1 α 2 j 1 α 2 exp { α 1 j α }) 2 Y i,j 1 σ(y i,j 1, β, j) = β 1 exp{ β 2 j} Y i,j 1 CLS estimates: α 1 = 2.033, α 2 = 1.106, β 1 = 109.8, β 2 = 0.4053

Real data example 24 30 6 Residuals 2 1 0 1 2 9 0 8 ε^i,j 2 7 13 45 6 9 8 7 8 7 6 5 7 46 1 1 1 1 1 23 23 23 5 45 4 45 6 5 4 23 2 3 4 3 3 2 2 12 1 1 1 2 4 6 8 10 Development period j Still some slight pattern (trend) not captured by mean and variance parametric part

Real data example 25 30 Copula godness-of-fit Kendall τ for the pairs of consecutive residuals {[ ε i,j 1 ( α, β), n 2,n+1 i ε i,j ( α, β)]} i=1,j=3 equals 0.43, which indicates at least mild dependence Three Archimedean copulae (Clayton, Frank, and Gumbel) together with Gaussian and Student t 5 -copula considered S (C) n goodness-of-fit test proposed by Genest et al. (2009) Gumbel copula ( γ = 1.776) was chosen Exhibits strong right tail dependence and relatively weak left tail dependence Transformed residuals (by the residuals marginal edf Ĝ n ; having uniform margins) seem to be strongly correlated at high values but less correlated at low values

Real data example 26 30 1.5 0.5 0.0 0.5 1.0 1.5 1.0 0.0 0.5 1.0 1.5 2.0 2.5 Residuals ε^i,j 1 ε^i,j 1 0 1 2 1 0 1 2 Simulated Residuals ε^j 1 ε^j 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Transformed Residuals G^ n(ε^i,j 1) G^ n(ε^i,j) Gumbel(γ^) Copula Density Un[0,1] Un[0,1] 0.25 0.25 0.5 0.5 0.75 0.75 1 1 1.25 1.25 1.5 1.5 1.75 1.75 2 2 2.25 2.5 2.5 3 4 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Real data example 27 30 Results Benchmark: traditional bootstrapped chain ladder (BCL) Disadvantages, which can be overcome by our approach: Number of parameters depending on the sample size Some parameters estimated by just ratio of two numbers (yielding zero sample variance) Questionable consistency of the estimates Non-realistic assumption of independence of the residuals Our approach: Slightly smaller predictions of reserves But even more important is that the estimates of the reserves distribution are less volatile (interesting finding in terms of CoV: GEE AR(1) 1.95; CLSC 1.99)

Reserves 28 30 BCL+CLSC i=2 BCL+CLSC BCL+CLSC i=3 i=4 0k 25k Reserves 25k 50k 75k Reserves 50k 75k 100k BCL+CLSC i=5 BCL+CLSC i=6 75k 100k 125k 150k Reserves 100k 150k 200k Reserves 150k 200k 250k BCL+CLSC i=8 BCL+CLSC i=7 BCL+CLSC i=9 300k 350k 400k 450k 500k Reserves 600k 675k 750k 825k 900k Reserves 1100k 1250k 1400k 1550k BCL+CLSC i=11 BCL+CLSC i=10 4750k 5000k 5250k 5500k 5750k 6000k Total Reserves 1775k 2000k 2225k 2450k 2675k Reserves Reserves BCL+CLSC Reserves Reserves

Conclusions 29 30 Summary Conditional mean and variance (CMV) time series model for triangular data with innovations being a stationary first-order Markov process Framework is demonstrated to be suitable for stochastic claims reserving in general insurance Very flexible modeling approach, relatively smaller number of model parameters not depending on the number of development periods, and time series innovations not considered as independent Increase in precision of the claims reserves prediction Theoretical justification of the proposed approached shown

Questions? 30 30 Thank you! Michal.Pesta@mff.cuni.cz