On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory Danaïl Davidov under the supervision of Jean-Philippe Boucher Département de mathématiques Université du Québec à Montréal August 2009 D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 1 / 37

Table of Contents 1 Introduction 2 Tweedie GLM Model Definition Log-likelihood Overdispersion 3 Variance Modeling Variance in regressions Dispersion Models Double GLMs 4 Swiss Motor Industry Example Analysis of results 5 Conclusion D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 2 / 37

Introduction Definition A loss reserve is a provision for an insurer s liability for claims Notes D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 3 / 37

Introduction Definition Notes A loss reserve is a provision for an insurer s liability for claims A stochastic model uses random variables in a regression framework D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 3 / 37

Introduction Definition Notes A loss reserve is a provision for an insurer s liability for claims A stochastic model uses random variables in a regression framework Claims reserves models presented here use GLM theory as introduced in England, Verrall 2002 D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 3 / 37

Model Definition Notations C i,j Incremental payments D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 4 / 37

Model Definition Notations C i,j Incremental payments w i,j Exposure D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 4 / 37

Model Definition Notations C i,j Incremental payments w i,j Exposure r i,j Incremental number of payments D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 4 / 37

Model Definition Notations C i,j Incremental payments w i,j Exposure r i,j Incremental number of payments Y i,j Normalized incremental payments D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 4 / 37

Model Definition Notations C i,j Incremental payments w i,j Exposure r i,j Incremental number of payments Y i,j Normalized incremental payments Y i,j = C i,j w i,j D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 4 / 37

Model Definition Hypotheses C i,j is a compound Poisson-Gamma distribution D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 5 / 37

Model Definition Hypotheses C i,j is a compound Poisson-Gamma distribution Frequency Poisson with mean ϑ i,j w i,j and variance ϑ i,j w i,j D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 5 / 37

Model Definition Hypotheses C i,j is a compound Poisson-Gamma distribution Frequency Poisson with mean ϑ i,j w i,j and variance ϑ i,j w i,j Severity Gamma with mean τ i,j and variance ντ 2 i,j D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 5 / 37

Model Definition Hypotheses C i,j is a compound Poisson-Gamma distribution Frequency Poisson with mean ϑ i,j w i,j and variance ϑ i,j w i,j Severity Gamma with mean τ i,j and variance ντ 2 i,j Using the following parametrisation p = ν + 2 ν + 1 µ = ϑτ φ = ϑ1 p τ 2 p (2 p), p (1, 2) D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 5 / 37

Model Definition Tweedie Model Y i,j Tweedie(µ i,j, p, φ, w i,j ) D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 6 / 37

Model Definition Tweedie Model Y i,j Tweedie(µ i,j, p, φ, w i,j ) µ i,j = e Xβ D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 6 / 37

Model Definition Tweedie Model Y i,j Tweedie(µ i,j, p, φ, w i,j ) µ i,j = e Xβ E[Y i,j ] = µ i,j, Var[Y i,j ] = φ w i,j µ p i,j D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 6 / 37

Log-likelihood function Tweedie Model l = i,j r i,j log ( ) (wi,j /φ) ν+1 yi,j ν (p 1) ν (2 p) log ( r i,j!γ ( r i,j ν ) y i,j ) + w i,j φ µ (y 1 p i,j i,j 1 p µ 2 p ) i,j 2 p D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 7 / 37

Parameters FIGURE: Parameter main influence D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 8 / 37

Parameter p Tweedie Model Can be estimated only when the number of payments is known D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 9 / 37

Parameter p Tweedie Model Can be estimated only when the number of payments is known Otherwise, it s supposed fixed and known D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 9 / 37

Parameter p Tweedie Model Can be estimated only when the number of payments is known Otherwise, it s supposed fixed and known p and φ need both to be estimated at the same time D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 9 / 37

Parameter φ Optimizing φ using the likelihood principle φ p = i,j w i,j (y i,j µ 1 p i,j 1 p µ2 p i,j 2 p (1 + ν) i,j r i,j ) D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 10 / 37

Parameter φ FIGURE: Optimizing p using the likelihood principle for φ D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 11 / 37

Parameter φ Optimizing φ using the deviance principle φ p = i,j 2 y (y 1 p i,j µ 1 p i,j i,j N Q 1 p y 2 p i,j µ 2 p i,j 2 p ) D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 12 / 37

Parameter φ FIGURE: Optimizing p using the deviance principle for φ D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 13 / 37

Parameter w i,j Note The exposure has been incorporated in the begining, within the initial hypothesis D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 14 / 37

Parameter w i,j Note The exposure has been incorporated in the begining, within the initial hypothesis Different ways of incorporating the exposure within the initial hypothesis would lead to different models D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 14 / 37

Frequency vs Severity Compound Poisson Model Y = N k=1 X i Case 1 Case 2 Case 3 E[N] 10 20 10 Var[N] 10 20 10 E[X] 10 10 20 Var[X] 100 100 400 E[Y] 100 200 200 Var[Y] 2000 4000 8000 TABLE: Mean and variance of total costs for various situations D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 15 / 37

Frequency vs Severity Typical situation in a long-tail business Decreasing average frequency D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 16 / 37

Frequency vs Severity Typical situation in a long-tail business Decreasing average frequency Increasing average severity D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 16 / 37

Frequency vs Severity Typical situation in a long-tail business Decreasing average frequency Increasing average severity Increasing variance in the severity D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 16 / 37

Impact of the Distribution FIGURE: Fitted curve for Normal, Poisson and Gamma models D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 17 / 37

Dispersion Models Model Definition Y i,j Tweedie(µ i,j, p, φ i,j, w i,j ) D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 18 / 37

Dispersion Models Model Definition Y i,j Tweedie(µ i,j, p, φ i,j, w i,j ) µ i,j = e Xβ D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 18 / 37

Dispersion Models Model Definition Y i,j Tweedie(µ i,j, p, φ i,j, w i,j ) µ i,j = e Xβ E[Y i,j ] = µ i,j, Var[Y i,j ] = φ i,j µ p i,j D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 18 / 37

Dispersion Models Model Definition Y i,j Tweedie(µ i,j, p, φ i,j, w i,j ) µ i,j = e Xβ E[Y i,j ] = µ i,j, Var[Y i,j ] = φ i,j µ p i,j φ i,j = e V γ D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 18 / 37

Dispersion Models Log-likelihood l = ( ) (wi,j /φ i,j ) ν+1 yi,j ν r i,j log (p 1) ν log ( r i,j!γ ( r i,j ν ) ) y i,j (2 p) i,j + w i,j µ (y 1 p i,j i,j φ i,j 1 p µ 2 p ) i,j 2 p D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 19 / 37

Dispersion Models Notes The p parameter is optimized at the same time as all other parameters using implicitly the likelihood principle D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 20 / 37

Dispersion Models Notes The p parameter is optimized at the same time as all other parameters using implicitly the likelihood principle Accident years do not have a significant impact on the dispersion parameter Due to lack of data in the last column, the model was build so that the last two columns have the same dispersion parameter D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 20 / 37

Dispersion Models Notes The p parameter is optimized at the same time as all other parameters using implicitly the likelihood principle Accident years do not have a significant impact on the dispersion parameter Due to lack of data in the last column, the model was build so that the last two columns have the same dispersion parameter Possibility to incorporate trends in the dispersion parameter by using the Hoerl s curve parametrisation D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 20 / 37

Double GLMs Algorithm 1 Start with the initial exposure and find the normalized incremental payments D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 21 / 37

Double GLMs Algorithm 1 Start with the initial exposure and find the normalized incremental payments 2 Find the deviance between fitted and observed values D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 21 / 37

Double GLMs Algorithm 1 Start with the initial exposure and find the normalized incremental payments 2 Find the deviance between fitted and observed values 3 Model the deviance D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 21 / 37

Double GLMs Algorithm 1 Start with the initial exposure and find the normalized incremental payments 2 Find the deviance between fitted and observed values 3 Model the deviance 4 Establish the new exposure and start over D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 21 / 37

Double GLMs FIGURE: Inter-relationship between the two sub-models D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 22 / 37

Iterative Weighted Least Squares Algorithm additional specifications Analogous to Fisher s weighted scoring method for optimization D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 23 / 37

Iterative Weighted Least Squares Algorithm additional specifications Analogous to Fisher s weighted scoring method for optimization IWLS implicitly uses the deviance principle for estimating φ D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 23 / 37

Restricted Maximum Likelihood Notes Maximum likelihood estimators are biased downwards when the number of estimators is large compared to the number of observations D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 24 / 37

Restricted Maximum Likelihood Notes Maximum likelihood estimators are biased downwards when the number of estimators is large compared to the number of observations REML produces estimators which are approximately and sometimes exactly unbiased D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 24 / 37

Restricted Maximum Likelihood Notes Maximum likelihood estimators are biased downwards when the number of estimators is large compared to the number of observations REML produces estimators which are approximately and sometimes exactly unbiased Approximately maximizes the penalized log-likelihood l p (y; γ; p) = l(y; β γ ; γ; p) + 1 X 2 log T WX D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 24 / 37

Optimizing p Algorithm 1 Suppose p fixed and known D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 25 / 37

Optimizing p Algorithm 1 Suppose p fixed and known 2 Evaluate all the other parameters using the DGLM D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 25 / 37

Optimizing p Algorithm 1 Suppose p fixed and known 2 Evaluate all the other parameters using the DGLM 3 Evaluate the penalized log-likelihood D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 25 / 37

Optimizing p Algorithm 1 Suppose p fixed and known 2 Evaluate all the other parameters using the DGLM 3 Evaluate the penalized log-likelihood 4 Start over with different values for p and compare the penalized log-likelihood D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 25 / 37

Reserve Variability Mean Square Error of Prediction Dispersion Models : overdispersion included implicitly in the parameter covariance matrix D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 26 / 37

Reserve Variability Mean Square Error of Prediction Dispersion Models : overdispersion included implicitly in the parameter covariance matrix GLMs : overdispersion is included manually in the parameter covariance matrix D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 26 / 37

Data Analyzed D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 27 / 37

Data Analyzed D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 28 / 37

Data Analyzed D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 29 / 37

Optimizing p FIGURE: Restricted log-likelihood for various p in a DGLM D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 30 / 37

Analysis of Results D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 31 / 37

Analysis of Results D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 32 / 37

Analysis of Results FIGURE: Adjusting the most deviant observations has a bigger influence on the deviance principle D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 33 / 37

Analysis of Results FIGURE: Contribution of each cell to the dispersion. p = 1.1741, w i,j 1 D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 34 / 37

Conclusion Further Discussion Lack of observations and abundance of parameters is a hostile environment for DGLMs D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 35 / 37

Conclusion Further Discussion Lack of observations and abundance of parameters is a hostile environment for DGLMs The deviance cannot be estimated in the last column for a DGLM and is hence ignored when estimating φ The algorithm does not converge for p fixed when there are more than 7 parameters D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 35 / 37

Conclusion Further Discussion Lack of observations and abundance of parameters is a hostile environment for DGLMs The deviance cannot be estimated in the last column for a DGLM and is hence ignored when estimating φ The algorithm does not converge for p fixed when there are more than 7 parameters The algorithm does converge for p fixed for one parameter, but p cannot be optimized D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 35 / 37

Conclusion Further Discussion Lack of observations and abundance of parameters is a hostile environment for DGLMs The deviance cannot be estimated in the last column for a DGLM and is hence ignored when estimating φ The algorithm does not converge for p fixed when there are more than 7 parameters The algorithm does converge for p fixed for one parameter, but p cannot be optimized The "bounds" of convergence need to be explored further D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 35 / 37

References England, PD and Verrall, RJ (2002). Stochastic Claims Reserving in General Insurance. Institute of Actuaries and Faculty of Actuaries Smyth, G. and Jorgensen, B. (2002). Fitting Tweedie s Compound Poisson Model to Insurance Claims Data : Dispersion Modelling. ASTIN Bulletin, 1, 143 157 Wüthrich, MV (2003). Claims Reserving Using Tweedie s Compound Poisson Model. ASTIN Bulletin, 331 346 D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 36 / 37

The end D. Davidov (UQAM) Loss Reserves Variance Modeling August 2009 37 / 37