Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI

Size: px

Start display at page:

Download "Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI"

Abel Greene
6 years ago
Views:

1 Analytics Software Beyond deterministic chain ladder reserves Neil Covington Director of Solutions Management GI

2 Objectives 2

3 Contents 01 Background 02 Formulaic Stochastic Reserving Methods 03 Bootstrapping The Theory 04 Bootstrapping The Practice 05 Generalised Linear Models GLM 3

4 01 Background

5 Questions Do you currently calculate non-deterministic chain ladder style reserves e.g. stochastic, GLM etc.? Do you think you will calculate non-deterministic chain ladder reserves in the next 1 to 2 years? 5

6 Definition Deterministic A deterministic system is a system in which no randomness is involved in the development of future states of the system A deterministic model will thus always produce the same output from a given starting condition or initial state 6

7 Example Deterministic Basic Chain Ladder Same answer every time Run 1 Run 2 Run 3 Run 4 Run 5.. Development pattern is fixed Fixed proportional development Fixed timing Fixed tail development 7

determined Having a random probability distribution or pattern that

8 Definition Stochastic Events or systems that are unpredictable due to the influence of a random variable Pertaining to chance Randomly determined Having a random probability distribution or pattern that may be analysed statistically but may not be predicted precisely 8

ladder assumes averages User often provides expert judgement in the process Sometimes curve

9 Example Stochastic Basic Chain Ladder Development pattern is is variable fixed Variable Fixed Variable Fixed tail proportional tail Variable Fixed timing development development Basic chain ladder assumes averages User often provides expert judgement in the process Sometimes curve fitting But then still just one view of the development pattern The only guarantee is that this will be wrong! 9

10 Stochastic Measure of variability Standard deviation Mean squared error Full distribution 10

11 Reserve Models Distributional Free Models Consider incremental claim models CC ii,jj = αα ii. ββ jj No explicit distribution specified Estimators use method of moments Effectively pure deterministic methods E.g. typical chain ladder method

12 Reserve Models Distributional Models Consider incremental claim models CC ii,jj = αα ii. ββ jj Distributional Models Overdispersed Poisson Negative-binomial Log-normal Gamma Tweedie MLE s typically used to estimate parameters Can then associate stochastic variance measures, usually variance and/or mean squared error

13 Reserve Models Linear Models General linear models Not generalised linear models Special case of the generalised linear model with identity link and responses normally distributed Generalised linear models Flexible generalisation of ordinary linear regression Response variables can have error distribution models other than a normal distribution Linear model can be related to the response variable via a link function Magnitude of the variance of each measurement can be a function of its predicted value Linear regression techniques typically used to estimate

14 02 Formulaic Stochastic Reserving Methods

15 Non-Bootstrap Methods Mack Thomas Mack et al Single measure of ultimate variability, no distribution Formulaic, no simulation needed Builds on the standard ODP basic chain ladder model Well documented method 15

16 Non-Bootstrap Methods Mack Mean squared error of reserve R by origin period i: mmmmmm( 2 RR ii ) = CC ii,ii II 1 kk=ii+1 ii I represents the maximum origin, and development, period σσ kk 2 dd kk 2 1 CC ii,kk + 1 II kk jj=1 CC jj,kk Measures the average of the squares of the errors, the difference between the estimator and what is estimated, actual versus expected Similar formula for mean squared error of total reserves by origin period including a covariance adjustment Extensions, e.g. Mack with Tail 16

single point value, no distribution or other values No

17 Non-Bootstrap Methods Mack Simple formulaic calculation, no need for simulations Useful non-complicating measure Only a single point value, no distribution or other values No specific defined percentile measure for the mean squared error 17

18 Non-Bootstrap Methods Merz and Wuthrich Michael Merz and Mario V. Wuthrich Builds on the work of Mack et al Mean squared error of reserve over: One development period, e.g. a year Ultimate Measure of variability, no distribution Formulaic, no simulation needed By origin period and in total Process variance and estimation variance by origin period Aggregate with covariance adjustment to get total One year measure used for Solvency II reserve risk calibration and is the prescribed USP calculation 18

19 Non-Bootstrap Methods Merz and Wuthrich Simple formulaic calculation, no need for simulations Useful non-complicating measure Single development period and ultimate measures Only a point value, no distribution No specific defined percentile measure for the mean squared error 19

20 03 Bootstrapping The Theory

Bootstrapping Bootstrapping usually refers to a self-starting process that is supposed to proceed without external input In computers, shortened to booting, refers to the

21 Bootstrapping Bootstrapping usually refers to a self-starting process that is supposed to proceed without external input In computers, shortened to booting, refers to the process of loading the basic software which will then take care of loading other software as needed. Originally from putting on a boot with the help of the boot itself A bootstrap 21

22 The Theory Bootstrapping can refer to any test or metric that relies on random sampling, with replacement. Bootstrapping allows assigning measures of accuracy (defined in terms of bias, variance, confidence intervals, prediction error or some other such measure) to sample estimates. This technique allows estimation of the sampling distribution of almost any statistic using random sampling methods. Generally, it falls in the broader class of resampling methods. 22

23 The Theory Bootstrapping is the practice of estimating properties of an estimator (such as its variance) by measuring those properties when sampling from an approximating distribution. One standard choice for an approximating distribution is the empirical distribution function of the observed data. In the case where a set of observations can be assumed to be from an independent and identically distributed population, this can be implemented by constructing a number of resamples with replacement, of the observed dataset 23

24 The Challenge 24

25 Common Bootstrap ODP Model An incremental loss type model Incremental losses CC ii,jj by origin period i and development period j are independent Incremental losses CC ii,jj are related: CC ii,jj = αα ii. ββ jj There is a constant variance to mean ratio, σσ 2 Note: Poisson distribution mean = variance Overdispersed Poisson distribution mean variance with the ratio being the dispersion parameter 25

26 Common Bootstrap ODP Model For the basic chain ladder, cumulative claims DD ii,jj : DD ii,jj = DD ii,kkii. dd jj To date amount times cumulative development factor dd jj = jj aaaa mm mm=kk ii +1 Cumulative development factor is the product of age to age, link ratio, factors aaaa mm Hence incremental claims CC ii,jj = DD ii,jj DD ii,jj 1 = DD ii,kkii. dd jj dd jj 1 = αα ii. ββ jj αα ii = DD ii,kkii ββ jj = dd jj dd jj 1 26

27 Common Bootstrap ODP Model So CC ii,jj = αα ii. ββ jj as required EE CC ii,jj = DD ii,kkii. dd jj dd jj 1 VVVVVV CC ii,jj = σσ 2. EE CC ii,jj αα ii = DD ii,kkii and ββ jj = dd jj dd jj 1 are estimated straight from the basic chain ladder calculations Variance proportional to the mean defines an ODP distribution, σσ 2 being the dispersion parameter 27

larger samples Parametric Theoretical residuals distribution e.g., ODP or Normal

28 Types of Bootstrap Non-Parametric Empirical residuals distribution Does not require a distributional assumption Just uses the data itself Works best for larger samples Parametric Theoretical residuals distribution e.g., ODP or Normal distributions are common Mean and variance parameters selected based on the original data 28

29 Residuals Typical unscaled residual is based on Pearson s Residual εε ii,jj = CC ii,jj CC ii,jj CC ii,jj Actual less expected as a proportion of the square root of expected The set of residuals εε ii,jj then forms the sample distribution: Used directly as the residuals distribution, an empirical distribution Fitted to an appropriate standard distribution, e.g. ODP or normal, using the sample mean and variance 29

30 Residuals Bias Correction Residuals can further be scaled to correct for bias εε aaaaaa ii,jj = εε ii,jj. nn nn pp Where: n is the number of observations, the size of the residuals triangle p the number of parameters estimated, the number of origin and development periods 30

31 Residuals Process Variance Process variance can be estimated as: Where: n is the number of observations p the number of parameters estimated σσ 2 = 1 nn pp. ii,jj 2 εε ii,jj 31

Residuals Selections Residuals can be selected, in turn adjusting the residual distribution Individual residuals can be ignored, for example,

32 Residuals Selections Residuals can be selected, in turn adjusting the residual distribution Individual residuals can be ignored, for example, the maximum and minimum Residuals can be grouped for sampling, for example, use the tail group residuals for the tail only Group 1 Group 2 32

33 04 Bootstrapping The Practice

34 Bootstrapping Process Cumulative Claims Start with the cumulative claims triangle Cumulative Claims

35 Bootstrapping Process Development Pattern Calculate the usual development pattern Include any adjustments as usual Example here is a simple average of all values Development Factors Average

36 Bootstrapping Process Developed Claims Calculate developed claims in the usual way and hence the usual BCL reserves Developed Claims Actual Reserve

37 Bootstrapping Process Expected Cumulative Claims Calculate expected past cumulative claims using the development pattern Start with the latest, to date, value and apply the development pattern in reverse What previous claim would have given rise to this one For example 130 = 132 / Expected Cumulative Claims Development Factors Average

38 Bootstrapping Process Incremental Claims Convert cumulative actual and expected claims to incremental claims Actual Incremental Claims Expected Incremental Claims

39 Bootstrapping Process Unscaled Pearson Residual Calculate unscaled Pearson residual εε ii,jj = CC ii,jj CC ii,jj CC ii,jj Actual less expected divided by the square root of expected Pearsons Unscaled Residual

40 Bootstrapping Process Empirical Residual Distribution These 10 residuals then form our empirical distribution Note you would usually look to have a fuller sample, e.g. 100 or more Empirical Residual Distribution

41 Bootstrapping Process Simulated Residual Triangle Through sampling from the residual distribution, with replacement, randomly populate a new simulated residual triangle Empirical Residual Distribution Simulated Residuals

Bootstrapping Process Simulated Incremental Claims Using the same expected claims, reverse the residual calculation to generate simulated actual claims CC ii,jj = CC ii,jj + εε ii,jj.

42 Bootstrapping Process Simulated Incremental Claims Using the same expected claims, reverse the residual calculation to generate simulated actual claims CC ii,jj = CC ii,jj + εε ii,jj. CC ii,jj Simulated actual claims equals expected claims plus the residual times the square root of the expected claims Simulated Incremental Claims

43 Bootstrapping Process Simulated Cumulative Claims Calculate cumulative simulated claims from the incremental claims Simulated Cumulative Claims

44 Bootstrapping Process Simulated Development Pattern Calculate a usual development pattern Adjustments not typically included further Example is again a simple average Development Factors

45 Bootstrapping Process Developed Simulated Claims Calculate developed simulated claims in the usual way and hence the usual BCL reserves Developed Claims Simulated Reserve

7 Simulated Reserve 0.0 2.0 11.4 22.2 47.7 83.

46 Bootstrapping Process We now have two results for claim reserves One from actual claims One from simulated claims Actual Reserve Simulated Reserve Now repeat from the simulating residual triangle step to produce another simulated claim reserve, and another, and another. 46

47 Bootstrapping Process So what? 47

48 Bootstrapping Process Using Results Examples Mean 94.6 Coefficient of variance 8.7% 75 th percentile th percentile th to 75 th percentile confidence interval 88.8 to or -6.1% to +5.9% of the mean 48

49 Bootstrapping Process So what? 49

50 Bootstrapping Process Using Results Examples The Science Mean 94.6 Coefficient of variance 8.7%, approx. 83 rd percentile The English Best estimate, expected value 75 th percentile 100.1, +5.9% of mean Prudent estimate 1 in 4 year level Average difference to mean measure How stable is the estimate? 99.5 th percentile 116.5, +23.2% mean Extreme, 1 in 200 year, estimate How bad could it get? 25 th to 75 th percentile confidence interval 88.8 to 100.1, -6.1% to +5.9% of the mean A likely range 50% chance it will be in this range Use them as a tool to inform and assist with your work 50

51 Bootstrapping Process Common Misuses There are no guarantees, these are simulations only Don t be fooled into a false sense of security Use them as an informative tool only, an extra piece of information to help you The statistics, e.g. coefficient of variation, only apply to this data set A common mis-use is to run a simulation on one dataset and then use the statistics, e.g. coefficient of variation, on another data set 51

52 05 Generalised Linear Models GLM

53 GLM in Reserving Not a new concept Papers going back to early 2000 s Not a new concept Different ways to use it.

54 What is a GLM? Standard definition The GLM consists of three elements: A probability distribution from the exponential family, error distribution A linear predictor η = Xβ, dependent variable A link function g such that E(Y) = μ = g 1(η) 54

55 GLM Link Function A standard linear model has an additive form: response = constant + f factor 1 + f factor f factor n (f factor i is a function whose value depends on the level of factor i, e.g. you might have f sex = 0.00 if sex=male and if sex=female.) Generalised linear models allow you to apply a 'link function' to change the structure of the model you are analysing. Typical link functions include: log function inverse function square root function logistic function Taking the simple example of the log link function, if you have a response which you want to model according to a multiplicative model, i.e. : response = constant f factor 1 f factor 2... f factor n By taking logs, you end up with: log(response) = log(constant f factor 1 f factor 2... f factor n ) = log(constant) + log(f factor 1 ) + log(f factor 2 ) log(f factor n ) Hence a linear model again 55

56 GLM Error Distribution Going hand-in-hand with the choice of link function is the choice of error distribution. Common link functions and error distribution combinations include: Error Distribution Link Function Common Usage Normal Identity Standard additive linear model, the general linear model Poisson Log Model the number of times something happens using a multiplicative structure Gamma Log Model amounts (>0) using a multiplicative structure Binomial Logistic Model a probability (constrained to be between 0 and 1) 56

57 GLM Dependent Variable The dependent variable, or response, is the variable you are trying to model, i.e. the variable which depends on the values of the analysis factors, e.g. a reserve. This factor is usually one of the following: a 0/1 flag or a probability, in which case use the binomial-logistic model structure a number of events, in which case use the poisson-log model structure a (positive) amount, in which case use the gamma-log model structure 57

58 GLM Just another distribution? Basically yes Linear model compared to a statistical distribution But reduces the number of parameters, effectively the dimensions of the random vector Typical multiplicative model CC ii,jj = αα ii. ββ jj (I+1).(J+1) unknown parameters C i,j GLM additive form model I+J+1 unknown parameters α i and β j [(I+1)+(J+1) with typical normalisation constraint α 0 =1 hence log(α 0 )=0] MLE s for GLM model can be shown to be the same as usual distributional models This time solve the system of equations for α i and β j using GLM software and techniques 58

59 GLM Extension Operational Time Remember the usual incremental claim assumptions Independent origin period and development period effects What about operational time? Development Origin

60 Operational Times Solutions Inflation adjustments Berquist Sherman adjustment rate Cape Cod decay factor Chain ladder Adjusted for operational time has form CC ii,jj = αα ii. ββ jj CC ii,jj = αα ii. ββ jj. γγ ii+jj 60

61 Operational Times Solutions GLM model can easily incorporate and solve for operational time factor CC ii,jj = αα ii. ββ jj. γγ ii+jj Other factors. Integrate other risk factors too? 61

62 06 Summary

63 Use reserve models to help supplement your experience and knowledge You should always know your business better than any model can

64 64

Claims Reserving under Solvency II

Claims Reserving under Solvency II Mario V. Wüthrich RiskLab, ETH Zurich Swiss Finance Institute Professor joint work with Michael Merz (University of Hamburg) April 21, 2016 European Congress of Actuaries,