Non-Life Insurance: Mathematics and Statistics

Size: px

Start display at page:

Download "Non-Life Insurance: Mathematics and Statistics"

Charlotte Barton
5 years ago
Views:

1 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet Solution. Discrete Distribution a) Note that N only takes values in N \ {0} and that p 0, ). Hence we calculate PN R] = PN = k] = p) k p = p p) k = p p) = p p =, k= k= from which we can conclude that the geometric distribution indeed defines a probability distribution on R. b) For n N \ {0}, we get PN n] = PN = k] = k=n k=n k=0 p) k p = p) n p p) k = p) n, where we used that k=0 p)k = p, as was shown in a). c) The expectation of a discrete random variable that takes values in N \ {0} can be calculated as EN] = k PN = k]. Thus we get k= k=0 k= EN] = k p) k p = k+) p) k p = k p) k p+ p) k p = p)en]+, where we used that k=0 p)k p =, as was shown in a). We conclude that EN] = p. d) Let r R. Then we calculate Eexp{rN}] = = k=0 exp{rk} PN = k] k= exp{rk} p) k p k= = p exp{r} = p exp{r} k=0 k=0 p) exp{r}] k k= p) exp{r}] k. Since p) exp{r} is strictly positive, the sum on the right hand side is convergent if and only if p) exp{r} <, which is equivalent to r < log p). Hence Eexp{rN}] exists if and only if r < log p) and in this case we have M N r) = Eexp{rN}] = p exp{r} p) exp{r} = p exp{r} p) exp{r}. Updated: September 9, 207 / 4 k=0

2 HS 207 Solution sheet e) For r < log p), we have Hence we get d dr M Nr) = d p exp{r} dr p) exp{r} p exp{r} p) exp{r}] + p exp{r} p) exp{r} = p) exp{r}] 2 = p exp{r} p) exp{r}] 2. d dr M p exp{0} Nr) r=0 = p) exp{0}] 2 = p p)] 2 = p p 2 = p. We observe that d dr M N r) r=0 = EN], which holds in general for all random variables if the moment generating function exists in an interval around 0. Solution.2 Absolutely Continuous Distribution a) We calculate PY R] = f Y x) dx = 0 λ exp{ λx} dx = exp{ λx}] 0 = 0 )] =, from which we can conclude that the exponential distribution indeed defines a probability distribution on R. b) For 0 < y < y 2, we calculate Py Y y 2 ] = = y2 y y2 y f Y x) dx λ exp{ λx} dx = exp{ λx}] y2 y = exp{ λy } exp{ λy 2 }. c) The expectation and the second moment of an absolutely continuous random variable can be calculated as EY ] = Thus, using partial integration, we get EY ] = xf Y x) dx and EY 2 ] = 0 xλ exp{ λx} dx = x exp{ λx}] 0 + exp{ λx} dx 0 = 0 + ] λ exp{ λx} 0 = λ. x 2 f Y x) dx. Updated: September 9, / 4

3 HS 207 Solution sheet The variance VarY ) can be calculated as VarY ) = EY 2 ] EY ] 2 = EY 2 ] λ 2. For the second moment EY 2 ] we get, again using partial integration, EY 2 ] = 0 x 2 λ exp{ λx} dx = x 2 exp{ λx} ] 0 + 2x exp{ λx} dx = λ EY ] = 2 λ 2, from which we can conclude that VarY ) = 2 λ 2 λ 2 = λ 2. Note that for the exponential distribution both the expectation and the variance exist. The reason is that exp{ λx} goes much faster to 0 than x or x 2 go to infinity, for all λ > 0. d) Let r R. Then we calculate Eexp{rY }] = 0 exp{rx}λ exp{ λx} dx = 0 0 λ exp{r λ)x} dx. The integral on the right hand side and therefore also Eexp{rY }] exist if and only if r < λ. In this case we have and therefore e) For r < λ, we have M Y r) = Eexp{rY }] = d 2 dr 2 log M Y r) = d2 dr 2 log Hence we get λ r λ exp{r λ)x}] 0 = λ 0 ) = λ r λ λ r ) λ log M Y r) = log. λ r ) λ = d2 λ r dr 2 logλ) logλ r)] = d dr d 2 dr 2 log M Y r) r=0 = λ 0) 2 = λ 2. λ r = λ r) 2. We observe that d2 dr 2 log M Y r) r=0 = VarY ), which holds in general for all random variables if the moment generating function exists in an interval around 0. Solution.3 Conditional Distribution a) For y > > 0, we get PY y] = PY y, I = 0] + PY y, I = ] = PY y I = 0]PI = 0] + PY y I = ]PI = ] = 0 p) + PY y I = ] p = p PY y I = ], Updated: September 9, / 4

4 HS 207 Solution sheet since Y I = 0 is equal to 0 almost surely and thus PY y I = 0] = 0. Since Y I = Pareto, α), we can calculate α x ) α+) x ) ] α y ) α PY y I = ] = f Y I= x) dx = dx = =. y y y We conclude that y ) α PY y] = p. b) Using that Y I = 0 is equal to 0 almost surely and thus EY I = 0] = 0, we get EY ] = EY {I=0} ]+EY {I=} ] = EY I = 0]PI = 0]+EY I = ]PI = ] = p EY I = ]. Since Y I = Pareto, α), we can calculate EY I = ] = xf Y I= x) dx = x α x ) α+) dx = α α x α dx We see that the integral on the right hand side and therefore also EY ] exist if and only if α >. In this case we get EY I = ] = α α ] α x α ) = α α α α α ) = α. We conclude that, if α >, we get If 0 < α, EY ] does not exist. α EY ] = p α. Updated: September 9, / 4

5 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 2 Solution 2. Gaussian Distribution a) The moment generating function of a + bx can be calculated as M a+bx r) = E exp {ra + bx)}] = exp {ra} E exp {rbx}] = exp {ra} M X rb), for all r R. Using the formula for the moment generating function of X given on the exercise sheet, we get M a+bx r) = exp {ra} exp {rbµ + rb)2 σ 2 } = exp {ra + bµ) + r2 b 2 σ 2 }, 2 2 which is equal to the moment generating function of a Gaussian random variable with expectation a + bµ and variance b 2 σ 2. Since the moment generating function uniquely determines the distribution, we conclude that a + bx N a + bµ, b 2 σ 2 ). b) Using the independence of X,..., X n, the moment generating function of Y = n X i can be calculated as { }] n n n M Y r) = E exp {ry }] = E exp r X i = E exp {rx i }] = M Xi r), for all r R. Using the formula for the moment generating function of a Gaussian random variable given on the exercise sheet, we get n { M Y r) = exp rµ i + r2 σi 2 } { n } = exp r µ i + r2 n σ2 i, 2 2 which is equal to the moment generating function of a Gaussian random variable with expectation n µ i and variance n σ2 i. Since the moment generating function uniquely determines the distribution, we conclude that n n ) n X i N µ i,. σ 2 i Solution 2.2 Maximum Likelihood and Hypothesis Test a) Since log Y,..., log Y 8 are independent random variables, the joint density f µ,σ 2x,..., x 8 ) of log Y,..., log Y 8 is given by product of the marginal densities of log Y,..., log Y 8. We have f µ,σ 2x,..., x 8 ) = 8 { exp 2πσ 2 x i µ) 2 since log Y,..., log Y 8 are Gaussian random variables with mean µ and variance σ 2. σ 2 }, Updated: September 23, 207 / 3

6 HS 207 Solution sheet 2 b) By taking the logarithm, we get log f µ,σ 2x,..., x 8 ) = 8 = 8 log ) log 2π logσ) x i µ) 2 2 σ 2 2π ) 8 logσ) 2σ 2 8 x i µ) 2. c) We have log f µ,σ 2x,..., x 8 ) < 8 logσ) for all µ R. Hence, independently of µ, log f µ,σ 2x,..., x 8 ) if σ 2. Moreover, since for example x x 2, there exists a c > 0 with 8 x i µ) 2 > c and thus log f µ,σ 2x,..., x 8 ) < 8 logσ) c 2σ for c 2 all µ R. Since 2σ goes much faster to than 8 logσ) goes to if σ 2 0, we have 2 log f µ,σ 2x,..., x 8 ) if σ 2 0, independently of µ. Finally, if σ 2 c, c 2 ] for some 0 < c < c 2, we have log f µ,σ 2x,..., x 8 ) < 8 2c 2 x i µ) 2. Hence, independently of the value of σ 2 in the interval c, c 2 ], log f µ,σ 2x,..., x 8 ) if µ. Since log f µ,σ 2x,..., x 8 ) is continuous in µ and σ 2, we can conclude that it attains its global maximum somewhere in R R >0. Thus ˆµ and ˆσ 2 as defined on the exercise sheet have to satisfy the first order conditions µ log f µ,σ 2x,..., x 8 ) µ,σ 2 )=ˆµ,ˆσ 2 ) = 0 and σ 2 ) log f µ,σ 2x,..., x 8 ) µ,σ2 )=ˆµ,ˆσ 2 ) = 0. We calculate µ log f µ,σ 2x,..., x 8 ) = σ 2 8 x i µ), which is equal to 0 if and only if µ = 8 8 x i. Moreover, we have σ 2 ) log f µ,σ 2x,..., x 8 ) = 8 2σ 2 + 2σ 4 8 x i µ) 2 = 2σ σ 2 ] 8 x i µ) 2, which is equal to 0 if and only if σ 2 = 8 8 x i µ) 2. Since there is only tuple in R R >0 that satisfies the first order conditions, we conclude that ˆµ = 8 8 x i = 7 and ˆσ 2 = 8 8 x i ˆµ) 2 = 8 8 x i 7) 2 = 7. Note that the MLE ˆσ 2 is not unbiased. Indeed, if we replace x,..., x 8 by independent Gaussian random variables X,..., X 8 with expectation µ R and variance σ 2 > 0 and write ˆµ for 8 8 X i, we can calculate Eˆσ 2 ] = Eˆσ 2 X,..., X 8 )] = E 8 ] 8 8 ] X i ˆµ) 2 = 8 E i X 2 2X i ˆµ + ˆµ 2 ). By noting that 8 X i = 8ˆµ and that EX 2 ] = = EX 2 8 ], we get Eˆσ 2 ] = 8 E 8 X 2 i 2 8 ˆµ 2 + 8ˆµ 2 ] = EX 2 ] Eˆµ 2 ] = σ 2 EX ] 2 Varˆµ) + Eˆµ] 2. Updated: September 23, / 3

7 HS 207 Solution sheet 2 By inserting we can conclude that ) 8 Varˆµ) = Var X i = 8 ] 2 Eˆµ] 2 8 = E X i = 8 ) 2 8 VarX i ) = 8 8 σ2 8 8 EX i ]) 2 = EX ] 2, and hence ˆσ 2 is not unbiased. Eˆσ 2 ] = σ 2 EX ] 2 8 σ2 + EX ] 2 = 7 8 σ2 σ 2, d) Since our data is assumed to follow a Gaussian distribution and the variance is unknown, we perform a t-test. The test statistic is given by where T = T log Y,..., log Y 8 ) = log Y i µ, S 2 S 2 = 7 8 log Y i 8 ) 2 8 log Y i. Under H 0, T follows a Student-t distribution with 7 degrees of freedom. With the data given on the exercise sheet, the random variable S 2 attains the value ) 2 8 x i 8 x i = 8 x i 7) 2 = 8, and thus for T we get the observation x i µ = =, S 2 8 where we use that µ = 6 under H 0. Now the probability under H 0 to observe a T that is at least as extreme as the observation we got above, is P T ] = PT ] + PT ] = PT < ] + PT < ] = 2 2PT < ], where we used the symmetry of the Student-t distribution around 0. The probability PT < ] is approximately 0.83, thus the p-value is given by P T ] = 2 2PT < ] = This p-value is fairly high, hence we conclude that we can not reject the null hypothesis, for example, at significance level of 5% or %. Solution 2.3 Variance Decomposition By definition of the random variable X, the second moments exist. Hence, we have EVarX G)] = E EX 2 G] EX G]) 2] = EX 2 ] E EX G]) 2] and VarEX G]) = E EX G]) 2] E EX G]] 2 = E EX G]) 2] EX] 2. Combining these two results, we get EVarX G)] + VarEX G]) = EX 2 ] EX] 2 = VarX). Updated: September 23, / 3

8 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 3 Solution 3. No-Claims Bonus a) We define the following events: A = { no claims in the last six years }, B = { no claims in the last three years but at least one claim in the last six years }, C = { at least one claim in the last three years }. Note that since the events A, B and C are disjoint and cover all possible outcomes, we have PA] + PB] + PC] =, i.e. it is sufficient to calculate two out of the three probabilities. Since the calculation of PB] is slightly more involved, we will look at PA] and PC]. Let N,..., N 6 be the number of claims of the last six years of our considered car driver, where N 6 corresponds to the most recent year. By assumption, N,..., N 6 are independent Poisson random variables with frequency parameter λ = 0.2. Therefore, we can calculate PA] = P N = 0,..., N 6 = 0] = and, similarly, 6 P N i = 0] = 6 exp{ λ} = exp{ 6λ} = exp{.2} PC] = PC c ] = P N 4 = 0, N 5 = 0, N 6 = 0] = exp{ 3λ} = exp{ 0.6}. For the event B we get PB] = PA] PC] = exp{.2} exp{ 0.6}) = exp{ 0.6} exp{.2}. Thus the expected proportion q of the base premium that is still paid after the grant of the no-claims bonus is given by q = 0.8 PA] PB] + PC] = 0.8 exp{.2} exp{ 0.6} exp{.2}) + exp{ 0.6} If s denotes the surcharge on the base premium, then it has to satisfy the equation q + s) base premium = base premium, which leads to s = q. We conclude that the surcharge on the base premium is given by approximately 9.3%. Updated: October 4, 207 / 4

9 HS 207 Solution sheet 3 b) We use the same notation as in a). Since this time the calculation of PB] is considerably more involved, we again look at PA] and PC]. By assumption, conditionally given Θ, N,..., N 6 are independent Poisson random variables with frequency parameter Θλ, where λ = 0.2. Therefore, we can calculate PA] = P N = 0,..., N 6 = 0] = E P N = 0,..., N 6 = 0 Θ]] 6 ] = E P N i = 0 Θ] 6 ] = E exp{ Θλ} = E exp{ 6Θλ}] = M Θ 6λ), where M Θ denotes the moment generating function of Θ. Since Θ Γ, ), M Θ is given by for all r <, which leads to Similarly, we get M Θ r) = r, PA] = + 6λ = 2.2. PC] = PC c ] = P N 4 = 0, N 5 = 0, N 6 = 0] = + 3λ =.6 = For the event B we get PB] = PA] PC] = = Thus the expected proportion q of the base premium that is still paid after the grant of the no-claims bonus is given by q = 0.8 PA] PB] + PC] = ) We conclude that the surcharge s on the base premium is given by which is considerably bigger than in a). s = q 2.%, Solution 3.2 Central Limit Theorem Let σ 2 be the variance of the claim sizes and x > 0. Then we have ] ] n P Y i µ n < x n = P Y i µ < x n P n n n n n n n = P Y i µ < x ] P σ σ = P Z n < x ] P Z n x ], σ σ ] Y i µ x n n n n Y i µ x σ σ ] Updated: October 4, / 4

10 HS 207 Solution sheet 3 where Z n = n n n Y i µ. σ According to the Central Limit Theorem, Z n converges in distribution to a standard Gaussian random variable. Hence, if we write Φ for the distribution function of a standard Gaussian random variable, we have the approximation ] n P Y i µ n < x x ) Φ Φ x ). n σ σ On the one hand, as we are interested in a probabilty of at least 95%, we have to choose x > 0 such that x ) Φ Φ x ) = 0.95, σ σ which implies It follows that x σ =.96. x =.96 σ =.96 VcoY ) µ =.96 4 µ. ) On the other hand, as we want the deviation of the empirical mean from µ to be less than %, we set x = 0.0 µ, n which implies Combining ) and 2), we conclude n = x µ 2. 2) n =.96 4 µ) µ 2 = = Solution 3.3 Compound Binomial Distribution For S CompBinomṽ, p, G) with the random variable Ỹ having distribution function G and moment generating function MỸ, the moment generating function M S of S is given by M Sr) = pmỹ r) + p ) ṽ, for all r R for which MỸ is defined. We calculate the moment generating function of S lc and show that it is exactly of the form given above. Since S lc 0 almost surely, its moment generating function is defined at least for all r < 0. Thus, for r < 0, we have { }] N M Slc r) = E exp r Y i {Yi>M} = E N exp { } ] ry i {Yi>M} N = E E exp { ]] ry i {Yi>M}} N N = E E exp { ] ry i {Yi>M}}], Updated: October 4, / 4

11 HS 207 Solution sheet 3 where in the third equality we used the tower property of conditional expectation and in the fourth equality the independence between N and Y i. For the inner expectation we get E exp { ] ry i {Yi>M}}] = E exp {ryi } {Yi>M} + {Yi M} = E exp {ry i } Y i > M] PY i > M] + PY i M] = E exp {ry i } Y i > M] GM)] + GM). First note that the distribution function of the random variable Y i Y i > M is G lc. Moreover, since Y i Y i > M is greater than 0 almost surely, its moment generating function M Y Y >M is defined for all r < 0 and thus we can write Hence we get E exp { ry i {Yi>M}}] = MY Y >M r) GM)] + GM). N M Slc r) = E MY Y >M r) GM)] + GM) )] = E MY Y >M r) GM)] + GM) ) ] N = E exp { N log M Y Y >M r) GM)] + GM) )}] = M N ρ), where M N is the moment generating function of N and ρ = log M Y Y >M r) GM)] + GM) ). Since we have N Binomv, p), M N r) is given by Therefore, we get M N r) = p exp{r} + p) v. M Slc r) = p M Y Y >M r) GM)] + GM) ) + p] v = p GM)]M Y Y >M r) + p GM)]) v. Applying Lemma.3 of the lecture notes, we conclude that S lc CompBinomṽ, p, G) with ṽ = v, p = p GM)] and G = G lc. Updated: October 4, / 4

12 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 4 Solution 4. Poisson Model and Negative-Binomial Model a) Let v = v = = v 0 = In the Poisson model we assume that N,..., N 0 are independent with N t Poiλv t ) for all t {,..., 0}. We use Estimator 2.32 of the lecture notes to estimate the claims frequency parameter λ by ˆλ MLE 0 = 0 t= N t 0 t= v = t 0 t= N t = v %. Note that a random variable N Poiλv) can be understood as N d) = v N i, ) where N,..., N v are independent random variables that all follow a Poiλ)-distribution. If we define ˆλ = N/v, then we have ] ] N E ˆλ = E = EN] = λv v v v = λ, hence ˆλ can be seen as an estimator for λ. Moreover, we have ) ) N Var ˆλ = Var = VarN) v v 2 = λv v 2 = λ v and, because of ), we can use the Central Limit Theorem to get N/v E N/v] Var N/v) = ˆλ λ λ/v Z, as v, where Z is a random variable following a standard normal distribution. Hence, we have the approximation ] λ P ˆλ v λ ˆλ λ + = P ˆλ ] λ P Z ) 0.7, v λ/v ] i.e. with a probability of roughly 70%, λ lies in the interval ˆλ λ/v, ˆλ + λ/v. Since a confidence interval for λ is not allowed to depend on λ itself, we also replace it by the ] estimator ˆλ to get an approximate, roughly 70%-confidence interval ˆλ ˆλ/v, ˆλ + ˆλ/v 0 ) MLE for λ. If we look at the estimator ˆλ 0 as the random variable t= N t / 0v), we see that ˆλMLE ] 0 t= E 0 = EN 0 t] t= = λv ] t = λ = E ˆλ 0v 0v and Var ˆλMLE ) 0 t= 0 = VarN 0 t) t= 0v) 2 = λv t 0v) 2 = λ 0v < λ ) ˆλ v = Var. Updated: October 0, 207 / 5

13 HS 207 Solution sheet 4 Because of the smaller variance it makes sense to replace ˆλ by roughly 70%-confidence interval ˆλ MLE ˆλMLE 0 MLE 0, ˆλ 0 + v ˆλ MLE %, 0.54%] v ˆλ MLE 0 to get the approximate, for λ. If we define λ t = N t /v t for all t {,..., 0}, we have the following observations λ,..., λ 0 of the frequency parameter λ: t λ t = Nt v t 0% 9.97% 9.85% 9.89% 0.56% 0.70% 9.94% 9.86% 0.93% 0.54% Table : Observed claims frequencies λ t = N t /v t. We observe that instead of the expected, roughly seven observations, only four observations lie in the estimated confidence interval. We conclude that the assumption of having Poisson distributions might not be reasonable. b) By equation 2.8) of the lecture notes, the test statistic ˆχ is given by ˆχ = 0 v t t= N t /v t ˆλ MLE 0 ) 2 MLE ˆλ 0 and is approximately χ 2 -distributed with 0 = 9 degrees of freedom. By inserting the MLE numbers and ˆλ 0 calculated in a), we get ˆχ The probability that a random variable with a χ 2 -distribution with 9 degrees of freedom is greater than 4.84 is approximately equal to 9.55%. Hence we can reject the null hypothesis of having Poisson distributions only at significance levels that are higher than 9.55%. In particular, we can not reject the null hypothesis at the significance level of 5%. c) As in part a), let v = v = = v 0 = In the negative-binomial model we assume that N,..., N 0 are independent with N t PoiΘ t λv t ) for all t {,..., 0}, where i.i.d. Θ,..., Θ 0 Γγ, γ) for some γ > 0. We use Estimator 2.28 of the lecture notes to estimate the claims frequency parameter λ by 0 ˆλ NB t= 0 = N 0 t t= 0 t= v = N t = t 0v %. As in equation 2.7) of the lecture notes, we define ˆV 2 0 = 9 0 t= v t Nt v t ) NB ˆλ 0 6.9%. Now we can use Estimator 2.30 of the lecture notes to estimate the dispersion parameter γ by ˆγ NB 0 = ˆλNB ) 2 0 ˆV 0 2 ˆλ NB 0 0 v t 9 t= ) 0 t= v2 t 0 t= v = t ˆλNB ) 2 0 ˆV 0 2 ˆλ NB 0 ) 0v 0v2 0v = 9 ˆλNB ) 2 0 v ˆV 0 2 ˆλ NB Updated: October 0, / 5

14 HS 207 Solution sheet 4 For a random variable N PoiΘλv), conditionally given Θ, we have ] N E = EN] = EEN Θ]] = EΘλv] = λv v v v v v = λ, since EΘ] =, and Var ) N EVarN Θ)] + VarEN Θ]) EΘλv] + VarΘλv) = v v 2 = v 2 = λv + λ 2 v 2 γ v 2 = λ + λ 2 v γ, v since VarΘ) = /γ. Similarly as in the Poisson case in part a), we get the approximate, roughly 70%-confidence interval ˆλ ˆλNB ) 2 NB ˆλ NB v/ˆγ NB 0 ˆλ ˆλNB ) 2 NB 0 NB 0, ˆλ + 0 v/ˆγ NB v v 9.8%, 0.63%]. for λ. Looking at the observations λ,..., λ 0 given in Table above, we see that eight of them lie in the estimated confidence interval, which is clearly better than in the Poisson case in part a). In conclusion, the negative-binomial model seems more reasonable than the Poisson model. Solution 4.2 Compound Poisson Distribution a) Since S CompPoiλv, G), we can write S as S = N Y i, where N Poiλv), Y, Y 2,... are i.i.d. with distribution function G and N and Y, Y 2,... are independent. Now we can define S sc, S mc and S lc as N N N S sc = Y i {Yi 000}, S mc = Y i { 000<Yi } and S lc = Y i {Yi> }. b) Note that according to Table 2 given on the exercise sheet, we have PY 000] = PY = 00] + PY = 300] + PY = 500] = = 2, P 000 < Y ] = PY = 6 000] + PY = ] + PY = ] = = 3 and PY > ] = PY ] = 2 3 = 6. Thus, using Theorem 2.4 of the lecture notes disjoint decomposition of compound Poisson distributions), we get ) ) ) λv λv λv S sc CompPoi 2, G sc, S mc CompPoi 3, G mc and S lc CompPoi 6, G lc, Updated: October 0, / 5

15 HS 207 Solution sheet 4 where G sc y) = PY y Y 000], G mc y) = PY y 000 < Y ] G lc y) = PY y Y > ] and for all y R. In particular, for a random variable Y sc having distribution function G sc, we have PY = 00] PY sc = 00] = PY 000] = 3/20 /2 = 3 0, PY = 300] PY sc = 300] = PY 000] = 4/20 /2 = 4 and 0 PY = 500] PY sc = 500] = PY 000] = 3/20 /2 = 3 0 Analogously, for random variables Y mc and Y lc having distribution functions G mc and G lc, respectively, we get PY mc = 6 000] = 2 5, PY mc = ] = 2 5 and PY mc = ] = 5, as well as PY lc = ] = 2, PY lc = ] = 4 and PY lc = ] = 4. c) According to Theorem 2.4 of the lecture notes, S sc, S mc and S lc are independent. d) In order to find ES sc ], we need EY sc ], which can be calculated as EY sc ] = 00 PY sc = 00]+300 PY mc = 300]+500 PY lc = 500] = = 300. Now we can apply Proposition 2. of the lecture notes to get Similarly, we get Thus we find ES sc ] = λv 2 EY sc] = = 90. EY mc ] = and EY lc ] = ES mc ] = λv 3 EY mc] = and ES lc ] = λv 6 EY lc] = Since S = S sc + S mc + S lc, we get ES] = ES sc ] + ES mc ] + ES lc ] = In order to find VarS sc ), we need EY 2 sc], which can be calculated as EY 2 sc] = 00 2 PY sc = 00] PY mc = 300] PY lc = 500] = = Now we can apply Proposition 2. of the lecture notes to get VarS sc ) = λv 2 EY 2 sc] = = Updated: October 0, / 5

16 HS 207 Solution sheet 4 Similarly, we get Thus we find EY 2 mc] = and EY 2 lc] = VarS mc ) = λv 3 EY mc] 2 = and VarS lc ) = λv 6 EY lc] 2 = Since S = S sc + S mc + S lc and S sc, S mc and S lc are independent, we get VarS) = VarS sc ) + VarS mc ) + VarS lc ) = e) First, we define the random variable N lc as N lc Poi ) λv. 6 The probability that the total claim in the large claims layer exceeds 5 millions can be calculated by looking at the complement, i.e. at the probability that the total claim in the large claims layer does not exceed 5 millions. Since with three claims in the large claims layer we already exceed 5 millions, it is enough to consider only up to two claims. Then we get P S lc ] = PN lc = 0] + PN lc = ]PY lc ] + PN lc = 2]PY lc = ] 2 { = exp λv } { + exp λv } λv ) { + exp λv } ) 2 λv Hence we can conclude = exp { 0.} ) 97.4%. P S lc > ] = P S lc ] 2.6%. Solution 4.3 Method of Moments If Y Γγ, c), then we have EY ] = γ and VarY ) = γ c c 2. We define the sample mean ˆµ 8 and the sample variance ˆσ 8 2 of the eight observations given on the exercise sheet as ˆµ 8 = 8 x i = = 8 and ˆσ2 8 = 8 x i ˆµ 8 ) 2 = = 4. The method of moments estimates ˆγ, ĉ) of γ, c) are defined to be those values that solve the equations We see that ˆγ = ˆµ 8 ĉ and thus which is equivalent to Moreover, we get ˆµ 8 = ˆγ ĉ and ˆσ 2 8 = ˆγ ĉ 2. ˆσ 2 8 = ˆµ 8ĉ ĉ 2 = ˆµ 8 ĉ, ĉ = ˆµ 8 ˆσ 2 8 = 8 4 = 2. ˆγ = ˆµ2 8 ˆσ 8 2 = 64 4 = 6. Thus we conclude that the method of moments estimate are given by ˆγ, ĉ) = 6, 2). Updated: October 0, / 5

17 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 5 Solution 5. Kolmogorov-Smirnov Test The distribution function G 0 of a Weibull distribution with shape parameter τ = 2 and scale parameter c = is given by G 0 y) = exp{ y /2 } for all y 0. Note that since G 0 is continuous, we are allowed to apply a Kolmogorov-Smirnov test. If x = log u) 2 for some u 0, ), we have { G 0 x) = exp log u) 2] } /2 = exp {log u} = u. Hence, if we apply G 0 to x,..., x 5, we get G 0 x ) = 2 40, G 0x 2 ) = 3 40, G 0x 3 ) = 5 40, G 0x 4 ) = 6 40, G 0x 5 ) = Moreover, the empirical distribution function Ĝ5 of the sample x,..., x 5 is given by 0 if y < x, /5 if x y < x 2, 2/5 if x Ĝ 5 y) = 2 y < x 3, 3/5 if x 3 y < x 4, 4/5 if x 4 y < x 5, if y x 5. Now the Kolmogorov-Smirnov test statistic D 5 is defined as D 5 = sup Ĝ5y) G 0 y). y R Since G 0 is continuous and strictly monotonically increasing with range 0, ) and Ĝ5 is piecewise constant and attains both the values 0 and, it is sufficient to consider the discontinuities of Ĝ5 to find D 5. We define fs ) = lim r s fr) for all s R, where the function f stands for G 0 and Ĝ5. Since G 0 is continuous, we have G 0 s ) = G 0 s) for all s R. The values of G 0 and Ĝ5 and their differences can be summarized in the following table: x i, x i x x x 2 x 2 x 3 x 3 x 4 x 4 x 5 x 5 Ĝ 5 ) 0 8/40 8/40 6/40 6/40 24/40 24/40 32/40 32/40 G 0 ) 2/40 2/40 3/40 3/40 5/40 5/40 6/40 6/40 30/40 30/40 Ĝ5 ) G 0 ) 2/40 6/40 5/40 3/40 /40 9/40 8/40 26/40 2/40 0/40 From this table we see that D 5 = 26/40 = Let q = 5%. By writing K q) for the q)-quantile of the Kolmogorov distribution, we have K q) =.36. Since K q) < 0.65 = D 5, we can reject the null hypothesis of having a Weibull distribution with shape parameter τ = 2 and scale parameter c = as claim size distribution. Updated: October 8, 207 / 5

18 HS 207 Solution sheet 5 Solution 5.2 Large Claims a) The density of a Pareto distribution with threshold = 50 and tail index α > 0 is given by fx) = f α x) = α x ) α+) for all x. Using the independence of Y,..., Y n, the joint likelihood function L Y α) for the observation Y = Y,..., Y n ) can be written as n n ) α+) α Yi n L Y α) = f α Y i ) = = α α Y α+) i, whereas the joint log-likelihood function l Y α) is given by n n l Y α) = log L Y α) = log α+α log α+) log Y i = n log α+nα log α+) log Y i. Now the MLE ˆα MLE n is defined as ˆα MLE n = arg max α>0 L Yα) = arg max α>0 l Yα). Calculating the first and the second derivative of l Y α) with respect to α, we get α l Yα) = n n α + n log log Y i and ) 2 α 2 l Yα) = n n α α + n log log Y i = n α 2 < 0, for all α > 0, from which we can conclude that l Y α) is strictly concave in α. Thus ˆα n MLE can be found by setting the first derivative of l Y α) equal to 0. We get n ˆα n MLE + n log n log Y i = 0 ˆα MLE n = n n log Y i log ). b) Let ˆα denote the unbiased version of the MLE for the storm and flood data given on the exercise sheet. Since we observed 5 storm and flood events, we have n = 5. Thus ˆα can be calculated as ˆα = n n n ) n log Y i log = log Y i log 50) 0.98, where for Y,..., Y 5 we plugged in the observed claim sizes given on the exercise sheet. Note that with ˆα = 0.98 <, the expectation of the claim sizes does not exist. c) We define N,..., N 20 to be the number of yearly storm and flood events during the twenty years By assumption, we have N,..., N 20 i.i.d. Poiλ). Using Estimator 2.32 of the lecture notes with v = = v 20 =, the MLE ˆλ of λ is given by ˆλ = N i = 20 Since we observed 5 storm and flood events in total, we get ˆλ = 5 20 = Updated: October 8, / 5 20 N i.

19 HS 207 Solution sheet 5 d) Using Proposition 2. of the lecture notes, the expected yearly claim amount ES] of storm and flood events is given by ES] = λemin{y, M}]. The expected value of min{y, M} can be calculated as Emin{Y, M}] = Emin{Y, M} {Y M}] + Emin{Y, M} {Y>M}] = EY {Y M}] + EM {Y>M}] = EY {Y M}] + MPY > M], where for EY {Y M}] and PY > M] we have EY {Y M}] = = M x {x M} fx) dx x α x ) α+) dx ] M = α α α x α = α α α M α α α = α ) α M α α α M ) α α = ] α α = α ) ] α M and MPY > M] = M PY M]) = M ) ]) α ) α M M =. Hence we get α Emin{Y, M}] = α ) ] α M + ) α M = α α α ) α M. Replacing the unknown parameters by their estimates, we get for the estimated expected total yearly claim amount ÊS]: ) ] ˆα ÊS] = ˆλ M ˆα ) ] 0.98 ˆα ˆα e) Since S CompPoiλ, G), we can write S as S = N Y i, where N Poiλ), Y, Y 2,... are i.i.d. with distribution function G and N and Y, Y 2,... are independent. Since we are only interested in events that exceed the level of M = 2 billions CHF, we define S M as N S M = Y i {Yi>M}. Updated: October 8, / 5

20 HS 207 Solution sheet 5 Due to Theorem 2.4 of the lecture notes, we have S M CompPoiλ M, G M ) for some distribution function G M and ) ]) α ) α M M λ M = λpy > M] = λ PY M]) = λ = λ. Defining a random variable N M Poiλ M ), the probability that we observe at least one storm and flood event in a particular year is given by { ) } α M PN M ] = PN M = 0] = exp{ λ M } = exp λ. If we replace the unknown parameters by their estimates, this probability can be estimated by { ) } { ˆα ) } 0.98 M ˆPN M ] = exp ˆλ exp Note that in particular such a flood storm and flood event that exceeds the level of 2 billions CHF is expected roughly every /0.02 = 50 years. Solution 5.3 Pareto Distribution The density g and the distribution function G of Y are given by for all x. gx) = α x ) α+) x ) α and Gx) = a) The survival function Ḡ = G of Y is x ) α Ḡx) = Gx) = for all x. Hence, for all t > 0 we have Ḡxt) lim x Ḡx) = lim xt/) α x x/) α = t α. Thus, by definition, the survival function of Y is regularly varying at infinity with tail index α. b) Let u < u 2 and α. Then the expected value of Y within the layer u, u 2 ] can be calculated as u2 EY {u<y u 2}] = x {u<x u 2}gx) dx = x α x ) α+) u2 x ) α dx = α dx. u In the case α, we get EY {u<y u 2}] = α and if α =, we get u x ) ] α+ u2 α u α u = α u2 EY {u<y u 2}] = dx = log u x u2 u ) α+ u2 ) ] α+, Updated: October 8, / 5 ).

21 HS 207 Solution sheet 5 c) Let α > and y >. Then the expected value µ Y α µ Y = α of Y is given by and, similarly as in part b), we get ) ] α+ α y ) α+ y ) ] α+ EY {Y y} ] = EY {<Y y} ] = = µ Y. α Hence, for the loss size index function for level y > we have IGy)] = y ) α+ EY {Y y} ] = 0, ]. µ Y d) Let α > and u >. The mean excess function of Y above u can be calculated as eu) = EY u Y > u] = EY Y > u] u = EY {Y >u}] PY > u] where for EY {Y >u} ] we have, similarly as in part b), Thus we get u = EY {Y >u}] Ḡu) EY {Y >u} ] = α x ) ] α+ = α u ) α+ α u α α = α uḡu). eu) = α α u u = α u. Note that the mean excess function u eu) has slope α > 0. u, Updated: October 8, / 5

22 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 6 Solution 6. Goodness-of-Fit Test Let Y be a random variable following a Pareto distribution with threshold = 200 and tail index α =.25. Then the distribution function G of Y is given by x ) α x ).25 Gx) = = 200 for all x. For example for the interval I 2 we then have PY I 2 ] = P239 Y < 30] = G30) G239) = By analogous calculations for the other four intervals, we get ) ) ] PY I ] 0.2, PY I 2 ] 0.2, PY I 3 ] 0.2, PY I 4 ] 0.2, PY I 5 ] 0.2. Let E i and O i denote respectively the expected number of observations in I i and the observed number of observations in I i, for all i {,..., 5}. As we have 20 observations in our data, we can calculate for example E 2 as E 2 = 20 PY I 2 ] 4. The values of the expected number of observations and the observed number of observations in the five intervals as well as their squared differences are summarized in the following table: i O i E i O i E i ) Now the test statistic of the χ 2 -goodness-of-fit test using 5 intervals and 20 observations is given by X 2 20,5 = 5 O i E i ) 2 E i = = 0. Let α = 5%. Then the α)-quantile of the χ 2 -distribution with 5 = 4 degrees of freedom is given by approximately Since this is smaller than X 2 20,5, we can reject the null hypothesis of having a Pareto distribution with threshold = 200 and tail index α =.25 as claim size distribution at the significance level of 5%. Solution 6.2 Log-Normal Distribution and Deductible a) Let X N µ, σ 2 ). Then the moment generating function M X of X is given by M X r) = E exp{rx}] = exp {rµ + r2 σ 2 } 2 Updated: October 24, 207 / 4

23 HS 207 Solution sheet 6 for all r R. Since Y has a log-normal distribution with mean parameter µ and variance parameter σ 2, we have Y d = exp{x}. Hence, the expectation, the variance and the coefficient of variation of Y can be calculated as } EY ] = E exp{x}] = E exp{ X}] = M X ) = exp {µ + σ2, 2 VarY ) = EY 2 ] EY ] 2 = E exp{2x}] M X ) 2 = M X 2) M X ) 2 } } = exp {2µ + 4σ2 exp {2µ + 2 σ2 = exp { 2µ + σ 2} exp { σ 2} ) and 2 2 VarY ) VcoY ) = = exp { µ + σ 2 /2 } exp {σ 2 } EY ] exp {µ + σ 2 = exp {σ /2} 2 }. b) From part a), we know that σ = logvcoy ) 2 + ] µ = log EY ] σ2 2. Since EY ] = and VcoY ) = 4, we get and σ = log4 2 + ).68 µ log )2 2 and i) The claims frequency λ is given by λ = EN]/v. With the introduction of the deductible d = 500, the number of claims changes to N new = N {Yi>d}. Using the independence of N and Y, Y 2,..., we get N ] EN new ] = E = EN]E {Y>d}] = EN]PY > d]. {Yi>d} Let Φ denote the distribution function of a standard Gaussian distribution. Since log Y has a Gaussian distribution with mean µ and variance σ 2, we have log Y µ PY > d] = P log d µ ] ) log d µ = Φ. σ σ σ Hence, the new claims frequency λ new is given by λ new = EN new ]/v = EN]PY > d]/v = λpy > d] = λ Φ Inserting the values of d, µ and σ, we get )] log λ new λ Φ 0.59 λ..68 log d µ Note that the introduction of this deductible reduces the administrative burden a lot, because 4% of small) claims disappear. σ )]. Updated: October 24, / 4

24 HS 207 Solution sheet 6 ii) With the introduction of the deductible d = 500, the claim sizes change to Y new i = Y i d Y i > d. Thus, the new expected claim size is given by EY new ] = EY d Y > d] = ed), where ed) is the mean excess function of Y above d. According to the lecture notes, ed) is given by ) ed) = EY ] Φ log d µ σ 2 σ ) d. Φ log d µ σ Inserting the values of d, µ, σ and EY ], we get ) EY new ] Φ log ) EY ]. Φ log iii) According to Proposition 2.2 of the lecture notes, the expected total claim amount ES] is given by ES] = EN]EY ]. With the introduction of the deductible d = 500, the total claim amount S changes to S new, which can be written as S new = N new Y new i. Hence, the expected total claim amount changes to E S new ] = E N new ] E Y new ] = EN]PY > d]ed) ) )] log d µ = λv Φ EY ] Φ log d µ σ 2 σ ) d. σ Φ log d µ σ Inserting the values of d, µ, σ and EY ], we get ) )] log E S new ] λv Φ Φ log ) Φ λv = 0.88 ES]. log In particular, the insurance company can grant a discount of roughly 2% on the pure risk premium. Note that also the administrative expenses on claims handling will reduce substantially because we only have 59% of the original claims, see the result in i). Updated: October 24, / 4

25 HS 207 Solution sheet 6 Solution 6.3 Inflation and Deductible Let Y be a random variable following a Pareto distribution with threshold > 0 and tail index α >. Then the expectation EY ] of Y and the mean excess function e Y u) of Y above u > are given by EY ] = α α and e Y u) = α u. Since the insurance company only has to pay the part that exceeds the threshold, this year s average claim payment z is z = EY ] = For the total claim size Ỹ of a claim next year we have α α = α. Ỹ d = + r)y Pareto + r], α). Let ρ for some ρ > 0 denote the increase of the threshold that is needed such that the average claims payment remains unchanged. Then next year s average claim payment is given by z = EỸ + ρ]) +]. Let s first assume that we can choose a ρ < r such that z = z. In this case we get Ỹ + r) a.s. = Ỹ + ρ) a.s. and thus z = EỸ + ρ)] = EỸ ] + ρ) = α + r) + ρ). α Now we have z = z if and only if α α = α + r) + ρ) α 0 = α r ρ α ρ = α α r > r, which is a contradiction to the assumption ρ < r. Hence, we conclude that ρ r, i.e. the percentage increase in the deductible has to be bigger than the inflation. Assuming ρ r, we can calculate Now we have z = z if and only if z = EỸ + ρ]) {Ỹ +ρ)}] = EỸ + ρ) Ỹ > + ρ)] PỸ > + ρ)] = eỹ + ρ]) PỸ = + ρ) α > + ρ)] + ρ) + r) = α + r)α + ρ) α+ = z + r) α + ρ) α+. ] α + r) α + ρ) α+ = ρ = + r) α α. We conclude that if we want the claim payment to remain unchanged, we have to increase the deductible by the amount ] + r) α α. Updated: October 24, / 4

26 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 7 Solution 7. Hill Estimator An example of a possible R-code is given below. ### Generate 300 independent observations coming from a 2 ### Pareto distribution with threshold theta = 0 millions 3 ### and tail index alpha = 2. 4 ### We use that if Z ~ Gamma, alpha ), 5 ### then theta * exp {Z} ~ Pareto theta, alpha ). 6 ### Note that for the Gamma distribution we have : 7 ### scale parameter in R = / scale parameter in lecture notes ) 8 n < theta <- 0 ### in millions 0 alpha <- 2 set. seed 00) ### for reproducibility 2 data. <- rgamma n, shape =, scale = / alpha ) 3 data <- theta * exp data.) 4 5 ### Order the data 6 data. ordered <- data order data, decreasing = FALSE )] 7 8 ### Take the logarithm 9 log. data. ordered <- log data. ordered ) 20 2 ### Number of observations 22 n. obs <- n: ### Hill estimator 25 hill. estimator <- sum log. data. ordered ) 26 - cumsum log. data. ordered ) + log. data. ordered ) / n. obs 27 - log. data. ordered ) ^ -) ### Confidence bounds see Lemma 3.7 of the lecture notes ) 30 upper. bound <- hill. estimator + sqrt n. obs ^2 / n. obs - ) ^2 3 * n. obs - 2)) * hill. estimator ^2) 32 lower. bound <- hill. estimator - sqrt n. obs ^2 / n. obs - ) ^2 33 * n. obs - 2)) * hill. estimator ^2) ### Hill plot and log - log plot next to each other 36 par mfrow =c,2) ) ### Hill plot 39 plot hill. estimator, ylim = c alpha -, alpha +2), xaxt ="n", 40 xlab = " number of observations ", 4 ylab = " Pareto tail index parameter ", cex = 0. 5) Updated: November, 207 / 7

27 HS 207 Solution sheet 7 42 title main = " Hill plot for alpha ") 43 axis, at=c, seq from = n / 0, to = n, by = n / 0) ), 44 c seq from = n, to = n / 0, by = -n / 0),)) 45 lines upper. bound ) 46 lines lower. bound ) 47 abline h = alpha, col = " blue ") 48 legend " topleft ", col = c" blue "," black "), lty = c, NA), 49 pch = cna,), 50 legend = c" Pareto distribution "," observations ")) 5 52 ### True survival function = - true distribution function ) 53 true.sf <- data. ordered / theta )^ - alpha ) ### Empirical survival function = - empirical survival function ) 56 empirical.sf <- - : n) / n ### Log - log plot 59 plot log. data. ordered, log true.sf), xlab = " log claim size )", 60 ylab = " log - distribution function )", 6 cex = 0.5, col = " blue ") 62 title main = "log - log plot ") 63 lines log. data. ordered, log true.sf), col = " blue ") 64 points log. data. ordered, log empirical.sf), col = " black ", 65 cex = 0.5) 66 legend " bottomleft ", col = c" blue "," black "), lty = c, NA), 67 pch = c,), legend = c" Pareto distribution "," observations ")) The Hill plot on the left) and the log-log plot on the right) look as follows: Updated: November, / 7

28 HS 207 Solution sheet 7 Note that even though we sampled from a Pareto distribution with tail index α = 2, it is not at all clear to see that the data comes from a Pareto distribution. In the Hill plot we see that, first, the estimates of α seem more or less correct, but starting from the 80 largest observations, the plot suggests a higher α or even another distribution. In the log-log plot we see that for small-sized and medium-sized claims the fit seems to be fine. But looking at the largest claims, we would conclude that our data is not as heavy-tailed as a true Pareto distribution with threshold = 0 millions and tail index α = 2 would suggest. We are confronted with these problems even though we sampled directly from a Pareto distribution. This might indicate the difficulties one faces when trying to fit such a distribution to a real data set, which, to make matters even worse, often contains far less than 300 observations as in this example and moreover the observations may be contaminated by other distributions. Solution 7.2 Approximations Note that if Y Γγ = 00, c = 0 ), then EY ] = γ c = 00 /0 = 000, EY 2 ] = EY 3 ] = γγ + ) c 2 = 00 0 /00 = and γγ + )γ + 2) c 3 = = /000 Let M Y denote the moment generating function of Y. According to formula.3) of the lecture notes, we have M Y 0) = d3 dr 3 M Y r) = EY 3 ]. r=0 For the total claim amount S, we can use Proposition 2. of the lecture notes to get ES] = λvey ] = = , VarS) = λvey 2 ] = = M S r) = exp{λvm Y r) ]}. and In order to get the skewness ς S of S, which we will need for the translated gamma and the log-normal approximations, we can use the third equation given in the formulas.5) of the lecture notes: ς S VarS) 3/2 = d3 dr 3 log M Sr) = λv d3 r=0 dr 3 M Y r) = λvm Y 0) = λvey 3 ], r=0 from which we can conclude that ς S = λvey 3 ] λvey 2 ]) 3/2 = EY 3 ] λvey 2 ] 3/2 = ) 3/ Let F S denote the distribution function of S. Then, since F S is continuous and strictly increasing, the quantiles q 0.95 and q 0.99 can be calculated as q 0.95 = F S 0.95) and q 0.99 = F S 0.99). a) According to Section 4.. of the lecture notes, the normal approximation is given by ) x λvey ] F S x) Φ, λvey 2 ] Updated: November, / 7

29 HS 207 Solution sheet 7 for all x R, where Φ is the standard Gaussian distribution function. For all α 0, ), we have In particular, we get F S α) = λvey ] + λvey 2 ] Φ α) = Φ α) Φ α). q 0.95 = F S 0.95) Φ 0.95) = and q 0.99 = F S 0.99) Φ 0.99) = Note that the normal approximation also allows for negative claims S, which under our model assumption is excluded. The probability for negative claims S in the normal approximation can be calculated as ) 0 λvey ] F S 0) Φ Φ ) Φ 3.5) , λvey 2 ] which of course is positive, but very close to 0. b) According to Section 4..2 of the lecture notes, in the translated gamma approximation we model S by the random variable X = k + Z, where k R and Z Γ γ, c). The three parameters k, γ and c can be determined by solving the equations EX] = ES], VarX) = VarS) and ς X = ς S, ) where ς X is the skewness parameter of X. Since Z Γ γ, c), we can use the results given in Section 3.2. of the lecture notes to calculate EX] = Ek + Z] = k + EZ] = k + γ c, VarX) = Vark + Z) = VarZ) = γ c 2 and ς X = E X EX]) 3] = E ] k + Z Ek + Z]) 3 VarX) 3/2 Vark + Z) 3/2 Using equations ), we get = E ] Z EZ]) 3 = ς VarZ) 3/2 Z = γ 2. 2 γ = ς S γ = 4 γ = VarS) c = c 2 ς 2 S 3 883, γ VarS) and k + γ c = ES] k = ES] γ c = ES] γvars) If we write F Z for the distribution function of Z Γ γ 3 883, c 0.002), using the translated gamma approximation, we get F S x) = PS x] PX x] = Pk + Z x] = PZ x k] = F Z x k), Updated: November, / 7

30 HS 207 Solution sheet 7 for all x R. Now, for all α 0, ), we have In particular, we get and q 0.95 = F S q 0.99 = F S F S α) k + F α) Z 0.95) k + F 0.95) = Z 0.99) k + F 0.99) = Z Note that since k < 0, the translated gamma approximation in this example also allows for negative claims S, which under our model assumption is excluded. The probability for negative claims S can be calculated as which is basically 0. F S 0) F Z 0 k) F Z ) , c) According to Section 4..2 of the lecture notes, in the translated log-normal approximation we model S by the random variable X = k + Z, where k R and Z LNµ, σ 2 ). Similarly as in part b), the three parameters k, µ and σ 2 can be determined by solving the equations EX] = ES], VarX) = VarS) and ς X = ς S. 2) Since Z LNµ, σ 2 ), we can use the results given in Section of the lecture notes to calculate EX] = Ek + Z] = k + EZ] = k + exp { µ + σ 2 /2 }, VarX) = Vark + Z) = VarZ) = exp { 2µ + σ 2} exp { σ 2} ) and ς X = ς Z = exp { σ 2} + 2 ) exp { σ 2} ) /2. Using the third equation in 2), we get exp { σ 2 } + 2 ) exp { σ 2} ) /2 = ςs σ , which was found using a computer software. Using the second equation in 2), we get exp { 2µ + σ 2} exp { σ 2} ) = VarS) µ = 2 which implies Finally, using the first equation in 2), we get µ exp { log σ 2 } ) ] VarS) σ 2), k + exp { µ + σ 2 /2 } = ES] k = ES] exp { µ + σ 2 /2 } If we write F W for the distribution function of W = log Z N µ 4.875, σ ), using the translated log-normal approximation, we get F S x) = PS x] PX x] = Pk + Z x] = Plog Z logx k)] = F W logx k]), Updated: November, / 7

31 HS 207 Solution sheet 7 for all x R. Now, for all α 0, ), we have In particular, we get F S α) k + exp{fw α)}. and q 0.95 = F S 0.95) k + exp{f 0.95)} = W q 0.99 = F S 0.99) k + exp{f 0.99)} = W Note that since k < 0, the translated log-normal approximation in this example also allows for negative claims S, which under our model assumption is excluded. The probability for negative claims S can be calculated as which is basically 0. F S 0) F Z 0 k) = F W log k]) F W log ) , d) We observe that with all the three approximations applied in parts a) - c) we get almost the same results. In particular, the normal approximation does not provide estimates that deviate significantly from the ones we get using the translated gamma and the translated log-normal approximations. This is due to the fact, that λv = 000 is large enough and the gamma distribution assumed for the claim sizes is not a heavy tailed distribution. Moreover, the skewness ς S = of S is rather small, hence the normal approximation is a valid model in this example. Note that in all the three approximations we allow for negative claims S, which actually should not be possible under our model assumption. However, the probability of observe a negative claim S is vanishingly small. Solution 7.3 Akaike Information Criterion and Bayesian Information Criterion a) By definition, the MLEs ˆγ MLE, ĉ MLE) maximize the log-likelihood function l Y. In particular, we have l Y ˆγ MLE, ĉ MLE) l Y γ, c), for all γ, c) R + R +. If we write d MLE) and d MM) for the number of estimated parameters in the MLE model and in the method of moments model, respectively, we have d MLE) = d MM) = 2. The AIC value AIC MLE) of the MLE model and the AIC value AIC MM) of the method of moments model are then given by AIC MLE) = 2l Y ˆγ MLE, ĉ MLE) + 2d MLE) = = AIC MM) = 2l Y ˆγ MM, ĉ MM) + 2d MM) = = and According to the AIC, the model with the smallest AIC value should be preferred. Since AIC MLE) < AIC MM), we choose the MLE fit. b) If we write d gam) and d exp) for the number of estimated parameters in the gamma model and in the exponential model, respectively, we have d gam) = 2 and d exp) =. The AIC value AIC gam) of the gamma model and the AIC value AIC exp) of the exponential model are then given by AIC gam) = 2l gam) Y ˆγ MLE, ĉ MLE) + 2d gam) = = AIC exp) = 2l exp) ĉmle ) Y + 2d exp) = = and Updated: November, / 7

32 HS 207 Solution sheet 7 Since AIC gam) > AIC exp), we choose the exponential model. The BIC value BIC gam) of the gamma model and the BIC value BIC exp) of the exponential model are given by and BIC gam) = 2l gam) Y ˆγ MLE, ĉ MLE) + d gam) log 000 = log BIC exp) = 2l exp) ĉmle ) Y + d exp) log 000 = log According to the BIC, the model with the smallest BIC value should be preferred. Since BIC gam) > BIC exp), we choose the exponential model. Note that the gamma model gives the better in-sample fit than the exponential model. But if we adjust this in-sample fit by the number of parameters used, we conclude that the exponential model probably has the better out-of-sample performance better predictive power). Updated: November, / 7

33 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 8 Solution 8. Panjer Algorithm For the expected yearly claim amount π 0 we have π 0 = ES] = EN] EY ] = Ek + Z] = k + EZ] = k + exp } {µ + σ Let Y + i denote the discretized claim sizes using a span of s = 0, where we put all the probability mass to the upper end of the intervals. If we write g m = PY + = sm] for m N, then we have since PY + g = g 2 = = g 0 = 0, k] = PZ 0] = 0 and k = 0s. For all l, we get g l = PY + = sl] = PY + = k + sl 0)] = Pk + sl ) < Y k + sl 0)] = PY k + sl 0)] PY k + sl )] = PZ sl 0)] PZ sl )] = P log Z logsl 0])] P log Z logsl ])] ) ) logsl 0)] µ logsl )] µ = Φ Φ, σ σ where Φ is the distribution function of the standard Gaussian distribution and where we define log 0 =. From now on we will replace the claim sizes Y i with the discretized claim sizes Y + i. In particular, we will still write S for the yearly claim amount that changed to S = N Note that N Poi) has a Panjer distribution with parameters a = 0 and b =, see the proof of Lemma 4.7 of the lecture notes. Applying the Panjer algorithm given in Theorem 4.9 of the lecture notes, we have for r N 0 f r def. = PS = sr] = Y + i. { PN = 0] for r = 0, r l= l r g lf r l for r > 0. Since the yearly amount that the client has to pay by himself is given by S ins = min{s, d} + min{α S d) +, M} = min{s, d} + α min { S d) +, M α M/α = and the maximal possible franchise is 2 500, we have to apply the Panjer algorithm until we reach PS = 9 500] = f 950. Here we limit ourselves to determine the values of f 0,..., f 2 to illustrate how the algorithm works. In particular, we have f 0 = PN = 0] = e 0.36 }, Updated: November 8, 207 / 9

34 HS 207 Solution sheet 8 and f = f 2 = = f 0 = 0, since g = g 2 = = g 0 = 0. For r = and r = 2, we get and f = f 2 = l= 2 l= l g lf l = g f 0 = Φ l 2 g lf 2 l = g 2 f 0 = Φ log s µ σ log 2s µ σ ) Φ ) Φ log 0 µ σ log s µ σ )] e )] e Using the discretized claim sizes, the yearly expected amount π ins paid by the client is given by { π ins = ES ins ] = E min{s, d}] + α E min S d) +, M }], α where we have d/s d/s E min{s, d}] = f r sr + d d/s = d + f r sr d) f r r=0 r=0 r=0 and { E min S d) +, M }] d/s+m/sα = f r sr d) + M α α r=d/s+ = M d/s+m/sα α + r=d/s+ d/s+m/α r=0 f r sr d M ) M α α f r d/s f r. r=0 Therefore, we get d/s π ins = d + f r sr d) + α M d/s+m/sα α + f r sr d M ) M α α r=0 d/s = d + M + r=0 r=d/s+ d/s+m/sα f r sr d M) + r=d/s+ αf r sr d M ). α d/s f r Finally, if the client has chosen franchise d, then the monthly pure risk premium π is given by π = π 0 π ins 2 = k + exp 2 } {µ + σ2 2 d/s d M r=0 d/s+m/sα f r sr d M) r=d/s+ r=0 αf r sr d M α ). In the end, we get the following monthly pure risk premiums for the different franchises: d π More generally, the monthly pure risk premium as a function of the franchise, which is allowed to vary between 300 CHF and CHF, looks as follows: Updated: November 8, / 9

35 HS 207 Solution sheet 8 Note that the above values only represent the pure risk premiums. In order get the premiums that the customer has to pay in the end, we would need to add an appropriate risk-loading, which may vary between different health insurance companies. The above plot can be created by the R-code given below, where we calculated the premiums using two different discretizations of the claim sizes: in one we put the probability mass to the upper end of the intervals and in the other to the lower end of the intervals. However, the resulting premiums for these two versions are basically the same. ### Define the function KK_ premium with the variables : 2 ### lambda = mean number of claims 3 ### mu = mean parameter of log - normal distribution 4 ### sigma2 = variance parameter of log - normal distribution 5 ### span = span size used in the Panjer algorithm 6 ### shift = shift of the translated log - normal distribution 7 KK_ premium <- function lambda, mu, sigma2, span, shift ){ 8 ### we will calculate the distribution of S until M M = ) 9 M < ### number of steps 2 m <- M/ span 3 4 ### we won t have any mass until we reach shift, which happens at the k0 - th step 5 k0 <- shift / span 6 7 ### initialize array where mass is put to the lower end of the interval 8 g_ min <- array 0, dim =cm +,) ) 9 20 ### initialize array where mass is put to the upper end of the interval 2 g_ max <- array 0, dim =cm +,) ) Updated: November 8, / 9

36 HS 207 Solution sheet ### discretize the log - normal distribution putting the mass to the lower end of the interval 24 for k in k0 +) :m+) ){g_ min k,] <- pnorm log k-k0)* span ), mean =mu, sd= sqrt sigma2 ))-pnorm log k-k0 -) * span ), mean =mu, sd= sqrt sigma2 ))} ### discretize the log - normal distribution putting the mass to the upper end of the interval 27 g_ max 2: m +),] <- g_ min :m,] ### initialize matrix, where we will store the probability distribution of S 30 f <- matrix 0, nrow =m+, ncol =3) 3 32 ### store the probability of getting zero claims in both lower bound and upper bound ) 33 f,] <- exp - lambda * -g_ min,]) ) 34 f,2] <- exp - lambda * -g_ max,]) ) ### calculate the values " l * g_{ l}" of the discretized claim sizes lower bound and upper bound ), we need these values in the Panjer algorithm 37 h <- matrix 0, nrow =m, ncol =3) 38 for i in :m){ 39 hi,] <- g_ min i +,] *i +) 40 hi,2] <- g_ max i +,] *i +) 4 } ### Panjer algorithm note that in the Poisson case we have a = 0 and b = lambda * v, which is just lambda here ) 44 for r in :m){ 45 fr +,] <- lambda /r*tf :r,]) %*%hr :,]) 46 fr +,2] <- lambda /r*tf :r,2]) %*%hr :,2]) 47 fr +,3] <- r * span 48 } ### maximal and minimal franchise 5 m < m0 < ### number of iterations needed to get to m and m0 55 i <- m/ span + 56 i0 <- m0/ span ### calculate the part that the insured pays by himself 59 franchise <- array NA, ci, 3)) 60 for i in i0:i){ 6 franchise i,] <- f i,3] ### this represents the franchise 62 franchise i,2] <- sum f :i,] *f :i,3]) + fi,3] * - sum f :i,]) ) 63 franchise i,2] <- franchise i,2] + sum f i+) :i / span ) Updated: November 8, / 9

37 HS 207 Solution sheet 8,]*f 2:7000 / span +),3])* * - sum f : i / span ),])) 64 franchise i,3] <- sum f :i,2] *f :i,3]) + fi,3] * - sum f :i,2]) ) 65 franchise i,3] <- franchise i,3] + sum f i+) :i / span ),2]*f 2:7000 / span +),3])* * - sum f : i / span ),2])) 66 } ### calculate the price of the monthly premium 69 price <- array NA, ci, 3)) 70 price,] <- franchise,] ### this represents the franchise 7 price,2:3] <- lambda * exp mu+ sigma2 /2)+ shift ) - franchise,2:3]) /2 72 price 73 } ### Load the add - on packages stats and MASS 76 require stats ) 77 require MASS ) ### Determine values for the input parameters of the function KK_ premium 80 lambda <- 8 mu < sigma2 <- 83 span < shift < ### The coefficient of variation of the translated log - normal distribution is given by 87 exp mu+ sigma2 /2)* sqrt exp sigma2 ) -)/ shift + exp mu+ sigma2 /2)) ### Run the function KK_ premium 90 price <- KK_ premium lambda, mu, sigma2, span, shift ) 9 92 ### Plot the monthly pure risk premium as a function of the franchise 93 plot x= price,], y= price,2], lwd =2, col =" blue ", type = l, ylab =" pure risk premium ", xlab =" franchise ", main =" pure risk premium monthly )") 94 lines x= price,], y= price,2], lwd =, col =" blue ") 95 points x=c 300,500, 000, 500, 2000, 2500), y= price c 300,500, 000, 500, 2000, 2500) / span +,3], pch =9, col =" orange ") 96 abline v=c 300, 500, 000, 500, 2000, 2500), col =" darkgray ", lty =3) ### Give the monthly pure risk premiums for the six franchises listed on the exercise sheet 99 round price c 300,500, 000, 500, 2000, 2500) / span +,2]) 00 round price c 300,500, 000, 500, 2000, 2500) / span +,3]) Updated: November 8, / 9

38 HS 207 Solution sheet 8 Solution 8.2 Variance Loading Principle a) Let S, S 2, S 3 be the total claim amounts of the passenger cars, delivery vans and trucks, respectively. Then, according Proposition 2. of the lecture notes, for the expected total claim amounts we have ] ES i ] = λ i v i E, for all i {, 2, 3}. Using the data given in the table on the exercise sheet, we get Y i) ES ] = = , ES 2 ] = = 730 ES 3 ] = = and If we write S for the total claim amount of the car fleet, we can conclude that ES] = ES + S 2 + S 3 ] = ES ] + ES 2 ] + ES 3 ] = b) Again using Proposition 2. of the lectures notes, we get ) ] 2 ) ] ) 2 VarS i ] = λ i v i E Y i) = λ i v i Var + E Y i) Y i) = λ i v i E for all i {, 2, 3}. Using the data given in the table on the exercise sheet, we find VarS ) = ) = , VarS 2 ) = ) = VarS 3 ) = ) = Y i) and ] 2 ) VcoY i) )2 +, Since S, S 2 and S 3 are independent by assumption, we get for the variance of the total claim amount S of the car fleet VarS) = VarS ) + VarS 2 ) + VarS 3 ) = Using the variance loading principle with α = 3 0 6, we get for the premium π of the car fleet π = ES] + αvars) = = 4 4. Note that we have π ES] ES] = αvars) ES] %. Thus, the loading π ES] is given by 5.3% of the pure risk premium. Solution 8.3 Panjer Distribution If we write p k = PN = k] for all k N, then, by definition of the Panjer distribution, we have p k = p k a + b ), k for all k in the range of N. We can use this recursion to calculate EN] and VarN). Note that the range of N is N if a 0 and it is {0,,..., n} for some n N if a < 0. Updated: November 8, / 9

39 HS 207 Solution sheet 8 First, we consider the case where a < 0, i.e. where the range of N is {0,,..., n}. According to the proof of Lemma 4.7 of the lecture notes, we have For the expectation of N, we get n = a + b a. ) Using ), we get EN] = = = = a n k p k k=0 n k p k k= n k= k p k a + b ) k n k p k + b k= n k= p k n n = a k + ) p k + b k=0 k=0 n n = a k p k + a + b) k=0 k=0 p k p k = a EN] np n ) + a + b) p n ) = a EN] + a + b + p n an a b). an a b = a a + b a Hence, the above expression for EN] simplifies to EN] = aen] + a + b, a b = 0. 2) from which we can conclude that EN] = a + b a. Updated: November 8, / 9

40 HS 207 Solution sheet 8 In order to get the variance of N, we first calculate the second moment of N: EN 2 ] = Using ), we get = = = a n k 2 p k k=0 n k 2 p k k= n k= k 2 p k a + b ) k n k 2 p k + b k= n k p k k= n n = a k + ) 2 p k + b k + ) p k k=0 k=0 n n n = a k 2 p k + 2a + b) k p k + a + b) k=0 k=0 k=0 = a EN 2 ] n 2 p n ) + 2a + b)en] np n ) + a + b) p n ) = a EN 2 ] + 2a + b) EN] + a + b + p n an 2 2a + b)n a b]. p k ) 2 a + b an 2 2a + b)n a b = a + 2a + b) a + b a a = a2 + 2ab + b 2 a = a2 + 3ab + b 2 a a b a2 + ab a 3) Hence, the above expression for EN 2 ] simplifies to from which we get Finally, the variance of N then is EN 2 ] = a EN 2 ] + 2a + b) EN] + a + b, EN 2 2a + b) EN] + a + b ] = a 2a + b) a + b) + a + b) a) = a) 2 = 2a2 + 3ab + b 2 + a a 2 + b ab a) 2 = a + b)2 + a + b a) 2. VarN) = EN 2 ] EN] 2 = a + b)2 + a + b a + b)2 a) 2 a) 2 = a + b a) 2. In the case where a 0, i.e. where the range of N is N, we can perform analogous calculations with the only difference that the index of summation in all the sums involved goes up to instead of stopping at n. As a consequence, the calculations in 2) and in 3) aren t necessary anymore. The formulas for EN] and VarN), however, remain the same. Updated: November 8, / 9

41 HS 207 Solution sheet 8 The ratio of VarN) to EN] is given by VarN) EN] = a + b a) 2 a a + b = a. Note that if a < 0, i.e. if N has a binomial distribution, we have VarN) < EN]. If a = 0, i.e. if N has a a Poisson distribution, we have VarN) = EN]. Finally, in the case of a > 0, i.e. for a negative-binomial distribution, we have VarN) > EN]. Updated: November 8, / 9

42 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 9 Solution 9. Utility Indifference Price a) Suppose that there exist two utility indifference prices π = π u, S, c 0 ) and π 2 = π 2 u, S, c 0 ) with π π 2. By definition of a utility indifference price, we have Euc 0 + π S)] = uc 0 ) = Euc 0 + π 2 S)]. ) Without loss of generality, we assume that π < π 2. Then we have c 0 + π S < c 0 + π 2 S a.s., which implies uc 0 + π S) < uc 0 + π 2 S) since u is a utility function and, thus, strictly increasing by definition. Finally, by taking the expectation, we get Euc 0 + π S)] < Euc 0 + π 2 S)], which is a contradiction to ). We conclude that if the utility indifference price π exists, then it is unique. Moreover, being a utility function, u is strictly concave by definition. Hence, we can apply Jensen s inequality to get a.s., uc 0 ) = Euc 0 + π S)] < uec 0 + π S]) = uc 0 + π ES]). Note that we used that S is non-deterministic and, thus, Jensen s inequality is strict. Since u is strictly increasing, this implies π ES] > 0, i.e. π > ES]. b) Note that E Y ) ] = γ c = = and that ] E Y 2) = = 200. Since S and S 2 both have a compound Poisson distribution, Proposition 2. of the lecture notes gives and We conclude that ES ] = λ v E ES 2 ] = λ 2 v 2 E Y ) Y 2) ] = = ] = = ES] = ES + S 2 ] = ES ] + ES 2 ] = c) The utility indifference price π = πu, S, c 0 ) is defined through the equation uc 0 ) = Euc 0 + π S)]. Updated: November 20, 207 / 8

43 HS 207 Solution sheet 9 Using that the utility function u is given by for all x R, with α =.5 0 6, we get ux) = exp { αx}, α uc 0 ) = Euc 0 + π S)] α exp { αc 0} = E ] α exp { αc 0 + π S)} exp { αc 0 } = E exp { αc 0 + π S)}] exp {απ} = E exp {αs}] π = log E exp {αs}]. α Note that we can write S = S + S 2 and use the independence of S and S 2 to get π = α log E exp {αs + S 2 )}] = α log E exp {αs }] E exp {αs 2 }]) = α log E exp {αs }] + log E exp {αs 2 }]) = α log M S α) + log M S2 α)], where M S and M S2 denote the moment generating functions of S and S 2, respectively. Moreover, since S and S 2 both have a compound Poisson distribution, Proposition 2. of the lecture notes gives where M Y ) π = α and M Y 2) λ v M Y ) ] α) + λ 2 v 2 M Y 2) ]) α), denote the moment generating functions of Y ) and Y 2), respectively. Using that Y ) Γγ = 20, c = 0.0) and that Y 2) expo0.005), we get and ) γ c 0.0 M ) Y α) = = c α M 2) Y α) = α = In particular, since α < c and α < 0.005, both M ) Y α) and M 2) Y α) and thus also M S α) and M S2 α) exist. Inserting all the numerical values, we find the utility indifference price π = = Note that we have π ES] ES] ) 20 ) ] = = % Thus, the loading π ES] is given by approximately 0.46% of the pure risk premium. ] ) Updated: November 20, / 8

44 HS 207 Solution sheet 9 d) The moment generating function M X of X N µ, σ 2 ) for some µ R and σ 2 > 0 is given by M X r) = exp {rµ + r2 σ 2 }, 2 for all r R. Hence, if we assume Gaussian distributions for S and S 2, then we get π = α log M S α) + log M S2 α)] = ) αes ] + α2 α 2 VarS ) + αes 2 ] + α2 2 VarS 2) = ES ] + ES 2 ] + α 2 VarS ) + VarS 2 )] = ES] + α 2 VarS), where in the last equation we used that S and S 2 are independent. We see that in this case the utility indifference price is given according to a variance loading principle. Since here we assume Gaussian distributions for S and S 2 with the same corresponding first two moments as in the compound Poisson case in part c), in order to calculate VarS ) and VarS 2 ), we again assume that S and S 2 have compound Poisson distributions. Note that E Y ) ) 2] γγ + ) 20 2 = c 2 = = , and that E Y 2) ) 2] = = Then Proposition 2. of the lecture notes gives and which leads to VarS ) = λ v E Y ) ) 2] = = VarS 2 ) = λ 2 v 2 E Y 2) ) 2] = = , 0 VarS) = VarS + S 2 ) = VarS ) + VarS 2 ) = We conclude that the utility indifference price is given by Note that we have π = ES] + α VarS) = = π ES] ES] = = %. Thus, as in part c), the loading π ES] is given by approximately 0.46% of the pure risk premium. The reason why we get the same results in c) and d) is the Central Limit Theorem. In particular, neither the gamma distribution nor the exponential distribution are heavy-tailed distributions and thus λ v = λ 2 v 2 = 000 are large enough for the normal approximations to be valid approximations for the compound Poisson distributions. Updated: November 20, / 8

45 HS 207 Solution sheet 9 Solution 9.2 Value-at-Risk and Expected Shortfall a) Since S LNµ, σ 2 ) with µ = 20 and σ 2 = 0.05, we have } ES] = exp {µ + σ Let z denote the VaR of S ES] at security level q = 99.5%. Then, since the distribution function of a lognormal distribution is continuous and strictly increasing, z is defined via the equation PS ES] z] = q. By writing Φ for the distribution function of a standard Gaussian distribution, we can calculate z as follows PS ES] z] = q PS z + ES]] = q ] log S µ logz + ES]) µ P = q σ σ ] logz + ES]) µ Φ = q σ logz + ES]) = µ + σ Φ q) For q = 99.5%, we have Φ q) Thus, we get In particular, π CoC is then given by Note that we have z = exp { µ + σ Φ q) } ES] { }) σ z = exp{µ} exp{σ Φ 2 q)} exp. 2 z π CoC = ES] + r CoC z π CoC ES] ES] = %. Thus, the loading π CoC ES] is given by approximately 2.64% of the pure risk premium. b) For all u 0, ), let VaR u and ES u denote the VaR risk measure and the expected shortfall risk measure, respectively, at security level u. Note that actually in part a) we found that and that by a similar computation we get for all u 0, ). In particular, we have VaR u S ES]) = exp { µ + σ Φ u) } ES] VaR u S) = exp { µ + σ Φ u) }, VaR u S ES]) + ES] = VaR u S) Updated: November 20, / 8

46 HS 207 Solution sheet 9 for all u 0, ). Since the distribution function of S is continuous and strictly increasing, according to Example 6.26 of the lecture notes we have ES q S ES]) = E S ES] S ES] VaR q S ES])] = E S ES] S VaR q S)] = E S S VaR q S)] ES] = ES q S) ES]. By definition of the mean excess function e S ) of S we have ES q S) = E S VaR q S) S VaR q S)]+VaR q S) = e S VaR q S)]+VaR q S). Moreover, according to the formula given in Chapter of the lecture notes, the mean excess function e S VaR q S)] above level VaR q S) is given by ] e S VaR q S)] = ES] Φ log VaR qs) µ σ 2 σ ] VaR q S). log VaR qs) µ Φ Using the formula calculated above for VaR u S) with u = q, we get ] ES q S) = ES] Φ log VaR qs) µ σ 2 σ ] log VaR qs) µ Φ σ ] = ES] Φ µ+σ Φ q) µ σ 2 σ ] Φ µ+σ Φ q) µ σ Φ Φ q) σ ] ) = ES] σ Φ Φ q)] = ES] q Φ Φ q) σ ]). In particular, we have found For q = 99%, we get ES q S ES]) = q ES] Φ Φ q) σ ]) ES] Finally, π CoC is then given by = q ES] q Φ Φ q) σ ]) = { } q exp µ + σ2 q Φ Φ q) σ ]). 2 ES 99% S ES]) π CoC = ES] + r CoC ES 99% S ES]) Note that we have π CoC ES] ES] = %. Thus, the loading π CoC ES] is given by approximately 2.26% of the pure risk premium. In particular, the cost-of-capital price in this example is higher using the expected shortfall risk measure at security level 99% than using the VaR risk measure at security level 99.5%. Updated: November 20, / 8

47 HS 207 Solution sheet 9 c) In parts a) and b) we found that VaR 99.5% S ES]) < ES 99% S ES]). Let q = 99%. Now the goal is to find u 0, ] such that VaR u S ES]) = ES q S ES]) VaR u S) = ES q S). Note that from part b) we know for all u 0, ), and VaR u S) = exp { µ + σ Φ u) }, ES q S) = q ES] Φ Φ q) σ ]). Hence, we can solve for u to get u = Φ log q ES] Φ Φ q) σ ])] µ σ 99.62%. We conclude that in this example the cost-of-capital price using the VaR risk measure at security level 99.62% is approximately equal to the cost-of-capital price using the expected shortfall risk measure at security level 99%. d) Since S LNµ, σ 2 ) with µ = 20 and σ 2 = 0.05 and U and V are assumed to be independent, we have U N µ, σ 2 ), V N µ, σ 2 ) and U + V N 2µ, 2σ 2 ). Let X N µ, σ 2 ) for some µ R and σ 2 > 0. Then VaR q X) can be calculated as X µ P X VaR q X)] = q P VaR ] qx) µ = q σ σ ] VaR q X) µ Φ = q σ This implies that VaR q X) = µ + σ Φ q). VaR q U) + VaR q V ) = µ + σ Φ q) + µ + σ Φ q) = 2µ + 2σ Φ q) and that VaR q U + V ) = 2µ + 2σ Φ q). Since Φ 0.45) 0.26 and Φ 0.55) 0.26, we get and Note that since VaR 0.45 U + V ) > VaR 0.45 U) + VaR 0.45 V ) VaR 0.55 U + V ) < VaR 0.55 U) + VaR 0.55 V ). VaR q U + V ) > VaR q U) + VaR q V ) Φ q) > 2Φ q) Φ q) < 0, Updated: November 20, / 8

48 HS 207 Solution sheet 9 one can see that in this example for all q 0, 2) and that for all q 2, ). VaR q U + V ) > VaR q U) + VaR q V ) VaR q U + V ) < VaR q U) + VaR q V ) Solution 9.3 Esscher Premium a) Let α 0, r 0 ) and M S and M S denote the first and second derivative of M S, respectively. According to the proof of Corollary 6.6 of the lecture notes, the Esscher premium π α can be written as π α = M S α) M S α). Hence, the derivative of π α can be calculated as M S α) M S α) = M S α) ) M 2 M S α) S α) M S α) = E S 2 exp{αs} ] E S exp{αs}] M S α) M S α) = x 2 exp{αx} df x) M S α) = x 2 df α x) d dα π α = d dα 2 x df α x)], where we define the distribution function F α by F α s) = M S α) s ) 2 M S α) exp{αx} df x), ] 2 x exp{αx} df x) for all s R. Let X be a random variable with distribution function F α. Then we get d 2 dα π α = x 2 df α x) x df α x)] = E X 2] EX] 2 = VarX) 0. Hence, the Esscher premium π α is always non-decreasing. Moreover, if S is non-deterministic, then also X is non-deterministic. Thus, in this case we get d dα π α = VarX) > 0. In particular, the Esscher premium π α then is strictly increasing in α. b) Let α 0, r 0 ). According to Corollary 6.6 of the lecture notes, the Esscher premium π α is given by π α = d dr log M Sr). r=α Updated: November 20, / 8

49 HS 207 Solution sheet 9 For small values of α, we can use a first-order Taylor approximation around 0 to get π α d dr log M Sr) d 2 + α r=0 dr 2 log M Sr) r=0 = M S 0) M S 0) + α M S 0) ] ) M 2 M S 0) S 0) M S 0) = ES] + α E S 2] ES] 2) = ES] + αvars). We conclude that for small values of α, the Esscher premium π α of S is approximately equal to a premium resulting from a variance loading principle. c) Since S CompPoiλv, G), we can use Proposition 2. of the lecture notes to get log M S r) = λv M G r) ], where M G denotes the moment generating function of a random variable with distribution function G. Since G is the distribution function of a gamma distribution with shape parameter γ > 0 and scale parameter c > 0, we have ) γ c M G r) =, c r for all r < c. In particular, also M S r) is defined for all r < c, which implies that the Esscher premium π α exists for all α 0, c). Now let α 0, c). Then the Esscher premium π α can be calculated as π α = d dr log M Sr) = d dr λv c c r = d dr λv r c = λv γ c = λv γ c Note that since c > c α and γ > 0, we have r=α ) γ ] r=α ) γ ] r ) γ c ) γ+ c. c α ) γ+ c >, c α r=α r=α and, thus, π α = λv γ c ) γ+ c > λv γ c α c = ES]. Updated: November 20, / 8

50 ETH Zürich, D-MATH HS 207 Prof. Dr. Mario V. Wüthrich Coordinator A. Gabrielli Non-Life Insurance: Mathematics and Statistics Solution sheet 0 Solution 0. Tariffication Methods In this exercise we work with K = 2 tariff criteria. The first criterion vehicle type) has I = 3 risk characteristics: χ, passenger car), χ,2 delivery van) and χ,3 truck). The second criterion driver age) has J = 4 risk characteristics: χ 2, 2-30 years), χ 2, years), χ 2, years) and χ 2, years). The claim amounts S i,j for the risk classes i, j), i 3, j 4, are given on the exercise sheet. We work with a multiplicative tariff structure. In particular, we use the model ES i,j ] = v i,j µ χ,i χ 2,j, for all i 3, j 4, where we set the number of policies v i,j =. Moreover, in order to get a unique solution, we set µ = and χ, =. Therefore, there remains to find the risk characteristics χ,2, χ,3, χ 2,, χ 2,2, χ 2,3, χ 2,4. a) In the method of Bailey & Simon, these risk characteristics are found by minimizing X 2 = I J j= S i,j v i,j µ χ,i χ 2,j ) 2 v i,j µ χ,i χ 2,j = Let i {2, 3}. Then ˆχ,i is found by the solution of Thus, we get 0! = χ,i X 2 = = = = = 4 j= S i,j χ,i χ 2,j ) 2 χ,i χ,i χ 2,j 3 4 j= 4 2S i,j χ,i χ 2,j )χ,i χ 2,j S i,j χ,i χ 2,j ) 2 j= 4 j= 4 j= 4 j= χ 2,i χ 2,j S i,j χ,i χ 2,j ) 2 χ,i χ 2,j. 2S i,j χ,i χ 2,j + 2χ 2,i χ2 2,j S2 i,j + 2S i,jχ,i χ 2,j χ 2,i χ2 2,j χ 2,i χ 2,j χ 2,i χ2 2,j S2 i,j χ 2,i χ 2,j χ 2,j χ 2,i 4 j= ˆχ,i = S 2 i,j χ 2,j. 4 j= S2 i,j /χ ) /2 2,j. 4 j= χ 2,j Updated: November 30, 207 / 5

51 HS 207 Solution sheet 0 By an analogous calculation, one finds 3 ˆχ 2,j = S2 i,j /χ ) /2,i 3 χ,,i for j {, 2, 3, 4}. For solving these equations, one has to apply a root-finding algorithm like for example the Newton-Raphson method. We get the following multiplicative tariff structure: 2-30y 3-40y 4-50y 5-60y ˆχ,i passenger car delivery van truck ˆχ 2,j We see that the risk characteristics for the classes passenger car and delivery van are close to each other, whereas for trucks we have a higher tariff. Moreover, an insured with age in the class 2-30 years gets a considerably higher tariff than an insured with age in the class 3-40 years. The smallest tariff is assigned to insureds with age in the classes 4-50 years and 5-60 years. Note that we have 3 j= 4 ˆχ,i ˆχ 2,j = > = 3 j= 4 S i,j, which confirms the systematic) positive bias of the method of Bailey & Simon shown in Lemma 7.2 of the lecture notes. b) In the method of Bailey & Jung, which is also called method of marginal totals, the risk characteristics χ,2, χ,3, χ 2,, χ 2,2, χ 2,3, χ 2,4 are found by solving the equations J v i,j µ χ,i χ 2,j = j= I v i,j µ χ,i χ 2,j = J S i,j, j= I S i,j. Since I = 3, J = 4 and we work with v i,j = and set µ =, we get the equations 4 χ,i χ 2,j = j= 3 χ,i χ 2,j = Thus, for i {2, 3} and j {, 2, 3, 4}, we get ˆχ,i = 4 4 S i,j, j= 3 S i,j. S i,j/ 3 χ 2,j, 4 S i,j/ j= j= 3 ˆχ 2,j = χ,i. Analogously to the method of Bailey & Simon, one has to solve this system of equations using a root-finding algorithm. We get the following multiplicative tariff structure: Updated: November 30, / 5

52 HS 207 Solution sheet y 3-40y 4-50y 5-60y ˆχ,i passenger car delivery van truck ˆχ 2,j We see that the results are very close to those in part a) where we applied the method of Bailey & Simon. However, now we have 3 j= 4 ˆχ,i ˆχ 2,j = = = 3 j= 4 S i,j, which comes as no surprise as we fitted the risk characteristics such that the above equality holds true. c) In the log-linear regression model we work with the stochastic model X i,j def = log S i,j v i,j = log S i,j N β 0 + β,i + β 2,j, σ 2 ), where β 0, β,i, β 2,j R and σ 2 > 0, for all risk classes i, j), i 3, j 4. The risk characteristics of the two tariff criteria vehicle type and driver age are now given by and β, passenger car), β,2 delivery van) and β,3 truck), β 2, 2-30 years), β 2, years), β 2, years) and β 2, years). In order to get a unique solution, we set β, = β 2, = 0. Because this will simplify notation considerably, we write X = X,..., X M ) with M = 2 and X = X,, X 2 = X,2, X 3 = X,3, X 4 = X,4, X 5 = X 2,, X 6 = X 2,2, X 7 = X 2,3, X 8 = X 2,4, X 9 = X 3,, X 0 = X 3,2, X = X 3,3, X 2 = X 3,4. Moreover, we define β = β 0, β,2, β,3, β 2,2, β 2,3, β 2,4 ) R r+, where r = 5. Then, we assume that X has a multivariate Gaussian distribution X N Zβ, σ 2 I), where I R M M denotes the identity matrix and Z R M r+) is the so-called design matrix that satisfies EX] = Zβ. For example for m = we have and for m = 8 EX m ] = EX ] = EX, ] = β 0 + β, + β 2, = β 0 =, 0, 0, 0, 0, 0) β, EX m ] = EX 8 ] = EX 2,4 ] = β 0 + β,2 + β 2,4 =,, 0, 0, 0, ) β. Doing this for all m {,..., 2}, we find the design matrix Z: Updated: November 30, / 5

53 HS 207 Solution sheet 0 intercept β 0 ) van β,2 ) truck β,3 ) 3-40y β 2,2 ) 4-50y β 2,3 ) 5-60y β 2,4 ) Here we would like to point out that we can also use R to find the design matrix, see the R-Code of Exercise 0.2. According to formula 7.9) of the lecture notes, the MLE ˆβ MLE of the parameter vector β is given by ˆβ MLE = Z σ 2 I) Z] Z σ 2 I) X = Z Z) Z X. Note that ˆβ MLE does not depend on σ 2. Moreover, the design matrix Z has full column rank and, thus, Z Z is indeed invertible. See the R-Code given at the end of the solution to this exercise for the calculation of ˆβ MLE. We get the following tariff structure: ˆβ 0 = y 3-40y 4-50y 5-60y ˆβ,i passenger car delivery van truck ˆβ 2,j We see that the results are very close to those in parts a) and b) where we applied the method of Bailey & Simon and the method of Bailey & Jung. However, since we are now working in a stochastic framework, we also get standard errors and we can make statements about the statistical significance of the parameters. According to the R-output, we get the following p-values for the individual parameters: ˆβ 0 ˆβ,2 ˆβ,3 ˆβ2,2 ˆβ2,3 ˆβ2,4 p-value R gets these p-values by applying a t-test individually to each parameter, whether they are equal to zero. While the p-values for ˆβ 0, ˆβ,3, ˆβ 2,2, ˆβ 2,3, ˆβ 2,4 are smaller than 0.05 and, thus, these parameters are significantly different from zero, the p-value of ˆβ,2 delivery van) is fairly high. Hence, we might question if we really need the class delivery van. In order to check whether there is statistical evidence that the classification into different types of vehicles could be omitted, we define the null hypothesis of the reduced model: H 0 : β,2 = β,3 = 0, i.e. we set p = 2 parameters equal to 0. Then we can perform the same analysis as above to ˆβ MLE get the MLE H 0. In particular, let Z H0 be the design matrix Z without the second column MLE van β,2 ) and the third column truck β,3 ). Then ˆβ H 0 is given by ˆβ MLE H 0 = Z H 0 Z H0 ) Z H 0 X. Updated: November 30, / 5

54 HS 207 Solution sheet 0 MLE See the R-Code given below for the calculation of ˆβ H 0. Now, for all m {,..., 2}, we full H0 define the fitted value ˆX m of the full model and the fitted value ˆX m of the reduced model. In particular, we have ˆX m full = Z ˆβ MLE] m and ] ˆX m H0 = Z ˆβMLE H0 H 0, m where ] m denotes the m-th element of the corresponding vector, for all m {,..., 2}. Moreover, we define M ) 2 SSerr full full = X m ˆX m and SS H0 err = m= M m= X m ) 2 H0 ˆX m. According to formula 7.5) of the lecture notes, the test statistic T = SSH0 err SSerr full M r SSerr full p = 3 SSH0 err SSerr full SSerr full has an F -distribution with degrees of freedeom given by df = p = 2 and df 2 = M r = 6. See the R-Code below for the calculation of T. We get T 8.336, which corresponds to a p-value of approximately.85%. Thus, we can reject H 0 at significance level of 5%, i.e. there is no statistical evidence that the classification into different types of vehicles could be omitted. d) As we already mentioned above, the method of Bailey & Simon, the method of Bailey & Jung and the MLE method in the log-linear regression model all lead to approximately the same results. The only differences are, that with the method of Bailey & Jung we get coinciding marginal totals and with the log-linear regression model we are in a stochastic framework which allows for calculating parameter uncertainties and hypothesis testing. ### c) 2 3 ### We apply the log - linear regression method to the observed claim amounts given on the exercise sheet 4 5 ### Load the observed claim amounts into a matrix 6 S <- matrix c 2000,2200,2500,800,600,2000,500,400,700,600,400,600), nrow = 3) 7 8 ### Define the design matrix Z 9 Z <- matrix c rep,2),rep 0,4),rep,4),rep 0,2),rep,4), rep c 0,,0,0),3),rep c0,0,,0),3),rep c0,0,0,),3)),nrow = 2) 0 ### Store the design matrix Z without the intercept term ) and the dependent variable log S_{ i, j}) in one dataset Updated: November 30, / 5

55 HS 207 Solution sheet 0 2 data <- cbind Z,-], matrix log ts)),nrow = 2) ) 3 data <- as. data. frame data ) 4 colnames data ) <- c" van ", " truck ", " X3 _40y", " X4 _50y", " X5 _ 60y", " observation ") 5 6 ### Apply the regression model 7 linear. model <- lm formula = observation ~ van + truck + X3 _ 40y + X4 _50y + X5 _60y, data = data ) 8 9 ### Print the output of the regression model 20 summary linear. model ) 2 22 ### Fitted values 23 fitted linear. model ) ### We can also get the parameters by applying the formula 7. 9) of the lecture notes 26 solve tz)%*%z)%*%tz)%*% matrix log ts)),nrow = 2) ### Apply the regression model under H_ {0} 30 linear. model2 <- lm formula = observation ~ X3 _ 40 y + X4 _ 50 y + X5 _60y, data = data ) 3 32 ### Calculation of the test statistic T which has an F- distribution 33 T <- 3 * sum fitted linear. model2 ) - data,6]) ^2) - sum fitted linear. model ) - data,6]) ^2) ) / sum fitted linear. model ) - data,6]) ^2) ### Calculation of the corresponding p- value 36 pft, 2, 6, lower. tail = FALSE ) Note that we could also define the covariates of factor type in R which then automatically implies that these covariates are of categorical type and R chooses the design matrix Z accordingly, see the R-Code for the solution of Exercise 0.2 given below. Solution 0.2 Tariffication Methods a) In this exercise we work with K = 3 tariff criteria. The first criterion vehicle class) has 2 risk characteristics: β, weight over 60 kg and more than two gears) and β,2 other). The second criterion vehicle age) also has 2 risk characteristics: β 2, at most one year) and β 2,2 more than one year). The third criterion geographic zone) has 3 risk characteristics: β 3, large cities), β 3,2 middle-sized towns) and β 3,3 smaller towns and countryside). Updated: November 30, / 5

56 HS 207 Solution sheet 0 The observed number of claims N l,l 2,l 3, the observed volumes v l,l 2,l 3 and the observed claim frequencies λ l,l 2,l 3 = N l,l 2,l 3 v l,l 2,l 3 for the risk classes l, l 2, l 3 ), l 2, l 2 2, l 3 3, are given on the exercise sheet. Now, for modelling purposes, we assume that all N l,l 2,l 3 are independent with and define Then we use the model Ansatz gλ l,l 2,l 3 ) = g E Nl,l 2,l 3 N l,l 2,l 3 Poiλ l,l 2,l 3 v l,l 2,l 3 ) v l,l 2,l 3 X l,l 2,l 3 = N l,l 2,l 3 v l,l 2,l 3. ]) = g E X l,l 2,l 3 ]) = β 0 + β,l + β 2,l2 + β 3,l3, where β 0 R and where we use the log-link function, i.e. g ) = log ). In order to get a unique solution, we set β, = β 2, = β 3, = 0. Moreover, we define β = β 0, β,2, β 2,2, β 3,2, β 3,3 ) R r+, where r = 4. Similarly as in Exercise 0., c), we will relabel the risk classes with the index m {,..., M}, where M = = 2, define X = X,..., X M ) and the design matrix Z R M r+) that satisfies log EX] = Zβ, where the logarithm is applied componentwise to EX]. Let m {,..., 2}. According to Example 7.0 of the lecture notes, X m = N m /v m belongs to the exponential dispersion family with cumulant function b ) = exp{ }, m = log λ m, w m = v m and dispersion parameter φ =, i.e. we have ] Nm Zβ] m = log EX m ] = log E = log λ m = m, where Zβ] m denotes as above the m-th element of the vector Zβ. Thus, we assume that X,..., X M are independent with v m X m EDF m = Zβ] m, φ =, v m, b ) = exp{ }), for all m {,..., M}. According to Proposition 7. of the lecture notes, the MLE ˆβ MLE of β is the solution of Z V exp{zβ} = Z V X, ) where the weight matrix V is given by V = diagv,..., v M ). This equation has to be solved numerically. See the R-Code at the end of the solution to this exercise for the calculation of ˆβ MLE. We get the following estimates: ˆβ 0 ˆβ,2 ˆβ2,2 ˆβ3,2 ˆβ3,3 MLE We observe that insureds with a vehicle with weight over 60 kg and more than two gears tend to cause more claims than insureds with other vehicles. Analogoulsy, if the vehicle is at most one year old, we expect more claims than if it was older. Regarding the geographic zone, we see that driving in middle-sized towns leads to fewer claims than driving in large cities. Moreover, driving in smaller towns and countryside leads to even fewer claims than driving in middle-sized towns, where this difference is greater than the difference between large cities and middle-sized towns. Updated: November 30, / 5

57 HS 207 Solution sheet 0 b) The observed and the fitted claim frequencies against the vehicle class, the vehicle age and the geographical zone look as follows: Updated: November 30, / 5

Non-Life Insurance: Mathematics and Statistics

Non-Life Insurance: Mathematics and Statistics Exercise sheet 1 Exercise 1.1 Discrete Distribution Suppose the random variable N follows a geometric distribution with parameter p œ (0, 1), i.e. ; (1 p) P[N = k] = k 1 p if k œ N \{0}, 0 else. (a) Show