Applying the proportional hazard premium calculation principle Maria de Lourdes Centeno and João Andrade e Silva CEMAPRE, ISEG, Technical University of Lisbon, Rua do Quelhas, 2, 12 781 Lisbon, Portugal lcenteno@iseg.utl.pt April 2, 24 Abstract We discuss the application of the proportional hazard premium calculation principle in the parametric and non parametric framework. In the parametric approach, we propose a method to calculate the premium of a compound risk when the severity distribution is subexponential. In the non parametric approach, the use of the empirical distribution to calculate the premium using the proportional hazard principle leads to a systematic underestimation of the premium. After studying the bias of the premium calculated using this non-parametric approach we use the bootstrap technique with subsampling to reduce it. Keywords: Proportional hazard premium principle; subexponential distributions; bootstrap; subsampling; 1 Introduction As it is well known the proportional hazard premium principle (PH premium principle), introduced by Wang (1995), satisfies properties which make of it a very attractive premium from the theoretical point of view, see e.g. Wang (1996) and Andrade e Silva and Centeno (1998). However its use depends on the complete knowledge of the distribution of the aggregate claims amount. In practical terms this means that one must fit a distribution to the data set or use the empirical distribution function. When the aggregate claims amount follows a compound distribution the calculation of the PH premium based on the parametric approach can raise some problems. This will be discussed in the following section. The non-parametric approach is more appealing, from the practical point of view, but can lead to a significative underestimation of the premium. In subsection 3.1 we calculate the bias of the premium based on the empirical distribution for some 1
distributions, namely the exponential, Pareto and uniform. We also study the rate of convergency of the bias in the exponential case. In subsection 3.2 we show how to use the bootstrap technique with subsampling to reduce the bias of the estimator of the premium based on the empirical distribution. We also perform two simulation studies to give some insight to this technique. 2 Applying the PH transform to compound distributions in the parametric model Let Y, the aggregate claims amount, be a nonnegative random variable, with distribution function F (y). Let S(y) =1 F (y) be the survival function. The PH premium principle assigns to the distribution F (y) the premium π ρ = Z (S(y)) 1/ρ dy, (1) where ρ, with ρ 1, is the risk aversion index. When using the collective risk model, Y is equal to X 1 +... + X N, where N is thenumberofclaimsoccurredinagivenperiodand{x i } i=1,2,... are the individual claims amounts which are assumed to be nonnegative i.i.d. random variables and independent of N. InthiscaseY has a compound distribution and the company often estimates claim frequency and claim severity separately. Various numerical techniques are available to calculate the compound distribution, being Panjer s recursion formula the most well known. When using the recursion formula we will have to stop somewhere the calculations. When the individual claims amount has distribution G with unlimited support, i.e. G(x) < 1, forallx>, anditisveryskewed,thechoice of the value where to stop the calculations to obtain a reasonable estimate of the premium is a critical aspect of the parametric model. When X follows a heavy tail distribution R (S(y)) 1/ρ dy can be of a significative size even for large values of t. In t this case, when the probability generating function of N is analytic in the neighbourhood of 1 (which happens both for the Poisson and the negative binomial case) and G is subexponential then (see Embrechts et al (1997), pp. 45-46) 1 F (x) E[N](1 G(x)), x. (2) Hence for heavy tail severity distributions we will estimate the premium by π ρ = Z t Z (S (y)) 1/ρ dy +(E[N]) 1/ρ (1 G(y)) 1/ρ dy, (3) where S (y) is the approximation to S(y) obtained by Panjer s recursion formula, and t is a suitable high value. Example 1 Suppose that an actuary has arrived to the following estimates for a risk: the number of claims greater than a given observation point d =$1K is Poisson distributed with mean λ d =6and the size of each claim greater than d is Pareto with 2 t
parameter α =1.647. Let us consider an excess of loss reinsurance treaty. For a layer l xs m, withm d and l +, the aggregate claims amount are compound Poisson with expected number of claims λ =(d/m) α λ d and severity distribution G(x) = ½ 1 m m+x α if x<l 1 if x l. (4) Using 1/ρ =.925, Table 1 shows the PH transform premium, as percentage of the subject earned premium, SEP =$1, K, associate to the following layers: 1. $4K xs $1K 2. $5K xs $5K 3. $9K xs $1K 4. the excess over $1, K This example for the limited layers was considered by Wang (1998). For the calculation of the premium of the unlimited layer we have used (3) with t =$1 5 K, for which the difference between the aggregate survival function S (t) and λ(1 G(t)), is 2.3 1 7. The step used in the arithmetization of the severity distribution to perform the recursion was h =$1K. Table 1: PH- transform premiums Layer l xs m Pure Premium π ρ as percentage of SEP as percentage of SEP $4K xs $1K 6.% 6.384% $5K xs $5K 1.183% 1.48% $9K xs $1K 7.183% 7.742% xs $1, K 2.9% 3.388% The reason to chose such a high value for t in formula (3) is related to the very small value of α. If instead of $1 5 K we had used $1 4 K for t, the result of applying formula (3) with ρ =1would be 2.86% of SEP, compared to a theoretical value of 2.9%. Using t =$1 5 K the two terms in the right hand side of (3) are 1.9848% and.155% for ρ =1and 3.298% and.358% for 1/ρ =.925. Note that the relative weight of the second term increases with ρ and is far from negligible in spite ofthehighvaluesoft. 3
3 Applying the PH transform to the empirical distribution 3.1 The bias of the premium Let F n (y) be the empirical distribution function of Y based on a random sample of size n, (Y 1,...Y n ), let S n (y) be the corresponding empirical survival function and bπ ρ = Z (S n (y)) 1/ρ dy, (5) the premium estimator. Although S n (y) is an unbiased estimator of S(y), bπ ρ given by (5) is a biased estimator of π ρ, since Z h E[bπ ρ ]= E (S n (y)) 1/ρi Z dy [E (S n (y))] 1/ρ dy = π ρ by Jensen inequality, with the inequality being strict unless ρ =1. Let Y k:n be the k-th order statistic in a sample of size n. Then 1 y<y 1:n S n (y) = (n k)/n, Y k:n y<y k+1:n, k =1,..., n 1 y Y n:n, which implies that Xn 1 µ 1/ρ n k bπ ρ = (Y k+1:n Y k:n), (6) n k= where Y :n. Let F k:n (y) be the distribution function of the k-th order statistic, i.e. F k:n (y) = Pr{Y k:n y}. Given that then and integrating we obtain F k:n (y) = r=k 1 F k:n (y) =1 F k+1:n (y) E[Y k+1:n Y k:n ]= Then, using (6) and (7) we get ³ n r (F (y)) r (1 F (y)) n r, Z ³ n k (F (y)) k (1 F (y)) n k ³ n k (F (y)) k (1 F (y)) n k dy. (7) E[bπ ρ ]= Xn 1 k= = n k n 1/ρ n Z (F (y)) k (S(y)) n k dy k n k+1 1/ρ n Z n k 1 4 (F (y)) k 1 (S(y)) n k+1 dy (8)
which is, when F is absolutely continuous for x >, equivalent to E[bπ ρ ]= µ 1/ρ µ n k +1 n n k 1 Z 1 (1 x) k 1 x n k+1 1 dx, (9) f [S 1 (x)] where f(x) =F (x). Expressions (8) and (9) can be used to calculate the bias of bπ ρ B(bπ ρ )=E[bπ ρ ] π ρ. For some distributions (not in the compound case) the bias is easily calculated, as it is the case when Y is exponential, Pareto or uniform, as can be seen in the following examples. Example 2 Let Y be exponential distributed, i.e. S(y) = exp( θy), y> (θ >). In this case f [S 1 (x)] = θx, and using (9) we get, after some calculations, and E[bπ ρ ]=n 1/ρ θ 1 Ã B(bπ ρ )= n 1/ρ k 1/ρ 1! k 1/ρ 1 ρ θ 1. (1) ³ β Example 3 Let Y be Pareto distributed, i.e. S(y) = β+y α,y> (α, β > ). In this case f [S 1 (x)] = αβ 1 x 1+1/α, and using (9) we get, E[bπ ρ ]=βα 1 n 1/ρ n! k 1/ρ Γ(k 1/α) k!γ(n +1 1/α) and " n! B(bπ ρ )=β αn 1/ρ k 1/ρ # Γ(k 1/α) k!γ(n +1 1/α) ρ. (11) α ρ Example 4 When Y is uniformly distributed in (,1) we get E[bπ ρ ]= 1 (n +1)n 1/ρ k 1/ρ and B(bπ ρ )= 1 (n +1)n 1/ρ k 1/ρ ρ ρ +1. 5
For the exponential distribution it is possible to relate the order of the bias with thesamplesize,asstatedintheresultthatfollows. Wewerenotableofdeveloping similar results for other distributions, but we are strongly convinced that the order of the bias is smaller for the uniform and bigger for the Pareto. Result. When Y follows an exponential distribution B(bπ ρ ) converges to zero at the same rate as n 1/ρ. Proof. Noticing that and using (1) we have ρ = n 1/ρ B(bπ ρ ) = θ 1 n 1/ρ Z 1 Z 1 (k x) 1/ρ 1 dx, (k x) 1/ρ 1 k 1/ρ 1 dx. As (k x) 1/ρ 1 is an increasing function of x for k 1 and <x<1, we have that Z 1 (k x) 1/ρ 1 k 1/ρ 1 Z 1 dx (1 x) 1/ρ 1 1 dx (12) + (k 1) 1/ρ 1 k 1/ρ 1 = ρ n 1/ρ 1 ρ. k=2 On the other hand as R 1 (k x) 1/ρ 1 k 1/ρ 1 dx is strictly positive we can concludethatlefthandsideof(12)convergestoapositivevaluea as n goes to infinity. Consequently lim B(bπ ρ) = θ 1 A lim n n n 1/ρ. 3.2 Correcting the bias via bootstrapping As we have seen the distortion of the empirical distribution leads to an underestimation of the premiums. We use the bootstrap technique, see e.g. Efron and Tibshirani (1993), to correct, at least partially, the bias of that estimator. The bootstrap technique consists, as it is well known, on the resampling of the original data set. The bootstrap estimator of the bias of bπ ρ, is \B(bπ ρ )=bπ ρ bπ ρ with bπ ρ = 1 M MX b=1 bπ b ρ 6
Figure 1: B as function of n /n where M is the number of bootstrap samples and bπ b ρ is calculated using the b-th bootstrap sample. Hence the bootstrap estimator of the premium π ρ is bπ ρ = bπ ρ \ B(bπ ρ ). (13) Usually the resampling is made with replacement and using bootstrap samples of size n (i.e. the same size as the original sample). However in this case, as B(bπ ρ ) is always negative, we can improve, in principle, the results by using bootstrap samples of size n <n.figure 1 shows how the bias of the corrected estimator given by (13) varies with the sample size n (measured as a percentage of n). The figure is representative: it starts, when n = n, with a negative value for the mean of the observed bias of the bootstrap premiums, which we denote B n where B, than there is an optimal value of, is almost zero, and becomes positive when n is very small. The question is how to choose n, independently of the distribution. To give some insight to the problem we performed two simulation studies. Simulation 1 This case is based on two families of distributions, the Pareto and the Gamma. In the Pareto case we have considered that the parameter α takes the value 2, 3 or 4, originating what we call Pareto 2,Pareto 3 and Pareto 4 respectively. The other parameter was chosen such that the distribution has mean equal to 1, i.e. we have assumed that the survival function for the Pareto α is µ α α 1 S α (x) =, x >. x + α 1 7
For the gamma distribution we have kept the same mean and assumed that the variance was equal to 1 (exponential) or 2. We have considered that the original sample size is either 1, 5 or 1. Tables 2 and 3 give the bootstrap results for ρ =1.2 and ρ =1.15 respectively, with M = 2 and using 2 replicas. As we have mentioned, the bias of the premium based on the empirical distribution (column labelled B(bπ ρ ), and calculated using expressions (8), (1) and (11), for the gamma, exponential and Pareto distributions) can be of a considerable size, namely for heavy tail distributions when n is small and ρ is high. For instance for the Pareto 2, with n =1and ρ =1.2, the absolute value of the bias is 8.4% of the theoretical premium. The average bias of the premiums based on the 2 replicas (before the bootstrap is performed), labelled B(bπ ρ ), is in those cases still a bit far from B(bπ ρ ), but it was out of our computing facilities to consider more replicas in the simulation study. The following 1 columns show the average bias for the bootstrap premiums. The bootstrap with n = n, only corrects the bias partially, since after that correction we still observe a negative bias in all the situations of our example. As we can see from Tables 2 and 3, the pattern shown in Figure 1 is characteristic of the behaviour ofthebiasasfunctionofn /n. The optimal proportion n /n depends, of course, on the distribution, on n and on ρ. As a rule n /n should increase with n and decrease with ρ. For instance, for ρ =1.2, a bootstrap with a resampling of n /n =4% performs, in average, better than the bootstrap with full resampling (and certainly better than with no bootstrap), at least for the sample sizes considered in the study. The last column of the tables, provides the average of the premiums if we have used the maximum likelihood estimators of the parameters. For the Pareto 2,the figures are not presented because for a significative number of samples (2 for ρ =1.2 and 11 for ρ =1.15) we got a maximum likelihood estimate bα smaller than ρ. In some other situations the estimate was very close to ρ, what would imply an extremely high premium. For the Pareto 3 and Pareto 4 we didn t get any sample with a maximum likelihood estimate smaller than ρ, but we got values not very far from ρ, which explain the positive values of the bias on those cases. The maximum likelihood performs well (in association with the PH premium calculation principle) if the distribution does nothaveaveryheavytail,orifthesamplesizeisverybig. As a conclusion, and for values of ρ around 1.15-1.2 we can say that use of the bootstrap technique with a resampling proportion size of 4% of the original sample size provides good results in general, but when we know that we are dealing with heavy tail loss distributions, we could use a smaller resampling size. Simulation 2 The goal of the second simulation study is to analyse the behaviour of our procedures when the aggregate claims amount are generated according to a compound Poisson distribution. For the individual claims amount we have considered the three Pareto s and the exponential distributions of the former example. The expected number of claims was set at 1. As in the previous simulation study ρ =1.2 and ρ =1.15, M =2,weuse2 replicas and n is equal to 1, 5 or 1. Tables 4 and 5 are similar to tables 2 and 3, with the exceptions of B(bπ ρ ) and ML, which are not considered now. Column labelled π ρ was obtained using (3), with t equal to 1, 926 and 297 for α equal to 2, 3 and 4 respectively, values for which 8
the Pr{X >t} =1 8. For the exponential case we only considered the first term of (3), with t =9. As expected with λ =1the premiums associated to the compound distribution are greater than for the corresponding Pareto s. The results are very similar to the former case. We obtained the same behaviour for the bias as a function of n /n, and again we would recommend a resampling of n /n =4%, butifweknewthatweweredealingwithheavytaillossdistributions, we could use a smaller resampling size. Acknowledgments This research was supported by Fundação para a Ciência e a Tecnologia - FCT/POCTI. We are also grateful to our colleague Raul Brás for assisting us with the computation programs. References Andrade e Silva, J. and Centeno, M. L. (1998). Comparing risk adjusted premiums from the reinsurance point of view. ASTIN Bulletin 28, 221-239. Embrechts, P. Klüppelberg, Mikosch, T. (1997). Modelling extremal events. Springer- Verlag, Berlin. Efron, B. and Tibshirani, R. (1993). An Introduction to the Bootstrap. Chapman and Hall, London. Wang, S. (1995). Insurance pricing and increased limits ratemaking by proportional hazards transforms. Insurance: Mathematics and Economics 17, 43-54. Wang, S. (1996). Premium calculation by transforming the layer premium density, ASTIN Bulletin 26, 71-92. Wang, S. (1998). Implementation of PH-transforms in ratemaking, Proceedings of the Casualty Actuarial Society, LXXXV, 94-979. 9
Table 2: Bootstrap Results, ρ =1.2 n Distribution π ρ B(bπ ρ ) B(bπ ρ ) B for n /n equal to 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% Pareto 2 1.5 -.1261 -.1388 -.1134 -.118 -.175 -.133 -.98 -.913 -.822 -.685 -.463.4 Pareto 3 1.3333 -.458 -.58 -.361 -.345 -.325 -.3 -.268 -.227 -.168 -.8.69.43.492 1 Pareto 4 1.2857 -.296 -.33 -.217 -.24 -.188 -.169 -.143 -.11 -.63.8.132.418.161 Gamma 1.2865 -.17 -.166 -.73 -.63 -.5 -.33 -.12.18.6.126.244.529 -.12 Exponential 1.2 -.98 -.112 -.56 -.49 -.41 -.31 -.18..25.66.139.324 -.13 Pareto 2 1.5 -.743 -.838 -.687 -.67 -.65 -.625 -.593 -.552 -.497 -.416 -.279.11.452 Pareto 3 1.3333 -.29 -.238 -.168 -.16 -.151 -.139 -.124 -.13 -.75 -.34.4.28.69 5 Pareto 4 1.2857 -.12 -.139 -.91 -.86 -.79 -.7 -.6 -.45 -.25.6.6.19.21 Gamma 1.2865 -.51 -.59 -.29 -.25 -.21 -.15 -.8.2.16.39.8.188 -.14 Exponencial 1.2 -.29 -.39 -.21 -.19 -.17 -.13 -.9 -.3.5.18.42.15 -.1 Pareto 2 1.5 -.59 -.62 -.482 -.468 -.452 -.432 -.46 -.372 -.327 -.259 -.15.89.224 Pareto 3 1.3333 -.149 -.148 -.99 -.93 -.86 -.77 -.66 -.51 -.31..53.177.41 1 Pareto 4 1.2857 -.81 -.81 -.49 -.45 -.41 -.35 -.28 -.18 -.4.18.55.146.15 Gamma 1.2865 -.3 -.39 -.21 -.19 -.16 -.13 -.9 -.3.6.2.46.111 -.15 Exponential 1.2 -.17 -.21 -.11 -.1 -.8 -.6 -.4..5.13.27.66 -.6 ML
Table 3: Bootstrap Results, ρ =1.15 n Distribution π ρ B(bπ ρ ) B(bπ ρ ) B for n /n equal to 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% Pareto 2 1.3529 -.771 -.881 -.75 -.687 -.664 -.635 -.598 -.551 -.487 -.391 -.232.16 - Pareto 3 1.2432 -.293 -.337 -.235 -.223 -.29 -.192 -.169 -.14 -.99 -.36.71.314.337 1 Pareto 4 1.215 -.193 -.223 -.143 -.134 -.123 -.11 -.92 -.68 -.35.16.15.314.121 Gamma 1.2137 -.115 -.112 -.46 -.38 -.29 -.18 -.2.19.49.97.182.39 -.9 Exponential 1.15 -.67 -.8 -.39 -.35 -.29 -.22 -.12..19.48.12.238 -.12 Pareto 2 1.3529 -.429 -.57 -.48 -.398 -.384 -.368 -.347 -.319 -.283 -.229 -.137.62.32 Pareto 3 1.2432 -.127 -.152 -.15 -.1 -.94 -.86 -.76 -.62 -.43 -.15.35.151.5 5 Pareto 4 1.215 -.74 -.91 -.59 -.55 -.51 -.45 -.38 -.28 -.15.6.43.134.14 Gamma 1.2137 -.33 -.42 -.22 -.19 -.16 -.12 -.7 -.1.9.25.54.13 -.13 Exponential 1.15 -.19 -.28 -.16 -.15 -.13 -.11 -.8 -.4.2.11.27.72 -.1 Pareto 2 1.3529 -.333 -.342 -.266 -.257 -.247 -.234 -.217 -.195 -.166 -.122 -.5.11.155 Pareto 3 1.2432 -.88 -.88 -.56 -.53 -.48 -.43 -.35 -.26 -.12.9.44.127.31 1 Pareto 4 1.215 -.49 -.5 -.29 -.27 -.24 -.2 -.15 -.9.1.15.4.12.11 Gamma 1.2137 -.19 -.3 -.17 -.16 -.14 -.12 -.9 -.5.1.1.28.74 -.14 Exponential 1.15 -.11 -.15 -.9 -.8 -.7 -.6 -.4 -.1.2.7.17.44 -.6 ML
Table 4: Bootstrap Results for Compound Distributions, ρ =1.2 n Severity π ρ B(bπ ρ ) B for n /n equal to Distribution 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% Pareto 2 1.5498 -.1493 -.1236 -.127 -.1172 -.1129 -.176 -.16 -.91 -.763 -.527 -.21 Pareto 3 1.3939 -.589 -.433 -.414 -.393 -.366 -.332 -.286 -.222 -.122.47.438 1 Pareto 4 1.3515 -.41 -.276 -.261 -.244 -.222 -.194 -.156 -.13 -.18.128.477 Exponential 1.2822 -.18 -.13 -.93 -.82 -.68 -.51 -.26.1.68.171.436 Pareto 2 1.5498 -.817 -.661 -.644 -.622 -.597 -.564 -.522 -.465 -.381 -.238.65 Pareto 3 1.3939 -.243 -.171 -.163 -.153 -.14 -.125 -.13 -.74 -.29.51.235 5 Pareto 4 1.3515 -.145 -.94 -.88 -.81 -.72 -.61 -.45 -.23.11.72.22 Exponencial 1.2822 -.49 -.25 -.22 -.19 -.14 -.9 -.1.11.29.64.153 Pareto 2 1.5498 -.643 -.517 -.53 -.486 -.466 -.438 -.45 -.356 -.287 -.174.7 Pareto 3 1.3939 -.18 -.128 -.122 -.115 -.16 -.94 -.79 -.57 -.25.31.161 1 Pareto 4 1.3515 -.16 -.71 -.67 -.62 -.56 -.48 -.37 -.22.1.41.14 Exponential 1.2822 -.34 -.19 -.18 -.16 -.13 -.1 -.5.2.12.33.86
Table 5: Bootstrap Results for Compound Distributions, ρ =1.15 n Severity π ρ B(bπ ρ ) B for n /n equal to Distribution 1% 9% 8% 7% 6% 5% 4% 3% 2% 1% Pareto 2 1.393 -.969 -.791 -.771 -.747 -.717 -.68 -.631 -.564 -.459 -.291.74 Pareto 3 1.2887 -.45 -.296 -.283 -.268 -.249 -.225 -.193 -.148 -.76.45.329 1 Pareto 4 1.2599 -.283 -.194 -.184 -.171 -.156 -.136 -.11 -.72 -.1.94.349 Exponential 1.2115 -.133 -.77 -.71 -.63 -.53 -.4 -.22.3.45.12.315 Pareto 2 1.393 -.491 -.389 -.378 -.364 -.347 -.326 -.298 -.26 -.23 -.17.11 Pareto 3 1.2887 -.156 -.18 -.13 -.97 -.88 -.78 -.63 -.43 -.13.42.17 5 Pareto 4 1.2599 -.97 -.62 -.59 -.54 -.48 -.4 -.29 -.14.9.52.154 Exponencial 1.2115 -.36 -.19 -.17 -.15 -.12 -.8 -.3.5.18.43.16 Pareto 2 1.393 -.379 -.298 -.29 -.279 -.266 -.248 -.226 -.195 -.149 -.75.89 Pareto 3 1.2887 -.116 -.82 -.78 -.74 -.67 -.6 -.5 -.36 -.14.23.112 1 Pareto 4 1.2599 -.71 -.48 -.46 -.42 -.38 -.33 -.26 -.16 -.1.27.94 Exponential 1.2115 -.25 -.15 -.14 -.12 -.1 -.8 -.5 -.1.7.21.58