IMPORTANCE SAMPLING FOR THE SIMULATION OF REINSURANCE LOSSES. Georg Hofmann

Similar documents
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

Department of Mathematics

On an Application of Bayesian Estimation

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Statistics 511 Additional Materials

Simulation. Two Rule For Inverting A Distribution Function

Topic 9: Sampling Distributions of Estimators

A statistical method to determine sample size to estimate characteristic value of soil parameters

Random Variables, Sampling and Estimation

Topic 9: Sampling Distributions of Estimators

Lecture 2: Monte Carlo Simulation

An Introduction to Randomized Algorithms

Chapter 8: Estimating with Confidence

Approximate Confidence Interval for the Reciprocal of a Normal Mean with a Known Coefficient of Variation

Topic 9: Sampling Distributions of Estimators

NOTES ON DISTRIBUTIONS

This is an introductory course in Analysis of Variance and Design of Experiments.

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

Robust Default Correlation for Cost Risk Analysis

1 Inferential Methods for Correlation and Regression Analysis

Confidence intervals summary Conservative and approximate confidence intervals for a binomial p Examples. MATH1005 Statistics. Lecture 24. M.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Provläsningsexemplar / Preview TECHNICAL REPORT INTERNATIONAL SPECIAL COMMITTEE ON RADIO INTERFERENCE

The standard deviation of the mean

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Expectation and Variance of a random variable

1 Review of Probability & Statistics

Output Analysis (2, Chapters 10 &11 Law)

Monte Carlo Integration

Statistical inference: example 1. Inferential Statistics

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables

MATH/STAT 352: Lecture 15

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Stochastic Simulation

Machine Learning Brett Bernstein

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

Rademacher Complexity

Stat 421-SP2012 Interval Estimation Section

6.3 Testing Series With Positive Terms

Using the IML Procedure to Examine the Efficacy of a New Control Charting Technique

Lecture 19: Convergence

Confidence Intervals

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

Chapter 6. Sampling and Estimation

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Chapter 6 Sampling Distributions

Distribution of Random Samples & Limit theorems

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

1 Introduction to reducing variance in Monte Carlo simulations

4. Partial Sums and the Central Limit Theorem

Clases 7-8: Métodos de reducción de varianza en Monte Carlo *


MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS

Final Review for MATH 3510

The Poisson Distribution

There is no straightforward approach for choosing the warmup period l.

Lecture 1 Probability and Statistics

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

Estimation of a population proportion March 23,

Output Analysis and Run-Length Control

Properties and Hypothesis Testing

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Computing Confidence Intervals for Sample Data

Analysis of Experimental Data

The target reliability and design working life

Lecture 1 Probability and Statistics

c. Explain the basic Newsvendor model. Why is it useful for SC models? e. What additional research do you believe will be helpful in this area?

Confidence Intervals for the Population Proportion p

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Chapter 6 Infinite Series

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

GUIDELINES ON REPRESENTATIVE SAMPLING

Instructor: Judith Canner Spring 2010 CONFIDENCE INTERVALS How do we make inferences about the population parameters?

Basis for simulation techniques

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Estimation of the Mean and the ACVF

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Chapter 9: Numerical Differentiation

Basics of Probability Theory (for Theory of Computation courses)

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

A LARGER SAMPLE SIZE IS NOT ALWAYS BETTER!!!

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

11 Correlation and Regression

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

Probability and statistics: basic terms

Chapter 6 Principles of Data Reduction

Discrete probability distributions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

Probability, Expectation Value and Uncertainty

NCSS Statistical Software. Tolerance Intervals

Statisticians use the word population to refer the total number of (potential) observations under consideration

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Element sampling: Part 2

Transcription:

Proceedigs of the 213 Witer Simulatio Coferece R. Pasupathy, S.-H. Kim, A. Tolk, R. Hill, ad M. E. Kuhl, eds. IMPORTANCE SAMPLING FOR THE SIMULATION OF REINSURANCE LOSSES Georg Validus Research Ic. Suite 21, 187 Kig Street South Waterloo Otario, N2J 1R1, CANADA ABSTRACT Importace samplig is a well developed method i statistics. Give a radom variable X, the problem of estimatig its expected value µ is addressed. The stadard approach is to use the sample mea as a estimator x. I importace samplig, a suitable variable L is itroduced such that the radom variable X/L has a estimator with a smaller variace tha that of x. As a result, a smaller sample size ca lead to the same estimatio accuracy. I the simulatio of reisurace fiacial terms for catastrophe loss, choosig a geeral variable L is difficult: Eve before the applicatio of fiacial terms, the loss distributio is ofte ot modelled by a closed-form distributio. After that, a wide rage of fiacial terms ca be applied that makes the fial distributio upredictable. However, it is evidet that the heavy tail of the resultig et loss distributio makes the use of importace samplig desirable. We propose a importace samplig techique usig a power fuctio trasformatio o the cumulative distributio fuctio. The beefit of this techique is that o prior kowledge of the loss distributio is required. It is a ew techique that has ot bee documeted i the literature. The trasformatio depeds o the choice of the expoet k. For a specific example we ivestigate desirable values of k. 1 INTRODUCTION I may applicatios heavy-tailed distributios are sampled accordig to the Mote Carlo method i order to approximate the mea (or other metrics). The accuracy of such simulatios ca be improved by icreasig the sample size. However, i practice this is ofte expesive i terms of performace ad computer resources. This is due to the fact that the simulatio error decreases oly very slowly as the sample size icreases. Let X be a radom variable whose variace Var(X) exists. Its expectatio E(X) ca be estimated by takig the mea of a sample with sample size. The stadard error of this estimator X is give by Var(X) Var(X ) =. So to reduce the stadard error by a factor of m (ad thus icrease accuracy) the sample size has to be multiplied by m 2. Aother way to improve accuracy is to reduce the variace Var(X). The stadard approach for this i the literature is importace samplig. See for example Bucklew (24) for a recet textbook o this subject. For this approach a radom variable L is chose such that E(L) = 1. If P is the give probability measure, a ew oe ca be defied as P (L) = LP. With respect to the two probability measures we have the followig equality i expectatio: ( ) X E(X;P) = E L ;P(L). 978-1-4799-276-1/13/$31. 213 IEEE 125

If L is chose suitably, the we obtai a reductio i variace: ( ) X Var(X;P) > Var L ;P(L). I this article we propose a applicatio of importace samplig to the evaluatio of catastrophe loss risk for reisurace compaies. This evaluatio is usually accomplished i roughly two steps usig commercial catastrophe models: First the gross loss is determied for a stochastic esemble of evet occurreces. The this gross loss is subjected to reisurace fiacial terms of a reisurace cotract. This happes durig a aual simulatio that may have several evets occur i a simulated year. The fial result of this process are metrics describig the aual et loss distributio for the reisurace cotract. The challege here is that it is impossible to determie a sigle distributio for L that provides effective importace samplig for all kids of catastrophe risk ad reisurace cotracts. This is due to the fact that the distributio of the et loss is ot give i a closed form. It depeds heavily o the sciece used i the catastrophe model ad o the type of reisurace cotracts cosidered. Noetheless it is clear that the distributio has a sigificat tail that promises a sigificat gai from importace samplig. Our approach is to use a simple form of importace samplig o the loss severity resultig from the evet esemble. The ucertaity aroud this loss is geerally referred to i the isurace idustry as rate ucertaity or primary ucertaity (Lohma ad Yue 211). Goig forward, we will deote this loss severity by X. However the effectiveess of the importace samplig will be measured by the simulatio error associated with metrics of the aual loss. If F X is the cumulative distributio fuctio of X, the its quatile fuctio FX 1 : [,1] R is defied by (p) = if{x R : p F(x)}. FX 1 See Gilchrist (2) for a recet textbook that advocates the usage of quatile fuctios for statistical modellig. The expected value of X ca be calculated as follows: E(X) = FX 1 (p) d p. (1) For a give sample size let p 1, p 2,..., p be samples from the equidistat partitio. The we have the followig Riema sum approximatio: E(X) 1 FX 1 (p i). This approximatio may also be thought of as the Mote Carlo method if p 1, p 2,..., p costitute a radom uiform sample. We will achieve the reductio i variace discussed earlier by performig a substitutio i the variable p of Equatio (1). This article is iteded to be useful for professioals with a workig kowledge of radom simulatios. The ecessary formulae for our proposed samplig method ad its error estimatio ca be foud i Subsectio 2.2. The relevat equatios are (8) ad (9). I Sectio 3 we have icluded a case study to measure the effectiveess of the proposed importace samplig approach. The R code used to evaluate this case study ca be foud i the appedix. 2 IMPORTANCE SAMPLING USING A POWER FUNCTION SUBSTITUTION I this sectio we work out the the mathematical details ecessary to specify the proposed importace samplig approach. Subsectio 2.1 cotais the theoretical basis for our approach. Subsectio 2.2 provides the details about the applicatio of the approach usig Riema samplig. Subsectio 2.3 explais how to trasitio from Riema samplig to radom samplig ad explais error estimates. 126

2.1 A Substitutio I this subsectio, we itroduce a family of trasformatios that will lead to a variace reductio i some cases. It is therefore our goal to compare the origial variace Var(X) with the variace Var(Y ) i the trasformed case. Note that all radom variables are real-valued. Let X be a radom variable for which the expected value E(X) ad the variace Var(X) exist. We deote its cumulative distributio fuctio by F X. The its tail distributio is defied as F X (x) = 1 F X (x). We will work with F(x) rather tha F X for the followig umerical reaso: The accuracy of floatig poit umbers o computers is higher i the viciity of tha i the viciity of 1. Sice F X ecodes iformatio about the tail of the distributio i values aroud, it is preferable to F X. The complemetary quatile fuctio is the fuctio F X 1 : [,1] R defied by Now let F X 1 (p) = if{x R : p F(x)}. t : [,1] [,1] be a differetiable, strictly icreasig fuctio o the uit iterval that satisfies t() = ad t(1) = 1. Later we will specialize t to the fuctio defied by t(q) = q 2, so you may keep this example i mid. Now let Y be a radom variable with the complemetary quatile fuctio defied by F 1 Y (q) = F X 1 (t(q))t (q). The radom variable Y could be realized, for istace, by takig a radom variable U that is uiformly distributed o the uit iterval. The set Y = F Y 1 (U). By usig the substitutio p = t(q) we obtai E(X) = = F X 1 F X 1 (p) d p (2) (t(q))t (q) dq = E(Y ). (3) }{{} = F Y 1 (q) So, the two radom variables X ad Y have the same expected value. We will compute the differece i their variaces. To accomplish this we first compute E(X 2 ) = = E(Y 2 ) = = = ( F X 1 (p)) 2 d p ( F X 1 ( ) ) 2 t(q) t (q) dq. (4) ( F X 1 ( ) ) 2( t(q) t (q) ) 2 dq (5) (t ( t 1 (p) )) 2 ( F 1 X (p)) 2 t ( t 1 (p) ) ( F 1 X (p)) 2 t ( t 1 (p) ) d p. d p 127

Sice Var(x) = E(X 2 ) ( E(X)) 2 ad E(X) = E(Y ), we obtai Var(Y ) Var(X) = E(Y 2 ) E(X 2 ) ( ( = F X 1 (p)) 2 t ( t 1 (p) ) ) 1 d p. Our objective is to achieve a reductio i variace. This happes whe the above ( expressio is egative. To further uderstad whe this occurs, it is istructive to ivestigate the term t ( t 1 (p) ) ) 1 for its sig. Let k [1, ). Our primary example of a selectio of t is defied by We compute t(q) = q k. t (q) = kq k 1, t 1 (p) = p 1 k, t ( t 1 (p) ) = kp k 1 k. For the specific case k = 2, the last term is 2 p. I this case a reductio of variace will be achieved if the followig iequality of positive itegrals is satisfied: /4 ( F 1 X (p)) 2 (1 2 p) d p > 1/4 ( F 1 X (p)) 2 (2 p 1) d p. (6) This is the case, if ( F 1 X (p)) 2 is large eough o the iterval [,1/4]. That, i tur, happes if the distributio of X has sigificat eough a tail. 2.2 Riema Sums Let be the sample size ad (r i ) the equidistat midpoit sample poits give by r i = i 1 2 for i = 1,2,3,...,. The expected value E(X) ca be approximated by the middle Riema sum. This ca be doe usig either (2): or it ca be doe usig (3): E(X) 1 E(X) 1 F 1 X (r i), (7) F 1 X (t(r i))t (r i ). If we defie the sample values s i = F X 1 (t(r i)) ad the sample weights w i = t (r i ) the the above equatio takes the simple form E(X) 1 s i w i. (8) 128

I the same way, we ca compute middle Riema sums for the followig quatities. We use Equatios (4) ad (5) for this: We defie the sample improvemet by E(X 2 ) 1 E(Y 2 ) 1 s 2 i w i, (9) (s i w i ) 2. (1) I = Var(X) Var(Y ) = E(X 2 ) ( E(X) ) 2 E(Y 2 ) ( E(X) ) 2. 2.3 Mote Carlo Method The formulae (7) to (1) ca be used as a Mote Carlo method to approximate the expected value of X ad the variaces of X ad Y. I that case (r i ) is uiformly distributed betwee ad 1. Either methods (7) ad (8) ca be used to approximate E(X). We will refer to the two methods as the regular method ad the ehaced method. If the trasformatio t is chose appropriately for the distributio of X the the sample improvemet I is greater tha 1 ad the variace of Y is less tha the variace of X. The error estimates are give as follows: Var(X) Regular method: Var(Y ) Ehaced method: Oe way to iterpret the sample improvemet I is that it is the umber by which you would eed to multiply the sample size for the regular method to catch up to the ehaced method. 3 AN EXAMPLE OF CATASTROPHE REINSURANCE RISK I this sectio we use a model to quatify the distributio of gross property isurace losses resultig from catastrophes. The we apply a typical set of reisurace cotracts to arrive at a et loss distributio. This model is ot based o catastrophe sciece, so it ca ot be used to base ay busiess decisios o. However, the model is realistic eough to measure the effectiveess of the proposed importace samplig approach. From there the reader may draw coclusios about what to expect from this approach for more advaced models with similar distributios. The data preseted i Table 1 ad Figure 1 was obtaied usig the R code i the appedix. A trial umber of 1 millio was used for that, however the simulatio error was expressed as a error for a 1 millio trial simulatio. The simple model that we use here is based o the followig assumptios: Evet Occurrece Gross Loss Severity The occurrece of evets throughout a year is modelled accordig to a Poisso distributio with frequecy λ = 3. The gross loss of a evet is modelled accordig to a log-ormal distributio fitted to a mea of 1 ad a stadard deviatio of 3. 129

Table 1: Evaluatio of reisurace cotracts based o a 1 millio trial simulatio. Occurrece Terms Aual Aggregate Terms Simulatio Result Simulatio Error (Radius of 95% Cofidece Iterval) Att Lim Att Lim EL EL% k = 1 k = 1.5 k = 2 k = 3 Cotract 1 34 34 34 3.457 1.17%.54%.37%.42%.74% Cotract 2 95 95 95 1.879 1.98% 1.25%.65%.63%.92% Cotract 3 19 19 19.969.51% 2.44% 1.3%.88% 1.13% Cotract 4 9 9 9 9.865 9.61%.56%.42%.52% 1.3% Cotract 5 2 2 2 2.391 1.96% 1.26%.68%.7% 1.14% Cotract 6 34 34 34 34.168.49% 2.46% 1.3%.93% 1.37% It is importat to ote that for larger values of λ the importace samplig approach will be subject to what is kow as the Curse of Dimesioality (Begtsso, Bickel, ad Li 28). I that case the stability of the simulatio decreases as k icreases. Table 1 shows the details of 6 typical reisurace cotracts. Occurrece attachmets ad limits are applied to the gross loss after every evet occurrece. The resultig loss is accumulated withi every trial year ad the subjected to the aual aggregate attachmets ad limits. This results i the et loss. The expected loss (EL) of a cotract is ofte expressed as a percetage of the occurrece limit. We deote this percetage by EL%. Accordig to their defiitios the first three cotracts are first-evet covers ad the last three are secod-evet covers. I each category there are cotracts that have a approximate EL% of.5%, 2% ad 1%, which are typical umbers for the reisurace idustry. The more volatile cotracts are those with a lower EL%. As a cosequece, they also show the largest simulatio error. Figure 1 shows that the best sample improvemets are obtaied for values of k betwee 1.5 ad 2. But if the objective is to keep the simulatio error of ay cotract below a certai threshold, the k = 2 is a good choice, as it addresses the simulatio errors of the more volatile cotracts 3 ad 6 i the best possible way. This ca also be see i Table 1, where the choice of k = 2 reduces all simulatio errors below 1%. 4 CONCLUSION Importat metrics i risk maagemet are ofte approximated usig radom or Riema samplig. I this article we preset a importace samplig approach that allows a icrease i the accuracy of such approximatios. It ca be used to reduce the error margi of results or to reduce the ru time while maitaiig the same accuracy. The sample improvemet I ca be viewed as the factor by which the ru time ca be reduced. For a typical catastrophic risk ad some typical reisurace cotracts the sample improvemet is calculated. Based o this aalysis, the value k = 2 of the tuig parameter k is recommeded. However, the choice of k may vary with the distributio at had. A simple modificatio of the icluded code would allow a ivestigatio of other distributios for values of k with maximal gai i performace. 13

9 8 7 Sample Improvemet 6 5 4 3 2 Cotract 1 Cotract 2 Cotract 3 Cotract 4 Cotract 5 Cotract 6 1 1 1.5 2 2.5 3 3.5 4 Value of k Figure 1: Sample improvemet for 6 differet cotracts. 131

A R CODE The followig code rus with R-2.13.1. No additioal packages eed to be istalled. 132

REFERENCES Begtsso, T., P. Bickel, ad B. Li. 28. Curse-of-dimesioality revisited: Collapse of the particle filter i very large scale systems. IMS Collectios 28 2:316 334. Bucklew, J. 24. Itroductio to Rare Evet Simulatio. 1st ed. Spriger Series i Statistics. New York: Spriger. Gilchrist, W. 2. Statistical Modellig with Quatile Fuctios. 1st ed. Mathematics/Statistics. Boca Rato, Florida: Chapma & Hall/CRC. Lohma, D., ad F. Yue. 211. Correlatio, simulatio ad ucertaity i catastrophe modelig. I Proceedigs of the 211 Witer Simulatio Coferece, edited by S. Jai, R. Creasey, J. Himmelspach, K. White, ad M. Fu, 133 145. Piscataway, New Jersey: Istitute of Electrical ad Electroics Egieers, Ic. 133

AUTHOR BIOGRAPHY GEORG HOFMANN is a member of the research team at Validus Research Ic., Waterloo Otario, Caada. He is a adjuct professor at the Departmet of Mathematics ad Statistics of Dalhousie Uiversity, Nova Scotia, Caada ad a techical advisor of the Risk Aalytics Lab of the Computer Sciece Departmet. He has worked i the reisurace idustry for 5 years as a data scietist, gaiig experiece i the sciece ad implemetatio of catastrophe risk models. He received a Mathematics Diploma (Masters) with a Mior i Quatum Mechaics from the Techische Uiversität Darmstadt (Germay). At the same uiversity he received his Ph.D. i Mathematics. His email address is georg.w.hofma@gmail.com ad his web page is http://www.georg.ca. 134