A HOMOTOPY CLASS OF SEMI-RECURSIVE CHAIN LADDER MODELS

Similar documents
CHAIN LADDER FORECAST EFFICIENCY

GREG TAYLOR ABSTRACT

Greg Taylor. Taylor Fry Consulting Actuaries University of Melbourne University of New South Wales. CAS Spring meeting Phoenix Arizona May 2012

Chain ladder with random effects

Prediction Uncertainty in the Bornhuetter-Ferguson Claims Reserving Method: Revisited

GARY G. VENTER 1. CHAIN LADDER VARIANCE FORMULA

On the Importance of Dispersion Modeling for Claims Reserving: Application of the Double GLM Theory

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Bootstrapping the triangles

Identification of the age-period-cohort model and the extended chain ladder model

Claims Reserving under Solvency II

Synchronous bootstrapping of loss reserves

DELTA METHOD and RESERVING

Forecasting with the age-period-cohort model and the extended chain-ladder model

A few basics of credibility theory

LOGISTIC REGRESSION Joseph M. Hilbe

PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA

Linear Prediction Theory

Generalized Mack Chain-Ladder Model of Reserving with Robust Estimation

Calendar Year Dependence Modeling in Run-Off Triangles

Analytics Software. Beyond deterministic chain ladder reserves. Neil Covington Director of Solutions Management GI

The main results about probability measures are the following two facts:

Stochastic Incremental Approach for Modelling the Claims Reserves

arxiv: v1 [stat.ap] 17 Jun 2013

Applying the proportional hazard premium calculation principle

Conditional Least Squares and Copulae in Claims Reserving for a Single Line of Business

Credibility Theory for Generalized Linear and Mixed Models

Basic concepts in estimation

Theory and Methods of Statistical Inference. PART I Frequentist likelihood methods

February 26, 2017 COMPLETENESS AND THE LEHMANN-SCHEFFE THEOREM

A Loss Reserving Method for Incomplete Claim Data

Generalized Linear Models (GLZ)

PRINCIPAL COMPONENTS ANALYSIS

Reserving for multiple excess layers

Individual loss reserving with the Multivariate Skew Normal framework

10-701/15-781, Machine Learning: Homework 4

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

A Comparison of Resampling Methods for Bootstrapping Triangle GLMs

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Expressions for the covariance matrix of covariance data

Economics 620, Lecture 5: exp

Practice Problems Section Problems

Proof In the CR proof. and

ONE-YEAR AND TOTAL RUN-OFF RESERVE RISK ESTIMATORS BASED ON HISTORICAL ULTIMATE ESTIMATES

Optimal Auxiliary Variable Assisted Two-Phase Sampling Designs

Confidence and prediction intervals for. generalised linear accident models

GLM I An Introduction to Generalized Linear Models

DEPARTMENT OF ACTUARIAL STUDIES RESEARCH PAPER SERIES

Large Sample Properties of Estimators in the Classical Linear Regression Model

Cramér-Rao Bounds for Estimation of Linear System Noise Covariances

Lecture 7 Introduction to Statistical Decision Theory

Semiparametric Generalized Linear Models

Multiple Linear Regression

Lecture Notes 1: Vector spaces

Notes 19 Gradient and Laplacian

1 Delayed Renewal Processes: Exploiting Laplace Transforms

Lecture 8. Poisson models for counts

ECONOMETRIC THEORY. MODULE VI Lecture 19 Regression Analysis Under Linear Restrictions

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods

Chapter 8 Heteroskedasticity

CHAPTER III THE PROOF OF INEQUALITIES

ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM

Random Dyadic Tilings of the Unit Square

Haruhiko Ogasawara. This article gives the first half of an expository supplement to Ogasawara (2015).

Theory and Methods of Statistical Inference

CS 195-5: Machine Learning Problem Set 1

A Practitioner s Guide to Generalized Linear Models

Likelihood and p-value functions in the composite likelihood context

Regression Models - Introduction

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing

arxiv: v2 [stat.me] 8 Jun 2016

Various Extensions Based on Munich Chain Ladder Method

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Generalized Linear Models. Kurt Hornik

Short cycles in random regular graphs

Likelihood Function for Multivariate Hawkes Processes

Research Article Improved Estimators of the Mean of a Normal Distribution with a Known Coefficient of Variation

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

Introduction to Machine Learning Spring 2018 Note 18

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Triangles in Life and Casualty

Signal Detection and Estimation

Economics 520. Lecture Note 19: Hypothesis Testing via the Neyman-Pearson Lemma CB 8.1,

Testing Statistical Hypotheses

Mathematical statistics

1. (Regular) Exponential Family

F denotes cumulative density. denotes probability density function; (.)

Counting Permutations by their Rigid Patterns

The Chain Ladder Reserve Uncertainties Revisited

On the Application of the Generalized Pareto Distribution for Statistical Extrapolation in the Assessment of Dynamic Stability in Irregular Waves

Parametric Techniques Lecture 3

On a simple construction of bivariate probability functions with fixed marginals 1

Tail Conditional Expectations for Extended Exponential Dispersion Models

Fault Detection and Diagnosis Using Information Measures

Properties of the least squares estimates

ECE 275B Homework # 1 Solutions Winter 2018

STA 4273H: Statistical Machine Learning

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning

Parametric Techniques

Transcription:

A HOMOTOPY CLASS OF SEMI-RECURSIVE CHAIN LADDER MODELS Greg Taylor Taylor Fry Consulting Actuaries Level, 55 Clarence Street Sydney NSW 2000 Australia Professorial Associate Centre for Actuarial Studies Faculty of Economics and Commerce University of Melbourne Parville VIC 3052 Australia Phone: 6 2 9249 290 Fax: 6 2 9249 2999 greg.taylor@taylorfry.com.au September 20

Abstract The chain ladder algorithm is nown to produce maximum lielihood estimates of the parameters of certain recursive and non-recursive models. These types of models represent two extremes of dependency within rows of a data array. Whereas observations within a row of a non-recursive model are stochastically independent, each observation of a recursive model is, in expectation, directly proportional to the immediately preceding observation from the same row. The correlation structures of forecasts also differ as between recursive and non-recursive models. The present paper constructs a family of models that forms a bridge between recursive and non-recursive models and so provides a continuum of intermediate cases in terms of dependency structure. The intermediate models are called semi-recursive. The statistical inference properties of semi-recursive models are investigated. It is found (Section 5.4) that the chain ladder algorithm is also maximum lielihood for semirecursive models. Sufficient, and minimally sufficient, statistics are found for the semi-recursive model (Section 6). They are found to be the same as for non-recursive models. The minimally sufficient statistic is complete, leading to minimum variance unbiased estimation (Section 7). eywords: chain ladder, correlation, non-recursive model, recursive model, minimally sufficient statistic, minimum variance unbiased estimation, ODP cross-classified model, ODP Mac model, semi-recursive model, sufficient statistic,. Introduction The actuarial literature identifies two families of chain ladder models categorised by Verrall (2000) as recursive and non-recursive models respectively. Although the model formulations are fundamentally different, both are found to yield the same maximum lielihood estimators of age-to-age factors and the same forecasts of loss reserve. The properties of these models are studied by Taylor (20a). Whereas observations within a row of a non-recursive model are stochastically independent, each observation of a recursive model is, in expectation, directly proportional to the immediately preceding observation from the same row. It would be useful to define a family of models that forms a bridge between these two extreme cases of dependency, i.e. where a relation between consecutive observations in a row exists but is less than linear (in expectation). Further, distinct forecasts within a row of a run-off array are nown to be correlated differently under recursive and non-recursive models (Taylor, 20b). It would be useful to define a family of models displaying intermediate correlations.

Homotopy class of chain ladder models 2 The purpose of the present paper is to define ust such a family and explore its statistical inference properties. 2. Framewor and notation 2. Claims data Consider a x rectangle of claims observations Y with: accident periods represented by rows and labelled =, 2,.., ; development periods represented by columns and labelled by =, 2,,. Within the rectangle identify a development trapezoid of past observations D Y : and min, The complement of this subset, representing future observations is D c Y : and min, Y : and Also let D D D c In general, the problem is to predict c D on the basis of observed D. The usual case in the literature (though often not in practice) is that in which =, so that the trapezoid becomes a triangle. The more general trapezoid will be retained throughout the present paper. Define the cumulative row sums X Y i i (2.) and the full row and column sums (or horizontal and vertical sums) H min, Y V Y (2.2) Also define, for = + 2,,, R Y X X, 2 (2.3)

Homotopy class of chain ladder models 3 R 2 R (2.4) c Note that R is the sum of the (future) observations in D. It will be referred to as the total amount of outstanding losses. Liewise, R denotes the amount of outstanding losses in respect of accident period. The obective stated earlier is to forecast the R and R. Let R denote summation over the entire row of D, i.e. min, for fixed. Similarly, let C denote summation over the entire column of D. For example, (2.2) may be expressed as, i.e. for fixed V C Y Finally, let T denote summation over the entire trapezoid of (,) cells, i.e. T min, R C A with A :, D. The first column For a random variable 2.2 Families of distributions 2.2. Exponential dispersion family, D, A will denote the entire array of A will be denoted by A. The exponential dispersion family (EDF) (Nelder & Wedderburn, 972) consists of those variables Y with log-lielihoods of the form y,, y b / a c y, (2.5) for parameters (canonical parameter) and (scale parameter) and suitable functions a, b and c, with a continuous, b differentiable and one-one, and c such as to produce a total probability mass of unity. For Y so distributed, E Y b Var Y ab (2.6) (2.7)

Homotopy class of chain ladder models 4 If denotes E[Y], then (2.6) establishes a relation between and, and so (2.7) may be expressed in the form Var Y av (2.8) for some function V, referred to as the variance function. The notation Y ~ EDF, ; a, b, c will be used to mean that a random variable Y is subect to the EDF lielihood (2.5). 2.2.2 Tweedie family The Tweedie family (Tweedie, 984) is the sub-family of the EDF for which a (2.9) p, 0 or V p p (2.0) For this family, 2 2 p/ p b p p (2.) μ = [( p)θ] /( p) (2.2) p 2 p y;, y / p / 2 p / c y, p p / y / (2.3) (2.4) The notation Y ~ Tw,, p will be used to mean that a random variable Y is subect to the Tweedie lielihood with parameters,, p. The abbreviated form Y ~ Tw p will mean that Y is a member of the sub-family with specific parameter p. 2.2.3 Over-dispersed Poisson family The over-dispersed Poisson (ODP) family is the Tweedie sub-family with p =. The limit of (2.2) as p gives E Y exp (2.5) By (2.8) (2.0), Var Y (2.6) By (2.4),

Homotopy class of chain ladder models 5 / y / (2.7) The notation Y ~ ODP, means ~,, 3. Chain ladder models 3. Heuristic chain ladder Y Tw. The chain ladder was originally (pre-975) devised as a heuristic algorithm for forecasting outstanding losses. It had no statistical foundation. The algorithm is as follows. Define the following factors: fˆ X / X,,2,...,, (3.) Note that fˆ can be expressed in the form fˆ w X / X, (3.2) with w X / X (3.3) i.e. as a weighted average of factors X /, X for fixed. Then define the following forecasts of Y D : c, 2 2 Yˆ X fˆ fˆ... fˆ fˆ (3.4) Call these chain ladder forecasts. forecasts: They yield the additional chain ladder Xˆ X fˆ... fˆ, (3.5) Rˆ Xˆ Xˆ, (3.6) Rˆ 2 Rˆ (3.7) 3.2 Recursive models A recursive model taes the general form

Homotopy class of chain ladder models 6 E X, X function of D and some parameters (3.8) where D is the data sub-array of D obtained by deleting diagonals on the right side of until X is contained in its right-most diagonal. 3.2. Mac model D The Mac model (Mac, 993) is defined by the following assumptions. (M) Accident periods are stochastically independent, i.e. Y, Y 2 2 are stochastically independent if 2. (M2) For each =, 2,,, the X ( varying) form a Marov chain. (M3) For each =, 2,, and =, 2,,, (a) E X, X f X for some parameters f > 0; and (b) Var X X X 2, 2 for some parameters 0. 3.2.2 ODP Mac model Taylor (20) defined the over dispersed Poisson (ODP) Mac model as that satisfying assumptions (M), (M2) and ODPM 3 For each =,2,, and =,2,,-, Y X ~ ODP f X,,, where now f. Assumption (ODPM3) implies (M3a). Moreover, in the special case, 2 independent of, (ODPM3) also implies (M3b) with f. It is evident that, for this model to be valid, it is necessary that all Y 0. Note, also that, under ODPM 3, 0 implies that X 0, for all m 0. This means that, for each, either Y 0 or 0 m X for all. A summary of these requirements in terms of the data array (R) Y 0 for all Y D (R2) For each =, 2,,, either: D is as follows.

Homotopy class of chain ladder models 7 (a) Y 0; or (b) 0 Y for all min, A data array satisfying these requirements will be called ODPM-regular. Assumption (ODPM3) may be expressed in the following form, suitable for GLM implementation of the OPD Mac model: Y X ~ ODP exp ln X ln f, / w,, (3.9) where w, /, (3.0) In this form, the GLM of the Y, has log lin, offsets ln, ln f, and weights w,. parameters 3.3 Non-recursive models Taylor (20) also defined the ODP cross-classified model as that satisfying the following assumptions: (ODPCC) The random variables Y D are stochastically independent. (ODPCC2) For each =, 2,, and =, 2,,, Y ~ ODP, ; (a) (b) (c) for some parameters a, 0 ; and Assumption (ODPCC2b) may be expressed in the following form, suitable for GLM implementation of the ODP cross-classified model: Y ~ ODP exp ln ln, / w (3.) In this form, the GLM of the Y has log lin, parameters ln and ln, and weights w w / satisfying Assumption (ODPCC2b) removes one degree of redundancy from the parameter set, and would be reflected by the aliasing of one parameter in the GLM. (3.2)

Homotopy class of chain ladder models 8 3.4 Semi-recursive models First, a definition of homotopy is given. Let A and B be topological spaces and let : A B and : A B be continuous. A homotopy is a continuous function H : A0, B such that, for a A, H a,0 a and H a, a collection of functions H H, t ; t 0, class associated with the homotopy ust defined.. The will be referred to as the homotopy Consider a model that satisfies assumptions (M), (M2) and the following: (ODPSR3a) For each, 2,...,, and for some independent of and, subect to 0, Y ~ ODP, for some parameters 0, 0. (ODPSR3b) For each,2,...,, and,2,...,, Y X ODP By convention, 0, when 0, ~,, for the same λ as in (ODPSR3a), and where, 2,3,..., are parameters subect to 0. 0, when 0 Such a model will be called OPD semi-recursive. It is valid only for a nonnegative data arrays and, in the case 0, for ODPM-regular arrays. It will be assumed hence forth that all satisfy these requirements. D Assumptions (ODPSR3a-b) may be expressed in the following form, suitable for GLM implementation of the ODP semi-recursive model: Y ~ ODP exp ( )ln ( )ln,, / w (3.3),, Y X ~ ODP exp ln ln ln, / w (3.4) with weights, w / (3.5) In this formulation, the terms ln, nown quantities, are offsets, and the ln and ln are unnown parameters requiring estimation. ODP semi-recursive models are subect to the following representation lemma.

Homotopy class of chain ladder models 9 Lemma 3.. The mean ODP parameter in (ODPSR3b) may be expressed in the form f (3.6) where, for,2,3,...,, is the unique non-negative solution of i i is given by the constraint i and f i i i (3.7) (3.8),, 2,..., (3.9) The are calculated recursively from (3.7) in the order =,-,,2. In the event that in (3.6),... 0 (3.20) Proof. The uniqueness of, 2,..., is first proven for given 2, 3,...,. Consider relation (3.7) and note that i i i 0 (3.2) in the case i i (3.22) Hence (3.7) has at most one solution in also that the right side of (3.7) varies from 0 to as i in the case that (3.22) holds. Note varies from 0 to. Hence (3.7) has a non-negative solution in, which is therefore unique. i Note also that the existence of a solution to (3.7) implies that i i (3.23) Thus (3.22) implies (3.23), and so the required recursive calculation of the can proceed over,,...,2.

Homotopy class of chain ladder models 0 Relation (3.23) holds for 2, i2 i so define Then 0, as required by ODPRS and (3.8) is satisfied. (3.24) Substitution of (3.7) and (3.9) in the right side of (3.6) now yields / / i i i i by (3.8). By (3.7), this is equal to the left side of (3.6), and so (3.6) holds. Note that the semi-recursive models form a homotopy class. Let A be the set each of whose members consists of a data array D and a parameter set. Define H to be the mapping that sends a, f,, 2,...,,,,,, 2,...,,, 2,...,. Let a be a specific member of A and let 0 to the distributions of Y and Y, X defined by (ODPSR3a-b). Convert H to a metric space by imposing the metric () () (2) (2) a,, a, (2) 2 ( 2) 2 () ( 2) (2) 2 2 2 2 where the superscripts () and (2) on the right designate the parameters,,, () () a, and associated with the respective members (2) (2), a of 0, A. The defined metric is continuous in, as required for homotopy. Moreover, it is evident from Lemma 3. that H(a,0) generates an OPD cross-classified model and H(a,) an ODP Mac model. The homotopy class includes all the intermediate models between these two.

Homotopy class of chain ladder models 4. Correlation between observations 4. Semi-recursive models Consider the model defined in Section 3.4, and specifically the conditional covariance for Cov X, m, X 2,, 2 m n X,, X 2 2 with m 0, n 0. The following lemma is immediate from assumption (M). Lemma 4.. In the semi-recursive model defined in Section 3.4 Cov X, m, X 2, 2 m n X,, X 0 2 for 2 2 In view of this result, attention will be focused on within-row covariances Cov X, m, X, mn X and correlations Corr X, m, X, mn X. Let the latter be denoted, m, mn, with 0, m, mn and, m, mn representing the boundary cases of the ODP cross-classified and ODP Mac models respectively. These boundary correlations are evaluated by Taylor (20b) with the following results:, m, mn m, mn 2 ( B ) for 0, (4.) with B mn m 0 m, mn i i i i i m i (4.2) B m, m n mn i m m i f... f f... f 2 2 2 mn i i i f... f f... f 2 2 2 mn i i i (4.3) 0 The properties of, m, mn and, m, mn and the relation between them are discussed by Taylor (20b). It is established that, while there are distinct similarities between the two, there are also distinct differences. Certainly 0, m, mn and, m, mn are numerically different. One might therefore wish to formulate a semi-recursive model with correlation structure intermediate between these two cases. It is evident from (3.6) that the homotopy class of semi-recursive models provides a continuum of correlation values between 0, m, mn and, m, mn.

Homotopy class of chain ladder models 2 4.2 Evaluation of semi-recursive correlation structures However, care may be required in the selection of a semi-recursive model as, m, mn does not appear to be related to 0, m, mn and, m, mn in any simple way. Consider the evaluation of, m, mn. Let c, m, m n denote Cov X, X X, m, mn. Then c E X, X X, m, mn, m, mn X, mn X, m n X X, m, m E X, E X, X X, X, m, m,,, E X E X X E X E X X m n m n m E X X E X X X m n m m n (4.4) Difficulty arises in the evaluation of the terms E X, m X. In the case of the recursive ODP Mac model, these are evaluated recursively, thus: E X, m X E E X, m X, m X E f m X, m X etc. where (ODPM3) has been used. If, however, the same procedure is attempted for the semi- recursive model, then, by (3.6), E X, m X E E X, m X, m X E X, m X f E X X f, m E X X, m X and difficulty arises in the evaluation of the last expectation. 4.3 Non-monotonicity of semi-recursive correlations Care would also be necessary in the selection of semi-recursive correlation structures because, while it is nown from Theorem 4.4 of Taylor (20b) that 0, it cannot be assumed that ρ changes monotonically, m, mn, m, mn between these extremes. Indeed, my colleague, Hugh Miller, provides the following counter-example. Example. Consider a semi-recursive model in the representation of Lemma 3., with the following parameters: 00, 2 3 4, (Poisson case). By (3.9), f 2, f2.5, f3.33. Now consider the (unliely) case of E X = 00). Then simulation yields the values of X for some (c.f.,2,4 for various shown in Table 4..

Homotopy class of chain ladder models 3 Table 4. Values of,2,4 λ Ρ 0 0.578 0. 0.569 0.2 0.593 0.769 for varying X The values of ρ for 0, may be verified by the formulas (4.5) and (4.6) (λ = 0) and (4.6) and (4.7) (λ = 0) of Taylor (20b) but note that ρ does not proceed monotonically between 0 and 0.2. If a less eccentric value of X is chosen, say X 50, the results are as in Table 4.2. Table 4.2 Values of,2,4 λ Ρ 0 0.578 0. 0.602 0.2 0.627 0.3 0.652 0.4 0.673 0.5 0.693 0.6 0.72 0.7 0.728 0.8 0.744 0.9 0.756 0.768 for varying X 50 Evidence of slight sampling error is apparent from a comparison of Tables 4. and 4.2 at. Nonetheless, monotonicity of, m, mn as a function of appears to have been achieved in this second example. Note also, by comparison of Tables 4. and 4.2 at λ = 0., 0.2, that, m, mn depends on the observed value of for 0 < λ <, whereas it is independent of this observation for λ = 0,. X More detail on the relation between and,, future research. m m n might be a fruitful area for

Homotopy class of chain ladder models 4 5. Parameter estimation and forecasts 5. Recursive models Consider MLE of parameters in the OPD Mac model defined in Section 3.2.2. The conditional log-lielihood of a single observation in D is (terms extraneous to MLE omitted) Y Y n n f f (5.),,, The conditional log- lielihood of the entire row ofd Y, Y,... Y Y 2 3, 2 is,...,, Y Y Y 3, 2 Y 2 Y 3,..., Y, 2 by assumption (M2). By extension of this argument 2,...,,, 2 Y Y Y for (5.2) The reasoning for is similar but with the upper limit of summation replaced by. Then, by assumption (M), min(, ) Y Y, (5.3) 2 2 Substitution of (5.) into (5.3) and differentiation with respect to f for a particular value of, yields,, f Y f (5.4) Setting this to zero and rearranging gives the following MLE of f : fˆ, Y,,,,, (5.5) In the special case in which weights are column dependent only,, independent of (5.6)

Homotopy class of chain ladder models 5 the estimator (5.5) reduces to the usual chain ladder estimator (5.7) f ˆ /, The forecast of a future (i.e. + > +) value of is ˆ R ˆ ˆ ˆ f f 2... f, 2, 3,...,, (5.8), The estimation and forecast algorithm consisting of (5.7) and (5.8) constitute the chain ladder algorithm described in Section 3.. 5.2 Non-recursive models Consider MLE of parameters in the OPD cross-classified model defined in Section 3.3. The log-lielihood of a single observation in Y Y n n (5.9) The log-lielihood for the entire min (, ) Y Y D is (5.0) Substitution of (5.9) into (5.0) and differentiation with respect to for a particular value of, yields min(, ) Y / Differentiation with respect to yields Y Setting (5.) and (5.2) to zero gives the following MLEs of, : min(, ) min(, ) ˆ Y D is (5.) (5.2) ˆ (5.3) ˆ Y ˆ (5.4) In the special case in which weights are column dependent only, i.e. (5.6) holds, relations (5.3) and (5.4) reduce to the following: min (, ) min(, ) ˆ Y ˆ (5.3a) ˆ Y ˆ (5.4a) In the alternative special case in which weights are row dependent only,

Homotopy class of chain ladder models 6, independent of (5.5) relations (5.3) and (5.4) reduce to the following: ˆ min (, ) min(, ) Y ˆ (5.3b) ˆ Y ˆ (5.4b) In the even more specialised case in which weights are uniform across all cells, the relations simplify further, as follows ˆ min (, ) min(, ) Y ˆ (5.3c) ˆ Y ˆ (5.4c) The last case includes the case, i.e. the ODP distribution reduces to Poisson. This is a case where MLEs have been studied in detail by Hachemeister & Stanard (975), Renshaw & Verrall (998) and Taylor (2000), among others, where it is shown that (5.3c) and (5.4c) are equivalent to the chain ladder estimates (5.7) when the f ˆ and ˆ are related by fˆ ˆ ˆ i. It is shown by England & Verrall (2002) that this result continues to hold in the more general case. The forecast of a future value of Y is Yˆ NR ˆ ˆ, 2, 3,..., (5.6) i 5.3 Relation between recursive and non-recursive cases 5.3. Poisson distribution Taylor (2000, Chapter 2) studies the ODP cross-classified model subect to (i.e. Poisson distribution in each cell of D ), with MLEs given by (5.3c) and (5.4c). It is shown there (equation (2.47) ) that X / X ˆ / ˆ (5.7), i i i i Comparison of this result with (5.7) shows that fˆ ˆ ˆ (5.8) i i i i establishing the relation between the MLEs of the recursive and non-recursive models.

Homotopy class of chain ladder models 7 Verrall (2000, p.93) shows that, is the MLE of z for where z is defined by z i (5.9) i It follows that, / ˆ i i ˆ (5.20) Substitution of (5.8) and (5.20) into (5.8) yields ˆ R ˆ i i in which case the forecast of Yˆ ˆ ˆ ˆ ˆ Yˆ R R R NR, by (5.6). Y in the recursive model is (5.2) Thus, the recursive (Poisson Mac) and non- recursive (Poisson cross-classified) models produce the same forecasts when parameters are estimated by MLEs in the case. It then follows from Section 5. that those forecasts are obtainable from the chain ladder algorithm. 5.3.2 OPD distribution Now consider the more general case in which, independent of and but not necessarily equal to unity. In both OPD Mac and ODP cross- classified models Y ~ ODP(, ) for some mean The meaning of this is. Y ~ Poiss( / ) (5.22) Application of (5.22) to (ODPM3) yields Y, ~ Poiss( f / ) (5.23) For any fixed value of, this last relation indicates that the MLE of f is obtained by application of the Poisson theory of Section 5.3. (i.e. with ) but with each Y replaced by Y. This leaves the estimator (5.7) unchanged. On the other hand, application of (5.22) to (ODPCC2) yields

Homotopy class of chain ladder models 8 Y ~ Poiss (5.24) This last relation indicates that the MLEs of the and / are again obtained by application of the Poisson theory of Section 5.3. but with Y Y replaced by /. Equations (5.3c) and (5.4c) are unchanged by these substitutions, indicating that the MLE, and forecasts of the OPD cross-classified model are the same as in the Poisson case. This fact was noted by England & Verrall (2002, p.449) This leads to the following result. Lemma 5.. For a given data array D models with dispersion parameters uniform across, the OPD Mac and OPD cross-classified D ( ), generate the c same forecasts of D on the basis of ML. The ML parameter estimates for the two models are related through (5.8) and (5.20). Proof. Section 5.3. gives the proof for the case. The present sub-section shows that all of the forecasts and parameter estimates discussed in Section 5.3. are unaffected by a change in to a value not equal to unity. 5.4 Semi-recursive models Consider MLE of parameters in the semi-recursive model defined in Section 3.4. The log-lielihood of the data array is Y X Y X min(, ) Y, (5.25) by the same argument as led to (5.3). The partial log-lielihood for ( Y ) can be obtained from assumption (ODPSR3a) and that for Y, from Lemma 3.. These give NR Y ( ) Y (5.26) R NR Y, Y, Y, (5.27) R where denotes a log-lielihood within the recursive model of Section 5. and NR within the non-recursive model of Section 5.2. Substitution of (5.26) and (5.27) into (5.25) gives R NR Y Y Y (5.28)

Homotopy class of chain ladder models 9 by (5.3) and (5.0). It is shown in Section 5. that the chain ladder estimates (5.7) of the parameters R f set the log-lielihood component to zero in the case of column dependent dispersion parameters (5.6). Liewise, it is shown in Section 5.2 that the chain ladder estimates (5.7) of the NR parameters f set the log-lielihood component to zero in the case of uniform dispersion parameters when the f ˆ It follows that the chain ladder estimates (5.7) of the parameters set the loglielihood (5.28) to zero under the same conditions. summarised as follows. and ˆ are related as in (5.8). f These results may be Theorem 5.2. Suppose that the data array D is subect to a semi-recursive model as represented in Lemma 3. with, independent of and. Then (a) the MLEs of its parameters f are obtained by treating the data array D as if subect to the (recursive) ODP Mac model. (b) the MLEs of parameters, are obtained by treating the data array as if subect to the (non-recursive) ODP cross-classified model. (c) these parameter estimates are related by (5.8) and (5.20) and the ODP Mac, ODP cross-classified, and semi-recursive forecasts of any particular future Y are all identical. The forecasts are obtainable by application of the chain ladder algorithm. The theorem shows that the chain ladder algorithm provides ML parameter estimates and forecasts for the entire homotopy class of semi-recursive models defined Section 3.4. 6. Sufficient statistics The following results are special cases of more general results appearing in Taylor (20). Lemma 6.. statistic for f. (a) For an OPD Mac model, (b) For an OPD cross-classified model, C and Y is a sufficient statistic for. C Y,, is a sufficient R Y is a sufficient statistic for (c) In case (b), the sufficient statistic for the full parameter set {, } consists of the row sums and column sums. This sufficient statistic is not minimal. A minimal sufficient statistic is obtained by deletion of an arbitrary single component.

Homotopy class of chain ladder models 20 Proof. (a) See Theorem 5. of Taylor (20a). (b) See Theorem 5.2 of Taylor (20a). (c) See Theorem 5.3 of Taylor (20a). Remar. The minimal statistic defined in part (c) of the theorem is not unique. For full detail on the construction of alternative minimal sufficient statistics, see Theorem 5.3 of Taylor (20a). Theorem 6.2. For the semi-recursive model defined in Section 3.4, R ( ) ( ) (a) The vector s = Y,,...,, Y,,...,, C is a sufficient statistic for the parameter set f,..., f,...,,,...,. (b) A minimal sufficient statistic can be obtained by the deletion of an arbitrary single component of s. This statistic is complete. Proof. (a) Recall the form (5.28) for the log-lielihood Y. Theorem 5. of R Taylor (20a) shows that satisfies Fisher-Neyman factorisation with respect to the parameter set f,..., f and the statistic s. Similarly, NR with respect to the set,...,,,.... Thus Y satisfies Fisher-Neyman factorisation with respect to entire parameter set f,..., f,...,,,..., and it follows that s is a sufficient statistic for that parameter set. s,..., s, s,..., s (b) Let the components of s be denoted, relations at the end of Section 2,. By the s... s s... s (6.) whence any component of s can be expressed in terms of the other components. This means that s min, obtained from s by the deletion of an arbitrary component, contains the same information as s and is therefore sufficient for f,..., f,...,,...,., Now note that this last parameter set can be reduced in dimension. By (3.8), each f may be expressed in terms of,..., and so { f,..., f } may be expressed in terms of {,..., }. Further, by (3.7), this last set may be reduced to {,..., }. Thus the parameter set for the semi-recursive model may be taen as,...,,,..., of dimension. Now s min is of the same dimension and, for a regression model such as the semirecursive model, with error terms distributed as a member of the EDF, this equality of dimensions is a necessary and sufficient condition for a sufficient statistic to be complete (Cox & Hinley, 974, p.3).

Homotopy class of chain ladder models 2 Finally, since smin is a complete sufficient statistic, it is immediately minimally sufficient (Lehmann & Casella, 998). 7. Minimum variance estimation The Mac model is nown to generate unbiased MLEs, of loss reserve (Mac, 993). The ODP Mac model with column dependent dispersion parameters contains the same expectations and leads to the same MLEs (5.7) and (5.8). The same is not true, however, of the ODP cross-classified model. Since the semi-recursive model is a mixture of these two, one can expect that its MLEs will, in general, be biased. However, any bias can be corrected as follows. Let Z : Z D D R be some predictand and let Zˆ : D R be a predictor of c. Define Z Zˆ Zˆ D D (7.) Then Z D Z D (7.2) and so Z is a bias corrected form of the predictor Ẑ. Theorem 7.. Let D be subect to the semi-recursive model of Section 3.4, with, const. Then the bias corrected chain ladder estimates X, R and R (derived from Xˆ ; Rˆ and ˆR defined by (3.5) (3.7) respectively) are minimum variance unbiased estimators (MVUEs) of X D, R D R D. Proof. Lemma 4.3 of Taylor (20a) shows that the estimators MLE for the ODP cross-classified model with England & Verrall (2002). ˆ, Rˆ and and ˆR are. The result also appears in These estimators are expressible in terms of the statistic s, defined in Theorem 6.2. It is apparent from the proof of that theorem that s is expressible in terms of s min which, by the same theorem, is a complete sufficient (in fact, minimal sufficient) statistic for the semi-recursive model parameter set f,..., f,...,,...,., Thus, X, R and ˆR are unbiased estimators that are functions of a complete sufficient statistic. By the Lehmann-Scheffe theorem, they are MVUEs.

Homotopy class of chain ladder models 22 The application of Theorem 7. is limited by the fact that the bias correction factors in, etc. would rarely be nown practice. On the other hand, however, the biases contained in chain ladder estimates are tolerated in practice and, in this context, the theorem shows that the chain ladder provides a minimum variance estimate of whatever it is estimating. When the chain ladder bias is small, it provides minimum variance almost unbiased estimators. References Cox DR & Hincley DV (974). London U. Theoretical Statistics. Chapman and Hall, England P D & Verrall R (2002). Stochastic claims reserving in general insurance. British Actuarial ournal, 8(iii), 443-58. Hachemeister C A & Stanard N (975). IBNR claims count estimation with static lag functions. Spring meeting of the Casualty Actuarial Society. Lehmann EL & Casella G (998). Springer. Theory of point estimation (2nd edition). Mac T (993). Distribution-free calculation of the standard error of chain ladder reserve estimates. Astin Bulletin, 23(2), 23-225. Renshaw AE & Verrall R (998). A stochastic model underlying the chainladder technique. British Actuarial ournal, 4(iv), 903-923. Taylor G (2000). Loss reserving: an actuarial perspective. luwer Academic Publishers, Boston. Taylor (20a). Maximum lielihood and estimation efficiency of the chain ladder. Astin Bulletin, 4(), 3-55. Taylor (20b). Chain ladder correlations. Research paper No. 220 at http://www.economics.unimelb.edu.au/actwww/wps20.shtmlhttp://www.econo mics.unimelb.edu.au/actwww/wps20.shtml. Verrall R (2000). An investigation into stochastic claims reserving models and the chain-ladder technique. Insurance: mathematics and economics, 26(), 9-99.