Mathematical models in regression credibility theory

Similar documents
The Credibility Estimators with Dependence Over Risks

REGRESSION TREE CREDIBILITY MODEL

Regression Tree Credibility Model. Liqun Diao, Chengguo Weng

ON BOUNDS FOR THE DIFFERENCE BETWEEN THE STOP-LOSS TRANSFORMS OF TWO COMPOUND DISTRIBUTIONS

EVALUATING THE BIVARIATE COMPOUND GENERALIZED POISSON DISTRIBUTION

Elementary operation matrices: row addition

Econ 620. Matrix Differentiation. Let a and x are (k 1) vectors and A is an (k k) matrix. ) x. (a x) = a. x = a (x Ax) =(A + A (x Ax) x x =(A + A )

Final Exam. Economics 835: Econometrics. Fall 2010

Lecture 4: Least Squares (LS) Estimation

CREDIBILITY IN THE REGRESSION CASE REVISITED (A LATE TRIBUTE TO CHARLES A. HACHEMEISTER) H.BUHLMANN. ETH Zurich. A. GlSLER. Winterhur- Versicherungen

Credibility Theory for Generalized Linear and Mixed Models

STAT 100C: Linear models

Problem Set #6: OLS. Economics 835: Econometrics. Fall 2012

Experience Rating in General Insurance by Credibility Estimation

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Likelihood Methods. 1 Likelihood Functions. The multivariate normal distribution likelihood function is

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

Vectors and Matrices Statistics with Vectors and Matrices

Comonotonicity and Maximal Stop-Loss Premiums

Solutions of a constrained Hermitian matrix-valued function optimization problem with applications

A Matrix Theoretic Derivation of the Kalman Filter

STAT 992 Paper Review: Sure Independence Screening in Generalized Linear Models with NP-Dimensionality J.Fan and R.Song

A few basics of credibility theory

CS229 Lecture notes. Andrew Ng

This is a preprint of an article accepted for publication in the Scandinavian Actuarial Journal 2008

Inverses and Elementary Matrices

6-1. Canonical Correlation Analysis

Optimization problems on the rank and inertia of the Hermitian matrix expression A BX (BX) with applications

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

The Multivariate Gaussian Distribution [DRAFT]

ELEMENTARY LINEAR ALGEBRA

Stat 206: Sampling theory, sample moments, mahalanobis

CP3 REVISION LECTURES VECTORS AND MATRICES Lecture 1. Prof. N. Harnew University of Oxford TT 2013

12. Perturbed Matrices

7 Matrix Operations. 7.0 Matrix Multiplication + 3 = 3 = 4

Lecture 20: Linear model, the LSE, and UMVUE

Matrix & Linear Algebra

Matrix Algebra Review

UC Berkeley Department of Electrical Engineering and Computer Sciences. EECS 126: Probability and Random Processes

EFFICIENT, WEAK EFFICIENT, AND PROPER EFFICIENT PORTFOLIOS

ON THE TRUNCATED COMPOSITE WEIBULL-PARETO MODEL

HOW IS GENERALIZED LEAST SQUARES RELATED TO WITHIN AND BETWEEN ESTIMATORS IN UNBALANCED PANEL DATA?

2. Matrix Algebra and Random Vectors

ELEMENTARY LINEAR ALGEBRA

Applied Econometrics (QEM)

Notes on Random Vectors and Multivariate Normal

a 11 x 1 + a 12 x a 1n x n = b 1 a 21 x 1 + a 22 x a 2n x n = b 2.

New insights into best linear unbiased estimation and the optimality of least-squares

HASSE-MINKOWSKI THEOREM

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Calculation of Bayes Premium for Conditional Elliptical Risks

Lecture 2: The Simplex method

Examensarbete. Explicit Estimators for a Banded Covariance Matrix in a Multivariate Normal Distribution Emil Karlsson

Math Camp II. Basic Linear Algebra. Yiqing Xu. Aug 26, 2014 MIT

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

Tail Conditional Expectations for. Extended Exponential Dispersion Models

3 Multiple Linear Regression

A matrix over a field F is a rectangular array of elements from F. The symbol

The compound Poisson random variable s approximation to the individual risk model

Econometrics I. Professor William Greene Stern School of Business Department of Economics 3-1/29. Part 3: Least Squares Algebra

ABSTRACT KEYWORDS 1. INTRODUCTION

ON SPECTRA OF LÜDERS OPERATIONS

TAMS39 Lecture 2 Multivariate normal distribution

Basic Distributional Assumptions of the Linear Model: 1. The errors are unbiased: E[ε] = The errors are uncorrelated with common variance:

Multiplicative Perturbation Bounds of the Group Inverse and Oblique Projection

Basic Concepts in Matrix Algebra

Lecture 11. Multivariate Normal theory

ECE 275A Homework 6 Solutions

Review. December 4 th, Review

ONE-YEAR AND TOTAL RUN-OFF RESERVE RISK ESTIMATORS BASED ON HISTORICAL ULTIMATE ESTIMATES

University of Mannheim, West Germany

STA 256: Statistics and Probability I

Course 4 Solutions November 2001 Exams

Solutions to the Spring 2015 CAS Exam ST

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Rigid Geometric Transformations

Minimization of the root of a quadratic functional under a system of affine equality constraints with application in portfolio management

Advanced Econometrics I

Boolean Inner-Product Spaces and Boolean Matrices

Lecture 1 and 2: Random Spanning Trees

Panel Data Models. James L. Powell Department of Economics University of California, Berkeley

arxiv: v1 [math.na] 1 Sep 2018

Applied Matrix Algebra Lecture Notes Section 2.2. Gerald Höhn Department of Mathematics, Kansas State University

ELEMENTARY LINEAR ALGEBRA

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

This paper is not to be removed from the Examination Halls

DEPARTMENT MATHEMATIK SCHWERPUNKT MATHEMATISCHE STATISTIK UND STOCHASTISCHE PROZESSE

Tests for Covariance Matrices, particularly for High-dimensional Data

1 Appendix A: Matrix Algebra

Economics 620, Lecture 4: The K-Varable Linear Model I

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Absolute value equations

Topic 10: Panel Data Analysis

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Institute of Actuaries of India

MATH 106 LINEAR ALGEBRA LECTURE NOTES

Prediction. Prediction MIT Dr. Kempthorne. Spring MIT Prediction

ABSOLUTELY FLAT IDEMPOTENTS

Transcription:

BULETINUL ACADEMIEI DE ŞTIINŢE A REPUBLICII MOLDOVA. MATEMATICA Number 3(58), 2008, Pages 18 33 ISSN 1024 7696 Mathematical models in regression credibility theory Virginia Atanasiu Abstract. In this paper we give the matrix theory of some regression credibility models and we try to demonstrate what kind of data is needed to apply linear algebra in the regression credibility models. Just like in the case of classical credibility model we will obtain a credibility solution in the form of a linear combination of the individual estimate (based on the data of a particular state) and the collective estimate (based on aggregate USA data). To illustrate the solution with the properties mentioned above, we shall need the well-known representation formula of the inverse for a special class of matrices. To be able to use the better linear credibility results obtained in this study, we will provide useful estimators for the structure parameters, using the matrix theory, the scalar product of two vectors, the norm and the concept of perpendicularity with respect to a positive definite matrix given in advance, an extension of Pythagoras theorem, properties of the trace for a square matrix, complicated mathematical properties of conditional expectations and of conditional covariances. Mathematics subect classification: 15A03, 15A12, 15A48, 15A52, 15A60, 62P05, 62J12, 62J05. Keywords and phrases: Linearized regression credibility premium, the structural parameters, unbiased estimators. Introduction In this paper we give the matrix theory of some regression credibility models. The article contains a description of the Hachemeister regression model allowing for effects like inflation. In Section 1 we give Hachemeister s original model, which involves only one isolated contract. In this section we will give the assumptions of the Hachemeister regression model and the optimal linearized regression credibility premium is derived. Just like in the case of classical credibility model, we will obtain a credibility solution in the form of a linear combination of the individual estimate (based on the data of a particular state) and the collective estimate (based on aggregate USA data). To illustrate the solution with the properties mentioned above, we shall need the well-known representation formula of the inverse for a special class of matrices. It turns out that this procedure does not provide us with a statistic computable from the observations, since the result involves unknown parameters of the structure function. To obtain estimates for these structure parameters, for Hachemeister s classical model we embed the contract in a collective of contracts, all providing independent information on the structure distribution. c Virginia Anasasiu, 2008 18

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 19 Section 2 describes the classical Hachemeister model. In the classical Hachemeister model, a portfolio of contracts is studied. Just as in Section 1, we will derive the best linearized regression credibility premium for this model and we will provide some useful estimators for the structure parameters, using a well-known representation theorem for a special class of matrices, properties of the trace for a square matrix, the scalar product of two vectors, the norm 2 P, the concept of perpendicularity and an extension of Pythagoras theorem, where P is a positive definite matrix given in advance. So, to be able to use the result from Section 1, one still has to estimate the portfolio characteristics. Some unbiased estimators are given in Section 2. From the practical point of view the attractive property of unbiasedness for these estimators is stated. 1 The original regression credibility model of Hachemeister In the original regression credibility model of Hachemeister, we consider one contract with unknown and fixed risk parameter θ, during a period of t ( 2) years. The yearly claim amounts are denoted by X 1,...,X t. Suppose X 1,...,X t are random variables with finite variance. The contract is a random vector consisting of a random structure parameter θ and observations X 1,...,X t. Therefore, the contract is equal to (θ,x ), where X = (X 1,...,X t ). For this model we want to estimate the net premium: µ(θ) = E(X θ), = 1,t for a contract with risk parameter θ. Remark 1.1. In the credibility models, the pure net risk premium of the contract with risk parameter θ is defined as: µ(θ) = E(X θ), = 1,t. (1.1) Instead of assuming time independence in the pure net risk premium (1.1) one could assume that the conditional expectation of the claims on a contract changes in time, as follows: µ (θ) = E(X θ) = b (θ), = 1,t, (1.2) where the design vector is known ( is a column vector of length q, the nonrandom (q 1) vector is known) and where the b (θ) are the unknown regression constants (b (θ) is a column vector of length q). Remark 1.2. Because of inflation we are not willing to assume that E(X θ) is independent of. Instead we make the regression assumption E(X θ) = b (θ). When estimating the vector β from the initial regression hypothesis E(X ) = β formulated by actuary, Hachemeister found great differences. He then assumed

20 VIRGINIA ATANASIU that to each of the states there was related an unknown random risk parameter θ containing the risk characteristics of that state, and that θ s from different states were independent and identically distributed. Again considering one particular state, we assume that E(X θ) = b (θ), with E[b(θ)] = β. Consequence of the hypothesis (1.2): (t,1) µ (θ) = E(X θ) = b(θ), (1.3) where is a (t q) matrix given in advance, the so-called design matrix of full rank q (q t) [the (t q) design matrix is known and having full rank q t] and where b (θ) is an unknown regression vector [ b(θ) is a column vector of length q]. Observations. By a suitable choice of the (assumed to be known), time effects on the risk premium can be introduced. Examples. 1) If the design matrix is for example chosen as follows: 1 1 1 = (t,3) 1 2 2 2 =... we obtain a quadratic inflationary trend: µ (θ) = 1 t t 2 b 1 (θ) + b 2 (θ) + 2 b 3 (θ), = 1,t, where b(θ) = (b 1 (θ),b 2 (θ),b 3 (θ)). Indeed, (t,1) by standard computations we obtain: µ (θ) = b(θ) = (1b 1 (θ) + 1b 2 (θ) + 1 2 b 3 (θ), 1b 1 (θ) + 2b 2 (θ) + 2 2 b 2 (θ),...,1b 1 (θ) + tb 2 (θ) + t 2 b 3 (θ)) (t,1) and as µ (θ) = (µ 1 (θ),µ 2 (θ),...,µ t (θ)) results that is established our first assertion. 2) If the design matrix is for example chosen as follows: 1 1 = (t,2) 1 2 = (the last column of 1 is omitted) a linear inflation results:.. 1 t µ (θ) = b 1 (θ) + b 2 (θ), = 1,t, where b(θ) = (b 1 (θ),b 2 (θ)). The proof is similar. After these motivating introductory remarks, we state the model assumptions in more detail. Let X = (X 1,...,X t ) be an observed random (t 1) vector and θ an unknown random risk parameter. We assume that: It is assumed that the matrices: E(X θ) = b (θ). (H 1 ) Λ = Cov[b (θ)](λ = Λ (q q) ) (H2 )

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 21 Φ = E[Cov(X θ) (Φ = Φ (t t) ) (H3 ) are positive definite. We finally introduce: E[b(θ)] = β. Let µ be the credibility estimator of µ (θ) based on X. For the development of an expression for µ, we shall need the following lemma. Lemma 1.1 (Representation formula of the inverse for a special class of matrices). Let A be an (r s) matrix and B an (s r) matrix. Then if the displayed inverses exist. Proof. We have (I + A B ) = I A (I + B A ) B, (1.4) I = I + AB AB = I + AB A(I + BA)(I + BA ) B = = (I + A B ) (I A + A B A )(I + B A ) B = = (I + A B ) (I + A B )A (I + B A ) B giving I = (I + A B )[I A (I + B A ) B ] and multiplying this equation from the left by (I + A B ) gives (1.4). Observation. I denotes the (r r) identity matrix. The optimal choice of µ is determined in the following theorem: Theorem 1.1. The credibility estimator µ is given by: µ = [Z + (I Z)β], (1.5) ˆb with: ˆb = ( Φ ) Φ X, (1.6) Z = Λ Φ (I + Λ Φ ), (1.7) where I denotes the q q identity matrix (ˆb = ˆb (q 1) ;Z = Z (q q) ), for some fixed. Proof. The credibility estimator µ of µ (θ) based on X is a linear estimator of the form µ = γ 0 + γ X, (1.8)

22 VIRGINIA ATANASIU { E( µ ) = E[µ (θ)] which satisfies the normal equations where γ 0 is a Cov( µ,x ) = Cov[µ (θ),x ] scalar constant, and γ is a constant (t 1) vector. The coefficients γ 0 and γ are chosen such that the normal equations are satisfied. where We write the normal equations as After inserting (1.8) in (1.10), one obtains E( µ ) = β, (1.9) Cov( µ,x ) = Cov[µ (θ),x ]. (1.10) γ Cov(X ) = Cov[µ (θ),x ], (1.11) Cov(X ) = E[Cov(X (θ)] + Cov[E(X (θ)] = = Φ + Cov[ b (θ)] = Φ + Cov[ b (θ),( b (θ)) ] = = Φ + Cov[b (θ),(b (θ)) ] = Φ + Cov[b (θ),(b (θ)) ] = = Φ + Cov[b (θ)] = Φ + Λ and Cov[µ (θ),x ] = Cov[µ (θ),e(x (θ)] = Cov[ b (θ),( b (θ)) ] = = Cov[ b (θ),(b (θ)) ] = Cov[ b (θ),(b (θ)) ] = = Cov[ b (θ)] = Λ and thus (1.11) becomes γ (Φ + Λ ) = Λ, from which γ Lemma 1.1. now gives γ = Λ (Φ + Λ ) = Λ [(I + Λ Φ )Φ ] = = Λ Φ (I + Λ Φ ). = Λ Λ [I (I + Λ Φ ) Λ Φ ] = = [Λ Φ Λ Φ (I + Λ Φ ) Λ Φ ] = = [I Λ Φ (I + I Λ Φ ) I ]Λ Φ = = (I + Λ Φ ) Λ Φ

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 23 and, once more using Lemma 1.1 γ X = (I + Λ Φ ) Λ Φ X = = I (I + Λ Φ I ) Λ Φ ˆb = [I (I + Λ Φ I ) ]ˆb with ˆb given by (1.6). According to Lemma 1.1 we obtain γ X = {I [I Λ Φ (I + Λ Φ ) ]} ˆb = Z ˆb, with Z given by (1.7). Insertion in (1.9) gives γ 0 + Z E(ˆb ) = β (1.12) where E(ˆb ) = ( Φ ) Φ E(X ) = ( Φ ) Φ E[E(X θ)] = = ( Φ ) Φ E[b(θ)] = ( Φ ) ( Φ )β = β and thus (1.12) becomes γ 0 + Z β = β from which γ 0 = (I Z)β. This completes the proof of Theorem 1.1. 2 The classical credibility regression model of Hachemeister In this section we will introduce the classical regression credibility model of Hachemeister, which consists of a portfolio of k contracts, satisfying the constraints of the original Hachemeister model. The contract indexed is a random vector consisting of a random structure θ and observations X 1,...,X t. Therefore the contract indexed is equal to (θ,x ), where X = (X 1,...,X t ) and = 1,k (the variables describing the th contract are (θ,x ), = 1,k). Just as in Section 1, we will derive the best linearized regression credibility estimators for this model. Instead of assuming time independence in the net risk premium: µ(θ ) = E(X q θ ), = 1,k, q = 1,t (2.1) one could assume that the conditional expectation of the claims on a contract changes in time, as follows: µ q (θ ) = E(X q θ ) = y q β(θ ), = 1,k, q = 1,t, (2.2) with y q assumed to be known and β( ) assumed to be unknown.

24 VIRGINIA ATANASIU Observations: By a suitable choice of the y q, time effects on the risk premium can be introduced. Examples. 1) If for instance the claim figures are subect to a known inflation i, (2.2) becomes: µ q (θ ) = E(X q θ ) = (1 + i) q β(θ ), = 1,k, q = 1,t. 2) If in addition the volume w changes from contract to contract, one could introduce the model: µ q (θ ) = E(X q θ ) = w (1 + i) q β(θ ), = 1,k, q = 1,t where w and i are given. Consequence of the hypothesis (2.2): µ (t,1) (θ ) = E(X θ ) = x (t,n) β (n,1) (θ ), = 1,k, (2.3) where x (t,n) is a matrix given in advance, the so-called design matrix, and where the β(θ ) are the unknown regression constants. Again one assumes that for each contract the risk parameters β(θ ) are the same functions of different realizations of the structure parameter. Observations: By a suitable choice of the x, time effects on the risk premium can be introduced. Examples. 1) If the design matrix is for examples chosen as follows: 1 1 1 2 x (t,3) 1 2 2 2 =, we obtain a quadratic inflationary trend:... 1 t t 2 µ q (θ ) = β 1 (θ ) + qβ 2 (θ ) + q 2 β 3 (θ ), = 1,k, q = 1,t, (2.4) where β (3,1) (θ ) = (β 1 (θ ),β 2 (θ ),β 3 (θ )), with = 1,k. 2) If the design matrix is for example chosen as follows: 1 1 x (t,2) 1 2 = (the last column of 1) is omitted) a linear inflation results:.. 1 t µ q (θ ) = β 1 (θ ) + qβ 2 (θ ), = 1,k, q = 1,t, (2.5) where β (2,1) (θ ) = (β 1 (θ ),β 2 (θ )), with = 1,k. For some fixed design matrix x (t,n) of full rank n (n < t), and a fixed weight matrix v (t,t), the hypotheses of the Hachemeister model are:

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 25 (H 1 ) The contracts (θ,x ) are independent, the variables θ 1,...,θ k are independent and identically distributed. (H 2 ) E(X (t,1) θ ) = x (t,n) β (n,1) (θ ), = 1,k, where β is an unknown regression vector; Cov(X (t,1) θ ) = σ 2 (θ ) v (t,t), where σ 2 (θ ) = Var(X r θ ), r = 1,t and v = v (t,t) is a known non-random weight (t t) matrix, with rgv = t, = 1,k. We introduce the structural parameters, which are natural extensions of those in the Bühlmann-Straub model. We have: s 2 = E[σ 2 (θ )] (2.6) a = a (n,n) = Cov[β(θ )] (2.7) b = b (n,1) = E[β(θ )], (2.8) where = 1,k. After the credibility result based on these structural parameters is obtained, one has to construct estimates for these parameters. Write: c = c (t,t) = Cov(X ), u = u (n,n) = (x v x), z = z (n,n) = a(a + s 2 u ) = [the resulting credibility factor for contract ], = 1,k. Before proving the linearized regression credibility premium, we first give the classical result for the regression vector, namely the GLS-estimator for β(θ ). Theorem 2.1 (Classical regression result). The vector B minimizing the weighted distance to the observations X, d(b ) = (X xb ) v (X xb ), reads or B = (x v x) x v X = u x v X, B = (x c x) x c X in case c = s 2 v + xax. Proof. The first equality results immediately from the minimization procedure for the quadratic form involved, the second one from Lemma 2.1. Lemma 2.1. (Representation theorem for a special class of matrices). If C and V are t t matrices, A an n n matrix and a t n matrix, and C = s 2 V + A, then and ( C ) = s 2 ( V ) + A ( C ) C = ( V ) V.

26 VIRGINIA ATANASIU We can now derive the regression credibility results for the estimates of the parameters in the linear model. Multiplying this vector of estimates by the design matrix provides us with the credibility estimate for µ(θ ), see (2.3). Theorem 2.2 (Linearized regression credibility premium). The best linearized estimate of E[β (n,1) (θ ) X ] is given by: M = z (n,n) B (n,1) + (I (n,n) z (n,n) )b (n,1) (2.9) and the best linearized estimate of E[x (t,n) β (n,1) (θ ) X ] is given by: x (t,n) M = x (t,n) [z (n,n) B (n,1) + (I (n,n) z (n,n) )b (n,1) ]. (2.10) Proof. The best linearized estimate M of E[β(θ ) X ] is determined by solving the following problem Min d(ε), (2.11) ε with d(ε) = β(θ ) (M + εv ) 2 p = = E[(β(θ ) M εv ) P(β(θ ) M εv )], (2.12) where V = V (n,1) is a linear combination of 1 and the components of X, P = P (n,n) is a positive definite matrix given in advance and 2 p is a norm defined by: X 2 p = E(X PX), with X = X (n,1) an arbitrary vector. The theorem holds in case d (0) = 0 for every V. Standard computations lead to d(ε) = E[(β(θ )) Pβ(θ )] E[(β(θ )) PM ] εe[(β(θ )) PV ] E[M Pβ(θ )] + E[M PM ]+ (2.13) + εe[m PV ] εe[v Pβ(θ )] + εe[v PM ] + ε 2 E[V PV ] The derivative d (ε) is given by d (ε) = 2E[V P(β(θ ) M εv )] (2.14) Define reduced variables by β 0 (θ ) = β(θ ) E[β(θ )] = β(θ ) b (2.15) B 0 = B E(B ) = B b, (2.16) X 0 = X E(X ) = X xb. (2.17) Inserting M from (2.9) in (2.14) for ε = 0, we have to prove that E[V P(β(θ ) Z B bz b)] = 0, (2.18)

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 27 for every V. Using (2.15) and (2.16), the relation (2.18) can be written as E[V P(β 0 (θ ) z B 0 )] = 0, (2.19) for every V. But since V is an arbitrary vector, with as components linear combinations of 1 and the components of X, it may be written as Therefore one has to prove that V = α 0 + α (n,t) 1 X 0. (2.20) E[(α 0 + X 0 α 1)P(β 0 (θ ) z B 0 )] = 0, (2.21) for every V. Standard computations lead to the following expression for the left hand side (2.22) α 0 PE[β0 (θ )] + E[X 0 α 1 Pβ0 (θ )] α 0 Pz E(B 0 ) E[X0 α 1 PZ B 0 ] = = E[X 0 α 1 P(β0 (θ ) z B 0 )] = E{Tr[X 0 α 1 P(β0 (θ ) z B 0 )]} = = E{Tr[α 1 P(β0 (θ ) z B 0 )X0 ]} = Tr{α 1 PE[(β0 (θ ) z B 0 )X0 ]}, where we used the fact that E[B 0 (θ )] = 0, E(β 0 ) = 0 and that a scalar random variable trivially equals its trace, and also that Tr(AB) = Tr(BA). Expression (2.22) is equal to zero, as can be seen by E[(β 0 (θ ) z B 0 )X 0 ] = E[β 0 (θ )X 0 ] z E(B 0 X 0 ) = = Cov [β 0 (θ ),X 0 ] z Cov(B 0,X0 ) = = Cov[β(θ ),X ] z Cov(B,X ) = = ax z (a + s 2 u )x = ax a(a + s 2 u ) (a + s 2 u )x = = ax ax = 0. (2.23) This proves (2.9), (2.10) follows by replacing P in (2.12) by x Px. So repeating the same reasoning as above we arrive at (2.10). Remark 2.1. Here and in the following we present the main results leaving the detailed computations to the reader. Remark 2.2. From (2.9) we see that the credibility estimates for the parameters of the linear model are given as the matrix version of a convex mixture of the classical regression result B and the collective result b. Theorem 2.2 concerns a special contract. By the assumptions, the structural parameters a,b and s 2 do not depend on. So if there are more contracts, these parameters can be estimated.

28 VIRGINIA ATANASIU Every vector B gives an unbiased estimator of b. Consequently, so does every linear combination of the type Σα B, where the vector of matrices (α (n,n) ) =1,k, is such that: k α (n,n) = I (n,n). (2.24) =1 The optimal choice of α (n,n) is determined in the following theorem: Theorem 2.3 (Estimation of the parameters b in the regression credibility model). The optimal solution to the problem where: d(α) = b α B 2 p Min α def E b d(α), (2.25) α B P b α B (the distance from α B to the parameters b), P = P (n,n) a given positive definite matrix (P is a non-negative definite matrix), with the vector of matrices α = (α ) =1,k satisfying (2.24), is: where Z = ˆb(n,1) = Z k z B, (2.26) =1 k z and z is defined as: z = a(a + s 2 u ), = 1,k. =1 Proof. Using the norm X 2 p = E(X PX) and the perpendicularity concept of two vectors X (n,1) and (n,1) defined by X iff E(X P ) = 0, we see that it is sufficient to prove that for all feasible α (ˆb b) α B ˆb, (2.27) since then according to an extension of Pythagoras theorem X X + 2 p = X 2 p + 2 p, we have b α B 2 p = b ˆb + ˆb α B 2 p = = b ˆb 2 p + ˆb α β 2 p (2.28)

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 29 so for every choice of α one gets b ˆb 2 p b α B 2 p. (2.29) So let us show now that (2.27) holds. It is clear that ˆb α B = Z z B α B = (Z Z α )B = γ B (2.30) with γ = (Z z α ) = Z z α = Z Z I = I I = 0, (2.31) where γ = Z z α, = 1,k. To prove (2.27), we have to show that γ B (ˆb b) (2.32) so that E γ B P(ˆb b) = 0. (2.33) The left hand side of (2.33) can successively be rewritten as follows E B γ P(ˆb b) = E(B γ P ˆb 0 ) = = [E(B γ P ˆb 0 ) b γ PE( ˆb 0 )] = = [E(B γ P ˆb 0 )) E(b γ P ˆb 0 )] = E[(B b )γ P ˆb 0 ] = = = =,i =,i =,i E[(B E(B ))γ P ˆb 0 ] = E(B 0 γ PZ z i B 0 i ) = i,i E[Tr(B 0 γ PZ z i B 0 i )] =,i Tr[γ PZ z i E(B 0 i B 0 )] =,i Tr[γ PZ z i Cov (B i B )] =,i E(B 0 γ P ˆb 0 ) = E(B 0 γ PZ z i B 0 i ) = E[Tr(γ PZ z i B 0 i B 0 )] = Tr[γ PZ z i Cov (B 0 i,b 0 )] = Tr[γ PZ z i δ i (a + s 2 u )] =

30 VIRGINIA ATANASIU = Tr[γ PZ z (a + s 2 u )] = = Tr[γ PZ z a(a + s 2 u ) (a + s 2 u )] = (2.34) = Tr(γ PZ a) = Tr ( γ )PZ a = = Tr(0PZ a) = Tr(0) = 0, where ˆb 0 = ˆb E(ˆb) = ˆb b, B 0 = B E(B ) = B b are the reduced variables. In (2.34) we used the fact that E( ˆb 0 ) = 0 and that a scalar random variable trivially equals its trace, and also that Tr(AB) = Tr(BA). The proof is complete. Theorem 2.4 (Unbiased estimator for s 2 for each contract group). In case the number of observations t in the th contract is larger than the number of regression constants n, the following is an unbiased estimator of s 2 : ŝ 2 = 1 t n (X x B ) (X x B ). (2.35) Corollary (Unbiased estimator for s 2 in the regression model). Let K denote the number of contracts, with t > n. The E(ŝ 2 ) = s 2, if: ŝ 2 = 1 K ;t >n ŝ 2. (2.36) For a, we give an unbiased pseudo-estimator, defined in terms of itself, so it can only be computed iteratively: Theorem 2.5 (Pseudo-estimator for a). The following random variable has expected value a: â = 1 z (B k 1 ˆb)(B ˆb). (2.37) Proof. By standard computations we obtain Since results that E(â) = 1 k 1 z [E(B B ) E(B ˆb ) E(ˆb B ) + E(ˆb ˆb )]. (2.38) Cov(B ) = Cov(B,B ) = Cov[u x v X,(u x v X ) ] = = u x v Cov(X )v xu = u x v (s 2 v + xax )v xu = a + s2 u, E(B B ) = Cov(B ) + E(B )E(B ) = a + s2 u + b b, (2.39)

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 31 where E(B ) = E[E(B θ )] = E(β(θ )] = b. Since Cov(B,ˆb ) = Cov(B,Z ( ( )) z i B i ) = Cov Z z i B i,b = i i = Cov(B,B i )z i (Z ) = δ i (a + s 2 u )z i(z ) = (a + s 2 u )z (Z ), i i results that E(B ˆb ) = Cov(B,ˆb ) + E(B )E(ˆb ) = (a + s 2 u )z (Z ) + b b, (2.40) where E(ˆb) = E Z z B = Z z E(B ) = Z Zb = b. Since results that Cov(ˆb,B ) = (Cov(B,ˆb)) = [(a + s 2 u )z (Z ) ] = = Z z (a + s 2 u ) = Z a(a + s 2 u ) (a + s 2 u ) = Z a, E(ˆbB ) = Cov(ˆb,B ) + E(ˆb)E(B ) = Z a + b b (2.41) Since Cov(ˆb) = Cov(ˆb,ˆb) = Cov(Z i z i B i,z z B ) = = Z i = Z i z i z i Cov (B,B i )z (Z ) = δ i (a + s 2 u )z (Z ) = = Z ( i = Z ( i ) z i (a + s 2 u i )z i (Z ) = ) a(a + s 2 u i ) (a + s 2 u i )z i (Z ) = Z az (Z ) = Z a results that E(ˆb ˆb ) = Cov(ˆb) + E(ˆb)E(ˆb ) = Z a + b b. (2.42) Now (2.37) follows from (2.38), (2.39), (2.40), (2.41) and (2.42).

32 VIRGINIA ATANASIU Remark 2.3. Another unbiased estimator for a is the following: 1 â = (w 2. 1 k w w) 2 i w (B 2 i B )(B i B ) ŝ 2 w (w. w )u, i, (2.43) where w is the volume of the risk for the th contract, = 1,k and w. = w. Proof. Complicate and tedious computations lead to (w 2. w 2 )E(â) = 1 w 2 i w E[(B i B ) (B i B ) ] E(ŝ2 ) i, w (w. w )u = 1 2 { w i w [E(B i B i) E(B i B ) i, =1 E(B B i ) + E(B B )]} s2 w w.u w 2 u = = 1 2 { i, w i w [a + s 2 u i + b b δ i (a + s 2 u ) b b δ i (a + s 2 u ) b b + a + s 2 u + b b ]} s 2 w w.u + s 2 w 2 u = = 1 2 2w.w.a + 1 2 s2 i w i u i w. 1 2 2 w w i δ i (a + s 2 u )+ + 1 2 s2 w u w. s 2 w u w. + s 2 w 2 u = i = w 2.a w 2 a = (w2. w 2 )a Thus we have proved our assertion. Observation. This estimator is a statistic; it is not a pseudo-estimator. Still, the reason to prefer (2.37) is that this estimator can easily be generalized to multilevel hierarchical models. In any case, the unbiasedness of the credibility premium disappears even if one takes (2.43) to estimate a. 3 Conclusions The article contains a credibility solution in the form of a linear combination of the individual estimate (based on the data of a particular state) and the collective estimate (based on aggregate USA data). This idea is worked out in regression credibility theory.

MATHEMATICAL MODELS IN REGRESSION CREDIBILIT THEOR 33 In case there is an increase (for instance by inflation) of the results on a portfolio, the risk premium could be considered to be a linear function in time of the type β 0 (θ) + tβ 1 (θ). Then two parameters β 0 (θ) and β 1 (θ) must be estimated from the observed variables. This kind of problem is named regression credibility. This model arises in cases where the risk premium depends on time, e.g. by inflation. The one could assume a linear effect on the risk premium as an approximation to the real growth, as is also the case in time series analysis. These regression models can be generalized to get credibility models for general regression models, where the risk is characterized by outcomes of other related variables. This paper contains a description of the Hachemeister regression model allowing for effects like inflation. If there is an effect of inflation, it is contained in the claim figures, so one should use estimates based on these figures instead of external data. This can be done using Hachemeister s regression model. In this article the regression credibility result for the estimates of the parameters in the linear model is derived. After the credibility result based on the structural parameters is obtained, one has to construct estimates for these parameters. The matrix theory provided the means to calculate useful estimators for the structure parameters. The property of unbiasedness of these estimators is very appealing and very attractive from the practical point of view. The fact that it is based on complicated mathematics, involving linear algebra, needs not bother the user more than it does when he applies statistical tools like discriminant analysis, scoring models, SAS and GLIM. References [1] Daykin C.D., Pentikäinen T., Pesonen M. Practical Risk Theory for Actuaries. Chapman & Hall, 1993. [2] Goovaerts M.J., Kaas R., Van Heerwaarden A.E., Bauwelinckx T. Effective Actuarial Methods. Elsevier Science Publishers B.V., 3 1990. [3] Hachemeister C.A. Credibility for regression models with application to trend; in Credibility, theory and application. Proceedings of the Berkeley Actuarial Research Conference on credibility; Academic Press, New ork, 1975, 129 163. [4] Sundt B. On choice of statistics in credibility estimation, Scandinavian Actuarial Journal, 1979, 115 123 [5] Sundt B. An Introduction to Non-Life, Insurance Mathematics, volume of the Mannheim Series, 1984, 22 54. [6] Sundt B. Two credibility regression approaches for the classification of passenger cars in a multiplicative tariff, ASTIN Bulletin, 17, 1987b, 41 69 Academy of Economic Studies Department of Mathematics Calea Dorobanţilor nr. 15-17 Sector 1 Bucharest, 010552 Romania E-mail: virginia atanasiu@yahoo.com Received September 27, 2006