PL-2 The Matrix Inverted: A Primer in GLM Theory

Size: px

Start display at page:

Download "PL-2 The Matrix Inverted: A Primer in GLM Theory"

Alexia Boone
6 years ago
Views:

1 PL-2 The Matrix Inverted: A Primer in GLM Theory 2005 CAS Seminar on Ratemaking Claudine Modlin, FCAS Watson Wyatt Insurance & Financial Services, Inc W W W. W A T S O N W Y A T T. C O M / I N S U R A N C E

2 Generalized linear model benefits Consider all factors simultaneously Allow for nature of random process Provide diagnostics Robust and transparent

3 Example of GLM output (real UK data) % % 6% 7% Log of multiplier % -5% -19% -4% -16% -20% -15% -19% -17% Exposure (policy years) Factor Exposure Oneway relativities Approx 2 SE from estimate GLM estimate

4 Agenda Formularization of GLMs linear predictor, link function, offset error term, scale parameter, prior weights typical model forms Model testing use only variables which are predictive make sure model is reasonable Aliasing

5 Agenda Formularization of GLMs linear predictor, link function, offset error term, scale parameter, prior weights typical model forms Model testing use only variables which are predictive make sure model is reasonable Aliasing

6 Linear models Linear model Y i = i + error i based on linear combination of measured factors Which factors, and how they are best combined is to be derived i = +.age i +.age i 2 +.height i.age i i = +.age i +.(sex i =female) i = ( +.age i ) * exp(.height i.age i )

7 Linear models - formularization E[Y i ] = i = X ij j Var[Y i ] = 2 Y i ~ N( i, 2 )

8 What is X ij j? X defines the explanatory variables to be included in the model could be continuous variables - "variates" could be categorical variables - "factors" contains the parameter estimates which relate to the factors / variates defined by the structure of X "the answer"

9 What is X.? Write X ij j as X. Consider 3 rating factors - age of driver ("age") - sex of driver ("sex") - age of vehicle ("car") Represent by,,,,...

10 What is X.? Suppose we wanted a model of the form: = +.age +.age 2 +.car 27.age 52½ X. would need to be defined as: 1 age 1 age 1 2 car 1 27.age 1 52½ 1 age 2 age 2 2 car 2 27.age 2 52½ 1 age 3 age 3 2 car 3 27.age 3 52½ 1 age 4 age 4 2 car 4 27.age 4 52½ 1 age 5 age 2 5 car 27 5.age 52½

11 What is X.? Suppose we wanted a model of the form: = + if age < if age if age > if sex male 1 + if sex female 2

12 What is X.? Age Sex < >40 M F

13 What is X.? Suppose we wanted a model of the form: = + if age < if age if age > if sex male 1 + if sex female 2

14 What is X.? Suppose we wanted a model of the form: = + if age < if age "Base levels" + if age > if sex male 1 + if sex female 2

15 X. having adjusted for base levels Age Sex < >40 M F

16 Linear models - formularization E[Y i ] = i = X ij j Var[Y i ] = 2 Y i ~ N( i, 2 )

17 Generalized linear models i = f( +.age i +.age i 2 +.height i.age i ) i = f( +.age i +.(sex i =female) )

18 Generalized linear models i = g -1 ( +.age i +.age i 2 +.height i.age i ) i = g -1 ( +.age i +.(sex i =female) )

19 Generalized linear models Linear Models E[Y i ] = i = X ij j Var[Y i ] = 2 Y from Normal distribution Generalized Linear Models E[Y i ] = i = g -1 ( X ij j + i ) Var[Y i ] = V( i )/ i Y from a distribution from the exponential family

20 Generalized linear models Each observation i from distribution with mean i Women Men

21 Generalized linear models E[Y i ] = i = g -1 ( X ij j + i ) j Var[Y i ] =.V( i )/ i

22 Generalized linear models E[Y] = = g -1( X + ) Var[Y] = V( ) /

23 Generalized linear models E[Y] = = g -1( X ) Observed thing (data) Some function (user defined) Some matrix based on data (user defined) as per linear models Parameters to be estimated (the answer!)

24 What is g -1 (X. )? -1 Y = g (X. ) + error Assuming a model with three categorical factors, each observation can be expressed as: -1 Y = g ( ijk i j k ) + error = = = age is in group i sex is in group j car is in group k

25 What is g -1 (X. )? g(x) = x Y ijk = + i + j + k + error g(x) = ln(x) Y ijk = e ( + i + j + k ) + error = A.B i.c j.d k + error where B i = e i etc Multiplicative form common for frequency and amounts

26 Multiplicative model $ x Age Factor Group Factor Sex Factor Male 1.00 Female 1.25 Area Factor A 0.95 B 1.00 C 1.09 D 1.15 E 1.18 F 1.27 G 1.36 H 1.44 E(losses) = $ x 1.42 x 0.92 x 1.00 x 1.15 = $

27 Generalized linear models E[Y] = = g -1( X + ) "Offset" Eg Y = claim numbers Smith: Jones: Male, 30, Ford, 1 years, 2 claims Female, 40, VW, ½ year, 1 claim

28 What is? g(x) = ln(x) ijk = ln(exposure ijk ) E[Y ijk ] = e ( + i + j + k + ijk ) = A.B i.c j.d k. e (ln(exposure ijk )) = A.B i.c j.d k. exposure ijk

29 Restricted models E[Y] = = g -1( X + ) Offset Constrain model (eg territory, ABS discount) Other factors adjusted to compensate

30 Generalized linear models E[Y] = = g -1( X + ) Var[Y] = V( ) /

31 Generalized linear models Var[Y] = V( ) / Normal: = 2, V(x) = 1 Var[Y] = 2.1 Poisson: = 1, V(x) = x Var[Y] = Gamma: = k, V(x) = x 2 Var[Y] = k 2

32 The scale parameter Var[Y] = V( ) / Normal: = 2, V(x) = 1 Var[Y] = 2.1 Poisson: = 1, V(x) = x Var[Y] = Gamma: = k, V(x) = x 2 Var[Y] = k 2

33 Example of effect of changing assumed error Data Normal

34 Example of effect of changing assumed error Data Normal Poisson

35 Example of effect of changing assumed error Data Normal Poisson Gamma

36 Example of effect of changing assumed error - 2 Example portfolio with five rating factors, each with five levels A, B, C, D, E Typical correlations between those rating factors Assumed true effect of factors Claims randomly generated (with Gamma) Random experience analyzed by three models

37 Example of effect of changing assumed error Log of multiplier A B C D E -0.2 True effect

38 Example of effect of changing assumed error Log of multiplier A B C D E -0.2 True effect One way

39 Example of effect of changing assumed error Log of multiplier A B C D E -0.2 True effect One way GLM / Normal

40 Example of effect of changing assumed error Log of multiplier A B C D E -0.2 True effect One way GLM / Normal GLM / Gamma

41 Example of effect of changing assumed error Log of multiplier A B C D E -0.2 True effect GLM / Gamma + 2 SE - 2 SE

42 Prior weights Var[Y] = V( ) / Exposure Other credibility Eg Y = claim frequency Smith: Male, 30, Ford, 1 years, 2 claims, 100% Jones: Female, 40, VW, ½ year, 1 claim, 100%

43 Typical model forms Y Claim frequency Claim number Average claim amount Probability (eg lapses) g(x) ln(x) ln(x) ln(x) ln(x/(1-x)) Error Poisson Poisson Gamma Binomial V(x) 1 x 1 x estimate x 2 1 x(1-x) exposure 1 # claims 1 0 ln(exposure) 0 0

44 0.2 0 Tweedie distributions Incurred losses have a point mass at zero and then a continuous distribution Poisson and gamma not suited to this Tweedie distribution has point mass and parameters which can alter the shape to be like Poisson and gamma above zero f Y ( y;,, ) n 1 1 ( ) ( 1/ y) ( n ) n! y p( Y 0) exp ( 0) n.exp [ y ( )] 0 0 for y 0

45 Generalized linear models Var[Y] = V( ) / Normal: = 2, V(x) = 1 Var[Y] = 2.1 Poisson = 1, V(x) = x Var[Y] = Gamma = k, V(x) = x 2 Var[Y] = k 2 Tweedie = k, V(x) = x p Var[Y] = k p

46 Tweedie distributions Tweedie = k, V(x) = x p Var[Y] = k p Defines a valid distribution for p<0, 1<p<2, p>2 Can be considered as Poisson/gamma process for 1<p<2 Typical values of p for insurance incurred claims around, or just under, 1.5

47 Tweedie conclusions Helpful when important to fit to pure premium Often similar results to traditional approach but differences may occur if numbers and amounts models have effects which are both large and insignificant No information about whether frequencies or amounts are driving result

48 Link function g(x) Offset Error Structure V( ) Linear Predictor Form X. = i + j + k + l Data Y Scale Parameter Prior Weights Numerical MLE Parameter Estimates,,, Diagnostics

49 Maximum likelihood estimation Seek parameters which give highest likelihood function given data Likelihood Parameter 1 Parameter 2

50 Newton-Raphson In one dimension: x n+1 = x n f '(x n ) / f ''(x n ) In n dimensions: n+1 = n H -1.s where is the vector of the parameter estimates (with p elements), s is the vector of the first derivatives of the log-likelihood and H is the (p*p) matrix containing the second derivatives of the log-likelihood

51 Agenda Formularization of GLMs linear predictor, link function, offset error term, scale parameter, prior weights typical model forms Model testing use only variables which are predictive make sure model is reasonable Aliasing

52 Model testing Use only those variables which are predictive standard errors of parameter estimates F tests / 2 tests on deviances Make sure the model is reasonable histogram of deviance residuals residual vs fitted value Box Cox link function investigation

53 Standard errors Roughly speaking, for a parameter p: SE = -1 / ( 2 / p 2 Likelihood) Likelihood Parameter 1 Parameter 2

54 GLM output (significant factor) % 138% % Log of multiplier % 39% 45% 58% 84% 73% 72% 93% Exposure (years) % 5% Vehicle symbol Onew ay relativities Approx 95% confidence interval Parameter es timate P value = 0.0%

55 GLM output (insignificant factor) % Log of multiplier % 4% -1% 3% -5% 0% -5% -1% 3% -1% 4% -1% Exposure (years) Vehicle symbol Onew ay relativities Approx 95% confidence interval Par ameter estimate P value = 52.5%

56 Awkward cases Log of multiplier % -5% 16% -14% 11% -10% -22% 0% 42% 65% 92% 101% 123% 159% 200%

57 Deviances Single figure measure of goodness of fit Try model with & without a factor Statistical tests show the theoretical significance given the extra parameters

58 Deviances Age Sex Vehicle Zone Model A Fitted value Deviance = 9585 df = Mileage Claims? Age Sex Vehicle Model B Fitted value Deviance = 9604 df = Mileage Claims

59 Deviances If known, scaled deviance S output S = n 2 / Y u u (Y u - ) / V( ) d u=1 u S 1 - S 2 ~ 2 d 1 -d 2 If unknown, unscaled deviance D =.S output (D 1 - D 2 ) (d 1 - d 2 ) D 3 / d 3 ~ F d1 -d 2, d 3

60 Model testing Use only those variables which are predictive standard errors of parameter estimates F tests / 2 tests on deviances Make sure the model is reasonable histogram of deviance residuals residual vs fitted value Box Cox link function investigation

61 Residuals Residuals Fitted values Data

62 Residuals Several forms, eg standardized deviance sign (Y u - u ) / ( (1-h u ) ) ½ 2 u ( Y u - ) / V( d standardized Pearson Y - u u (.V( ).(1-h ) / ) u u u Standardized deviance - Normal (0,1) Numbers/frequency residuals problematical ½ Y u u

63 Residuals Histogram of Deviance Residuals Run 12 (Final models with analysis) Model 8 (AD amounts) Frequency Size of deviance residuals Pretium 08/01/ :24

64 Residuals Residual Pretium 04/05/ :03 Log of fitted value

65 Residuals

66 Gamma data, Gamma error Plot of deviance residual against fitted value Run 12 (All claim types, final models, N&A) Model 6 (Own damage, Amounts) 2 1 Deviance Residual Pretium 08/01/ :32 Fitted Value

67 Gamma data, Normal error Plot of deviance residual against fitted value Run 12 (All claim types, final models, N&A) Model 7 (Own damage, Amounts) Deviance Residual Pretium 08/01/ :32 Fitted Value

68 Agenda Formularization of GLMs linear predictor, link function, offset error term, scale parameter, prior weights typical model forms Model testing use only variables which are predictive make sure model is reasonable Aliasing

69 Aliasing and "near aliasing" Aliasing the removal of unwanted redundant parameters Intrinsic aliasing occurs by the design of the model Extrinsic aliasing occurs "accidentally" as a result of the data

70 Intrinsic aliasing X. = + if age if age if age "Base levels" + if sex male 1 + if sex female 2

71 Intrinsic aliasing Example job Run 16 Model 3 - Small interaction - Third party material damage, Numbers % 138% Log of multiplier % 28% 24% Exposure (years) % -19% -6% -11% Age of driver Onew ay relativities Approx 95% confidence interval Unsmoothed estimate Smoothed estimate

72 Extrinsic aliasing If a perfect correlation exists, one factor can alias levels of another Eg if doors declared first: Selected base Exposure: # Doors Unknown Color Selected base Red 13,234 12,343 13,432 13,432 0 Green 4,543 4,543 13,243 2,345 0 Blue 6,544 5,443 15,654 4,565 0 Black 4,643 1,235 14,565 4,545 0 Further aliasing Unknown ,242 This is the only reason the order of declaration can matter (fitted values are unaffected)

73 Extrinsic aliasing Example job Run 16 Model 3 - Small interaction - Third party material damage, Numbers 1 135% Log of multiplier % 39% 45% 57% 82% 72% 70% 91% 103% Exposure (years) 0 0% 0% Malus 0 No claim discount Onew ay relativities Approx 95% confidence interval Unsmoothed estimate Smoothed estimate

74 "Near aliasing" If two factors are almost perfectly, but not quite aliased, convergence problems can result as a result of low exposures (even though one-ways look fine), and/or results can become hard to interpret Selected base Exposure: # Doors Unknown Color Selected base Red 13,234 12,343 13,432 13,432 0 Green 4,543 4,543 13,243 2,345 0 Blue 6,544 5,443 15,654 4,565 0 Black 4,643 1,235 14,565 4,545 2 Unknown ,242 Eg if the 2 black, unknown doors policies had no claims, GLM would try to estimate a very large negative number for unknown doors, and a very large positive number for unknown color

75 "A Practitioner's Guide to Generalized Linear Models" CAS 2004 Discussion Paper Program Copies available at

76 The Matrix Inverted: A Primer in GLM Theory 2005 CAS Seminar on Ratemaking Claudine Modlin, FCAS Watson Wyatt Insurance & Financial Services, Inc W W W. W A T S O N W Y A T T. C O M / I N S U R A N C E

A Practitioner s Guide to Generalized Linear Models

A Practitioner s Guide to Generalized Linear Models A Practitioners Guide to Generalized Linear Models Background The classical linear models and most of the minimum bias procedures are special cases of generalized linear models (GLMs). GLMs are more technically