Content Preview. Multivariate Methods. There are several theoretical stumbling blocks to overcome to develop rating relativities

Similar documents
A Practitioner s Guide to Generalized Linear Models

PL-2 The Matrix Inverted: A Primer in GLM Theory

GLM I An Introduction to Generalized Linear Models

Final Exam - Solutions

C-14 Finding the Right Synergy from GLMs and Machine Learning

Linear Regression In God we trust, all others bring data. William Edwards Deming

ORF 245 Fundamentals of Engineering Statistics. Final Exam

Varieties of Count Data

Binary Logistic Regression

Ratemaking with a Copula-Based Multivariate Tweedie Model

Learning Objectives for Stat 225

LOOKING FOR RELATIONSHIPS

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Generalized Linear Models (GLZ)

Model Assumptions; Predicting Heterogeneity of Variance

Logistic Regression in R. by Kerry Machemer 12/04/2015

Advanced Ratemaking. Chapter 27 GLMs

CS6220: DATA MINING TECHNIQUES

6.867 Machine Learning

Unit 10: Simple Linear Regression and Correlation

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Chapter 5: Generalized Linear Models

Sociology 6Z03 Review II

CS6220: DATA MINING TECHNIQUES

Logistic Regression: Regression with a Binary Dependent Variable

The Union and Intersection for Different Configurations of Two Events Mutually Exclusive vs Independency of Events

TABLE OF CONTENTS CHAPTER 1 COMBINATORIAL PROBABILITY 1

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Bemidji Area Schools Outcomes in Mathematics Algebra 2 Applications. Based on Minnesota Academic Standards in Mathematics (2007) Page 1 of 7

Generalized Linear Models

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

PENALIZING YOUR MODELS

Solutions to the Spring 2015 CAS Exam ST

STATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Regression Analysis. BUS 735: Business Decision Making and Research. Learn how to detect relationships between ordinal and categorical variables.

6.867 Machine Learning

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

A Random-effects construction of EMV models - a solution to the Identification problem? Peter E Clarke, Deva Statistical Consulting

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

Lecture 10: Introduction to Logistic Regression

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

Simple and Multiple Linear Regression

Statistics: A review. Why statistics?

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Introducing Generalized Linear Models: Logistic Regression

Standard Error of Technical Cost Incorporating Parameter Uncertainty

Chapter 1. Gaining Knowledge with Design of Experiments

Linear Models for Regression CS534

Probability and Estimation. Alan Moses

Introduction Fitting logistic regression models Results. Logistic Regression. Patrick Breheny. March 29

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

Classification and Regression Trees

INSTITUTE OF ACTUARIES OF INDIA

Truck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation

Midterm 2 - Solutions

Stat 5101 Lecture Notes

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

We like to capture and represent the relationship between a set of possible causes and their response, by using a statistical predictive model.

LISA Short Course Series Generalized Linear Models (GLMs) & Categorical Data Analysis (CDA) in R. Liang (Sally) Shan Nov. 4, 2014

Project Report for STAT571 Statistical Methods Instructor: Dr. Ramon V. Leon. Wage Data Analysis. Yuanlei Zhang

Chapter (4) Discrete Probability Distributions Examples

Solutions to the Spring 2016 CAS Exam S

STAT 7030: Categorical Data Analysis

Course in Data Science

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Bivariate Relationships Between Variables

Binomial and Poisson Probability Distributions

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Qualifying Exam in Probability and Statistics.

Chapter IV: Random Variables - Exercises

The simple linear regression model discussed in Chapter 13 was written as

the amount of the data corresponding to the subinterval the width of the subinterval e x2 to the left by 5 units results in another PDF g(x) = 1 π

8 Nominal and Ordinal Logistic Regression

Subject CS1 Actuarial Statistics 1 Core Principles

Course 4 Solutions November 2001 Exams

Linear Regression Analysis for Survey Data. Professor Ron Fricker Naval Postgraduate School Monterey, California

Generalized Linear Models 1

Linear Models for Regression CS534

Formal Statement of Simple Linear Regression Model

ECE521 week 3: 23/26 January 2017

Multiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1

Bias Variance Trade-off

Introduction to Statistical modeling: handout for Math 489/583

STAT 212 Business Statistics II 1

Classification & Regression. Multicollinearity Intro to Nominal Data

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Applications in Differentiation Page 3

Multiple Linear Regression CIVL 7012/8012

Randomized Algorithms

Logistic Regression and Generalized Linear Models

A Supplemental Material for Insurance Premium Prediction via Gradient Tree-Boosted Tweedie Compound Poisson Models

Generalized linear models

9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients

Transcription:

Introduction to Ratemaking Multivariate Methods March 15, 2010 Jonathan Fox, FCAS, GuideOne Mutual Insurance Group Content Preview 1. Theoretical Issues 2. One-way Analysis Shortfalls 3. Multivariate Methods There are several theoretical stumbling blocks to overcome to develop rating relativities Separating the Signal from the Noise Not double counting Correlated Exposures Addressing Variable Interactions 1

1. Insurance is inherently a stochastic, or random, process Any set of data you examine will contain: systematic variation - signal, true relationships random variation - noise Systematic Variation Random Variation Note: the presence of noise along with our signal is the basic reason credibility was conceived. Due to the presence of noise, we don t fully believe our point estimate. Good multivariate methods provide credibility statistics for estimates. Filtering the Signal from the Noise requires credibility estimates based on measuring Process Variance For example: A coin is tossed 10 times and only lands on heads 4 times (). Over last three years territory 101 has a fire frequency of 0.005 while territory 102 has a fire frequency of 0.007. Is the coin biased? Can we say that territory 102 is truly more risky than 101? To answer these questions we need a measure of the variability in each processes 2. The signal once detected is usually made up of inter-related effects Correlations Often, distributional biases in exposures exist in the data and cause results to be linked Credit Sex Marital Youth Random Variation Driver Age Possible Example: If all Youth live the City and all Adults live in Rural areas, then Age and Territory effects will be correlated Vehicle Cal Yr Territory 2

3. The signal is usually made up of inter-related effects Interactions This occurs when two variable s indicated factors are correlated; hence the outcome of one depends on the level of the other Frequency 35% 30% 20% 15% 5% 0% Accident Frequency Adult Total Elderly Youth Total Random Variation Vehicle Cal Yr Credit Sex Driver Age Marital Youth Territory 100 150 200 250 300 350 Sedan Horsepower Possible Example: Youngsters with muscle cars are more likely to drive them often and fast while senior citizens with muscle cars are more likely to leave them in the garage and polish them: Horsepower and Driver Age interact Summary on Theoretical issues Processes Variability measures allow us to gauge the strength of the signal or indicated factor estimates. Correlations between two variables exposure distributions cause the indications to be linked. This is NOT an interaction; it is an important effect and multivariate techniques can resolve this problem. Interactions are correlations between two variables indicated factors: the indicated factors behave differently across levels of a secondary variable. It is perfectly possible for two variables to be correlated but have no interaction, or for two variables to have an interaction but not be correlated. One-way Analysis Techniques and their Shortfalls Pure Premium Example Youthful Losses: Total Losses: $266,667 Mature $400,000 Losses: $133,333 Exposures: Exposures: 2,000 4,000 Exposures: 2,000 Pure Premium: $100 Pure Premium: $133 Pure Premium: $67 Relativity: 2.00 Relativity: 1.00 3

Problem: One-way Pure Premium analysis is blind to the rest of the class plan Losses: Total $150,000 Exposures: 2,000 Losses: $400,000 Pure Premium: $75 Relativity: Exposures: 1.00 4,000 Driver With Pure Points Premium: $100 Losses: $250,000 Exposures: 2,000 Pure Premium: $125 Relativity: 1.67 Should a Young Driver with Points be charged 3.33 times the rate of a clean Adult (2.00 * 1.67)? Losses: Total $150,000 Exposures: 2,000 Losses: $400,000 Pure Premium: $75 Relativity: Exposures: 1.00 4,000 Driver With Pure Points Premium: $100 Losses: $250,000 Exposures: 2,000 Pure Premium: $125 Relativity: 1.67 Should a Young Driver with Points be charged 3.33 times the rate of a clean Adult (2.00 * 1.67)? Equal Distribution Biased Distribution Exposure Distribution 50% 45% 35% 30% 20% 15% Exposure Distribution 50% 45% 35% 30% 20% 15% 5% 0% Points 5% 0% Points Youthful Mature Youthful Mature Yes If Pointed drivers are evenly spread through Youth and Matures. Then no correlation exists between Pointed vs. Youthful exposures No If Young drivers are more (or less) likely to have Points. Then a correlation will exist between the exposures, and oneway analysis will be distorted by the bias. 4

50% 45% 35% 30% 20% 15% 5% 0% Youthful Mature Points Example Detail Exposures Youth Mature Total 400 1,600 2,000 Points 1,600 400 2,000 Total 2,000 2,000 4,000 Exposures Youth Mature 20% 80% Points 80% 20% Total 100% 100% Losses Youth Mature Total 66,667 83,333 150,000 Points 200,000 50,000 250,000 Total 266,667 133,333 400,000 Losses Youth Mature 62% Points 75% 38% Total 100% 100% Pure Premium Youth Mature Total 167 52 75 Points 125 125 125 Total 133 67 100 Pure Premium Youth Mature 1.25 0.78 Points 0.94 1.88 Total 1.00 1.00 One-Way Relativity Points Youth Mature One-way 1.00 1.00 2.00 Points 3.33 1.67 1.67 Age One-way 2.00 1.00 Two-Way Relativity Points Youth Mature One-way 3.20 1.00 1.00 Points 2.40 2.40 1.67 Age One-way 2.00 1.00 Exposure Distribution There are several problems with oneway analysis Usually does not provide a measure of significance Can overlook Exposure Correlations Not sensitive to Factor Interactions Multivariate Techniques can overcome these shortfalls Multi-way Pure Premium Loss Ratio Minimum Bias Multi Linear Regression Generalized Linear Regression 5

One-way Loss Ratios are inherently Multivariate: the premium takes into account the rest of the class plan Youth LR = 73% = Loss Premium = Loss (BaseRate) Rel Rel Age Points Rel Terr... For example, if you look at the relative loss ratios between Youthful and Adult drivers, the premium within that loss ratio will reflect the current factors for Points. Because Youthfuls have a higher percentage of Points, their average premium will be higher due to the higher Points factors. This will lower the loss ratio. In this way we don t double count the effect of Points and Age. Side note what if Points didn t exist? Appropriate Age factors would change. One-way Loss Ratio analysis has a significant shortcoming It assumes the rating plan s other factors are accurate. = (Base Rate) Rel Loss Age Rel Points Rel Terr... This assumption is often not appropriate, as is the case when there are multiple changes which need to be made. e.g. suppose you want to examine the adequacy of both your Age and Points curves. When you look at loss ratios by Age, you are assuming your current Points factors are good and vice versa for when you look at loss ratios by Points. Minimum Bias Techniques overcome several common short comings Min Bias is an iterative approach for reducing the error between observed and indicated relativities Optimizes the relativities for multiple changes Can Formula use either representation Pure Premiums of Indicated or Pure Loss Prem Ratios Total Formula Equivalent Calculates Points relativities 100 x A1 which x P1 minimize 100 x the A2 x error P1 with 600 observed = (100 x relativities A1 x P1) + (100 based x A2 x P1) on a No selected Points error 100 (Bias) x A1 x function P2 100 x A2 x P2 400 = (100 x A1 x P2) + (100 x A2 x P2) The Bias Function determines how the error which can t be removed is then distributed Iterative technique - when the equations converge you have the optimal solution 6

Example 1: Use the Minimum Bias procedure to find the optimal relativities for Age and Points Observed Pure Premiums Youth Adult Total Rel Points 333 167 500 1.67 200 100 300 1.00 Total 533 267 Rel 2.00 1.00 Indicated Relativity Points 2.00 x 1.67 1.00 x 1.67 2.00 x 1.00 1.00 x 1.00 Points 3.33 1.67 2.00 1.00 Indicated Pure Premium Points 333 167 200 100 Error (Indication vs. Actual) Points 0 0 0 0 Perfect Match no need to use Minimum Bias Approach Example 2: Use the Minimum Bias procedure to find the optimal relativities for Age and Points Observed Pure Premiums Total Rel Points 400 200 600 1.50 P1 300 100 400 1.00 P2 Total 700 300 Rel 2.33 1.00 A1 A2 Indicated Relativity x 1.50 Points 2.33 x 1.50 1.00 2.33 x 1.00 1.00 x 1.00 Points 3.50 1.50 2.33 1.00 Indicated Pure Premium Points 350 150 233 100 Error (Indication vs. Actual) Points -50-50 -67 0 Imperfect Match - use Minimum Bias Approach Which Bias Function to Chose? Balance Principle Ensures that the total error in each class is zero Least Squares Minimizes the total error, using the Sum of Squared errors (nominal values) Chi Squared Minimizes the total error, using the Sum of Squared errors (as a Percent of expected) 7

Example 2: Balance Principle Bias Function (1 st Iteration) Balance Principle ensures the total error in each class dimension is zero Observed Pure Premiums Total Rel Points 400 200 600 P1 300 100 400 P2 Total 700 300 Rel A1 A2 Assumed Initial Relativity for Age: Symbol Initial Value Youthful A1 2.333 Adult A2 1.000 Formula representation of Indicated Pure Prem Total Formula Equivalent Points 100 x A1 x P1 100 x A2 x P1 600 = (100 x A1 x P1) + (100 x A2 x P1) 100 x A1 x P2 100 x A2 x P2 400 = (100 x A1 x P2) + (100 x A2 x P2) 600 = (100 x 2.33 x P1) + (100 x 1.00 x P1) 400 = (100 x 2.33 x P2) + (100 x 1.00 x P2) 1st Iteration Solution for Points Variables Symbol Solved Value Rebased Points P1 1.8000 1.5000 P2 1.2000 1.0000 Next we substitute in the solved values for Points to get new values for Age Example 2: Balance Principle Bias Function (1 st Iteration) Balance Principle ensures the total error in each class dimension is zero Observed Pure Premiums Total Rel Points 400 200 600 P1 300 100 400 P2 Total 700 300 Rel A1 A2 Relativity from Prior Iteration Symbol Value Points P1 1.500 P2 1.000 Points 100 x A1 x P1 100 x A2 x P1 100 x A1 x P2 100 x A2 x P2 700 300 700 = (100 x A1 x P1) + (100 x A1 x P2) 300 = (100 x A2 x P1) + (100 x A2 x P2) 700 = (100 x A1 x 1.50) + (100 x A1 x 1.00) 300 = (100 x A2 x 1.50) + (100 x A2 x 1.00) 1st Iteration Solution for Age Variables Symbol Solved Value Rebased Youth A1 2.8000 2.3333 Adult A2 1.2000 1.0000 1st Iter. 2nd Iter. Points 1.5000 1.5000 No Next Points we substitute 1.0000 in the solved 1.0000 Youthful values for Points 2.3333 to get new 2.3333 Adult values for Age 1.0000 1.0000 Convergence Example 2 Notes: Immediate convergence not always the case Method can be extended to many dimensions Possible to code the calculation directly into a spreadsheet ( short cut formulas exist) Can use with multiplicative or additive pricing models Could use Least Squares, Chi Squared, or Maximum likelihood approaches instead For more information see CAS publication: The Minimum Bias Procedure, A Practioner s Guide, by Feldblum and Brosius 8

Minimum Bias techniques still have limitations These techniques only give Point Estimates, yet we know all data contains both signal and noise. Minimum bias techniques provide no method for quantifying the extent and impact of the noise. Classical Statistics Techniques provide a method for quantifying noise vs. signal General Form: Dependent Variable = Signal + Noise Simple Linear Regression: y = ( β 1 x + β 0 ) + ε MPG = (0.15 Model year + 25) + ε Multiple Linear Regression: y = ( β 1 x 1 + β 2 x 2 + β n x n + β 0 ) + ε MPG = (-0.10 Wgt + 0.20 Model year + 40) + ε With Insurance applications we use the rating factors as the Dimensions in the regression Observed Pure Premiums or Loss Ratios are used to determine the parameter values, or fit the model Usually Categorical variables are used instead of Quantitative variables Specifying a Categorical model differs from a Quantitative model: Categorical: each Level of a Variable has it s own parameter Quantitative: each Variable has a slope for all levels within it 9

We specify a Categorical model so each cell is uniquely represented using dummy variables Loss Younger Older Clean 1,500 5,000 Pointed 4,500 7,500 This two-dimensional example is formulated as y = β Youth x 1 + β Adult x 2 + β Clean x 3 + ε The x s take on values of 0 or 1 The default is a Pointed driver 4,500 = β Youth 1 + β Adult 0 + β Clean 0 + ε 1,500 = β Youth 1 + β Adult 0 + β Clean 1 + ε 7,500 = β Youth 0 + β Adult 1 + β Clean 0 + ε 5,000 = β Youth 0 + β Adult 1 + β Clean 1 + ε Solve the Parameters (β) by substituting in the observed Pure Premiums 4,500 = β 1 1 + β 2 0 + β 3 0 + ε 1,500 = β 1 1 + β 2 0 + β 3 1 + ε 7,500 = β 1 0 + β 2 1 + β 3 0 + ε 5,000 = β 1 0 + β 2 1 + β 3 1 + ε Loss Younger Older Clean 1,500 5,000 Pointed 4,500 7,500 Usually the system of equations will not have a fit that perfectly explains all variation. What fit will be best at minimizing the error? To find an answer, we need a criterion for what is the best answer. A typical approach is to minimize the sum of the squared errors (SSE). 1. SSE = ε 12 + ε 22 + ε 32 + ε 2 +. 4 where ε = (observed expected) 2. Minimize by taking the derivative with respect to beta: δsse/δβ i 3. Set the derivative equal to zero and solve for β i : δsse/δβ i = 0 Solve the Parameters (β) by substituting in the observed Pure Premiums and Minimizing the SSE 4,500 = β 1 1 + β 2 0 + β 3 0 + ε 1,500 = β 1 1 + β 2 0 + β 3 1 + ε Loss Clean Pointed 7,500 = β 1 0 + β 2 1 + β 3 0 + ε Younger 1,500 4,500 5,000 = β 1 0 + β 2 1 + β 3 1 + ε Older 5,000 7,500 2 2 SSE = ( ε ) = ( yˆ i y i ) i i = (4,500 - β 1 ) 2 + (1,500 - β 1 - β 3 ) 2 + (7,500 - β 2 ) 2 + (5,000 - β 2 - β 3 ) 2 δsse/δβ 1 = 2 (4,500 - β 1 ) (-1) + 2 (1,500 - β 1 - β 3 ) (-1) + 0 + 0 δsse/δβ 2 = 0 + 0 + 2 (7,500 - β 2 ) (-1) + 2 (5,000 - β 2 - β 3 ) (-1) set = 0 set = 0 δsse/δβ 3 = 0 + 2 (1,500 - β 1 - β 3 ) (-1) + 0 + 2 (5,000 - β 2 - β 3 ) (-1) set = 0 10

Solve the Parameters (β) by substituting in the observed Pure Premiums and Minimizing the SSE 4,500 = β 1 1 + β 2 0 + β 3 0 + ε 1,500 = β 1 1 + β 2 0 + β 3 1 + ε Loss Clean Pointed 7,500 = β 1 0 + β 2 1 + β 3 0 + ε Younger 1,500 4,500 5,000 = β 1 0 + β 2 1 + β 3 1 + ε Older 5,000 7,500 2 β 1 + β 3 = 6,000 2 β 2 + β 3 = 12,500 β 1 + β 2 + 2 β 3 = 6,500 β 1 = 4,375 β 2 = 7,625 β 3 = -2,750 Finding the optimal answer for a multi-linear regression boils down to systems of equations 4,500 = β 1 1 + β 2 0 + β 3 0 + ε 1,500 = β 1 1 + β 2 0 + β 3 1 + ε 7,500 = β 1 0 + β 2 1 + β 3 0 + ε 5,000 = β 1 0 + β 2 1 + β 3 1 + ε Expressing a System of Equations is more conveniently done via Matrix Notation and Linear Algebra Especially as the number of variables and observations get more numerous Matrix notation allows systems of equations to be more elegantly represented 4,500 = β 1 1 + β 2 0 + β 3 0 + ε 1,500 = β 1 1 + β 2 0 + β 3 1 + ε 7,500 = β 1 0 + β 2 1 + β 3 0 + ε 5,000 = β 1 0 + β 2 1 + β 3 1 + ε Y = X.β + ε Y = signal + noise 4500 1500 7500 5000 = 1 1 0 0 0 0 1 1 0 1 βyouth 0 β Adult 1 βclean + ε1 ε 2 ε 3 ε 4 4 obs Youth Adult Clean solving for these 11

We think of the Linear Model as having Three Parts: Y = X.β + ε Random Component: The Observations being predicted Systematic Component: The Predictor variables, also notated as η ( eta ) Link function: Defines the relationship between the predictors and the observations Linear Modeling is subject to some assumptions that may not fit Insurance applications well Y = X.β + ε Random Component: Observations are independent and come from a normal distribution with a common variance. Systematic Component: Predictor variables are related as a Linear sum, η i.e. Predictors are related as a linear combination: η = β 1 x 1 + β 2 x 2 + β 3 x 3 Link function: The expected value of Y is equal to η Ε[Y] = η Linear Modeling is subject to some assumptions that may not fit Insurance applications well Y = X.β + ε Random Component: Observations are independent and come from a normal distribution with a common variance. For each variable in our model, there is an expected mean and randomness about that mean. The average loss for younger drivers may be $100, but why should the distribution of individual observations be Normal about this? In fact, normal distributions extend to negative numbers. What s a negative loss? 12

Age Points Terr Linear Modeling is subject to some assumptions that may not fit Insurance applications well Y = X.β + ε Random Component: Observations are independent and come from a normal distribution with a common variance. Why should the distribution of losses for 25K limits have the same variance as the distribution of losses for 100K limits? The 25K limits, with a low mean, would likely have less variance than 100K limits. Linear Modeling is subject to some assumptions that may not fit Insurance applications well E[Y] = X.β β 1 x 1 + βε 2 x 2 + β 3 x 3 Systematic Component: The Predictor variables are a Linear sum, η η = β 1 x 1 + β 2 x 2 + β 3 x 3 Link function: The expected value of Y is equal to η This pair assumes that Y is predicted by the additive combination of the X variables. However, most insurance effects tend to combine multiplicatively. E[Y] = (β 1 x 1 ) * (β 2 x 2 ) * (β 3 x 3 ) * Round up of Multi-variate approaches and their limitations One-way Pure Premium: = (Base Rate) Youthful Mature Loss Ratio: Rel Loss Rel Rel Minimum Bias: Youthful Adult Points 100 x A1 x P1 100 x A2 x P1 100 x A1 x P2 100 x A2 x P2 Multi Linear Regression: (MLR)... No measure of Significance (Point estimate only) Can overlook Exposure Correlations Not sensitive to Factor Interactions Is Sensitive to correlations But assumes all other pricing factors accurate: can t use if changing multiple structures Provides Point estimates only Does accommodate multiple structure changes But only Provides Point estimates Does accommodate multiple structure changes Provides Confidence Ranges & Significance tests But requires assumptions that don t fit insurance data 13

Generalized Linear Models use more lenient assumptions Y = g -1 (X.β) + ε <<<< Old Assumptions >>>> Random Component: Observations are independent, and but come from one a normal of the distribution family of Exponential with a common Distributions variance. (Normal, Poisson, Gamma, ) now the variance can change with the mean and negative values can be prohibited Systematic Component: Predictor variables are related as a Linear sum, η No Change here: η = β 1 x 1 + β 2 x 2 + β 3 x 3 Link function: The expected value of Y is equal to a η transformation of η: g( Ε[Y] ) = η Ε[Y] or = E[Y] η = g -1 (η) A log link results in a multiplicative relationship between the X s log-link: g(x) = ln(x) g( E[Y] ) = η = ln( E[Y] ) E[Y] = e (η) = e (x 1 β 1 +x 2 β 2 ) = e (x 1 β 1 ) e (x 2 β 2 ) Generalized Linear Modeling assumptions are better suited for Insurance applications Y = g -1 (X.β) + ε First Step is choosing a Link function. You must select a Distributional Family that mimics your data. Then you need to decide on your design matrix (which variables to include in your model and how to combine them)? This process is best done through an evaluative, trial and error process that combines both statistics and judgment. GLM output can show Predictions vs. Error, Correlated effects, and Interactions Exposure Error Bars Multi Variate Prediction One-way Average Adult Exposure Adult Prediction Youth Exposure Youth Prediction Frequency 0.12 0.10 0.08 0.06 0.04 0.02-0 1 2 3+ Driver Points 200% 180% 160% 1 120% 100% 80% 60% 20% 0% Exposure % Frequency 0.12 0.10 0.08 0.06 0.04 0.02 - Adult Error Youth Error 100% 90% 80% 70% 60% 50% 30% 20% 0% 0 1 2 3+ Driver Points by Age Group Exposure % 14

In summary of GLMs As a statistical model, GLMs allow us to have some measure of the Noise as well as the Signal. GLMs assumptions are flexible enough to reasonably fit real-world insurance situations. It turns out that many Minimum Bias techniques, all One-way, and all Linear Regression approaches are just special forms of GLMs. GLMs are multivariate and automatically solve the double counting problem presented by correlated variables. They also allow for many model forms, including interactions. GLM s are fairly standard in the industry but there are other, non-linear multivariate techniques as well Decision Trees (CART, C5, CHAID, etc.) Neural Networks Polynomial Networks Clustering Kernels Others 15