Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23
Methods for parameter estimation Methods for parameter estimation Methods for estimating parameters in a parametric model: method of moments matching of quantiles (or percentiles) maximum likelihood estimation full/complete, individual data complete, grouped data truncated or censored data Bayes estimation Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 2 / 23
Method of moments Method of moments Assume we observe values of the random sample X 1,..., X n from a population with common distribution function F (x θ) where θ = (θ 1,..., θ p ) is a vector of p parameters. The observed sample values will be denoted by x 1,..., x n. Denote the k-th raw moment by µ k (θ) = E(Xk θ). Denote the empirical (sample) moments by ˆµ k = 1 n x k j. n The method of moments is fairly straightforward: equate the first p raw moments to the corresponding p empirical moments and solve the resulting system of simultaneous equations: µ k (ˆθ) = 1 n j=1 n x k j. j=1 Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 3 / 23
Method of moments Illustrations Illustrations Example 1: For a Normal distribution, derive expressions for the method of moment estimators for the parameters µ and σ 2. Example 2: For a Gamma distribution with a shape parameter α and a scale parameter θ, derive expressions for their method of moment estimators. Example 3: For a Pareto distribution, derive expressions for the method of moment estimators for the parameters α and θ. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 4 / 23
Method of moments Illustrations SOA Example #6 For a sample of dental claims x 1, x 2,..., x 10 you are given: xi = 3860 and x 2 i = 4, 574, 802 Claims are assumed to follow a lognormal distribution with parameters µ and σ. µ and σ are estimated using the method of moments. Calculate E[X 500] for the fitted distribution. [ 259] Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 5 / 23
Matching of quantiles Matching of quantiles Denote the 100q-th quantile of the distribution by π q (θ) for which in the case of continuous distribution is the solution to: F (π q (θ) θ) = q. Denote the smoothed empirical estimate by ˆπ q. This is usually found by solving for ˆπ q = (1 h)x (j) + hx (j+1), where j = (n + 1)q and h = (n + 1)q j. Here x (1) x (n) denote the order statistics of the sample. The quantile matching estimate of θ is any solution to the p equations: π qk (ˆθ) = ˆπ qk for k = 1, 2,..., p, where the q k s are p arbitrarily chosen quantiles. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 6 / 23
Matching of quantiles Example Example Suppose the following observed sample come from a Weibull with 12, 15, 18, 21, 21, 23, 28, 32, 32, 32, 58 F (x λ, γ) = 1 e λxγ. Calculate the quantile matching estimates of λ and γ by matching the median and the 75-th quantile. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 7 / 23
Matching of quantiles Example SOA Example #155 You are given the following data: 0.49 0.51 0.66 1.82 3.71 5.20 7.62 12.66 35.24 You use the method of percentile matching at the 40th and 80th percentiles to fit an inverse Weibull distribution to these data. Calculate the estimate of θ. [1.614] Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 8 / 23
Maximum likelihood estimation The method of maximum likelihood The maximum likelihood estimate of parameter vector θ is obtained by maximizing the likelihood function. The likelihood contribution of an observation is the probability of observing the data. In many cases, it is more straightforward to maximize the logarithm of the likelihood function. The likelihood function will vary depending on whether it is completely observed, or if not, then possibly truncated and/or censored. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 9 / 23
Likelihood function Complete, individual data No truncation and no censoring In the case where we have no truncation and no censoring and each observation X i, for i = 1,..., n, is recorded, the likelihood function is L(θ) = n f Xj (x j θ). j=1 The corresponding log-likelihood function is l(θ) = log[l(θ)] = n log f Xj (x j θ). j=1 Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 10 / 23
Likelihood function Illustrations Illustrations Assume we have n samples completely observed. Derive expressions for the maximum likelihood estimates for the following distributions: Exponential: f X (x) = λe λx, for x > 0. Uniform: f X (x) = 1 θ, for 0 < x < θ. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 11 / 23
Likelihood function Illustrations SOA Example #26 You are given: Low-hazard risks have an exponential claim size distribution with mean θ. Medium-hazard risks have an exponential claim size distribution with mean 2θ. High-hazard risks have an exponential claim size distribution with mean 3θ. No claims from low-hazard risks are observed. Three claims from medium-hazard risks are observed, of sizes 1, 2 and 3. One claim from a high-hazard risk is observed, of size 15. Calculate the maximum likelihood estimate of θ. [2] Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 12 / 23
Likelihood function Complete, grouped data Complete, grouped data Starting with a set of numbers c 0 < c 1 < < c k where from the sample, we have n j observed values in the interval (c j 1, c j ]. For such data, the likelihood function is L(θ) = k [F (c j θ) F (c j 1 θ)] n j. j=1 Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 13 / 23
Likelihood function Example Example Suppose you are given the following observations for a loss random variable X: Interval Number of Observations (0, 2] 4 (2, 4] 7 (4, 6] 10 (6, 8] 6 (8, ) 3 Total 30 Determine the log-likelihood function of the sample if X has a Pareto with parameters α and θ. If it is possible to maximize this log-likelihood and solve explicitly, determine the MLE of the parameters. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 14 / 23
Likelihood function Example SOA Example #44 You are given: Losses follow an exponential distribution with mean θ. A random sample of 20 losses is distributed as follows: Loss Range Frequency [0,1000] 7 (1000, 2000] 6 (2000, ) 7 Calculate the maximum likelihood estimate of θ. [1996.90] Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 15 / 23
Likelihood function Truncated and/or censored data Truncated and/or censored data Consider the case where we have a left truncated and right censored observation. It is usually best to represent this observation as (t i, x i, δ i ) where: t i is the left truncation point; x i is the value that produced the data point; and δ i is an indicator (1 or 0) whether data is right censored. When observation is right censored, the contribution to the likelihood is [ ] 1 F (xi ) δi. 1 F (t i ) Otherwise if not right censored, the contribution to the likelihood is [ ] f(xi ) 1 δi. 1 F (t i ) Note that the policy limit if reached would be equal to x i t i. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 16 / 23
Likelihood function Truncated and/or censored data SOA Example #4 You are given: Losses follow a single-parameter Pareto distribution with density function: f(x) = α x α+1 x > 1 0 < α < A random sample of size five produced three losses with values 3, 6 and 14, and two losses exceeding 25. Calculate the maximum likelihood estimate of α. [0.2507] Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 17 / 23
Likelihood function Example Illustrative example An insurance company records the claim amounts from a portfolio of policies with a current deductible of 100 and policy limit of 1100. Losses less than 100 are not reported to the company and losses above the limit are recorded as 1000. The recorded claim amounts as 120, 180, 200, 270, 300, 1000, 1000. Assume ground-up losses follow a Pareto with parameters α and θ = 400. Use the maximum likelihood estimate of α to estimate the Loss Elimination Ratio for a policy with twice the current deductible. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 18 / 23
Bayesian estimation Definitions and notation Assume that conditionally on θ, the observable random variables X 1,..., X n are i.i.d. with conditional density f X θ (x θ). Denote the random vector X = (X 1,..., X n ). prior distribution: the probability distribution of θ with density π(θ). joint distribution of (X, θ) is given by f X,θ (x, θ) = f X θ (x θ) π(θ) = f X θ (x 1 θ) f X (x n θ) π(θ). Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 19 / 23
Bayesian estimation Posterior distribution Posterior distribution The conditional probability distribution of θ, given the observed data X 1,..., X n, is called the posterior distribution. This is denoted by π(θ X) and is computed using π θ X (θ x) = f X,θ(x, θ) f X,θ (x, θ)dθ = f X θ(x θ)π(θ). f X θ (x θ)π(θ)dθ We can conveniently write π θ X (θ x) = c f X θ (x 1,..., x n θ)π(θ), where c is the constant of integration. More often, we write π θ X (θ x) f X θ (x 1,..., x n θ)π(θ). The mean of the posterior, E(θ X), is called the Bayes estimator of θ. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 20 / 23
Bayesian estimation Poisson with a Gamma prior Poisson with a Gamma prior Suppose that conditionally on θ, claims on an insurance policy X 1,..., X n are distributed as Poisson with mean λ. Let the prior distribution of λ be a Gamma with parameters α and θ (as in textbook). Show that the posterior distribution is also Gamma distributed and find expressions for its parameters. Show that the resulting Bayes estimator can be expressed as the weighted combination of the sample mean and the prior mean. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 21 / 23
Bayesian estimation SOA Exam Question SOA Exam Question You are given: Conditionally, given β, an individual loss X has Exponential distribution with density: f(x β) = 1 β e x/β, for x > 0. The prior distribution of β is Inverse Gamma with density: Hint: 0 π(β) = c2 β 3 e c/β, for β > 0. 1 (n 2)! y n e a/y dy = a n 1, for n = 2, 3,... Given that the observed loss is x, calculate the mean of the posterior distribution of β. Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 22 / 23
Bayesian estimation SOA Exam Question SOA Example #5 You are given The annual number of claims for a policyholder has a binomial distribution with probability function: ( ) 2 p(x q) = q x (1 q) 2 x, x = 0, 1, 2,... x The prior distribution is: p(q) = 4q 3, 0 < q < 1 This policyholder had one claim in each of Years 1 and 2. Calculate the Bayesian estimate of the number of claims in Year 3. [4/3] Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 23 / 23