Solutions to the Spring 2015 CAS Exam ST

Solutions to the Spring 2015 CAS Exam ST (updated to include the CAS Final Answer Key of July 15) There were 25 questions in total, of equal value, on this 2.5 hour exam. There was a 10 minute reading period in addition to the 2.5 hours. The Exam ST is copyright 2015 by the Casualty Actuarial Society. The exam is available from the CAS. The solutions and comments are solely the responsibility of the author. CAS Exam ST prepared by Howard C. Mahler, FCAS Copyright 2015 by Howard C. Mahler. Study Aid 2015-ST-5A Howard Mahler hmahler@mac.com www.howardmahler.com/teaching

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 1 1. You are given: The number of messages sent follows a Poisson process with rate λ(t) = t/4, t 0. Each message sent is received with probability 0.80. Calculate the probability that at least 2 messages will be received by time t = 4. A. Less than 0.35 B. At least 0.35, but less than 0.45 C. At least 0.45, but less than 0.55 D. At least 0.55, but less than 0.65 E. At least 0.65 1. C. The expected number of messages sent by time 4 is: 4 4 λ(t) dt = t / 4 dt = 4 2 /8 = 2. 0 0 Thinning, the expected number of messages received by time 4 is Poisson with mean: (0.8)(2) = 1.6. Thus, the probability that at least 2 messages will be received by time 4 is: 1 - e -1.6-1.6e -1.6 = 47.5%.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 2 2. You arrive at a bus station at exactly 8:30 am and you have the option of either taking Line 1 or Line 2 to bring you to your destination. Buses along Line 1 and Line 2 arrive independently, both according to a Poisson process. On the average, one Line 1 bus arrives every 15 minutes and one Line 2 bus arrives every 10 minutes. If you board Line 1, it will take you 8 minutes to reach your destination. If you board Line 2, it will take you 20 minutes to reach your destination. You decide to take the first bus that arrives. Calculate the expected length of time, to the nearest minute, that it will take you to reach your destination. A. Less than 22 minutes B. At least 22 minutes, but less than 24 minutes C. At least 24 minutes, but less than 26 minutes D. At least 26 minutes, but less than 28 minutes E. At least 28 minutes 2. A. The rate at which buses show up is: 1/15 + 1/10 = 1/6. Thus the average wait for a bus is 6 minutes. (You will take the first one that arrives.) The probability the first bus to arrive is Line 1 is: 10 / (10 + 25) = 40%, while the probability the first bus to arrive is Line 2 is: 15 / (10 + 25) = 60%. Thus your average ride is: (40%)(8) + (60%)(20) = 15.2 minutes. Thus, the expected length of time that it will take you to reach your destination is: 6 + 15.2 = 21.2. To the nearest minute this is 21. Comment: The last sentence in the exam question was missing the word it. to the nearest minute is an unnecessary addition to the wording of an otherwise good question.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 3 3. You are given: Claim frequency follows the Poisson process with a rate λ(t) = 3t, t > 0. Frequency and severity of claims are independent. Claim severity follows the distribution given in the following table: Amount Probability 5 0.6 10 0.3 75 0.1 The aggregate claim amount by t = 5 was 505. Calculate the variance of the aggregate claim amount at t = 25. A. Less than 500,000 B. At least 500,000, but less than 520,000 C. At least 520,000, but less than 540,000 D. At least 540,000, but less than 560,000 E. At least 560,000 3. D. Since the aggregate amount by time 5 is known, the variance of the aggregate amount by time 25 is equal to the variance of the aggregate amount between times 5 and 25. 25 25 The mean number of claims from time 5 to 25: λ(t) dt = 3t dt 5 5 = 1.5t 2 ] t = 25 The second moment of severity is: (0.6)(5 2 ) + (0.3)(10 2 ) + (0.1)(75 2 ) = 607.5. The variance of the aggregate claim amount at t = 25 is: (900)(607.5) = 546,750. t = 5 = 900.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 4 4. Let X 1, X 2,..., X n be a random sample from a Bernoulli distribution with success probability q and let X = 1 n n k=1 X k denote the sample mean. The sample mean is used as an estimator for q. Determine the correct expression for the mean squared error of this estimator. A. 0 B. q(1 - q) C. nq(1 - q) D. 1 1 q(1 - q) E. q(1 - q) n n2 4. D. The sample mean is an unbiased estimator of the mean q. Therefore, the mean square error is equal to the variance. Var[X ] = Var[X] / n = q(1 - q) / n.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 5 5. Let X 1, X 2,..., X n be a random sample from a population X with probability mass function: p(x) = θ(1 - θ) x for x = 0, 1, 2,... θ is an unknown parameter between 0 and 1 and the expected value of X is: E[X] = 1 - θ θ Determine the Cramer-Rao lower bound for the variance of all unbiased estimators of θ. A. 1 n θ(1 - θ) B. n θ(1 - θ) C. 1 n2 θ(1 - θ) D. 1 n θ2 (1 - θ) E. 1 n2 θ2 (1 - θ) 5. D. f(x) = θ(1 - θ) x. ln[f(x)] = lnθ + x ln[1-θ]. lnf(x) θ = 1/θ - x/(1-θ). 2 lnf(x) θ 2 = -1/θ 2 - x/(1-θ) 2. E[ 2 lnf(x) θ 2 ] = -1/θ 2 - E[X]/(1-θ) 2 = -1/θ 2-1 - θ θ The Cramer- Rao lower bound is: 1 (1- θ) 2 = -1/θ2 - -1 n E [ 2 ln f(x) / 2 θ] = 1 n θ2 (1 - θ). 1 θ(1- θ) = -1 θ 2 (1- θ). Comment: The answer has to be inversely proportional to n, eliminating B, C, and E. This is a Geometric Distribution parameterized somewhat differently, with β/(1+β) = 1 - θ, θ = 1 / (1 + β), or β = 1/θ - 1 = 1 - θ. θ

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 6 6. For a general liability policy the paid claim amounts, X, follow the Weibull distribution which is shown below. F(x) = 1 - exp[-(x/θ) 2 ], x > 0. You are given 5 paid claim amounts of 1, 5, 6, 8, and 10. Also, it is known that the paid amounts for each of three additional claims exceed 1. Calculate the maximum likelihood estimate of θ. A. Less than 5 B. At least 5, but less than 6 C. At least 6, but less than 7 D. At least 7, but less than 8 E. At least 8 6. C. f(x) = exp[-(x/θ) 2 ] 2x / θ 2. ln[f(x)] = -(x/θ) 2 + ln[2] + ln[x] - 2 lnθ. S(x) = exp[-(x/θ) 2 ]. ln[s(x)] = -(x/θ) 2. Each uncensored value contributes ln[f(x)] to the loglikelihood. Each value censored from above at 1 contributes ln[s(1)] = -1/θ 2 to the loglikelihood. Thus the loglikelihood is: -(1 2 + 5 2 + 6 2 + 8 2 + 10 2 )/θ 2 + 5ln[2] + ln[(1)(5)(6)(8)(10)] - 10 lnθ - 3/θ 2. Set the partial derivative with respect to theta of the loglikelihood equal to zero: 0 = 452/θ 3-10/θ - 6/θ 3. θ 2 = 45.8. θ^ = 6.768. (Min[xi, u Alternately, for a Weibull Distribution with τ fixed: θ^ = i ] τ - d i τ ) 1 /τ. number of uncensored values Here we have 5 uncensored values and when there is censoring from above, u i = 1. There is no truncation from below, no deductible, so d i = 0. 1 θ^ = 2 + 5 2 + 6 2 + 8 2 + 10 2 + 1 2 + 1 2 + 1 2 1/ 2 = 6.768. 5 Comment: It would be unusual in actuarial applications to have some of the data censored from above at 1, while there are other payments such as 10 that are not censored from above. For example, in this case, we could have had instead three values censored from above at 15.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 7 7. Let X 1, X 2,..., X 5, be independent and identically distributed observations for a random variable from a population with probability density function: f(x; θ) = x(1- θ) / θ θ, 0 < x < 1, 0 < θ Given the following observations: 0.250 0.200 0.800 0.750 0.050 Calculate the maximum likelihood estimate of θ. A. Less than 0.5 B. At least 0.5, but less than 1.0 C. At least 1.0, but less than 1.5 D. At least 1.5, but less than 2.0 E. At least 2.0 7. C. ln[f(x)] = 1 - θ θ ln[x] - ln[θ] = ln[x] / θ - ln[x] - ln[θ]. Set the partial derivative with respect to theta of the loglikelihood equal to zero: 0 = - ln[x i ] / θ2 - n/θ. θ^ = - ln[x i ] / n = -ln[(0.25)(0.2)(0.8)(0.75)(0.05)] / 5 = 1.300. Alternately, this is a Beta Distribution with b = 1, and a - 1 = 1 - θ = 1/θ - 1. θ = 1/a. θ The maximum Iikelihood fit is: a^ = -n/ ln[x i ] = -5 / ln[(0.25)(0.2)(0.8)(0.75)(0.05)] = 0.769. θ^ = 1/a^ = 1/0.769 = 1.30. Comment: A graph of the loglikelihood as a function of theta: Loglikelihood 0.2 0.1-0.1 1.0 1.2 1.4 1.6 1.8 2.0 Theta - 0.2-0.3-0.4-0.5

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 8 8. You are given the probability distribution for observing a given Genotype: Genotype Probability AA θ 2 Aa 2θ(1-θ) aa (1- θ) 2 The results from a random sample of 122 people are: Genotype Probability AA 15 Aa 31 aa 76 Calculate the maximum likelihood estimate of θ. A. Less than 0.15 B. At least 0.15, but less than 0.21 C. At least 0.21, but less than 0.27 D. At least 0.27, but less than 0.33 E. At least 0.33 8. C. The loglikelihood is: (15)(2 lnθ) + (31){ln2 + lnθ + ln(1-θ)} + (76){2 ln(1-θ)} = 61lnθ + 183ln(1-θ) + 31 ln2. Set the partial derivative with respect to θ equal to zero: 0 = 61/θ - 183/(1-θ). θ = 0.25. Alternately, the number of A is Binomial with m = 2 and q = θ. The observed average number of A is: {(15)(2) + 31} / 122 = 0.5. For the Binomial with m fixed, the method of moments is equal to maximum likelihood. θ = X /m = 0.5/2 = 0.25.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 9 9. Determine the formula for the Fisher Information of n independent samples from a geometric distribution with mean β where each of the n samples gives you the number of trials required to obtain a successful outcome in that particular sample. A. 1/β B. n/β C. n/β - n/(β+1) D. nβ/(β+1) E. β(β+1)/n

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 10 9. C. For the Geometric Distribution, the method of moments is equal to maximum likelihood. β^ = X. Var[β^ ] = Var[X ] = Var[X]/n = β(1+β)/n. Fisherʼs Information = 1/Var[β^ ] = n / {β(1+β)} = n/β - n/(β+1). Alternately, f(x) = β x / (1+β) x+1. ln[f(x)] = x lnβ - (x+1) ln[1+β]. ln[f(x)] β = x/β - (x+1)/(1+β). 2 ln[f(x)] β 2 = -x/β 2 + (x+1)/(1+β) 2. E[ 2 ln[f(x)] β 2 ] = -E[X]/β 2 + (E[X]+1)/(1+β) 2 = -β/β 2 + (β+1)/(1+β) 2 = -1/β + 1/(1+β). Fisherʼs Information = -n E[ 2 ln[f(x)] β 2 ] = n/β - n/(β+1). Comment: Should have read: a geometric distribution with mean β for a sample of size n. The Geometric Distribution in Appendix A attached to the exam is the number of failures prior to the first success. The number of trials prior to the first success is one more than the number of failures prior to the first success; the number of trials follows what is called a zero-truncated Geometric Distribution: f(x) = β x-1 / (1+β) x, x = 1, 2, 3,... However, it turns out that Fisherʼs Information is the same. The zero-truncated Geometric has a mean of 1+β, and a variance of β(1+β). For the zero-truncated Geometric, the method of moments is equal to maximum likelihood: β^ = X - 1. Var[β^ ] = Var[X ] = Var[X]/n = β(1+β)/n. Fisherʼs Information = 1/Var[β^ ] = n / {β(1+β)} = n/β - n/(β+1). Alternately, f(x) = β x-1 / (1+β) x. ln[f(x)] = (x-1) lnβ - x ln[1+β]. ln[f(x)] β = (x-1)/β - x/(1+β). 2 ln[f(x)] β 2 = -(x-1)/β 2 + x/(1+β) 2. E[ 2 ln[f(x)] β 2 ] = -(E[X]-1)/β 2 + E[X]/(1+β) 2 = -β/β 2 + (β+1)/(1+β) 2 = -1/β + 1/(1+β). Fisherʼs Information = n/β - n/(β+1). However, the question says that the mean of the distribution is β, rather than 1+β as it would be a zero-truncated Geometric, so the question is not consistent with the usual notation. For the zero-truncated Geometric, Fisherʼs Information = n/(mean - 1) - n/(mean). If as per this question, one instead uses the letter beta for the mean of the zero-truncated Geometric Distribution then none of the given letter choices is correct.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 11 10. An insurance company is examining its offer of reduced rates for car insurance premiums to owners of small vehicles. Some analysts suggest that when small vehicles are involved in accidents, the chances of serious injury are higher than that for larger sized vehicles. A random sample of 1000 accidents is classified according to severity of the injuries and the size of the car. The results are: Size of Car Small Medium Large Total Fatal/Critical 235 120 60 415 Non-critical 300 165 120 585 Total 535 285 180 1000 The following null and alternate hypotheses are created: H 0 : There is no association between size of car and severity of injury. H A : There is association between size of car and severity of injury. Calculate the chi-squared test statistic to evaluate the null hypothesis. A. Less than 6.6 B. At least 6.6, but less than 6.8 C. At least 6.8, but less than 7.0 D. At least 7.0, but less than 7.2 E. At least 7.2

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 12 10. A. Assuming the null hypothesis is true, then for example, the expected number of fatal/critical injuries for small cars is: (415)(535)/1000 = 222.025. The contribution to the Chi-Square Statistic from that cell is: (235-222.025) 2 /222.025 = 0.758. Small Medium Large Total Fatal/Critical 235 120 60 415 Non-critical 300 165 120 585 Total 535 285 180 1000 Expected Number Sum Fatal/Critical 222.025 118.275 74.700 415 Female 312.975 166.725 105.300 585 Sum 535 285 180 1000 Contribution to Chi-Square Sum Fatal/Critical 0.758 0.025 2.893 3.676 Female 0.538 0.018 2.052 2.608 Sum 1.296 0.043 4.945 6.284 The test statistic is: (235-222.025)2 222.025 +... + (120-105.3)2 105.3 = 6.284. Comment: We lose one degree of freedom in each dimension, since in each dimension the sum of the expected equals the sum of the observed. Therefore, the degrees of freedom = (number of rows - 1) (number of columns - 1) = (1)(2) = 2. Looking on the 2 degree of freedom row of the Chi-Square Table: 5.99 < 6.28 < 7.38. Therefore, the probability value is between 5% and 2.5%. Using a computer, the probability value is 4.3%.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 13 11. You are given: A random sample, X 1, X 2,..., X 10 from the Bernoulli distribution with q = P(X = 1). H 0 : q = 0.5 and H a : q > 0.5. 10 The critical region, C = ( x k > 6}. k=1 Calculate the probability of a Type I error. A. Less than 0.1 B. At least 0.1, but less than 0.2 C. At least 0.2, but less than 0.3 D. At least 0.3, but less than 0.4 E. At least 0.4 10 11. B. The probability of a Type I error is: Prob[Reject H 0 ] = Prob[ x k > 6 q = 0.5] = k=1 sum of Binomial densities from 7 to 10 = (120 + 45 + 10 + 1) (0.5 10 ) = 17.1875%. Comment: The last sentence of the exam question was missing the word a. 10 7 = 120, 10 8 = 45, 10 9 = 10, 10 10 = 1.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 14 12. You are given two samples of paid claim amounts for two Hospitals, A and B, which, after the natural logarithm transformation, independently follow the normal distribution with means µ A and µ B. The results for the paid claim amounts below are after the natural logarithm transformation has been applied. Hospital A 6.21 7.34 5.67 7.88 3.89 Hospital B 7.89 10.12 9.71 5.55 4.33 12.48 The unbiased standard deviations of the above two samples are s A = 1.56 and s B = 3.04. Assume that variances of paid claim amounts on the natural logarithm scale for these hospitals are equal. Calculate the upper bound of the 95 percent symmetric confidence interval for the difference µ B - µ A. A. Less than 4.9 B. At least 4.9, but less than 5.1 C. At least 5.1, but less than 5.3 D. At least 5.3, but less than 5.5 E. At least 5.5 12. E. The two sample means are: 6.1980 and 8.3467. Thus the point estimate of: µ B - µ A = 8.3467-6.1980 = 2.1487. The pooled sample variance is: (5-1)(1.562 ) + (6-1)(3.04 2 ) (5-1) + (6-1) = 6.2158. For (5-1) + (6-1) = 9 degrees of freedom, for the sum of 5% area in both tails, t = 2.262. Thus the 95 percent symmetric confidence interval for the difference µ B - µ A is: 2.1487 ± 2.262 (6.2158)(1/ 5 + 1/ 6) = (-1.266, 5.564). Comment: The sample standard deviation is not an unbiased estimator of the standard deviation. The unbiased standard deviations should have said The square roots of the unbiased estimates of the variances. Prior to the natural logarithm transformation, the claim amounts follow a LogNormal. They could have had you calculate the sample standard deviations yourself; to more decimal places they are: 1.56037 and 3.04143. Since 0 is in the 95% confidence interval, at 5% we do not reject H 0 : µ B = µ A versus H 1 : µ B µ A. For such a hypothesis test, t = 2.1487 / (6.2158)(1/ 5 + 1/ 6) = 1.423, with p-value 18.8%.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 15 13. For a general liability policy you are given: The natural logarithm of paid claim amounts follows the normal distribution with mean µ and standard deviation 3. The null hypothesis H 0 : µ = 8 The alternate hypothesis H A : µ = 7 The sample size is 160 and the probability of error of type I, α, is equal to 0.05. Based on Neyman-Pearson Lemma, calculate the probability of error of type II, β. A. Less than 0.0049 B. At least 0.0049, but less than 0.0051 C. At least 0.0051, but less than 0.0053 D. At least 0.0053, but less than 0.0055 E. At least 0.0055

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 16 13. B or C (see Comment). The Neyman-Pearson Lemma results in the usual Normal test. For 5% significance, for a one-sided test reject when: Reject when: X < 7.609845. X - 8 3 / 160 < -1.645. Probability of a Type 2 error is the probability of failing to reject when µ = 7: Prob[X < 7.60985 µ = 7] = 1 - Φ[ 7.609854-7 3 / 160 ] = 1 - Φ[2.75137]. In the table, 1 - Φ[2.75] = 0.0051 and 1 - Φ[2.76] = 0.0049. Linearly interpolating: 1 - Φ[2.75137] = 0.0051 - (0.0002)(0.137) = 0.00507, choice B. Alternately, rounding 2.75137 to 2.75, 1 - Φ[2.75] = 0.0051, choice C. Comment: The CAS allowed answers B or C. The ranges are much too narrow in this exam question! There is no specified rule for using the Normal Table, as there is on some other exams. The Normal Table attached to this exam is not designed to make such fine distinctions in survival functions out in the extreme righthand tail of the Normal Distribution. Apparently some people have a fundamental misunderstanding of the limitations of the Normal Table. A visitor to the Museum of Natural History asks a guard how old that dinosaur is. The guard answers 90 million and 11 years old, since he was told it was 90 million years old when he started working at the museum 11 years ago. In the Normal table: Φ[2.75] = 0.9949. This means that 1 - Φ[2.75] = 0.0051, but only to four decimal places. Thus, 1 - Φ[2.75] could be as large as 0.005149. Thus while it is likely that 1 - Φ[2.75137] is less than 0.0051, this does not follow from the table. In the table, 1 - Φ[2.76] = 0.0049. Thus, 1 - Φ[2.76] could be as large as 0.004949. Thus, linearly interpolating, 1 - Φ[2.75137] could be as large as: 0.005149 - (0.0002)(0.137) = 0.005122 > 0.0051. Using a computer, 1 - Φ[2.75137] = 0.005065. By the way, the 95th percentile of the Standard Normal is to more decimal places 1.64485. Thus doing the whole problem on a computer without any intermediate rounding, the critical region is X < 7.60989, and the probability of a Type II error is: 1 - Φ[2.57152] = 0.0050627.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 17 14. You are given: Ten independent observations which follow the normal distribution with unknown mean µ and variance σ 2 = 20. Based on your observations, the (1-α) confidence interval for µ is (-6.84, -1.30). H 0 : µ = -5 and H 1 : µ -5. Calculate the critical region at significance level α. A. (-, -6.84) (6.84, ) B. (-6.84, 6.84) C. (-7.77, -2.23) D. (-, -7.77) (-2.23, ) E. (-, -10.54) (0.54, ) 14. D. (-6.84, -1.30) is -4.07 ± 2.77. The critical region is when we are outside of: -5 ± 2.77 = (-7.77, -2.33). Thus the critical region is: (-, -7.77) (-2.23, ). Comment: We reject H 0 when µ is either very large or very small, eliminating choices B and C. Since the given confidence interval is for (1-α) and we wish to test at the significance level α, we can directly use the ± 2.77 of the given confidence interval. The average of the 10 observations is Normal with variance: 20/10 = 2. Thus the given confidence interval is plus or minus 2.77 / Thus this is a 95% confidence interval. 2 = 1.96 standard deviations.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 18 15. You are given: The following pairs of observed values: Observation X Y Y-X 1 100.9 101.4 0.5 2 100.4 100.4 0 3 136.9 138.2 1.3 4 149.4 149.6 0.2 5 81.9 81.4-0.5 6 140.7 141.9 1.2 7 104.4 106.2 1.8 H 0 : E[X] = E[Y] H 1 : E[X] E[Y] Calculate the p-value for the null hypothesis. A. Less than 0.01 B. At least 0.01, but less than 0.02 C. At least 0.02, but less than 0.05 D. At least 0.05, but less than 0.10 E. At least 0.10 15. D or E. (See Comment) Assume that X and Y are each Normal. (This is not stated in the question.) Then W = Y - X is Normal. W = 4.5/7. Sample variance of W is 0.6695. t = 4.5 / 7 0.6695 / 7 = 2.079. Perform a two-sided test with 6 degrees of freedom. 1.943 < 2.079 < 2.447. Thus the p-value is at least 0.05, but less than 0.10. Comment: The CAS allowed choices D or E. The exam question should have said that X and Y are each Normal; without that there is no way to know what test to apply. The null and alternative hypotheses do not correspond to either of the nonparametric tests: sign test or Wilcoxon Signed-Rank Test. For either of these tests we would throw out the difference that is zero. For the sign test, 5 out of 6 differences are positive. Thus for a two-sided test, the p-value is: (1 + 6 + 6 + 1)/ 2 6 = 0.21875. Sorting the absolute values of the differences: 0.2, 0.5, -0.5, 1.2, 1.3, 1.8. The sum of the positive ranks is: 1 + 2.5 + 4 + 5 + 6 = 18.5. The sum of the negative ranks is 2.5. T = min[t +, T - ] = 2.5. Consulting the table, for a two-sided test, since 2.5 > 2, the p-value is greater than 10%.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 19 16. You are given: X 1, X 2, and X 3 are independent observations from an exponential distribution. E[X] = 5 Y (1), Y (2), and Y (3) are the order statistics of those observations. The probability density function of the order statistic Y (k) from a sample of size n is: g k (y (k) ) = Calculate E(Y (2) ). A. Less than 3.0 B. At least 3.0, but less than 4.0 C. At least 4.0, but less than 5.0 D. At least 5.0, but less than 6.0 E. At least 6.0 16. C. The density of Y (2) is: n! (k -1)! (n- k)! F(y (k) )k - 1 [1- F(y(k) )] n - k f(y (k) ) 3! (2-1)! (3-2)! F(y) S(y) f(y) = 6 (1- e-y/5 ) (e -y/5 ) (e -y/5 /5) = 1.2 (e -2y/5 - e -3y/5 ). Since the mean of an Exponential Distribution is θ: 0 x e- x / θ / θ dx = θ. x e- x / θ dx = θ 2. Thus E(Y (2) ) = (1.2) {(5/2) 2 - (5/3) 2 } = 4.1667. 0 Alternately, for a sample of size n from an Exponential Distribution with mean θ: E(Y (k) ) = θ ( 1 N + 1 N- 1 +... + 1 N- k ). Thus, E(Y (2) ) = (5) (1/3 + 1/2) = 25/6 = 4.1667. Comment: E(Y (1) ) = (5) (1/3) = 5/3. E(Y (3) ) = (5) (1/3 + 1/2 + 1) = 55/6. E(Y (1) ) + E(Y (2) ) + E(Y (3) ) = 5/3 + 25/6 + 55/6 = 15 = 3θ.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 20 17. You are given: (X 1, Y 1 ), (X 2, Y 2 ), (X 3, Y 3 ), and (X 4, Y 4 ) is a random sample from a bivariate continuous distribution function F(x, y). R(X i ) denotes the rank of X i, among {X 1, X 2, X 3, X 4 }. R(Y i ) denotes the rank of Y i, among {Y 1, Y 2, Y 3, Y 4 }. The ranks are: Observation R(X i ) R(Y i ) 1 3 1 2 1? 3 2? 4 4? There are no tied ranks. X and Y are independent according to Spearman's rho. Determine R(Y 2 ), R(Y 3 ), and R(Y 4 ). A. 2, 4, 3 B. 3, 2, 4 C. 3, 4, 2 D. 4, 2, 3 E. 4, 3, 2 17. A. For a Spearmanʼs rho of zero, we require that the correlation of the ranks be zero. Try each of the five choices. A. Corr[(3, 1, 2, 4), (1, 2, 4, 3)] = 0. B. Corr[(3, 1, 2, 4), (1, 3, 2, 4)] = 1/5. C. Corr[(3, 1, 2, 4), (1, 3, 4, 2)] = -3/5. D. Corr[(3, 1, 2, 4), (1, 4, 2, 3)] = -2/5. E. Corr[(3, 1, 2, 4), (1, 4, 3, 2)] = -4/5. Alternately, in order to have the correlation of the ranks be zero we need the covariance of the ranks to be zero: {(3)(1) + 1 R(Y 2 ) + 2 R(Y 3 ) + 4 R(Y 4 )}/4 - (2.5)(2.5) = 0. R(Y 2 ) + 2 R(Y 3 ) + 4 R(Y 4 ) = 22. Trying the choices, the answer is A: 2 + (2)(4) + (4)(3) = 22. Comment: A Spearmanʼs rho of zero does not imply independence. Independence implies a Pearsonʼs correlation of zero. One of the outputs of using a calculator to fit a least squares line with intercept is r, the sample Pearsonʼs correlation coefficient between X and Y. One can apply that here to the ranks.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 21 18. You are given: The following paired samples: Pair X Y 1 29 23 2 24 26 3 26 26 4 12 11 5 14 14 6 15 11 7 25 23 H 0 : Median of X = Median of Y H 1 : Median of X > Median of Y Calculate the exact p-value using the sign test. A. Less than 10% B. At least 10% but less than 15% C. At least 15% but less than 20% D. At least 20% but less than 25% E. At least 25% 18. C. The differences are: 6, -2, 0, 1, 0, 4, 2. Throw out the zeros. 4 out of 5 are positive. For a one-sided test, the p-value is the sum of the Binomial densities for m = 5 and q = 0.5 at 4 and 5: (5 + 1)/2 5 = 18.75%.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 22 19. You are given: A tutor claims that students will get more than 89.5 out of 100 questions right on a particular exam if the students use the tutor's service. A random sample of 160 previous students was taken. No student scored exactly 89.5. H 0 : Students using the service have a median score of 89.5 on the exam. H 1 : Students using the service have a median score greater than 89.5 on the exam. H 0 was rejected at the 0.05 significance level using the sign test. The normal approximation with continuity adjustment is used. Calculate the minimum number of students that scored more than 89.5 in the sample. A. 89 or fewer B. 90 C. 91 D. 92 E. 93 or more 19. C. The number of students who score more than the median is Binomial with m = 160 and q = 1/2. It has mean 80 and variance 40. Let x be the number of students that scored more than 89.5 in the sample. (x - 1/ 2) - 80 Then for the one-sided test, we reject at 5% if: 1 - Φ[ ] < 0.05. 40 x - 80.5 40 > 1.645. x > 90.90. Minimum number is 91. Comment: If we observe 91 students who scored more than 89.5 in the sample, then the p-value 90.5-80 is: 1 - Φ[ ] = 1 - Φ[1.660] = 1-0.9515 = 4.85% < 5%. 40 If we instead observe 90 students who scored more than 89.5 in the sample, then the p-value is: 89.5-80 1 - Φ[ ] = 1 - Φ[1.502] = 6.65% > 5%. 40

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 23 20. You are given the following loss ratios for two lines of business, A and B, over a four-year period: A 80 75 75 90 B 70 75 60 95 You are asked to set up a linear regression model for estimating loss ratios of line B using loss ratios of line A. Based on the above sample, the estimated slope is 11, and the unbiased estimate of the error 6 variance is 72.92. Calculate the upper bound of the symmetric 95 percent confidence interval for the intercept of the regression line. A. Less than -100 B. At least -100, but less than 0 C. At least 0, but less than 100 D. At least 100, but less than 200 E. At least 200

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 24 20. D. The fitted intercept ^α is: B - β^ A = 75-11 80 = -71.67. 6 Var[ ^α] = s 2 X i 2 N (X i - X ) 2 = (72.92) (80 2 + 75 2 + 75 2 + 90 2 ) (4) {(80-80)2 + (75-80)2 + (75-80)2 + (90-80)2} = (72.92) (25,750) (4) (150) = 3129.483. For 5% area in both tails and 4-2 = 2 degrees of freedom, the critical value form the t-table is 4.303. The symmetric 95% confidence interval for the intercept of the regression line is: -71.67 ± 4.303 3129.483 = (-312, 169). Comment: Since the fitted intercept is -71.67, the upper bound of the confidence interval must be greater than -71.67, eliminating choice A. Using the electronic calculator to fit the regression, one can get the fitted slope and intercept. Var[ ^β] = s 2 ( X i - X ) 2 = 72.92 (80-80)2 + (75-80)2 + (75-80)2 + (90-80)2 = 0.48613. The symmetric 95% confidence interval for the slope of the regression line is: 11/6 ± 4.303 0.48613 = (-1.17, 4.83). The fitted model is: Y = -215/3 + x11/6. Thus the fitted values are: 75, 65 5 6, 65 5 6, 93 1 3. Thus, s 2 = (75-70) 2 + (65 5 6-75) 2 + (65 5 6-60) 2 + (93 1 3-95) 2 4-2 = 72.9167. The output of a regression program: Estimate Standard Error t-statistic p-value 95% Confidence Interval 1-71.6667 55.9405-1.28112 0.328627 (-312.359, 169.026) X 1.83333 0.697217 2.6295 0.119295 (-1.16655, 4.83321)

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 25 21. The following two linear regression models were fit to 20 observations: Model 1: Y = β 0 + β 1 X 1 + β 2 X 2 + ε Model 2: Y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + ε The results of the regression are as follows: Model Number Error Sum of Squares Regression Sum of Squares 1 13.47 22.75 2 10.53 25.70 The null hypothesis is H 0 : β 3 = β 4 = 0 with the alternative hypothesis that the two betas are not equal to zero. Calculate the statistic used to test H 0. A. Less than 1.70 B. At least 1.70, but less than 1.80 C. At least 1.80, but less than 1.90 D. At least 1.90, but less than 2.00 E. At least 2.00 21. E. F-Statistic = (ESS R - ESS UR ) / q ESS UR / (N- k) = (13.47-10.53) / 2 10.53 / (20-5) = 2.094. Comment: Note that for both models the sum of ESS and RSS is the total sum of squares: 13.47 + 22.75 = 36.22 = 10.53 + 25.70, subject to rounding. The F-Statistic has 2 and 15 degrees of freedom. Consulting the F-table: 1.795 < 2.094 < 2.695. Reject at 20%, but not at 10%. Using a computer, the p-value is: 15.8%.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 26 22. You are given the following linear regression model fitted to 12 observations: Y = β 0 + β 1 X + ε The results of the regression are as follows: Parameter Estimate Standard Error β 0 15.52 3.242 β 1 0.40 0.181 Determine the results of the hypothesis test H 0 : β 1 = 0 against the alternative H 1 : β 1 0. A. Reject at α = 0.01 B. Reject at α = 0.02, Do Not Reject at α = 0.01 C. Reject at α = 0.05, Do Not Reject at α = 0.02 D. Reject at α = 0.10, Do Not Reject at α = 0.05 E. Do Not Reject at α = 0.10 22. D. t = 0.040 / 0.181 = 2.210. Do a two-sided test, with 12-2 = 10 degrees of freedom. Consulting the t-table: 1.812 < 2.210 < 2.228. Reject at α = 0.10, Do Not Reject at α = 0.05. Comment: Using a computer, the p-value is 5.2%.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 27 23. You are given: The number of claims per year for a policyholder follows a Poisson distribution with mean λ. The prior distribution for the mean, λ, follows a gamma distribution with mean 0.2 and variance 0.005. Last year, 60 annual policies produced a total of 12 claims. Calculate the variance of the posterior distribution of λ. A. Less than 0.0015 B. At least 0.0015, but less than 0.0025 C. At least 0.0025, but less than 0.0035 D. At least 0.0035, but less than 0.0045 E. At least 0.0045 23. B. mean = αθ = 0.2. variance = αθ 2 = 0.005. θ = 1/40. α = 8. Posterior gamma has: αʼ = α + C = 8 + 12 = 20, 1/θʼ = 1/θ + E = 40 + 60 = 100. Variance of the posterior gamma is: αʼθʼ2 = 20/100 2 = 0.002.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 28 24. You are given: The score in a computer-based test follows a binomial distribution with parameters m and q. The prior distribution for q follows a beta distribution with parameters a = 5 and b = 5. After 15 questions, the posterior mean of the test score is 0.64. Calculate the number of questions answered correctly. A. Fewer than 9 B. 9 C. 10 D. 11 E. More than 11 24. D. Let x be the number correct out of 15, and 15 - x be the number incorrect. aʼ = a + number correct = 5 + x. bʼ = b + number incorrect = 5 + 15 - x = 20 - x. 0.64 = posterior mean = aʼ / (aʼ + bʼ) = (5+x) / 25. x = 11. Comment: This exam question is worded improperly. It should have said: the posterior mean of q is 0.64. It would have been better to say, For a computer-based test with m questions, the number correct for an individual follows a binomial distribution with parameters m and q. It would have been better to specify that the beta distribution (as per Appendix A attached to the exam) has θ = 1.

HCMSA-2015-ST-5A Solutions Spring 2015 CAS Exam ST, 8/11/15, Page 29 25. You are given: The size of a claim follows a normal distribution with mean µ and variance 169. The prior distribution of µ follows a normal distribution with mean 50 and variance 25. Two claims were observed: 35 and 45. Calculate the lower bound of the symmetric 95% Bayesian confidence interval for µ. A. Less than 42 B. At least 42, but less than 45 C. At least 45, but less than 48 D. At least 48, but less than 51 E. At least 51 25. A. Let L be the total losses, and C be the number of claims. Where greek letters refer to the prior Normal, the posterior distribution of m is Normal with mean: L σ 2 + µ s 2 C σ2 + s2 and variance: = (80)(25) + (50)(169) (2)(25) + 169 = 47.72, σ 2 s 2 C σ2 + s2 = (25)(169) (2)(25) + 169 = 19.29. The symmetric 95% Bayesian confidence interval for m is: 47.72 ± 1.960 19.29 = (39.11, 56.33). Comment: Alternately, using Buhlmann Credibility, K = 169/25, and Z = 2/(2+K) = 22.8%. Thus the posterior mean is: (40)(22.8%) + (50)(1-22.8%) = 47.72. Thus the lower bound of the confidence interval cannot be D or E. END OF EXAMINATION