Estimation of Truncated Data Samples in Operational Risk Modeling Bakhodir Ergashev FRB of Richmond Prepared for OpRisk North America Conference New York, NY March 20 2013
Caveats The presentation is based on the following working paper (so I am not the sole author): Ergashev, B., Pavlikov, K., Uryasev, S., Sekeris, E. (2012) Estimation of truncated data samples in operational risk modeling The views expressed in this presentation are those of the authors and do not necessarily reflect the position of the Federal Reserve Bank of Richmond or the Federal Reserve System. 2
Introduction Basel II AMA Modeling Regulatory capital is the 99.9 percentile of the distribution of annual aggregate losses. A common approach to modeling operational risk exposure is LDA: Annual loss severity and frequency distributions are estimated. The distribution of annual losses is found by convoluting the two distributions. Available samples of loss event data are usually truncated from below ($10K is a common data collection threshold). 3
Introduction How to treat truncated data samples while modeling severity and frequency distributions? The literature offers the following three approaches to fitting truncated samples: The naïve approach: treat the data as if it is not truncated. The shifting approach: shift the data to the origin, fit it, and add the threshold value to the fitted distribution. The truncation approach: fit the conditional distribution by explicitly acknowledging the existence of censored loses falling below the threshold. In this research, we focus on the last two approaches. 4
Truncation vs. Shifting Shifting approach Numerical algorithms converge fast. The results are stable. Convolution algorithms are more efficient. For heavy tailed distributions the difference is often negligible. The right tail heavily influences the behavior of the left tail. Truncation approach The fitting is not always successful. Frequently, the likelihood surface is ascending with no global maximum. Sometimes frequency estimates are too high making the convolution computationally challenging. The right tail does not influence the behavior of the left tail. 5
Findings by Recent Literature Luo, Shevchenko, Donnelly (2007): The shifting approach overstates or understates VaR depending on the chosen combination of the parameters and threshold. Shevchenko (2009): Naïve and shifted models are easy to fit but induced bias can be very large. Opdyke (2011) emphasizes that truncation exacerbates the non-robustness of MLE and proposes robust alternatives to MLE. For truncated lognormal distribution, large contaminations lead to arbitrarily small mean and arbitrarily large standard deviation parameters. Cope (2011) proposes penalized likelihood estimators for estimating distributional parameters in the presence of data truncation. The cost of the reduction in the variance is a small bias in the parameter estimates. Cavallo et al (2012): Overstatement or understatement depends on the characteristics of the data sample. They propose the use of Vuong s test to choose the best fit model; it does not always work, though. Rozenfeld (2010) proposes to calculate capital from truncated data using the shifting approach by utilizing the information on losses falling below threshold. Caveat: some technical derivations of the paper are not clear to me. 6
Our Contributions In the case of Poisson frequency and lognormal severity, we derive a necessary and sufficient condition for the existence of the global solution to the likelihood maximization problem under the truncation approach. If this regularity condition is violated, the MLE suggests that lognormal is not an appropriate severity distribution for a given data sample. An alternative severity distribution should be used. Violations of the regularity condition are the main reason for obtaining unstable parameter estimates under the truncating approach. When the regularity condition is satisfied, we offer a simple procedure with explicit formulae for calculating the mean and st.dev. parameters of the truncated lognormal severity. The procedure can be implemented in Excel. With the use of the proposed procedure practitioners avoid the challenges associated with maximizing the likelihood function. 7
Our Findings on Capital Bias Using the SLA and the explicit formulae, we approximate capital bias under the truncation and shifting approaches. The approximations are not great, but they reveal the tendencies that are supported by recent literature. To obtain more accurate estimates of capital bias, we conduct a simulation study. The results are as follows: For small samples, both approaches induce significant bias. As the sample size increases, the bias induced by the truncation approach becomes negligible, while the bias induced by the shifting approach does not. Caveat: we explored only the following range: 8 μ 15 and 0.25 σ 4. The main reason our findings are different is that we filter out random samples that violate the regularity condition. 8
The Regularity Condition Notations Severity: X is distributed according to truncated lognormal(μ,σ). Frequency is Poisson(λ). L is the data collection threshold. Truncated sample: X 1,, X n, X > L and Y 1,, Y n = (ln X 1,, ln X n ) Sample moments: M = 1 n Statistic A: A = Function G: G t = V M 2 (M ln L) 2. n n i=1 Y i, i=1 Y 2 i. V = 1 n 1 ( 1 t) and F t F t t F t t = φ(t) 1 Ф(t), where φ and Ф are pdf and cdf of the standard normal distribution. 9
The Regularity Condition 10
The Severity Estimation Procedure 11
Derivation of the Estimation Procedure We use the method of moment estimation. The MM estimation is equivalent to the ML estimation in this setting. Let Y be a lognormal r.v. truncated from below at lnl. There exist an explicit system of equations for E(Y) and E(Y 2 ) expressed in terms of cdf and pdf of the standard normal distribution. 12
Derivation of the Estimation Procedure We rewrite the system of equations for E(Y) and E(Y 2 ) as: where t = σ = G t 1 = ln L μ σ. E Y ln L F t t var Y (E(Y) ln L) 2 Theorem (Barrow and Cohen, 1954) 1. G t 1, when t, 2. G t 2, when t +, 3. G t is a monotonically increasing function on, +. 13
Derivation of the Estimation Procedure The MM system: σ = E Y ln L F t t and G t 1 = var Y (E(Y) ln L) 2. Due to Barrow and Cohen theorem 0 < Statistic A is the sample equivalent of A = var Y (E(Y ln L)) 2 < 1. var Y (E(Y) ln L) 2 V M^2 (M ln L) 2 > 0. A unique solution to the MM system exits when A < 1 and there is no solution when A > 1. We also prove that there is no global maximum of the likelihood function when A >1. 14
Capital Bias Quantifying capital bias using the Single Loss Approximation The explicit formulae are presented in the paper for the SLA approximation to capital bias under both approaches. The accuracy of the approximations is not great. They still reveal the tendencies that are supported by our simulation study. Quantifying capital bias by simulation (next few slides) 15
8.0 8.5 9.0 10.0 9.5 11.0 10.5 11.5 12.0 12.5 13.0 13.5 14.0 14.5 15.0 Bias Under Shifting Approach Sample Size = 500 100% 80% 60% 40% 20% 0% -20% -40% -60% -80% -100% 16
0.75 8.0 8.5 1.00 1.25 9.0 1.50 9.5 10.5 10.0 11.5 11.0 4.00 3.75 3.50 3.25 3.00 2.75 2.50 2.25 2.00 1.75 12.5 12.0 13.5 13.0 14.5 15.0 14.0 Bias Under Shifting Approach Sample Size = 2000 100% 80% 60% 40% 20% 0% -20% -40% -60% -80% -100% 17
8.0 8.5 1.00 9.0 1.50 2.00 10.0 9.5 2.50 3.00 3.50 11.0 10.5 4.00 12.0 11.5 13.0 12.5 14.0 14.5 15.0 13.5 Bias Under Truncation Approach Sample Size = 500 100% 80% 60% 40% 20% 0% -20% -40% -60% -80% -100% 18
8.0 8.5 9.0 10.0 9.5 10.5 11.0 11.5 12.0 12.5 13.0 14.0 13.5 15.0 14.5 Bias Under Truncation Approach Sample Size = 2000 100% 80% 60% 40% 20% 0% -20% -40% -60% -80% -100% 1.00 1.50 2.00 2.50 4.00 3.50 3.00 19
References Barrow, D., Cohen A. (1954) On some functions involving Mill s ratio, The Annals of Mathematical Statistics, 25(2), 405-408. Bӧcker, K., and Klüppelberg, C. (2005) Operational VaR: Closed Form approximation. Risk, 90-93. Cavallo A., Rosenthal, B., Wang, X., Yan, J. (2012) Treatment of the Data Collection Threshold in Operational Risk: A Case Study with the Lognormal Distribution, Journal of Operational Risk Vol. 7, 3-38. Cope, E.W. (2011) Penalized likelihood estimators for truncated data Journal of Statistical Planning and Inference, Vol. 141, 345 358. Luo, X., Shevchenko, P., and Donnelly, J. (2007) Addressing Impact of Truncation and Parameter Uncertainty on Operational Risk Estimates, Journal of Operational Risk, Vol. 2, pp. 3-26. Opdyke, J.D., (2011) Robust Statistics Vs. MLE for Operational Risk Severity distribution Parameter Estimation, Working Paper. Shevchenko, P. (2009) Implementing loss distribution approach for operational risk, Applied Stochastic Models in Business and Industry, Vol. 26, 227-307. Rozenfeld, I. (2010) Using Shifted Distributions in Computing Operational Risk Capital. Available at SSRN: http://ssrn.com/abstract=1596268 or http://dx.doi.org/10.2139/ssrn.1596268 20