Robust Default Correlatio for Cost Risk Aalysis Christia Smart, Ph.D., CCEA Director, Cost Estimatig ad Aalysis Missile Defese Agecy Preseted at the 03 ICEAA Professioal Developmet ad Traiig Workshop Jue, 03
Itroductio Correlatio is a importat cosideratio i cost risk aalysis Whe correlatio is igored, you are makig the de facto assumptio that all risks are idepedet Eve whe you choose ot to decide, you still have made a choice (Rush, Free Will) Assumig o correlatio results i a vast uderstatemet of risk I 996, Do Mackezie wrote that Oe of the more difficult chores i cost risk aalysis is establishig appropriate levels of correlatio (Mackezie 996) Sevetee years later, this is still true This presetatio is a attempt at makig forward progress o this issue
Defiitios Cosider two radom variables, X ad Y. The mea of X, E(X), is deoted by μ x, ad similarly, the mea of Y, E(Y), is deoted by μ y The variace of X, Var(X), is deoted by σ X, ad similarly, the variace of Y, Var(Y), is deoted by σ Y The variace of X ad Y are equal to: Var( X ) Cov( X, X ) E( X ) [ E( X )] [ ] Var ( Y ) Cov( Y, Y ) E( Y ) E( Y ) Correlatio, deoted by the Greek letter r ( rho ), is defied by XY Corr( X,Y ) cov(x,y ) Var( X )Var(Y ) E( XY ) E( X )E(Y ) Var( X )Var(Y ) E( XY ) μ σ X σ Y X μ Y 3
Total System Mea ad Variace For WBS elemets, the mea ad the variace of the total cost are defied by: 4 ( ) i i i i i i X E X E μ i j j i j i ij i i X i Var σ σ σ
Total Variace with Level Correlatio Suppose (for simplicity) There are WBS Elemets Each Each Total Cost Var( C i ) σ Corr C ( i C j) C, < C i k C, C, Κ, C ( ) ( i ) ( i) ( j) Var C Var C Var C Var C k i ( ) ( ) σ σ ( ) σ j i Correlatio 0 Var( C ) σ σ ( ) ( ) σ 5
Impact of Assumig Idepedece For a 00 elemet WBS assumig idepedece amog all WBS elemets whe the true uderlyig correlatio is equal to 0% results i a uderestimate of total system stadard deviatio equal to 80%! Percet Uderestimated 00 80 60 40 0 000 00 30 0 0 0 0. 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Actual Correlatio Source: Why Correlatio Matters i Cost Estimatig, Advaced Traiig Sessio, 3d Aual DOD Cost Aalysis Symposium, Williamsburg, VA, 999. 6
Example of Impact As a example, cosider a system with 0 subsystems, each with mea equal to $0 millio ad stadard deviatio equal to $3 millio For 00 elemets ad,000 elemets, assumig correlatio is zero whe it is actually 0% results i uderestimatig the 80 th percetile by 8-0%, ad if the correlatio is 60%, the 80 th percetile is uderestimated by 5-7% Number of WBS Elemets 80% Cofidece Level (TY$, Millios) Idepedece 0% Correlatio 60% Correlatio 0 $08 $3 $9 00 $,05 $, $,8,000 $0,080 $,09 $,85 7
Default Correlatio Notice i the graph o the previous chart there is a apparet kee i the curve aroud 0% Above 0% correlatio the cosequece of assumig less correlatio begis to dwidle This graph is the basis for assumig 0-30% for default correlatio for elemets betwee which there is o fuctioal correlatio Book (Book 999) recommeds 0% as a default correlatio value because of this However, the graph does ot tell us how much the total stadard deviatio is uderestimated because correlatio is assumed to be 0%, but is actually 60%, for example 8
Uderestimatig Correlatio with the Default 0% For a 00-elemet WBS, if the correlatio is assumed to be 0% but is actually 60%, the total stadard deviatio is uderestimated by 40% 00% Percet Over/Uderestimated 90% 80% 70% 60% 50% 40% 30% 0% 00 30 0 000 0% 0% 0 0. 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Overestimated Uderestimated Actual Correlatio Source: Why Correlatio Matters i Cost Estimatig, Advaced Traiig Sessio, 3d Aual DOD Cost Aalysis Symposium, Williamsburg, VA, 999. 9
Robust Approach A more robust approach to assigig correlatios would be to use the value that results i the least amout of error i the variace It is robust i the sese that without solid evidece to assig a correlatio value, it miimizes the amout by which the total stadard is misestimated due to the correlatio assumptio This robust default measure of correlatio would be a value for correlatio that would miimize the error whe the assumed correlatio differs from the actual uderlyig correlatio 0
Absolute Error We are iterested i the absolute value of the error, sice if we cosider egative ad positive values, they may offset each other Let ε deote the error, the we are iterested i ε, where ε is defied by ε ε if ε > 0 ε if ε < 0
Expected Value of Absolute Error If we assume that the prior distributio of correlatio o the iterval (0,) is uiform, the the expected value of the absolute error ε of the variace as a fuctio of the assumed correlatio is defied by ε f ( )d 0 ε d 0 sice f ( ) Thus the approach is to fid the value of that miimizes the expected (absolute) error This equatio provides the expected error as a fuctio of, ad the we miimize this fuctio with respect to usig techiques from elemetary Calculus
What is the Error? Now that we have determied how to determie the miimum error, we eed to figure out what to miimize We preset several differet choices, calculate the results, ad provide pros ad cos for each 3
Case : Percetage Error (of Actual) Deote the assumed correlatio by ad the actual correlatio by I this first case, we cosider the metric that Book (Book, 999) looked at whe measurig over- ad uder-estimatio of correlatio, which is to cosider the percetage error i variace as a percetage of the actual correlatio 4 ( ) ( ) ( ) ( ) ( ) ( ) σ σ σ ε
Case : Calculatig Expected Absolute Error ( of ) The expected absolute error is calculated* as This is a fuctio of the umber of WBS elemets () ad the assumed correlatio ( ) Miimizig this with respect to we fid that *See the paper for detailed calculatios 5 ( ) ( ) ( ) ( ) ( ) ( ) 0 d d ( ) ( ) 4 ( ) ( ) 4 4
Case : Calculatig Expected Absolute Error ( of ) The limit of this miimum as is 5% This is close to the 0% default value advocated by Book (Book, 999) However, the total error is miimized by this value because of the large pealty assiged whe overestimatig actual correlatios ear zero For example, let 00 ad assume the correlatio is 40%. The absolute percetage error whe the actual correlatio is equal to zero is 537%, while the absolute percetage error whe the actual correlatio is equal to 80% is oly 9% The pealty should ot differ greatly whether you are overestimatig or uderestimatig A easy way to overcome this issue is to examie the percet error as a fuctio of the assumed correlatio, which is cosidered i Case 6
Case : Percetage Error (of Assumed) ( of ) This case is similar to Case, oly the deomiator is differet I this case, 7 ( ) ( ) ( ) ( ) ( ) ( ) σ σ σ ε ( ) ( ) ( ) ( ) ( ) ( ) 3 3 4 ) ( E 3 ε
Case : Percetage Error (of Assumed) ( of ) The value of that miimizes the expected (absolute) error is 3 3 The limit of as is 3 63% 8
Impact of Case The sigle recommeded value from this approach is 63% This is much larger tha the 5% value usig the other approach, or the 0% rule of thumb widely used i practice The impact o stadard deviatio i icreasig default correlatio from 0% to 63% will result i a sigificat icrease i stadard deviatio % Icrease i σ 0 54.3% 30 68.3% 00 74.5%,000 77.% 0,000 77.5% 9
Case 3: Total Absolute Differece The absolute differece could also be cosidered as a metric σ ( ) σ ( ) I this case, the absolute expected value of the error occurs whe 50% 0
Case 4: Case with Trucated Limits ( of ) If we cosider the first case, much of the reaso why the miimum is so low compared to the other cases is the error whe the actual correlatio is close to 0% We kow that i most case the correlatio is ot 0%, ad we kow that it is ot 00% Absolute percetage error for variace as a percet of the actual correlatio for 00 WBS elemets: Expected Absolute Percetage Error 90% 80% 70% 60% 50% 40% 30% 0% 0% 0% 0% 0% 0% 30% 40% 50% 60% 70% 80% 90% 00% Assumed Correlatio
Case 4: Case with Trucated Limits ( of ) If we trucate the actual correlatio to be uiform i the iterval (0.,0.9) the the expected value of the absolute percet error is miimized whe ( 0. 0.9) ( 0.9.) 4 ( ) 4 The limit of this as is 40%
Summary of the Four Cases All four cases miimize the expected value of the absolute error i the variace, but use differet metrics for measurig error Case : Error is measured as a percetage of the variace that results from the actual correlatio, result i the limit is 5% Case : Error is measured as a percetage of the variace that results from the assumed correlatio, result i the limit is 63% Case 3: Error is measured as total differece i variaces, result is 50% Case 4: Error is measured as a percetage of the variace that results from the actual correlatio, with the correlatio rage limited to 0-90%; result is 40% 3
Recommedatio I recommed a percetage differece approach Kowig that the differece betwee the estimated total stadard deviatio ad the actual total stadard deviatio is $00 millio does t tell you much, sice it could be large if the stadard deviatio is $00 millio, or relatively small if the total stadard deviatio is $ billio Calculatig the error based as a percetage of the assumed correlatio is logical The issue with lookig at the error relative to the actual correlatio is that we do t kow the actual correlatio - we oly kow the assumed correlatio. The same is true for CER residuals For the Miimum Ubiased Percet Error (MUPE) ad the Zero bias Miimum Percet Error (ZMPE) CER methods look at the percetage error from the estimate, ot from the actual We should use the same metric i lookig at correlatio Bottom lie: I recommed usig a default value for correlatio that is equal to 63% 4
Empirical Evidece for Correlatio There is some limited empirical evidece o correlatio for spacecraft This rages from 6-40% at the subsystem level Smart calculated a average correlatio i the rage 6-0% for NASA/Air Force Cost Model hardware subsystems (Smart, 004) Covert ad Aderso calculated a average correlatio equal to 6.8% for Umaed Spacecraft Cost Model subsystems (Covert ad Aderso, 005) Mackezie ad Addiso reported correlatios i the rage 0-40% for average uit cost of subsystems NRO data (Mackezie ad Addiso, 000) However, this evidece is oly for oe commodity 5
Summary ( of ) 0% is ofte the default value whe there is o iformatio to provide iformed iput This level is too low Usig a more robust approach, we have show that default values i the rage 40-63% are more reasoable I recommed 63% as a default value Oly dowside is potetial for overestimatio However as a professio we do ot have a reputatio for overestimatio Icreasig default correlatio value may help couter this 6
Summary ( of ) Example of uderestimatio of risk For a risk aalysis coducted for the Tethered Satellite System, the actual cost was more tha double the 95 th percetile of the origial cost risk aalysis 00% 90% Cofidece Level 80% 70% 60% 50% 40% 30% 0% 0% Iitial Budget 5/8 Aalysis 3/8 Aalysis Actual Fial Cost 0%.5.5 3 3.5 4 4.5 5 5.5 6 Normalized Cost 7
Refereces Book, S.A., Why Correlatio Matters i Cost Estimatig, Advaced Traiig Sessio, 3d Aual DOD Cost Aalysis Symposium, Williamsburg, VA, 999. Covert, R. ad T. Aderso, Correlatio Tutorial, Rev. H, preseted at the Cost Drivers Learig Evet, November 005. Gupto, G.M., C.C. Figer, ad M. Bahtia, CreditMetrics Techical Documet, 997, J.P. Morga, New York, available at: http://www.defaultrisk.com/_pdf6j4/creditmetrics_techdoc.pdf Mackezie, D., Cost Variace i Idealized Systems, preseted at the Iteratioal Society of Parametric Aalysts Aual Coferece, Caes, Frace, Jue, 996. Mackezie, D., ad B. Addiso, Space System Cost Variace ad Estimatig Ucertaity, preseted at the Space Systems Cost Aalysis Group meetig, Seattle, WA, October, 000. Smart, C., Average Correlatio Values for NAFCOM 004, upublished white paper, SAIC, 004. Smart, C., Risk Aalysis i the NASA/Air Force Cost Model, preseted at the Joit Aual ISPA/SCEA Coferece, Dever, CO, Jue, 005. Smart, C., Mathematical Techiques for Joit Cost ad Schedule Aalysis, preseted at the 009 NASA Cost Symposium, Cocoa Beach, FL, May, 009. Smart, C., Covered With Oil: Icorporatig Realism i Cost Risk Aalysis, preseted at the 0 Joit Aual Iteratioal Society of Parametric Aalysts ad Society of Cost Estimate ad Aalysis Coferece, Albuquerque, NM, Jue 0. 8