Robust Default Correlation for Cost Risk Analysis

Size: px
Start display at page:

Download "Robust Default Correlation for Cost Risk Analysis"

Transcription

1 Robust Default Correlatio for Cost Risk Aalysis Christia Smart, Ph.D., CCEA Director, Cost Estimatig ad Aalysis Missile Defese Agecy Preseted at the 03 ICEAA Professioal Developmet ad Traiig Workshop Jue, 03

2 Itroductio Correlatio is a importat cosideratio i cost risk aalysis Whe correlatio is igored, you are makig the de facto assumptio that all risks are idepedet Eve whe you choose ot to decide, you still have made a choice (Rush, Free Will) Assumig o correlatio results i a vast uderstatemet of risk I 996, Do Mackezie wrote that Oe of the more difficult chores i cost risk aalysis is establishig appropriate levels of correlatio (Mackezie 996) Sevetee years later, this is still true This presetatio is a attempt at makig forward progress o this issue

3 Defiitios Cosider two radom variables, X ad Y. The mea of X, E(X), is deoted by μ x, ad similarly, the mea of Y, E(Y), is deoted by μ y The variace of X, Var(X), is deoted by σ X, ad similarly, the variace of Y, Var(Y), is deoted by σ Y The variace of X ad Y are equal to: Var( X ) Cov( X, X ) E( X ) [ E( X )] [ ] Var ( Y ) Cov( Y, Y ) E( Y ) E( Y ) Correlatio, deoted by the Greek letter r ( rho ), is defied by XY Corr( X,Y ) cov(x,y ) Var( X )Var(Y ) E( XY ) E( X )E(Y ) Var( X )Var(Y ) E( XY ) μ σ X σ Y X μ Y 3

4 Total System Mea ad Variace For WBS elemets, the mea ad the variace of the total cost are defied by: 4 ( ) i i i i i i X E X E μ i j j i j i ij i i X i Var σ σ σ

5 Total Variace with Level Correlatio Suppose (for simplicity) There are WBS Elemets Each Each Total Cost Var( C i ) σ Corr C ( i C j) C, < C i k C, C, Κ, C ( ) ( i ) ( i) ( j) Var C Var C Var C Var C k i ( ) ( ) σ σ ( ) σ j i Correlatio 0 Var( C ) σ σ ( ) ( ) σ 5

6 Impact of Assumig Idepedece For a 00 elemet WBS assumig idepedece amog all WBS elemets whe the true uderlyig correlatio is equal to 0% results i a uderestimate of total system stadard deviatio equal to 80%! Percet Uderestimated Actual Correlatio Source: Why Correlatio Matters i Cost Estimatig, Advaced Traiig Sessio, 3d Aual DOD Cost Aalysis Symposium, Williamsburg, VA,

7 Example of Impact As a example, cosider a system with 0 subsystems, each with mea equal to $0 millio ad stadard deviatio equal to $3 millio For 00 elemets ad,000 elemets, assumig correlatio is zero whe it is actually 0% results i uderestimatig the 80 th percetile by 8-0%, ad if the correlatio is 60%, the 80 th percetile is uderestimated by 5-7% Number of WBS Elemets 80% Cofidece Level (TY$, Millios) Idepedece 0% Correlatio 60% Correlatio 0 $08 $3 $9 00 $,05 $, $,8,000 $0,080 $,09 $,85 7

8 Default Correlatio Notice i the graph o the previous chart there is a apparet kee i the curve aroud 0% Above 0% correlatio the cosequece of assumig less correlatio begis to dwidle This graph is the basis for assumig 0-30% for default correlatio for elemets betwee which there is o fuctioal correlatio Book (Book 999) recommeds 0% as a default correlatio value because of this However, the graph does ot tell us how much the total stadard deviatio is uderestimated because correlatio is assumed to be 0%, but is actually 60%, for example 8

9 Uderestimatig Correlatio with the Default 0% For a 00-elemet WBS, if the correlatio is assumed to be 0% but is actually 60%, the total stadard deviatio is uderestimated by 40% 00% Percet Over/Uderestimated 90% 80% 70% 60% 50% 40% 30% 0% % 0% Overestimated Uderestimated Actual Correlatio Source: Why Correlatio Matters i Cost Estimatig, Advaced Traiig Sessio, 3d Aual DOD Cost Aalysis Symposium, Williamsburg, VA,

10 Robust Approach A more robust approach to assigig correlatios would be to use the value that results i the least amout of error i the variace It is robust i the sese that without solid evidece to assig a correlatio value, it miimizes the amout by which the total stadard is misestimated due to the correlatio assumptio This robust default measure of correlatio would be a value for correlatio that would miimize the error whe the assumed correlatio differs from the actual uderlyig correlatio 0

11 Absolute Error We are iterested i the absolute value of the error, sice if we cosider egative ad positive values, they may offset each other Let ε deote the error, the we are iterested i ε, where ε is defied by ε ε if ε > 0 ε if ε < 0

12 Expected Value of Absolute Error If we assume that the prior distributio of correlatio o the iterval (0,) is uiform, the the expected value of the absolute error ε of the variace as a fuctio of the assumed correlatio is defied by ε f ( )d 0 ε d 0 sice f ( ) Thus the approach is to fid the value of that miimizes the expected (absolute) error This equatio provides the expected error as a fuctio of, ad the we miimize this fuctio with respect to usig techiques from elemetary Calculus

13 What is the Error? Now that we have determied how to determie the miimum error, we eed to figure out what to miimize We preset several differet choices, calculate the results, ad provide pros ad cos for each 3

14 Case : Percetage Error (of Actual) Deote the assumed correlatio by ad the actual correlatio by I this first case, we cosider the metric that Book (Book, 999) looked at whe measurig over- ad uder-estimatio of correlatio, which is to cosider the percetage error i variace as a percetage of the actual correlatio 4 ( ) ( ) ( ) ( ) ( ) ( ) σ σ σ ε

15 Case : Calculatig Expected Absolute Error ( of ) The expected absolute error is calculated* as This is a fuctio of the umber of WBS elemets () ad the assumed correlatio ( ) Miimizig this with respect to we fid that *See the paper for detailed calculatios 5 ( ) ( ) ( ) ( ) ( ) ( ) 0 d d ( ) ( ) 4 ( ) ( ) 4 4

16 Case : Calculatig Expected Absolute Error ( of ) The limit of this miimum as is 5% This is close to the 0% default value advocated by Book (Book, 999) However, the total error is miimized by this value because of the large pealty assiged whe overestimatig actual correlatios ear zero For example, let 00 ad assume the correlatio is 40%. The absolute percetage error whe the actual correlatio is equal to zero is 537%, while the absolute percetage error whe the actual correlatio is equal to 80% is oly 9% The pealty should ot differ greatly whether you are overestimatig or uderestimatig A easy way to overcome this issue is to examie the percet error as a fuctio of the assumed correlatio, which is cosidered i Case 6

17 Case : Percetage Error (of Assumed) ( of ) This case is similar to Case, oly the deomiator is differet I this case, 7 ( ) ( ) ( ) ( ) ( ) ( ) σ σ σ ε ( ) ( ) ( ) ( ) ( ) ( ) ) ( E 3 ε

18 Case : Percetage Error (of Assumed) ( of ) The value of that miimizes the expected (absolute) error is 3 3 The limit of as is 3 63% 8

19 Impact of Case The sigle recommeded value from this approach is 63% This is much larger tha the 5% value usig the other approach, or the 0% rule of thumb widely used i practice The impact o stadard deviatio i icreasig default correlatio from 0% to 63% will result i a sigificat icrease i stadard deviatio % Icrease i σ % % %, % 0, % 9

20 Case 3: Total Absolute Differece The absolute differece could also be cosidered as a metric σ ( ) σ ( ) I this case, the absolute expected value of the error occurs whe 50% 0

21 Case 4: Case with Trucated Limits ( of ) If we cosider the first case, much of the reaso why the miimum is so low compared to the other cases is the error whe the actual correlatio is close to 0% We kow that i most case the correlatio is ot 0%, ad we kow that it is ot 00% Absolute percetage error for variace as a percet of the actual correlatio for 00 WBS elemets: Expected Absolute Percetage Error 90% 80% 70% 60% 50% 40% 30% 0% 0% 0% 0% 0% 0% 30% 40% 50% 60% 70% 80% 90% 00% Assumed Correlatio

22 Case 4: Case with Trucated Limits ( of ) If we trucate the actual correlatio to be uiform i the iterval (0.,0.9) the the expected value of the absolute percet error is miimized whe ( ) ( 0.9.) 4 ( ) 4 The limit of this as is 40%

23 Summary of the Four Cases All four cases miimize the expected value of the absolute error i the variace, but use differet metrics for measurig error Case : Error is measured as a percetage of the variace that results from the actual correlatio, result i the limit is 5% Case : Error is measured as a percetage of the variace that results from the assumed correlatio, result i the limit is 63% Case 3: Error is measured as total differece i variaces, result is 50% Case 4: Error is measured as a percetage of the variace that results from the actual correlatio, with the correlatio rage limited to 0-90%; result is 40% 3

24 Recommedatio I recommed a percetage differece approach Kowig that the differece betwee the estimated total stadard deviatio ad the actual total stadard deviatio is $00 millio does t tell you much, sice it could be large if the stadard deviatio is $00 millio, or relatively small if the total stadard deviatio is $ billio Calculatig the error based as a percetage of the assumed correlatio is logical The issue with lookig at the error relative to the actual correlatio is that we do t kow the actual correlatio - we oly kow the assumed correlatio. The same is true for CER residuals For the Miimum Ubiased Percet Error (MUPE) ad the Zero bias Miimum Percet Error (ZMPE) CER methods look at the percetage error from the estimate, ot from the actual We should use the same metric i lookig at correlatio Bottom lie: I recommed usig a default value for correlatio that is equal to 63% 4

25 Empirical Evidece for Correlatio There is some limited empirical evidece o correlatio for spacecraft This rages from 6-40% at the subsystem level Smart calculated a average correlatio i the rage 6-0% for NASA/Air Force Cost Model hardware subsystems (Smart, 004) Covert ad Aderso calculated a average correlatio equal to 6.8% for Umaed Spacecraft Cost Model subsystems (Covert ad Aderso, 005) Mackezie ad Addiso reported correlatios i the rage 0-40% for average uit cost of subsystems NRO data (Mackezie ad Addiso, 000) However, this evidece is oly for oe commodity 5

26 Summary ( of ) 0% is ofte the default value whe there is o iformatio to provide iformed iput This level is too low Usig a more robust approach, we have show that default values i the rage 40-63% are more reasoable I recommed 63% as a default value Oly dowside is potetial for overestimatio However as a professio we do ot have a reputatio for overestimatio Icreasig default correlatio value may help couter this 6

27 Summary ( of ) Example of uderestimatio of risk For a risk aalysis coducted for the Tethered Satellite System, the actual cost was more tha double the 95 th percetile of the origial cost risk aalysis 00% 90% Cofidece Level 80% 70% 60% 50% 40% 30% 0% 0% Iitial Budget 5/8 Aalysis 3/8 Aalysis Actual Fial Cost 0% Normalized Cost 7

28 Refereces Book, S.A., Why Correlatio Matters i Cost Estimatig, Advaced Traiig Sessio, 3d Aual DOD Cost Aalysis Symposium, Williamsburg, VA, 999. Covert, R. ad T. Aderso, Correlatio Tutorial, Rev. H, preseted at the Cost Drivers Learig Evet, November 005. Gupto, G.M., C.C. Figer, ad M. Bahtia, CreditMetrics Techical Documet, 997, J.P. Morga, New York, available at: Mackezie, D., Cost Variace i Idealized Systems, preseted at the Iteratioal Society of Parametric Aalysts Aual Coferece, Caes, Frace, Jue, 996. Mackezie, D., ad B. Addiso, Space System Cost Variace ad Estimatig Ucertaity, preseted at the Space Systems Cost Aalysis Group meetig, Seattle, WA, October, 000. Smart, C., Average Correlatio Values for NAFCOM 004, upublished white paper, SAIC, 004. Smart, C., Risk Aalysis i the NASA/Air Force Cost Model, preseted at the Joit Aual ISPA/SCEA Coferece, Dever, CO, Jue, 005. Smart, C., Mathematical Techiques for Joit Cost ad Schedule Aalysis, preseted at the 009 NASA Cost Symposium, Cocoa Beach, FL, May, 009. Smart, C., Covered With Oil: Icorporatig Realism i Cost Risk Aalysis, preseted at the 0 Joit Aual Iteratioal Society of Parametric Aalysts ad Society of Cost Estimate ad Aalysis Coferece, Albuquerque, NM, Jue 0. 8

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor.

Regression, Part I. A) Correlation describes the relationship between two variables, where neither is independent or a predictor. Regressio, Part I I. Differece from correlatio. II. Basic idea: A) Correlatio describes the relatioship betwee two variables, where either is idepedet or a predictor. - I correlatio, it would be irrelevat

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons

Statistical Analysis on Uncertainty for Autocorrelated Measurements and its Applications to Key Comparisons Statistical Aalysis o Ucertaity for Autocorrelated Measuremets ad its Applicatios to Key Comparisos Nie Fa Zhag Natioal Istitute of Stadards ad Techology Gaithersburg, MD 0899, USA Outlies. Itroductio.

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

Machine Learning Brett Bernstein

Machine Learning Brett Bernstein Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio

More information

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01

ENGI 4421 Confidence Intervals (Two Samples) Page 12-01 ENGI 44 Cofidece Itervals (Two Samples) Page -0 Two Sample Cofidece Iterval for a Differece i Populatio Meas [Navidi sectios 5.4-5.7; Devore chapter 9] From the cetral limit theorem, we kow that, for sufficietly

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

ANALYSIS OF EXPERIMENTAL ERRORS

ANALYSIS OF EXPERIMENTAL ERRORS ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Topic 10: Introduction to Estimation

Topic 10: Introduction to Estimation Topic 0: Itroductio to Estimatio Jue, 0 Itroductio I the simplest possible terms, the goal of estimatio theory is to aswer the questio: What is that umber? What is the legth, the reactio rate, the fractio

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Common Large/Small Sample Tests 1/55

Common Large/Small Sample Tests 1/55 Commo Large/Small Sample Tests 1/55 Test of Hypothesis for the Mea (σ Kow) Covert sample result ( x) to a z value Hypothesis Tests for µ Cosider the test H :μ = μ H 1 :μ > μ σ Kow (Assume the populatio

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

A proposed discrete distribution for the statistical modeling of

A proposed discrete distribution for the statistical modeling of It. Statistical Ist.: Proc. 58th World Statistical Cogress, 0, Dubli (Sessio CPS047) p.5059 A proposed discrete distributio for the statistical modelig of Likert data Kidd, Marti Cetre for Statistical

More information

A Question. Output Analysis. Example. What Are We Doing Wrong? Result from throwing a die. Let X be the random variable

A Question. Output Analysis. Example. What Are We Doing Wrong? Result from throwing a die. Let X be the random variable A Questio Output Aalysis Let X be the radom variable Result from throwig a die 5.. Questio: What is E (X? Would you throw just oce ad take the result as your aswer? Itroductio to Simulatio WS/ - L 7 /

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Computing Confidence Intervals for Sample Data

Computing Confidence Intervals for Sample Data Computig Cofidece Itervals for Sample Data Topics Use of Statistics Sources of errors Accuracy, precisio, resolutio A mathematical model of errors Cofidece itervals For meas For variaces For proportios

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

2 Definition of Variance and the obvious guess

2 Definition of Variance and the obvious guess 1 Estimatig Variace Statistics - Math 410, 11/7/011 Oe of the mai themes of this course is to estimate the mea µ of some variable X of a populatio. We typically do this by collectig a sample of idividuals

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

Sampling Distributions, Z-Tests, Power

Sampling Distributions, Z-Tests, Power Samplig Distributios, Z-Tests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace

More information

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER 1 018/019 DR. ANTHONY BROWN 8. Statistics 8.1. Measures of Cetre: Mea, Media ad Mode. If we have a series of umbers the

More information

Economics Spring 2015

Economics Spring 2015 1 Ecoomics 400 -- Sprig 015 /17/015 pp. 30-38; Ch. 7.1.4-7. New Stata Assigmet ad ew MyStatlab assigmet, both due Feb 4th Midterm Exam Thursday Feb 6th, Chapters 1-7 of Groeber text ad all relevat lectures

More information

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE

April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE April 18, 2017 CONFIDENCE INTERVALS AND HYPOTHESIS TESTING, UNDERGRADUATE MATH 526 STYLE TERRY SOO Abstract These otes are adapted from whe I taught Math 526 ad meat to give a quick itroductio to cofidece

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

AP Statistics Review Ch. 8

AP Statistics Review Ch. 8 AP Statistics Review Ch. 8 Name 1. Each figure below displays the samplig distributio of a statistic used to estimate a parameter. The true value of the populatio parameter is marked o each samplig distributio.

More information

Analysis of Experimental Data

Analysis of Experimental Data Aalysis of Experimetal Data 6544597.0479 ± 0.000005 g Quatitative Ucertaity Accuracy vs. Precisio Whe we make a measuremet i the laboratory, we eed to kow how good it is. We wat our measuremets to be both

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Basis for simulation techniques

Basis for simulation techniques Basis for simulatio techiques M. Veeraraghava, March 7, 004 Estimatio is based o a collectio of experimetal outcomes, x, x,, x, where each experimetal outcome is a value of a radom variable. x i. Defiitios

More information

Chapter 6. Sampling and Estimation

Chapter 6. Sampling and Estimation Samplig ad Estimatio - 34 Chapter 6. Samplig ad Estimatio 6.. Itroductio Frequetly the egieer is uable to completely characterize the etire populatio. She/he must be satisfied with examiig some subset

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63. STT 315, Summer 006 Lecture 5 Materials Covered: Chapter 6 Suggested Exercises: 67, 69, 617, 60, 61, 641, 649, 65, 653, 66, 663 1 Defiitios Cofidece Iterval: A cofidece iterval is a iterval believed to

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

Appendix D Some Portfolio Theory Math for Water Supply

Appendix D Some Portfolio Theory Math for Water Supply DESALINATION, WITH A GRAIN OF SALT A CALIFORNIA PERSPECTIVE 9 Appedix D Some Portfolio Theory Math for Water Supply Costat-Reliability-Beefit Uit Costs The reliability ad cost of differet water-supply

More information

(X i X)(Y i Y ) = 1 n

(X i X)(Y i Y ) = 1 n L I N E A R R E G R E S S I O N 10 I Chapter 6 we discussed the cocepts of covariace ad correlatio two ways of measurig the extet to which two radom variables, X ad Y were related to each other. I may

More information

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification

INF Introduction to classifiction Anne Solberg Based on Chapter 2 ( ) in Duda and Hart: Pattern Classification INF 4300 90 Itroductio to classifictio Ae Solberg ae@ifiuioo Based o Chapter -6 i Duda ad Hart: atter Classificatio 90 INF 4300 Madator proect Mai task: classificatio You must implemet a classificatio

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Statistics 20: Final Exam Solutions Summer Session 2007

Statistics 20: Final Exam Solutions Summer Session 2007 1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets

More information

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2. SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample

More information

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT

WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? ABSTRACT WHAT IS THE PROBABILITY FUNCTION FOR LARGE TSUNAMI WAVES? Harold G. Loomis Hoolulu, HI ABSTRACT Most coastal locatios have few if ay records of tsuami wave heights obtaied over various time periods. Still

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Lecture 16

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Lecture 16 EECS 70 Discrete Mathematics ad Probability Theory Sprig 2014 Aat Sahai Lecture 16 Variace Questio: Let us retur oce agai to the questio of how may heads i a typical sequece of coi flips. Recall that we

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution

Department of Civil Engineering-I.I.T. Delhi CEL 899: Environmental Risk Assessment HW5 Solution Departmet of Civil Egieerig-I.I.T. Delhi CEL 899: Evirometal Risk Assessmet HW5 Solutio Note: Assume missig data (if ay) ad metio the same. Q. Suppose X has a ormal distributio defied as N (mea=5, variace=

More information

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2

Anna Janicka Mathematical Statistics 2018/2019 Lecture 1, Parts 1 & 2 Aa Jaicka Mathematical Statistics 18/19 Lecture 1, Parts 1 & 1. Descriptive Statistics By the term descriptive statistics we will mea the tools used for quatitative descriptio of the properties of a sample

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desig ad Aalysis of Algorithms Probabilistic aalysis ad Radomized algorithms Referece: CLRS Chapter 5 Topics: Hirig problem Idicatio radom variables Radomized algorithms Huo Hogwei 1 The hirig problem

More information

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Chapter 11 Output Analysis for a Single Model. Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Chapter Output Aalysis for a Sigle Model Baks, Carso, Nelso & Nicol Discrete-Evet System Simulatio Error Estimatio If {,, } are ot statistically idepedet, the S / is a biased estimator of the true variace.

More information

Estimation of the Mean and the ACVF

Estimation of the Mean and the ACVF Chapter 5 Estimatio of the Mea ad the ACVF A statioary process {X t } is characterized by its mea ad its autocovariace fuctio γ ), ad so by the autocorrelatio fuctio ρ ) I this chapter we preset the estimators

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Lecture 12: Hypothesis Testing

Lecture 12: Hypothesis Testing 9.07 Itroductio to Statistics for Brai ad Cogitive Scieces Emery N. Brow Lecture : Hypothesis Testig I. Objectives. Uderstad the hypothesis testig paradigm.. Uderstad how hypothesis testig procedures are

More information

Estimation of a population proportion March 23,

Estimation of a population proportion March 23, 1 Social Studies 201 Notes for March 23, 2005 Estimatio of a populatio proportio Sectio 8.5, p. 521. For the most part, we have dealt with meas ad stadard deviatios this semester. This sectio of the otes

More information

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}. 1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1

October 25, 2018 BIM 105 Probability and Statistics for Biomedical Engineers 1 October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 1 Populatio parameters ad Sample Statistics October 25, 2018 BIM 105 Probability ad Statistics for Biomedical Egieers 2 Ifereces

More information

Discrete Mathematics and Probability Theory Fall 2016 Walrand Probability: An Overview

Discrete Mathematics and Probability Theory Fall 2016 Walrand Probability: An Overview CS 70 Discrete Mathematics ad Probability Theory Fall 2016 Walrad Probability: A Overview Probability is a fasciatig theory. It provides a precise, clea, ad useful model of ucertaity. The successes of

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators.

Open book and notes. 120 minutes. Cover page and six pages of exam. No calculators. IE 330 Seat # Ope book ad otes 120 miutes Cover page ad six pages of exam No calculators Score Fial Exam (example) Schmeiser Ope book ad otes No calculator 120 miutes 1 True or false (for each, 2 poits

More information

For example suppose we divide the interval [0,2] into 5 equal subintervals of length

For example suppose we divide the interval [0,2] into 5 equal subintervals of length Math 120c Calculus Sec 1: Estimatig with Fiite Sums I Area A Cosider the problem of fidig the area uder the curve o the fuctio y!x 2 + over the domai [0,2] We ca approximate this area by usig a familiar

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

Homework 5 Solutions

Homework 5 Solutions Homework 5 Solutios p329 # 12 No. To estimate the chace you eed the expected value ad stadard error. To do get the expected value you eed the average of the box ad to get the stadard error you eed the

More information

GUIDELINES ON REPRESENTATIVE SAMPLING

GUIDELINES ON REPRESENTATIVE SAMPLING DRUGS WORKING GROUP VALIDATION OF THE GUIDELINES ON REPRESENTATIVE SAMPLING DOCUMENT TYPE : REF. CODE: ISSUE NO: ISSUE DATE: VALIDATION REPORT DWG-SGL-001 002 08 DECEMBER 2012 Ref code: DWG-SGL-001 Issue

More information

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date: Confidence Interval Guesswork with Confidence PSet ----- Stats, Cocepts I Statistics Cofidece Iterval Guesswork with Cofidece VII. CONFIDENCE INTERVAL 7.1. Sigificace Level ad Cofidece Iterval (CI) The Sigificace Level The sigificace level, ofte deoted

More information

1 Review and Overview

1 Review and Overview DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,

More information

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals Chapter 6 Studet Lecture Notes 6-1 Busiess Statistics: A Decisio-Makig Approach 6 th Editio Chapter 6 Itroductio to Samplig Distributios Chap 6-1 Chapter Goals After completig this chapter, you should

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Statistical Properties of OLS estimators

Statistical Properties of OLS estimators 1 Statistical Properties of OLS estimators Liear Model: Y i = β 0 + β 1 X i + u i OLS estimators: β 0 = Y β 1X β 1 = Best Liear Ubiased Estimator (BLUE) Liear Estimator: β 0 ad β 1 are liear fuctio of

More information

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities

Chapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information