Basic Probability/Statistical Theory I

Similar documents
MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

Random Variables, Sampling and Estimation

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Chapter 8 Hypothesis Testing

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Chapter 8: Estimating with Confidence

Sx [ ] = x must yield a

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Sampling Distributions, Z-Tests, Power

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Bernoulli Numbers. n(n+1) = n(n+1)(2n+1) = n(n 1) 2

1 Inferential Methods for Correlation and Regression Analysis

Chapter 6 Sampling Distributions

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

MATH/STAT 352: Lecture 15

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Error & Uncertainty. Error. More on errors. Uncertainty. Page # The error is the difference between a TRUE value, x, and a MEASURED value, x i :

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Principal Component Analysis. Nuno Vasconcelos ECE Department, UCSD

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

BHW #13 1/ Cooper. ENGR 323 Probabilistic Analysis Beautiful Homework # 13

Fluids Lecture 2 Notes

Sample Size Determination (Two or More Samples)

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

Math Third Midterm Exam November 17, 2010

The standard deviation of the mean

Chapter 13, Part A Analysis of Variance and Experimental Design

Interval Estimation (Confidence Interval = C.I.): An interval estimate of some population parameter is an interval of the form (, ),

Lecture 5. Materials Covered: Chapter 6 Suggested Exercises: 6.7, 6.9, 6.17, 6.20, 6.21, 6.41, 6.49, 6.52, 6.53, 6.62, 6.63.

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

6 Sample Size Calculations

(Dependent or paired samples) Step (1): State the null and alternate hypotheses: Case1: One-tailed test (Right)

Statistics 511 Additional Materials

Parameter, Statistic and Random Samples

Confidence Intervals for the Population Proportion p

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

(7 One- and Two-Sample Estimation Problem )

Lecture Note 8 Point Estimators and Point Estimation Methods. MIT Spring 2006 Herman Bennett

Understanding Samples

BIOSTATISTICS. Lecture 5 Interval Estimations for Mean and Proportion. dr. Petr Nazarov

Lecture 7: Properties of Random Samples

THE MEASUREMENT OF THE SPEED OF THE LIGHT

The variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Statistics

ANALYSIS OF EXPERIMENTAL ERRORS

Inferential Statistics. Inference Process. Inferential Statistics and Probability a Holistic Approach. Inference Process.

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

(6) Fundamental Sampling Distribution and Data Discription

f(x i ; ) L(x; p) = i=1 To estimate the value of that maximizes L or equivalently ln L we will set =0, for i =1, 2,...,m p x i (1 p) 1 x i i=1

Background Information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

One way Analysis of Variance (ANOVA)

11 Correlation and Regression

Statistical Fundamentals and Control Charts

Lecture 15: Learning Theory: Concentration Inequalities

Estimation of the Mean and the ACVF

2 Definition of Variance and the obvious guess

Direction: This test is worth 150 points. You are required to complete this test within 55 minutes.

Chapter 2 Descriptive Statistics

Linear Regression Models

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

Exam II Covers. STA 291 Lecture 19. Exam II Next Tuesday 5-7pm Memorial Hall (Same place as exam I) Makeup Exam 7:15pm 9:15pm Location CB 234

Mathacle. PSet Stats, Concepts In Statistics Level Number Name: Date:

Expectation and Variance of a random variable

Exam II Review. CEE 3710 November 15, /16/2017. EXAM II Friday, November 17, in class. Open book and open notes.

Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p

Properties and Hypothesis Testing

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

Estimation of a population proportion March 23,

(8) 1f = f. can be viewed as a real vector space where addition is defined by ( a1+ bi

Module 1 Fundamentals in statistics

Probability & Statistics Chapter 8

Computing Confidence Intervals for Sample Data

Topic 9: Sampling Distributions of Estimators

Economics Spring 2015

Principal Component Analysis

Topic 9: Sampling Distributions of Estimators

Analysis of Experimental Data

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Common Large/Small Sample Tests 1/55

z is the upper tail critical value from the normal distribution

Topic 9: Sampling Distributions of Estimators

MEASURES OF DISPERSION (VARIABILITY)

An Introduction to Randomized Algorithms

A General Family of Estimators for Estimating Population Variance Using Known Value of Some Population Parameter(s)

CH19 Confidence Intervals for Proportions. Confidence intervals Construct confidence intervals for population proportions

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

This chapter focuses on two experimental designs that are crucial to comparative studies: (1) independent samples and (2) matched pair samples.

Output Analysis (2, Chapters 10 &11 Law)

Introduction There are two really interesting things to do in statistics.

3 Resampling Methods: The Jackknife

- E < p. ˆ p q ˆ E = q ˆ = 1 - p ˆ = sample proportion of x failures in a sample size of n. where. x n sample proportion. population proportion

Simulation. Two Rule For Inverting A Distribution Function

Transcription:

Basi Probability/Statistial Theory I

Epetatio The epetatio or epeted values of a disrete radom variable X is the arithmeti mea of the radom variable s distributio. E[ X ] p( X ) all Epetatio by oditioig E[ X ] E[ X Y y] p( Y y) all y

Epetatio of the Sum of Variables Let S, X, ad Y be radom variables, ad let a, b ad be ostats. If: The: S E[ S] a X by Note: This theorem holds whether or ot X ad Y are statistially idepedet. This theorem a also apply to that of three or more radom variables. a E[ X] b E[ Y]

Epetatio of the Sum of Variables (otiued) Where X ad Y are statistially idepedet, the followig is also true: If : S a X by The : E[ S] a E[ X] b E[ Y]

Variae The variae of a radom variable is a measure of dispersio aroud the arithmeti mea of the distributio. Var [ X ] ( E[ X ]) p( X ) all Variae by oditioig Var[ X] E[ Var[ X Y]] Var[ E[ X Y]]

Variae (otiued) The variae Var[X] by oditioig is the same as epressig the variae as the sum of the withivariae ad betwee-variae. Var[ X] E[ Var[ X Y]] Var[ E[ X Y]] Var[ X ] Var[ X all y Y y] p( Y y) all y ( E[ X Y y] E[ X ]) p( Y y) withi-variae betwee-variae

Withi- ad Betwee-variae Compoets Meaigs of the withi-variae ad betwee-variae ompoets: The withi-variae ompoet represets the average variae of X withi groupigs or ategories based o aother variable Y. The betwee-variae ompoet reflets the differee betwee the average value of X i the groupigs or ategories based o Y.

Let S, X ad Y be radom variables, ad let a, b ad be ostats. If : The : Variae of the Sum of Variables Y b X a S ], [ ] [ ] [ ] [ Y X Cov b a Y Var b X Var a S Var

Eample If S a X by ad a=1 ad b=-1, the the variae of S is: S X Y Var[ S] Var[ X ] Var[ Y] Cov[ X, Y] If X ad Y are statistially idepedet, the Cov[X,Y]=0 Var[ S] Var[ X] Var[ Y]

Variae of the Sum of Three or More Radom Variables It is more ompliated ad ivolves the oept of ovariae matri. Let S be the sum of k umber of radom variables deoted by X 1, X,,X k, ad let 1,,, k be ostats. If : S The : X X... 1 1 k X k Var[ S] k k i1 j1 Cov[ i X i, j X j ]

Variae of the Sum of Three or More Radom Variables (otiued) Covariae matri X 1 X X 3 X k X 1 Cov[X 1,X 1 ] Cov[X 1,X ] Cov[X 1,X 3 ] Cov[X 1,X k ] X Cov[X,X 1 ] Cov[X,X ] Cov[X,X 3 ] Cov[X,X k ] X 3 Cov[X 3,X 1 ] Cov[X 3,X ] Cov[X 3,X 3 ] Cov[X 3,X k ]............... X k Cov[X k,x 1 ] Cov[X k,x ] Cov[X k,x 3 ] Cov[X k,x k ]

Variae of the Sum of Three or More Radom Variables (otiued) Rewrite Var[S]: Var[ S] k k i1 j1 Cov[ i X i, j X j ] k k k i Var[ X i ] i1 i1 i j Cov[ X To alulate Var[S], we eed to kow the value of eah variae term ad the value of the ovariae for idividual pairs of radom variables. i j i, X j ]

Variae of the Sum of Three or More Radom Variables (otiued) Whe all the radom variables are pairwise idepedet, the Var[S] simplifies to: Var[ S] k i1 Var[ i X i ]

Basi Probability/Statisti Theory II

Sample Mea Distributio (1) Assume there is a large populatio of N elemets, ad that we draw a simple radom sample of elemets from this populatio suh that is muh smaller tha N. Based o the elemets i the sample, we alulate a sample mea, : i 1 i

Sample Mea Distributio () The sample mea ( ) is a estimate of the populatio mea E[X] (or ). i 1 1... i ˆ Due to radom seletio, the value of a vary from sample to sample. ad ˆ

Sample Mea Distributio (3) If there are M distit samples of elemets that we a draw from the populatio, there are M of possible sample meas. Some of these sample meas may have the same value or all ould be differet. This set of M sample meas is termed the samplig distributio of the sample meas whih is also alled sample mea distributio for short.

Epetatio of the Sample Mea Distributio ( ): Sample Mea Distributio (4) ˆ] E[ E E E i i... ˆ] [ 1 1 ]) [... ] [ ] [ ( 1 ]... [ 1 1 1 E E E E

Sample Mea Distributio (5) Uder simple radom samplig, the epeted value for ay elemet draw ito the sample is equal to populatio mea,. 1 E[ ˆ] E[ i ] i1 Sie E[ ˆ], the estimator is said to be ubiased estimate of the populatio mea. ˆ

Variae of the Sample Mea Distributio ( ). Sample Mea Distributio (6) ˆ] Var[ Var Var Var i i... 1 ˆ] [ 1 1 i i j i j i i Cov Var Var 1 1 1 ], [ ] [ 1 ]... [ 1

Sample Mea Distributio (7) Uder simple radom samplig, two importat fats related to the variae are: The variae assoiated with ay elemet draw ito the sample is equal to the populatio variae,, ad The elemets i the sample are idepedet, so all elemets are pairwise idepedet suh that Cov[, ] 0, for all i ad j. Therefore, i j Var[ ˆ] 1 i1 Var[ i ] 1 i1

Sample Mea Distributio (8) If is muh smaller tha N, Var[ˆ] is ubiased estimate of, divided by the sample size. Var[ˆ] is a measure of the preisio of the estimate of. A ubiased estimate of Var[ˆ] is. S ( 1 ) N The term stadard error is ormally used to represet the stadard deviatio of the sample mea distributio. S S X S X

Bias, Preisio ad Auray (1) I statistial samplig, the term auray a be thought of as ombiig the oepts of bias ad preisio. However, statistiias ted ot to use the term auray but istead use the term mea square error (MSE), whih is defied as: mea square error variae bias MSE( ˆ) Bias( ) Var[ ˆ] Bias( ˆ) E[ ˆ ] Var[ˆ] preisio

Bias, Preisio ad Auray () The estimator with the lowest MSE is osidered the best or most aurate estimator. It is possible that a radom samplig sheme may ivolve a biased estimator suh that E[ ˆ]. However, if for Var[ the ˆ] sheme is suffiietly low, the overall MSE may be lower tha that of a differet samplig sheme for whih. E[ ˆ] It is importat to uderstad that zero bias does ot sigify that every sample mea equals the true populatio mea. Rather, zero bias sigifies that the average of all possible sample mea values equals the populatio mea.

Bias, Preisio ad Auray (3) S 1 True S a b 1 Whih oe of S1 ad S distributios has better preisio ad auray for estimatig the true distributio?

Bias, Preisio ad Auray (4) I idustrial hygiee, whe olletig ad aalyzig a sample, we usually eperiee a error i that our result does t equal the true evirometal level that we moitored. There are two soures of error: (1) bias (= systemati error) It is the differee betwee the mea of our repeated measuremets ad the true value. () radom error (= radom variability) It represets variability i the repeated measuremets of a ostat evirometal level. This variability may arise from flutuatios i the flow rate of the samplig pump, flutuatios i eletrial urret flow for the laboratory istrumets, et.

Bias, Preisio ad Auray (5) I idustrial hygiee, the radom error i a samplig ad aalytial method is ofte epressed by oeffiiet of variatio (CV). If we deoted the oeffiiet of variatio i the samplig devie by CV S, ad the oeffiiet of variatio i the aalytial proedure by CV A, the the total oeffiiet of variatio (CV T ) is omputed as the followig: CV T CV S CV A

Bias, Preisio ad Auray (5) I idustrial hygiee, auray is a somewhat ofusig statisti that iorporates: measuremet error due to both bias ad radom error; ad a ofidee level. We ofte say that we wat our samplig ad aalytial method to have 5% auray at a 95% ofidee level for measurig a evirometal oetratio at the permissible eposure limit (PEL). This meas that if we geerate a ostat test atmosphere at the PEL, at least 95% of our measuremets must fall withi the rage (0.75 PEL) to (1.5PEL).

Bias, Preisio ad Auray (6) If we are give : the bias (as a proportio) ad the oeffiiet of variatio (CV m ) of the method; ad the true test oetratio () we are tryig to measure, the we are asked to determie if the method meets the required auray at a 95% ofidee level. The way to determie it is: (1) To ompute the mea ( m ) of the measuremets. m Bias () To ompute the stadard deviatio ( m )of the measuremets. m CVm m

Bias, Preisio ad Auray (7) (3) To fid the peretiles of measuremets orrespodig to (0.75PEL) ad (1.5PEL). Z Z upper lower 1.5 PEL m 0.75 PEL m m m ad use the Z tablefor the peretilex ad use the Z tablefor the peretilex upper% lower% (4) If X upper % X lower % 95%, the method meets the auray riterio.

Eample A method to have 5% auray at a 95% ofidee level for measurig a atmosphere at the PEL. Kow the bias = 0.04, CV=0.11 ad PEL=00 ppm for this method, please hek whether the method meets the stated riterio. (1) m () m 00 0.0400 08 ppm 0.11 08.9 ppm 1.500 08 (3) Zupper 1.83 X 96.7%.9 0.7500 08 Zlower.53 X 0.6%.9 (4) 96.7%- 0.6% 96.1%of the measuremets fall withi 5%PEL. Colusio: Auray riterio was met.

Notes o Logormal Distributio

Trasformatio of Parameters Betwee Normal ad Logormal Distributios (1) Defiitio : arithmetimea : arithmetistadard deviatio g : geometrimea g : geometristadard deviatio :mea of log- trasformed values l : stadard deviatioof log- trasformed values l

Trasformatio of Parameters Betwee Normal ad Logormal Distributios () g g l e 0.5 ] 0.5 [ l l l e e g 1) ( 1) ( ) ( l l l l l e e e e g ) l(1 l e e g g

Trasformatio of Parameters Betwee Normal ad Logormal Distributios (3) l l( g ) l l 0.5 l l l( g ) l l[1 ( )]

Theorems Regardig the Produt of Logormal Variables (1) Let X ad Y be logormally distributed variables, ad let be a ostat. If P X Y The: P is a logormally distributed variable.

Proof: Theorems Regardig the Produt of Logormal Variables () By log-trasformig the epressio for P we obtai: lp l lx ly Beause is a ostat, l is a ostat. Beause X is logormally distributed, lx is ormally distributed. Beause Y is logormally distributed, ly is ormally distributed. Beause lp is the sum of two ormally distributed variables ad a ostat, lp is ormally distributed. Beause lp is ormal distributed, P is logormally distributed.

Theorems Regardig the Produt of Logormal Variables (3) Let X ad Y be logormally distributed variables, ad let be a ostat. If P X Y The: GM[ P] GM[ X] GM[ P]

Proof: Theorems Regardig the Produt of Logormal Variables (4) By log-trasformig the epressio for P we obtai: lp For the ormally distributed variables lx ad ly, the epetatio of the sum lp is: E[lP] By defiitio: E[lP] E[l] E[lX ] E[lY ] l lx ly E[l] lgm[ P] l lgm[ X ] lgm[ Y ] E[lX] E[lY ] Reall: GM[ P] E e [l P ]

Theorems Regardig the Produt of Logormal Variables (5) Proof:(otiued) Rewite the epressio: E[lP] E[l] lgm[ P] E[lX] Epoetiate both sides of the equatio: E[lY ] llgm[ X] lgm[ Y] Sie e lgm[ P] l e a a, e e llgm[ X ] lgm[ Y ] l e lgm[ X ] e lgm[ Y ] the above equatio a be writte as: GM[ P] GM[ X] GM[ Y]

Theorems Regardig the Produt of Logormal Variables (6) Let X ad Y be logormally distributed variables, ad let be a ostat. If P X Y The: GSD[ P] e (lgsd[ X ]) (lgsd[ Y ]) r (lgsd[ X ]) (lgsd[ Y ]) where r is the orrelatio oeffiiet for lx ad ly.

Proof: Theorems Regardig the Produt of Logormal Variables (7) By logtrasformaig the epressio of P we have: lp l lx ly For the ormally distributed variables lx ad ly, the variae of the sum lp is: Var[l P] Var[l X] Var[lY ] Cov [lx,ly] The term Var[l] does ot appear sie l is a ostat, ad the variae of a ostat is zero. By defiitio: Var[lP] (lgsd[ P]) SD[lP] lgsd[ P] Var[l X ] (lgsd[ X ]) SD[l X ] lgsd[ X ] Var[lY ] (lgsd[ Y ]) SD[lY ] lgsd[ Y ] Reall: GSD[ P] e SD[l P ]

Theorems Regardig the Produt of Logormal Variables (8) Proof:(otiued) The orrelatio oeffiiet r for two variables X ad Y is defied as: Therefore, r r By rearragemet ad substitutio: Rewrite Var[P] as: Cov ( X, Y ) SD[ X ] SD[ Y ] Cov (lx,ly ) SD[l X ] SD[lY ] Cov[l X,lY] r r SD[l X] SD[lY ] (lgsd[ X ]) (lgsd[ Y ]) (lgsd[ P]) (lgsd[ X]) (lgsd[ Y]) r (lgsd[ X]) (lgsd[ Y])

Theorems Regardig the Produt of Logormal Variables (9) Proof:(otiued) Take the square root of both sides of the equatio: lgsd[ P] (lgsd[ X ]) (lgsd[ Y ]) r (lgsd[ X ]) (lgsd[ Y ]) By epoetiatig both sides of the equatio, we obtai: GSD[ P] e (lgsd[ X ]) (lgsd[ Y ]) r (lgsd[ X ]) (lgsd[ Y ]) Note that if X ad Y are idepedet, whih meas that lx ad ly are idepedet, r =0, ad the above epressio for GSD[P] simplifies to: (lgsd[ X ]) (lgsd[ Y ]) GSD[ P] e