Expectation and Variance of a random variable

Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio of the samplig mea. 11. Itroductio Based o the type of experimet, the outcomes could belog to limited categories, for example, Heads ad Tails from tossig a coi; or 1,,3,4,5,6 from throwig a die. Such a radom variable is kow as a discrete radom variable. O the other had, a radom variable which is either categorical or discrete is referred to as cotiuous radom variable. It is possible to study the probability distributio or simply the distributio of a radom variable alog with the associated probabilities of its outcomes. Notatio: A radom variable is ormally deoted by capital letters,, Y, etc ad particular values of a radom variable by small letters, x, y, etc. 11.1 Expectatio ad Variace Defiitio: The mathematical expectatio of a radom variable, writte E(), is the mea value of ad is defied as E ( ) xp( x ) whe is discrete or E xf x dx whe is cotiuous Populatio mea (or Expected value) of a fuctio g of a radom variable is defied as For example or E E E x x p x ( ) ( ) g g x p x g g x f x or whe is discrete dx whe is cotiuous E( x ) x f ( x) dx depedig o the ature of. 11.1.1 Some Properties of E() 1. E c c, i.e. the Expected value of a costat c is c.. Ec ce, i.e. the Expected value of c is c times the Expected value of. This ca be proved as follows. Proof E c c x p x R. A. Rigby ad D. M. StasiopoulosSeptember 005 90

c x p x c E 3. E Y E E Y 4. Ea by a E b E Y 5. Y E E Y E (provided variables ad Y are idepedet of each other). Defiitio: The variace of a radom variable, writte V(), is defied as V E E sice E E E. Hece the variace of is the average squared distace of to its mea. Populatio Variace of a fuctio g( ) of a radom variable is g V g E g E g g where g E g Hece the variace of g( ) is the average squared distace of g( ) to its mea g. 11.1. Some Properties of V() 1. V c 0 sice. Vc c V V c E c E c c c 0 sice V c E c E c c E E c V 3. V Y V V Y, provided ad Y are idepedet. 4. Va by a V b V Y, provided ad Y are idepedet. Note: (i) V Y V Y sice V Y V 1. Y 1V Y V Y (ii) V Y V V Y provided ad Y are idepedet. sice V Y V Y V V Y V V Y Example 1: ad Y are two idepedet radom variables with mea values of 5 ad 7 ad variaces of 1.5 ad respectively. Fid (i) E(+Y) (ii) E(5-3) (iii) V(5-4Y) Solutio 1: (i) E( +Y) = E() + E(Y) = 5 + *7 = 19 (ii) E(5 3) = 5 3E() = 5 3*5 = - 10 R. A. Rigby ad D. M. StasiopoulosSeptember 005 91

(iii) V(5 4Y) = V(5) + V(4Y) = 5 V() + 4 V(Y) = 5*1.5 + 16* = 69.5 11. The distributio of liear combiatios of ormal radom variables. Theorem 1. Let ~ N (, ) ad Y ~ N ( Y, Y ) where ad Y are idepedet. Let Ta b Y where a ad b are costats, the T ~ N (, ) where T a b Y ad T T a b T Y Proof T Y E T E a by E a E by a E b E Y a b T Y V T V a by V a V by a V b V Y a b sice ad Y are idepedet. The Proof that T is Normally distributed is difficult. Example : The weight of a apple has a N (80, 5) distributio ad weight Y of a orage has a N (150, 39) distributio, both weights are measured i grams. (a) Suppose a apple ad a orage at radom are selected, what is the distributio of the total weight, T? Solutio (a): Let T Y. The E T E Y ad E E Y = 80 + 150 = 30 gm V T V Y V V Y sice ad Y are idepedet = 5 + 39 = 64 gm T ~ N(30,64) (b) Suppose two apples at radom are selected, what is the distributio of the total weight, T? Solutio (b): Let 1 measure the weight of the first apple ad measure the weight of the secod apple. Ad let T 1. The E( T ) E ( 1 ) R. A. Rigby ad D. M. StasiopoulosSeptember 005 9

ad E ( ) E ( ) 1 = 80 + 80 = 160 gm V ( T ) V ( ) 1 V V 1 = 5 + 5 = 50 gm sice 1 ad T ~ N(160,50) are idepedet. (c) Suppose a apple at radom was selected, ad the foud aother apple with exactly the same weight. What is the distributio of the total weight, T? Solutio (c): Let T The E T E E ad V T V V = 4 * 5 = 100 gm = * 80 = 160 gm T ~ N(160,100 ) Example 3. Male heights M ~ N(68,5) ad Female heights F ~ N(63,4) where heights are measured i iches. If a male ad a female at radom are selected, what is the probability that the female is taller tha the male? Solutio 3: We are give that E( M ) 68, V ( M ) 5, E( F) 63 ad V ( F) 4 Hece p( FM ) p( MF ) 0p( D) 0 where DM F the differece betwee male ad female heights. Thus E D E MF E M E F = 68-63 = 5 iches V D V M F V M V F = 5 + 4 = 9 iches D ~ N5, 9 Hece 0 p FM p D where D ~ N(5,9) D5 05 p ( ) 3 3 5 = p( Z ) where Z ~ N(0,1 ) 3 p Z1. 67 = 0.0475 R. A. Rigby ad D. M. StasiopoulosSeptember 005 93

Coclusio: The probability that a female will be taller tha a male (if both are selected at radom) is 0.0475 (i.e. a 4.75% chace). 11.3 The distributio of, the sample mea. If you compute the mea of a sample of 10 umbers chose at radom from a populatio of size N (N beig large), the value you obtai will ot equal the populatio mea exactly; by chace it will be a little bit higher or a little bit lower. If you sampled sets of 10 umbers over ad over agai (computig the mea for each set), you would fid that some sample meas come much closer to the populatio mea tha others. Some would be higher tha the populatio mea ad some would be lower. Imagie samplig 10 umbers ad computig the mea over ad over agai, say about 1,000 times, ad the costructig a relative frequecy distributio of those 1,000 meas. Iterest the would be o how the meas are distributed. This distributio of meas is a very good approximatio to the samplig distributio of the mea. 11.3.1Samplig Distributio of the Mea Defiitio: The samplig distributio of the mea is a theoretical distributio that is approached as the umber of samples i the relative frequecy distributio icreases. A samplig distributio ca also be defied as the relative frequecy distributio that would be obtaied if all possible samples of a particular sample size were take. With 1,000 samples, the relative frequecy distributio is quite close; with 10,000 it is eve closer. As the umber of samples approaches ifiity, the relative frequecy distributio approaches the samplig distributio of the mea. The samplig distributio of the mea for a sample size of 10 was just a example; there is a differet samplig distributio for other sample sizes. Also, keep i mid that the relative frequecy distributio approaches a samplig distributio as the umber of samples icreases, ot as the sample size icreases sice there is a differet samplig distributio for each sample size. The samplig distributio of the mea is a very importat distributio. I later lectures you will see that it is used to costruct cofidece itervals for the mea ad for sigificace testig. Give a populatio with a mea of μad a stadard deviatio of σ, the samplig distributio of the mea has a mea of μad a stadard deviatio of where is the sample size. Defiitio: The stadard deviatio of the samplig distributio of the mea is called the stadard error of the mea ad is give by the formula,. Note that the spread of the samplig distributio of the mea decreases as the sample size icreases, as show i the followig diagram. R. A. Rigby ad D. M. StasiopoulosSeptember 005 94

I geeral, the larger the sample size the smaller the stadard error. 11.3. Defiitio of a radom sample of size The radom variables 1,..., are called radom sample if (i) they are idepedet of each other ad (ii) they have the same distributio. Example: Suppose we select apples at radom from a ifiite populatio ad weigh them. Let i be the radom variable measurig the weight of the ith apple for i = 1,, 3,,. The 1,, 3..., is a radom sample of size. Defiitio: The radom variable is the sample mea of the radom samples 1,, 3,..., defied by. 1 i i 1 11.3.3 The Expectatio ad Variace of Theorem. Let 1,..., be a radom sample from a populatio with mea ad variace, the ad E( ) E ( ) V ( ) V ( ) Proof E E 1 1 E i E i i i1 1 V ( ) V ( 1 i 1 V i i1 i1 ( ) i 1 V ( ) i sice s are idepedet. i 1 1 i1 i1 1 i1 V i.e. R. A. Rigby ad D. M. StasiopoulosSeptember 005 95

Summary of the results: Populatio Mea Populatio Variace Variable or Expected Value (sample mea) Theorem 3. The distributio of (a) Let 1,, 3... be a radom sample from a N (, ) the populatio ~ N(, ) (b) Let 1,,..., be a radom sample from ANY populatio (with a fiite mea ad fiite variace ), the ~ N(, ) (The Cetral Limit Theorem, see below.) Example: Suppose the weight of a apple, ~ N(80,5). Let 1 3... 100 be the weights of a radom sample of 100 apples. Fid (i) P( 85) ad compare this with (ii) P( 85) Solutio: (i) P( 85) where ~ N(80,5) 8580 PZ P Z 5 0.16 i.e. a 16% chace 1 where Z ~ N(0,1) (ii) 5 From Theorem 3, we have ~ N 80, 100 8580 P 85 PZ 0.5 where Z ~ N(0,1 ) P Z 10 = 0.0000. R. A. Rigby ad D. M. StasiopoulosSeptember 005 96

0.8 0.7 0.6 0.5 f(x) 0.4 0.3 0. 0.1 0.0 70 80 x 90 11.3.4 Calculatig itervals for Theorem 4. Let 1,... be a radom sample from a ormal populatio of N (, ). A 100(1 )% iterval for z z Proof 4: ~ N, N, Z ~ N0, 1 Steps 1. Fid a iterval for Z. Substitute for Z 3. Rearrage to give a iterval for. Step 1. P z Z z Step Substitute for Z 1 P z z 1 P z z Step 3. 1 p z z 1 A 100(1 )% iterval for z, z z where is give by R. A. Rigby ad D. M. StasiopoulosSeptember 005 97

Example: A radom sample of 100 apples was take from a ~ N (80, 5) populatio distributio. Fid a 95% iterval for, the sample mea of 100 apples. Solutio: First fid the distributio of 5 ~ N, N 80, N 80, 0.5 100 The fid 95% CI for A 95% CI for z 0.05 80z 0.5 0.05 801.96*0.5 800.98 79.0,80.98 79.0,81.0 Coclusio: There is a probability 0.95 (i.e. a 95% chace) that of 100 apples) lies betwee 79.0 ad 81.0 gm. (the sample mea weight 0.8 0.7 0.6 f(xmea) 0.5 0.4 0.3 0. 0.1 0.0 78 79 80 x 81 8 95% iterval for 11.3.5 Cetral Limit Theorem Let 1,,..., be a radom sample from ANY distributio with fiite mea ad fiite variace, the whe is large N, ad in (, ) i1 wheremeas has the approximate distributio. R. A. Rigby ad D. M. StasiopoulosSeptember 005 98

Example: Let,,,... be a radom sample from a Expoetial populatio 1 3 where E ad V. By the Cetral Limit Theorem N, Cosider r samples of size from the above distributio 1 3 Sample 1 11 1 13 1 1 1 3 3 31 3 33 3 3.. r r1 r r 3 r r 1 1 0.060 0.01 0.009 0.045 Desity 0.006 Des ity 0.030 0.003 0.015 0.000 0 50 100 150 00 samplef(x) 0.000 0 5 50 sample f(xmea) 75 100 Sample pdf of, fˆ ( x) Sample pdf of ˆ( x ) f R. A. Rigby ad D. M. StasiopoulosSeptember 005 99

Populatio pdf of, f ( x) Populatio pdf of, f (x) 0.04 0.0 0.03 f(x) 0.01 f(xmea) 0.0 0.01 0.00 0.00 0 100 x 00 0 50 100 x Example: Let 1,..., be a radom sample from a uiform U(a, b) distributio a b ba where E ad V. 1 For example let 1,..., be a radom sample from a uiform U(0, 100) distributio. 100 By the Cetral Limit Theorem, N50, 1. R. A. Rigby ad D. M. StasiopoulosSeptember 005 100

Practical 7 1. A fair die is rolled 40 times. Fid the probability that the mea of the 40 scores is more tha 4.. If 30 observatios are take from a populatio with distributio give by the pdf x, 0 x 3 f ( x) 9 0 otherwise a) Fid the mea ad variace of this populatio. b) Fid the probability that the mea of the 30 observatios is more tha 1.85. 3. a) A populatio is kow to have a stadard deviatio of.34 but ukow mea. A radom sample size of 81 provided a mea of 9.7. For the populatio mea costruct: (i) a 90% cofidece iterval; (ii) a 98% cofidece iterval; (iii) commet o the precisio of the results. b) A radom sample of size 50 was draw from a ormal populatio of variace 0.0. If the sample produced a mea of 13.45, costruct a 99% cofidece iterval for the mea of the populatio. 4. I a class of 5 pupils if each pupil rolls a fair die 30 times ad records the umber of sixes they obtai, fid the probability that the mea umber of sixes recorded for the class is less tha 4.3 5. A sample of 100 tis filled by a machie has a average weight equal to 510 gm ad a stadard deviatio of weight equal to 1 gm. a) Costruct a 95% cofidece iterval for the mea b) Calculate the probability that the mea is less tha 51 gm. 6. O the basis of a large sample a 95% cofidece iterval for the populatio mea is (55.8, 60.4). Usig this iformatio compute: a) estimates ofad the stadard error; b) a 90% cofidece iterval for. 7. The heights of female employees of a certai compay have a mea of 170 cm ad a stadard deviatio of.5 cm while the heights of the male employees have a mea of 178 cm with a stadard deviatio of.1 cm. If radom samples of 35 female ad 45 male employees are take ad their heights recorded, fid the probability that the mea of the male employees height is greater tha the female s by more tha 9 cm. R. A. Rigby ad D. M. StasiopoulosSeptember 005 101