The Hog Kog Uiversity of ciece & Techology IOM55 Itroductory tatistics for Busiess Assigmet 3 uggested olutio Note All values of statistics i Q ad Q4 are obtaied by Excel. Qa. Let be the robability that a cosumer correctly idetify Pesi, the sice a radomly selected cosumer is claimed to be equally likely to idetify Pesi, so. 5. The sice is large, by the Cetral Limit Theorem, we have the samle.5 roortio P ˆ ~ N(.5, ). Therefore, ˆ.45.5.6.5 P (.45 P.6) P( Z ).5.5 P(.44 Z.884) P(.44 Z ) P( Z.884) P( Z.44) P( Z.884).47.4976.983 Qb. A statistical oit estimate for the true roortio of cosumers who refer Pesi, or let say the oulatio roortio,, is the samle roortio, Pˆ. I this case of samle size, the oit estimator of is P ˆ. 6. The oit estimate is good because Pˆ is ubiased ad has quite small tadard.6.4 Error se ( Pˆ). 35. Qc. Whe P ˆ. 6, the maximum error of estimatio is Pˆ( Pˆ).6.4.6.4 Z Z..645.5698. Qd. Let N be the ew samle size required such that the ew maximum error is less tha or equal to. with 98% cofidece robability, the the maximum error i this case is give.6.4.33 by.33. N ( ).6.4 357.34 N 358. N. That meas we eed 358- = 358 more cosumers to be icluded i the samle.
Qa. Let be the radom variable of age. ice is large, so by Cetral Limit Theorem, we have ~ N(, ). o a 95% Cofidece Iterval for is give by.8454 [.96 ] [38.3946.96 ] [36.,4.79]. Qbi. Let be the startig salary of male emloyee ad Y be the startig salary of female emloyee. ice both the samle sizes of male ad female startig salaries are both large, i.e. 5 ad 49, the by the Cetral Limit Theorem, we have Y Y Y ~ N( Y, ). 5 49 A 99% cofidece iterval for the differece of mea male startig salary ad mea female startig salary is give by 3984.6 85.36 [( ).575 Y Y ] [(9338 55.7).575 ] 5 49 5 49 [86.8,5757.8] The data does ot suort the claim that average startig salary of male is the same as the average startig salary of female because the 99% cofidece iterval does ot eclose. Qbii. If there are oly 3 observatios i the samle, the the samle size of both male ad female startig salary will less tha 3. ice the samle size are both ot large eough to aly the Cetral Limit Theorem, the we caot calculate the above cofidece iterval by ormal distributio. Qc. Let deote the curret salary of the emloyees i the bak, the the mea curret salary of the emloyees i the bak ca be estimated by samle mea curret salary 4536. 5(Excel). uose is the required samle size, the sice ~ N(, ) ad we estimate by, the the width of a 99% cofidece iterval must satisfy.575 7393.45 5 or ) (.575 449.8 45. 5 Hece the curret samle is ot large eough ad 454- = 354 more data should be collected. Qd. If we wat to comare the true mea curret salary (say ) ad true mea startig salary (say ) at 95% cofidece robability, we may costruct a 95% cofidece iterval to estimate the differece betwee the two true meas. The tadard 95% cofidece iterval of is
[( ).96 ] [(4536.5 737.5).96 7393.45 366.87 [5687,89] However, the above stadard C.I. is ot aroximate because the two samles are deedet. I this case, if we do a aired differece test of H ] D V H with. 5, the otice that sice is large, by the Cetral D D Limit Theorem ad assumed H D is true, we have D ~ N(, ). D The Test tatistic is Z, ad the value of test statistic is D 799. z 6.5.96 Z.5 Z.5. 4448.7 o we reject H D at 5% level of sigificace ad coclude that the true mea of the two salaries are strogly sigificatly differet. Q3a. Let be the salary of male factory worker ad be the salary of female factory worker. ice it is assumed that salary of male factory worker is larger tha that of female factory worker, the we wat to test is H. 5 V H. 5(Oe sided test). ice the two samles sizes ad are small, we assumed the samles have equal variace so that we ca ool them ito a large samle to estimate a ooled samle variace () 5.8 ( ) 6.6 for the combied samle. Now 6. 3 ad the critical () value ist. 7..5 ( ) D (4.65 35.5).5 () The test statistic has valuet.8836.7 t.5, 6.3 so we reject the H. 5 at 5% level of sigificace ad we have sufficiet evidece that it is reasoable to coclude that male factory workers ear $.5 more tha female factory workers. () Q3b. Recall the test statistic formula ad otice that t.. 58. Let D be the required differece i salaries for the two samles of factory workers, the i order to reject the H. 5, D must satisfy 6.3 D.5.58 D 8.5 That meas the differece would be more tha $8.5 i order to reject the ull at % level of sigificace. 3
Q3c. Notice that the test statistic ist () ( T ) D ad it follows tudet-t distributio () of (+-) = degree of freedom uder H. Also t.5. 7, so a 9% cofidece iterval of the true differece i male ad female factory worker salaries is give by [( ) t ().5 ] [6.4.7 6.3 [.93,.877] ] Q3d. Let be the salary of male stock brokers ad be the salary of female stock brokers. The the required test is H V H (Two sided test). The test statistic give the ull has value 68 5875 Z.388 746 734 746 734 5 4 5 4.3 Z (Two sided critical value whe. uder tadard Normal distributio),. so we reject H at % level of sigificace. We have sufficiet evidece to say male ad female stock brokers have differet average aual salary. The -value of the test is P ( Z.388) P( Z.388). 68. Q3e. Probability that committig a Tye I error is P(Reject H H is true).. If Tye I error is made, that meas H is true actually ad the true average aual salary of male stock brokers is NOT equal to that of female stock brokers. Q3f. Geerally, the best oit estimator of is. But i case of H is true, we may ooled the two samles together ad get a oit estimate of the true average aual salary eared by the ooled samle mea aual salary. o our realized 5 68 4 5875 oit estimate of true average aual salary eared is x or 5 4 aroud $6556. 4
Q4a. Weight of ectio weight cout 7 mea 53.876 samle variace 6.67 samle stadard deviatio.38 miimum 38 maximum rage 63 oulatio variace 5.83 oulatio stadard deviatio.87 stadard error of the mea.96 skewess.7 kurtosis 4.399 coefficiet of variatio (CV) 9.7% st quartile 46.9 media 5. 3rd quartile 58. iterquartile rage. mode 5. low extremes low outliers high outliers 5 high extremes Weight of ecctio weight cout 76 mea 5.86 samle variace 7.47 samle stadard deviatio 8.59 miimum 4 maximum 87 rage 47 oulatio variace 7.454 oulatio stadard deviatio 8.453 stadard error of the mea.976 skewess.539 kurtosis 3.55 coefficiet of variatio (CV) 6.73% st quartile 45. media 5. 3rd quartile 54.5 iterquartile rage 9.5 mode 5. low extremes low outliers high outliers high extremes 5
Frequecy Frequecy Frequecy Distributio of Weight of ectio 5 45 4 35 3 5 5 5 38 43.77773 49.45454545 55.8888 6.9999 66.63636364 7.36363636 78.9999 83.8888 89.54545455 95.7777 More Frequecy Distributio of Weight of ectio 5 5 5 4 45. 5.44444444 55.66666667 6.88888889 66. 7.33333333 76.55555556 8.77777778 More 6
Q4b. Let be o. of hours set o usig comuter er week of sciece studets ad be that of o-sciece studets. Let us do a test of H V H with. 5. ice uder H, ~ N(, ), the the value of the test statistic is (6.69 5.9) give by z.6893.645 Z. 5. 7.6 7.57 7.6 7.57 7 37 7 37 o H is ot rejected at 5% level of sigificace ad we ca t coclude sciece studets sed more time o usig comuters tha the other studets. Q4c. Let ˆP be the roortio of o-art studets familiar with mathematics ad ˆP be that of art studets. Let us test H V H at. 5level of sigificace, a oe-sided test, sice we the assumed that o-art studets have higher roortio to get familiar with mathematics. Uder H, the samlig distributio of Pˆ ˆ P is give by ˆ ˆ P P ~ N(, Pˆ( Pˆ)( )) where ˆ P P 95(.874) 5(.77) P. 87. 95 5 o the value of test statistic is Pˆ Pˆ.874.77 Z Pˆ( Pˆ)( ).87(.87)( 5 ) 95.94.645 Z o we reject H at 5% level of sigificace ad the observed data suort what aalyst said at 5% level of sigificace. The Tye II error i this case is P( Reject H H )..5 Q4d. I geeral, we may regard P( Reject H H ) = Probability of Tye I error = Probability of False Negative of H ; P( Do ot Reject H H) = Probability of Tye II error = Probability of False Positive of H ; The ower of the test = P( Not Reject H H) = Probability of True Negative of H ; P( Not Reject H H ) = Probability of True Positive of H. I real alicatios, we usually do the test for a air of hyotheses by assumig H ad focusig o the samlig distributio uder H. o we ca cotrol the robability of false egative of H,, ad robability of true ositive of H,, easily. But the we would igore the other two robabilities, i articular the robability of false ositive of H,, ad make the result of the whole test ot so reliable. o i ractice, we also eed to cosider Tye II error ad ower of the test to make the test result more reliable. 7