Y i n. i=1. = 1 [number of successes] number of successes = n

Eco 371 Problem Set # Aswer Sheet 3. I this questio, you are asked to cosider a Beroulli radom variable Y, with a success probability P ry 1 p. You are told that you have draws from this distributio ad that ˆp is the fractio of successes i.e., the percetage of 1 s. a. The first part of the questio asks you to show that ˆp Ȳ. From our defiitio o Ȳ, we kow that: Ȳ 1 Y i i1 1 umber of successes umber of successes ˆp b. The secod part asks you to show that ˆp is a ubiased estimator of p. We kow that, sice the Y i are Beroulli radom variables that EY i 1 P ry i 1 + 0 P ry i 0 P ry i 1 p. 1 Usig this iformatio, ad our result from part a, we have that Eˆp EȲ 1 E Y i i1 1 EY i i1 1 p i1 1 p p c. Fially, you are asked to show that varˆp p1 p/. We kow from chapter equatio.7, that σ Y V ary i p1 p. Usig this iformatio, ad the fact that the Y i s are idepedet, we the kow from equatio.45 i the text, that: V arˆp V arȳ σ Y p1 p 3.3 You are told that, i a survey of likely voters, 15 respoded that they would vote for the icumbet ad 185 would likely vote for the challeger. Let p deote the fractio of all likely voters who preferred the icumbet at the time of the survey ad ˆp deote the fractio of the survey respodets who preferred the icumbet. a. The first part of the questio asks you to estimate p. We kow from questio 3. that a ubiased estimator of p is Ȳ, so we could use as our estimator: ˆp Ȳ 15 400 0.5375 1

b. From part c of questio 3., we kow that: This suggests the estimator for the variace of V arˆp p1 p. V arˆp ˆp1 ˆp 0.53751 0.5375 400 0.0006148. The correspodig stadard error estimate would be SEˆp V arˆp 0.0493. 3 c. You are asked to provide the p-value for the test H 0 : p 0.5 versus H 1 : p 0.5. I this case, σ Y is ukow ad we would use equatio 3.10 i the text, so that: p value Φ Ȳ act µ Y,0 SEȲ Φ 0.5375 0.5 0.0493 Φ 1.504 0.066 0.13. 3.10 You are told that a ew stadardized test is give to 100 radomly selected third grade studets i New Jersey. The sample average score is Ȳ o the test is 58 ad the sample stadard deviatio is s y 8. a. The first part of the questio asks you to costruct a 95% cofidece iterval for the mea score of all New Jersey third graders. Usig the iformatio above, we kow that: The correspodig 95% cofidece iterval is the give by: SEȲ s Y 8 100 0.8 4 {Ȳ ± 1.96SEȲ } {58 ± 1.960.8} {56.43, 59.568} 5 b. The same test is give to 00 third graders i Iowa, with a sample average of 6 poits ad a stadard deviatio of 11 poits. You are asked to costruct a 90% cofidece iterval for the differece i the mea test scores betwee Iowa ad New Jersey. Usig equatio 3.19 from the text, we have that: SEȲIowa ȲNJ s Iowa + s NJ Iowa NJ 11 00 + 8 100 0.605 + 0.64 The correspodig cofidece iterval is the give by: 1.1158 {ȲIowa ȲNJ ± 1.64SEȲIowa ȲNJ} {6 58 ± 1.641.1158} {4 ± 1.83} {.17, 5.83}

c. Fially, you are asked to assess the degree of cofidece you have i the propositio that the populatio meas for Iowa ad New Jersey are differet. More specifically, you are asked what the stadard error of the differeces i the two sample meas which we have already calculated as 1.1158 ad the p-value of the test of o differeces i meas versus some differece. That latter is give by: p value Φ Ȳ Iowa ȲNJ 0 SEȲIowa ȲNJ Φ 4 1.1158 Φ 3.585 0.00017 0.00034. Clearly, we would reject the ull hypothesis of o differeces betwee the two groups of studets. 3.1 I this questio, you are asked to cosider evidece o the issue of geder discrimiatio i a firm. You are give data o the salaries of 100 me ad 64 wome with similar job descriptios. a. The first questio asks what the data suggest i terms of the wage differeces betwee me ad wome. The correspodig ull ad alterative hypotheses would be: The data provided i the table idicate that: H 0 : EY me EY wome 0 H 1 : EY me EY wome 0 Ȳ act me 3100 Ȳ act wome 900 s me 00 s wome 30 Usig this iformatio: SEȲme Ȳwome s me + s wome me wome 00 100 + 30 64 400 + 1600 44.7 This i tur ca be used to calculate the appropriate t-statistic as: t act Ȳme Ȳwome 0 SEȲme Ȳwome 00 44.7 4.47 Fially, we ca compute the correspodig p-value as: p value Φ t act Φ 4.47 0 0. 3

b. I the secod part of the questio, you are asked to assess whether the data suggest that the firm is guilty of geder discrimiatio i its compesatio policies. With the extremely small p-value, the ull hypothesis ca be rejected with a high degree of cofidece. There is overwhelmig statistical evidece that mea earigs for me are differet from the mea earigs for wome. However, by itself, this does ot imply geder discrimiatio by the firm. Geder discrimiatio meas that two workers, idetical i every way but geder, are paid differet wages. The data descriptio suggests that some care has bee take to make sure that workers with similar jobs are beig compared. But, it is also importat to cotrol for characteristics of the workers that may affect their productivity educatio, years of experiece, etc.. If these characteristics are systematically differet betwee me ad wome, the they may be resposible for the differece i mea wages. If this is true, it raises a iterestig ad importat questio of why wome ted to have less educatio or less experiece tha me, but that is a questio about somethig other tha geder discrimiatio by this firm. Sice these characteristics are ot cotrolled for i the statistical aalysis, it is premature to reach a coclusio about geder discrimiatio. 3.15 This questio focuses o the outcomes of two polls aroud the time of the 004 electios. The first poll was i September of 004, with 405 of 755 likely voters preferrig Bush, whereas the secod poll foud that 378 our of favorig Bush. a. The first questio asks that you costruct of 95% cofidece boud for the fractio of likely voters favorig Bush i September 004. As a aside, what you are doig here is actually costructig a cofidece iterval o the estimator of this fractio for a similar sample of 755 likely voters. Usig the data, we kow that: ȲSept act 405 0.536 6 755 As estimate of variace of this estimator would be Usig this, V arȳsept ȲSept act act 1 ȲSept 0.5361 0.536 755 0.0003941. The correspodig cofidece iterval would be give by: SEȲSept V arȳsept 0.01815. 7 {ȲSept ± 1.96SEȲSept} {0.536 ± 1.960.01815} {0.500,.571} 8 b. The secod questio asks you to repeat these steps for the October 004 poll. I this case As estimate of variace of this estimator would be Usig this, ȲOct act 378 0.5 9 V arȳoct Ȳ Oct act act 1 Ȳ 0.51 0.5 0.00033069. The correspodig cofidece iterval would be give by: Oct SEȲOct V arȳoct 0.01818. 10 {ȲOct ± 1.96SEȲOct} {0.5 ± 1.960.01818} {0.464, 0.536} 11 4

c. The last questio asks whether or ot there was a sigificat chage i voters opiios across the two surveys. This is just a test of the differece betwee meas betwee the two samples. That is, we have the ull hypothesis H 0 : EY Sept EY Oct 0 1 versus the alterative hypothesis: H 1 : EY Sept EY Oct 0 13 To costruct the correspodig t-statistic, we will eed sample variaces for both polls. That is, we eed: s Sept ȲSept1 ȲSept 0.5361 0.536 0.487 14 ad s Oct ȲOct1 ȲOct 0.51 0.5 0.5 15 This ca the be used to form SEȲSept ȲOct s Sept + s Oct Sept Oct 0.487 755 + 0.5 0.0003941 + 0.00033069.057 This i tur ca be used to calculate the appropriate t-statistic as: t act ȲSept ȲOct 0 SEȲSept ȲOct.036 0.07 1.33 Fially, we ca compute the correspodig p-value as: p value Φ t act Φ 1.33 0.0918 0.1836. With this high a p-value, we should be reluctat to reject the ull hypothesis that there has bee o chage i voter s opiios. 5