PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton coeffcent, OLS/SLR (Ordnary Least Squares/Smple Lnear Regresson), and the assocated output (R-square, standard error, t-statstcs). I recommend that you use Excel or another spreadsheet program to calculate everythng asked. There are relevant examples for the Pearson correlaton coeffcent and OLS/SLR on pages 5 and 5 of Course Notes #. [The followng data appear n Wooldrdge Q.3.] The table below contans the ACT score and college GPA for eght college students. Student ID College GPA ACT Score.8 3.4 4 3 3. 6 4 3.5 7 5 3.6 9 6 3. 5 7.7 5 8 3.7 3 Mean 3.5 5.875. Plot the relatonshp between GPA and ACT scores. (Place GPA scores on the vertcal axs and ACT scores on the horzontal axs). How do the two measures appear to be related? College GPA and ACT Score 4 3.5 3 College GPA.5.5.5 5 5 5 3 35 ACT Score Problem Set # SOLUTIONS Page
The ponts on ths scatterplot suggest a postve relatonshp between college GPA and ACT score.. Compute the Pearson correlaton coeffcent and compute whether or not r s statstcally sgnfcant. Are GPA and ACT correlated at all? If so, are they postvely correlated or negatvely correlated? I recommend you set up a table wth the followng columns and use the followng formulas to compute r and test for sgnfcance. How many degrees of freedom wll you use n ths t-test? What are the relevant crtcal values for p.5 and p.? For ths t-test, n8 and there are two varables, so n-8-6 degrees of freedom. The relevant crtcal values for a -taled t-test wth 6 degrees of freedom are.943 (p.) and.447 (p.5). Student College ACT X ) X ) ( Y ( Y ID GPA Score X )( Y.8-4.875 3.76563 -.45.756.938 3.4 4 -.875 3.5565.875.3556 -.3556 3 3 6.5.565 -.5.4556 -.656 4 3.5 7.5.6565.875.8656.33438 5 3.6 9 3.5 9.76565.3875.556.938 6 3 5 -.875.76565 -.5.4556.85938 7.7 5 -.875.76565 -.55.6656.448438 8 3.7 3 4.5 7.563.4875.37656.938 Sum 56.875 8.88 E- 6.875 5.85 Mean 3.5 5.875 r t r X ) X )( Y ( Y 5.85 56.875*.875 n 6.759884 *.86334 r.759884 5.85 7.64993.759884 Problem Set # SOLUTIONS Page
We fnd that college GPA and ACT are, n fact, postvely correlated. The Pearson correlaton coeffcent s.759884 wth a t-statstc of.86334, whch exceeds the 5 percent threshold. 3. Use elements from the table constructed n part to construct ˆ β and ˆβ for the followng equaton: GPAˆ ˆ β ˆ + βact Use the followng formulas: ˆ β X )( Y 5.85 X ) 56.875.98 ˆ β ˆ Y βx 3.5.98* 5.875.5683 4. Now test whether or not ˆβ s statstcally sgnfcantly dfferent than. To do ths, you wll also need to construct predcted Yˆ and e (see example on page 9 of Course Notes #). You wll also need the followng: Frst, I construct all of the new elements I wll need for ths exercse. Student College ACT Yˆ e Y Y e ID GPA Score.8.7437 -.8569.7343 3.4 4 3.93 -.379.4374 3 3 6 3.53.53.576 4 3.5 7 3.375 -.75.9756 5 3.6 9 3.53897 -.68.4638 6 3 5 3.3.3.554 7.7 5 3.3.43.795 8 3.7 3 3.63496 -.659.4343 Sum 5.7.8.43475 We are testng whether ˆβ s statstcally dfferent than a true β. Ths H : β means, so we substtute for β n our computaton of t. H : β ˆ β β t s ˆ β.98.3569.86334 Problem Set # SOLUTIONS Page 3
s β ˆ σ X ).7454 56.875.3569 e σ ( n ) e Y ˆ Y *.43475.7454 (8 ) We reject the null hypothess that ˆβ s not statstcally sgnfcantly dfferent than zero. 5. Interpret ˆβ n a sentence. For each addtonal pont a student scores on ther ACT, ther college GPA s predcted to ncrease by one-tenth of a pont (.). 6. If a student s ACT score ncreases by 5 ponts, how much hgher or lower s GPA predcted to be? If a student s ACT score ncreases by 5 ponts, ther GPA s predcted to ncrease by 5*.98.5989 ponts. 7. What s the predcted GPA for a student wth an ACT score of? Predcted GPA for a student wth an ACT score of s: GPA ˆ ˆ β ˆ + βact.5683 +.98*.688 8. Now compute the R-square. (Recall that n the case of bvarate regresson/slr, the R-square can be computed as the square of the Pearson correlaton coeffcent.) What does the R-square tell us? R r * r.759884*.759884.57744 The R-square tells us the amount of varaton n the dependent varable that s explaned by the ndependent varable. In ths example, we fnd an R-square of.58, whch means roughly 58 percent of varaton n college GPA s explaned by ACT score. Problem Set # SOLUTIONS Page 4
9. Usng SAS and the dataset actgpa.sas7bdat, compute the Pearson correlaton coeffcent and usng OLS, obtan the ntercept and slope estmates you computed by hand n part 3. Do your estmates match what you computed by hand? Yes, n both cases, the SAS estmates match what we computed by hand. (See below only roundng dffs.) On the SAS output for the Pearson correlaton coeffcent: a. Crcle r and label t A b. Crcle the assocated p-value and label t B The CORR Procedure Varables: ACT GPA Smple Statstcs Varable N Mean Std Dev Sum Mnmum Maxmum ACT 8 5.875.8544 7.. 3. GPA 8 3.5.38336 5.7.7 3.7 Pearson Correlaton Coeffcents, N 8 Prob > r under H: Rho ACT GPA ACT..75988.87 GPA.75988..87 A : The Pearson correlaton coeffcent B : the assocated p- value Problem Set # SOLUTIONS Page 5
On the SAS output for OLS/SLR: a. Crcle R-square and label t C b. Crcle ˆ β and label t D c. Crcle ˆβ and label t E d. Crcle the standard error assocated wth ˆβ and label t F e. Crcle the t-value assocated wth ˆβ and label t G The REG Procedure Model: MODEL Dependent Varable: GPA Number of Observatons Read 8 Number of Observatons Used 8 Analyss of Varance Sum of Mean Source DF Squares Square F Value Pr > F Model.594.594 8..87 Error 6.43473.745 Corrected Total 7.875 Root MSE.697 R-Square.5774 Dependent Mean 3.5 Adj R-Sq.57 Coeff Var 8.37893 Parameter Estmates Parameter Standard Varable DF Estmate Error t Value Pr > t Intercept.5683.984.6.563 ACT..3569.86.87 C : R-square D : Beta- E : Beta- F : standard error for Beta- G : t- value for Beta- Problem Set # SOLUTIONS Page 6
. Can you thnk of any addtonal factors whch mght mpact a college student s GPA? By excludng them from ths regresson, whch assumpton of SLR do we volate? Explan why. Yes, a student s ablty mght mpact ther GPA. We have excluded a student s ablty whch means that ablty s essentally ncluded n the error term. Ths means the error term s correlated wth ACT score. Ths volates the condtonal mean assumpton, or SLR3. (See page of Course Notes #.) SAS Code used n ths assgnment: proc corr dataquant.actgpa; var act gpa; run; proc reg dataquant.actgpa; model gpaact; run; qut; Problem Set # SOLUTIONS Page 7