STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal relatoshp, ot a statstcal oe. If the umber of yearly spa vsts for a member s kow, the the eact dollar amout of that member s aual dues ca be calculated..6 (a) See Fg..6 o p.. Your pcture should look smlar. Plot E(Y) 00 + 5.0X Evaluate E(Y) at X 0, 0 ad 40, ad plot these pots. X 0 : E(Y) 00 + 5.0(0) 50 X 0 : E(Y) 00 + 5.0(0) 300 X 40 : E(Y) 00 + 5.0(40) 400 Sketch a ormal curve aroud each of these mea values to represet the dstrbuto of Y at each of the gve X values. Note that the varace of each probablty dstrbuto s the same (σ 6). (b) β 0 : the value of E(Y) at X 0 s β 0 00. β : for each ut crease X, E(Y) creases by β 5 uts..7 (a) No, sce the dstrbuto of Y s ukow. (b) Yes. Now Y has a ormal dstrbuto wth E(Y) 00 + 0(5) 00, Var(Y) 5..e., Y ~ N(00, 5) 95 00 05 00 P[95 Y 05] P Z P[ - Z ] 0.686 5 5. Over the specfed rage for X, from 40 to 00, there s a crease producto output after a employee takes a trag program. Ths s because the y-tercept b 0 s equal to 0. For eample, f the producto output was 40 before the trag, t wll be 58 after the trag. Also f the producto output was 00 before the trag, t wll be 5 after. However, lookg outsde the rage of X, f the producto was 000 before the trag, t would oly be 970 after the trag. There s oly a crease wth the rage of X..3 (a) The data are observatoal there was o cotrolled epermet. (b) The cocluso s ot vald. Oe caot make fereces about a causal relatoshp based o observatoal data. There could be cofoudg varables that are related to the creased employee productvty ad creased class preparato tme.

.3 (c) Eample : Taleted employees do t eed to sped much tme class preparato but stll have hgher productvty levels tha others. Eample : Readg techcal papers or searchg the web may decrease oe s class preparato tme. However, oe may stll have hgher productvty levels tha others. (d) A epermeter mght take a represetatve sample of good employees who do t sped much tme class for preparg. The partcpats would be radomly assged to oe of two groups. Group would be asked to sped several hours for preparato (say 4 hours per day) ad Group would be asked ot to eceed 4 hours of preparg per day. The amout of tme for preparato ad the productvty level would be recorded for each dvdual.. (a) Yˆ 0.0 + 4. 0X Arfreght breakage Y 0. + 4X R-Sq 90. % 0 amp 5 0 0 3 tras A lear regresso fucto appears to ft the data well. (b) Whe X, Yˆ 0. + 4() 4. (c) The crease the umber of trasfers (X) s. So, the crease the epected umber of broke ampules, E(Y), s estmated by b 4. (d) Calculate: X, Y 4. As we have see part (b), Yˆ 0. + 4() 4.. The X, Y. ftted regresso le does pass through the pot ( )

.5 MINITAB regresso for arfreght breakage data: The regresso equato s y broke 0. + 4.00 trasfers Predctor Coef StDev T P Costat 0.000 0.6633 5.38 0.000 trasf 4.0000 0.4690 8.53 0.000 S.483 R-Sq 90.% R-Sq(adj) 88.9% Aalyss of Varace Source DF SS MS F P Regresso 60.00 60.00 7.73 0.000 Resdual Error 8 7.60.0 Total 9 77.60 Obs trasf y broke Ft StDev Ft Resdual St Resd.00 6.000 4.00 0.469.800.8.9 (a) Ŷ 0. + 4() 4. e Y Ŷ 6 4..8 e s a estmate of ε, the vertcal devato of Y from the ukow true regresso le (b) Σe SSE 7.6 MSE.0 MSE estmates σ Assumg X 0 wth the scope of the model, the mplcato of the regresso fucto f β 0 s othg but we epect Y 0 ad the regresso fucto plot passes through the org. 0.38 a) For b 0 9 ad b 3, the crtero s: b) Smlarly, for b 0 ad b 5 0 ( (9 + 3 )) 76 Q y 0 ( ( + 5 )) 60 Q y YES! The crtera Q for these estmates, as epected, are larger tha for the least squares estmates..4 a) The least squares estmator of β s obtaed by mmzg the least square crtera, Q. Hece we eed to mmze:

Q ( y β ) To get the estmator, we take the dervatve of Q wth respect to β ad equate t to zero. dq dβ d dβ ( y β ) ( y β ) 0 Thus, solvg the above equato for β, evaluated at b, we get the least square estmator to be: b y b) Frst, the desty of a observato Y for the ormal error model, utlzg the fact that E{ Y } β ad σ { Y } σ s gve by: y β f ep πσ σ The lkelhood fucto for observatos Y, Y,..., Y s the product of the above dvdual destes. Sce the varace of the error termσ s kow ths problem, the lkelhood fucto s a fucto of β oly. Hece, L( β ) ep y β / (πσ ) σ ep / (πσ ) σ ( ) ( y β ) The mamum lkelhood estmator (MLE) of β s obtaed by mamzg the above lkelhood. Sce the value of β that mamze the above lkelhood also mamze LogL (β ), we fd the MLE from: LogL( β ) log πσ ( y β ) σ Net, takg the dervatve wth respect to β ad equatg to zero gve the MLE (Check also secod dervatve). That s, d LogL( β ) β dβ σ ( y ) 0

Solvg for β evaluated at b gve the MLE to be: b y c) A estmator b of β s ubased f E {b} β. From the regresso equato y β + ε, we deduce E{ y } β Thus, t follow E { b} E{ y } β β β Therefore, the MLE b s a ubased estmator of β..43 a) Let Y be the umber of actve physcas CDI X be The Total Populato X be Number of Hosptal Beds ad X 3 be Total Persoal Icome The estmated regresso fuctos of the umber of actve physcas o each of the predctors are gve by: Yˆ 0.635 + 0.003X Yˆ 95.93 + 0.743X Yˆ 48.395 + 0.3X b) Plot of regresso fuctos ad data. 3 Number of Actve Physcas 0 5000 0000 5000 0000 Number of Actve Physcas 0 5000 0000 5000 0000 0 *0^6 4*0^6 6*0^6 8*0^6 Total Populato 0 5000 0000 5000 0000 5000 Number of Hosptal Beds

Number of Actve Physcas 0 5000 0000 5000 0000 Yes, the lear regresso relato appears to provde a good ft for each of the three predctor varables. However, the two data pots whch are out of the scatter should be see wth cauto. 0 50000 00000 50000 Total Persoal Icome c) The mea square error for each predctor varables ca be obtaed from the aalyss of varace table. Thus, Predctors MSE X 3704 X 309 X 3 34539 So based o the MSE, we ca deduce that regresso model that cota X (the umber of hosptal beds) has the smallest varablty aroud the ftted regresso le..44 Let Y stads for per capta come ad also let X represet the percetage of dvduals havg at least bachelor s degree. a) The estmated regresso fucto for each rego s gve by: Rego NE NC S W Estmated MSE Regresso Fucto Yˆ 93.8 + 5. 6X 7,335,008 Yˆ 358.4 + 38. 67X 4,4,34 Yˆ 059.79 + 330. 6X 7,474,349 Yˆ 865.05 + 440. 3 8,4,38 X

b) As to the smlarty of the regresso fuctos, terms of the drecto of relatoshp betwee per capta come ad percetage of dvdual havg at least a bachelor s degree t s the same all regos. A ut crease percetage of bachelor s degree result a crease the per capta come. The rate of cremet, however, dffers amog the regos. For stace, the relatve rate of cremet per capta come for a ut crease percetage of dvduals s hgher NE ad smaller NC. c) The MSE for each rego s show colum 3 above. There s a dfferece the varablty aroud the ftted regresso le amog the groups. The varablty s relatvely hgher W ad smaller NC.