STAT3503 Test 2 NOTE: a. YOU MAY USE ONE 8.5 X11 TWO-SIDED CHEAT SHEET AND YOUR TEXTBOOK (OR COPY THEREOF). b. YOU MAY USE ANY ELECTRONIC CALCULATOR. c. FOR FULL MARKS YOU MUST SHOW THE FORMULA YOU USE AND YOUR SUBSTITUTION INTO THE FORMULA. YOU MAY USE DIRCTLY ANY RESULT FROM THE SAS OUTPUT ATTACHED d. Place all of your answers in your test booklet. A realtor studied the relationship between taxes on a home (in thousands of dollars) and the age(x1) and number of rooms(x2) of the home. Use the attached SAS output to answer the following questions: 1. [2 marks]give the estimated multiple linear regression model based on these two predictors. 2. [2 marks]he also modelled taxes using the quadratic model Y i 0 1 X1 2 X1 2 3 X2 Based on the SAS output, give the estimated multiple regression model of this form. 3. [5 marks]for the model in #2, test the importance of Rooms i.e. X2 as a predictor using 0.05. State the null and alternative hypotheses, the test statistic, its calculated value, and the rule for rejection. 4. [6 marks]for the model in #2, test the joint importance of age 2 i.e. X1 2 ) and Rooms i.e. X2 as predictors, using 0.05. State the null and alternative hypotheses, the test statistic, its calculated value, and the rule for rejection. 5. [3 marks]give the order of importance of age, i.e.x1, age 2 i.e. X1 2 and Rooms i.e. X2, with the strongest first. 6. Using the model containing only age and age 2,, a. which observation corresponds to Y 0 1 age 2 age 2 i. [2 marks]the largest studentized deleted residual in absolute value and what is that value? ii. [2 marks]the largest leverage and what is that value? iii. [2 marks]the largest effect on its predicted value and what is that value? iv. [2 marks]the largest effect on 0 and what is that value? 1
v. [2 marks]the largest effect on 1 and what is that value? vi. [2 marks]the largest effect on 2 and what is that value? vii. [2 marks]the largest effect on all predicted values and what is that value? b. [2 marks]what is the variance inflation factor for age and what is that value? 7. Using the model in #6, answer the following questions: a. [5 marks]give the formula for a 95% C.I. for the average taxes for a 5 year old home with 8 rooms and show your substitution into that formula. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) b. [5 marks]give the formula for a 95% P.I. for the taxes for an 8 year old home with 10 rooms and show your substitution into that formula. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) c. [10 marks]using the Scheffe method, give the formulae for simultaneous 95% P.I. for the taxes for an 8 year old home with 10 rooms and a 10 year old home with 10 rooms and show your substitution into those formulae. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) d. [4 marks] give the formula for the leverage of a 25 year old home with 9 rooms and show your substitution into that formula. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) 8. [2 marks]of the 3 models run in SAS, which is the best model to use? State the criterion you use. TOTAL 60 marks 2
SAS Code: data taxation; input taxes age rooms; cards; 6.25 1 9 5.7 2 10 3.2 4 10 3.94 4 7 3.3 5 9 3.26 8 8 2.62 10 7 2.46 12 6 2.23 15 9 1.90 20 8 ; data full; set taxation; agesq=age**2; proc print data=full ; var taxes age agesq rooms; proc reg data=full; model taxes=age rooms; model taxes=age agesq rooms/ ss1 stb ; model taxes=age agesq/ influence vif xpx I r p ; output out=result cookd=cookd; proc print data=result; var cookd; proc iml ; use full; read all var ('taxes') into taxes; read all var ('age') into age; read all var ('agesq') into agesq; read all var ('rooms') into rooms; print taxes age agesq rooms; n=nrow(taxes) ; *size of sample; p=3; *number of regression coefficients; print n p; x=j(n,1,1) age agesq ;*create X matrix; print x; xtx=t(x)*x;*create xtransposex; xty=t(x)*taxes;*create xtransposey; print taxes x xtx xty; xtxinv=inv(xtx);*create xtransposex inverse; print xtxinv; b=xtxinv*xty;*compute regression coefficients; print b; yhat=x*b;*compute predicted values; e=taxes-yhat;*compute residuals; print yhat e; yty=t(taxes)*taxes;*compute ytransposey; correct=t(taxes)*j(n,n,1)*taxes/n;*compute the correction for the mean; sstot=yty-correct;*create SSTO; print yty correct sstot; ssreg=t(b)*xty-correct;*obtain the SSR; sse=yty-t(b)*xty;*obtain SSE; mse=sse/(n-p);*obtain MSE;
print ssreg sse mse; varb=mse*xtxinv;*obtain var(b); print varb; case1={1,5,25};*define the first case; meanyhatcase1=t(case1)*b;*estimate of average Y at case 1; varmeanyhatcase1=t(case1)*varb*case1;*variance of estimate of average Y at first case; print case1 meanyhatcase1 varmeanyhatcase1; case2={1,10,100};*define the second case; meanyhatcase2=t(case2)*b;*estimate of average Y at second case; varmeanyhatcase2=t(case2)*varb*case2;*variance of estimate of average Y at second case; print case2 meanyhatcase2 varmeanyhatcase2; casep1={1,8,64};*define the first case for prediction; yhatcasep1=t(casep1)*b;*estimate of Y at first case for prediction; varyhatcasep1=mse*(1+t(casep1)*xtxinv*casep1);*variance of estimate of Y at first case for prediction; print casep1 yhatcasep1 varyhatcasep1; casep2={1,10,100};*define the second case for prediction; yhatcasep2=t(casep2)*b;*estimate of Y at second case for prediction; varyhatcasep2=mse*(1+t(casep2)*xtxinv*casep2);*variance of estimate of Y at second case for prediction; print casep2 yhatcasep2 varyhatcasep2; caselev={1,25,625}; *25 year old home; lev=t(caselev)*xtxinv*caselev; print caselev lev; t1=tinv(0.975,6);*97.5% percentile of t for 6 df ; f1=finv(0.95, 2,6);*95% percentile of F with 2 and 6 d.f.; t2=tinv(0.975,7);*97.5% percentile of t with 7 d.f.; f2=finv(0.95,2,7);*95% percentile of F with 2 and 7 d.f.; print t1 f1 t2 f2 ; quit; SAS Output: Obs taxes age agesq rooms 1 6.25 1 1 9 2 5.70 2 4 10 3 3.20 4 16 10 4 3.94 4 16 7 5 3.30 5 25 9 6 3.26 8 64 8 7 2.62 10 100 7 8 2.46 12 144 6 9 2.23 15 225 9 10 1.90 20 400 8 The REG Procedure Model: MODEL1 Dependent Variable: taxes Number of Observations Read 10 Number of Observations Used 10 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 13.27828 6.63914 8.40 0.0138 Error 7 5.53236 0.79034 Corrected Total 9 18.81064 Root MSE 0.88901 R-Square 0.7059 Dependent Mean 3.48600 Adj R-Sq 0.6219 Coeff Var 25.50226 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept 1 3.81217 2.22509 1.71 0.1304 age 1-0.18383 0.05261-3.49 0.0101 rooms 1 0.14011 0.24136 0.58 0.5798
The REG Procedure Model: MODEL2 Dependent Variable: taxes Number of Observations Read 10 Number of Observations Used 10 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 3 16.22485 5.40828 12.55 0.0054 Error 6 2.58579 0.43096 Corrected Total 9 18.81064 Root MSE 0.65648 R-Square 0.8625 Dependent Mean 3.48600 Adj R-Sq 0.7938 Coeff Var 18.83186 Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Type I SS Estimate Intercept 1 7.26684 2.10839 3.45 0.0137 121.52196 0 age 1-0.58837 0.15951-3.69 0.0102 13.01197-2.49738 agesq 1 0.01904 0.00728 2.61 0.0399 3.08639 1.68635 rooms 1-0.10954 0.20219-0.54 0.6075 0.12650-0.10134 The REG Procedure Model: MODEL3 Model Crossproducts X'X X'Y Y'Y Variable Intercept age agesq taxes Intercept 10 81 995 34.86 age 81 995 14877 215.96 agesq 995 14877 246611 2312.42 taxes 34.86 215.96 2312.42 140.3326 The REG Procedure Model: MODEL3 Dependent Variable: taxes Number of Observations Read 10 Number of Observations Used 10 X'X Inverse, Parameter Estimates, and SSE Variable Intercept age agesq taxes Intercept 0.6782673941-0.145870368 0.0060631416 6.162766522 age -0.145870368 0.0416242458-0.001922473-0.541432979 agesq 0.0060631416-0.001922473 0.0000955667 0.0171742774 taxes 6.162766522-0.541432979 0.0171742774 2.7122824838 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 16.09836 8.04918 20.77 0.0011 Error 7 2.71228 0.38747 Corrected Total 9 18.81064 Root MSE 0.62247 R-Square 0.8558 Dependent Mean 3.48600 Adj R-Sq 0.8146 Coeff Var 17.85628 Parameter Estimates Parameter Standard Variance Variable DF Estimate Error t Value Pr > t Inflation Intercept 1 6.16277 0.51265 12.02 <.0001 0 age 1-0.54143 0.12700-4.26 0.0037 14.10646 agesq 1 0.01717 0.00609 2.82 0.0257 14.10646
The REG Procedure Model: MODEL3 Dependent Variable: taxes Output Statistics Dependent Predicted Std Error Std Error Student Cook's Obs Variable Value Mean Predict Residual Residual Residual -2-1 0 1 2 D 1 6.2500 5.6385 0.4113 0.6115 0.467 1.309 ** 0.442 2 5.7000 5.1486 0.3297 0.5514 0.528 1.044 ** 0.142 3 3.2000 4.2718 0.2408-1.0718 0.574-1.867 *** 0.205 4 3.9400 4.2718 0.2408-0.3318 0.574-0.578 * 0.020 5 3.3000 3.8850 0.2349-0.5850 0.576-1.015 ** 0.057 6 3.2600 2.9305 0.2833 0.3295 0.554 0.595 * 0.031 7 2.6200 2.4659 0.3091 0.1541 0.540 0.285 0.009 8 2.4600 2.1387 0.3144 0.3213 0.537 0.598 * 0.041 9 2.2300 1.9055 0.3158 0.3245 0.536 0.605 * 0.042 10 1.9000 2.2038 0.5822-0.3038 0.220-1.379 ** 4.430 Output Statistics Hat Diag Cov -------------DFBETAS------------- Obs RStudent H Ratio DFFITS Intercept age agesq 1 1.3941 0.4365 1.2145 1.2271 1.2143-0.9665 0.8048 2 1.0523 0.2806 1.3279 0.6572 0.6188-0.4276 0.3300 3-2.4403 0.1497 0.2361-1.0239-0.6163 0.1314 0.0264 4-0.5485 0.1497 1.6126-0.2301-0.1385 0.0295 0.0059 5-1.0173 0.1424 1.1488-0.4146-0.1340-0.0764 0.1304 6 0.5649 0.2072 1.7146 0.2888-0.0775 0.1993-0.2077 7 0.2657 0.2466 2.0352 0.1520-0.0647 0.1172-0.1129 8 0.5685 0.2551 1.8213 0.3326-0.1592 0.2479-0.2186 9 0.5753 0.2574 1.8202 0.3387-0.1180 0.1503-0.0868 10-1.4964 0.8748 4.8973-3.9550-0.9556 1.7073-2.5263 Sum of Residuals 0 Sum of Squared Residuals 2.71228 Predicted Residual SS (PRESS) 10.44957 TAXES AGE AGESQ ROOMS 6.25 1 1 9 5.7 2 4 10 3.2 4 16 10 3.94 4 16 7 3.3 5 25 9 3.26 8 64 8 2.62 10 100 7 2.46 12 144 6 2.23 15 225 9 1.9 20 400 8 N P 10 3 X 1 1 1 1 2 4 1 4 16 1 4 16 1 5 25 1 8 64 1 10 100 1 12 144 1 15 225 1 20 400
TAXES X XTX XTY 6.25 1 1 1 10 81 995 34.86 5.7 1 2 4 81 995 14877 215.96 3.2 1 4 16 995 14877 246611 2312.42 3.94 1 4 16 3.3 1 5 25 3.26 1 8 64 2.62 1 10 100 2.46 1 12 144 2.23 1 15 225 1.9 1 20 400 XTXINV 0.6782674-0.14587 0.0060631-0.14587 0.0416242-0.001922 0.0060631-0.001922 0.0000956 B 6.1627665-0.541433 0.0171743 YHAT E 5.6385078 0.6114922 5.1485977 0.5514023 4.271823-1.071823 4.271823-0.331823 3.8849586-0.584959 2.9304564 0.3295436 2.4658645 0.1541355 2.1386667 0.3213333 1.9054843 0.3245157 2.2038179-0.303818 YTY CORRECT SSTOT 140.3326 121.52196 18.81064 SSREG SSE MSE 16.098358 2.7122825 0.3874689 VARB 0.2628075-0.05652 0.0023493-0.05652 0.0161281-0.000745 0.0023493-0.000745 0.000037 CASE1 MEANYHATCASE1 VARMEANYHATCASE1 1 3.8849586 0.0551903 5 25 CASE2 MEANYHATCASE2 VARMEANYHATCASE2 1 2.4658645 0.0955633 10 100 CASEP1 YHATCASEP1 VARYHATCASEP1 1 2.9304564 0.4677543 8 64 CASEP2 YHATCASEP2 VARYHATCASEP2 1 2.4658645 0.4830322 10 100 CASELEV LEV 1 4.2323024 25 625 T1 F1 T2 F2 2.4469119 5.1432528 2.3646243 4.7374141