a. YOU MAY USE ONE 8.5 X11 TWO-SIDED CHEAT SHEET AND YOUR TEXTBOOK (OR COPY THEREOF).

Similar documents
1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STOR 455 STATISTICAL METHODS I

COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION

Topic 18: Model Selection and Diagnostics

STAT 3A03 Applied Regression With SAS Fall 2017

STATISTICS 479 Exam II (100 points)

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

Multicollinearity Exercise

3 Variables: Cyberloafing Conscientiousness Age

EXST7015: Estimating tree weights from other morphometric variables Raw data print

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Booklet of Code and Output for STAC32 Final Exam

Biostatistics 380 Multiple Regression 1. Multiple Regression

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 3: Inference in SLR

No other aids are allowed. For example you are not allowed to have any other textbook or past exams.

Stat 500 Midterm 2 12 November 2009 page 0 of 11

Lecture 10 Multiple Linear Regression

STA 4210 Practise set 2b

Chapter 2 Inferences in Simple Linear Regression

Lecture 11: Simple Linear Regression

Chapter 1 Linear Regression with One Predictor

Lecture notes on Regression & SAS example demonstration

Statistical Modelling in Stata 5: Linear Models

ST 512-Practice Exam I - Osborne Directions: Answer questions as directed. For true/false questions, circle either true or false.

Chapter 11 : State SAT scores for 1982 Data Listing

Analysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total

Chapter 12: Multiple Regression

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

Statistics for exp. medical researchers Regression and Correlation

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

General Linear Model (Chapter 4)

Ch 2: Simple Linear Regression

MATH 644: Regression Analysis Methods

Overview Scatter Plot Example

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

STATISTICS 110/201 PRACTICE FINAL EXAM

ST430 Exam 2 Solutions

Detecting and Assessing Data Outliers and Leverage Points

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Lecture 13 Extra Sums of Squares

Department of Mathematics The University of Toledo. Master of Science Degree Comprehensive Examination Applied Statistics.

Chapter 6 Multiple Regression

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club

STAT 3A03 Applied Regression Analysis With SAS Fall 2017

Lecture 12 Inference in MLR

Multiple Linear Regression

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

a. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 =

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Chap 10: Diagnostics, p384

Chapter 12 - Lecture 2 Inferences about regression coefficient

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

holding all other predictors constant

In Class Review Exercises Vartanian: SW 540

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Effect of Centering and Standardization in Moderation Analysis

Possibly useful formulas for this exam: b1 = Corr(X,Y) SDY / SDX. confidence interval: Estimate ± (Critical Value) (Standard Error of Estimate)

assumes a linear relationship between mean of Y and the X s with additive normal errors the errors are assumed to be a sample from N(0, σ 2 )

(4) 1. Create dummy variables for Town. Name these dummy variables A and B. These 0,1 variables now indicate the location of the house.

School of Mathematical Sciences. Question 1

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003

Inferences for Regression

STA 4210 Practise set 2a

Ch 3: Multiple Linear Regression

6. Multiple Linear Regression

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Statistical Techniques II EXST7015 Simple Linear Regression

Chapter 8 Quantitative and Qualitative Predictors

Homework 2: Simple Linear Regression

Topic 14: Inference in Multiple Regression

STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002

Regression Model Building

Lecture 6 Multiple Linear Regression, cont.

Simple Linear Regression

Multiple Linear Regression

Regression Review. Statistics 149. Spring Copyright c 2006 by Mark E. Irwin

Section Least Squares Regression

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

Beam Example: Identifying Influential Observations using the Hat Matrix

STAT Chapter 11: Regression

CHAPTER 3: Multicollinearity and Model Selection

Statistics 5100 Spring 2018 Exam 1

Outline. Topic 19 - Inference. The Cell Means Model. Estimates. Inference for Means Differences in cell means Contrasts. STAT Fall 2013

ECO220Y Simple Regression: Testing the Slope

Stat 302 Statistical Software and Its Applications SAS: Simple Linear Regression

Basic Business Statistics, 10/e

Ch 13 & 14 - Regression Analysis

ECON3150/4150 Spring 2016

Multiple Regression Examples

The Steps to Follow in a Multiple Regression Analysis

STAT 350: Summer Semester Midterm 1: Solutions

Chapter 8 (More on Assumptions for the Simple Linear Regression)

SAMPLE QUESTIONS. Research Methods II - HCS 6313

Multiple Regression Methods

Topic 20: Single Factor Analysis of Variance

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Simple Linear Regression

Transcription:

STAT3503 Test 2 NOTE: a. YOU MAY USE ONE 8.5 X11 TWO-SIDED CHEAT SHEET AND YOUR TEXTBOOK (OR COPY THEREOF). b. YOU MAY USE ANY ELECTRONIC CALCULATOR. c. FOR FULL MARKS YOU MUST SHOW THE FORMULA YOU USE AND YOUR SUBSTITUTION INTO THE FORMULA. YOU MAY USE DIRCTLY ANY RESULT FROM THE SAS OUTPUT ATTACHED d. Place all of your answers in your test booklet. A realtor studied the relationship between taxes on a home (in thousands of dollars) and the age(x1) and number of rooms(x2) of the home. Use the attached SAS output to answer the following questions: 1. [2 marks]give the estimated multiple linear regression model based on these two predictors. 2. [2 marks]he also modelled taxes using the quadratic model Y i 0 1 X1 2 X1 2 3 X2 Based on the SAS output, give the estimated multiple regression model of this form. 3. [5 marks]for the model in #2, test the importance of Rooms i.e. X2 as a predictor using 0.05. State the null and alternative hypotheses, the test statistic, its calculated value, and the rule for rejection. 4. [6 marks]for the model in #2, test the joint importance of age 2 i.e. X1 2 ) and Rooms i.e. X2 as predictors, using 0.05. State the null and alternative hypotheses, the test statistic, its calculated value, and the rule for rejection. 5. [3 marks]give the order of importance of age, i.e.x1, age 2 i.e. X1 2 and Rooms i.e. X2, with the strongest first. 6. Using the model containing only age and age 2,, a. which observation corresponds to Y 0 1 age 2 age 2 i. [2 marks]the largest studentized deleted residual in absolute value and what is that value? ii. [2 marks]the largest leverage and what is that value? iii. [2 marks]the largest effect on its predicted value and what is that value? iv. [2 marks]the largest effect on 0 and what is that value? 1

v. [2 marks]the largest effect on 1 and what is that value? vi. [2 marks]the largest effect on 2 and what is that value? vii. [2 marks]the largest effect on all predicted values and what is that value? b. [2 marks]what is the variance inflation factor for age and what is that value? 7. Using the model in #6, answer the following questions: a. [5 marks]give the formula for a 95% C.I. for the average taxes for a 5 year old home with 8 rooms and show your substitution into that formula. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) b. [5 marks]give the formula for a 95% P.I. for the taxes for an 8 year old home with 10 rooms and show your substitution into that formula. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) c. [10 marks]using the Scheffe method, give the formulae for simultaneous 95% P.I. for the taxes for an 8 year old home with 10 rooms and a 10 year old home with 10 rooms and show your substitution into those formulae. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) d. [4 marks] give the formula for the leverage of a 25 year old home with 9 rooms and show your substitution into that formula. DO NOT CALCULATE! (HINT: YOU MAY USE THE SAS OUTPUT.) 8. [2 marks]of the 3 models run in SAS, which is the best model to use? State the criterion you use. TOTAL 60 marks 2

SAS Code: data taxation; input taxes age rooms; cards; 6.25 1 9 5.7 2 10 3.2 4 10 3.94 4 7 3.3 5 9 3.26 8 8 2.62 10 7 2.46 12 6 2.23 15 9 1.90 20 8 ; data full; set taxation; agesq=age**2; proc print data=full ; var taxes age agesq rooms; proc reg data=full; model taxes=age rooms; model taxes=age agesq rooms/ ss1 stb ; model taxes=age agesq/ influence vif xpx I r p ; output out=result cookd=cookd; proc print data=result; var cookd; proc iml ; use full; read all var ('taxes') into taxes; read all var ('age') into age; read all var ('agesq') into agesq; read all var ('rooms') into rooms; print taxes age agesq rooms; n=nrow(taxes) ; *size of sample; p=3; *number of regression coefficients; print n p; x=j(n,1,1) age agesq ;*create X matrix; print x; xtx=t(x)*x;*create xtransposex; xty=t(x)*taxes;*create xtransposey; print taxes x xtx xty; xtxinv=inv(xtx);*create xtransposex inverse; print xtxinv; b=xtxinv*xty;*compute regression coefficients; print b; yhat=x*b;*compute predicted values; e=taxes-yhat;*compute residuals; print yhat e; yty=t(taxes)*taxes;*compute ytransposey; correct=t(taxes)*j(n,n,1)*taxes/n;*compute the correction for the mean; sstot=yty-correct;*create SSTO; print yty correct sstot; ssreg=t(b)*xty-correct;*obtain the SSR; sse=yty-t(b)*xty;*obtain SSE; mse=sse/(n-p);*obtain MSE;

print ssreg sse mse; varb=mse*xtxinv;*obtain var(b); print varb; case1={1,5,25};*define the first case; meanyhatcase1=t(case1)*b;*estimate of average Y at case 1; varmeanyhatcase1=t(case1)*varb*case1;*variance of estimate of average Y at first case; print case1 meanyhatcase1 varmeanyhatcase1; case2={1,10,100};*define the second case; meanyhatcase2=t(case2)*b;*estimate of average Y at second case; varmeanyhatcase2=t(case2)*varb*case2;*variance of estimate of average Y at second case; print case2 meanyhatcase2 varmeanyhatcase2; casep1={1,8,64};*define the first case for prediction; yhatcasep1=t(casep1)*b;*estimate of Y at first case for prediction; varyhatcasep1=mse*(1+t(casep1)*xtxinv*casep1);*variance of estimate of Y at first case for prediction; print casep1 yhatcasep1 varyhatcasep1; casep2={1,10,100};*define the second case for prediction; yhatcasep2=t(casep2)*b;*estimate of Y at second case for prediction; varyhatcasep2=mse*(1+t(casep2)*xtxinv*casep2);*variance of estimate of Y at second case for prediction; print casep2 yhatcasep2 varyhatcasep2; caselev={1,25,625}; *25 year old home; lev=t(caselev)*xtxinv*caselev; print caselev lev; t1=tinv(0.975,6);*97.5% percentile of t for 6 df ; f1=finv(0.95, 2,6);*95% percentile of F with 2 and 6 d.f.; t2=tinv(0.975,7);*97.5% percentile of t with 7 d.f.; f2=finv(0.95,2,7);*95% percentile of F with 2 and 7 d.f.; print t1 f1 t2 f2 ; quit; SAS Output: Obs taxes age agesq rooms 1 6.25 1 1 9 2 5.70 2 4 10 3 3.20 4 16 10 4 3.94 4 16 7 5 3.30 5 25 9 6 3.26 8 64 8 7 2.62 10 100 7 8 2.46 12 144 6 9 2.23 15 225 9 10 1.90 20 400 8 The REG Procedure Model: MODEL1 Dependent Variable: taxes Number of Observations Read 10 Number of Observations Used 10 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 13.27828 6.63914 8.40 0.0138 Error 7 5.53236 0.79034 Corrected Total 9 18.81064 Root MSE 0.88901 R-Square 0.7059 Dependent Mean 3.48600 Adj R-Sq 0.6219 Coeff Var 25.50226 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept 1 3.81217 2.22509 1.71 0.1304 age 1-0.18383 0.05261-3.49 0.0101 rooms 1 0.14011 0.24136 0.58 0.5798

The REG Procedure Model: MODEL2 Dependent Variable: taxes Number of Observations Read 10 Number of Observations Used 10 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 3 16.22485 5.40828 12.55 0.0054 Error 6 2.58579 0.43096 Corrected Total 9 18.81064 Root MSE 0.65648 R-Square 0.8625 Dependent Mean 3.48600 Adj R-Sq 0.7938 Coeff Var 18.83186 Parameter Estimates Parameter Standard Standardized Variable DF Estimate Error t Value Pr > t Type I SS Estimate Intercept 1 7.26684 2.10839 3.45 0.0137 121.52196 0 age 1-0.58837 0.15951-3.69 0.0102 13.01197-2.49738 agesq 1 0.01904 0.00728 2.61 0.0399 3.08639 1.68635 rooms 1-0.10954 0.20219-0.54 0.6075 0.12650-0.10134 The REG Procedure Model: MODEL3 Model Crossproducts X'X X'Y Y'Y Variable Intercept age agesq taxes Intercept 10 81 995 34.86 age 81 995 14877 215.96 agesq 995 14877 246611 2312.42 taxes 34.86 215.96 2312.42 140.3326 The REG Procedure Model: MODEL3 Dependent Variable: taxes Number of Observations Read 10 Number of Observations Used 10 X'X Inverse, Parameter Estimates, and SSE Variable Intercept age agesq taxes Intercept 0.6782673941-0.145870368 0.0060631416 6.162766522 age -0.145870368 0.0416242458-0.001922473-0.541432979 agesq 0.0060631416-0.001922473 0.0000955667 0.0171742774 taxes 6.162766522-0.541432979 0.0171742774 2.7122824838 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 16.09836 8.04918 20.77 0.0011 Error 7 2.71228 0.38747 Corrected Total 9 18.81064 Root MSE 0.62247 R-Square 0.8558 Dependent Mean 3.48600 Adj R-Sq 0.8146 Coeff Var 17.85628 Parameter Estimates Parameter Standard Variance Variable DF Estimate Error t Value Pr > t Inflation Intercept 1 6.16277 0.51265 12.02 <.0001 0 age 1-0.54143 0.12700-4.26 0.0037 14.10646 agesq 1 0.01717 0.00609 2.82 0.0257 14.10646

The REG Procedure Model: MODEL3 Dependent Variable: taxes Output Statistics Dependent Predicted Std Error Std Error Student Cook's Obs Variable Value Mean Predict Residual Residual Residual -2-1 0 1 2 D 1 6.2500 5.6385 0.4113 0.6115 0.467 1.309 ** 0.442 2 5.7000 5.1486 0.3297 0.5514 0.528 1.044 ** 0.142 3 3.2000 4.2718 0.2408-1.0718 0.574-1.867 *** 0.205 4 3.9400 4.2718 0.2408-0.3318 0.574-0.578 * 0.020 5 3.3000 3.8850 0.2349-0.5850 0.576-1.015 ** 0.057 6 3.2600 2.9305 0.2833 0.3295 0.554 0.595 * 0.031 7 2.6200 2.4659 0.3091 0.1541 0.540 0.285 0.009 8 2.4600 2.1387 0.3144 0.3213 0.537 0.598 * 0.041 9 2.2300 1.9055 0.3158 0.3245 0.536 0.605 * 0.042 10 1.9000 2.2038 0.5822-0.3038 0.220-1.379 ** 4.430 Output Statistics Hat Diag Cov -------------DFBETAS------------- Obs RStudent H Ratio DFFITS Intercept age agesq 1 1.3941 0.4365 1.2145 1.2271 1.2143-0.9665 0.8048 2 1.0523 0.2806 1.3279 0.6572 0.6188-0.4276 0.3300 3-2.4403 0.1497 0.2361-1.0239-0.6163 0.1314 0.0264 4-0.5485 0.1497 1.6126-0.2301-0.1385 0.0295 0.0059 5-1.0173 0.1424 1.1488-0.4146-0.1340-0.0764 0.1304 6 0.5649 0.2072 1.7146 0.2888-0.0775 0.1993-0.2077 7 0.2657 0.2466 2.0352 0.1520-0.0647 0.1172-0.1129 8 0.5685 0.2551 1.8213 0.3326-0.1592 0.2479-0.2186 9 0.5753 0.2574 1.8202 0.3387-0.1180 0.1503-0.0868 10-1.4964 0.8748 4.8973-3.9550-0.9556 1.7073-2.5263 Sum of Residuals 0 Sum of Squared Residuals 2.71228 Predicted Residual SS (PRESS) 10.44957 TAXES AGE AGESQ ROOMS 6.25 1 1 9 5.7 2 4 10 3.2 4 16 10 3.94 4 16 7 3.3 5 25 9 3.26 8 64 8 2.62 10 100 7 2.46 12 144 6 2.23 15 225 9 1.9 20 400 8 N P 10 3 X 1 1 1 1 2 4 1 4 16 1 4 16 1 5 25 1 8 64 1 10 100 1 12 144 1 15 225 1 20 400

TAXES X XTX XTY 6.25 1 1 1 10 81 995 34.86 5.7 1 2 4 81 995 14877 215.96 3.2 1 4 16 995 14877 246611 2312.42 3.94 1 4 16 3.3 1 5 25 3.26 1 8 64 2.62 1 10 100 2.46 1 12 144 2.23 1 15 225 1.9 1 20 400 XTXINV 0.6782674-0.14587 0.0060631-0.14587 0.0416242-0.001922 0.0060631-0.001922 0.0000956 B 6.1627665-0.541433 0.0171743 YHAT E 5.6385078 0.6114922 5.1485977 0.5514023 4.271823-1.071823 4.271823-0.331823 3.8849586-0.584959 2.9304564 0.3295436 2.4658645 0.1541355 2.1386667 0.3213333 1.9054843 0.3245157 2.2038179-0.303818 YTY CORRECT SSTOT 140.3326 121.52196 18.81064 SSREG SSE MSE 16.098358 2.7122825 0.3874689 VARB 0.2628075-0.05652 0.0023493-0.05652 0.0161281-0.000745 0.0023493-0.000745 0.000037 CASE1 MEANYHATCASE1 VARMEANYHATCASE1 1 3.8849586 0.0551903 5 25 CASE2 MEANYHATCASE2 VARMEANYHATCASE2 1 2.4658645 0.0955633 10 100 CASEP1 YHATCASEP1 VARYHATCASEP1 1 2.9304564 0.4677543 8 64 CASEP2 YHATCASEP2 VARYHATCASEP2 1 2.4658645 0.4830322 10 100 CASELEV LEV 1 4.2323024 25 625 T1 F1 T2 F2 2.4469119 5.1432528 2.3646243 4.7374141