(a) The percentage of variation in the response is given by the Multiple R-squared, which is 52.67%.

Similar documents
Linear Regression Model. Badr Missaoui

Ch 3: Multiple Linear Regression

Lecture 6 Multiple Linear Regression, cont.

Applied Regression Analysis

Stat 5102 Final Exam May 14, 2015

Lecture 4 Multiple linear regression

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Ch 2: Simple Linear Regression

MS&E 226: Small Data

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Multiple Linear Regression

Inference for Regression

Simple and Multiple Linear Regression

Density Temp vs Ratio. temp

CAS MA575 Linear Models

Biostatistics 380 Multiple Regression 1. Multiple Regression

1 Multiple Regression

Statistics - Lecture Three. Linear Models. Charlotte Wickham 1.

Tests of Linear Restrictions

ST430 Exam 1 with Answers

1 Introduction 1. 2 The Multiple Regression Model 1

EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)

Lecture 1 Intro to Spatial and Temporal Data

ST430 Exam 2 Solutions

CHAPTER 2 SIMPLE LINEAR REGRESSION

The Simple Regression Model. Part II. The Simple Regression Model

Lecture 18: Simple Linear Regression

STAT 540: Data Analysis and Regression

Advanced Quantitative Methods: ordinary least squares

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

13 Simple Linear Regression

36-707: Regression Analysis Homework Solutions. Homework 3

Chapter 3: Multiple Regression. August 14, 2018

MODELS WITHOUT AN INTERCEPT

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one.

Dealing with Heteroskedasticity

Nested 2-Way ANOVA as Linear Models - Unbalanced Example

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Example: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA

Simple Linear Regression

SCHOOL OF MATHEMATICS AND STATISTICS

Central Limit Theorem ( 5.3)

Booklet of Code and Output for STAC32 Final Exam

AMS-207: Bayesian Statistics

2.1 Linear regression with matrices

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Linear Models and Estimation by Least Squares

Chapter 8 Conclusion

General Linear Model (Chapter 4)

Chapter 12: Linear regression II

Lecture 5: Comparing Treatment Means Montgomery: Section 3-5

STAT763: Applied Regression Analysis. Multiple linear regression. 4.4 Hypothesis testing

Homework 9 Sample Solution

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear models and their mathematical foundations: Simple linear regression

Introduction to the Analysis of Hierarchical and Longitudinal Data

Analytics 512: Homework # 2 Tim Ahn February 9, 2016

Lecture 15. Hypothesis testing in the linear model

Section 4.6 Simple Linear Regression

Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer

6. Multiple Linear Regression

Exercise 2 SISG Association Mapping

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

Applied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013

Regression and the 2-Sample t

Simple Linear Regression

Math 423/533: The Main Theoretical Topics

Stat 401B Final Exam Fall 2015

3. For a given dataset and linear model, what do you think is true about least squares estimates? Is Ŷ always unique? Yes. Is ˆβ always unique? No.

SCHOOL OF MATHEMATICS AND STATISTICS

Recall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:

Variance Decomposition and Goodness of Fit

Simple Linear Regression

Beyond GLM and likelihood

Regression. Marc H. Mehlman University of New Haven

Model Specification and Data Problems. Part VIII

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

STAT 215 Confidence and Prediction Intervals in Regression

CHAPTER 2: Assumptions and Properties of Ordinary Least Squares, and Inference in the Linear Regression Model

Confidence Intervals, Testing and ANOVA Summary

Correlation & Simple Regression

14 Multiple Linear Regression

Lecture 1: Linear Models and Applications

Coefficient of Determination

Introduction to Estimation Methods for Time Series models. Lecture 1

Linear Models Review

General Linear Statistical Models - Part III

Multiple Linear Regression

Marcel Dettling. Applied Statistical Regression AS 2012 Week 05. ETH Zürich, October 22, Institute for Data Analysis and Process Design

Introduction and Single Predictor Regression. Correlation

The Distribution of F

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Regression Analysis lab 3. 1 Multiple linear regression. 1.1 Import data. 1.2 Scatterplot matrix

BIOS 2083 Linear Models c Abdus S. Wahed

ST505/S697R: Fall Homework 2 Solution.

MATH 644: Regression Analysis Methods

STATISTICS 479 Exam II (100 points)

Diagnostics and Transformations Part 2

Transcription:

STOR 664 Homework 2 Solution Part A Exercise (Faraway book) Ch2 Ex1 > data(teengamb) > attach(teengamb) > tgl<-lm(gamble ~ sex+status+income+verbal) > summary(tgl) Coefficients: Estimate Std Error t value Pr(> t ) (Intercept) 2255565 1719680 1312 01968 sex -2211833 821111-2694 00101 * status 005223 028111 0186 08535 income 496198 102539 4839 179e-05 *** verbal -295949 217215-1362 01803 Signif codes: 0 *** 0001 ** 001 * 005 01 1 Residual standard error: 2269 on 42 degrees of freedom Multiple R-squared: 05267, Adjusted R-squared: 04816 F-statistic: 1169 on 4 and 42 DF, p-value: 1815e-06 (a) The percentage of variation in the response is given by the Multiple R-squared, which is 5267% (b) The th case has the largest residual > whichmax(tgl$residuals) (c) The mean of the residuals is almost 0( 86 10 17 ) and the median is 1451 > mean(tgl$residuals) > median(tgl$residuals) (d) The correlation of the residuals with fitted values is 2586 10 17 > cor(tgl$residuals,tgl$fittedvalues) (e) The correlation of the residuals with the income is 5027 10 17 > cor(tgl$residuals,income) (f) Based on the summary, the fitted model can be explicitly written as: gamble = 2255565 2211833 sex + 005223 status + 496198 income 295949 verbal If all the predictors except sex are held constant, the difference in predicted expenditure on gambling between male (sex=0) and female (sex=1) will be equal to the regression coefficient of sex, ie, 2211833 Therefore whenever sex changes from male (sex=0) to female (sex=1), the value of gamble decreases by 2211833 In other words, according to the current regression model, a female spends $2211833 less than a comparable (ie, other predictors being held constant) male on gambling 1

Part B Ch3 Ex2 The model is Y = Xβ + ɛ where x 11 x 12 Y = (y 1, y 2,, y n ) x 21 x 22, X =, β = (β 1, β 2 ), ɛ = (ɛ 1, ɛ 2,, ɛ n ) Using those calculations, (X X) 1 = x n1 x n2 ( 1 x 2 i2 ) x i1 x i2 x 2 2 i1 i2 ( x i1 x i2 ) 2 x i1 x i2 x 2, X Y = i1 ( ) xi1 y i xi2 y i Least Squared Estimation of β is given by Normal equation; ( ˆβ = (X X) 1 X 1 x 2 Y = i2 xi1 y i ) x i1 x i2 xi2 y i x 2 2 i1 i2 ( x i1 x i2 ) 2 x i1 x i2 xi1 y i + x 2 i1 xi2 y i Clearly estimates are unbiased and Cov( ˆβ) = σ 2 (X X) 1 : Var( ˆβ 1 ) = σ 2 x 2 i2 /A, Cov( ˆβ 1, ˆβ 2 ) = σ 2 x i1 x i2 /A where A = x 2 2 i1 i2 ( x i1 x i2 ) 2 Var( ˆβ 2 ) = σ 2 x 2 i1 /A, Ch3 Ex4 (a) Write considered model as Y = Xβ +ɛ, the statement that ˆθ = c ˆβ is the BLUE for θ = c β is equivalent to say that For any linear unbiased θ = b Y, Var(ˆθ) Var( θ) Now, unbiasedness of θ gives that Eb Y = b Xβ = c β X b = c and Var(c ˆβ) = σ 2 c (X X) 1 c = σ 2 c (X X) 1 X X(X X) 1 c, Var(b Y ) = σ 2 b b Thus it is also equivalent to c (X X) 1 X X(X X) 1 c b b b st X b = c X(X X) 1 c = argmin b b b subject to X b = c This proves the argument (b) Let P = {Xβ R n β R p } be as in the book Define more sets; P c = {b R n X b = c} and P 0 = {b R n X b = 0} First, note that for any Xβ P and for any b P 0, the inner product is 0 (ie (Xβ) b = β X b = 0) So P 0 P The problem in (a) is equivalent to find a vector a P c st a a is minimized With some geometric understanding, it is enough to find a vector a which is orthogonal to b a for all b P c Now, since P 0 P, a is also in P ie a P P c a = Xβ for some β R p a P c = X a = X Xβ = β = (X X) 1 c = a = Xβ = X(X X) 1 c a P c (Optional) Alternatively, it is easily proved by the fact that I H = I X(X X) 1 X is symmetric, idempotent so nnd so that Var( θ) Var(ˆθ) 0 Or, use Cov(ˆθ, θ ˆθ) = 0, so that Var( θ) Var(ˆθ) Ch3 Ex6 (a) Follow scheme (ii), set 1 1 0 0 1 1 0 0 1 0 1 0 1 0 1 0 X = 1 0 0 1, β = 1 0 0 1 0 0 1 1 0 0 1 1 β 1 β 4, ɛ = ɛ 1 ɛ 12, Y = y 1 y 12 2

5 1 1 ˆβ = (X X) 1 X Y = 1 5 1 1 1 5 1 1 1 1 1 1 5 X Y (b) Write z i for the weighings of scheme (i), while y i is for scheme (ii) Suppose sd(z i ) = σ, then sd(y i ) = 2σ Then, Var( ˆβ (i) ) = 1 3 σ2 and Var( ˆβ (ii) ) = 5 22 σ 2 = 5 6 σ2 Scheme (i) is superior in this case (c) scheme (i) : weigh each item 10 times 1 10 y 1 1 10 0 Y = X = 1 10 1 10 y 12 0 1 10 1 10 = I 6 1 10, β = β 1 β 6, X X = (I 6 1 10)(I 6 1 10 ) = I 6 1 101 10 = 10I 6 ( 10 ) ˆβ = (X X) 1 X Y = 10 1 I 6 X Y = 1 60 y i,, y i and 10 i=1 i=51 Var( ˆβ 1 ) = 1 10 σ2 scheme(ii) : weigh each pair 4 times 1 4 1 4 0 0 0 0 1 4 0 1 4 0 0 0 X = 1 4 0 0 1 4 0 0, 0 0 0 0 1 4 1 4 scheme(iii) : weigh each triple 3 times 1 3 1 3 1 3 0 0 0 1 3 1 3 0 1 3 0 0 1 3 1 3 0 0 1 3 0 X = 1 3 1 3 0 0 0 1 3, 0 0 0 1 3 1 3 1 3 X X = 16I 6 + 4J 6, (X X) 1 = 1 16 I 6 4 16(16+4 6) J 6, ˆβ = (X X) 1 X Y, Var( ˆβ 1 ) = 9 160 σ2 X X = 18I 6 + 12J 6, (X X) 1 = 1 18 I 6 12 18(18+12 6) J 6, ˆβ = (X X) 1 X Y, Var( ˆβ 1 ) = 13 270 σ2 Since Var(iii) < Var(ii) < Var(i), Scheme (iii) is the best Ch3 Ex18 In general, the question what is the best functional form of the relationship should be answered carefully, and there is no correct (or wrong) answer In this problem, strength of yarn would grow in ratio of fiber length, tensile strength of the fibers, and fiber fineness, their relationship would be a multiple form so X 1 = ax2x b 3X c 4 d Fit this through simple linear regression method; the model we regress is log x i1 = β 1 + β 2 log x i2 + β 3 log x i3 + β 4 log x i4 + ɛ i Since ˆβ 3 is not significant(009126(039551)), after omitting log x 3, fitted result is; lm(formula = lx1 ~ lx2 + lx4) Coefficients: Estimate Std Error t value Pr(> t ) (Intercept) 4 07779 3117 000628 ** lx2 08355 01531 5458 425e-05 *** 3

lx4-04040 01098-3681 000185 ** Signif codes: 0 *** 0001 ** 001 * 005 01 1 Residual standard error: 006741 on 17 degrees of freedom Multiple R-Squared: 07175, Adjusted R-squared: 06842 F-statistic: 2158 on 2 and 17 DF, p-value: 2159e-05 We have s 2 = 0004544 For (a)-(c), with confidence level α = 005, number of points K = 4 and n = 3, confidence intervals are of the form [ŷ ± t se(ŷ)] where t = t 1 α/2,n 3 for (a), t 1 α/2k,n 3 for (b-bonferroni), and 3F 1 α,3,n 3 for (c-scheffe) num ŷ CI for each sci(bonferroni) sci(scheffe) 1 8946327 8600056 9306539 8490900 9426181 8442698 9479998 2 9441991 9061474 9838488 8941654 9970325 8888764 10029651 3 9708884 9373159 10056634 9267026 10171811 9220111 10223568 4 8325013 7822336 8859993 7666263 9040369 7597709 9121939 If you want to fit the model without logarithm transformation, the following should be the answer lm(formula = fiber$x1 ~ X2 + X4) Residuals: Min 1Q Median 3Q Max -9843-3690 -1934 6608 10862 Coefficients: Estimate Std Error t value Pr(> t ) (Intercept) 954000 14333 66557 < 2e-16 *** X2 10706 01839 5823 204e-05 *** X4-10157 02692-3773 000152 ** Signif codes: 0 *** 0001 ** 001 * 005 01 1 Residual standard error: 641 on 17 degrees of freedom Multiple R-Squared: 07453, Adjusted R-squared: 07154 F-statistic: 88 on 2 and 17 DF, p-value: 8931e-06 and confidence intervals are given: num ŷ CI for each sci(bonferroni) sci(scheffe) 1 8939622 8562869 9316375 8440953 9438291 8386616 9492627 2 9474931 9085545 9864318 8959541 9990322 8903382 10046481 3 9779652 9452995 10106310 9347290 10212014 9300179 10259126 4 8376871 7813549 8940194 7631260 91282 7550016 9203727 (d) Each interval of Bonferroni is narrower than corresponding interval of Scheffe method Thus, Bonferroni method is preferred Ch3 Ex19 (a) Call: lm(formula = protein ~ L1 + L2 + L3 + L4 + L5 + L6) Residuals: Min 1Q Median 3Q Max -0397979-0126604 -00082 0077451 0387052 4

Coefficients: Estimate Std Error t value Pr(> t ) (Intercept) 23074230 9899022 2331 003232 * L1 00281 0082118 0342 073618 L2 0001667 0087162 0019 098497 L3 0234909 0077400 3035 000748 ** L4-00445 0063218-3803 000142 ** L5 0011839 0006126 1932 007014 L6-0035584 0045530-0782 044522 Signif codes: 0 *** 0001 ** 001 * 005 01 1 Residual standard error: 02203 on 17 degrees of freedom Multiple R-Squared: 09821, Adjusted R-squared: 09758 F-statistic: 1559 on 6 and 17 DF, p-value: 6654e-14 Residual sum of square is 17s 2 = 08250455 (b)(l1, L2, L5, L6) regression coefficients of which are smaller in magnitude than twice their standard errors respectively Thus we shall keep L3 and L4 only lm(formula = protein ~ L3 + L4) Residuals: Min 1Q Median 3Q Max -052215-009417 002566 013763 041405 Coefficients: Estimate Std Error t value Pr(> t ) (Intercept) 31174314 1308664 2382 < 2e-16 *** L3 00405 0009640 94 < 2e-16 *** L4-0217101 0009568-2269 297e-16 *** Signif codes: 0 *** 0001 ** 001 * 005 01 1 Residual standard error: 077 on 21 degrees of freedom Multiple R-Squared: 09721, Adjusted R-squared: 09695 F-statistic: 3662 on 2 and 21 DF, p-value: < 22e-16 Residual sum of square is 17s 2 = 1288461 (c) Set H 0 : model in (b) vs H 1 : model in (a) Model 1: protein ~ L3 + L4 Model 2: protein ~ L1 + L2 + L3 + L4 + L5 + L6 ResDf RSS Df Sum of Sq F Pr(>F) 1 21 128884 2 17 082534 4 046350 23868 00918 P-value is 00918; not quite small enough to reject H 0 Model in (b) is good (d) With confidence level α = 010, number of points K = 6, p = 3 and n =, prediction intervals are of the form [ŷ ± t se(ŷ)] where t = t 1 α/2,n 3 for (i), t 1 α/2k,n 3 for (ii-bonferroni), and 6F 1 α,6,n 3 for (iii-scheffe) 5

num ŷ PI for each spi(bonferroni) spi(scheffe) 1 9801225 9365855 10236595 9143050 1045940 8807162 10795288 2 8358796 7909020 8808573 7678842 903875 7331840 9385753 3 9444302 9007920 9880684 8784598 1010401 8447929 10440675 4 11831218 11341547 12320888 11090953 1257148 10713172 12949263 5 10025695 9589644 10461745 9366491 1068490 9030079 11021311 6 10986116 10520573 11451660 10282327 1168991 9923160 12049072 Bonferroni is more preferable Part C Ch3 Ex7 (a) Given level α test in ANOVA model is to reject H 0 if F F I 1,n I,1 α Power function is given by β(δ) = P δ (test rejects H 0 ) = P(F F I 1,n I,1 α ) where F non-central F I 1,n I,δ To get δ, use substitution rule, δ 2 σ 2 = n i (y i ȳ) 2 yi=θ i,ȳ= θ = n i (θ i θ) 2 Now by the following claim, conclude that the power is maximized when δ is the greatest possible ie ni (θ i θ) 2 subject to n i = n is maximized Claim: Let F δ F m,n,δ for some m, n, then P(F δ1 a) P(F δ2 a) if δ 1 δ 2 0 (Heuristic argument: observe that mass of F m,n,δ tends to move toward right when δ increases) proof: Let Z N(0, 1), X χ 2 m 1, Y χ 2 n and independent with each other Write their densities as f Z, f X and f Y resp then if P((Z + δ) 2 a) decreases as δ decreases, P(F δ1 a) = P( (Z + δ 1) 2 + X a n Y m ) = P((Z + δ 1) 2 + X a Y ) = 0 0 0 0 = P(F δ2 a) P((Z + δ 1 ) 2 a y x)f X (x)f Y (y)dxdy P((Z + δ 2 ) 2 a y x)f X (x)f Y (y)dxdy where a = a n m Thus it is left to prove that P((Z + δ) 2 a) decreases as δ decreases Equivalently, P((Z + δ) 2 a) increases as δ decreases, since {(Z + δ) 2 a} = { a δ Z a δ} tends to cover more neighborhood of 0 (where most mass of N(0, 1) is on) when δ decreases One can prove this argument more mathematically (b) When I = 2, n i = na i, a 1 + a 2 = 1, write f(a 1 ) = n i (θ i θ) 2 = a 1 (θ 1 (a 1 θ 1 + (1 a 1 )θ 2 )) 2 + (1 a 1 )(θ 2 (a 1 θ 1 + (1 a 1 )θ 2 )) 2 The fact that f(a 1 ) is concave with peak at a 1 = 1/2 proves the argument (c) Write v = θ 3 θ 2 = θ 2 θ 1 Let B j = a i (θ i θ) 2 where j indicates allocation scheme (i) and (ii) Since δ = n B j, scheme (j) with larger B j gives more powerful test (j = i, ii) Calculation gives B 1 = 2/3v 2, B 2 = v 2 Thus, scheme (ii) gives more power and if we have 3/2n instead of n for total number of samples, then the power would be approximately the same (note that the second df of the F distributions would be different then due to the different sample sizes) 6

Appendix: Sample code for Ex18 and Ex19 #### #318 fiber <- readtable("d:/2006-fall/stat664/hw2/fiber2dat") names(fiber) <- c("no","x1","x2","x3","x4") lfiber <- log(fiber) names(lfiber) <- c("no","lx1","lx2","lx3","lx4") X2 <- fiber$x2 - mean(fiber$x2) X3 <- fiber$x3 - mean(fiber$x3) X4 <- fiber$x4 - mean(fiber$x4) rm <- lm(fiber$x1 ~ X2+X4) logrm <- lm(lx1 ~ lx2+lx4, data=lfiber) summary(rm) summary(logrm) #### confidence interval for mean X1 xc <- t(matrix(ncol=4, nrow=3, c(75- mean(fiber$x2),70- mean(fiber$x3),45 - mean(fiber$x4), 80- mean(fiber$x2),70- mean(fiber$x3),45 - mean(fiber$x4), 80- mean(fiber$x2),75- mean(fiber$x3),42 - mean(fiber$x4), 65- mean(fiber$x2),80- mean(fiber$x3),40 - mean(fiber$x4)))) xc<-dataframe(xc[,c(1,3)]) names(xc) <- c("x2","x4") a <- 005 ## 1-a prediction intervals K <- 4 ## number of points p <- predict(rm, xc,sefit=t) ## prediction PI_each <- cbind(p$fit - qt(1-a/2,rm$dfresidual)*p$sefit, p$fit + qt(1-a/2,rm$dfresidual)*p$sefit) PI_Bonf <- cbind(p$fit - qt(1-a/(2*k),rm$dfresidual)*p$sefit, p$fit + qt(1-a/(2*k),rm$dfresidual)*p$sefit) PI_Sche <- cbind(p$fit - sqrt(3*qf(1-005,3,rm$dfresidual))*p$sefit, p$fit + sqrt(3*qf(1-005,3,rm$dfresidual))*p$sefit) #### confidence interval for logx1 xc <- t(matrix(ncol=4, nrow=3, c(75,70,45, 80,70,45, 80,75,42, 65,80,40))) logxc <- log(xc) xc<-dataframe(logxc[,c(1,3)]) names(xc) <- c("x2","x4") a <- 005 ## 1-a prediction intervals K <- 4 ## number of points p <- predict(logrm, xc,sefit=t) ## prediction PI_each <- cbind(p$fit - qt(1-a/2,logrm$dfresidual)*p$sefit, p$fit + qt(1-a/2,logrm$dfresidual)*p$sefit) PI_Bonf <- cbind(p$fit - qt(1-a/(2*k),logrm$dfresidual)*p$sefit, p$fit + qt(1-a/(2*k),logrm$dfresidual)*p$sefit) PI_Sche <- cbind(p$fit - sqrt(3*qf(1-005,3,logrm$dfresidual))*p$sefit, p$fit + sqrt(3*qf(1-005,3,logrm$dfresidual))*p$sefit) 7

#### #319 gw <- readtable("d:/2007-fall/664 - solution/hw2/proteindat") names(gw) <- c("no","protein","l1","l2","l3","l4","l5","l6") attach(gw) #par(mfrow=c(2,3)) #### (a) Fit linear model with all covariates fm <- lm(protein~l1+l2+l3+l4+l5+l6) summary(fm) ####(b) ## (L1, L2, L5, L6) regression coefficients of which are smaller in magnitude ## than twice their standard errors resp rm <- lm(protein~l3+l4) summary(rm) ####(c) anova(rm,fm) ## H0 : rm ## H1 : fm ## p-value is 00918; not quite small enough to reject H0(rm) ## model in (b) (rm) is good #### confidence interval for rm in (b) a <- 010 ## 1-a prediction intervals K <- 6 ## number of points pt <- readtable("d:/2007-fall/664 - solution/hw2/reflectdat") pt_rm<-pt[,4:5] names(pt_rm)<-c("l3","l4") ####(d)-i p <- predict(rm, dataframe(pt_rm),sefit=t) ## prediction sep <- p$residualscale * sqrt(1+(p$sefit/p$residualscale)^2) ##se for PI p$fit PI_each <- cbind(p$fit - qt(1-a/2,rm$dfresidual)*sep, p$fit + qt(1-a/2,rm$dfresidual)*sep) PI_Bonf <- cbind(p$fit - qt(1-a/(2*k),rm$dfresidual)*sep, p$fit + qt(1-a/(2*k),rm$dfresidual)*sep) PI_Sche <- cbind(p$fit - sqrt(6*qf(1-005,6,rm$dfresidual))*sep, p$fit + sqrt(6*qf(1-005,6,rm$dfresidual))*sep) 8