Chapter 2 Inferences in Regression and Correlation Analysis

Size: px
Start display at page:

Download "Chapter 2 Inferences in Regression and Correlation Analysis"

Transcription

1 Chapter 2 Inferences in Regression and Correlation Analysis 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 2 1 / 102

2 Inferences concerning the regression parameters β 0 and β 1 interval estimation of β 0 and β 1 tests about them interval estimation of E(Y ) of the probability distribution of Y for given X prediction intervals of a new observation Y confidence bands for the regression line the analysis of variance approach to regression analysis the general linear test approach descriptive measure of association the correlation coefficient hsuhl (NUK) LR Chap 2 2 / 102

3 Normal error regression model Assume that the normal error regression model is applicable: Y i = β 0 + β 1 X i + ε i β 0 and β 1 are parameters X i are known constants ε i N (0, σ 2 ): are independent Y i and Y j i j are uncorrelated. hsuhl (NUK) LR Chap 2 3 / 102

4 Inferences Concerning β 1 Inferences about the slope β 1 of the regression line Illustration A market research analyst studying the relation between sales (Y ) and advertising expenditures ( 廣告支出 X): to obtain an interval estimate of β 1 provide information as to how many additional sales dollars 檢定 H 0 : β 1 = 0; H a : β 1 0. hsuhl (NUK) LR Chap 2 4 / 102

5 Inferences Concerning β 1 Inferences about the slope β 1 of the regression line (cont.) Figure 1 : Regression Model when β 1 = 0. β 1 = 0 no linear association between Y and X The regression line is horizontal. The means of Y : E{Y } = β 0. The probability distribution of Y are identical at all levels of X. hsuhl (NUK) LR Chap 2 5 / 102

6 Sampling Distribution of b 1 Point estimator b 1 b 1 = (Xi X)(Y i Ȳ ) (Xi X) 2 The sample distribution of b 1 refers to the different values of b 1 that would be obtained with repeated sampling when the levels of the predictor variable X are held constant from sample to sample. Model: Y i = 5 + 2X i + ε i ε i i.i.d. N (0, 1) X i { 5, 4.9,..., 4.9, 5} hsuhl (NUK) LR Chap 2 6 / 102

7 Sampling Distribution of b 1 Point estimator b 1 (cont.) Sampling distribution of b 1 ( 需 k i 的特性 ) For normal error regression model: Mean: E{b 1 } = β 1 ; Variance: σ 2 {b 1 } = σ 2 (Xi X) 2 b 1 is a linear combination of Y i b 1 = k i Y i k i = X i X (Xi X) 2 : a function of X i hsuhl (NUK) LR Chap 2 7 / 102

8 Sampling Distribution of b 1 Point estimator b 1 (cont.) Properties of k i : ki = 0 ki X i = 1 k 2 1 i = (Xi X) 2 hsuhl (NUK) LR Chap 2 8 / 102

9 Sampling Distribution of b 1 Normality Y i : independently, normally distributed A linear combination of independent normal random variables is normally distributed. hsuhl (NUK) LR Chap 2 9 / 102

10 Normality (cont.) Sampling Distribution of b 1 Normally properties Mean: E{b 1 } = β 1 Variance: σ 2 {b 1 } = σ 2 1 (Xi X) 2 hsuhl (NUK) LR Chap 2 10 / 102

11 Sampling Distribution of b 1 Normality Normally properties (cont.) The unbiased estimator of σ 2 : MSE = SSE n 2 = Estimated Variance σ 2 {b 1 }: s 2 {b 1 } = (Yi Ŷi) 2 n 2 MSE (Xi X) 2 The point estimator s 2 {b 1 } is an unbiased estimator of σ 2 {b 1 }. The point estimator of σ{b 1 }: s{b 1 } hsuhl (NUK) LR Chap 2 11 / 102

12 Sampling Distribution of b 1 b 1 : unbiased linear estimator Theorem 1 The estimator b 1 has minimum variance among all unbiased linear estimators of: ˆβ 1 = c i Y i c i : arbitrary constants; c i = 0; c i X i = 1 Unbiased: E{ ˆβ 1 } = β 1 σ 2 { ˆβ 1 } = σ 2 ci 2 c i = k i + d i k i = X i X (Xi X) 2 d i : arbitrary constants ki d i = 0 σ 2 k 2 i = σ 2 {b 1 } ˆβ 1 is at a minimum when di 2 = 0 all d i = 0 c i = k i hsuhl (NUK) LR Chap 2 12 / 102

13 Sampling distribution of (b 1 β 1 )/s{b 1 } Sampling distribution of (b 1 β 1 )/s{b 1 } Theorem 2 b 1 β 1 s{b 1 } t n 2 If the observations Y i come from the same normal population, (Ȳ µ)/s{ȳ } d t n 1 Two parameters: β 0, β 1 hsuhl (NUK) LR Chap 2 13 / 102

14 Sampling distribution of (b 1 β 1 )/s{b 1 } Sampling distribution of (b 1 β 1 )/s{b 1 } (cont.) Proof of b 1 β 1 s{b 1 } t n 2 Theorem 3 For regression model (2.1), and is independent of b 0 and b 1. SSE/σ 2 χ 2 (n 2), Remark: 利用矩陣型式可較容易推得 SSE/σ 2 χ 2 (n 2) is independent of b 0 and b 1 Key: Lσ 2 ya = (X X) 1 X σi [(I H )/σ 2 ] = 0 hsuhl (NUK) LR Chap 2 14 / 102

15 Sampling distribution of (b 1 β 1 )/s{b 1 } Sampling distribution of (b 1 β 1 )/s{b 1 } (cont.) (cont.) Theorem 4 t Distribution Let Z and χ 2 (ν) be independent r.v. (standard normal, χ 2 ). A t random variable as follows: t(ν) = Z [ χ 2 (ν) ν ] where Z and χ 2 (ν) are indep. hsuhl (NUK) LR Chap 2 15 / 102

16 Sampling distribution of (b 1 β 1 )/s{b 1 } Sampling distribution of (b 1 β 1 )/s{b 1 } (cont.) (cont.) 1 (b 1 β 1 )/σ{b 1 } Z (Standard Normal variable) 2 s 2 {b 1 } χ2 (n 2) σ 2 {b 1 } n 2 3 b 1 β 1 s{b 1 } = b 1 β 1 σ{b 1 } s{b 1} σ{b 1 } Z and χ 2 are independent; 4 Z is a function of b 1 5 b 1 is independent of SSE/σ 2 χ 2 Z χ 2 (n 2) n 2 b 1 β 1 s{b 1 } t n 2 hsuhl (NUK) LR Chap 2 16 / 102

17 Confidence interval for β 1 Confidence interval for β 1 b 1 β 1 s{b 1 } t n 2 { } P t(α/2; n 2) (b 1 β 1 )/s{b 1 } t(1 α/2; n 2) = 1 α { } P b 1 t(1 α/2; n 2)s{b 1 } β 1 b 1 + t(1 α/2; n 2)s{b 1 } = 1 α t(α/2; n 2): (α/2)100 percentile of the t distribution with n 2 d.f. Symmetric: t(α/2; n 2) = t(1 α/2; n 2) hsuhl (NUK) LR Chap 2 17 / 102

18 Confidence interval for β 1 Confidence interval for β 1 (cont.) The 1 α confidence limits for β 1 are: b 1 ± t(1 α/2; n 2)s{b 1 } hsuhl (NUK) LR Chap 2 18 / 102

19 Confidence interval for β 1 Ex: Confidence interval for β 1 The Toluca Company: an estimate of β 1 with 95 percent confidence coefficient. #### Example p46 toluca<-read.table("toluca.txt",header=t) attach(toluca) ## method 1 n<-length(size) alpha<-0.05 tdf<-qt(1-alpha/2,n-2) Sxx<-sum((Size-mean(Size))^2) b1<-sum((size-mean(size))*(hrs-mean(hrs)))/sxx b0<-mean(hrs)-b1*mean(size) c(b0,b1) SSE<-sum((Hrs-(b0+b1*Size))^2) MSE<-SSE/(n-2) sb1<-sqrt(mse/sum((size-mean(size))^2)) #s{b1} conf<-c(b1-tdf*sb1,b1+tdf*sb1) conf ## method 2 fitreg<-lm(hrs~size) summary(fitreg) confint(fitreg) hsuhl (NUK) LR Chap 2 19 / 102

20 Confidence interval for β 1 Ex: Confidence interval for β 1 (cont.) s 2 {b 1 } = s{b 1 } = MSE (Xi X) = t(0.975, 23) = The 95 percent confidence interval: 2.85 β hsuhl (NUK) LR Chap 2 20 / 102

21 Confidence interval for β 1 Ex: Confidence interval for β 1 (cont.) hsuhl (NUK) LR Chap 2 21 / 102

22 Confidence interval for β 1 Tests concerning β 1 b 1 β 1 s{b 1 } d t n 2 Two-Sided Test H 0 : β 1 = 0 vs. H a : β 1 0 The test statistic: t = b 1 s{b 1 } The decision rule (the level of significance α): If t t(1 α/2; n 2), conclude H 0 If t > t(1 α/2; n 2), conclude H a hsuhl (NUK) LR Chap 2 22 / 102

23 Confidence interval for β 1 Tests concerning β 1 α = 0.05; n = 25 t(0.975; 23) = b 1 = t = / = > conclude H a : β 1 0 { p-value: P t(23) > t = 10.29} < hsuhl (NUK) LR Chap 2 23 / 102

24 Confidence interval for β 1 Tests concerning β 1 α = 0.05; n = 25 t(0.975; 23) = b 1 = t = / = > conclude H a : β 1 0 { p-value: P t(23) > t = 10.29} < hsuhl (NUK) LR Chap 2 23 / 102

25 Confidence interval for β 1 Tests concerning β 1 One-Sided Test H 0 : β 1 0 vs. H a : β 1 > 0 The test statistic: t = b 1 s{b 1 } The decision rule (the level of significance α): If t t(1 α; n 2), conclude H 0 If t > t(1 α; n 2), conclude H a hsuhl (NUK) LR Chap 2 24 / 102

26 Confidence interval for β 1 Tests concerning β 1 Two-Sided Test H 0 : β 1 = β 10 vs. H a : β 1 β 10 The test statistic: t = b 1 β 10 s{b 1 } The decision rule (the level of significance α): If t t(1 α/2; n 2), conclude H 0 If t > t(1 α/2; n 2), conclude H a hsuhl (NUK) LR Chap 2 25 / 102

27 Inferences concerning β 0 Sampling distribution of b 0 When the scope of the model includes X = 0 The point estimator b 0 : b 0 = Ȳ b 1 X Theorem 5 For regression model (2.1), the sampling distribution of b 0 is normal, with mean and variance: E{b 0 } = β 0 σ 2 {b 0 } = σ 2 { 1 n + X 2 (Xi X) 2 } Cov(Ȳ, b 1) = 0 hsuhl (NUK) LR Chap 2 26 / 102

28 Inferences concerning β 0 Sampling distribution of b 0 (cont.) An estimator of σ 2 {b 0 }: [ 1 s 2 2 ] {b 0 } = MSE n + X (Xi X) 2 An estimator of σ{b 0 }: s{b 0 } Theorem 6 b 0 β 0 s{b 0 } d t n 2 hsuhl (NUK) LR Chap 2 27 / 102

29 Confidence interval for β 0 Confidence interval for β 0 The 1 α confidence limits for β 0 are: b 0 ± t(1 α/2; n 2)s{b 0 } ## method 1 sb0<-sqrt(mse*(1/n+mean(size)^2/sum((size-mean(size))^2))) #s{b0} b0<-mean(hrs)-b1*mean(size) tdf0<-qt(1-0.1/2,n-2) conf0<-c(b0-tdf0*sb0,b0+tdf0*sb0) conf0 ## method 2 fitreg<-lm(hrs~size) confint(fitreg,level = 0.9) hsuhl (NUK) LR Chap 2 28 / 102

30 Confidence interval for β 0 Example C.I. for β 0 Toluca Company Range of X: [20,120] t(0.95; 23) = [ ] s 2 1 {b 0 } = MSE + X 2 n (Xi X) = s{b 0 } = The 90% C.I. for β 0 : 17.5 β Remark: The C.I. does not necessarily provide meaningful information. (The scope of the model doesn t include X = 0 ) hsuhl (NUK) LR Chap 2 29 / 102

31 Confidence interval for β 0 Example C.I. for β 0 (cont.) It does not necessarily provide information about the setup cost since we are not certain whether a linear regression model is appropriate when the scope of the model is extended to X = 0 hsuhl (NUK) LR Chap 2 30 / 102

32 Confidence interval for β 0 Example C.I. for β 0 (cont.) > toluca<-read.table("toluca.txt",header=t) > fitreg<-lm(hrs~size,data=toluca) > summary(fitreg) R codes Call: lm(formula = Hrs ~ Size, data = toluca) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) * Size e-10 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 23 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 23 DF, p-value: 4.449e-10 > confint(fitreg,level = 0.95) 2.5 % 97.5 % (Intercept) Size hsuhl (NUK) LR Chap 2 31 / 102

33 Confidence interval for β 0 Example C.I. for β 0 (cont.) R codes > ## Plot regression line > plot(toluca$size,toluca$hrs,pch=16,xlab="size",ylab="hours") > abline(fitreg,col="red") hsuhl (NUK) LR Chap 2 32 / 102

34 Some considerations on making inferences concerning β 0 and β 1 Departures from Normality If the probability distribution of Y are not exactly normal but do not depart seriously, the sampling distributions of b 0 and b 1 will approximately normal, and the use of the t distribution will provide approximately the specified confidence coefficient or level of significance. hsuhl (NUK) LR Chap 2 33 / 102

35 Some considerations on making inferences concerning β 0 and β 1 Departures from Normality (cont.) Even if the distribution of Y are far from normal, the estimators b 0 and b 1 generally have the property of asymptotic normality- their distributions approach normality under very general conditions as the sample size increases. hsuhl (NUK) LR Chap 2 34 / 102

36 Some considerations on making inferences concerning β 0 and β 1 Power of Tests H 0 : β 1 = β 10 vs. H a : β 1 β 10 Test statistic: t = b 1 β 10 s{b 1 } The power of this test for α level: the decision rule will lead to conclusion H a when H a holds { } Power = P t > t(1 α/2; n 2) δ the noncentrality measure i.e., a measure of how far the true value of β 1 is from β 10 δ = β 1 β 10 (Appendix Table B.5) σ{b 1 } hsuhl (NUK) LR Chap 2 35 / 102

37 Some considerations on making inferences concerning β 0 and β 1 Common objective To estimate the mean for one or more probability distributions of Y. Illustration A study of the relation between level of piecework( 按件計酬的工作 ) pay (X) and worker productivity ( 生產力 Y ). The mean productivity at high and medium levels of piecework pay may be of particular interest for purposes of analyzing the benefits obtained from an increase in the pay hsuhl (NUK) LR Chap 2 36 / 102

38 Some considerations on making inferences concerning β 0 and β 1 X h : the level of X for which we wish to estimate the mean response may be a value which occurred in the sample other value of the predictor variable within the scope( 範圍 ) of the model E{Y h }: the mean response when X = X h Ŷ h : the point estimator of E{Y h }: Ŷ h = b 0 + b 1 X h What is the sampling distribution of Ŷh? hsuhl (NUK) LR Chap 2 37 / 102

39 Some considerations on making inferences concerning β 0 and β 1 Sampling distribution of Ŷh Sampling distribution of Ŷh For normal error regression model (2,1), the sampling distribution of Ŷh is normal: Mean: E{Ŷh} = E{Y h } Variance: σ 2 {Ŷh} = σ 2 [ 1 n + (X h X) 2 (Xi X) 2 ] Ŷ h is a linear combination of the observations Y i. Ŷ h is an unbiased estimator of E{Y h }. hsuhl (NUK) LR Chap 2 38 / 102

40 Some considerations on making inferences concerning β 0 and β 1 Sampling distribution of Ŷh (cont.) Figure 2 : Effect on Ŷh of Variation in b 1 from Sample to Sample in Two Samples with Same Means Ȳ and X hsuhl (NUK) LR Chap 2 39 / 102

41 Some considerations on making inferences concerning β 0 and β 1 Properties of Ŷh X h = 0: Var(Ŷh) = Var(b 0 ) s 2 {Ŷh} = s 2 {b 0 } b 1 and Ȳ are uncorrelated σ{ȳ, b 1} = 0 hsuhl (NUK) LR Chap 2 40 / 102

42 Some considerations on making inferences concerning β 0 and β 1 Properties of Ŷh (cont.) When MSE is substituted for σ 2, the estimated variance of Ŷ h (s 2 (Ŷh)) [ 1 s 2 {Ŷh} = MSE n + (X h X) 2 ] (Xi X) 2 The estimated standard deviation of Ŷh: s{ŷh} hsuhl (NUK) LR Chap 2 41 / 102

43 Interval Estimation of E{Y h } Sampling Distribution of (Ŷh E{Y h })/s{ŷh} Theorem 7 Ŷ h E{Y h } s{ŷh} t n 2 The 1 α confidence limits for E{Y h } are: Ŷ h ± t(1 α/2; n 2)s{Ŷh} hsuhl (NUK) LR Chap 2 42 / 102

44 Interval Estimation of E{Y h } Example 1 Toluca Company example find a 90 percent confidence interval for E{Y h } when the lot size is X h = 65 units. Ŷ h = (65) = ] s 2 {Ŷh} = 2, 384[ ( )2 19,800 = s{ŷh} = t(.95; 23) = E{Y h } Conclude: the mean number of work hours required when lots of 65 units are produced is somewhere between and hours. the estimate of the mean number of work hours is moderately precise. hsuhl (NUK) LR Chap 2 43 / 102

45 Interval Estimation of E{Y h } hsuhl (NUK) LR Chap 2 44 / 102

46 Interval Estimation of E{Y h } Code for Example 1 #### Example p54 toluca<-read.table("toluca.txt",header=t) attach(toluca) ## method 1 n<-length(size) alpha<-0.1 tdf90<-qt(1-alpha/2,n-2) Sxx<-sum((Size-mean(Size))^2) b1<-sum((size-mean(size))*(hrs-mean(hrs)))/sxx b0<-mean(hrs)-b1*mean(size) c(b0,b1) SSE<-sum((Hrs-(b0+b1*Size))^2) MSE<-SSE/(n-2) sb1<-sqrt(mse/sum((size-mean(size))^2)) #s{b1} Xh<-65 hyh<-b0+b1* Xh syh<-sqrt(mse*(1/n+(xh-mean(size))^2/sum((size-mean(size))^2))) hsuhl (NUK) LR Chap 2 45 / 102

47 Interval Estimation of E{Y h } Code for Example 1 (cont.) EYhCI<-c(hYh-tdf90*sYh,hYh+tdf90*sYh) ## method 2 fitreg<-lm(hrs~size,data=toluca) summary(fitreg) predxci<-predict(fitreg,data.frame(size = c(xh)), interval = "confidence", se.fit = F,level = 0.9) plot(size, Hrs) abline(fitreg,col="blue") # now the confidence interval for $X_h=specific level$ points(xh, predxci[, "fit"],col="red",pch=15) points(xh, predxci[, "lwr"], lty = "dotted",col="red") points(xh, predxci[, "upr"], lty = "dotted",col="red") hsuhl (NUK) LR Chap 2 46 / 102

48 Interval Estimation of E{Y h } Example 2 Toluca Company example Estimate E{Y h } for lots with X h = 100 units with a 90 percent confidence interval Ŷ h = (100) = ] s 2 {Ŷh} = 2, 384[ ( )2 19,800 = s{ŷh} = t(.95; 23) = E{Y h } Conclude: the confidence interval is somewhat wider than that for Example 1, since the X h level here is substantially farther from the mean X = 70 than the X h = 65. hsuhl (NUK) LR Chap 2 47 / 102

49 Interval Estimation of E{Y h } hsuhl (NUK) LR Chap 2 48 / 102

50 Interval Estimation of E{Y h } Sampling distribution of Ŷh The variance of Ŷh is smallest when X h = X. In an experiment to estimate the mean response at a particular level X h of the predictor variable, the precision of the estimate will be greatest if the observations on X are spaced so that X = X h. The usual relationship between C.I. and tests applies in inferences concerning the mean response. The two-sided confidence limits can be utilized for two-sided tests concerning the mean response at X h. Alternatively, a regular decision rule can be set up. The confidence limits for a mean response E{Y h } are not sensitive to moderate departures from the assumption that the error terms are normally distributed. hsuhl (NUK) LR Chap 2 49 / 102

51 Prediction of New Observation Prediction of New Observation Toluca Company: the next lot to be produced consists of 100 units; wishes to predict the number of work hours for this particular lot. Estimated the regression relation between company sales and number of persons 16 or more years old from data for the past 10 years; wishes to predict next year s company sales hsuhl (NUK) LR Chap 2 50 / 102

52 Prediction of New Observation Prediction of New Observation College admissions example the relevant parameters of the regression model are known: β 0 = 0.10 β 1 = 0.95 E{Y } = X σ = 0.12 An applicant whose high school GPA is X h = 3.5: E{Y h } = (3.5) = E{Y h } ± 3σ: ± 3(0.12) Y h(new) hsuhl (NUK) LR Chap 2 51 / 102

53 Prediction of New Observation The basic idea of a prediction interval is to choose a range in the distribution of Y wherein most of the observations will fall, and then to declare that the next observation will fall in this range. When the regression parameters of normal error regression model (2.1) are known, the 1 α prediction limits for Y h(new) are: E{Y h } ± z(1 α/2)σ hsuhl (NUK) LR Chap 2 52 / 102

54 Prediction of New Observation Prediction Interval for Y h(new) when Parameters Unknown Figure 3 : Unknown Figure 2.5: Prediction of Y h(new) when Parameters hsuhl (NUK) LR Chap 2 53 / 102

55 Prediction of New Observation Prediction Interval for Y h(new) when Parameters Unknown (cont.) Since we cannot be certain of the location of the distribution of Y, prediction limits for Y h(new) clearly must take account of two elements (Figure 2.5) Variation in possible location of the distribution of Y Variation within the probability distribution of Y Prediction limits for a new observation Y h(new) at X h (given) are obtained: Theorem 8 Y h(new) Ŷh s{pred} t n 2 for normal error regression model (2.1) hsuhl (NUK) LR Chap 2 54 / 102

56 Prediction of New Observation Prediction Interval for Y h(new) when Parameters Unknown (cont.) The 1 α prediction limits for Y h(new) : Ŷ h ± t(1 α/2; n 2)s{pred} The difference may be viewed as the prediction error, with Ŷ h(new) serving as the best point estimate of the value of the new observation Y h(new) σ 2 {pred}: the variance of the prediction error σ 2 {pred} = σ 2 {Y h(new) Ŷh} = σ 2 + σ 2 {Ŷh} hsuhl (NUK) LR Chap 2 55 / 102

57 Prediction of New Observation Prediction Interval for Y h(new) when Parameters Unknown (cont.) σ 2 {pred} has two components: The variance of the distribution of Y at X = X h ; σ 2 The variance of the sampling distribution of Ŷh; σ 2 {Ŷh} An unbiased estimator of σ 2 {pred}: s 2 {pred} = MSE + s 2 {Ŷh} = MSE [1 + 1 n + (X h X) 2 ] (Xi X) 2 hsuhl (NUK) LR Chap 2 56 / 102

58 Prediction of New Observation Example Toluca Company: X h = percent prediction interval: t(0.95; 23) = Ŷ h = s 2 {Ŷh} = MSE = 2, 384 s 2 {pred} = 2, = 2, s{pred} = The 90 percent prediction interval for Y h(new) : Y h(new) hsuhl (NUK) LR Chap 2 57 / 102

59 Prediction of New Observation ## Example (p59) fitreg<-lm(hrs~size,data=toluca) Xh<-100 fitnew<-predict.lm(fitreg,data.frame(size = c(xh)), se.fit = T, level = 0.9) s2pred<-fitnew$se.fit^2+fitnew$residual.scale^2 c(s2pred,sqrt(s2pred)) hsuhl (NUK) LR Chap 2 58 / 102

60 Prediction of New Observation This prediction interval is rather wide and may not be useful for planning worker requirements for the next lot. The interval can still be useful for control purposes. If the actual work hours fall outside the prediction limits some alerts- may have occurred a change in the production process hsuhl (NUK) LR Chap 2 59 / 102

61 Prediction of New Observation Comparing Y h(new) and E{Y h } Toluca Company: The C.I. for Y h(new) is wider than the C.I. for E{Y h }: predicting the work hours required for a new lot, encounter the variability in Ŷh from sample to sample as well as the lot-to-lot variation within the probability distribution of Y (2.38a): The prediction interval is wider the further X h is from X hsuhl (NUK) LR Chap 2 60 / 102

62 Prediction of New Observation Prediction of Mean of m New Observations for Given X h Predict the mean of m new observations on Y for a given X h Y : the mean of the new observations to be predicted as Ȳ h(new) the new Y observations are independent The appropriate 1 α prediction limits: Ŷ h ± t(1 α/2; n 2)s{predmean} s 2 {predmean} = MSE m + s2 {Ŷh} [ 1 s 2 {predmean} = MSE m + 1 n + (X h X) 2 ] (Xi X) 2 hsuhl (NUK) LR Chap 2 61 / 102

63 Prediction of New Observation Prediction of Mean of m New Observations for Given X h (cont.) Two components for s 2 {predmean} the variance of the mean of m observations from the probability distribution of Y at X = X h The variance of the sampling distribution of Ŷh hsuhl (NUK) LR Chap 2 62 / 102

64 Prediction of New Observation Prediction of Mean of m New Observations for Given X h (cont.) Example Toluca Company: X h = percent prediction interval for the mean number of work hours Ȳh(new) in three new production runs Previous results: Ŷ h = 419.4; s 2 {Y h } = MSE = 2, 384; t(0.95; 23) = 1.714; s 2 2, 384 {predmean} = = s{predmean} = hsuhl (NUK) LR Chap 2 63 / 102

65 Prediction of New Observation Prediction of Mean of m New Observations for Given X h (cont.) The prediction interval for Ȳh(new): Ȳh(new) The total number of work hours: 1, Total work hours 1, hsuhl (NUK) LR Chap 2 64 / 102

66 Geometric interpretation Geometric interpretation Y is a vector in R n ; The columns of X define a vector space range(x) X b is an arbitrary vector in range(x) Xb is the orthogonal projection of Y onto range(x) X (Y Xb) = 0 Figure 4 : Geometric interpretation of OLS hsuhl (NUK) LR Chap 2 65 / 102

67 Geometric interpretation Geometric interpretation Y is a vector in R n ; The columns of X define a vector space range(x) X b is an arbitrary vector in range(x) Xb is the orthogonal projection of Y onto range(x) X (Y Xb) = 0 Figure 4 : Geometric interpretation of OLS hsuhl (NUK) LR Chap 2 65 / 102

68 Geometric interpretation Geometric interpretation Y is a vector in R n ; The columns of X define a vector space range(x) X b is an arbitrary vector in range(x) Xb is the orthogonal projection of Y onto range(x) X (Y Xb) = 0 Figure 4 : Geometric interpretation of OLS hsuhl (NUK) LR Chap 2 65 / 102

69 Geometric interpretation Geometric interpretation Y is a vector in R n ; The columns of X define a vector space range(x) X b is an arbitrary vector in range(x) Xb is the orthogonal projection of Y onto range(x) X (Y Xb) = 0 Figure 4 : Geometric interpretation of OLS hsuhl (NUK) LR Chap 2 65 / 102

70 Analysis of Variance Approach to Regression Analysis Partition of Total Sum of Squares The analysis of variance approach is based on the partitioning of sums of squares and degrees of freedom associated with Y. The variation is measured: the deviations of the Y i around their mean Ȳ : Y i Ȳ hsuhl (NUK) LR Chap 2 66 / 102

71 Analysis of Variance Approach to Regression Analysis Partition of Total Sum of Squares (cont.) hsuhl (NUK) LR Chap 2 67 / 102

72 Analysis of Variance Approach to Regression Analysis Partition of Total Sum of Squares (cont.) Total variation: (SSTO: total sum of squares) SSTO = (Y i Ȳ )2 Y i are the same SSTO = 0 The greater the variation among the Y i, the larger is SSTO. SSE: error sum of squares SSE = (Y i Ŷi) 2 Y i fall on the fitted regression line SSE = 0 The greater the variation of the Y i around the fitted regression line, the larger is SSE. hsuhl (NUK) LR Chap 2 68 / 102

73 Analysis of Variance Approach to Regression Analysis Partition of Total Sum of Squares (cont.) SSR: regression sum of squares SSR = (Ŷi Ȳ )2 The regression line is horizontal SSR = 0, otherwise SSR > 0 a measure associated with the regression line The larger SSR is in relation to SSTO, the greater is the effect of the regression relation in accounting for the total variation in the Y i observations. hsuhl (NUK) LR Chap 2 69 / 102

74 Analysis of Variance Approach to Regression Analysis Formal Development of Partitioning The total deviation: Y i Ȳ }{{} Total deviation Two components: = Ŷ i Ȳ }{{} Deviation of fitted regression value around mean + Y i Ŷi }{{} Deviation around fitted regression line The deviation of the fitted value Ŷi around the mean Ȳ. The deviation of the observation Y i around the fitted regression line. hsuhl (NUK) LR Chap 2 70 / 102

75 Analysis of Variance Approach to Regression Analysis Formal Development of Partitioning (cont.) (Yi Ȳ )2 = (Ŷi Ȳ )2 + (Y i Ŷ )2 SSTO SSR SSE 2 (Ŷi Ȳ )(Y i Ŷi) = 0 SSR = b1 2 (Xi X) 2 Degrees of freedom: (df) SS df explanation SSTO n 1 ( (Y i Ȳ ) = 0) SSR 1 SSE n 2 ( β 0, β 1 in Ŷi) hsuhl (NUK) LR Chap 2 71 / 102

76 Analysis of Variance Approach to Regression Analysis Mean Squares Mean square (MS): A sum of squares divided by its associated df SS df MS SSR 1 MSR = SSR (regression mean square) 1 SSE n 2 MSE = SSE (error mean square) n 2 Mean squares are not additive. SSTO n 1 MSR + MSE hsuhl (NUK) LR Chap 2 72 / 102

77 Analysis of Variance Approach to Regression Analysis ANOVA table Table 1 : ANOVA Table for Simple Linear Regression Source of SS df MS E{MS} Variation Regression SSR = (Ŷi Ȳ )2 1 MSR = SSR σ 2 + β (Xi X) 2 Error SSE = (Y i Ŷi) 2 n 2 MSE = SSE σ 2 n 2 Total SSTO = (Y i Ȳ )2 n 1 hsuhl (NUK) LR Chap 2 73 / 102

78 Analysis of Variance Approach to Regression Analysis ANOVA table (cont.) Modified ANOVA table: SSTO = (Y i Ȳ )2 = Y 2 i nȳ 2 SSTOU: total uncorrected sum of squares: SSTOU = Y 2 i correction for the mean sum of squares: SS(correction for mean) = nȳ 2 hsuhl (NUK) LR Chap 2 74 / 102

79 Analysis of Variance Approach to Regression Analysis ANOVA table (cont.) Table 2 : Modified ANOVA Table for Simple Linear Regression Source of Variation SS df MS Regression SSR = (Ŷi Ȳ )2 1 MSR = SSR 1 Error SSE = (Y i Ŷi) 2 n 2 MSE = SSE n 2 Total SSTO = (Y i Ȳ )2 n 1 Correction for mean SS(correctionformean) = nȳ 2 1 Total, uncorrected SSTOU = Yi 2 n hsuhl (NUK) LR Chap 2 75 / 102

80 Analysis of Variance Approach to Regression Analysis Expected Mean Squares E{MSE} = σ 2 E{MSR} = σ 2 + β 2 1 MSE is an unbiased estimator of σ 2. Implication: (Xi X) 2 The mean of the sampling distribution of MSE is σ 2 ; The mean of the sampling distribution of MSR is σ 2 when β 1 = 0; When β 1 0, E{MSR} > E{MSE} = σ 2. ( β 2 1 (Xi X) 2 > 0) hsuhl (NUK) LR Chap 2 76 / 102

81 Analysis of Variance Approach to Regression Analysis F test of β 1 = 0 vs. β 1 0 The analysis of variance approach provides us with a battery highly useful tests for regression models. For the simple linear regression case, the ANOVA provides us with a test: H 0 :β 1 = 0 H a :β 1 0 Test Statistics: F = MSR MSE large values of F H a ; values of F near 1 H 0 ; hsuhl (NUK) LR Chap 2 77 / 102

82 Analysis of Variance Approach to Regression Analysis F test of β 1 = 0 vs. β 1 0 (cont.) Cochran s theorem If all n observations Y i come from the same normal distribution with mean µ and variance σ 2, and SSTO is decomposed into k sums of squares SS r, each with degrees of freedom df r, then the SS r /σ 2 terms are independent χ 2 variables with df r degrees of freedom if k df r = n 1 r=1 hsuhl (NUK) LR Chap 2 78 / 102

83 Analysis of Variance Approach to Regression Analysis F test of β 1 = 0 vs. β 1 0 (cont.) Property If β 1 = 0 so that all Y i have the same mean µ = β 0 and the same variance σ 2, SSE/σ 2 and SSR/σ 2 are independent χ 2 variables. When H 0 holds: F = SSR σ 2 SSE 1 σ 2 n 2 = MSR MSE χ2 (1) 1 χ2 (n 2) n 2 F(1, n 2) When H a holds, F follows the noncentral F distribution. hsuhl (NUK) LR Chap 2 79 / 102

84 Analysis of Variance Approach to Regression Analysis Construction of Decision Rule F F(1, n 2) The decision rule: α=type I error Decision IfF F(1 α; 1, n 2), concludeh 0 ; IfF > F(1 α; 1, n 2), concludeh a ; where F(1 α; 1, n 2) is the (1 α)100 percentile of the approximate F distribution. hsuhl (NUK) LR Chap 2 80 / 102

85 Analysis of Variance Approach to Regression Analysis Exmaple Toluca Company Earlier test on β 1 : (the t test, p46) Two-Sided Test H 0 : β 1 = β 10 If t = b 1 β 10 s{b 1 } t(1 α/2; n 2), conclude H 0 If t = b 1 β 10 s{b 1 } > t(1 α/2; n 2), conclude H a hsuhl (NUK) LR Chap 2 81 / 102

86 Analysis of Variance Approach to Regression Analysis Exmaple Using the F test H 0 :β 1 = β 10 = 0 H a :β 1 β 10 = 0 α = 0.05; n = 26; F(0.95; 1, 23) = 4.28 We have IfF 4.28, concludeh 0 F = MSR MSE What is the conclusion? = 252, 378 2, 384 = hsuhl (NUK) LR Chap 2 82 / 102

87 Analysis of Variance Approach to Regression Analysis Equivalence of F test and t Test For a given α level, the F test of H 0 :β 1 = 0 H a : β 1 0 is equivalence algebraically to the two-tailed t test. Decision F = b2 1 s 2 {b 1 } = ( ) 2 b1 = (t ) 2 s{b 1 } [t(1 α/2; n 2)] 2 = F(1 α; 1, n 2) t test: two-tailed; F test: one-tailed; hsuhl (NUK) LR Chap 2 83 / 102

88 General Linear Test Approach Steps for General Linear Test Approach 1 Fit the full model and SSE(F) 2 Fit the reduced model under H 0 and SSE(R) 3 Use test statistic and decision rule hsuhl (NUK) LR Chap 2 84 / 102

89 General Linear Test Approach 1. Full Model The full or unrestricted model: SSE(F): Y i = β 0 + β 1 X i + ε i SSE(F) = [ Y i (b 0 + b 1 X i )] 2 = (Yi Ŷi) 2 = SSE hsuhl (NUK) LR Chap 2 85 / 102

90 General Linear Test Approach 2. Reduced Model Hypothesis: H 0 :β 1 = 0 H a : β 1 0 The reduced or restricted model when H 0 holds: SSE(R): Y i = β 0 + ε i SSE(R) = [ Y i (b 0 )] 2 = (Yi Ȳ )2 = SSTO hsuhl (NUK) LR Chap 2 86 / 102

91 General Linear Test Approach 3. Test Statistic SSE(F) SSE(R) The more parameters are in the model, the better one can fit the data and the smaller are the deviations around the fitted regression function. hsuhl (NUK) LR Chap 2 87 / 102

92 General Linear Test Approach 3. Test Statistic (cont.) When SSE(F) is not much less than SSE(R), using the full model does not account for much more of the variability of the Y i than does the reduced model. i.e., H 0 holds. Suggest that the reduced model is adequate When SSE(F) is close to SSE(R), the variation of the observations around the fitted regression function for the full model is almost as great as the variation around the fitted regression function for the reduced model. hsuhl (NUK) LR Chap 2 88 / 102

93 General Linear Test Approach 3. Test Statistic (cont.) A small difference SSE(R) SSE(F) suggests that H 0 holds. A large difference suggests that H a holds. Test Statistic: a function of SSE(R) SSE(F): F = SSE(R) SSE(F) df R df F SSE(F) df F Fdistribution when H 0 holds. Decision rule: If F F(1 α; df R df F, df F ), conclude H 0 If F > F(1 α; df R df F, df F ), conclude H a hsuhl (NUK) LR Chap 2 89 / 102

94 General Linear Test Approach 3. Test Statistic (cont.) For testing whether or not β 1 = 0, we have SSE(R) = SSTO SSE(F) = SSE df R = n 1 df F = n 2 F = MSR MSE hsuhl (NUK) LR Chap 2 90 / 102

95 Descriptive Measures of Linear Association between X and Y Reason The usefulness of estimates or predictions depends upon the width of the interval The user s needs for precision which vary from one application to another. No single descriptive measure of the degree of linear association can capture the essential information as to whether a given regression relation is useful in any particular application. hsuhl (NUK) LR Chap 2 91 / 102

96 Descriptive Measures of Linear Association between X and Y Coefficient of Determination SSTO: the variation in the observation Y i, or a measure of the uncertainty in predicting Y when X is not considered. SSE measures the variation in the Y i when a regression model utilizing the predictor variable X is employed. A natural measure of the effect of X in reducing the variation in Y, i.e., in reducing the uncertainty in predicting Y, is to express the reduction in variation (SSTO SSE = SSR) as a proportion of the total variation: R 2 = SSR SSTO = 1 SSE SSTO Interpret: R 2 -the proportionate reduction of total variation associated with the use of the predictor variable X hsuhl (NUK) LR Chap 2 92 / 102

97 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) Montgomery: Introduction to linear regression analysis R 2 : called the proportion of variation explained by the regressor x Wiki R 2 is a statistic that will give some information about the goodness of fit of a model. In regression, the R 2 coefficient of determination is a statistical measure of how well the regression line approximates the real data points. hsuhl (NUK) LR Chap 2 93 / 102

98 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) R 2 : the coefficient of determination; 判定係數 ; 決定係數 0 SSE SSTO 0 R 2 = SSR SSTO = 1 SSE SSTO 1 SSE = 0 R 2 = 1 SSR = 0 R 2 = 0 關係 )) (If all Y i = Ŷi) (If all Ŷi = Ȳ (b 1 = 0 X 與 Y 無直線 hsuhl (NUK) LR Chap 2 94 / 102

99 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) R 2 = 0 X 與 Y 無線性關係 There is no linear association between X and Y in the sample data, and the predictor variable X is of no help in reducing the variation in Y i with linear regression. R 2 1 X 軸與 Y 軸變項間的直線關係越強 ( 具有線性變化 ) The closer it is to 1, the greater is said to be the degree of linear association between X and Y. hsuhl (NUK) LR Chap 2 95 / 102

100 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) The Toluca Company: fitreg<-lm(hrs~size,data=toluca) anova(fitreg) summary(fitreg) hsuhl (NUK) LR Chap 2 96 / 102

101 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) The variation in work hours is reduced by 82.2% when lot size is considered. hsuhl (NUK) LR Chap 2 97 / 102

102 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) hsuhl (NUK) LR Chap 2 98 / 102

103 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) R 2 is widely used for describing the usefulness of a regression model. Serious misunderstanding: A high R 2 indicates that useful predictions can be made. (Not necessarily correct. Ex: X h = 100) A high R 2 indicated that the estimated regression line is a good fit. (Not necessarily correct. Curvilinear) A R 2 near 0 indicated that X and Y are not related. (Not necessarily correct. Curvilinear) hsuhl (NUK) LR Chap 2 99 / 102

104 Descriptive Measures of Linear Association between X and Y Coefficient of Determination (cont.) hsuhl (NUK) LR Chap / 102

105 Descriptive Measures of Linear Association between X and Y Coefficient of Correlation ( 相關係數 ) A measure of linear association between Y and X when Y and X are random is the coefficient of correlation. r = ± R 2 A plus or minus sign is attached to this measure according to whether the slope of the fitted regression line is positive of negative. 1 r 1 The wider the X i are spaced, the higher R 2 will tend to be. R 2 = SSR SSTO = b2 1 (Xi X) 2 (Yi Ȳ )2 hsuhl (NUK) LR Chap / 102

106 Descriptive Measures of Linear Association between X and Y Coefficient of Correlation ( 相關係數 ) (cont.) SSR: the expected variation in Y SSE: the unexplained variation R 2 is interpreted in terms of the proportion of the total variable in Y which has been explained by X. 易被 Misunderstand: Y does not necessarily depends on X in a causal ( 因果 ) or explanatory ( 解釋 ) sense hsuhl (NUK) LR Chap / 102

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Inference in Normal Regression Model. Dr. Frank Wood

Inference in Normal Regression Model. Dr. Frank Wood Inference in Normal Regression Model Dr. Frank Wood Remember We know that the point estimator of b 1 is b 1 = (Xi X )(Y i Ȳ ) (Xi X ) 2 Last class we derived the sampling distribution of b 1, it being

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 )

Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 ) Chapter 13 Introduction to Nonlinear Regression( 非線性迴歸 ) and Neural Networks( 類神經網路 ) 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 35 13 Examples of nonlinear

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO. Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

Chapter 11 Building the Regression Model II:

Chapter 11 Building the Regression Model II: Chapter 11 Building the Regression Model II: Remedial Measures( 補救措施 ) 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 11 1 / 48 11.1 WLS remedial measures may

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

2.1: Inferences about β 1

2.1: Inferences about β 1 Chapter 2 1 2.1: Inferences about β 1 Test of interest throughout regression: Need sampling distribution of the estimator b 1. Idea: If b 1 can be written as a linear combination of the responses (which

More information

Inference in Regression Analysis

Inference in Regression Analysis Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value

More information

Linear Regression. Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) SDA Regression 1 / 34

Linear Regression. Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) SDA Regression 1 / 34 Linear Regression 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) SDA Regression 1 / 34 Regression analysis is a statistical methodology that utilizes the relation between

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Concordia University (5+5)Q 1.

Concordia University (5+5)Q 1. (5+5)Q 1. Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Mid Term Test May 26, 2004 Two Hours 3 Instructor Course Examiner

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng

Nonparametric Regression and Bonferroni joint confidence intervals. Yang Feng Nonparametric Regression and Bonferroni joint confidence intervals Yang Feng Simultaneous Inferences In chapter 2, we know how to construct confidence interval for β 0 and β 1. If we want a confidence

More information

6. Multiple Linear Regression

6. Multiple Linear Regression 6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n Chapter 2. Continued Proofs For ANOVA Proof of ANOVA Identity We are going to prove that Writing SST SSR + SSE. Y i Ȳ (Y i Ŷ i ) + (Ŷ i Ȳ ) Squaring both sides summing over all i 1,...n, we get (Y i Ȳ

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim 0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#

More information

Chapter 1 Linear Regression with One Predictor Variable

Chapter 1 Linear Regression with One Predictor Variable Chapter 1 Linear Regression with One Predictor Variable 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 1 1 / 41 Regression analysis is a statistical methodology

More information

Chapter 10 Building the Regression Model II: Diagnostics

Chapter 10 Building the Regression Model II: Diagnostics Chapter 10 Building the Regression Model II: Diagnostics 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 41 10.1 Model Adequacy for a Predictor Variable-Added

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Basic Business Statistics 6 th Edition

Basic Business Statistics 6 th Edition Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based

More information

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression Chapter 14 Student Lecture Notes 14-1 Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Multiple Regression QMIS 0 Dr. Mohammad Zainal Chapter Goals After completing

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

F-tests and Nested Models

F-tests and Nested Models F-tests and Nested Models Nested Models: A core concept in statistics is comparing nested s. Consider the Y = β 0 + β 1 x 1 + β 2 x 2 + ǫ. (1) The following reduced s are special cases (nested within)

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Course Information Text:

Course Information Text: Course Information Text: Special reprint of Applied Linear Statistical Models, 5th edition by Kutner, Neter, Nachtsheim, and Li, 2012. Recommended: Applied Statistics and the SAS Programming Language,

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

Statistics 512: Applied Linear Models. Topic 1

Statistics 512: Applied Linear Models. Topic 1 Topic Overview This topic will cover Course Overview & Policies SAS Statistics 512: Applied Linear Models Topic 1 KNNL Chapter 1 (emphasis on Sections 1.3, 1.6, and 1.7; much should be review) Simple linear

More information

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 16. Simple Linear Regression and dcorrelation Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Statistics for Managers using Microsoft Excel 6 th Edition

Statistics for Managers using Microsoft Excel 6 th Edition Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Simple Linear Regression (Part 3)

Simple Linear Regression (Part 3) Chapter 1 Simple Linear Regression (Part 3) 1 Write an Estimated model Statisticians/Econometricians usually write an estimated model together with some inference statistics, the following are some formats

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

ECON3150/4150 Spring 2015

ECON3150/4150 Spring 2015 ECON3150/4150 Spring 2015 Lecture 3&4 - The linear regression model Siv-Elisabeth Skjelbred University of Oslo January 29, 2015 1 / 67 Chapter 4 in S&W Section 17.1 in S&W (extended OLS assumptions) 2

More information

The Multiple Regression Model

The Multiple Regression Model Multiple Regression The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & or more independent variables (X i ) Multiple Regression Model with k Independent Variables:

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Regression Models. Chapter 4. Introduction. Introduction. Introduction Chapter 4 Regression Models Quantitative Analysis for Management, Tenth Edition, by Render, Stair, and Hanna 008 Prentice-Hall, Inc. Introduction Regression analysis is a very valuable tool for a manager

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Regression Analysis II

Regression Analysis II Regression Analysis II Measures of Goodness of fit Two measures of Goodness of fit Measure of the absolute fit of the sample points to the sample regression line Standard error of the estimate An index

More information

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com

Simple Linear Regression. Material from Devore s book (Ed 8), and Cengagebrain.com 12 Simple Linear Regression Material from Devore s book (Ed 8), and Cengagebrain.com The Simple Linear Regression Model The simplest deterministic mathematical relationship between two variables x and

More information

Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models

Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models Chapter 14 Logistic Regression, Poisson Regression, and Generalized Linear Models 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 29 14.1 Regression Models

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

BNAD 276 Lecture 10 Simple Linear Regression Model

BNAD 276 Lecture 10 Simple Linear Regression Model 1 / 27 BNAD 276 Lecture 10 Simple Linear Regression Model Phuong Ho May 30, 2017 2 / 27 Outline 1 Introduction 2 3 / 27 Outline 1 Introduction 2 4 / 27 Simple Linear Regression Model Managerial decisions

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Linear Regression Linear Regression with One Independent Variable 2.1 Introduction In Chapter 1 we introduced the linear model as an alternative for making inferences on means of one or

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Math 3330: Solution to midterm Exam

Math 3330: Solution to midterm Exam Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

Remedial Measures, Brown-Forsythe test, F test

Remedial Measures, Brown-Forsythe test, F test Remedial Measures, Brown-Forsythe test, F test Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 7, Slide 1 Remedial Measures How do we know that the regression function

More information

STAT 215 Confidence and Prediction Intervals in Regression

STAT 215 Confidence and Prediction Intervals in Regression STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:

More information

CHAPTER EIGHT Linear Regression

CHAPTER EIGHT Linear Regression 7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following

More information

Chapter 4. Regression Models. Learning Objectives

Chapter 4. Regression Models. Learning Objectives Chapter 4 Regression Models To accompany Quantitative Analysis for Management, Eleventh Edition, by Render, Stair, and Hanna Power Point slides created by Brian Peterson Learning Objectives After completing

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

Simple Linear Regression Analysis

Simple Linear Regression Analysis LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study

More information

ECO220Y Simple Regression: Testing the Slope

ECO220Y Simple Regression: Testing the Slope ECO220Y Simple Regression: Testing the Slope Readings: Chapter 18 (Sections 18.3-18.5) Winter 2012 Lecture 19 (Winter 2012) Simple Regression Lecture 19 1 / 32 Simple Regression Model y i = β 0 + β 1 x

More information

Simple Linear Regression

Simple Linear Regression 9-1 l Chapter 9 l Simple Linear Regression 9.1 Simple Linear Regression 9.2 Scatter Diagram 9.3 Graphical Method for Determining Regression 9.4 Least Square Method 9.5 Correlation Coefficient and Coefficient

More information

STA 4210 Practise set 2a

STA 4210 Practise set 2a STA 410 Practise set a For all significance tests, use = 0.05 significance level. S.1. A multiple linear regression model is fit, relating household weekly food expenditures (Y, in $100s) to weekly income

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Chapter 6 Multiple Regression

Chapter 6 Multiple Regression STAT 525 FALL 2018 Chapter 6 Multiple Regression Professor Min Zhang The Data and Model Still have single response variable Y Now have multiple explanatory variables Examples: Blood Pressure vs Age, Weight,

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Chapter 4 Experiments with Blocking Factors

Chapter 4 Experiments with Blocking Factors Chapter 4 Experiments with Blocking Factors 許湘伶 Design and Analysis of Experiments (Douglas C. Montgomery) hsuhl (NUK) DAE Chap. 4 1 / 54 The Randomized Complete Block Design (RCBD; 隨機化完全集區設計 ) 1 Variability

More information

Chapter 4: Regression Models

Chapter 4: Regression Models Sales volume of company 1 Textbook: pp. 129-164 Chapter 4: Regression Models Money spent on advertising 2 Learning Objectives After completing this chapter, students will be able to: Identify variables,

More information

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1 Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor

More information

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5)

Simple Linear Regression. (Chs 12.1, 12.2, 12.4, 12.5) 10 Simple Linear Regression (Chs 12.1, 12.2, 12.4, 12.5) Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 2 Simple Linear Regression Rating 20 40 60 80 0 5 10 15 Sugar 3 Simple Linear Regression

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

The simple linear regression model discussed in Chapter 13 was written as

The simple linear regression model discussed in Chapter 13 was written as 1519T_c14 03/27/2006 07:28 AM Page 614 Chapter Jose Luis Pelaez Inc/Blend Images/Getty Images, Inc./Getty Images, Inc. 14 Multiple Regression 14.1 Multiple Regression Analysis 14.2 Assumptions of the Multiple

More information

TMA4255 Applied Statistics V2016 (5)

TMA4255 Applied Statistics V2016 (5) TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

More information

Chapter 16. Simple Linear Regression and Correlation

Chapter 16. Simple Linear Regression and Correlation Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information