Chapter 1. Linear Regression with One Predictor Variable

Similar documents
STAT5044: Regression and Anova. Inyoung Kim

Simple Linear Regression

1. Simple Linear Regression

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Ch 2: Simple Linear Regression

Simple and Multiple Linear Regression

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by

Concordia University (5+5)Q 1.

27. SIMPLE LINEAR REGRESSION II

Introduction to Simple Linear Regression

MAT2377. Rafa l Kulik. Version 2015/November/26. Rafa l Kulik

STAT 360-Linear Models

Six Sigma Black Belt Study Guides

Chapter 1 Linear Regression with One Predictor

Inference for Regression Inference about the Regression Model and Using the Regression Line

Linear models and their mathematical foundations: Simple linear regression

STAT 4385 Topic 03: Simple Linear Regression

Simple Linear Regression

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Basic Business Statistics 6 th Edition

Regression Models - Introduction

TMA4255 Applied Statistics V2016 (5)

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Correlation and Regression

Math 3330: Solution to midterm Exam

SF2930: REGRESION ANALYSIS LECTURE 1 SIMPLE LINEAR REGRESSION.

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Statistics for Engineers Lecture 9 Linear Regression

STAT 705 Chapter 16: One-way ANOVA

BNAD 276 Lecture 10 Simple Linear Regression Model

Math 423/533: The Main Theoretical Topics

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

Mathematics for Economics MA course

STAT Chapter 11: Regression

Statistical View of Least Squares

Maximum Likelihood Estimation

Lecture 1: Linear Models and Applications

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

22s:152 Applied Linear Regression. Chapter 5: Ordinary Least Squares Regression. Part 1: Simple Linear Regression Introduction and Estimation

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Lecture 4 Multiple linear regression

Lecture 18: Simple Linear Regression

where x and ȳ are the sample means of x 1,, x n

Inference for Regression

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

Statistics for Managers using Microsoft Excel 6 th Edition

Linear Models in Machine Learning

Correlation & Simple Regression

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

Regression Estimation Least Squares and Maximum Likelihood

Inference for Regression Simple Linear Regression

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

The simple linear regression model discussed in Chapter 13 was written as

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

Section 4: Multiple Linear Regression

3. Diagnostics and Remedial Measures

Inferences for Regression

Simple Linear Regression

Fitting a regression model

Correlation Analysis

Chapter 2. Continued. Proofs For ANOVA Proof of ANOVA Identity. the product term in the above equation can be simplified as n

Lectures on Simple Linear Regression Stat 431, Summer 2012

Analysis of Bivariate Data

Master s Written Examination

Y i = η + ɛ i, i = 1,...,n.

Chapter 16. Simple Linear Regression and dcorrelation

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Association studies and regression

Section 3: Simple Linear Regression

Homework 2: Simple Linear Regression

Applied Econometrics (QEM)

Formal Statement of Simple Linear Regression Model

Multiple Linear Regression

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

Business Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'

STAT2201 Assignment 6

Simple Linear Regression for the MPG Data

Statistical Techniques II EXST7015 Simple Linear Regression

2.4.3 Estimatingσ Coefficient of Determination 2.4. ASSESSING THE MODEL 23

Bias Variance Trade-off

Simple linear regression

BIOS 2083 Linear Models c Abdus S. Wahed

The Simple Linear Regression Model

Section 4.6 Simple Linear Regression

Categorical Predictor Variables

Lecture 6 Multiple Linear Regression, cont.

Matrix Approach to Simple Linear Regression: An Overview

Ch 13 & 14 - Regression Analysis

Regression Models - Introduction

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

ECON The Simple Regression Model

Multiple Regression Examples

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Simple Linear Regression

Øving 8. STAT111 Sondre Hølleland Auditorium π April Oppgaver

Ch 3: Multiple Linear Regression

Transcription:

Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical variables x and y. This may be represented by a functional relation; y = f(x), (1) which says that given a value of x, there is a unique value of y, which can be exactly determined. 1

For example, the relation between the number of hours(x) driven on a car and distance (y) travelled may be given by y = cx, where c is the constant speed. There are many examples in physical and other sciences of such relations, known as the deterministic or exact relationship. To define a statistical relationship, we replace the mathematical variables by random variables, X and Y and add a random component of error ɛ representing deviation from the true relation is given by y = f(x) + ɛ (2) 2

Here (x, y) represent a typical value of the bivariate random variable (X, Y ). Such a relation is also known as stochastic relation and model the random phenomenon where (i) there is tendency of Y values to vary around a smooth function and (ii) there is a random scatter of points around this systematic component. Figure 1.1 presents the plot of heights and weights of 23 students enrolled in my last year s of STAT360 (for the data given in Table 1.1). 3

This graph shows the tendency of the data to vary around a straight line. This tendency of the variation in weights as function of height is called linear trend. Since the points do not fall on a straight line, it may be suitable to use a statistical relationship, i.e. y = β 0 + β 1 x + ɛ where β 0 and β 1 are unknown constants, x represents height and y represents weight, and ɛ represents a random error. The subject matter of this course is the study of such relationships. 4

Figure 1.1 Scatter Plot of Height-Weight Data of STAT360 2001 Class 5

Table 1.1 Heights and Weights of 23 Students in STAT 360 Class of 2001 Student ID Height(Cms.) Weight(Kgs.) 4126548 183.00 77.09 4281675 177.80 90.70 4100212 172.72 81.63 4411919 167.64 49.88 5936748 162.56 45.35 5919460 162.56 54.42 5945267 172.72 72.56 4276051 177.80 74.83 4084489 172.72 54.42 4139615 185.42 92.97 5928281 180.34 81.63 5922763 172.72 80.72 3630137 180.34 70.29 4751612 158.00 55.00 4767098 163.00 50.00 4767209 158.00 42.00 4766733 182.00 72.00 4766164 166.00 60.00 4763661 168.00 62.00 4766970 163.00 55.00 4763734 170.00 65.00 3952312 172.72 95.23 5928389 162.56 72.56 6

1.2 Regression Models Terminology: Regression The conditional expectation given by m(x) = E(Y X = x) in a bivariate setting is called regression of Y on X. The term regression was used by Sir Francis Galton (1822-1911) in studying the height of the offsprings as a function of the heights of their parents in a paper entitled Regression towards mediocrity in hereditary stature (Nature, vol. 15, pp.507-510). 7

In this paper Galton reported on his discovery that the offsprings did not resemble their parents in size but tend to be always more mediocre [i.e. more average] than they - to be smaller than the parents if parents were large; to be larger than parents if they were very small... Thus the random variable Y may be assumed to vary around its mean m(x) as a function of X, and denoting the random deviation Y m(x) by ɛ, we can write Y = m(x) + ɛ (3) Note that the probability distribution of ɛ is the conditional probability distribution of Y m(x) X = x, which is essentially the same as the Eq. 2. Hence statistical relationships as such as these are known as regression models. 8

Dependent and Independent Variable The relation y = f(x) implicitly requires to study the changes in y as a function of x and some times is interpreted as a causal relation, (i.e. x causes y). This understanding has resulted in defining x as independent variable and y as dependent variable. Uses of Regression Relation The regression model is used for Description: Simply knowing the nature of the relationship such as described by Sir Galton. 9

Prediction: Prediction of Y values (which are random) as a function of some related variable. This is an educated guess. For example, increase in Sales (Y ) as a function of advertising expense (X) for a company will be an important quantity to predict. In this context, X is known to be the predictor variable and Y is known to be predicand variable or response variable. Control: Knowledge of regression relations is used in control of Y values. For example, in an industrial processes, temperature (X) may be used to control the density Y of the finished product. Hence, to produce material of a given average density, the regression relation may be used to determine the proper temperature level. 10

1.3 Simple Linear Regression Model Distribution of the Error Unspecified Let n observations obtained from the bivariate random variable (X, Y ) be denoted by (X i, Y i ), i = 1, 2,..., n. Then the Simple Linear Regression (SLR) Model can be stated as follows; Y i = β 0 + β 1 X i + ɛ i (4) Y i : value of the response (or dependent) variable in the ith trial β 0 and β 1 are parameters known as the regression parameters X i is a known constant, the value of the predictor variable in the ith trial 11

ɛ i : random error term for the ith trial, such that E(ɛ i ) = 0 and V ar(ɛ i ) = σ 2 {ɛ i } = σ 2 ɛ i and ɛ j for i j are uncorrelated so that their covariance is zero, ı.e. cov(ɛ i, ɛ j ) = σ{ɛ i, ɛ j } = 0. Normal Distribution of the Errors For theoretical purposes, it is important to assume that the errors are normally distributed, this we denote by ɛ i i.i.d N(0, σ 2 ). Note that i.i.d is short for independent and identically distributed, and zero covariance between two normal random variables implies independence. Model with this extra assumption is known as Normal Simple Linear Regression model. 12

Some Features of the SLR model In the expressions below, expectations are used as if the X values are fixed; hence in fact these are conditional expectations. This should not create a confusion, if we assume that the regression relation is to study the variations in Y for fixed values of X. (i) Y i is sum of a constant and a random variable; hence it is a random variable. (ii) E(Y i ) = β 0 + β 1 X i (iii) V ar(y i ) = σ 2 {Y i } = σ 2 {ɛ i } = σ 2 Hence this model assumes that the mean function is linear but variance function is constant in X. 13

(iv) For i j, the observations Y i and Y j are uncorrelated. The above observations follow by simple rules of expectation and variance. Meaning of Regression Parameters Since, E(Y ) = β 0 + β 1 X, it is clear that β 0 = E(Y X = 0) = intercept of the regression line = mean response when X = 0 and β 1 = slope of the regression line = Change in average response per unit change in X. 14

1.4 Estimation of Regression Function Method of Least Squares (LS) When the distribution of errors is not specified, we need to minimize the observed errors Y i β 0 β 1 X i. The least square principle provides the best fitting line to the data by minimizing Q(β 0, β 1 ) = n i=1 (Y i β 0 β 1 X i ) 2 (5) Note: Other criteria may also be proposed, such as considering the least absolute deviation (LAD), n i=1 (Y i β 0 β 1 X i ) but LS offers an enormous theoretical simplification and has some required good properties of resulting estimators. 15

Least Square Estimators The analytical solution for β 0 and β 1, denoted by b 0 and b 1, respectively, is obtained by solving the following simultaneous linear equations known as the normal equations): Yi = nb 0 + b 1 Xi (6) Xi Y i = b 0 Xi + b 1 X 2 i (7) These can be explicitly solved to give, b 1 = (Xi X)(Y i Ȳ ) (Xi X) 2 (8) b 0 = Ȳ b 1 X (9) 16

Proof The minimizing equations are It is easy to obtain Q β 0 = 0 (10) Q β 1 = 0 (11) Q β 0 = 2 (Y i β 0 β 1 X i ) (12) Q β 1 = 2 X i (Y i β 0 β 1 X i ) (13) Equating these to zero and substituting b 0 and b 1 respectively for β 0 and β 1, we get (Yi β 0 β 1 X i ) = 0 (14) Xi (Y i β 0 β 1 X i ) = 0 (15) Expanding the summation over individual terms we get Yi nβ 0 β 1 Xi = 0(16) Xi Y i β 0 Xi β 1 X 2 i = 0(17) 17

Rearranging the terms gives the normal equations. From the first normal equation, we get 1 1 Yi = b 0 + b 1 Xi (18) n n or Hence, Ȳ = b 0 + b 1 X (19) b 0 = Ȳ b 1 X Substituting this in the second normal equation, we get, Xi Y i = n X(Ȳ b 1 X) + b 1 X 2 i = n XȲ + b 1 ( X 2 i n X 2 ) 18

This gives b 1 = Xi Y i n XȲ X 2 i n X 2 Using the fact that Xi Y i n XȲ = (X i X)(Y i Ȳ ) the above expression becomes b 1 = (Xi X)(Y i Ȳ ) (Xi X) 2 19

Example For the data in Table 1.1, the following computations are obtained; n = 23; Xi = 3931.6, Yi = 1555.3 Xi Y i = 267951; X 2 i = 673552 Hence X = 3931.6/23 = 170.94, Ȳ = 1555.3/23 = 67.6231 For computing b 1, the numerator is computed as Xi Y i n XȲ = X i Y i ( X i )( Y i ) and denominator as Hence, X 2 i n X 2 = X 2 i ( X i ) 2 n n b 1 = 267951 3931.6 1555.3/23 673552 3931.6 2 /23 = 1.40694 b 0 = 67.6231 1.40694 170.94 = 172.8792 20

1.5 Point Estimation of Mean Response Let X h be a typical value of the independent variable at which the mean response E(Y ) has to be estimated. Note that this is equivalent to estimating the regression function for X = X h. E(Y ) = β 0 + β 1 X, (20) Note that individual value of Y is known as response and E(Y ) is known as the mean response. The regression function is linear in parameters β 0 and β 1, hence, its estimate is easily obtained as Ŷ = ˆβ 0 + ˆβ 1 X = b 0 + b 1 X (21) 21

For the cases in the study, we call Ŷ i : Ŷ i = b 0 + b 1 X i, i = 1, 2,..., n; (22) the fitted value for the ith case; viewed as the estimate if the mean response for X = X i. Example 1.2 For the data in Table 1.1, the estimators of b 0 and b 1 were obtained as b 0 = 172.88, b 1 = 1.41, Hence, the estimated regression function is given by Ŷ = 172.88 + 1.41X. This estimated regression function is plotted in Figure 1.2. The fitted values are reported in the following table. 22

Table 1.2: Fitted Values and Residuals for Height-Weight(2001) Data Student# Height(X) Weight(Y) Fits Residuals 1 183.00 77.0975 84.5908-7.4933 2 177.80 90.7029 77.2747 13.4282 3 172.72 81.6327 70.1274 11.5052 4 167.64 49.8866 62.9802-13.0936 5 162.56 45.3515 55.8329-10.4815 6 162.56 54.4218 55.8329-1.4112 7 172.72 72.5624 70.1274 2.4349 8 177.80 74.8299 77.2747-2.4448 9 172.72 54.4218 70.7214-15.7057 10 185.42 92.9705 87.9956 4.9749 11 180.34 81.6327 80.8483 0.7843 12 172.72 80.7256 70.1274 10.5982 13 180.34 70.2948 80.8483-10.5535 14 158.00 55.0000 49.4173 5.5827 15 163.00 50.0000 56.4520-6.4520 16 158.00 42.0000 49.4173-7.4173 17 182.00 72.0000 83.1838-11.1838 18 166.00 60.0000 60.6728-0.6728 19 168.00 62.0000 63.4867-1.4867 20 163.00 55.0000 56.4520-1.4520 21 170.00 65.0000 66.3006-1.3006 22 172.72 95.2381 70.1274 25.1107 23 162.56 72.5624 55.8329 16.7294 23

Regression Plot Y = -172.879 + 1.40694X R-Sq = 55.7 % 100 90 Weight(Y) 80 70 60 50 40 160 170 Height(X) 180 Figure 1.2 Scatter Plot and Fitted Line Plot of Height-Weight Data of STAT360 2001 Class 24

The graph shows a good scatter around the fitted line. Suppose that the mean weight of a person of typical height X = 171cms is desired; the corresponding point estimate is given by Ŷ = 172.88 + 1.41(171) = 68.23Kg. Table 1.2 also gives the fitted values for all the heights in the data; just by substituting X i for X in the equation of the fitted line. This table also gives the value of the residuals, which are the differences between the observed values and fitted values. In general, the ith residual is given by e i = Y i Ŷ i (23) 25

For the SLR model it can be written as e i = Y i b 0 b 1 X i = (Y i Ȳ ) b 1 (X i X) (24) The latter equation is useful for theoretical derivations in the course. The residuals are in some sense estimates of the errors ɛ i. They are used to justify the validity of the model as well as in finding departures from the model. 26

1.6 Properties of Fitted Regression Line (i) Sum of all the residuals equal zero, i.e n i=1 e i = 0 (25) Note that this implies that the sample mean of the residual values ē = 1 n ni=1 e i = 0.The sample mean being an estimator of the population mean, this is inline with the assumption that E(ɛ) = 0. To prove this use Eq (25) and the fact that ni=1 (X i X) = 0 = n i=1 (Y i Ȳ ) = 0. (ii) Sum of the squared residuals (Y i Ŷ i ) 2 is minimum, for the least squared residuals, e i = (Y i Ŷ i ). Note that this was the requirement in the least square estimation. 27

(iii) Sum of the observed values Y i equals the sum of fitted values Ŷ i : n i=1 Y i = n i=1 Ŷ i (26) This follows from the first property as ei = Y i Ŷ i = 0. This implies that the (sample) mean of the observed values Ȳ and the fitted values Ŷ are the same. (iv) Sum of the weighted residuals is zero, when the residuals are weighted by the corresponding level of the predictor variable: Xi e i = 0 (27) 28

To prove this we see that Xi e i = X i {(Y i Ȳ ) b 1 (X i X)} = X i (Y i Ȳ ) b 1 Xi (X i X) (28) Furthermore, S xy = (X i X)(Y i Ȳ ) = X i (Y i Ȳ ) X(Y i Ȳ ) = X i (Y i Ȳ ) X (Y i Ȳ ) = X i (Y i Ȳ ) as (Yi Ȳ ) = 0 Hence Eq (??) becomes Xi e i = S xy b 1 S xx 29

Using the formula for b 1 : b 1 = S xy S xx, the above equation becomes Xi e i = S xy S xy S xx S xx = S xy S xy = 0 (v) Sum of the weighted residuals is zero, when the residuals are weighted by the corresponding level of the fitted values: Ŷi e i = 0 (29) This easily follows as Ŷi e i = b 0 ei + b 1 Xi e i and the facts (proved earlier) that ei = 0 X i e i = 0 30

(vi) The (fitted) regression line always passes through the point ( X, Ȳ ). Substituting X = X, we find that Ŷ = Ȳ b 1 X + b 1 X = Ȳ which proves this property. Notes: (i) Property (i) follows from the first normal equation as ei = (Y i b 0 b 1 X i ) = Y i nb 0 b 1 Xi (ii) The property (v) follows from the 2nd normal equation as: Xi e i = X i Y i b 0 Xi b 1 X 2 i 31

(iii) If the data is transformed as Y y = Y Ȳ and X x = X X, the fitted equation becomes ŷ = b 1 x (30) where ŷ = Ŷ Ȳ. It is clear to see that this equation passes through the point (0,0); which is a consequence of shifting the origin to the point ( X, Ȳ ). 1.7 Estimation of Error Variance In general, variation can be estimated by squared deviations of observations from the mean or estimate of the mean. For example for the observations Y 1, Y 2,..., Y n from a normal population N(µ, σ 2 ), the unbiased estimator of σ 2 is given by 32

ˆσ 2 = 1 n s 2 = 1 n 1 n i=1 n i=1 (Y i µ) 2, if µ is known. (Y i Ȳ ) 2, if µ is unknown. In other words, we say that the estimate of σ 2 is sum of squared deviations divided by degrees of freedom; n if µ is known and n 1 if µ is estimated by Ȳ. In the case of regression model the approximation to deviations of observations Y i from its mean m(x i ) = β 0 +β 1 X i is given by Y i m(x i ) is given by e i = Y i Ŷ i = Y i b 0 b 1 X i and the corresponding sum of squares, denoted by SSE, for Sum of Squares due to Error. is given by SSE = n i=1 (Y i Ŷ i ) 2 = n i=1 e 2 i (31) 33

The corresponding degrees of freedom is n 2 (2 degrees of freedom are lost for estimating two parameters, β 0 and β 1.) This gives rise to the following estimate of σ 2, ni=1 e 2 i MSE = SSE n 2 = (32) n 2 where M SE stands for Mean Square due to Error. It will be proved later that MSE is unbiased for σ 2. Example The estimate of the error variance for the data of Table 1.1 is obtained as follows: Note that Sum of Squared Errors (Residuals) is given by SSE = 2329.5. based on n = 23 observations. MSE is given by Hence, MSE = 2329.5/21 = 110.93. 34

1.8 Normal Error Regression Models The information about the parameters is given by the distribution of errors Y 1,..., Y n. For the normal error regression model, ɛ 1,...ɛ n are independent and normally distributed with zero mean and variance= σ 2. This implies that Y 1,..., Y n are also normal and independent where Y i N(β 0 + β 1 X i, σ 2 ). The probability density function for Y i is given by f i (Y i ) = = 1 σ 2π exp{ 1 2σ 2(Y i m(x i )) 2 } 1 σ 2π exp{ 1 2σ 2(Y i β 0 β 1 X i ) 2 } 35

Since, Y 1,..., Y n are independent, the joint probability density function of Y 1,..., Y n is given by f(y 1,..., Y n ) = f 1 (Y 1 )f 2 (Y 2 )...f n (Y n ) and the likelihood function L(β 0, β 1, σ 2 ) is given by L(β 0, β 1, σ 2 ) = f(y 1,..., Y n ) { } n 1 n = σ exp{ 1 2π 2σ 2(Y i β 0 β 1 X i )) 2 } i=1 { } n 1 = σ exp{ 1 n (Y 2π 2σ 2 i β 0 β 1 X i ) 2 } (33) i=1 For finding the maximum likelihood estimators of β 0, β 1, σ 2, the likelihood function has to be maximized. Equivalently, we consider to maximize the log-likelihood function given by log e L = n 2 log e(2π) n 2 log e σ 2 1 2σ 2 n (Y i β 0 β 1 X i ) 2 i=1 (34) 36

Maximum Likelihood Estimators of Parameters The maximum likelihood estimators are obtained by solving the following three equations log e L β 0 = 0 log e L β 1 = 0 log e L σ 2 = 0 These partial derivatives are given by log e L β 0 = 1 σ 2 log e L β 1 = 1 σ 2 n i=1 n i=1 log e L σ 2 = n 2σ 2 + 1 2σ 4 (Y i β 0 β 1 X i ) X i (Y i β 0 β 1 X i ) n i=1 (Y i β 0 β 1 X i ) 2 37

Replacing β 0, β 1, σ 2 by ˆβ 0, ˆβ 1, ˆσ 2, after a little simplification, we obtain n i=1 n i=1 (Y i ˆβ 0 ˆβ 1 X i ) = 0, (35) X i (Y i ˆβ 0 ˆβ 1 X i ) = 0, (36) ni=1 (Y i ˆβ 0 ˆβ 1 X i ) 2 n = ˆσ 2. (37) Note that the equations (36) and (37) are the two normal equations obtained by the least square method. Hence the Maximum Likelihood estimators of β 0 and β 1 are the same as b 0 and b 1 respectively. Whereas the MLE for σ 2 is given by ˆσ 2 = = n i=1 (Y i b 0 b 1 X i ) 2 n i=1 e2 i n n (38) (39) Note that the MLE for σ 2 is biased, as E(ˆσ 2 ) = E( n 2 n MSE) = n 2 n σ2 (40) 38

The following output is obtained using MINITAB (available in Math and Stat department PC lab) using Height weight data for this class. (The data can be downloaded by following the links from http://alcor.concordia.ca/ chaubey either in excel or text format, which can be subsequently copied and pasted on MINITAB worksheet) Use Stat-Regression-Regression menu to obtain the following output. MINITAB ignores the missing data denoted by * ) Regression Analysis: Weight versus Height The regression equation is Weight = - 161 + 1.33 Height 52 cases used 6 cases contain missing values 39

Predictor Coef SE Coef T P Constant -160.74 21.59-7.45 0.000 Height 1.3259 0.1265 10.48 0.000 S = 7.325 R-Sq = 68.7% R-Sq(adj) = 68.1% Analysis of Variance Source DF SS MS F P Regression 1 5898.4 5898.4 109.92 0.000 Residual Error 50 2683.0 53.7 Total 51 8581.4 40

Notes: 1. If the missing weight is substituted by the fitted value and the regression is run again; the same results are obtained. To store the fitted values and residuals, use STORAGE option by clicking the proper spaces. 2. To obtain a fitted line plot use Stat-Regression- Fitted Line Plot menu in MINITAB. 3. Regression OUTPUT may also be obtained from EXCEL using Tools-Data Analysis - Regression. 41