Regression Analysis Chapter 2 Simple Linear Regression Dr. Bisher Mamoun Iqelan biqelan@iugaza.edu.ps Department of Mathematics The Islamic University of Gaza 2010-2011, Semester 2 Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 1 / 22
Simple linear regression model Suppose for each subject, we observe/have two variables X and Y. We want to make prediction of Y based on X. Because of random effect, we cannot predict Y accurately. Instead, we can only predict its expected/mean value. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 2 / 22
Simple linear regression model Suppose for each subject, we observe/have two variables X and Y. We want to make prediction of Y based on X. Because of random effect, we cannot predict Y accurately. Instead, we can only predict its expected/mean value. Model equation It represents a response variable (the variable of primary interest) as a sum of mean function and error, response = mean function + error. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 2 / 22
Simple linear regression model Suppose for each subject, we observe/have two variables X and Y. We want to make prediction of Y based on X. Because of random effect, we cannot predict Y accurately. Instead, we can only predict its expected/mean value. Model equation It represents a response variable (the variable of primary interest) as a sum of mean function and error, response = mean function + error. Mean function It takes account of the information about the response variable that can be extracted from given predictor variables. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 2 / 22
Simple linear regression model Suppose for each subject, we observe/have two variables X and Y. We want to make prediction of Y based on X. Because of random effect, we cannot predict Y accurately. Instead, we can only predict its expected/mean value. Model equation It represents a response variable (the variable of primary interest) as a sum of mean function and error, response = mean function + error. Mean function It takes account of the information about the response variable that can be extracted from given predictor variables. Error term It represents any information not accounted for by the mean function. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 2 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. We denote the expected value of Y by EY Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. We denote the expected value of Y by EY or E(Y X). Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. We denote the expected value of Y by EY or E(Y X). Obviously, EY = µ(x). Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. We denote the expected value of Y by EY or E(Y X). Obviously, EY = µ(x). In summary, all of the following are the same thing: µ(x) E(Y X) EY with β 0 and β 1 being unknown, β 0 + β 1 X Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. We denote the expected value of Y by EY or E(Y X). Obviously, EY = µ(x). In summary, all of the following are the same thing: µ(x) E(Y X) EY with β 0 and β 1 being unknown, β 0 + β 1 X µ(x) is used when EY is studied as a function of the predictor. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. We denote the expected value of Y by EY or E(Y X). Obviously, EY = µ(x). In summary, all of the following are the same thing: µ(x) E(Y X) EY with β 0 and β 1 being unknown, β 0 + β 1 X µ(x) is used when EY is studied as a function of the predictor. EY simply refers to the expected value of the dependent variable. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
Notation for the Mean function The mean function is known also as regression function or regression line. We denote the expected value of Y by EY or E(Y X). Obviously, EY = µ(x). In summary, all of the following are the same thing: µ(x) E(Y X) EY with β 0 and β 1 being unknown, β 0 + β 1 X µ(x) is used when EY is studied as a function of the predictor. EY simply refers to the expected value of the dependent variable. E(Y X) emphasises the dependence of EY on the predictor. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 3 / 22
SLM: basic equation and terminology The Simple linear regression model (SLM) equation is Y = µ(x) + ε = β 0 + β 1 X +ε. }{{} mean function r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 4 / 22
SLM: basic equation and terminology The Simple linear regression model (SLM) equation is Y = µ(x) + ε = β 0 + β 1 X +ε. }{{} mean function Y is called response or dependent variable. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 4 / 22
SLM: basic equation and terminology The Simple linear regression model (SLM) equation is Y = µ(x) + ε = β 0 + β 1 X +ε. }{{} mean function Y is called response or dependent variable. The variable X is called predictor(s), explanatory or independent variable(s). r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 4 / 22
SLM: basic equation and terminology The Simple linear regression model (SLM) equation is Y = µ(x) + ε = β 0 + β 1 X +ε. }{{} mean function Y is called response or dependent variable. The variable X is called predictor(s), explanatory or independent variable(s). ε is called random error with Eε = 0, which is not observable and not estimable. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 4 / 22
SLM: basic equation and terminology The Simple linear regression model (SLM) equation is Y = µ(x) + ε = β 0 + β 1 X +ε. }{{} mean function Y is called response or dependent variable. The variable X is called predictor(s), explanatory or independent variable(s). ε is called random error with Eε = 0, which is not observable and not estimable. β 0 and β 1 are unknown, called regression coefficients. β 0 is also called intercept (value of EY when X = 0); β 1 is called slope indicating the change of Y on average when X increases one unit. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 4 / 22
SLM: basic equation and terminology The Simple linear regression model (SLM) equation is Y = µ(x) + ε = β 0 + β 1 X +ε. }{{} mean function Y is called response or dependent variable. The variable X is called predictor(s), explanatory or independent variable(s). ε is called random error with Eε = 0, which is not observable and not estimable. β 0 and β 1 are unknown, called regression coefficients. β 0 is also called intercept (value of EY when X = 0); β 1 is called slope indicating the change of Y on average when X increases one unit. The model is called linear because the mean function is linear with respect to the parameters. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 4 / 22
SLM: basic equation and terminology cont.. Suppose we have n observations, (X 1, Y 1 ), (X 2, Y 2 ),..., (X n, Y n ) The linear regression model also means Y 1 = β 0 + β 1 X 1 + ε 1 Y 2 = β 0 + β 1 X 2 + ε 2.. Y n = β 0 + β 1 X n + ε n In the model, X i is known, observable, and non-random, ε i is called random error (unobservable). Thus Y i is random. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 5 / 22
Assumptions and features of the model X i is non-random, but ε i is random. Thus the first part of Y i : β 0 + β 1 X i is due to regression on (X); the second part: ε i is due to the random effect. {Mean of random errors} Eε i = 0, thus EY i = E{β 0 + β 1 X i + ε i } = β 0 + β 1 X i + Eε i = β 0 + β 1 X i {Homogeneity of Variance} Var(ε i ) = σ 2 {independence} Cov(ε i, ε j ) = 0 for any i j. Thus (prove it based on the previous point), Var(Y i ) = σ 2 and Cov(Y i, Y j ) = 0 for any i j. Thus Y i and Y j are uncorrelated. Parameters in the model: β 0, β 1 and σ 2 need to be estimated. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 6 / 22
Parameter Estimation We do not know the parameters of the model but for any given b = (b 0, b 1 ) T, we can compute deviations of Y i from its expected value defined by e i = Y i (b 0 + b 1 X i ). r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 7 / 22
Parameter Estimation We do not know the parameters of the model but for any given b = (b 0, b 1 ) T, we can compute deviations of Y i from its expected value defined by e i = Y i (b 0 + b 1 X i ). It seems natural to choose as estimator of θ = (β 0, β 1 ) T, whose deviations are small rather than large. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 7 / 22
Parameter Estimation We do not know the parameters of the model but for any given b = (b 0, b 1 ) T, we can compute deviations of Y i from its expected value defined by e i = Y i (b 0 + b 1 X i ). It seems natural to choose as estimator of θ = (β 0, β 1 ) T, whose deviations are small rather than large. So, to find good b 0 and b 1 is to minimize the sum of squares of the deviations. n n S(b) = e 2 i = (Y i b 0 b 1 X i ) 2, i=1 i=1 This method is called the (ordinary) least squares estimation (l.s.e., or o.l.s.). r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 7 / 22
Sum of squares of deviations An example of linear regression model Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 8 / 22
Least squares estimator Principle of least squares Take as an estimate of the parameters the vector that makes the sum of squares as small as possible. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 9 / 22
Least squares estimator Principle of least squares Take as an estimate of the parameters the vector that makes the sum of squares as small as possible. l.s.e. Applied to the linear model, the principle of least squares leads to the following definition. Definition: A vector ˆθ is a least squares estimator (l.s.e.) of θ if S(ˆθ) S(b) for any b. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 9 / 22
Least squares estimator Principle of least squares Take as an estimate of the parameters the vector that makes the sum of squares as small as possible. l.s.e. Applied to the linear model, the principle of least squares leads to the following definition. Definition: A vector ˆθ is a least squares estimator (l.s.e.) of θ if S(ˆθ) S(b) for any b. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 9 / 22
Derivation of the normal equations Let S be the sum of squares, S(b 0, b 1 ) = = n i=1 e 2 i n (Y i b 0 b 1 X i ) 2 i=1 Its minimum may be found by solving the system S b r = 0, r = 0, 1. r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 10 / 22
Derivation of the normal equations Let S be the sum of squares, S(b 0, b 1 ) = = n i=1 e 2 i n (Y i b 0 b 1 X i ) 2 i=1 Its minimum may be found by solving the system S b r = 0, r = 0, 1. Note that e i b 0 = 1, and e i b 1 = X i r. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 10 / 22
Derivation of the normal equations (cont.) S b 0 = 2 S b 1 = 2 n (Y i b 0 b 1 X i ), i=1 n X i (Y i b 0 b 1 X i ). i=1 Equating to 0, we get the system 2 2 n (Y i b 0 b 1 X i ) = 0, i=1 n X i (Y i b 0 b 1 X i ) = 0. i=1 Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 11 / 22
Derivation of the normal equations (cont.) The values of β 0 and β 1 that minimize S(b 0, b 1 ) are given by ˆβ 1 = n i=1 (X i X)(Y i Ȳ ) n i=1 (X i X) 2, (1) and { n ˆβ 0 = 1 Y i b 1 n i=1 } n X i = Ȳ b 1 X. (2) i=1 Note that we give the formula for ˆβ 1 before the formula for ˆβ 0 because ˆβ 0 uses ˆβ 1. To ease computations let, x i = X i X and y i = Y i Ȳ, then Equation (1) can be written by n i=1 ˆβ 1 = x iy i n, (3) i=1 x2 i Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 12 / 22
Terminology for the estimation The least squares regression line is given by Ŷ = ˆβ 0 + ˆβ 1 X (4) (Note that we use Ŷ, to denote the predicted/fitted value of Y for a given X) For each observation in our data we can compute Ŷ i = ˆβ 0 + ˆβ 1 X i These are called the fitted values. Thus, the ith fitted value, Ŷi, is the point on the least squares regression line (4) corresponding to X i. The vertical distance corresponding to the ith observation is e i = Y i Ŷi, i = 1, 2,..., n. These vertical distances are called the ordinary least squares residuals. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 13 / 22
Alternative expression for ˆβ 1 It is often useful to use an equivalent formula for estimating ˆβ 1. Problem Show that an alternative formula for ˆβ 1 can be expressed as ˆβ 1 = Cov(Y, X) Var(X) (5) = Corr(Y, X) S Y S X (6) Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 14 / 22
Alternative expression for ˆβ 1 It is often useful to use an equivalent formula for estimating ˆβ 1. Problem Show that an alternative formula for ˆβ 1 can be expressed as ˆβ 1 = Cov(Y, X) Var(X) (5) = Corr(Y, X) S Y S X (6) Note that from Equations (5) and (6) it can be seen that ˆβ 1, Cov(Y, X), and Corr(Y, X) have the same sign. This makes intuitive sense because positive (negative) slope means positive (negative) correlation. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 14 / 22
Alternative expression for ˆβ 1 It is often useful to use an equivalent formula for estimating ˆβ 1. Problem Show that an alternative formula for ˆβ 1 can be expressed as ˆβ 1 = Cov(Y, X) Var(X) (5) = Corr(Y, X) S Y S X (6) Note that from Equations (5) and (6) it can be seen that ˆβ 1, Cov(Y, X), and Corr(Y, X) have the same sign. This makes intuitive sense because positive (negative) slope means positive (negative) correlation. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 14 / 22
Example1: Data Consider the following data Obs. X Y 1 1.0 0.60 2 1.5 2.00 3 2.1 1.06 4 2.9 3.44 5 3.2 1.17 6 3.9 3.54 By simple calculation, X = 2.4333, Ȳ = 1.9683, and 6 (X i X)(Y 6 i Ȳ ) = 4.6143, (X i X) 2 = 5.9933 Thus, i=1 and hence the estimated model is i=1 ˆβ 1 = 0.7699, ˆβ0 = 0.0949. Ŷ = 0.0949 + 0.7699X Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 15 / 22
Example1 Cont.. The Fitted Values, Y i, and the Ordinary Least Squares Residuals, e i, for the data Obs. X Y fitted Y : Ŷ residuals 1 1.0 0.60 0.8648 0.2648 2 1.5 2.00 1.2497 0.7503 3 2.1 1.06 1.7117 0.6517 4 2.9 3.44 2.3276 1.1124 5 3.2 1.17 2.5586 1.3886 6 3.9 3.54 3.0975 0.4425 So, with X = 3, then our prediction of Y is Ŷ = 0.0949 + 0.7699 3 = 2.4046. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 16 / 22
Graph of Example 1 The model indicates that Y increase with X. As X increases one unit, Y increases 0.7699 unit. Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 17 / 22
Examples From Equations (1) and (2) one can obtain the values of ˆβ 1 and ˆβ 0 : ˆβ 1 = 956 576 = 1.66, and ˆβ 0 = 57 (1.66)(18) = 57 29.88 = 27.12 Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 18 / 22 Example 2 This table shows the observations (years: 1971 1980) and calculations to estimate the regression equation for the corn-fertilizer problem
Example: Using R Example 3 Suppose we have 10 observations for (X, Y ) : (1.2, 1.91), (2.3, 4.50), (3.5, 2.13), (4.9, 5.77), (5.9, 7.40), (7.1, 6.56), (8.3, 8.79), (9.2, 6.56), (10.5, 11.14), (11.5, 9.88). We hope to fit a linear regression model Y i = β 0 + β 1 X i + ε i, i = 1, 2,..., 10 Here a code of R (the words after # are comments only) > X=c(1.2,2.3,3.5,4.9,5.9,7.1,8.3,9.2,10.5,11.5) > Y=c(1.91,4.50,2.13,5.77,7.40,6.56,8.79,6.56,11.14.9.88) > plot(x, Y) # plot the observations (data) > myreg = lm(y ~ X) # do the linear regression > summary(myreg) # output the estimation Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 19 / 22
Example 3 R Output Once we type the following command, > summary(myreg) we have the following output Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 1.3931 0.9726 1.432 0.189932 X 0.7874 0.1343 5.862 0.000378 *** --- Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 0.1 1 Residual standard error: 1.406 on 8 degrees of freedom Multiple R-squared: 0.8111, Adjusted R-squared: 0.7875 F-statistic: 34.36 on 1 and 8 DF, p-value: 0.0003778 Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 20 / 22
Example 3 R Output cont.. From the previous R Output, the fitted regression line/model is Ŷ = 1.3931 + 0.7874X and for any new subject/individual with X, its prediction of EY is Ŷ = 1.3931 + 0.7874X So, for the above data, If X = 3, then we predict Ŷ = 0.9691 If X = 3.0, then we predict Ŷ = 3.7553 If X = 0.5, then we predict Ŷ = 1.7868 Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 21 / 22
Example 3: Creating of R plots title("scatter of (X,Y) and fitted linear regression model") The following code will plot the points with the fitted line plot(x, Y, pch=15) # plot the observations (data) myreg = lm(y ~ X) # do the linear regression lines(x, myreg$fitted, lwd=2, col=2) # plot the fitted Dr. Bisher M. Iqelan (Department of Math.) Chapter 2: Simple Linear Regression 2010-2011, Semester 2 22 / 22