EXST Regression Techniques Page 1 SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA

Size: px
Start display at page:

Download "EXST Regression Techniques Page 1 SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA"

Transcription

1 EXST Regression Techniques Page 1 SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA MODEL: Y 3 = "! + "" X 3 + % 3 MATRIX MODEL: Y = XB + E Ô Y" Ô 1 X" Ô e" Y# 1 X# b! e# or Ö Ù = Ö Ù Ö Ù b ã ã ã " ã ÕY Ø Õ1 X Ø Õe Ø Where, Y is the vector of dependent variables XB is the linear model ith data values and parameter estimates in separate matrices. the "1"s correct for the intercept E is the vector of random deviations or random errors THE VALUES NEEDED FOR THE MATRIX SOLUTION ARE: Y Y hich is equivalent to USSY and equal to Y3 Y Y = cy Y... Y " # 8 d Ô Y" Y# Ö Ù ã ÕY Ø 8 X X hich produces various intermediate sums XX = = Ô 1 X" X# X" X #... X Ö Ù 8 ã ã Õ1 X Ø n DX3 DX DX 3 3 8

2 EXST Regression Techniques Page X Y hich produces sum of Y and cross products Ô Y" Y# DY3 X Y = X X = = X" X #... X Ö Ù 8 ã DX3Y 3 ÕY Ø 8 Using these three intermediate matrices (Y Y, X X and X Y) e can proceed ith the matrix solutions. The least squares solution for linear regression is based on the solutions of normal equations. The normal equations for a Simple linear Regression are nb! + DX3 b " = DY3 x DX b + b = DX Y 3! 3 " 3 3 hich can be expressed as a matrix equation n DX3 b Y = X X! b D 3 D D DX Y 3 3 " 3 3

3 EXST Regression Techniques Page 3 from our previous calculations e can recognize this matrix equation as (XX) B = XY here B is the vector of regression coefficients. In order to obtain the fitted model e must solve for B. -" -" (X X) (X X) B = (X X) (X Y) -" I*B = (X X) (X Y) -" B = (X X) (X Y) here A = a -a "" ## "# #" -a a -" 1 ## "# a a - a a #" "" and then (X X) = -" ndx - ( X ) 3 DX -DX D 3 3 -DX n DX DX 3 3 Ô ndx ( DX ) ndx ( DX ) Y b B = * = DX n Õ Ø D 3 3 XY! b 3 D 3 3 " ndx ( DX ) ndx ( DX )

4 EXST Regression Techniques Page 4 once e have fitted the line using the matrix equation, e proceed to obtain the Analysis of Variance table and our tests of hypothesis. ANOVA TABLE in matrix form uncorrected corrected Source d.f. USS d.f. SS Regression B X Y 1 B X Y - CF Error n- Y Y-B X Y n- Y Y-B X Y Total n Y Y n-1 Y Y-CF The correction factor (CF) is the same as has been previously discussed. It ( Y ) can be calculated as either 3œ" 3 or ny Additional calculations are BXY-CF YY - CF R = the F test and t-test are given by F = = D 8 n MSRegression (B X Y CF)/dfReg MSError (YY BXY)/dfError The Variance - covariance matrix can be calculated as -" c!! c!" MSE*c!! MSE*c!" MSE * (X X) = MSE* = c c MSE*c MSE*c "! "" "! "" Sb S S S! bb! " b! b!" = or S S S S bb b "! " "! " b b

5 EXST Regression Techniques Page 5 REVIEW OF SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA -" first obtain Y Y, X X, X Y, and (X X) then -" B = (X X) (X Y) SSReg = B X Y - CF SSTotal = Y Y - CF SSE = SSTotal - SSReg = Y Y - B X Y Y = Y = LB L L -" Y x S = L(XX) L(MSE) -" -" Y x S = [1 + L(X X) L ](MSE) = L(X X) L (MSE) + MSE -" Variance-Covariance Matrix = (X X) (MSE)

6 EXST Regression Techniques Page 6 Prediction of the true mean (population mean) and of ne observations. SLR a) Predicting the true population mean at x o Model: Y 3 = "! + "" X 3 + % 3 The true population mean at x o is: "! + "" xo The predicted value (Y ) is: b + b x = Y + b (x - X) Variance (error) for Y is: + o! " o " o 1 (xo-x) o 5 n S XX 1 (x -X)! " o! o, n- 5 n S Confidence Interval: (b + b x ) t Ê + # b) Predicting a ne observation at x o The true observation x is: Y = "! + "" x + % The predicted value (Y ) is: b + b x o o o o o o! " 1 n o (x -X) S o Variance (error) for Y is: Confidence Interval: (b + b x ) t (x -X)! " o! o, n- Ê n + # S MLR - matrix algebra generalization a) Predicting the true population mean Model: Y 3 = X " + % The true population mean at x is: The predicted value is: Variance (error) is: XB X o " 5 X(X X) -" X Confidence Interval: x b t! 5 X(X X) X b) Predicting a observation! XX " #, n-p - The true mean at x o is: Y o = X!" + % o The predicted value is: x! b Variance (error) is: 5 [1 + X(X X) X ] 1 Confidence Interval: x b t! É5 [1 + X(X X) -" X ]! #, n-p XX XX

7 EXST Regression Techniques Page 7 More ANOVA table stuff We have seen the partitioning of Y into components 3 SSTotal œ SSCF SSRegression SSError ANOVA TABLE for SLR SOURCE d.f. SS Mean 1 ny Regression 1 (Y D 3 - Y) Error n- (Y - Y ) D 3 3 Total (uncorrected) n DY 3 SS(Mean) = SS(b!) SS R(b!) SS(Regression) = SS(b " b!) SS R(b " b!) SS(Error) = SSE = SSResiduals SS (b,b ) = b X Y = combined sum of squares for SSMean and SSReg R! " The usual ANOVA TABLE for SLR SOURCE d.f. SS SS (b b ) 1 (Y D - Y) MSReg (Y œ D - Y) / 1 R "! 3 3 Error n- D(Y - Y ) MSError (Y - Y 3 3 œ D 3 3) / n- Total Corrected n-1 D(Y - Y) S œ D(Y - Y) / n Y 3

8 EXST Regression Techniques Page 8 Multiple Regression Y œ " " X " X % here E( % 3 ) œ 0 3! " "3 # #3 3 then E(Y ) œ " " X " X 3! " "3 # #3 This model then predicts any point on a PLANE, here the axes of the plane are X and X. The response variable, Y, gives the height above the plane. " # 3 If e choose any particular value of X " or X #, then e essentially take a slice of the plane. Y 3 œ b! bx " "3 bx # #3 hold X constant Y œ b b X b C Y œ (b b C ) b X # 3! " "3 # \# 3! # \# " "3 Lets suppose e take that slice at a point here X " =10 for the folloing model. We are then holding X " constant at 10, and examining the line (a slice of the plane) at that point. Y 3 œ 5 X "3 3X #3 Y 3 œ 5 *10 3X #3 Y œ 5 3X 3 #3 This is a simple linear function. At every particular value of either X or X, the " # function for the other X ill be simple linear for this model. 5 NOTE that the interpretation for the regression coefficients is the same as before, except that no e have one regression coefficient per independent variable. General Linear Regressions Y œ " " X " X " X... " X % 3! " "3 # #3 $ $3 :-" :-"3 3 this is no longer a simple plane; it describes a hyperplane. Hoever, e could still hold all X 5 constant except one, and describe a simple linear function.

9 EXST Regression Techniques Page 9 Calculations for Multiple Regression Y = b + b X + b X b X + e 3! " "3 # # here there are k different independent variables 1) as before, the equation can be solved for e, and partial derivatives taken 3 ith respect to each unknon (b, b, b, ).! " # á ) the partial derivatives can be set equal to zero, and solved simultaneously to get k+1 equations ith k+1 unknons. 3) the normal equations derived are; nb! + DX" b " + DX# b #... DX5b 5 = DY3 DX b + DX" b + DX X b... DX X b = DX Y DX b + DX X b + DX# b... DX X b = DX Y ã ã ã ã ã ã DX b + DX X b + DX X b... DX b = DX Y "! " " # # " 5 5 " 3 #! " # " # # 5 5 # 3 5! " 5 " # 5 # hich can be factored out to an equation of matrices Ô n DX" DX# DX5 Ô b! Ô DY3 DX" DX" DXX " # á DXX " 5 b" DX" 3Y3 DX# DX" X# DX# á DX# X5 * b# = DX#3Y3 Ö Ù Ö Ù Ö Ù ã ã ã ã ã Õ DX DX X DX X á DX Ø Õb Ø Õ Ø 5 DX53Y3 5 " 5 # 5 5

10 EXST Regression Techniques Page 10 Matrix calculations for General Regression : Numerical Example - NWK7.0 Mathematician salaries. X = Index of publication quality, X =years of " # experience, X =success in getting grant support. $ Ô X= Ö Ù Õ Ø Ô Ù Y= Ö Ù 8.0 Õ5.0 Ø

11 EXST Regression Techniques Page 11 Ra data matrices (X and Y) and the intermediate calculations (X X, X Y & Y Y) Ô n D X" D X# D X$ 3œ" 3œ" 3œ" D X" D X" D XX " # D XX " $ XX = 3œ" 3œ" 3œ" 3œ" X# X" X# X# X# X D D D D $ Ö3œ" 3œ" 3œ" 3œ" Ù 8 8 Õ D X$ D X" X$ D 8 X# X$ D 8 X$ Ø 3œ" 3œ" 3œ" 3œ" = Ô Ö Ù Õ Ø 8 Ô D Y 3œ" D XY " Ô X Y = 3œ" = Ö Ù 8 XY D # Ö3œ" Ù Õ Ø 8 Õ D XY $ Ø 3œ" Y Y = D 8 Y = œ"

12 EXST Regression Techniques Page 1 the normal equations derived are; nb! + DX" b " + DX# b # + DX5b 5 = DY3 DXb "! + DXb " " + DXXb " # # + DXXb " 5 5 = DXY " 3 DXb + DXXb + DXb # + DXXb = DXY DXb + DXXb + DXXb + DXb = DXY #! " # " # # 5 5 # 3 $! " $ " # $ # $ $ $ 3 hich can be factored out to an equation of matrices Ô n DX" DX# DX$ Ô b! Ô DY3 DX" DX" DXX " # DXX " $ b" DX"3Y3 Ö Ù* Ö Ù = Ö Ù DX# DX" X# DX# DX# X$ b# DX#3Y3 Õ DX DX X DX X DX Ø Õb Ø ÕDX Y Ø $ $3 3 $ " $ # $ $

13 EXST Regression Techniques Page 13 Analysis starts ith the X X inverse Ô c!! c!" c!# c!$ -" c"! c"" c"# c"$ (X X) = Ö Ù= c#! c#" c## c#$ Õc c c c Ø $! $" $# $$ Ô b! Ô " b" B = (X X) (X Y) = Ö Ù = Ö Ù b# Õb Ø Õ Ø $ Ô Ö Ù Õ Ø Analysis of Variance USSTotal = Y Y USSRegression = B X Y SSE = Y Y B X Y = UCSSTotal UCSSReg The ANOVA table calculated ith matrix formulas is Source d.f. SS Regression p-1=3 B X Y CF =1.41 Error n p-1=0 Y Y B X Y = Total n 1=3 Y Y CF = ( DY) here the correction factor is calculated as usual, CF = = ny. The F test of the model is a joint test of the regression coefficients, H!: "" = "# = " $ =... = ": " = 0 MSRegression (B X Y CF)/dfReg MSError (YY BXY)/dfError F = = here E(MSR)= + 5 (X X ) + (X X " D " D ) " " D(X X )(X X " "3 " # #3 # " # "3 " #3 #) NOTE: E(MSR) departs from 5 as " 5 increases in magnitude (+ of ) or as any X 53 increases in distance from X 5. The F test is a joint test for all " 5 jointly equal 0. n

14 EXST Regression Techniques Page 14 (b 5 0) " 5 s To test any individually, e can still use t = here s b 5 is obtained from the VARIANCE (belo). The confidence interval, for any, is given by " 5 P(b t! s Ÿ " Ÿ b t! s ) œ 1-! 5 " ß8 : b 5 5 " ß8 : # 5 # 5 b b 5 COVARIANCE matrix and the Bonferroni joint confidence interval for several " 5 parameters is given by P(b5 t! " ß8 : s Ÿ 5 Ÿ b 5 t! " ß8 : s ) # g b " 5 # g b 5 œ 1-! here g" is the number of parameters -" The VARIANCE COVARIANCE matrix is calculated as from the X X matrix. Ô c!! c!" c!# c!$ -" c"! c"" c"# c"$ (X X) = MSEÖ Ù c#! c#" c## c#$ Õc$! c$" c$# c Ø $$ Ô MSE*c MSE*c MSE*c MSE*c Ô s s s s!!!"!#!$ MSE*c MSE*c MSE*c MSE*c "! "" "# "$ = Ö Ù= s s, s s "" MSE*c MSE*c MSE*c MSE*c Ö #! #" ## #$ s s s, s ## ÕMSE*c MSE*c MSE*c MSE*c Ø $! $" $# $$ Õ s s s s Ô Var(b ") Cov(b" b #) á Cov(b" b n) Cov(b# b ") Var(b #) á Cov(b# b n) = Ö Ù = Var-Cov matrix ã ÕCov(b b ) Cov(b b ) á Var(b ) Ø n " n # n,,,,,,,!!!"!#!$ "! "# "$,,,,,,, #! #" #$ $! $" $# $$ here the c 34 values are called Gaussian multipliers. The VARIANCE COVARIANCE matrix is then calculated from this matrix by multiplying by the MSError. Ù Ø These are unbiased estimates of 5, The individual values then provide the variances and covariances such that MSE*c!! = Variance of b! = VAR(b!) MSE*c "" = Variance of b " = VAR(b "), so s b " = ÈMSE*c"" MSE*c!" = MSE*c "! = Covariance of b! and b " = COV(b!,b")

15 EXST Regression Techniques Page 15 Prediction of mean response For simple linear regression e got Y and its CI for some X For multiple regression, e need an X for each X 4 Ô 1 X" Given a vector of X = X# Ö Ù ã ÕX Ø E(Y ) = X " Y = XB ß: -" The variance estimates for mean responses are given by -" MSE*X(XX) X for individual observations, add one MSE -" MSE+MSE*X(X X) X P(Y t s E(Y ) Y! Ÿ Ÿ t! s ) œ 1- " # ß8 p Y " # ß8 p Y! simultaneous estimates of several mean responses can employ either the Working-Hotelling approach, "! à:ß8 : Y Ws here W = pf Y or the Bonferroni approach, "! à8 : Y Bs here B = t Y # g

16 EXST Regression Techniques Page 16 for individual observations the prediction is the same Y = X B and the variance is one MSE larger than for the mean -" MSE + MSE*X(X X) X and for the mean of a ne sample of size m, the variance is MSE m -" + MSE*X(X X) X As ith the SLR, confidence intervals of g ne observations can be done ith Scheffe limits Y Ss, here S = gf "! à g ß8 : Y or the Bonferroni approach, "! à8 : Y Bs here B = t Y # g

17 EXST Regression Techniques Page 17 Coefficient of Multiple Determination - the proportion of the SSTotal (usually corrected) accounted for by the Regression line (SSReg). Models ith an intercept SS (b,b, á, b b ) SSRegression BXY CF SS - SS SS S SSTotal (corrected) YY CF S S R " # k! R = = = = = 1- YY E E YY YY YY 1.4 = = Recalling the expressions for S e have R bxy - ny YY - ny D(Y - Y) D(Y 3 - Y) 3 R = = = 1 - SSE SSTotal (UNCorrected) Some Properties of R : Same as for SLR 1) 0 Ÿ R Ÿ 1 ) R = 1.0iff Y = Y for all i (perfect prediction, SSE = 0) 3 3 3) R = r XY for simple linear regression YY 4) R = r for all models ith intercepts 5) R Á 1.0hen there are different repeated values of Y 3 at some value of X 3 (no matter ho ell the model fits) SubModel FullModel 6) R Ÿ R Ne independent variables added to a model ill increase R. The R for the full model could be EQUAL, but never less than the R for the submodel F test for Lack of Fit E(Y) = " " X " X " X +... " - X -! " "3 # #3 $ $3 : " : "ß3 To get true repeats in multiple regression, EVERY independent variable must remain the same from one observation to another This can be calculated ith either full and reduced model

18 EXST Regression Techniques Page 18 Ne problems associated ith MULTIPLE REGRESSION The SLR fitted only a single slope, so e needed only one SSReg to describe it. With various slopes, e ill need some other sums of squares to describe the various fitted slopes. e ill actually see types of SS non-problems associated ith Multiple regression a) All previous definitions and notation apply. b) The assumptions are basically the same (more X, each measured ithout 3 error). Many of the tests of hypothesis ill be discussed in terms of the General Linear Test, ith appropriate Full and Reduced models. This test does not really change ith multiple regression, We still have the same table, Model d.f SS MS F Reduced (Error) n-p SSE Red Full (Error) n-p-q SSE Full Difference q SS Diff MS Diff Full (Error) n-p-q SSE Full MSE Full MS MSE Diff Full Model d.f SS MS F Full (SSReg) p+q SSR Red Reduced (SSReg) p SSR Full Difference q SS Diff MS Diff Full (Error) n-p-q SSE Full MSE Full MS MSE Diff Full

Statistical Techniques II EXST7015 Simple Linear Regression

Statistical Techniques II EXST7015 Simple Linear Regression Statistical Techniques II EXST7015 Simple Linear Regression 03a_SLR 1 Y - the dependent variable 35 30 25 The objective Given points plotted on two coordinates, Y and X, find the best line to fit the data.

More information

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

EXST Regression Techniques Page 1. We can also test the hypothesis H : œ 0 versus H : EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

The general linear model (and PROC GLM)

The general linear model (and PROC GLM) The general linear model (and PROC GLM) Solving systems of linear equations 3"! " " œ 4" " œ 8! " can be ritten as 3 4 "! œ " 8 " or " œ c. No look at the matrix 3. One can see that 3 3 3 œ 4 4 3 œ 0 0

More information

Analysis of Variance

Analysis of Variance Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

STAT 540: Data Analysis and Regression

STAT 540: Data Analysis and Regression STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Lecture 9 SLR in Matrix Form

Lecture 9 SLR in Matrix Form Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1 Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations.

More information

Lecture 12 Inference in MLR

Lecture 12 Inference in MLR Lecture 12 Inference in MLR STAT 512 Spring 2011 Background Reading KNNL: 6.6-6.7 12-1 Topic Overview Review MLR Model Inference about Regression Parameters Estimation of Mean Response Prediction 12-2

More information

Lecture 10 Multiple Linear Regression

Lecture 10 Multiple Linear Regression Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

STOR 455 STATISTICAL METHODS I

STOR 455 STATISTICAL METHODS I STOR 455 STATISTICAL METHODS I Jan Hannig Mul9variate Regression Y=X β + ε X is a regression matrix, β is a vector of parameters and ε are independent N(0,σ) Es9mated parameters b=(x X) - 1 X Y Predicted

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

6. Multiple Linear Regression

6. Multiple Linear Regression 6. Multiple Linear Regression SLR: 1 predictor X, MLR: more than 1 predictor Example data set: Y i = #points scored by UF football team in game i X i1 = #games won by opponent in their last 10 games X

More information

y(x) = x w + ε(x), (1)

y(x) = x w + ε(x), (1) Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued

More information

Multivariate Regression (Chapter 10)

Multivariate Regression (Chapter 10) Multivariate Regression (Chapter 10) This week we ll cover multivariate regression and maybe a bit of canonical correlation. Today we ll mostly review univariate multivariate regression. With multivariate

More information

Chapter 14. Linear least squares

Chapter 14. Linear least squares Serik Sagitov, Chalmers and GU, March 5, 2018 Chapter 14 Linear least squares 1 Simple linear regression model A linear model for the random response Y = Y (x) to an independent variable X = x For a given

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Lecture 3: Inference in SLR

Lecture 3: Inference in SLR Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals

More information

Business Statistics. Lecture 10: Correlation and Linear Regression

Business Statistics. Lecture 10: Correlation and Linear Regression Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form

More information

df=degrees of freedom = n - 1

df=degrees of freedom = n - 1 One sample t-test test of the mean Assumptions: Independent, random samples Approximately normal distribution (from intro class: σ is unknown, need to calculate and use s (sample standard deviation)) Hypotheses:

More information

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable, Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P.

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by

[4+3+3] Q 1. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/1 40 Examination Date Time Pages Final June 2004 3 hours 7 Instructors Course Examiner Marks Y.P. Chaubey

More information

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal Econ 3790: Business and Economics Statistics Instructor: Yogesh Uppal yuppal@ysu.edu Sampling Distribution of b 1 Expected value of b 1 : Variance of b 1 : E(b 1 ) = 1 Var(b 1 ) = σ 2 /SS x Estimate of

More information

Module 19: Simple Linear Regression

Module 19: Simple Linear Regression Module 19: Simple Linear Regression This module focuses on simple linear regression and thus begins the process of exploring one of the more used and powerful statistical tools. Reviewed 11 May 05 /MODULE

More information

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Correlation and the Analysis of Variance Approach to Simple Linear Regression Correlation and the Analysis of Variance Approach to Simple Linear Regression Biometry 755 Spring 2009 Correlation and the Analysis of Variance Approach to Simple Linear Regression p. 1/35 Correlation

More information

Simple Linear Regression Analysis

Simple Linear Regression Analysis LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study

More information

Addition and subtraction: element by element, and dimensions must match.

Addition and subtraction: element by element, and dimensions must match. Matrix Essentials review: ) Matrix: Rectangular array of numbers. ) ranspose: Rows become columns and vice-versa ) single row or column is called a row or column) Vector ) R ddition and subtraction: element

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

No other aids are allowed. For example you are not allowed to have any other textbook or past exams.

No other aids are allowed. For example you are not allowed to have any other textbook or past exams. UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Basic Probability Reference Sheet

Basic Probability Reference Sheet February 27, 2001 Basic Probability Reference Sheet 17.846, 2001 This is intended to be used in addition to, not as a substitute for, a textbook. X is a random variable. This means that X is a variable

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

Section Least Squares Regression

Section Least Squares Regression Section 2.3 - Least Squares Regression Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Regression Correlation gives us a strength of a linear relationship is, but it doesn t tell us what it

More information

Handout #5 Computation of estimates, ANOVA Table and Orthogonal Designs

Handout #5 Computation of estimates, ANOVA Table and Orthogonal Designs Hout 5 Computation of estimates, ANOVA Table Orthogonal Designs In this hout e ill derive, for an arbitrary design, estimates of treatment effects in each stratum. Then the results are applied to completely

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.

SSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO. Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

Lecture 6 Multiple Linear Regression, cont.

Lecture 6 Multiple Linear Regression, cont. Lecture 6 Multiple Linear Regression, cont. BIOST 515 January 22, 2004 BIOST 515, Lecture 6 Testing general linear hypotheses Suppose we are interested in testing linear combinations of the regression

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Simple linear regression

Simple linear regression Simple linear regression Biometry 755 Spring 2008 Simple linear regression p. 1/40 Overview of regression analysis Evaluate relationship between one or more independent variables (X 1,...,X k ) and a single

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Markus Haas LMU München Summer term 2011 15. Mai 2011 The Simple Linear Regression Model Considering variables x and y in a specific population (e.g., years of education and wage

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

Bivariate distributions

Bivariate distributions Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient

More information

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. 1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive

More information

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model

Outline. Remedial Measures) Extra Sums of Squares Standardized Version of the Multiple Regression Model Outline 1 Multiple Linear Regression (Estimation, Inference, Diagnostics and Remedial Measures) 2 Special Topics for Multiple Regression Extra Sums of Squares Standardized Version of the Multiple Regression

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)

STAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow) STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

From last time... The equations

From last time... The equations This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511

STAT 511. Lecture : Simple linear regression Devore: Section Prof. Michael Levine. December 3, Levine STAT 511 STAT 511 Lecture : Simple linear regression Devore: Section 12.1-12.4 Prof. Michael Levine December 3, 2018 A simple linear regression investigates the relationship between the two variables that is not

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

STAT5044: Regression and Anova

STAT5044: Regression and Anova STAT5044: Regression and Anova Inyoung Kim 1 / 25 Outline 1 Multiple Linear Regression 2 / 25 Basic Idea An extra sum of squares: the marginal reduction in the error sum of squares when one or several

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Statistical Modelling in Stata 5: Linear Models

Statistical Modelling in Stata 5: Linear Models Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does

More information

Analysis of Bivariate Data

Analysis of Bivariate Data Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr&reg 2 Independent

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

LINEAR REGRESSION MODELS W4315

LINEAR REGRESSION MODELS W4315 LINEAR REGRESSION MODELS W431 HOMEWORK ANSWERS March 9, 2010 Due: 03/04/10 Instructor: Frank Wood 1. (20 points) In order to get a maximum likelihood estimate of the parameters of a Box-Cox transformed

More information

Regression Estimation Least Squares and Maximum Likelihood

Regression Estimation Least Squares and Maximum Likelihood Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize

More information

STA 4210 Practise set 2a

STA 4210 Practise set 2a STA 410 Practise set a For all significance tests, use = 0.05 significance level. S.1. A multiple linear regression model is fit, relating household weekly food expenditures (Y, in $100s) to weekly income

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

Chapter 10. Design of Experiments and Analysis of Variance

Chapter 10. Design of Experiments and Analysis of Variance Chapter 10 Design of Experiments and Analysis of Variance Elements of a Designed Experiment Response variable Also called the dependent variable Factors (quantitative and qualitative) Also called the independent

More information

Chapter 2 Multiple Regression I (Part 1)

Chapter 2 Multiple Regression I (Part 1) Chapter 2 Multiple Regression I (Part 1) 1 Regression several predictor variables The response Y depends on several predictor variables X 1,, X p response {}}{ Y predictor variables {}}{ X 1, X 2,, X p

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

 M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2 Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

CS 5014: Research Methods in Computer Science

CS 5014: Research Methods in Computer Science Computer Science Clifford A. Shaffer Department of Computer Science Virginia Tech Blacksburg, Virginia Fall 2010 Copyright c 2010 by Clifford A. Shaffer Computer Science Fall 2010 1 / 207 Correlation and

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

TMA4255 Applied Statistics V2016 (5)

TMA4255 Applied Statistics V2016 (5) TMA4255 Applied Statistics V2016 (5) Part 2: Regression Simple linear regression [11.1-11.4] Sum of squares [11.5] Anna Marie Holand To be lectured: January 26, 2016 wiki.math.ntnu.no/tma4255/2016v/start

More information

Correlation and Linear Regression

Correlation and Linear Regression Correlation and Linear Regression Correlation: Relationships between Variables So far, nearly all of our discussion of inferential statistics has focused on testing for differences between group means

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

WEIGHTED LEAST SQUARES - used to give more emphasis to selected points in the analysis. Recall, in OLS we minimize Q =! % =!

WEIGHTED LEAST SQUARES - used to give more emphasis to selected points in the analysis. Recall, in OLS we minimize Q =! % =! WEIGHTED LEAST SQUARES - used to give more emphasis to selected poits i the aalysis What are eighted least squares?! " i=1 i=1 Recall, i OLS e miimize Q =! % =!(Y - " - " X ) or Q = (Y_ - X "_) (Y_ - X

More information