Chapter 5 Elements of Multiple Regression Analysis: Two Independent Variales
Moving from Independent Variale to Multiple IV s The simple linear regression equation with one IV is as follows: Y = a+ X + e Inferring this to multiple IV s is simple, just add more s! Y = a+ X + X + K+ X + e k k
Multiple Regression Each independent variales has its own regression coefficient. Instead of the slope, we can think of each regression coefficient as follows: The amount of change in the predicted value of Y as we increase unit of X i.
Calculation of Multiple Regression If we have only two independent variales, calculation is tedious y not too difficult. Once we have more than IV s we will have to rely on Matrix operations to perform calculations. We will learn more aout that next week. (YAY!)
Uncorrelated IV s For this example, we are going to assume that x and x are independent, or r = 0. This is done primarily for calculation reasons, it makes it much easier.
Data Set 0 students were given a test that measured Reading Achievement (Y), Veral Aptitude (X ), and Achievement Motivation (X ). The data is given on Pg. 98 in text.
Preliminary Calculations Y = 7 X X Y X X = 87 = 0 = 85 = 48 = 658 XY= 604 XY= 70 XX N = 0 = 57 y x x ( Y ) ( 7) = Y = 85 = 85 684.45= 40.55 N 0 ( X ) ( 87) = X = 48 = 48 378.45= 0.55 N 0 ( X) ( 0) = X = 658 = 658 605= 53 N 0 ( X )( Y) ( 87)( 7) xy = XY = 604 = 604 508.95= 95.05 N 0 ( X )( Y) ( 0)( 7) xy = XY = 70 = 70 643.5= 58.5 N 0 xx ( X)( X) ( 87)( 0) = XX = 57 = 57 478.5= 38.5 N 0
Calculation of ( x )( ) ( )( ) xy xx xy ( )( x ) ( ) x xx = ( 53)( 95.05) ( 38.5)( 58.5) 5037.65 5.5 = = 0.55 53 38.5 5435.5 48.5 ( )( ) ( ) 785.4 = =.7046 395.9
Calculation of = ( x )( ) ( )( ) xy xx xy ( )( x ) ( ) x xx ( 0.55)( 58.5) ( 38.5)( 95.05) 5999.75 3659.45 = = 0.55 53 38.5 5435.5 48.5 ( )( ) ( ) 339.75 = =.599 395.9
Calculation of a a = Y X X a a ( )( ) ( )( ) = 5.85.7046 4.35.599 5.5 =.4705
The final regression equation After we finish all the calculations, we can put it all together Y = a+ X + X ( ) ( ) Y =.4705 +.7046 X +.599 X
Predicted Values Once we have the final equation, we can use it for prediction. Let s try to predict sujects: Suject : X = and X =3 Suject 0: X =4 and X =9 ( )( ) ( )( ) ( )( ) ( )( ) Y (Suject ) =.4705 +.7046 +.599 3 =.0098 Y (Suject 0) =.4705 +.7046 4 +.599 9 = 7.6750
Residuals Once we have the predicted values, we can calculate residuals for these two sujects. Residual Suject e = Y Y =.0098 =.0098 Residual Suject 0 e = Y Y = 0 7.765=.35 0
Sum of Squares ss = xy+ xy ss reg reg =.7046 95.05 +.599 58.5 = 0.6 ( )( ) ( )( ) ssres = y ssreg ss res = 40.55 0.6= 38.95
Squared Multiple Regression Coefficient (R ) Rememer from CH., R was the amount of variance accounted for y the independent variale R ss reg From our previous example this would e: R = y 0.60 = = 40.55.73 7% of the total variance of the dependent variale (Y) was accounted for y the two independent variales (X &X )
Alternative methods of Calculations We can calculate this y hand in terms of the correlation coefficients (example shown in ook). We can also perform the calculations with the click of the mouse in SPSS
Performing this same analysis in SPSS
Performing this same analysis in SPSS
Tests of Significance Once we have calculated the parameter values it is important to determine if they are significant. The following are tests used for all the parameters
F F = Test of R R k ( R ) ( N k ).73.365 = = =.8.73.063 ( ) ( 0 ) Or alternatively, ss reg df 0.6 50.8 reg F= = = =.8 ssres 38.95.9 df res 7 df = k,n-k- Df =,7
Test of s Want to test if is significantly different from 0. This is done in the same way as it was done with one variale.take the value and divide it y its standard error. In our example with IV's, the standard error for is: s y. y. y. ( r ) the standard error for is: s = = x x s s y. ( r )
Test of s Using our same example from efore s y. ss s r y. res ssres = N k = 38.95 38.95 = = 0 x = 0.55 x = 53 =.7046 =.599 =.5.9 s t s t.9 = =.75 0.55.5 s = = =.7046.75 ( ) ( ) = = = 4.0.9 = =.437 53.5 s.599.437.43
Test of R vs. Test of The test of R is the same as testing all the s simultaneously When we test each individually, we are testing the given while controlling for all other independent variales.
Confidence Intervals We can calculate confidence intervals in multiple regression similar to the way we did in simple regression: ± t s ( α df ), Using our same example: CI for ( )( ) ( ).7046±..75 =.3349,.0743 CI for ( )( ) ( ).599±..437 =.0777,.06 Our CI s do not include 0, again confirming that the regression coefficients significantly differ from 0.
Confidence Intervals in SPSS
Test for increment in proportion of variance accounted for This tests the amount of variance accounted for as increased due to the adding of another independent variale In our example, we can test the increment due to adding X on top of what information we already have from X.along with testing the increment due to adding X on top of X. This test is actually equivalent to testing the individual coefficient
Test for increment in proportion of variance accounted for The test for X F F ( Ry. Ry. ) ( ) ( Ry. ) ( N ) ( ) ( ) ( ).73.673 ( ).0957 = = = 5.87.73 0.063 The test for X F F = = ( Ry. Ry. ) ( ) ( Ry. ) ( N ) ( ) ( ) ( ).73.4597 ( ).633 = = = 6.5.73 0.063 df =,N-- df =,7
Relative Importance of Variales The magnitude of is in part affected y the scale of measurement For example, if you measure ojects in inches instead of feet, the nature of the regression and the tests of significance will not change. What will change is the magnitude of the. Therefore rememer, it isn t the size of the that is important, it is its significance.
Relative Importance of Variales Thinking ack to our example: =.60 and =.364 This does not mean that is twice as important as They simply represent different variales measured on different scales.