SIMPLE LINEAR REGRESSION AND CORRELATION ANALSIS
INTRODUCTION There are lot of statistical ivestigatio to kow whether there is a relatioship amog variables Two aalyses: (1) regressio aalysis; () correlatio aalysis Regressio aalysis: to fid out a regressio which ca be used to predict variable (called depedet, respose variable) from variables 1,,..., k (called idepedet, predictor variables) Correlatio aalysis: to fid out the stregth of assosiatio betwee variables 1,,..., k ad variable.
SIMPLE LINEAR REGRESSION We build liear model i such a way that the values of variable ca be predicted from values of oe variable There are ordered pairs of idepedet observatio ( i, i ), i = 1,,..., ; where i are i-th values of idepedet variable ad i are i-th values of depedet variable
Example
Scatter diagram
DATA MODEL ON POPULATION i i i i = the i-th values of variable α = costat, the mea of if all are zero, itercept β = costat, regressio coefficiet o ε i = radom disturbace, error, for idividual i α is estimated by a o sample β is estimated by b o sample ε i are estimated by e i
REGRESSION LINE Ŷ a b Ŷ is predictive values,predicted froma certai values of Usually the predictive values of are ot the same as the real values The differece betwee the predictive value ad the real value is called residual or error ad is called e i the sample e i i Ŷ i
LEAST SQUARES METHOD 1 i i i 1 i i ) b a ( e D a b By usig theorems o calculus, D will be at miimum if the values of a ad b are as follows:
EAMPLE
EAMPLE
SCATTER DIAGRAM
SUM OF SQUARES (SSs) SST Ŷ Ŷ = SSR + SSE SST = total sum of squares SSR = sum of squares due to regressio SSE = error sum of squares
COMPUTATION FORMULAS SST SSR a b SSE a b
EAMPLE
EAMPLE
SIGNIFICANT TEST OF REGRESSION LINE H o : regressio lie o is ot sigificat H 1 : regressio lie o is sigificat Those ull hypothesis is tested usig aova approach
EAMPLE
THE SIMPLE CORRELATION COEFFICIENT The product-momet correlatio coefficiet is a statistic descriptive of the magitude of the relatio betwee two variables Correlatio coefficiets are traditioally defied i such a way as to take values extedig from 1 to + 1
THE SIMPLE CORRELATION COEFFICIENT A egative value idicates a egative relatio; that is, decreases as icreases, ad icreases as decreases. A positive value idicates a positive relatio; that is, icreases as icreases, ad decreases as decreases.
THE SIMPLE CORRELATION COEFFICIENT The symbol r is commo practice used to deote the sample value of the correlatio coefficiet. The symbol ρ is commo practice used to deote the populatio value of the correlatio coefficiet. So, r ˆ
THE DEFINITION OF r The measure ρ of liear relatioship betwee two variables ad is estimated by the sample correlatio coefficiet r, where r xy b sxx syy sxy sxsy
THE DEFINITION OF r sxsy sxy syy sxx xy b r 1 1 ) )( ( s xy 1 1 s xx 1 1 s yy covariace betwee ad variace of variace of
COMPUTATION FORMULA r xy xy x y where x ad y
COMPUTATION FORMULA r xy The above formula is called Karl Pearso product momet correlatio coefficiet. This formula is widely used, because it is simply couted from raw data
THE INTERPRETATION
THE INTERPRETATION
EAMPLE 53305 951; 665; 76095 3755; 54.88636364 s 11 1 (665)(951) 53305 1 xy 61.17444 s 11 1 665 3755 1 xx 66.0454545 s 11 1 951 76095 1 yy 7.8139645 61.17444 s x 8.13661757 66.0454545 s y
SIGNIFICANT TEST USING t-test AND F-test Usig t-test: t r xy 1r xy with ( - ) degree of fredom Usig F-test: F r xy () 1r xy with 1ad ( - ) df
SIGNIFICANT TEST USING ANOVA H o : ρ xy = 0 H 1 : ρ xy 0 Those ull hypothesis is tested usig aova approach
SIGNIFICANT TEST USING ANOVA SST SSE (1 SSR (r r y )SST )SST
EAMPLE
EAMPLE b r xy b sxx syy (0.897136) 7.8139645 8.13661757 0.86 r xy s xy s x s y 54.88636364 (7.8139645 )(8.13661757 ) 0.86
EAMPLE r xy 665; 951; 53305 3755; 76095 r xy (1)(53305) (665)(951) (1)(3755) 665 (1)(76095) 951 745 8400.4419566 0.86
Exercises