1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these two variables Preferably, both X ad Y are measured at the iterval or ratio level, although it is also commo to estimate correlatio coefficiets ad regressio lies whe oe or both variables are measured at oly the ordial level The first stage i obtaiig the estimates of correlatio ad regressio statistics is to compute ΣX, ΣY, ΣX, ΣY ad ΣXY Each summatio is across all values of X ad Y The use these summatios to calculate the followig expressios: = Σ(X X) = ΣX (ΣX) S XY = ΣXY (ΣX)(ΣY ) Correlatio coefficiet S Y Y = ΣY (ΣY ) Usig the above expressios, the correlatio coefficiet is r = S XY SXX S Y Y Regressio lie The slope b ad the itercept a of the regressio lie are b = S XY a = Ȳ b X
Sociology 405/805 Witer 004 Correlatio ad regressio formulae where Ȳ = ΣY/ ad X = ΣX/ The estimate of the regressio lie expressig the relatioship betwee the depedet variable Y ad the idepedet variable X is Ŷ = a + bx Stadard errors For this regressio lie, the stadard error of estimate is s e = ad the stadard deviatio of b is ΣY aσy bσxy s b = s e SXX The stadard deviatio of the mea predicted value Ŷ is sŷ = s 1 e + (X X) ad the stadard deviatio for a idividual predicted value, Ŷi, is sŷi = s 1 e 1 + + (X X) Compoets of the Variatio i the Depedet Variable Y The total variatio i the depedet variable is SS t = (Y i Ȳ ) = S Y Y This total variatio ca be broke ito two compoets, the explaied variatio, or regressio sum of squares, ad the uexplaied, or residual sum of squares
Sociology 405/805 Witer 004 Correlatio ad regressio formulae 3 The regressio sum of squares is SS r = (Ŷi Ȳ ) = b The uexplaied variatio, or the residual or error sum of squares is SS e = (Y i Ŷi) = ΣY aσy bσxy These two compoets of the total variatio ca be used to determie R, the goodess of fit of the regressio equatio Tests of Statistical Sigificace R = SS r SS t There are various ways of testig for the statistical sigificace of the regressio lie For each test, the ull hypothesis is that there is o relatioship betwee X ad Y The alterative hypothesis ca be costructed as either a oe or two directioal statemet These ca be stated i geeral as: H 0 : No relatioship betwee X ad Y H 1 : Some relatioship betwee X ad Y Alteratively, the research hypothesis ca be stated as a oe directioal relatioship, either a positive or a egative relatioship betwee X ad Y For a hypothesis test about the goodess of fit R, the hypotheses are: H 0 : R = 0 H 1 : R 0 The test for R is a F test with 1 ad ( ) degrees of freedom ad ca be writte as: SS r SS e /( ) = R ( ) 1 R The tests of sigificace for the correlatio coefficiet r, ad for the slope of the lie b, are usually costructed as oe directioal tests The ull hypothesis is that there is o relatioship betwee X ad Y, ad the research
Sociology 405/805 Witer 004 Correlatio ad regressio formulae 4 hypothesis is either a positive relatioship betwee X ad Y, or a egative relatioship betwee the two variables If the test is to determie whether there is a positive relatioship betwee the two variables, the hypotheses for the test of sigificace o the Pearso correlatio coefficiet r would be as follows Let ρ be the true correlatio betwee X ad Y H 0 : ρ = 0 H 1 : ρ > 0 The followig t-test with ( ) degrees of freedom tests these hypotheses: t = r 1 r To test for a positive slope for the regressio lie, b, the hypotheses are: H 0 : β = 0 H 1 : β > 0 where β is the slope of the true regressio lie whe Y is regressed o X This test is usually writte as a t-test with degrees of freedom, where t = b β s b If the ull hypothesis is that β = 0, the this test is simply t = b s b Note that for a bivariate relatioship, ivolvig oly two variables, each of the above three tests is really the same test, so ot more tha oe of these tests eed be reported That is, t = b s b = r 1 r ad R 1 R ( ) = t
Sociology 405/805 Witer 004 Correlatio ad regressio formulae 5 Aalysis of variace The decompositio of the variatio of Y is preseted as a aalysis of variace table i Table 1 Recallig that SS r = (Ŷi Ȳ ) = b ad the F test is SS e = (Y i Ŷi) = ΣY aσy bσxy, SS r SS e /( ) = R ( ) 1 R Table 1: Aalysis of Variace Table Source of Degrees of Variatio Sum of Squares Freedom Mea Square F Regressio SS r = (Ŷi Ȳ ) 1 SS r R ( )/(1 R ) Residual SS e = (Y i Ŷi) SS e /( ) Total SS t = (Y i Ȳ ) 1 Last edited February 4, 004