Correlatio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical Multiple histograms Correlatio Scatter plot Cumulative frequecy t-test distributios Regressio Y Two variables: Which test? X Explaatory variable Respose variable Categorical Numerical Categorical Cotigecy table Cotigecy Logistic Grouped bar graph aalysis regressio Mosaic plot Numerical Multiple histograms Correlatio Scatter plot Cumulative frequecy t-test distributios Regressio Relatioship Betwee Two Numerical Variables
Relatioship Betwee Two Numerical Variables Correlatio What is the tedecy of two umerical variables to co-vary (chage together)? Correlatio What is the tedecy of two umerical variables to co-vary (chage together)? Correlatio coefficiet r measures the stregth ad directio of the liear associatio betwee two umerical variables Correlatio What is the tedecy of two umerical variables to co-vary (chage together)? Correlatio coefficiet r measures the stregth ad directio of the liear associatio betwee two umerical variables Populatio parameter:! (rho) Sample estimate: r
r = " X ) Y i " Y ( ) " X ) 2 Y i " Y #( ) 2 r = " X ) Y i " Y ( ) " X ) 2 Y i " Y #( ) 2 Sum of squares: X ad Y r = Sum of products " X ) Y i " Y ( ) " X ) 2 Y i " Y #( ) 2 Shortcuts # # X $ ' i Y i " X )( Y i "Y ) = &# X i Y i ) " i=1 % ( $ ' &# X i ) " X ) 2 2 % ( = ) " i=1 2 Sum of squares: X ad Y $ ' &#Y i ) #( Y i "Y ) 2 2 % ( = #( Y i ) " i=1 2
r r Correlatio assumes... r r Radom sample X is ormally distributed with equal variace for all values of Y Y is ormally distributed with equal variace for all values of X Correlatio assumes... Radom sample X is ormally distributed with equal variace for all values of Y Y is ormally distributed with equal variace for all values of X Correlatio coefficiet facts -1 <! < 1; -1 < r < 1 Bivariate ormal distributio
Correlatio coefficiet facts -1 <! < 1; -1 < r < 1 Positive r: variables icrease together Negative r: whe oe variable icreases, the other decreases, ad vice-versa Correlatio coefficiet facts -1 <! < 1; -1 < r < 1 Positive r: variables icrease together Negative r: whe oe variable icreases, the other decreases, ad vice-versa egative ucorrelated positive r = -1 r=0 r = 1 Correlatio coefficiet facts Coefficiet of determiatio = r 2 Describes the proportio of variatio i oe variable that ca be predicted from the other Stadard error of r SE r = 1" r 2 " 2
Cofidece Limits for r # z = 0.5l 1+ r & % ( $ 1" r ' " z = 1 # 3 z " Z 0.05(2) # z $ % $ z + Z 0.05(2) # z Example Are the effects of ew mutatios o matig success ad productivity correlated? Data from Drosophila melaogaster = 31 idividuals z "1.96# z $ % $ z +1.96# z r = e2z "1 e 2z +1 X is productivity, Y is the matig success Sum of products = 2.796 Sum of squares for X = 16.245 Sum of squares for Y = 1.6289 X is productivity, Y is the matig success " X )( Y i "Y ) = 2.796 i=1 " X ) 2 =16.245 i=1 #( Y i "Y ) 2 =1.6289 i=1
r = 2.796 ( 16.245) 1.6289 ( ) = 0.5435 r = 2.796 ( 16.245) ( 1.6289) = 0.5435 SE r = 1" r2 " 2 = 0.7045 29 = 0.1558 Cofidece Limits for r # z = 0.5l 1+ r & # 1+ 0.5435& % ( = 0.5l% ( $ 1" r ' $ 1" 0.5435' z = 0.609 Cofidece Limits for r # z = 0.5l 1+ r & # 1+ 0.5435& % ( = 0.5l% ( $ 1" r ' $ 1" 0.5435' z = 0.609 1 " z = # 3 = 1 31# 3 = 0.189
Cofidece Limits for r # z = 0.5l 1+ r & # 1+ 0.5435& % ( = 0.5l% ( $ 1" r ' $ 1" 0.5435' z = 0.609 1 " z = # 3 = 1 31# 3 = 0.189 z "1.96# z $ % $ z +1.96# z Cofidece Limits for r # z = 0.5l 1+ r & # 1+ 0.5435& % ( = 0.5l% ( $ 1" r ' $ 1" 0.5435' z = 0.609 1 " z = # 3 = 1 31# 3 = 0.189 z "1.96# z $ % $ z +1.96# z 0.609 "1.96 # 0.189 $ % $ 0.609 +1.96 # 0.189 0.239 " # " 0.979 Cofidece Limits for r 0.239 " # " 0.979 r = e2z "1 e 2z +1 Cofidece Limits for r 0.239 " # " 0.979 r = e2z "1 e 2z +1 e 2*0.239 "1 e 2*0.239 +1 # $ # e2*0.979 "1 e 2*0.979 +1 0.235 " # " 0.753
Example: Why Sleep? Example: Why Sleep? 10 experimetal subjects Measured icrease i slow-wave activity durig sleep Measured improvemet i task after sleep - had-eye coordiatio activity Example: Why Sleep? Why sleep? Sum of products: 1127.4 Sum of squares X: 2052.4 Sum of squares Y: 830.9 Calculate a 95% C.I. for!
Hypothesis Testig for Correlatios Ca test hypotheses relatig to correlatios amog variables Closely related to regressio - the topic for ext Tuesday s lecture Hypothesis Testig for Correlatios H 0 :! = 0 H A :! " 0 If! = 0,... r is ormally distributed with mea 0 Example Are the effects of ew mutatios o matig success ad productivity correlated? t = r SE r with df = -2 Data from Drosophila melaogaster
Hypotheses H 0 : Matig success ad productivity are ot related (! = 0) H A : Matig success ad productivity are correlated (! " 0) X is productivity, Y is the matig success Sum of products = 2.796 Sum of squares for X = 16.245 Sum of squares for Y = 1.6289 r = 2.796 ( 16.245) 1.6289 ( ) = 0.5435 r = 2.796 ( 16.245) 1.6289 ( ) = 0.5435 SE r = 1" r2 " 2 = 0.7045 29 = 0.1558 SE r = 1" r 2 " 2 = 0.7045 29 = 0.1558 t = 0.5435 0.1558 = 3.49
df= -2=31-2=29 df= -2=31-2=29 t=3.49 is greater tha t 0.05(2), 29 = 2.045, so we ca reject the ull hypothesis ad say that productivity ad male matig success are correlated across geotypes. Why sleep? Sum of products: 1127.4 Sum of squares X: 2052.4 Sum of squares Y: 830.9 Test for a correlatio differet from zero i these data. Checkig Assumptios for Correlatio Bivariate ormal distributio Relatioship is liear (straight lie) Cloud of poits i scatter plot is circular or elliptical Frequecy distributios of X ad Y are ormal
Liear Relatioship? Maximum correlatio possible Maximum correlatio possible Correlatio of zero
Cloud of poits elliptical? Maximum correlatio possible Correlatio of zero X ad Y ormal? Use usual techiques for both X ad Y separately Be wary of outliers Quick Referece Guide - Correlatio Coefficiet What is it for? Measurig the stregth of a liear associatio betwee two umerical variables What does it assume? Bivariate ormality ad radom samplig Parameter:! Estimate: r Formulae: " X )( Y i " Y ) r = SE " X ) 2 #( Y i " Y ) 2 r = 1" r 2 " 2
Quick Referece Guide - t-test for zero liear correlatio What is it for? To test the ull hypothesis that the populatio parameter,!, is zero What does it assume? Bivariate ormality ad radom samplig Test statistic: t Null distributio: t with -2 degrees of freedom Formulae: t = r SE r Sample Test statistic t = r SE r Reject H o T-test for correlatio compare How uusual is this test statistic? P < 0.05 P > 0.05 Null hypothesis!=0 Null distributio t with -2 d.f. Fail to reject H o