CLASS NOTES for PBAF 58: Quattatve Methods II SPRING 005 Istructor: Jea Swaso Dael J. Evas School of Publc Affars Uversty of Washgto Ackowledgemet: The structor wshes to thak Rachel Klet, Assstat Professor, for allowg the use of her lecture otes.
Week 1 1. Correlato, Lear Relatoshps, ad Causalty Example: Travel Tme ad Dstace to Campus (Assgmet 1) Are travel tme to campus ad dstace from campus correlated? Does oe cause the other? A. The purposes of research What s t? Why do we udertake t? B. Bvarate Relatoshps You eded the last quarter havg leared how to compare meas of oe varable. You ether compared two groups to see f they were the same o that measure or looked to see f oe group chaged over tme. Now we re startg to look at relatoshps amog more tha oe varable. We beg wth relatoshps betwee varables. We ca thk about varables correlated- -whether oe causes the other s aother questo. Scattergram (scatterplot) s a two-dmeso graph We ca use a scattergram to aswer three questos: 1) Is the relatoshp postve or egatve (drect or verse)? ) Is the relatoshp lear or o-lear? 3) How strog s the lear relatoshp? PBAF 58 Sprg 005
Example 1: Assgmet 1 Correlato (for cotuous varables) If we observe a lear relatoshp betwee two varables, we ca quatfy or measure the stregth of the lear relatoshp. If X ad Y are two radom varables, the ther correlato s a measure of the degree of lear assocato betwee them. The correlato coeffcet measures how well they track each other. Do they chage together? Pearso Product Momet Coeffcet of Correlato To calculate a sample correlato coeffcet use SPSS or Excel (see Assgmet 1) or: r = SS SS xx xy SS yy = x ( )( ) x y xy ( ) ( ) x y y -1 r 1 The closer the correlato coeffcet s to 1 or +1, the stroger the lear relatoshp (egatve or postve respectvely). We ca say they are strogly correlated. The closer t s to 0, the weaker the lear relatoshp. r > 0 mples a postve relatoshp betwee the two varables r < 0 mples a egatve relatoshp betwee the two varables PBAF 58 Sprg 005 3
Examples of dfferet correlatos Let s look at some examples: Page 538, McClave ad Scch (000) Correlatos wth Household Icome HH come Poverty Level.770 Hours worked per week.069 # of phoe les.139 # of people lvg house.041 Age -.00 Wages per week.357 PBAF 58 Sprg 005 4
Example : Assgmet 1 Populato Parameter versus Sample Statstc ρ populato correlato coeffcet; t equals betwee 1 ad +1 (say rho) r sample correlato coeffcet; t equals betwee 1 ad +1 Hypothess Test for? Ofte we wat to kow f the populato value of the correlato, ρ, s dfferet from 0 based o what we observe the correlato to be our sample, r? Null hypothess: ρ=0 Alteratve hypothess: ρ 0 Test statstc: t = r 1 r for t wth - df Could be oe-taled (s the correlato postve oly? egatve oly?) Could be two-taled (s ths correlato dfferet from zero?) Decso rule: Select a sgfcace level (α). If t>t α we ca reject the ull hypothess. Example 3: Assgmet 1 C. Causato Does A cause B? eg: Does dstace to work cause travel tme? Does oe cause the other? Mght there be other factors that cotrbuted to the crease travel tme? Idographc Models Ca we fd out all the uque causes? Ams at a complete uderstadg of a partcular evet PBAF 58 Sprg 005 5
Nomothetc Models What are the most mportat causes? What are those factors that expla the most varato? Am at a geeral uderstadg of a partcular evet. Codtos for establshg causato: The cause precedes the effect tme. Emprcally correlated wth each other The observed correlato caot be explaed terms of some thrd varable that causes both of them. Spurous Relatoshp Cocdetal statstcal correlato betwee two varables caused by a thrd varable Necessary cause Suffcet cause THINK CRITICALLY ABOUT CAUSATION! What else could be causg the relatoshp? Types of Models 1) Determstc Used for a exact relatoshp betwee two varables Y ca be determed exactly whe the value of X s kow There s o allowace for error ths model TAX = 0.01(SALARY) ) Probablstc Cotas a determstc compoet ad a radom error compoet Radom compoet caused by other varables (ot cluded the model) or radom pheomea DRIVING REACTION TIME = 0.01(AGE) + Radom Error PBAF 58 Sprg 005 6
D. Computer drectos for Correlato EXCEL: See Dretzke p. 187-19 or: Each of your varables eeds to be a separate colum, wth each row beg a observato. Use the formula =CORREL(array1, array) or the formula =PEARSON(array1,array), where array 1 s the rage cludg all the observatos for the frst varable ad array s the rage cludg all the observatos of the secod varable. PBAF 58 Sprg 005 7
SPSS: See Carver ad Nash p. 5-54 or: Select ANALYZE>CORRELATE>BIVARIATE Select the frst varable from the lst. Clck the arrow to move t to the Varables box. Select the secod varable from the lst. Clck the arrow to move t to the Varables box. Make sure the box frot of Pearso s checked. Make sure you have the tals box checked approprately. Clck OK. PBAF 58 Sprg 005 8
. Regresso Aalyss ad the Research Process A. Smple Lear Regresso Bvarate Relatoshps Cotued We are lookg for the le that mmzes the dstace through all the pots the scatter plot. The actual outcomes are spread aroud ths "best guess" le. Algebra equato for a le: I regresso we wrte ths as: (Regress Y o X) usually Y = MX + B Y = β 1 X + β 0 + ε Y = β 0 + β 1 X + ε Y s the outcome varable the depedet varable (respose varable) X s the explaatory or predctor varable -- the depedet varable. Y = β 0 + β 1 X + ε (determ) + (radom) β 0 s where the le hts the vertcal axs the costat term the y-tercept. β 1 s Y X the amout of chage Y for a ut chage X the slope ε s radom uexplaed ose, other explaatory factors, measuremet error, or aother equato wth a dfferet form the resdual or error. NB: Y s ot kow exactly as soo as we kow X, but our estmate of t s arrowed dow. We kow the mea value of Y assocated wth that partcular X, because of the relatoshp that we observe the data. There are two ukow populato parameters β 0 ad β 1 out there, but we wll have to rely o a sample to estmate them. So, we estmate a equato ad produce estmated regresso coeffcets (beta-hats): = β ˆ + β ˆ X Ŷ 0 1 + e PBAF 58 Sprg 005 9
The subscrpt letter takes values from 1 to (there are data pots). Each value e s the dstace from the ftted regresso le estmate of y to the th datapot y. Ŷ s the value of Y lyg o the ftted regresso le for a gve value of X. To estmate the coeffcets for a regresso le wth just oe depedet varable ad oe depedet varable: ˆβ 1 = SS SS xy xx = = 1 [( x x) ( y y) ] = 1 = x y ( x x) ( x ) β ˆ = y βˆ x 1 0 x x y Example 4: Assgmet 1 Commo mstake about regresso ad correlato People ofte thk that as the slope of the estmated regresso le gets larger, so does r. But fact r really measures how close all the data pots are to our estmated regresso le, ot how steep the slope of the regresso le s. If the pots whle exactly o the le, the the correlato s ether +1 or -1, regardless of the slope, uless the slope s 0. If there s o lear relatoshp, the r = 0. PBAF 58 Sprg 005 10