Unit 9 Regression and Correlation

Size: px
Start display at page:

Download "Unit 9 Regression and Correlation"

Transcription

1 PubHlth 54 - Fall 4 Regresso ad Correlato Page of 44 Ut 9 Regresso ad Correlato Assume that a statstcal model such as a lear model s a good frst start oly - Gerald va Belle Is hgher blood pressure the mom assocated wth a lower brth weght of her baby? Smple lear regresso explores the relatoshp of oe cotuous outcome (Y=brth weght) wth oe cotuous predctor (X=blood pressure). At the heart of statstcs s the fttg of models to observed data followed by a examato of how they perform. -- somewhat useful The ftted model s a suffcetly good ft to the data f t permts explorato of hypotheses such as hgher blood pressure durg pregacy s assocated wth statstcally sgfcat lower brth weght ad t permts assessmet of cofoudg, effect modfcato, ad medato. These are deas that wll be developed PubHtlh 64 Ut, Multvarable Lear Regresso. -- more useful The ftted model ca be used to predct the outcomes of future observatos. For example, we mght be terested predctg the brth weght of the baby bor to a mom wth systolc blood pressure 45 mm Hg. -3- most useful Sometmes, but ot so much publc health, the ftted model derves from a physcal-equato. A example s Mchaels-Meto ketcs. A mchaels-meto model s ft to the data for the purpose of estmatg the actual rate of a partcular chemcal reacto. Hece A lear model s a good frst start oly Populato/ Relatoshps/

2 PubHlth 54 - Fall 4 Regresso ad Correlato Page of 44 Table of Cotets Topc. Ut Roadmap. Learg Objectves. 3. Defto of the Lear Regresso Model.. 4. Estmato The Aalyss of Varace Table. 6. Assumptos for the Straght Le Regresso. 7. Hypothess Testg Cofdece Iterval Estmato Itroducto to Correlato... Hypothess Test for Correlato Populato/ Relatoshps/

3 PubHlth 54 - Fall 4 Regresso ad Correlato Page 3 of 44. Ut Roadmap / Populatos Smple lear regresso s used whe there s oe respose (depedet, Y) varable ad oe explaatory (depedet, X) varables ad both are cotuous. Ut 9. Regresso & Correlato Relatoshps Examples of explaatory (depedet) respose (depedet) varable pars are heght ad weght, age ad blood pressure, etc -- A smple lear regresso aalyss begs wth a scatterplot of the data to see f a straght le model s approprate: y = β + βx where Y = the respose or depedet varable X = the explaatory or depedet varable. β = slope (the chage y per ut chage x) β = tercept (the value of y whe x=) -- The sample data are used to estmate the parameter values ad ther stadard errors. -3- The ftted model s the compared to the smpler model y = β whch says that y s ot learly related to x. Populato/ Relatoshps/

4 PubHlth 54 - Fall 4 Regresso ad Correlato Page 4 of 44. Learg Objectves Whe you have fshed ths ut, you should be able to:! Expla what s meat by depedet versus depedet varable ad what s meat by a lear relatoshp;! Produce ad terpret a scatterplot;! Defe ad expla the tercept ad slope parameters of a lear relatoshp;! Expla the theory of least squares estmato of the tercept ad slope parameters of a lear relatoshp;! Calculate by had least squares estmato of the tercept ad slope parameters of a lear relatoshp;! Expla the theory of the aalyss of varace of smple lear regresso;! Calculate by had the aalyss of varace of smple lear regresso;! Expla, compute, ad terpret R the cotext of smple lear regresso;! State ad expla the assumptos requred for estmato ad hypothess tests regresso;! Expla, compute, ad terpret the overall F-test smple lear regresso;! Iterpret the computer output of a smple lear regresso aalyss from a package such as Stata, SAS, SPSS, Mtab, etc.;! Defe ad terpret the value of a Pearso Product Momet Correlato, r ;! Expla the relatoshp betwee the Pearso product momet correlato r ad the lear regresso slope parameter; ad! Calculate by had cofdece terval estmato ad statstcal hypothess testg of the Pearso product momet correlato r. Populato/ Relatoshps/

5 PubHlth 54 - Fall 4 Regresso ad Correlato Page 5 of Defto of the Lear Regresso Model Ut 8 cosdered two categorcal (dscrete) varables, such as smokg (yes/o) ad low brth weght (yes/o). It was a troducto to ch-square tests of assocato. Ut 9 cosders two cotuous varables, such as age ad weght. It s a troducto to smple lear regresso ad correlato. A woderful troducto to the tuto of lear regresso ca be foud the text by Freedma, Psa, ad Purves (Statstcs. WW Norto & Co., 978). The followg s excerpted from pp 46 ad 48 of ther text: How s weght related to heght? For example, there were 4 me aged 8 to 4 Cycle I of the Health Examato Survey. Ther average heght was 5 feet 8 ches = 68 ches, wth a overall average weght of 58 pouds. But those me who were oe ch above average heght had a somewhat hgher average weght. Those me who were two ches above average heght had a stll hgher average weght. Ad so o. O the average, how much of a crease weght s assocated wth each ut crease heght? The best way to get started s to look at the scattergram for these heghts ad weghts. The object s to see how weght depeds o heght, so heght s take as the depedet varable ad plotted horzotally The regresso le s to a scatter dagram as the average s to a lst. The regresso le estmates the average value for the depedet varable correspodg to each value of the depedet varable. Lear Regresso Lear regresso models the mea µ = E [Y] of oe radom varable Y as a lear fucto of oe or more other varables (called predctors or explaatory varables) that are treated as fxed. The estmato ad hypothess testg volved are extesos of deas ad techques that we have already see. I lear regresso, Y s the outcome or depedet varable that we observe. We observe ts values for dvduals wth varous combatos of values of a predctor or explaatory varable X. There may be more tha oe predctor X ; ths wll be dscussed PubHlth 64. I smple lear regresso the values of the predctor X are assumed to be fxed. Ofte, however, the varables Y ad X are both radom varables. Populato/ Relatoshps/

6 PubHlth 54 - Fall 4 Regresso ad Correlato Page 6 of 44 Correlato Correlato cosders the assocato of two radom varables. The techques of estmato ad hypothess testg are the same for lear regresso ad correlato aalyses. Explorg the relatoshp begs wth fttg a le to the pots. Developmet of a smple lear regresso model aalyss Example. Source: Klebaum, Kupper, ad Muller 988 The followg are observatos of age (days) ad weght (kg) for = chcke embryos. Notato WT=Y AGE=X LOGWT=Z The data are pars of (X, Y ) where X=AGE ad Y=WT (X, Y ) = (6,.9) (X, Y ) = (6,.8) ad Ths table also provdes pars of (X, Z ) where X=AGE ad Z=LOGWT (X, Z ) = (6, -.538) (X, Z ) = (6,.449) Populato/ Relatoshps/

7 PubHlth 54 - Fall 4 Regresso ad Correlato Page 7 of 44 Research questo There are a varety of possble research questos: () Does weght chage wth age? () I the laguage of aalyss of varace we are askg the followg: Ca the varablty weght be explaed, to a sgfcat extet, by varatos age? (3) What s a good fuctoal form that relates age to weght? Tp! Beg wth a Scatter plot. Here we plot X=AGE versus Y=WT We check ad lear about the followg: The average ad meda of X The rage ad patter of varablty X The average ad meda of Y The rage ad patter of varablty Y The ature of the relatoshp betwee X ad Y The stregth of the relatoshp betwee X ad Y The detfcato of ay pots that mght be fluetal Populato/ Relatoshps/

8 PubHlth 54 - Fall 4 Regresso ad Correlato Page 8 of 44 Example, cotued The plot suggests a relatoshp betwee AGE ad WT A straght le mght ft well, but aother model mght be better We have adequate rages of values for both AGE ad WT There are o outlers The bowl shape of our scatter plot suggests that perhaps a better model relates the logarthm of WT (Z=LOGWT) to AGE: Populato/ Relatoshps/

9 PubHlth 54 - Fall 4 Regresso ad Correlato Page 9 of 44 We mght have gotte ay of a varety of plots. y.5 No relatoshp betwee X ad Y x 8 y 6 4 Lear relatoshp betwee X ad Y x 5 y3 5 No-lear relatoshp betwee X ad Y x Populato/ Relatoshps/

10 PubHlth 54 - Fall 4 Regresso ad Correlato Page of 44 Note the outlyg pot y Here, a ft of a lear model wll yeld a estmated slope that s spurously o-zero x 8 Note the outlyg pot y 6 4 Here, a ft of a lear model wll yeld a estmated slope that s spurously ear zero x Note the outlyg pot y Here, a ft of a lear model wll yeld a estmated slope that s spurously hgh x Populato/ Relatoshps/

11 PubHlth 54 - Fall 4 Regresso ad Correlato Page of 44 Revew of the Straght Le Way back whe, your hgh school days, you may have bee troduced to the straght le fucto, defed as y = mx + b where m s the slope ad b s the tercept. Nothg ew here. All we re dog s chagg the otato a bt: () Slope : m " β () Itercept: b " β Slope Slope > Slope = Slope < Populato/ Relatoshps/

12 PubHlth 54 - Fall 4 Regresso ad Correlato Page of 44 Defto of the Straght Le Model Y = β + β X Populato Y = β + βx + ε Y = β + βx + ε = relatoshp the populato. Y = β + βx s measured wth error ε defed ε = [Y] - [β + βx] β, β ad ε are ukow Y = β ˆ + βx ˆ + e β, ˆ β ˆ ad e are estmates of β, β ad ε resdual = e s ow the dfferece betwee the observed ad the ftted (ot the true) e = [Y] - [β ˆ + βx] ˆ β, ˆ β ˆ ad e are kow β, ˆ β ˆ ad e are obtaed by the The values of method of least squares estmato. How close dd we get? To see f ˆβ β ad ˆβ βwe perform regresso dagostcs. Regresso dagostcs are dscussed PubHlth 64 Notato Y = the outcome or depedet varable X = the predctor or depedet varable µ Y = The expected value of Y for all persos the populato µ Y X=x = The expected value of Y for the sub-populato for whom X=x σ Y σ Y X=x = Varablty of Y amog all persos the populato = Varablty of Y for the sub-populato for whom X=x Populato/ Relatoshps/

13 PubHlth 54 - Fall 4 Regresso ad Correlato Page 3 of Estmato Least squares estmato s used to obta guesses of β ad β. Whe the outcome = Y s dstrbuted ormal, least squares estmato s the same as maxmum lkelhood estmato. Note If you are ot famlar wth maxmum lkelhood estmato, do t worry. Ths s troduced PubHlth 64. Least Squares, Close ad Least Squares Estmato Theoretcally, t s possble to draw may les through a X-Y scatter of pots. Whch to choose? Least squares estmato s oe approach to choosg a le that s closest to the data. d Perhaps we d lke d = [observed Y - ftted! Y ] = smallest possble. Note that ths s a vertcal dstace, sce t s a dstace o the vertcal axs. d Better yet, perhaps we d lke to mmze the squared dfferece: d = [observed Y - ftted! Y ] = smallest possble We ca t do ths mmzato separately for each X-Y par. That s, t s ot possble to choose commo values of d = d =. d = Y ˆ β ˆ ad β ˆ that mmzes ( Y Y ˆ ) for subject ad mmzes ( Y Y ˆ ) for subject ad mmzes ad mmzes ( Y ) for the th subject So, stead, we choose values for β ˆ ad β ˆ that, upo serto, mmzes the total d = ( ) [ ] ( ) Y Yˆ = Y ˆ β + ˆ β X = = Populato/ Relatoshps/

14 PubHlth 54 - Fall 4 Regresso ad Correlato Page 4 of 44 d = ( Y Yˆ ) = Y ˆβ + ˆβ X has a varety of ames: ( ) resdual sum of squares sum of squares about the regresso le sum of squares due error (SSE) σ Y X! Populato/ Relatoshps/

15 PubHlth 54 - Fall 4 Regresso ad Correlato Page 5 of 44 Least Squares Estmato of the Slope ad Itercept I case you re terested. Cosder SSE = d = Y Y ˆ = Y ˆβ + ˆβ X ( ) ( ) Step #: Dfferetate wth respect to! β Set dervatve equal to ad solve for! β. Step #: Dfferetate wth respect to! β Set dervatve equal to, sert! β ad solve for! β. Least Squares Estmato Solutos Note the estmates are deoted ether usg greek letters wth a caret or wth roma letters Estmate of Slope ˆβ or b ˆβ = = ( X X )( Y Y ) ( X X ) = Itercept ˆβ or b! β = Y! β X Populato/ Relatoshps/

16 PubHlth 54 - Fall 4 Regresso ad Correlato Page 6 of 44 A closer look Some very helpful prelmary calculatos ( ) Sxx = X-X = X NX ( ) Syy = Y-Y = Y NY xy ( ) S = X-X (Y-Y) = XY NXY Note - These expressos make use of a specal otato called the summato otato. The captol S dcates summato. I S xy, the frst subscrpt x s sayg (x-x). The secod subscrpt y s sayg (y-y). S xy = ( ) X-X (Y-Y) S subscrpt x subscrpt y Slope ˆβ = ( X X) Y Y ( ) ( X X) = côv ( X,Y ) vâr(x) ˆ S β = S xy xx Itercept! β = Y! β X Predcto of Y Ŷ= ˆ β + ˆ β X =b + bx Populato/ Relatoshps/

17 PubHlth 54 - Fall 4 Regresso ad Correlato Page 7 of 44 Do these estmates make sese? Slope ˆβ = ( X X) Y Y ( ) ( X X) = côv ( X,Y ) vâr(x) The lear movemet Y wth lear movemet X s measured relatve to the varablty X.!β = says: Wth a ut chage X, overall there s a 5-5 chace that Y creases versus decreases!β says: Wth a ut crease X, Y creases also (! β > ) or Y decreases (! β < ). Itercept! β = Y! β X If the lear model s correct, or, f the true model does ot have a lear compoet, we obta!β = ad! β = Y as our best guess of a ukow Y Populato/ Relatoshps/

18 PubHlth 54 - Fall 4 Regresso ad Correlato Page 8 of 44 Illustrato Stata Y=WT ad X=AGE. regress y x Partal lstg of output y Coef. Std. Err. t P> t [95% Cof. Iterval] x _cos Aotated y = WEIGHT Coef. Std. Err. t P> t [95% Cof. Iterval] x = AGE.3577 = b _cos = Itercept = b The ftted le s therefore WT = *AGE. It says that each ut crease AGE of day s estmated to predct a.357 crease weght, WT. Here s a overlay of the ftted le o our scatterplot. Scatter Plot of WT vs AGE WT AGE Populato/ Relatoshps/

19 PubHlth 54 - Fall 4 Regresso ad Correlato Page 9 of 44 As we mght have guessed, the straght le model may ot be the best choce. The bowl shape of the scatter plot does have a lear compoet, however. Wthout the plot, we mght have beleved the straght le ft s okay. Illustrato Stata- cotued Z=LOGWT ad X=AGE. regress z x Partal lstg of output z Coef. Std. Err. t P> t [95% Cof. Iterval] x _cos Aotated Z = LOGWT Coef. Std. Err. t P> t [95% Cof. Iterval] x = AGE = b _cos = INTERCEPT = b Thus, the ftted le s LOGWT = *AGE Populato/ Relatoshps/

20 PubHlth 54 - Fall 4 Regresso ad Correlato Page of 44 Now the overlay plot looks better: Populato/ Relatoshps/

21 PubHlth 54 - Fall 4 Regresso ad Correlato Page of 44 Predcto of Weght from Heght Source: Dxo ad Massey (969) Now You Try Idvdual Heght (X) Weght (Y) Prelmary calculatos X= X = 49,68 XY 9,38 = xx Y=4.667 Y = 46, S = Syy = 5, Sxy = Slope ˆ S β = S xy xx ˆ β = = Itercept! β = Y! β X ˆ β (5.9)( = = Populato/ Relatoshps/

22 PubHlth 54 - Fall 4 Regresso ad Correlato Page of The Aalyss of Varace Table Recall the sample varace troduced I Ut, Summarzg. Y Y. = The umerator of the sample varace (S ) of the Y data s ( ) Ths same umerator ( ) = ( Y ) Y Y Y s a cetral fgure regresso. It has a ew ame, several actually. = = total varace of the Y s. = total sum of squares, = total, corrected, ad = SSY. (Note corrected refers to subtractg the mea before squarg.) The aalyss of varace tables s all about ( ) Y Y ad parttog t to two compoets =. Due resdual (the dvdual Y about the dvdual predcto Ŷ). Due regresso (the predcto Ŷ about the overall mea Y) Here s the partto (Note Look closely ad you ll see that both sdes are the same) ( Y ) ( ˆ ) ( ˆ Y = Y Y + Y Y) Some algebra (ot show) reveals a ce partto of the total varablty. ( Y Y) = ( Y Yˆ) + ( Yˆ Y) Total Sum of Squares = Due Error Sum of Squares + Due Model Sum of Squares Populato/ Relatoshps/

23 PubHlth 54 - Fall 4 Regresso ad Correlato Page 3 of 44 A closer look Total Sum of Squares = Due Model Sum of Squares + Due Error Sum of Squares ( Y Y ) ( ) + ( Y Y ˆ ) = Y ˆ Y ( Y Y ) = devato of Y from Y that s to be explaed Y ˆ Y ( ) = due model, sgal, systematc, due regresso ( Y Y ˆ ) = due error, ose, or resdual We seek to expla the total varablty ( Y Y ) wth a ftted model: What happes whe β? What happes whe β =? A straght le relatoshp s helpful A straght le relatoshp s ot helpful Best guess s ˆ = ˆ β + ˆ β X Best guess s Y ˆ ˆ β = Y Y = Due model s LARGE because ( Yˆ Y ) = ([ ˆ β + ˆ β X ] ) Y Y ˆ β X + ˆ β X Y = = ˆβ ( X X ) Due error s early the TOTAL because ( Y Yˆ ) = ( Y [ ˆ β ]) = ( Y Y ) Due error has to be small Due regresso has to be small due(model) due wll be large ( model) due( error) due( error) wll be small Populato/ Relatoshps/

24 PubHlth 54 - Fall 4 Regresso ad Correlato Page 4 of 44 How to Partto the Total Varace. The total or total, corrected refers to the varablty of Y about Y ( Y Y ) s called the total sum of squares Degrees of freedom = df = (-) Dvso of the total sum of squares by ts df yelds the total mea square. The resdual or due error refers to the varablty of Y about! Y ( Y Y ˆ ) s called the resdual sum of squares Degrees of freedom = df = (-) Dvso of the resdual sum of squares by ts df yelds the resdual mea square. 3. The regresso or due model refers to the varablty of! Y about Y ( Y ˆ Y ) = ˆβ ( X X) s called the regresso sum of squares Degrees of freedom = df = Dvso of the regresso sum of squares by ts df yelds the regresso mea square or model mea square. It s a example of a varace compoet. Source df Sum of Squares Mea Square Regresso SSR = ( Y ˆ Y ) SSR/ Error Total, corrected (-) (-) Tp! Mea square = (Sum of squares)/(degrees of freedom,df) SSE = SST = ( Y Y ˆ ) ( Y Y ) SSE/(-) Populato/ Relatoshps/

25 PubHlth 54 - Fall 4 Regresso ad Correlato Page 5 of 44 Be careful! The questo we may ask from a aalyss of varace table s a lmted oe. Does the ft of the straght le model expla a sgfcat porto of the varablty of the dvdual Y about Y? Is ths ftted model better tha usg Y aloe? We are NOT askg: Is the choce of the straght le model correct? or Would aother fuctoal form be a better choce? We ll use a hypothess test approach (aother proof by cotradcto ). Start wth the othg s gog o ull hypothess that says β = ( o lear relatoshp ) Use least squares estmato to estmate a closest le The aalyss of varace table provdes a comparso of the due regresso mea square to the resdual mea square Recall that we reasoed the followg: If β The due (regresso)/due (resdual) wll be LARGE If β = The due (regresso)/due (resdual) wll be SMALL Our p-value calculato wll aswer the questo: If the ull hypothess s true ad β = truly, what were the chaces of obtag a value of due (regresso)/due (resdual) as larger or larger tha that observed? To calculate chaces we eed a probablty model. So far, we have ot eeded oe. Populato/ Relatoshps/

26 PubHlth 54 - Fall 4 Regresso ad Correlato Page 6 of Assumptos for a Straght Le Regresso Aalyss I performg least squares estmato, we dd ot use a probablty model. We were dog geometry. Hypothess testg requres some assumptos ad a probablty model. Assumptos The separate observatos Y, Y,, Y are depedet. The values of the predctor varable X are fxed ad measured wthout error. For each value of the predctor varable X=x, the dstrbuto of values of Y follows a ormal dstrbuto wth mea equal to µ Y X=x ad commo varace equal to σ Y x. The separate meas µ Y X=x le o a straght le; that s µ Y X=x = β + β X At each value of X, there s a populato of Y for persos wth X=x Populato/ Relatoshps/

27 PubHlth 54 - Fall 4 Regresso ad Correlato Page 7 of 44 Wth these assumptos, we ca assess the sgfcace of the varace explaed by the model. F msq(model) = wth df =, (-) msq(resdual) β = β Due model MSR has expected value σ Y X Due resdual MSE has expected value σ Y X Due model MSR has expected value σ Y X + β ( X X) Due resdual MSE has expected value σ Y X F = (MSR)/MSE wll be close to F = (MSR)/MSE wll be LARGER tha We obta the aalyss of varace table for the model of Z=LOGWT to X=AGE: Stata llustrato wth aotatos red. Source SS df MS Number of obs = F(, 9) = = MSQ(model)/MSQ(resdual) Model Prob > F =. = p-value for Overall F Test Resdual R-squared =.9983 = SSQ(model)/SSQ(TOTAL) Adj R-squared =.998 = R ajusted for ad # of X Total Root MSE =.87 = Sqaure root of MSQ(resdual) Populato/ Relatoshps/

28 PubHlth 54 - Fall 4 Regresso ad Correlato Page 8 of 44 Ths output correspods to the followg. Note I ths example our depedet varable s actually Z, ot Y. Source Df Sum of Squares Mea Square Regresso SSR = ( Z ˆ - Z) = 4.63 SSR/ = 4.63 Error (-) = 9 Total, corrected (-) = SSE = SST = ( Z - Zˆ ) =.75 SSE/(-) = 7/838E-4 ( Z - Z) = Other formato ths output: R-SQUARED = [(Sum of squares regresso)/(sum of squares total)] = proporto of the total that we have bee able to expla wth the ft - Be careful! As predctors are added to the model, R-SQUARED ca oly crease. Evetually, we eed to adjust ths measure to take ths to accout. See ADJUSTED R-SQUARED. We also get a overall F test of the ull hypothess that the smple lear model does ot expla sgfcatly more varablty LOGWT tha the average LOGWT. F = MSQ (Regresso)/MSQ (Resdual) = 4.63/.7838 = wth df =, 9 Acheved sgfcace <.. Reject H O. Coclude that the ftted le explas statstcally sgfcatly more of the varablty Z=LOGWT tha s explaed by the ull model that cotas the average LOGWT oly. Populato/ Relatoshps/

29 PubHlth 54 - Fall 4 Regresso ad Correlato Page 9 of Hypothess Testg Straght Le Model: Y = β + β X ) Overall F-Test Research Questo: Does the ftted model, the! Y, expla sgfcatly more of the total varablty of the Y about Y tha does Y? Assumptos: As before. H O ad H A : H: β = O H : β A Test Statstc: F = msq( regreso) msq( resdual) df =,( ) Evaluato rule: Whe the ull hypothess s true, the value of F should be close to. Alteratvely, whe β, the value of F wll be LARGER tha. Thus, our p-value calculato aswers: What are the chaces of obtag our value of the F or oe that s larger f we beleve the ull hypothess that β =? Calculatos: For our data, we obta p-value = msq(model) pr F β = = <<,( ),9 ( ) pr msq resdual [ F ]. Populato/ Relatoshps/

30 PubHlth 54 - Fall 4 Regresso ad Correlato Page 3 of 44 Evaluate: Assumpto of the ull hypothess that β = has led to a extremely ulkely outcome (F-statstc value of ), wth chaces of beg observed less tha chace,. The ull hypothess s rejected. Iterpret: We have leared that, at least, the ftted straght le model does a much better job of explag the varablty Z = LOGWT tha a model that allows oly for the average LOGWT. later (PubHlth 64, Itermedate Bostatstcs), we ll see that the aalyss does ot stop here ) Test of the Slope, β Notes - The overall F test ad the test of the slope are equvalet. The test of the slope uses a t-score approach to hypothess testg It ca be show that { t-score for slope } = { overall F } Research Questo: Is the slope β =? Assumptos: As before. H O ad H A : H H O A : β = : β Test Statstc: To compute the t-score, we eed a estmate of the stadard error of! β ( ) = msq(resdual) SÊ ˆβ ( X X) Populato/ Relatoshps/

31 PubHlth 54 - Fall 4 Regresso ad Correlato Page 3 of 44 Our t-score s therefore: ( ) ( expected ) ( ) observed t score = sê expected df = ( ) ( ˆβ ) ( ) = sê ( ˆβ ) We ca fd ths formato our Stata output. Aotatos are red z Coef. Std. Err. t = Coef/Std. Err. P> t [95% Cof. Iterval] x =.9589/ _cos Recall what we mea by a t-score: t=73.38 says the estmated slope s estmated to be stadard error uts away from the ull hypothess expected value of zero. Check that { t-score } = { Overall F }: Evaluato rule: [ ] = whch s close. Whe the ull hypothess s true, the value of t should be close to zero. Alteratvely, whe β, the value of t wll be DIFFERENT from. Here, our p-value calculato aswers: What are the chaces of obtag our value of the t or oe that s more far away from f we beleve the ull hypothess that β =? Populato/ Relatoshps/

32 PubHlth 54 - Fall 4 Regresso ad Correlato Page 3 of 44 Calculatos: For our data, we obta p-value = pr ˆβ t ( ) sê ˆβ ( ) = pr t [ ] <<. Evaluate: Uder the ull hypothess that β =, the chaces of obtag a t-score value that s or more stadard error uts away from the expected value of s less tha chace,. Iterpret: The ferece s the same as that for the overall F test. The ftted straght le model does a statstcally sgfcatly better job of explag the varablty LOGWT tha the sample mea. 3) Test of the Itercept, β Ths addresses the questo: Does the the straght le relatoshp passes through the org? It s rarely of terest. Research Questo: Is the tercept β =? Assumptos: As before. H O ad H A : H H O A : β = : β Populato/ Relatoshps/

33 PubHlth 54 - Fall 4 Regresso ad Correlato Page 33 of 44 Test Statstc: To compute the t-score for the tercept, we eed a estmate of the stadard error of! β ( ) = msq(resdual) + X SÊ ˆβ ( X X) Our t-score s therefore: ( ) ( expected ) ( ) observed t score = sê expected df = ( ) ( ˆβ ) ( ) = sê ( ˆβ ) Aga, we ca fd ths formato our Stata output. Aotatos are red z Coef. Std. Err. t = Coef/Std. Err. P> t [95% Cof. Iterval] x _cos = / Here, t = says the estmated tercept s estmated to be stadard error uts away from ts expected value of zero. Evaluato rule: Whe the ull hypothess s true, the value of t should be close to zero. Alteratvely, whe β, the value of t wll be DIFFERENT from. Our p-value calculato aswers: What are the chaces of obtag our value of the t or oe that s more far away from f we beleve the ull hypothess that β =? Populato/ Relatoshps/

34 PubHlth 54 - Fall 4 Regresso ad Correlato Page 34 of 44 Calculatos: p-value = ˆ pr β t( ) = pr[ t ] <<. seˆ ( ˆ β ) Evaluate: Uder the ull hypothess that β =, the chaces of obtag a t-score value that s or more stadard error uts away from the expected value of s less tha chace,, aga promptg statstcal rejecto of the ull hypothess. Iterpret: The ferece s that the straght le relatoshp betwee Z=LOGWT ad X=AGE does ot pass through the org. Populato/ Relatoshps/

35 PubHlth 54 - Fall 4 Regresso ad Correlato Page 35 of Cofdece Iterval Estmato Straght Le Model: Y = β + β X The cofdece tervals here have 3 elemets: ) Best sgle guess (estmate) ) Stadard error of the best sgle guess (SE[estmate]) 3) Cofdece coeffcet : Ths wll be a percetle from the Studet t dstrbuto wth df=(-) We mght wat cofdece terval estmates of the followg 4 parameters: () Slope () Itercept (3) Mea of subset of populato for whom X=x (4) Idvdual respose for perso for whom X=x ) SLOPE estmate =! β ( ) = msq(resdual) sê ˆb ( X X) ) INTERCEPT estmate =! β ( ) = msq(resdual) + X sê ˆb ( X X) Populato/ Relatoshps/

36 PubHlth 54 - Fall 4 Regresso ad Correlato Page 36 of 44 3) MEAN at X=x estmate = Y! =! β +! β x X = x sê = msq(resdual) ( + x X) X X ( ) 4) INDIVIDUAL wth X=x estmate = Y! =! β +! β x X = x sê = msq(resdual) + ( + x X) X X ( ) Example, cotued Z=LOGWT to X=AGE. Stata yelded the followg ft: z Coef. Std. Err. t P> t [95% Cof. Iterval] x # 95% CI for Slope β _cos % Cofdece Iterval for the Slope, β ) Best sgle guess (estmate) = ˆ β =.9589 ) Stadard error of the best sgle guess (SE[estmate]) = ( ) se ˆ β =.68 3) Cofdece coeffcet = 97.5 th percetle of Studet t = t df 95% Cofdece Iterval for Slope β = Estmate ± ( cofdece coeffcet )*SE =.9589 ± (.6)(.68) = (.898,.9) =.., = Populato/ Relatoshps/

37 PubHlth 54 - Fall 4 Regresso ad Correlato Page 37 of 44 95% Cofdece Iterval for the Itercept, β z Coef. Std. Err. t P> t [95% Cof. Iterval] x _cos # 95% CI for tercept β ) Best sgle guess (estmate) = ˆ β =.6895 ) Stadard error of the best sgle guess (SE[estmate]) = ( ) se ˆ β =.364 3) Cofdece coeffcet = 97.5 th percetle of Studet t = t df 95% Cofdece Iterval for Slope β = Estmate ± ( cofdece coeffcet )*SE = ± (.6)(.364) = (-.7585,-.6) =.., = Populato/ Relatoshps/

38 PubHlth 54 - Fall 4 Regresso ad Correlato Page 38 of 44 For the brave Stata Example, cotued Cofdece Itervals for MEAN of Z at Each Value of X.. regress z x. predct zhat, xb. ** Obta SE for MEAN of Z gve X. predct semeaz, stdp. ** Obta cofdece coeffcet = 97.5th percetle of T o df=9. geerate tmult=vttal(9,.5). ** Geerate lower ad upper 95% CI lmts for MEAN of Z at Each X. geerate lowmeaz=zhat -tmult*semeaz. geerate hghmeaz=zhat+tmult*semeaz. ** Geerate lower ad upper 95% CI lmts for INDIVIDUAL PREDICTED Z at Each X. geerate lowpredctz=zhat-tmult*sepredctz. geerate hghpredctz=zhat+tmult*sepredctz. lst x z zhat lowmeaz hghmeaz, clea x z zhat lowmeaz hghmeaz Populato/ Relatoshps/

39 PubHlth 54 - Fall 4 Regresso ad Correlato Page 39 of 44 Stata Example, cotued Cofdece Itervals for INDIVIDUAL PREDICTED Z at Each Value of X.. regress z x. predct zhat, xb. ** Obta SE for INDIVIDUAL PREDICTION of Z at gve X. predct sepredctz, stdf. ** Obta cofdece coeffcet = 97.5th percetle of T o df=9. geerate tmult=vttal(9,.5). ** Geerate lower ad upper 95% CI lmts for INDIVIDUAL PREDICTED Z at Each X. geerate lowpredctz=zhat-tmult*sepredctz. geerate hghpredctz=zhat+tmult*sepredctz. *** Lst Idvdual Predctos wth 95% CI Lmts. lst x z zhat lowpredctz hghpredctz, clea x z zhat lowpred~z hghpre~z Populato/ Relatoshps/

40 PubHlth 54 - Fall 4 Regresso ad Correlato Page 4 of 44 Defto of Correlato 9. Itroducto to Correlato A correlato coeffcet s a measure of the assocato betwee two pared radom varables (e.g. heght ad weght). The Pearso product momet correlato, partcular, s a measure of the stregth of the straght le relatoshp betwee the two radom varables. Aother correlato measure (ot dscussed here) s the Spearma correlato. It s a measure of the stregth of the mootoe creasg (or decreasg) relatoshp betwee the two radom varables. The Spearma correlato s a o-parametrc (meag model free) measure. It s troduced PubHlth 64, Itermedate Bostatstcs. Formula for the Pearso Product Momet Correlato ρ Populato product momet correlato = ρ based estmate = r. Some prelmares: () Suppose we are terested the correlato betwee X ad Y () ˆ cov(x,y) = (x x)(y y) (-) = S xy (-) Ths s the covarace(x,y) (3) (4) ˆ var(x) = ˆ var(y) = (x x) (-) (y y) (-) Sxx = (-) = S yy (-) ad smlarly Populato/ Relatoshps/

41 PubHlth 54 - Fall 4 Regresso ad Correlato Page 4 of 44 Formula for Estmate of Pearso Product Momet Correlato from a ˆ ρ = r = ˆ cov(x,y) var(x)var(y) ˆ ˆ = S xy S S xx yy If you absolutely have to do t by had, a equvalet (more calculator fredly formula) s ˆ ρ = r = xy x y x y x y The correlato r ca take o values betwee ad oly Thus, the correlato coeffcet s sad to be dmesoless t s depedet of the uts of x or y. Sg of the correlato coeffcet (postve or egatve) = Sg of the estmated slope ˆβ. Populato/ Relatoshps/

42 PubHlth 54 - Fall 4 Regresso ad Correlato Page 4 of 44 There s a relatoshp betwee the slope of the straght le, ˆβ, ad the estmated correlato r. Relatoshp betwee slope ˆβ ad the sample correlato r Because ˆ S xy β = ad Sxx r = S xy S S xx yy A lttle algebra reveals that r = S S xx yy ˆ β Thus, beware!!! It s possble to have a very large (postve or egatve) r mght accompayg a very o-zero slope, asmuch as - A very large r mght reflect a very large S xx, all other thgs equal - A very large r mght reflect a very small S yy, all other thgs equal. Populato/ Relatoshps/

43 PubHlth 54 - Fall 4 Regresso ad Correlato Page 43 of 44. Hypothess Test of Correlato The ull hypothess of zero correlato s equvalet to the ull hypothess of zero slope. Research Questo: Is the correlato ρ =? Is the slope β =? Assumptos: As before. H O ad H A : H H O A : ρ = : ρ Test Statstc: A lttle algebra (ot show) yelds a very ce formula for the t-score that we eed. r (-) t score= r df = ( ) We ca fd ths formato our output. Recall the frst example ad the model of Z=LOGWT to X=AGE: The Pearso Correlato, r, s the R-squared the output. Source SS df MS Number of obs = F(, 9) = Model Prob > F =. Resdual R-squared = Adj R-squared =.998 Total Root MSE =.87 Pearso Correlato, r =.9983 =.999 Populato/ Relatoshps/

44 PubHlth 54 - Fall 4 Regresso ad Correlato Page 44 of 44 Substtuto to the formula for the t-score yelds r (-) t score = = = = 7.69 r Note: The value.999 the umerator s r= R =.9983 =.999 Ths s very close to the value of the t-score that was obtaed for testg the ull hypothess of zero slope. The dscrepacy s probably roudg error. I dd the calculatos o my calculator usg 4 sgfcat dgts. Stata probably used more sgfcat dgts - cb. Populato/ Relatoshps/

Topic 9. Regression and Correlation

Topic 9. Regression and Correlation BE54W Regresso ad Correlato Page of 43 Topc 9 Regresso ad Correlato Topc. Defto of the Lear Regresso Model... Estmato.... 3. The Aalyss of Varace Table. 4. Assumptos for the Straght Le Regresso. 5. Hypothess

More information

Simple Linear Regression

Simple Linear Regression Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato

More information

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y.

b. There appears to be a positive relationship between X and Y; that is, as X increases, so does Y. .46. a. The frst varable (X) s the frst umber the par ad s plotted o the horzotal axs, whle the secod varable (Y) s the secod umber the par ad s plotted o the vertcal axs. The scatterplot s show the fgure

More information

STA302/1001-Fall 2008 Midterm Test October 21, 2008

STA302/1001-Fall 2008 Midterm Test October 21, 2008 STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from

More information

Lecture Notes Types of economic variables

Lecture Notes Types of economic variables Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte

More information

Chapter 13 Student Lecture Notes 13-1

Chapter 13 Student Lecture Notes 13-1 Chapter 3 Studet Lecture Notes 3- Basc Busess Statstcs (9 th Edto) Chapter 3 Smple Lear Regresso 4 Pretce-Hall, Ic. Chap 3- Chapter Topcs Types of Regresso Models Determg the Smple Lear Regresso Equato

More information

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades

Multiple Regression. More than 2 variables! Grade on Final. Multiple Regression 11/21/2012. Exam 2 Grades. Exam 2 Re-grades STAT 101 Dr. Kar Lock Morga 11/20/12 Exam 2 Grades Multple Regresso SECTIONS 9.2, 10.1, 10.2 Multple explaatory varables (10.1) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (10.2) Trasformatos

More information

ENGI 3423 Simple Linear Regression Page 12-01

ENGI 3423 Simple Linear Regression Page 12-01 ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable

More information

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model

12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model 1. Estmatg Model parameters Assumptos: ox ad y are related accordg to the smple lear regresso model (The lear regresso model s the model that says that x ad y are related a lear fasho, but the observed

More information

Probability and. Lecture 13: and Correlation

Probability and. Lecture 13: and Correlation 933 Probablty ad Statstcs for Software ad Kowledge Egeers Lecture 3: Smple Lear Regresso ad Correlato Mocha Soptkamo, Ph.D. Outle The Smple Lear Regresso Model (.) Fttg the Regresso Le (.) The Aalyss of

More information

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions.

Ordinary Least Squares Regression. Simple Regression. Algebra and Assumptions. Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos

More information

Objectives of Multiple Regression

Objectives of Multiple Regression Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of

More information

Statistics MINITAB - Lab 5

Statistics MINITAB - Lab 5 Statstcs 10010 MINITAB - Lab 5 PART I: The Correlato Coeffcet Qute ofte statstcs we are preseted wth data that suggests that a lear relatoshp exsts betwee two varables. For example the plot below s of

More information

Summary of the lecture in Biostatistics

Summary of the lecture in Biostatistics Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the

More information

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model

Lecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The

More information

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:

Chapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn: Chapter 3 3- Busess Statstcs: A Frst Course Ffth Edto Chapter 2 Correlato ad Smple Lear Regresso Busess Statstcs: A Frst Course, 5e 29 Pretce-Hall, Ic. Chap 2- Learg Objectves I ths chapter, you lear:

More information

Multiple Linear Regression Analysis

Multiple Linear Regression Analysis LINEA EGESSION ANALYSIS MODULE III Lecture - 4 Multple Lear egresso Aalyss Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur Cofdece terval estmato The cofdece tervals multple

More information

Econometric Methods. Review of Estimation

Econometric Methods. Review of Estimation Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators

More information

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger

Example: Multiple linear regression. Least squares regression. Repetition: Simple linear regression. Tron Anders Moger Example: Multple lear regresso 5000,00 4000,00 Tro Aders Moger 0.0.007 brthweght 3000,00 000,00 000,00 0,00 50,00 00,00 50,00 00,00 50,00 weght pouds Repetto: Smple lear regresso We defe a model Y = β0

More information

Lecture 8: Linear Regression

Lecture 8: Linear Regression Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE

More information

Regresso What s a Model? 1. Ofte Descrbe Relatoshp betwee Varables 2. Types - Determstc Models (o radomess) - Probablstc Models (wth radomess) EPI 809/Sprg 2008 9 Determstc Models 1. Hypothesze

More information

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 00 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for the

More information

Simple Linear Regression

Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

Unit 2. Regression and Correlation

Unit 2. Regression and Correlation PubHlth 640 - Sprg 0. Regresso ad Correlato Page of 80 Ut. Regresso ad Correlato Do t let us quarrel, the Whte Quee sad a axous toe. What s the cause of lghtg? The cause of lghtg, Alce sad very decdedly,

More information

residual. (Note that usually in descriptions of regression analysis, upper-case

residual. (Note that usually in descriptions of regression analysis, upper-case Regresso Aalyss Regresso aalyss fts or derves a model that descres the varato of a respose (or depedet ) varale as a fucto of oe or more predctor (or depedet ) varales. The geeral regresso model s oe of

More information

Linear Regression with One Regressor

Linear Regression with One Regressor Lear Regresso wth Oe Regressor AIM QA.7. Expla how regresso aalyss ecoometrcs measures the relatoshp betwee depedet ad depedet varables. A regresso aalyss has the goal of measurg how chages oe varable,

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:

More information

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs

CLASS NOTES. for. PBAF 528: Quantitative Methods II SPRING Instructor: Jean Swanson. Daniel J. Evans School of Public Affairs CLASS NOTES for PBAF 58: Quattatve Methods II SPRING 005 Istructor: Jea Swaso Dael J. Evas School of Publc Affars Uversty of Washgto Ackowledgemet: The structor wshes to thak Rachel Klet, Assstat Professor,

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ  1 STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ

More information

ESS Line Fitting

ESS Line Fitting ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here

More information

Statistics: Unlocking the Power of Data Lock 5

Statistics: Unlocking the Power of Data Lock 5 STAT 0 Dr. Kar Lock Morga Exam 2 Grades: I- Class Multple Regresso SECTIONS 9.2, 0., 0.2 Multple explaatory varables (0.) Parttog varablty R 2, ANOVA (9.2) Codtos resdual plot (0.2) Exam 2 Re- grades Re-

More information

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use.

The equation is sometimes presented in form Y = a + b x. This is reasonable, but it s not the notation we use. INTRODUCTORY NOTE ON LINEAR REGREION We have data of the form (x y ) (x y ) (x y ) These wll most ofte be preseted to us as two colum of a spreadsheet As the topc develops we wll see both upper case ad

More information

Mean is only appropriate for interval or ratio scales, not ordinal or nominal.

Mean is only appropriate for interval or ratio scales, not ordinal or nominal. Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot

More information

Lecture 3. Sampling, sampling distributions, and parameter estimation

Lecture 3. Sampling, sampling distributions, and parameter estimation Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called

More information

Chapter 14 Logistic Regression Models

Chapter 14 Logistic Regression Models Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as

More information

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model

ECON 482 / WH Hong The Simple Regression Model 1. Definition of the Simple Regression Model ECON 48 / WH Hog The Smple Regresso Model. Defto of the Smple Regresso Model Smple Regresso Model Expla varable y terms of varable x y = β + β x+ u y : depedet varable, explaed varable, respose varable,

More information

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes

Midterm Exam 1, section 2 (Solution) Thursday, February hour, 15 minutes coometrcs, CON Sa Fracsco State Uverst Mchael Bar Sprg 5 Mdterm xam, secto Soluto Thursda, Februar 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes exam.. No calculators of a kd are allowed..

More information

Chapter 8. Inferences about More Than Two Population Central Values

Chapter 8. Inferences about More Than Two Population Central Values Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha

More information

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes

Midterm Exam 1, section 1 (Solution) Thursday, February hour, 15 minutes coometrcs, CON Sa Fracsco State Uversty Mchael Bar Sprg 5 Mdterm am, secto Soluto Thursday, February 6 hour, 5 mutes Name: Istructos. Ths s closed book, closed otes eam.. No calculators of ay kd are allowed..

More information

Simple Linear Regression - Scalar Form

Simple Linear Regression - Scalar Form Smple Lear Regresso - Scalar Form Q.. Model Y X,..., p..a. Derve the ormal equatos that mmze Q. p..b. Solve for the ordary least squares estmators, p..c. Derve E, V, E, V, COV, p..d. Derve the mea ad varace

More information

ε. Therefore, the estimate

ε. Therefore, the estimate Suggested Aswers, Problem Set 3 ECON 333 Da Hugerma. Ths s ot a very good dea. We kow from the secod FOC problem b) that ( ) SSE / = y x x = ( ) Whch ca be reduced to read y x x = ε x = ( ) The OLS model

More information

Simple Linear Regression and Correlation. Applied Statistics and Probability for Engineers. Chapter 11 Simple Linear Regression and Correlation

Simple Linear Regression and Correlation. Applied Statistics and Probability for Engineers. Chapter 11 Simple Linear Regression and Correlation 4//6 Appled Statstcs ad Probablty for Egeers Sth Edto Douglas C. Motgomery George C. Ruger Chapter Smple Lear Regresso ad Correlato CHAPTER OUTLINE Smple Lear Regresso ad Correlato - Emprcal Models -8

More information

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018

Statistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018 /3/08 Sstems & Bomedcal Egeerg Departmet SBE 304: Bo-Statstcs Smple Lear Regresso ad Correlato Dr. Ama Eldeb Fall 07 Descrptve Orgasg, summarsg & descrbg data Statstcs Correlatoal Relatoshps Iferetal Geeralsg

More information

Lecture 1: Introduction to Regression

Lecture 1: Introduction to Regression Lecture : Itroducto to Regresso A Eample: Eplag State Homcde Rates What kds of varables mght we use to epla/predct state homcde rates? Let s cosder just oe predctor for ow: povert Igore omtted varables,

More information

Multiple Choice Test. Chapter Adequacy of Models for Regression

Multiple Choice Test. Chapter Adequacy of Models for Regression Multple Choce Test Chapter 06.0 Adequac of Models for Regresso. For a lear regresso model to be cosdered adequate, the percetage of scaled resduals that eed to be the rage [-,] s greater tha or equal to

More information

ENGI 4421 Propagation of Error Page 8-01

ENGI 4421 Propagation of Error Page 8-01 ENGI 441 Propagato of Error Page 8-01 Propagato of Error [Navd Chapter 3; ot Devore] Ay realstc measuremet procedure cotas error. Ay calculatos based o that measuremet wll therefore also cota a error.

More information

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1 STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal

More information

Simple Linear Regression and Correlation.

Simple Linear Regression and Correlation. Smple Lear Regresso ad Correlato. Correspods to Chapter 0 Tamhae ad Dulop Sldes prepared b Elzabeth Newto (MIT) wth some sldes b Jacquele Telford (Johs Hopks Uverst) Smple lear regresso aalss estmates

More information

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance

Chapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss

More information

TESTS BASED ON MAXIMUM LIKELIHOOD

TESTS BASED ON MAXIMUM LIKELIHOOD ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal

More information

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1

Handout #8. X\Y f(x) 0 1/16 1/ / /16 3/ / /16 3/16 0 3/ /16 1/16 1/8 g(y) 1/16 1/4 3/8 1/4 1/16 1 Hadout #8 Ttle: Foudatos of Ecoometrcs Course: Eco 367 Fall/05 Istructor: Dr. I-Mg Chu Lear Regresso Model So far we have focused mostly o the study of a sgle radom varable, ts correspodg theoretcal dstrbuto,

More information

Functions of Random Variables

Functions of Random Variables Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uverst Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal

More information

Logistic regression (continued)

Logistic regression (continued) STAT562 page 138 Logstc regresso (cotued) Suppose we ow cosder more complex models to descrbe the relatoshp betwee a categorcal respose varable (Y) that takes o two (2) possble outcomes ad a set of p explaatory

More information

Continuous Distributions

Continuous Distributions 7//3 Cotuous Dstrbutos Radom Varables of the Cotuous Type Desty Curve Percet Desty fucto, f (x) A smooth curve that ft the dstrbuto 3 4 5 6 7 8 9 Test scores Desty Curve Percet Probablty Desty Fucto, f

More information

Chapter Two. An Introduction to Regression ( )

Chapter Two. An Introduction to Regression ( ) ubject: A Itroducto to Regresso Frst tage Chapter Two A Itroducto to Regresso (018-019) 1 pg. ubject: A Itroducto to Regresso Frst tage A Itroducto to Regresso Regresso aalss s a statstcal tool for the

More information

Multivariate Transformation of Variables and Maximum Likelihood Estimation

Multivariate Transformation of Variables and Maximum Likelihood Estimation Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty

More information

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) =

Applied Statistics and Probability for Engineers, 5 th edition February 23, b) y ˆ = (85) = Appled Statstcs ad Probablty for Egeers, 5 th edto February 3, y.8.7.6.5.4.3.. -5 5 5 x b) y ˆ.3999 +.46(85).6836 c) y ˆ.3999 +.46(9).744 d) ˆ.46-3 a) Regresso Aalyss: Ratg Pots versus Meters per Att The

More information

Regression. Linear Regression. A Simple Data Display. A Batch of Data. The Mean is 220. A Value of 474. STAT Handout Module 15 1 st of June 2009

Regression. Linear Regression. A Simple Data Display. A Batch of Data. The Mean is 220. A Value of 474. STAT Handout Module 15 1 st of June 2009 STAT Hadout Module 5 st of Jue 9 Lear Regresso Regresso Joh D. Sork, M.D. Ph.D. Baltmore VA Medcal Ceter GRCC ad Uversty of Marylad School of Medce Claude D. Pepper Older Amercas Idepedece Ceter Reducg

More information

: At least two means differ SST

: At least two means differ SST Formula Card for Eam 3 STA33 ANOVA F-Test: Completely Radomzed Desg ( total umber of observatos, k = Number of treatmets,& T = total for treatmet ) Step : Epress the Clam Step : The ypotheses: :... 0 A

More information

Lecture 1: Introduction to Regression

Lecture 1: Introduction to Regression Lecture : Itroducto to Regresso A Eample: Eplag State Homcde Rates What kds of varables mght we use to epla/predct state homcde rates? Let s cosder just oe predctor for ow: povert Igore omtted varables,

More information

4. Standard Regression Model and Spatial Dependence Tests

4. Standard Regression Model and Spatial Dependence Tests 4. Stadard Regresso Model ad Spatal Depedece Tests Stadard regresso aalss fals the presece of spatal effects. I case of spatal depedeces ad/or spatal heterogeet a stadard regresso model wll be msspecfed.

More information

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity

ECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data

More information

CHAPTER VI Statistical Analysis of Experimental Data

CHAPTER VI Statistical Analysis of Experimental Data Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca

More information

Lecture 1 Review of Fundamental Statistical Concepts

Lecture 1 Review of Fundamental Statistical Concepts Lecture Revew of Fudametal Statstcal Cocepts Measures of Cetral Tedecy ad Dsperso A word about otato for ths class: Idvduals a populato are desgated, where the dex rages from to N, ad N s the total umber

More information

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS

UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted

More information

Chapter 11 The Analysis of Variance

Chapter 11 The Analysis of Variance Chapter The Aalyss of Varace. Oe Factor Aalyss of Varace. Radomzed Bloc Desgs (ot for ths course) NIPRL . Oe Factor Aalyss of Varace.. Oe Factor Layouts (/4) Suppose that a expermeter s terested populatos

More information

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4

Reaction Time VS. Drug Percentage Subject Amount of Drug Times % Reaction Time in Seconds 1 Mary John Carl Sara William 5 4 CHAPTER Smple Lear Regreo EXAMPLE A expermet volvg fve ubject coducted to determe the relatohp betwee the percetage of a certa drug the bloodtream ad the legth of tme t take the ubject to react to a tmulu.

More information

Homework Solution (#5)

Homework Solution (#5) Homework Soluto (# Chapter : #6,, 8(b, 3, 4, 44, 49, 3, 9 ad 7 Chapter. Smple Lear Regresso ad Correlato.6 (6 th edto 7, old edto Page 9 Rafall volume ( vs Ruoff volume ( : 9 8 7 6 4 3 : a. Yes, the scatter-plot

More information

STA 105-M BASIC STATISTICS (This is a multiple choice paper.)

STA 105-M BASIC STATISTICS (This is a multiple choice paper.) DCDM BUSINESS SCHOOL September Mock Eamatos STA 0-M BASIC STATISTICS (Ths s a multple choce paper.) Tme: hours 0 mutes INSTRUCTIONS TO CANDIDATES Do ot ope ths questo paper utl you have bee told to do

More information

Chapter Statistics Background of Regression Analysis

Chapter Statistics Background of Regression Analysis Chapter 06.0 Statstcs Backgroud of Regresso Aalyss After readg ths chapter, you should be able to:. revew the statstcs backgroud eeded for learg regresso, ad. kow a bref hstory of regresso. Revew of Statstcal

More information

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:

{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution: Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed

More information

Example. Row Hydrogen Carbon

Example. Row Hydrogen Carbon SMAM 39 Least Squares Example. Heatg ad combusto aalyses were performed order to study the composto of moo rocks collected by Apollo 4 ad 5 crews. Recorded c ad c of the Mtab output are the determatos

More information

Lecture 2: Linear Least Squares Regression

Lecture 2: Linear Least Squares Regression Lecture : Lear Least Squares Regresso Dave Armstrog UW Mlwaukee February 8, 016 Is the Relatoshp Lear? lbrary(car) data(davs) d 150) Davs$weght[d]

More information

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA

THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for

More information

Chapter 5 Properties of a Random Sample

Chapter 5 Properties of a Random Sample Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample

More information

Chapter 2 Simple Linear Regression

Chapter 2 Simple Linear Regression Chapter Smple Lear Regresso. Itroducto ad Least Squares Estmates Regresso aalyss s a method for vestgatg the fuctoal relatoshp amog varables. I ths chapter we cosder problems volvg modelg the relatoshp

More information

Chapter 2 Supplemental Text Material

Chapter 2 Supplemental Text Material -. Models for the Data ad the t-test Chapter upplemetal Text Materal The model preseted the text, equato (-3) s more properl called a meas model. ce the mea s a locato parameter, ths tpe of model s also

More information

MEASURES OF DISPERSION

MEASURES OF DISPERSION MEASURES OF DISPERSION Measure of Cetral Tedecy: Measures of Cetral Tedecy ad Dsperso ) Mathematcal Average: a) Arthmetc mea (A.M.) b) Geometrc mea (G.M.) c) Harmoc mea (H.M.) ) Averages of Posto: a) Meda

More information

Module 7: Probability and Statistics

Module 7: Probability and Statistics Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to

More information

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.

2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen. .5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Marquette Uverst Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Coprght 08 b Marquette Uverst Maxmum Lkelhood Estmato We have bee sag that ~

More information

Chapter 3 Sampling For Proportions and Percentages

Chapter 3 Sampling For Proportions and Percentages Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys

More information

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek

Investigation of Partially Conditional RP Model with Response Error. Ed Stanek Partally Codtoal Radom Permutato Model 7- vestgato of Partally Codtoal RP Model wth Respose Error TRODUCTO Ed Staek We explore the predctor that wll result a smple radom sample wth respose error whe a

More information

"It is the mark of a truly intelligent person to be moved by statistics." George Bernard Shaw

It is the mark of a truly intelligent person to be moved by statistics. George Bernard Shaw Chapter 0 Chapter 0 Lear Regresso ad Correlato "It s the mark of a truly tellget perso to be moved by statstcs." George Berard Shaw Source: https://www.google.com.ph/search?q=house+ad+car+pctures&bw=366&bh=667&tbm

More information

Introduction to F-testing in linear regression models

Introduction to F-testing in linear regression models ECON 43 Harald Goldste, revsed Nov. 4 Itroducto to F-testg lear regso s (Lecture ote to lecture Frday 4..4) Itroducto A F-test usually s a test where several parameters are volved at oce the ull hypothess

More information

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model

( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch

More information

Johns Hopkins University Department of Biostatistics Math Review for Introductory Courses

Johns Hopkins University Department of Biostatistics Math Review for Introductory Courses Johs Hopks Uverst Departmet of Bostatstcs Math Revew for Itroductor Courses Ratoale Bostatstcs courses wll rel o some fudametal mathematcal relatoshps, fuctos ad otato. The purpose of ths Math Revew s

More information

Special Instructions / Useful Data

Special Instructions / Useful Data JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth

More information

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67.

1. The weight of six Golden Retrievers is 66, 61, 70, 67, 92 and 66 pounds. The weight of six Labrador Retrievers is 54, 60, 72, 78, 84 and 67. Ecoomcs 3 Itroducto to Ecoometrcs Sprg 004 Professor Dobk Name Studet ID Frst Mdterm Exam You must aswer all the questos. The exam s closed book ad closed otes. You may use your calculators but please

More information

Unit 9 Regression and Correlation

Unit 9 Regression and Correlation BIOSTATS 540 - Fall 05 Regressio ad Correlatio Page of 44 Uit 9 Regressio ad Correlatio Assume that a statistical model such as a liear model is a good first start oly - Gerald va Belle Is higher blood

More information

Econ 388 R. Butler 2016 rev Lecture 5 Multivariate 2 I. Partitioned Regression and Partial Regression Table 1: Projections everywhere

Econ 388 R. Butler 2016 rev Lecture 5 Multivariate 2 I. Partitioned Regression and Partial Regression Table 1: Projections everywhere Eco 388 R. Butler 06 rev Lecture 5 Multvarate I. Parttoed Regresso ad Partal Regresso Table : Projectos everywhere P = ( ) ad M = I ( ) ad s a vector of oes assocated wth the costat term Sample Model Regresso

More information

Point Estimation: definition of estimators

Point Estimation: definition of estimators Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.

More information

Module 7. Lecture 7: Statistical parameter estimation

Module 7. Lecture 7: Statistical parameter estimation Lecture 7: Statstcal parameter estmato Parameter Estmato Methods of Parameter Estmato 1) Method of Matchg Pots ) Method of Momets 3) Mamum Lkelhood method Populato Parameter Sample Parameter Ubased estmato

More information

ln( weekly earn) age age

ln( weekly earn) age age Problem Set 4, ECON 3033 (Due at the start of class, Wedesday, February 4, 04) (Questos marked wth a * are old test questos) Bll Evas Sprg 08. Cosder a multvarate regresso model of the form y 0 x x. Wrte

More information

University of Belgrade. Faculty of Mathematics. Master thesis Regression and Correlation

University of Belgrade. Faculty of Mathematics. Master thesis Regression and Correlation Uversty of Belgrade Vrtual Lbrary of Faculty of Mathematcs - Uversty of Belgrade Faculty of Mathematcs Master thess Regresso ad Correlato The caddate Supervsor Karma Ibrahm Soufya Vesa Jevremovć Jue 014

More information

STK4011 and STK9011 Autumn 2016

STK4011 and STK9011 Autumn 2016 STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto

More information

Lecture 3 Probability review (cont d)

Lecture 3 Probability review (cont d) STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto

More information

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter

The number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Revew for the prevous lecture: Theorems ad Examples: How to obta the pmf (pdf) of U = g (, Y) ad V = g (, Y) Chapter 4 Multple Radom Varables Chapter 44 Herarchcal Models ad Mxture Dstrbutos Examples:

More information