Statistics for EES Linear regression and linear models

Size: px
Start display at page:

Download "Statistics for EES Linear regression and linear models"

Transcription

1 Statstcs for EES Lnear regresson and lnear models Drk Metzler June 11, 2018 Contents 1 Unvarate lnear regresson: how and why? 1 2 t-test for lnear regresson 6 3 log-scalng the data 9 4 Checkng model assumptons 14 5 Lnear regresson example wth scalng 18 6 Why t s called regresson 23 1 Unvarate lnear regresson: how and why? References [1] Prnznger, R., E. Karl, R. Bögel, Ch. Walzer (1999): Energy metabolsm, body temperature, and cardac work n the Grffon vulture Gyps vulvus - telemetrc nvestgatons n the laboratory and n the feld.zoology 102, Suppl. II: 15 Data from Goethe-Unversty, Group of Prof. Prnznger Developed telemetrc system for measurng heart beats of flyng brds Important for ecologcal questons: metabolc rate. metabolc rate can only be measured n the lab can we nfer metabolc rate from heart beat frequency? 1

2 grffon vulture, , 16 degrees C metabolc rate [J/(g*h)] heart beats [per mnute] vulture day heartbpm metabol mntemp maxtemp medtemp / / / / / / / / / / / (14 dfferent days) > model <- lm(metabol~heartbpm,data=vulture, subset=day=="17.05.") > summary(model) Call: lm(formula = metabol ~ heartbpm, data = vulture, subset = day == "17.05.") Resduals: Mn 1Q Medan 3Q Max 2

3 Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) e-08 *** heartbpm e-14 *** --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: on 17 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: on 1 and 17 DF, p-value: 2.979e-14 y 3 b slope y 2 y 1 b= y y 2 1 x x 2 1 x x 2 1 y y y=a+bx a ntercept 0 0 x x x r n r 1 r 3 r r 2 resduals r = y (a+bx ) the lne must mnmze the sum of squared resduals 0 r 2+ r r n 0 defne the regresson lne y = â + ˆb x 3

4 by mnmzng the sum of squared resduals: (â, ˆb) = arg mn (y (a + b x )) 2 (a,b) ths s based on the model assumpton that values a, b exst, such that, for all data ponts (x, y ) we have y = a + b x + ε, whereas all ε are ndependent and normally dstrbuted wth the same varance σ 2. gvend data: Y X Model: there are values a, b, σ 2 such that y 1 x 1 y 1 = a + b x 1 + ε 1 y 2 x 2 y 2 = a + b x 2 + ε 2 y 3 x 3 y 3 = a + b x 3 + ε y n x n y n = a + b x n + ε n ε 1, ε 2,..., ε n are ndependent N (0, σ 2 ).[1.5ex] y 1, y 2,..., y n are ndependent y N (a + b x, σ 2 ).[1.5ex] a, b, σ 2 are unknown, but not random. We estmate a and b by computng (â, ˆb) := arg mn (a,b) Theorem 1. Compute â and ˆb by ˆb = (y ȳ) (x x) (x x) 2 = and (y (a + b x )) 2. â = ȳ ˆb x. y (x x) (x x) 2 Please keep n mnd: The lne y = â + ˆb x goes through the center of gravty of the cloud of ponts (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ). Sketch of the proof of the theorem Let g(a, b) = (y (a + b x )) 2. We optmze g, by settng the dervatves of g g(a, b) a g(a, b) b = = 2 (y (a + bx )) ( 1) 2 (y (a + bx )) ( x ) 4

5 to 0 and obtan 0 = 0 = (y (â + ˆbx )) ( 1) (y (â + ˆbx )) ( x ) 0 = (y (â + ˆbx )) 0 = (y (â + ˆbx )) x gves us 0 = 0 = ( ) ( ) y n â ˆb x ( ) ( ) ( y x â x ˆb x 2 ) and the theorem follows by solvng ths for â and ˆb. Regresson and Correlaton If s x and s y are the bas-corrected (that s, computed wth n 1) standard devatons of the x and y values, and f cov(x, y) = 1 n 1 (x x) (y y) s the bas-corrected covarance, we obtan for the estmated slope of the regresson lne: b = (x 1 x) (y y) n 1 (x = (x x) (y y) x) 2 1 n 1 (x x) 2 = cov(x, y). s 2 x Thus, b s equal to the correlaton cor(x, y) = cov(x,y) s x s y f and only f s x = s y. Optmzng the clutch sze Example:Cowpea weevl (also bruchd beetle) Callosobruchus maculatus German: Erbsensamenkäfer 5

6 References [Wl94] Wlson, K. (1994) Evoluton of clutch sze n nsects. II. A test of statc optmalty models usng the beetle Callosobruchus maculatus (Coleoptera: Bruchdae). Journal of Evolutonary Bology 7: How does survval probablty depnend on clutch sze? Whch clutch sze optmzes the expected number of survvng offsprng? vablty clutchsze clutchsze * vablty clutchsze 2 t-test for lnear regresson Example: red deer (Cervus elaphus) theory: femals can nfluence the sex of ther offsprng 6

7 Evolutonary stable strategy: weak anmals may tend to have female offsprng, strong anmals may tend to have male offsprng. References [CAG86] Clutton-Brock, T. H., Albon, S. D., Gunness, F. E. (1986) Great expectatons: domnance, breedng success and offsprng sex ratos n red deer.anm. Behav. 34, > hnd rank ratomales CAUTION: Smulated data, nspred by orgnal paper hnd$ratomales hnd$rank 7

8 > mod <- lm(ratomales~rank,data=hnd) > summary(mod) Call: lm(formula = ratomales ~ rank, data = hnd) Resduals: Mn 1Q Medan 3Q Max Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) e-06 *** rank e-09 *** --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: on 52 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: on 1 and 52 DF, p-value: 9.78e-09 Model: Y = a + b X + ε mt ε N (0, σ 2 ) [1.5ex] How to compute the sgnfcance of a relatonshp between the explanatory trat X and the target varable Y? [1.5ex] In other words: How can we test the null hypothess b = 0? [1.5ex] We have estmated b by ˆb 0. Could the true b be 0? [1.5ex] How large s the standard error of ˆb? not random: a, b, x, σ 2 y = a + b x + ε mt ε N (0, σ 2 ) random: ε, y var(y ) = var(a + b x + ε) = var(ε) = σ 2 and y 1, y 2,..., y n are stochastcally ndependent. ˆb = y (x x) (x x) 2 ( var(ˆb) = var y ) (x x) (x = var ( y (x x)) x) 2 ( (x x) 2 ) 2 = var (y ) (x x) 2 ( (x x) 2 ) 2 = σ 2 (x x) 2 ( (x x) 2 ) 2 = σ 2 / (x x) 2 8

9 In fact ˆb s normally dstrbuted wth mean b and var(ˆb) = σ 2 / (x x) 2 Problem: We do not know σ 2. We estmate σ 2 by consderng the resdual varance: s 2 := (y â ˆb ) 2 x n 2 Note that we dvde by n 2. The reason for ths s that two model parameters a and b have been estmated, whch means that two degrees of freedom got lost. var(ˆb) = σ 2 / (x x) 2 Estmate σ 2 by Then s 2 = (y â ˆb ) 2 x. n 2 ˆb b s / (x x) 2 s Student-t-dstrbuted wth n 2 degrees of freedom and we can apply the t-test to test the null hypothess b = 0. 3 log-scalng the data Data example: typcal body weght [kg] and and bran weght [g] of 62 mammals speces (and 3 dnosaurs) > data weght.kg. bran.weght.g speces extnct afrcan elephant no no no no asan elephant no no no no cat no 9

10 chmpanzee no Trceratops yes Brachosaurus yes typsche Werte be 65 Wrbelterarten asan afrcan elephant elephant Gehrngewcht [g] 1e 01 1e+00 1e+01 1e+02 1e+03 mouse human graffe horse chmpanzeecow donkey potar monkey grey goat wolf kangaroo cat rabbt mountan beaver gunea pg mole rat hamster rhesus monkey sheep jaguar pg 1e 02 1e+00 1e+02 1e+04 Koerpergewcht [kg] Brachosa Trceratops Dplodocus > modell <- lm(bran.weght.g~weght.kg.,subset=extnct=="no") > summary(modell) Call: lm(formula = bran.weght.g ~ weght.kg., subset = extnct == "no") Resduals: Mn 1Q Medan 3Q Max Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) * weght.kg <2e-16 *** --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: on 60 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: on 1 and 60 DF, p-value: < 2.2e-16 qqnorm(modell$resduals) 10

11 Normal Q Q Plot Sample Quantles Theoretcal Quantles plot(modell$ftted.values,modell$resduals) plot(modell$ftted.values,modell$resduals,log= x ) modell$resduals modell$resduals modell$ftted.values modell$ftted.values plot(modell$model$weght.kg.,modell$resduals) plot(modell$model$weght.kg.,modell$resduals,log= x ) 11

12 modell$resduals modell$resduals modell$model$weght.kg. 1e 02 1e+00 1e+02 1e+04 modell$model$weght.kg. We see that the resduals varance depends on the ftted values (or the body weght): heteroscadscty The model assumes homoscedascty,.e. the random devatons must be (almost) ndependent of the explanng trats (body weght) and the ftted values. varance-stablzng transformaton: can be rescale body- and bran sze to make devatons ndependent of varables Actually not so surprsng: An elephant s bran of typcally 5 kg can easly be 500 g lghter or heaver from ndvdual to ndvdual. Ths can not happen for a mouse bran of typcally 5 g. The latter wll rather also vary by 10%,.e. 0.5 g. Thus, the varance s not addtve but rather multplcatve: bran mass = (expected bran mass) random We can convert ths nto somethng wth addtve randomness by takng the log: log(bran mass) = log(expected bran mass) + log(random) > logmodell <- lm(log(bran.weght.g)~log(weght.kg.),subset=extnct=="no") > summary(logmodell) Call: lm(formula = log(bran.weght.g) ~ log(weght.kg.), subset = extnct == "no") Resduals: Mn 1Q Medan 3Q Max

13 Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** log(weght.kg.) <2e-16 *** --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: on 60 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: on 1 and 60 DF, p-value: < 2.2e-16 qqnorm(modell$resduals) Normal Q Q Plot Sample Quantles Theoretcal Quantles plot(logmodell$ftted.values,logmodell$resduals) plot(logmodell$ftted.values,logmodell$resduals,log= x ) 13

14 logmodell$ftted.values logmodell$resduals 1e 03 1e 02 1e 01 1e+00 1e logmodell$ftted.values logmodell$resduals plot(weght.kg.[extnct== no ],logmodell$resduals) plot(weght.kg.[extnct= no ],logmodell$resduals,log= x ) weght.kg.[extnct == "no"] logmodell$resduals 1e 02 1e+00 1e+02 1e weght.kg.[extnct == "no"] logmodell$resduals 4 Checkng model assumptons Is the model approprate for the data?, e.g Y = a + b X + ε mt ε N (0, σ 2 ) 14

15 If the model fts, the resduals must be y (â + b ) x look normally dstrbuted and must not have obvous dependences wth X or â + b X. Example: s the relaton between X and Y suffcently well descrbed by the lnear equaton Y = a + b X + ε? [-0.5cm] Y X > mod <- lm(y ~ X) > summary(mod) Call: lm(formula = Y ~ X) Resduals: Mn 1Q Medan 3Q Max Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) X <2e-16 *** --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: on 28 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: on 1 and 28 DF, p-value: < 2.2e-16 15

16 > plot(x,resduals(mod)) [-0.5cm] resduals(mod) X [-0.5cm] Obvously, the resduals tend to be larger for very large and very small values of X than for mean values of X. That should not be! Idea: Instead ft a secton of a parabola nstead of alne to (x, y ),.e. a model of the form Y = a + b X + c X 2 + ε. Is ths stll a lnear model? Yes: Let Z = X 2, then Y s lnear n X and Z. In R: > Z <- X^2 > mod2 <- mod <- lm(y ~ X+Z) > summary(mod2) Call: lm(formula = Y ~ X + Z) Resduals: Mn 1Q Medan 3Q Max Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** X * Z <2e-16 *** 16

17 --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: on 27 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: 7776 on 2 and 27 DF, p-value: < 2.2e-16 For ths model there s no obvous dependence between X and the resduals: plot(x,resduals(mod2)) [-0.5cm] resduals(mod2) X Is the assumpton of normalty n the model Y = a + b X + ε n accordance wth the data? Are the resduals r = Y (â + b X ) more or less normally dstrbuted? Graphcal Methods: compare the theoretcal quantles of the standard normal dstrbuton N (0, 1) wth those of the resduals. Background: If we plot the quantles of N (µ, σ 2 ) aganst those of N (0, 1), we obtan a lne y(x) = µ + σ x. (Reason: If X s standard-normally dstrbuted and Y = a + b X, then Y s normally dstrbuted wth mean a and varance b 2.) Before we ft the model wth lm() we frst have to check whether the model assumptons are fulflled. Before we ft the model wth lm() we frst have to check whether the model assumptons are fulflled. To check the assumptons underlyng a lnear model we need the resduals. To compute the resduals we frst have to ft the model (n R wth lm()). After that we can check the model assumptons and decde whether we stay wth ths model or stll have to modfy t. p <- seq(from=0,to=1,by=0.01) plot(qnorm(p,mean=0,sd=1),qnorm(p,mean=1,sd=0.5), 17

18 ablne(v=0,h=0) pch=16,cex=0.5) qnorm(p, mean = 1, sd = 0.5) qnorm(p, mean = 0, sd = 1) If we plot the emprcal quantles of a sample from a normal dstrbuton aganst the theoretcal quantles of a standard normal dstrbuton, the values are not precsely on the lne but are scattered around a lne. If no systematc devatons from an magnary lne are recognzable: Normal dstrbuton assumpton s acceptable If systematc devatons from an magnary lne are obvous: Assumpton of normalty may be problematc. It may be necessary to rescale varables or to take addtonal explanatory varables nto account. 5 Lnear regresson example wth scalng Data: For 301 US-amercan countes number of whte female nhabtants n certan age group n 1960 and number of deaths by breast cancer n ths group between 1950 and (Rce (2007) Mathematcal Statstcs and Data Analyss.) > canc deaths nhabtants

19 Is the average number of deaths proportonal to populaton sze,.e. Edeaths = b nhabtants or does the cancer rsk depend on the sze of the county, such that a dfferent model fts better? e.g. Edeaths = a + b nhabtants wth a 0. > modell <- lm(deaths~nhabtants,data=canc) > summary(modell) Call: lm(formula = deaths ~ nhabtants, data = canc) Resduals: Mn 1Q Medan 3Q Max Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) e e nhabtants 3.578e e <2e-16 *** --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: 13 on 299 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: 4315 on 1 and 299 DF, p-value: < 2.2e-16 The ntercept s estmated to , but not sgnfcantly dfferent from 0. Thus we cannot reject the null hypothess that the county sze has no nfluence on the cancer rsk. 19

20 But.. does the model ft? Normal Q Q Plot Sample Quantles Theoretcal Quantles qqnorm(modell$resduals) plot(modell$ftted.values,modell$resduals) plot(modell$ftted.values,modell$resduals,log= x ) modell$resduals modell$resduals modell$ftted.values modell$ftted.values plot(canc$nhabtants,modell$resduals,log= x ) 20

21 modell$resduals e+02 2e+03 5e+03 2e+04 5e+04 canc$nhabtants The varance of the resduals depends on the ftted values. Heteroscedastcty The lnear model assumgs Homoscedastcty. Varance Stablzng Transformaton: How can we rescale the populaton sze such that we obtan homoscedastc data? Where does the varance come from? If n s the number of whte female nhabtants and p the ndvdual probablty to de by breast cancer wthn 10 years, then np s the expected number of deaths and the varance s n p (1 p) n p (Maybe approxmate bnomal by Posson). Standard devaton: n p. In ths case we can approxmately stablze varance by takng the root on both sdes of the equaton. Explanaton: y = b x + ε y = (b x + ε) 2 = b 2 x + 2 b x ε + ε 2 SD s not exactly proportonal to x, but at least 2 b x ε has SD prop. to x, namely 2 b x σ. The Term ε 2 s the σ 2 -fold of a χ 2 1-dstrbuted random varable and has SD=σ 2 2. If σ s small compared to b x, the approxmaton y b 2 x + 2 b x ε s reasonable and the SD of y s approxmately proportonal to x. 21

22 > modellsq <- lm(sqrt(deaths~sqrt(nhabtants),data=canc) > summary(modellsq) Call: lm(formula = sqrt(deaths) ~ sqrt(nhabtants), data = canc) Resduals: Mn 1Q Medan 3Q Max Coeffcents: Estmate Std. Error t value Pr(> t ) (Intercept) sqrt(nhabtants) <2e-16 *** --- Sgnf. codes: 0 *** ** 0.01 * Resdual standard error: on 299 degrees of freedom Multple R-squared: ,Adjusted R-squared: F-statstc: 4051 on 1 and 299 DF, p-value: < 2.2e-16 Normal Q Q Plot Sample Quantles Theoretcal Quantles qqnorm(modell$resduals) plot(modellsq$ftted.values,modellsq$resduals,log= x ) plot(canc$nhabtants,modell 22

23 modellsq$ftted.values modellsq$resduals 5e+02 2e+03 5e+03 2e+04 5e canc$nhabtants modellsq$resduals The qqnorm plot s not perfect by at least the varance s stablzed. The result remans the same: No sgnfcant relaton between county sze and breast cancer death rsk. 6 Why t s called regresson Orgn of the word Regresson Sr Francs Galton ( ): Regresson toward the mean. Tall fathers tend to have sons that are slghtly smaller than the fathers. 23

24 Sons of small fathers are on average larger than ther fathers. Koerpergroessen Sohn Vater Koerpergroessen Sohn Vater Koerpergroessen Sohn Vater 24

25 Smlar effects In sports: The champon of the season wll tend to fal the hgh expectatons n the next year. In school: If the worst 10% of the students get extra lessons and are not the worst 10% n the next year, then ths does not proof that the extra lessons are useful. Some of what you should be able to explan Model assumptons underlyng lnear regresson Equaton What s random, what s fxed? approach: mnmze sum of squared resduals optmal soluton for slope and ntercept slope vs. correlaton t-test for the slope (standard error, test statstc and df) scalng the data: when, why, how? qqnorm plots theory how to use them to judge model assumptons 25

Statistics for EES Linear regression and linear models

Statistics for EES Linear regression and linear models Statistics for EES Linear regression and linear models Dirk Metzler http://evol.bio.lmu.de/_statgen 28. July 2010 Contents 1 Univariate linear regression: how and why? 2 t-test for linear regression 3

More information

Statistics for EES 7. Linear regression and linear models

Statistics for EES 7. Linear regression and linear models Statistics for EES 7. Linear regression and linear models Dirk Metzler http://www.zi.biologie.uni-muenchen.de/evol/statgen.html 26. May 2009 Contents 1 Univariate linear regression: how and why? 2 t-test

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Biostatistics 360 F&t Tests and Intervals in Regression 1

Biostatistics 360 F&t Tests and Intervals in Regression 1 Bostatstcs 360 F&t Tests and Intervals n Regresson ORIGIN Model: Y = X + Corrected Sums of Squares: X X bar where: s the y ntercept of the regresson lne (translaton) s the slope of the regresson lne (scalng

More information

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty

More information

F8: Heteroscedasticity

F8: Heteroscedasticity F8: Heteroscedastcty Feng L Department of Statstcs, Stockholm Unversty What s so-called heteroscedastcty In a lnear regresson model, we assume the error term has a normal dstrbuton wth mean zero and varance

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

ANOVA. The Observations y ij

ANOVA. The Observations y ij ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

F statistic = s2 1 s 2 ( F for Fisher )

F statistic = s2 1 s 2 ( F for Fisher ) Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Regression Analysis. Regression Analysis

Regression Analysis. Regression Analysis Regresson Analyss Smple Regresson Multvarate Regresson Stepwse Regresson Replcaton and Predcton Error 1 Regresson Analyss In general, we "ft" a model by mnmzng a metrc that represents the error. n mn (y

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.

[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students. PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2 ISQS 6348 Fnal Open notes, no books. Ponts out of 100 n parentheses. 1. The followng path dagram s gven: ε 1 Y 1 ε F Y 1.A. (10) Wrte down the usual model and assumptons that are mpled by ths dagram. Soluton:

More information

Diagnostics in Poisson Regression. Models - Residual Analysis

Diagnostics in Poisson Regression. Models - Residual Analysis Dagnostcs n Posson Regresson Models - Resdual Analyss 1 Outlne Dagnostcs n Posson Regresson Models - Resdual Analyss Example 3: Recall of Stressful Events contnued 2 Resdual Analyss Resduals represent

More information

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9 Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,

More information

Chapter 3. Two-Variable Regression Model: The Problem of Estimation

Chapter 3. Two-Variable Regression Model: The Problem of Estimation Chapter 3. Two-Varable Regresson Model: The Problem of Estmaton Ordnary Least Squares Method (OLS) Recall that, PRF: Y = β 1 + β X + u Thus, snce PRF s not drectly observable, t s estmated by SRF; that

More information

STATISTICS QUESTIONS. Step by Step Solutions.

STATISTICS QUESTIONS. Step by Step Solutions. STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models

Statistics for Managers Using Microsoft Excel/SPSS Chapter 14 Multiple Regression Models Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 14 Multple Regresson Models 1999 Prentce-Hall, Inc. Chap. 14-1 Chapter Topcs The Multple Regresson Model Contrbuton of Indvdual Independent Varables

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson

More information

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Lecture 16 Statistical Analysis in Biomaterials Research (Part II) 3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan

More information

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X). 11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e

More information

Statistics MINITAB - Lab 2

Statistics MINITAB - Lab 2 Statstcs 20080 MINITAB - Lab 2 1. Smple Lnear Regresson In smple lnear regresson we attempt to model a lnear relatonshp between two varables wth a straght lne and make statstcal nferences concernng that

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5). (out of 15 ponts) STAT 3340 Assgnment 1 solutons (10) (10) 1. Fnd the equaton of the lne whch passes through the ponts (1,1) and (4,5). β 1 = (5 1)/(4 1) = 4/3 equaton for the lne s y y 0 = β 1 (x x 0

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation

Tests of Exclusion Restrictions on Regression Coefficients: Formulation and Interpretation ECONOMICS 5* -- NOTE 6 ECON 5* -- NOTE 6 Tests of Excluson Restrctons on Regresson Coeffcents: Formulaton and Interpretaton The populaton regresson equaton (PRE) for the general multple lnear regresson

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the

Linear regression. Regression Models. Chapter 11 Student Lecture Notes Regression Analysis is the Chapter 11 Student Lecture Notes 11-1 Lnear regresson Wenl lu Dept. Health statstcs School of publc health Tanjn medcal unversty 1 Regresson Models 1. Answer What Is the Relatonshp Between the Varables?.

More information

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

Answers Problem Set 2 Chem 314A Williamsen Spring 2000 Answers Problem Set Chem 314A Wllamsen Sprng 000 1) Gve me the followng crtcal values from the statstcal tables. a) z-statstc,-sded test, 99.7% confdence lmt ±3 b) t-statstc (Case I), 1-sded test, 95%

More information

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management

Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 212. Chapters 14, 15 & 16. Professor Ahmadi, Ph.D. Department of Management Lecture Notes for STATISTICAL METHODS FOR BUSINESS II BMGT 1 Chapters 14, 15 & 16 Professor Ahmad, Ph.D. Department of Management Revsed August 005 Chapter 14 Formulas Smple Lnear Regresson Model: y =

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

Chapter 15 Student Lecture Notes 15-1

Chapter 15 Student Lecture Notes 15-1 Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Lecture 6 More on Complete Randomized Block Design (RBD)

Lecture 6 More on Complete Randomized Block Design (RBD) Lecture 6 More on Complete Randomzed Block Desgn (RBD) Multple test Multple test The multple comparsons or multple testng problem occurs when one consders a set of statstcal nferences smultaneously. For

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected. ANSWERS CHAPTER 9 THINK IT OVER thnk t over TIO 9.: χ 2 k = ( f e ) = 0 e Breakng the equaton down: the test statstc for the ch-squared dstrbuton s equal to the sum over all categores of the expected frequency

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Exam. Econometrics - Exam 1

Exam. Econometrics - Exam 1 Econometrcs - Exam 1 Exam Problem 1: (15 ponts) Suppose that the classcal regresson model apples but that the true value of the constant s zero. In order to answer the followng questons assume just one

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting. The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal

Systematic Error Illustration of Bias. Sources of Systematic Errors. Effects of Systematic Errors 9/23/2009. Instrument Errors Method Errors Personal 9/3/009 Sstematc Error Illustraton of Bas Sources of Sstematc Errors Instrument Errors Method Errors Personal Prejudce Preconceved noton of true value umber bas Prefer 0/5 Small over large Even over odd

More information

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT?

CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT? Basc Econometrcs, Gujarat and Porter CHAPER 11: HETEROSCEDASTICITY: WHAT HAPPENS WHEN ERROR VARIANCE IS NONCONSTANT? 11.1 (a) False. The estmators are unbased but are neffcent. (b) True. See Sec. 11.4

More information

Correlation and Regression

Correlation and Regression Correlaton and Regresson otes prepared by Pamela Peterson Drake Index Basc terms and concepts... Smple regresson...5 Multple Regresson...3 Regresson termnology...0 Regresson formulas... Basc terms and

More information

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li

Biostatistics. Chapter 11 Simple Linear Correlation and Regression. Jing Li Bostatstcs Chapter 11 Smple Lnear Correlaton and Regresson Jng L jng.l@sjtu.edu.cn http://cbb.sjtu.edu.cn/~jngl/courses/2018fall/b372/ Dept of Bonformatcs & Bostatstcs, SJTU Recall eat chocolate Cell 175,

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experments- MODULE LECTURE - 6 EXPERMENTAL DESGN MODELS Dr. Shalabh Department of Mathematcs and Statstcs ndan nsttute of Technology Kanpur Two-way classfcaton wth nteractons

More information

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle

4.1. Lecture 4: Fitting distributions: goodness of fit. Goodness of fit: the underlying principle Lecture 4: Fttng dstrbutons: goodness of ft Goodness of ft Testng goodness of ft Testng normalty An mportant note on testng normalty! L4.1 Goodness of ft measures the extent to whch some emprcal dstrbuton

More information

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats

Two-factor model. Statistical Models. Least Squares estimation in LM two-factor model. Rats tatstcal Models Lecture nalyss of Varance wo-factor model Overall mean Man effect of factor at level Man effect of factor at level Y µ + α + β + γ + ε Eε f (, ( l, Cov( ε, ε ) lmr f (, nteracton effect

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Lab 4: Two-level Random Intercept Model

Lab 4: Two-level Random Intercept Model BIO 656 Lab4 009 Lab 4: Two-level Random Intercept Model Data: Peak expratory flow rate (pefr) measured twce, usng two dfferent nstruments, for 17 subjects. (from Chapter 1 of Multlevel and Longtudnal

More information

Topic 7: Analysis of Variance

Topic 7: Analysis of Variance Topc 7: Analyss of Varance Outlne Parttonng sums of squares Breakdown the degrees of freedom Expected mean squares (EMS) F test ANOVA table General lnear test Pearson Correlaton / R 2 Analyss of Varance

More information

Some basic statistics and curve fitting techniques

Some basic statistics and curve fitting techniques Some basc statstcs and curve fttng technques Statstcs s the dscplne concerned wth the study of varablty, wth the study of uncertanty, and wth the study of decsonmakng n the face of uncertanty (Lndsay et

More information

β0 + β1xi. You are interested in estimating the unknown parameters β

β0 + β1xi. You are interested in estimating the unknown parameters β Ordnary Least Squares (OLS): Smple Lnear Regresson (SLR) Analytcs The SLR Setup Sample Statstcs Ordnary Least Squares (OLS): FOCs and SOCs Back to OLS and Sample Statstcs Predctons (and Resduals) wth OLS

More information

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA

7.1. Single classification analysis of variance (ANOVA) Why not use multiple 2-sample 2. When to use ANOVA Sngle classfcaton analyss of varance (ANOVA) When to use ANOVA ANOVA models and parttonng sums of squares ANOVA: hypothess testng ANOVA: assumptons A non-parametrc alternatve: Kruskal-Walls ANOVA Power

More information

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis

NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION MTH352/MH3510 Regression Analysis NANYANG TECHNOLOGICAL UNIVERSITY SEMESTER I EXAMINATION 014-015 MTH35/MH3510 Regresson Analyss December 014 TIME ALLOWED: HOURS INSTRUCTIONS TO CANDIDATES 1. Ths examnaton paper contans FOUR (4) questons

More information

Properties of Least Squares

Properties of Least Squares Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures

More information

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X.

( )( ) [ ] [ ] ( ) 1 = [ ] = ( ) 1. H = X X X X is called the hat matrix ( it puts the hats on the Y s) and is of order n n H = X X X X. ( ) ( ) where ( ) 1 ˆ β = X X X X β + ε = β + Aε A = X X 1 X [ ] E ˆ β β AE ε β so ˆ = + = β s unbased ( )( ) [ ] ˆ Cov β = E ˆ β β ˆ β β = E Aεε A AE ε ε A Aσ IA = σ AA = σ X X = [ ] = ( ) 1 Ftted values

More information

T E C O L O T E R E S E A R C H, I N C.

T E C O L O T E R E S E A R C H, I N C. T E C O L O T E R E S E A R C H, I N C. B rdg n g En g neern g a nd Econo mcs S nce 1973 THE MINIMUM-UNBIASED-PERCENTAGE ERROR (MUPE) METHOD IN CER DEVELOPMENT Thrd Jont Annual ISPA/SCEA Internatonal Conference

More information

Chapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout

Chapter 11: I = 2 samples independent samples paired samples Chapter 12: I 3 samples of equal size J one-way layout two-way layout Serk Sagtov, Chalmers and GU, February 0, 018 Chapter 1. Analyss of varance Chapter 11: I = samples ndependent samples pared samples Chapter 1: I 3 samples of equal sze one-way layout two-way layout 1

More information

Learning Objectives for Chapter 11

Learning Objectives for Chapter 11 Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method

More information

Laboratory 1c: Method of Least Squares

Laboratory 1c: Method of Least Squares Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information