Statistics for EES 7. Linear regression and linear models

Size: px
Start display at page:

Download "Statistics for EES 7. Linear regression and linear models"

Transcription

1 Statistics for EES 7. Linear regression and linear models Dirk Metzler May 2009

2 Contents 1 Univariate linear regression: how and why? 2 t-test for linear regression 3 log-scaling the data

3 Contents Univariate linear regression: how and why? 1 Univariate linear regression: how and why? 2 t-test for linear regression 3 log-scaling the data

4 Univariate linear regression: how and why? Griffon Vulture Gypus fulvus German: Ga nsegeier photo (c) by Jo rg Hempel

5 Univariate linear regression: how and why? Prinzinger, R., E. Karl, R. Bögel, Ch. Walzer (1999): Energy metabolism, body temperature, and cardiac work in the Griffon vulture Gyps vulvus - telemetric investigations in the laboratory and in the field. Zoology 102, Suppl. II: 15 Data from Goethe-University, Group of Prof. Prinzinger Developed telemetric system for measuring heart beats of flying birds

6 Univariate linear regression: how and why? Prinzinger, R., E. Karl, R. Bögel, Ch. Walzer (1999): Energy metabolism, body temperature, and cardiac work in the Griffon vulture Gyps vulvus - telemetric investigations in the laboratory and in the field. Zoology 102, Suppl. II: 15 Data from Goethe-University, Group of Prof. Prinzinger Developed telemetric system for measuring heart beats of flying birds Important for ecological questions: metabolic rate.

7 Univariate linear regression: how and why? Prinzinger, R., E. Karl, R. Bögel, Ch. Walzer (1999): Energy metabolism, body temperature, and cardiac work in the Griffon vulture Gyps vulvus - telemetric investigations in the laboratory and in the field. Zoology 102, Suppl. II: 15 Data from Goethe-University, Group of Prof. Prinzinger Developed telemetric system for measuring heart beats of flying birds Important for ecological questions: metabolic rate. metabolic rate can only be measured in the lab

8 Univariate linear regression: how and why? Prinzinger, R., E. Karl, R. Bögel, Ch. Walzer (1999): Energy metabolism, body temperature, and cardiac work in the Griffon vulture Gyps vulvus - telemetric investigations in the laboratory and in the field. Zoology 102, Suppl. II: 15 Data from Goethe-University, Group of Prof. Prinzinger Developed telemetric system for measuring heart beats of flying birds Important for ecological questions: metabolic rate. metabolic rate can only be measured in the lab can we infer metabolic rate from heart beat frequency?

9 Univariate linear regression: how and why? griffon vulture, , 16 degrees C metabolic rate [J/(g*h)] heart beats [per minute]

10 Univariate linear regression: how and why? griffon vulture, , 16 degrees C metabolic rate [J/(g*h)] heart beats [per minute]

11 Univariate linear regression: how and why? vulture day heartbpm metabol mintemp maxtemp medtemp / / / / / / / / / / / (14 different days)

12 Univariate linear regression: how and why? > model <- lm(metabol~heartbpm,data=vulture, subset=day=="17.05.") > summary(model) Call: lm(formula = metabol ~ heartbpm, data = vulture, subset = day "17.05.") Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-08 *** heartbpm e-14 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 17 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 17 DF, p-value: 2.979e-14

13 Univariate linear regression: how and why? 0 0

14 Univariate linear regression: how and why? 0 0

15 Univariate linear regression: how and why? y 3 y 2 y x x x 1 2 3

16 Univariate linear regression: how and why? y 3 y 2 y 1 a intercept 0 0 x x x 1 2 3

17 Univariate linear regression: how and why? y 3 b slope y 2 1 y 1 a intercept 0 0 x x x 1 2 3

18 Univariate linear regression: how and why? y 3 b slope y 2 y 1 y y b= 2 1 x x 2 1 x x 2 1 y y a intercept 0 0 x x x 1 2 3

19 Univariate linear regression: how and why? y 3 b slope y 2 y 1 b= y y 2 1 x x 2 1 x x 2 1 y y y=a+bx a intercept 0 0 x x x 1 2 3

20 Univariate linear regression: how and why? 0 0

21 Univariate linear regression: how and why? 0 0

22 Univariate linear regression: how and why? r n r 1 r 3 r i r 2 0 0

23 Univariate linear regression: how and why? r n r 1 r 3 r i r 2 residuals r = y (a+bx ) i i i 0 0

24 Univariate linear regression: how and why? r n r 1 r 3 r i r 2 residuals r = y (a+bx ) i i i the line must minimize the sum of squared residuals 0 r 2+ r r n 0

25 Univariate linear regression: how and why? define the regression line y = â + ˆb x by minimizing the sum of squared residuals: (â, ˆb) = arg min (y i (a + b x i )) 2 (a,b) this is based on the model assumption that values a, b exist, such that, for all data points (x i, y i ) we have i y i = a + b x i + ε i, whereas all ε i are independent and normally distributed with the same variance σ 2.

26 Univariate linear regression: how and why? givend data: Y X y 1 x 1 y 2 x 2 y 3 x 3.. y n x n

27 Univariate linear regression: how and why? givend data: Y X y 1 x 1 y 2 x 2 y 3 x 3.. Model: there are values a, b, σ 2 such that y 1 = a + b x 1 + ε 1 y 2 = a + b x 2 + ε 2 y 3 = a + b x 3 + ε 3.. y n x n y n = a + b x n + ε n

28 Univariate linear regression: how and why? givend data: Y X y 1 x 1 y 2 x 2 y 3 x 3.. Model: there are values a, b, σ 2 such that y 1 = a + b x 1 + ε 1 y 2 = a + b x 2 + ε 2 y 3 = a + b x 3 + ε 3.. y n x n y n = a + b x n + ε n ε 1, ε 2,..., ε n are independent N (0, σ 2 ).

29 Univariate linear regression: how and why? givend data: Y X y 1 x 1 y 2 x 2 y 3 x 3.. Model: there are values a, b, σ 2 such that y 1 = a + b x 1 + ε 1 y 2 = a + b x 2 + ε 2 y 3 = a + b x 3 + ε 3.. y n x n y n = a + b x n + ε n ε 1, ε 2,..., ε n are independent N (0, σ 2 ). y 1, y 2,..., y n are independent y i N (a + b x i, σ 2 ).

30 Univariate linear regression: how and why? givend data: Y X y 1 x 1 y 2 x 2 y 3 x 3.. Model: there are values a, b, σ 2 such that y 1 = a + b x 1 + ε 1 y 2 = a + b x 2 + ε 2 y 3 = a + b x 3 + ε 3.. y n x n y n = a + b x n + ε n ε 1, ε 2,..., ε n are independent N (0, σ 2 ). y 1, y 2,..., y n are independent y i N (a + b x i, σ 2 ). a, b, σ 2 are unknown, but not random.

31 Univariate linear regression: how and why? We estimate a and b by computing (â, ˆb) := arg min (y i (a + b x i )) 2. (a,b) i

32 Univariate linear regression: how and why? We estimate a and b by computing (â, ˆb) := arg min (y i (a + b x i )) 2. (a,b) i Theorem Compute â and ˆb by i ˆb = (y i ȳ) (x i x) i (x = i x) 2 and â = ȳ ˆb x. i y i (x i x) i (x i x) 2

33 Univariate linear regression: how and why? We estimate a and b by computing (â, ˆb) := arg min (y i (a + b x i )) 2. (a,b) i Theorem Compute â and ˆb by i ˆb = (y i ȳ) (x i x) i (x = i x) 2 and â = ȳ ˆb x. i y i (x i x) i (x i x) 2 Please keep in mind: The line y = â + ˆb x goes through the center of gravity of the cloud of points (x 1, y 1 ), (x 2, y 2 ),..., (x n, y n ).

34 Univariate linear regression: how and why? Sketch of the proof of the theorem Let g(a, b) = i (y i (a + b x i )) 2. We optimize g, by setting the derivatives of g to 0 and obtain g(a, b) a g(a, b) b = i = i 2 (y i (a + bx i )) ( 1) 2 (y i (a + bx i )) ( x i ) 0 = i 0 = i (y i (â + ˆbx i )) ( 1) (y i (â + ˆbx i )) ( x i )

35 Univariate linear regression: how and why? 0 = i (y i (â + ˆbx i )) 0 = i (y i (â + ˆbx i )) x i gives us 0 = 0 = ( ) ( ) y i n â ˆb x i i ( ) ( ) ( y i x i â x i ˆb i i i i x 2 i ) and the theorem follows by solving this for â and ˆb.

36 Univariate linear regression: how and why? vulture day heartbpm metabol mintemp maxtemp medtemp / / / / / / / / / / / (14 different days)

37 Univariate linear regression: how and why? > model <- lm(metabol~heartbpm,data=vulture, subset=day=="17.05.") > summary(model) Call: lm(formula = metabol ~ heartbpm, data = vulture, subset = day == "17.05.") Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-08 *** heartbpm e-14 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 17 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 17 DF, p-value: 2.979e-14

38 Univariate linear regression: how and why? Optimierung der Gelegegröße Example:Cowpea weevil (also bruchid beetle) Callosobruchus maculatus German: Erbsensamenkäfer Wilson, K. (1994) Evolution of clutch size in insects. II. A test of static optimality models using the beetle Callosobruchus maculatus (Coleoptera: Bruchidae). Journal of Evolutionary Biology 7: How does survival probability depnend on clutch size?

39 Univariate linear regression: how and why? Optimierung der Gelegegröße Example:Cowpea weevil (also bruchid beetle) Callosobruchus maculatus German: Erbsensamenkäfer Wilson, K. (1994) Evolution of clutch size in insects. II. A test of static optimality models using the beetle Callosobruchus maculatus (Coleoptera: Bruchidae). Journal of Evolutionary Biology 7: How does survival probability depnend on clutch size? Which clutch size optimizes the expected number of surviving offspring?

40 Univariate linear regression: how and why? viability clutchsize

41 Univariate linear regression: how and why? viability clutchsize

42 Univariate linear regression: how and why? clutchsize * viability clutchsize

43 Univariate linear regression: how and why? clutchsize * viability clutchsize

44 Contents t-test for linear regression 1 Univariate linear regression: how and why? 2 t-test for linear regression 3 log-scaling the data

45 t-test for linear regression Example: red deer (Cervus elaphus) theory: femals can influence the sex of their offspring

46 t-test for linear regression Example: red deer (Cervus elaphus) theory: femals can influence the sex of their offspring Evolutionary stable strategy: weak animals may tend to have female offspring, strong animals may tend to have male offspring. Clutton-Brock, T. H., Albon, S. D., Guinness, F. E. (1986) Great expectations: dominance, breeding success and offspring sex ratios in red deer. Anim. Behav. 34,

47 t-test for linear regression > hind rank ratiomales CAUTION: Simulated data, inspired by original paper

48 t-test for linear regression hind$rank hind$ratiomales

49 t-test for linear regression hind$rank hind$ratiomales

50 t-test for linear regression > mod <- lm(ratiomales~rank,data=hind) > summary(mod) Call: lm(formula = ratiomales ~ rank, data = hind) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-06 *** rank e-09 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 52 degrees of freedom Multiple R-squared: , Adjusted R-squared:

51 t-test for linear regression Model: Y = a + b X + ε mit ε N (0, σ 2 )

52 t-test for linear regression Model: Y = a + b X + ε mit ε N (0, σ 2 ) How to compute the significance of a relationship between the explanatory trait X and the target variable Y?

53 t-test for linear regression Model: Y = a + b X + ε mit ε N (0, σ 2 ) How to compute the significance of a relationship between the explanatory trait X and the target variable Y? In other words: How can we test the null hypothesis b = 0?

54 t-test for linear regression Model: Y = a + b X + ε mit ε N (0, σ 2 ) How to compute the significance of a relationship between the explanatory trait X and the target variable Y? In other words: How can we test the null hypothesis b = 0? We have estimated b by ˆb 0. Could the true b be 0?

55 t-test for linear regression Model: Y = a + b X + ε mit ε N (0, σ 2 ) How to compute the significance of a relationship between the explanatory trait X and the target variable Y? In other words: How can we test the null hypothesis b = 0? We have estimated b by ˆb 0. Could the true b be 0? How large is the standard error of ˆb?

56 t-test for linear regression y i = a + b x i + ε mit ε N (0, σ 2 ) not random: a, b, x i, σ 2 random: ε, y i var(y i ) = var(a + b x i + ε) = var(ε) = σ 2 and y 1, y 2,..., y n are stochastically independent.

57 t-test for linear regression not random: a, b, x i, σ 2 y i = a + b x i + ε mit ε N (0, σ 2 ) random: ε, y i var(y i ) = var(a + b x i + ε) = var(ε) = σ 2 and y 1, y 2,..., y n are stochastically independent. ˆb = i y i(x i x) i (x i x) 2

58 t-test for linear regression not random: a, b, x i, σ 2 y i = a + b x i + ε mit ε N (0, σ 2 ) random: ε, y i var(y i ) = var(a + b x i + ε) = var(ε) = σ 2 and y 1, y 2,..., y n are stochastically independent. ˆb = i y i(x i x) i (x i x) 2 ( var(ˆb) = var i y ) i(x i x) i (x i x) 2 = var ( i y i(x i x)) ( i (x i x) 2 ) 2

59 t-test for linear regression not random: a, b, x i, σ 2 y i = a + b x i + ε mit ε N (0, σ 2 ) random: ε, y i var(y i ) = var(a + b x i + ε) = var(ε) = σ 2 and y 1, y 2,..., y n are stochastically independent. ˆb = i y i(x i x) i (x i x) 2 ( var(ˆb) = var i y ) i(x i x) i (x = var ( i y i(x i x)) i x) 2 ( i (x i x) 2 ) 2 i = var (y i) (x i x) 2 i ( i (x = σ 2 (x i x) 2 i x) 2 ) 2 ( i (x i x) 2 ) 2

60 t-test for linear regression not random: a, b, x i, σ 2 y i = a + b x i + ε mit ε N (0, σ 2 ) random: ε, y i var(y i ) = var(a + b x i + ε) = var(ε) = σ 2 and y 1, y 2,..., y n are stochastically independent. ˆb = i y i(x i x) i (x i x) 2 ( var(ˆb) = var i y ) i(x i x) i (x = var ( i y i(x i x)) i x) 2 ( i (x i x) 2 ) 2 i = var (y i) (x i x) 2 i ( i (x = σ 2 (x i x) 2 i x) 2 ) 2 ( i (x i x) 2 ) 2 = σ 2 / i (x i x) 2

61 t-test for linear regression In fact ˆb is normally distributed with mean b and var(ˆb) = σ 2 / i (x i x) 2

62 t-test for linear regression In fact ˆb is normally distributed with mean b and var(ˆb) = σ 2 / i (x i x) 2 Problem: We do not know σ 2.

63 t-test for linear regression In fact ˆb is normally distributed with mean b and var(ˆb) = σ 2 / i (x i x) 2 Problem: We do not know σ 2. We estimate σ 2 by considering the residual variance: s 2 := i (y i â ˆb ) 2 x i n 2

64 t-test for linear regression In fact ˆb is normally distributed with mean b and var(ˆb) = σ 2 / i (x i x) 2 Problem: We do not know σ 2. We estimate σ 2 by considering the residual variance: s 2 := i (y i â ˆb ) 2 x i n 2 Note that we divide by n 2. The reason for this is that two model parameters a and b have been estimated, which means that two degrees of freedom got lost.

65 t-test for linear regression var(ˆb) = σ 2 / i (x i x) 2 Estimate σ 2 by Then s 2 = i (y i â ˆb ) 2 x i. n 2 ˆb b / s i (x i x) 2 is Student-t-distributed with n 2 degrees of freedom and we can apply the t-test to test the null hypothesis b = 0.

66 Contents log-scaling the data 1 Univariate linear regression: how and why? 2 t-test for linear regression 3 log-scaling the data

67 log-scaling the data Data example: typical body weight [kg] and and brain weight [g] of 62 mammals species (and 3 dinosaurs) > data weight.kg. brain.weight.g species extinct african elephant no no no no asian elephant no no no no cat no chimpanzee no

68 log-scaling the data typische Werte bei 62 Saeugeierarten Gehirngewicht [g] Koerpergewicht [kg]

69 log-scaling the data typische Werte bei 65 Saeugeierarten african elephant asian elephant Gehirngewicht [g] 1e 01 1e+00 1e+01 1e+02 1e+03 mouse giraffe horse chimpanzee donkey cow rhesus monkey sheep jaguar pig potar monkey grey goat wolf kangaroo cat rabbit mountain beaver guinea pig mole rat hamster human Triceratops Diplodocus 1e 02 1e+00 1e+02 1e+04 Brachiosa Koerpergewicht [kg]

70 log-scaling the data 1e 02 1e+00 1e+02 1e+04 1e 01 1e+00 1e+01 1e+02 1e+03 typische Werte bei 65 Saeugeierarten Koerpergewicht [kg] Gehirngewicht [g]

71 log-scaling the data > modell <- lm(brain.weight.g~weight.kg.,subset=extinct=="no") > summary(modell) Call: lm(formula = brain.weight.g ~ weight.kg., subset = extinct == "no") Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) * weight.kg <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 60 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 60 DF, p-value: < 2.2e-16

72 log-scaling the data qqnorm(modell$residuals) Normal Q Q Plot Sample Quantiles Theoretical Quantiles

73 log-scaling the data plot(modell$fitted.values,modell$residuals) modell$residuals e 02 1e+00 1e+02 1e+04 modell$model$weight.kg.

74 log-scaling the data plot(modell$fitted.values,modell$residuals,log= x ) modell$residuals modell$fitted.values

75 log-scaling the data plot(modell$model$weight.kg.,modell$residuals) modell$residuals modell$model$weight.kg.

76 log-scaling the data plot(modell$model$weight.kg.,modell$residuals,log= x ) modell$residuals e 02 1e+00 1e+02 1e+04 modell$model$weight.kg.

77 log-scaling the data We see that the residuals varaince depends on the fitted values (or the body weight): heteroscadiscity

78 log-scaling the data We see that the residuals varaince depends on the fitted values (or the body weight): heteroscadiscity The model assumes homoscedascity, i.e. the random deviations must be (almost) independent of the explaining traits (body weight) and the fitted values.

79 log-scaling the data We see that the residuals varaince depends on the fitted values (or the body weight): heteroscadiscity The model assumes homoscedascity, i.e. the random deviations must be (almost) independent of the explaining traits (body weight) and the fitted values. variance-stabilizing transformation: can be rescale body- and brain size to make deviations independent of variables

80 log-scaling the data Actually not so surprising: An elephant s brain of typically 5 kg can easily be 500 g lighter or heavier from individual to individual. This can not happen for a mouse brain of typically 5 g. The latter will rather also vary by 10%, i.e. 0.5 g. Thus, the variance is not additive but rather multiplicative: brain mass = (expected brain mass) random We can convert this into something with additive randomness by taking the log: log(brain mass) = log(expected brain mass) + log(random)

81 log-scaling the data > logmodell <- lm(log(brain.weight.g)~log(weight.kg.),subset=e > summary(logmodell) Call: lm(formula = log(brain.weight.g) ~ log(weight.kg.), subset = e "no") Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** log(weight.kg.) <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 60 degrees of freedom Multiple R-squared: , Adjusted R-squared:

82 log-scaling the data qqnorm(modell$residuals) Normal Q Q Plot Theoretical Quantiles Sample Quantiles

83 log-scaling the data plot(logmodell$fitted.values,logmodell$residuals) logmodell$fitted.values logmodell$residuals

84 log-scaling the data plot(logmodell$fitted.values,logmodell$residuals,log= x ) 1e 03 1e 02 1e 01 1e+00 1e logmodell$fitted.values logmodell$residuals

85 log-scaling the data plot(weight.kg.[extinct== no ],logmodell$residuals) weight.kg.[extinct == "no"] logmodell$residuals

86 log-scaling the data plot(weight.kg.[extinct= no ],logmodell$residuals,log= x ) 1e 02 1e+00 1e+02 1e weight.kg.[extinct == "no"] logmodell$residuals

Statistics for EES Linear regression and linear models

Statistics for EES Linear regression and linear models Statistics for EES Linear regression and linear models Dirk Metzler http://evol.bio.lmu.de/_statgen 28. July 2010 Contents 1 Univariate linear regression: how and why? 2 t-test for linear regression 3

More information

Multivariate Statistics in Ecology and Quantitative Genetics Linear Regression and Linear Models

Multivariate Statistics in Ecology and Quantitative Genetics Linear Regression and Linear Models Multivariate Statistics in Ecology and Quantitative Genetics Linear Regression and Linear Models Dirk Metzler http://evol.bio.lmu.de/_statgen July 12, 2017 Univariate linear regression Contents Univariate

More information

Statistics for EES Linear regression and linear models

Statistics for EES Linear regression and linear models Statstcs for EES Lnear regresson and lnear models Drk Metzler June 11, 2018 Contents 1 Unvarate lnear regresson: how and why? 1 2 t-test for lnear regresson 6 3 log-scalng the data 9 4 Checkng model assumptons

More information

Multivariate Statistics in Ecology and Quantitative Genetics Summary

Multivariate Statistics in Ecology and Quantitative Genetics Summary Multivariate Statistics in Ecology and Quantitative Genetics Summary Dirk Metzler & Martin Hutzenthaler http://evol.bio.lmu.de/_statgen 5. August 2011 Contents Linear Models Generalized Linear Models Mixed-effects

More information

Lecture 16: Again on Regression

Lecture 16: Again on Regression Lecture 16: Again on Regression S. Massa, Department of Statistics, University of Oxford 10 February 2016 The Normality Assumption Body weights (Kg) and brain weights (Kg) of 62 mammals. Species Body weight

More information

Minimum Regularized Covariance Determinant Estimator

Minimum Regularized Covariance Determinant Estimator Minimum Regularized Covariance Determinant Estimator Honey, we shrunk the data and the covariance matrix Kris Boudt (joint with: P. Rousseeuw, S. Vanduffel and T. Verdonck) Vrije Universiteit Brussel/Amsterdam

More information

PHAR2821 Drug Discovery and Design B Statistics GARTH TARR SEMESTER 2, 2013

PHAR2821 Drug Discovery and Design B Statistics GARTH TARR SEMESTER 2, 2013 PHAR2821 Drug Discovery and Design B Statistics GARTH TARR SEMESTER 2, 2013 Housekeeping Contact details Email: garth.tarr@sydney.edu.au Room: 806 Carslaw Building Consultation: by appointment (email to

More information

6.0 Lesson Plan. Answer Questions. Regression. Transformation. Extrapolation. Residuals

6.0 Lesson Plan. Answer Questions. Regression. Transformation. Extrapolation. Residuals 6.0 Lesson Plan Answer Questions Regression Transformation Extrapolation Residuals 1 Information about TAs Lab grader: Pontus, npl@duke.edu Hwk grader: Rachel, rmt6@duke.edu Quiz (Tuesday): Matt, matthew.campbell@duke.edu

More information

Stat 529 (Winter 2011) A simple linear regression (SLR) case study. Mammals brain weights and body weights

Stat 529 (Winter 2011) A simple linear regression (SLR) case study. Mammals brain weights and body weights Stat 529 (Winter 2011) A simple linear regression (SLR) case study Reading: Sections 8.1 8.4, 8.6, 8.7 Mammals brain weights and body weights Questions of interest Scatterplots of the data Log transforming

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Linear Regression Model. Badr Missaoui

Linear Regression Model. Badr Missaoui Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

18.0 Multiple and Nonlinear Regression

18.0 Multiple and Nonlinear Regression 18.0 Multiple and Nonlinear Regression 1 Answer Questions Multiple Regression Nonlinear Regression 18.1 Multiple Regression Recall the regression assumptions: 1. Each point (X i,y i ) in the scatterplot

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Distribution Assumptions

Distribution Assumptions Merlise Clyde Duke University November 22, 2016 Outline Topics Normality & Transformations Box-Cox Nonlinear Regression Readings: Christensen Chapter 13 & Wakefield Chapter 6 Linear Model Linear Model

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Stat 101: Lecture 6. Summer 2006

Stat 101: Lecture 6. Summer 2006 Stat 101: Lecture 6 Summer 2006 Outline Review and Questions Example for regression Transformations, Extrapolations, and Residual Review Mathematical model for regression Each point (X i, Y i ) in the

More information

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one.

Study Sheet. December 10, The course PDF has been updated (6/11). Read the new one. Study Sheet December 10, 2017 The course PDF has been updated (6/11). Read the new one. 1 Definitions to know The mode:= the class or center of the class with the highest frequency. The median : Q 2 is

More information

1 Multiple Regression

1 Multiple Regression 1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only

More information

SCHOOL OF MATHEMATICS AND STATISTICS

SCHOOL OF MATHEMATICS AND STATISTICS RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester

More information

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared

More information

Lecture 4 Multiple linear regression

Lecture 4 Multiple linear regression Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran

Statistics and Quantitative Analysis U4320. Segment 10 Prof. Sharyn O Halloran Statistics and Quantitative Analysis U4320 Segment 10 Prof. Sharyn O Halloran Key Points 1. Review Univariate Regression Model 2. Introduce Multivariate Regression Model Assumptions Estimation Hypothesis

More information

Section Review. Change Over Time UNDERSTANDING CONCEPTS. of evolution? share ancestors? CRITICAL THINKING

Section Review. Change Over Time UNDERSTANDING CONCEPTS. of evolution? share ancestors? CRITICAL THINKING Skills Worksheet Section Review Change Over Time UNDERSTANDING CONCEPTS 1. Describing What are three lines of evidence that support the theory of evolution? 2. Summarizing What evidence about the ancestors

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz

Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION Jan 7 2015 Charlotte Wickham stat512.cwick.co.nz Announcements TA's Katie 2pm lab Ben 5pm lab Joe noon & 1pm lab TA office hours Kidder M111 Katie Tues 2-3pm

More information

Correlation and Regression

Correlation and Regression Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class

More information

Booklet of Code and Output for STAC32 Final Exam

Booklet of Code and Output for STAC32 Final Exam Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure

More information

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION FOR SAMPLE OF RAW DATA (E.G. 4, 1, 7, 5, 11, 6, 9, 7, 11, 5, 4, 7) BE ABLE TO COMPUTE MEAN G / STANDARD DEVIATION MEDIAN AND QUARTILES Σ ( Σ) / 1 GROUPED DATA E.G. AGE FREQ. 0-9 53 10-19 4...... 80-89

More information

Interpretation, Prediction and Confidence Intervals

Interpretation, Prediction and Confidence Intervals Interpretation, Prediction and Confidence Intervals Merlise Clyde September 15, 2017 Last Class Model for log brain weight as a function of log body weight Nested Model Comparison using ANOVA led to model

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Inferences on Linear Combinations of Coefficients

Inferences on Linear Combinations of Coefficients Inferences on Linear Combinations of Coefficients Note on required packages: The following code required the package multcomp to test hypotheses on linear combinations of regression coefficients. If you

More information

14 Multiple Linear Regression

14 Multiple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in

More information

Transformations. Merlise Clyde. Readings: Gelman & Hill Ch 2-4, ALR 8-9

Transformations. Merlise Clyde. Readings: Gelman & Hill Ch 2-4, ALR 8-9 Transformations Merlise Clyde Readings: Gelman & Hill Ch 2-4, ALR 8-9 Assumptions of Linear Regression Y i = β 0 + β 1 X i1 + β 2 X i2 +... β p X ip + ɛ i Model Linear in X j but X j could be a transformation

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) We will cover Chs. 5 and 6 first, then 3 and 4. Mon,

More information

Evidence of Common Ancestry Stations

Evidence of Common Ancestry Stations Stations Scientists have long wondered where organisms came from and how they evolved. One of the main sources of evidence for the evolution of organisms comes from the fossil record. Thousands of layers

More information

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2

More information

Tests of Linear Restrictions

Tests of Linear Restrictions Tests of Linear Restrictions 1. Linear Restricted in Regression Models In this tutorial, we consider tests on general linear restrictions on regression coefficients. In other tutorials, we examine some

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent

More information

Measuring the fit of the model - SSR

Measuring the fit of the model - SSR Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do

More information

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Biostatistics for physicists fall Correlation Linear regression Analysis of variance Biostatistics for physicists fall 2015 Correlation Linear regression Analysis of variance Correlation Example: Antibody level on 38 newborns and their mothers There is a positive correlation in antibody

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the

More information

Biostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich

Biostatistics. Correlation and linear regression. Burkhardt Seifert & Alois Tschopp. Biostatistics Unit University of Zurich Biostatistics Correlation and linear regression Burkhardt Seifert & Alois Tschopp Biostatistics Unit University of Zurich Master of Science in Medical Biology 1 Correlation and linear regression Analysis

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Multiple Linear Regression. Chapter 12

Multiple Linear Regression. Chapter 12 13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.

More information

Handout 4: Simple Linear Regression

Handout 4: Simple Linear Regression Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:

More information

Regression Models - Introduction

Regression Models - Introduction Regression Models - Introduction In regression models there are two types of variables that are studied: A dependent variable, Y, also called response variable. It is modeled as random. An independent

More information

Evidence of Evolution by Natural Selection. Dodo bird

Evidence of Evolution by Natural Selection. Dodo bird Evidence of Evolution by Natural Selection Dodo bird 2007-2008 Evidence supporting evolution Fossil record transition species Anatomical record homologous & vestigial structures embryology & development

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

A discussion on multiple regression models

A discussion on multiple regression models A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

22s:152 Applied Linear Regression

22s:152 Applied Linear Regression 22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne

Applied Statistics. J. Blanchet and J. Wadsworth. Institute of Mathematics, Analysis, and Applications EPF Lausanne Applied Statistics J. Blanchet and J. Wadsworth Institute of Mathematics, Analysis, and Applications EPF Lausanne An MSc Course for Applied Mathematicians, Fall 2012 Outline 1 Motivation: Why Applied Statistics?

More information

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison. Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose

More information

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R Gilles Lamothe February 21, 2017 Contents 1 Anova with one factor 2 1.1 The data.......................................... 2 1.2 A visual

More information

STAT 215 Confidence and Prediction Intervals in Regression

STAT 215 Confidence and Prediction Intervals in Regression STAT 215 Confidence and Prediction Intervals in Regression Colin Reimer Dawson Oberlin College 24 October 2016 Outline Regression Slope Inference Partitioning Variability Prediction Intervals Reminder:

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information

Linear Model Specification in R

Linear Model Specification in R Linear Model Specification in R How to deal with overparameterisation? Paul Janssen 1 Luc Duchateau 2 1 Center for Statistics Hasselt University, Belgium 2 Faculty of Veterinary Medicine Ghent University,

More information

Lecture 2. Simple linear regression

Lecture 2. Simple linear regression Lecture 2. Simple linear regression Jesper Rydén Department of Mathematics, Uppsala University jesper@math.uu.se Regression and Analysis of Variance autumn 2014 Overview of lecture Introduction, short

More information

GMM - Generalized method of moments

GMM - Generalized method of moments GMM - Generalized method of moments GMM Intuition: Matching moments You want to estimate properties of a data set {x t } T t=1. You assume that x t has a constant mean and variance. x t (µ 0, σ 2 ) Consider

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope

Oct Simple linear regression. Minimum mean square error prediction. Univariate. regression. Calculating intercept and slope Oct 2017 1 / 28 Minimum MSE Y is the response variable, X the predictor variable, E(X) = E(Y) = 0. BLUP of Y minimizes average discrepancy var (Y ux) = C YY 2u C XY + u 2 C XX This is minimized when u

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

Statistical View of Least Squares

Statistical View of Least Squares Basic Ideas Some Examples Least Squares May 22, 2007 Basic Ideas Simple Linear Regression Basic Ideas Some Examples Least Squares Suppose we have two variables x and y Basic Ideas Simple Linear Regression

More information

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression

Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly

More information

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 29, 2015 Lecture 5: Multiple Regression Review of ANOVA & Simple Regression Both Quantitative outcome Independent, Gaussian errors

More information

Data Analysis Using R ASC & OIR

Data Analysis Using R ASC & OIR Data Analysis Using R ASC & OIR Overview } What is Statistics and the process of study design } Correlation } Simple Linear Regression } Multiple Linear Regression 2 What is Statistics? Statistics is a

More information

Multiple Regression and Regression Model Adequacy

Multiple Regression and Regression Model Adequacy Multiple Regression and Regression Model Adequacy Joseph J. Luczkovich, PhD February 14, 2014 Introduction Regression is a technique to mathematically model the linear association between two or more variables,

More information

Properties of Estimators

Properties of Estimators Statistical properties of â and ˆb Mean and variance of ˆb Properties of Estimators E(ˆb) = b var(ˆb) = σ2 Recall that = n (x i x) 2 i= Distribution of ˆb ˆb N (b, σ 2 ) Mean and variance of â E(â) = a

More information

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website. SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association

More information

Statistics for EES 3. From standard error to t-test

Statistics for EES 3. From standard error to t-test Statistics for EES 3. From standard error to t-test Dirk Metzler http://evol.bio.lmu.de/_statgen May 23, 2011 Contents 1 The Normal Distribution 2 Taking standard errors into account 3 The t-test for paired

More information

ST430 Exam 2 Solutions

ST430 Exam 2 Solutions ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving

More information

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10

More information

Introduction to Linear Regression

Introduction to Linear Regression Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46

More information

STAT 350: Summer Semester Midterm 1: Solutions

STAT 350: Summer Semester Midterm 1: Solutions Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.

More information

Multiple Regression Part I STAT315, 19-20/3/2014

Multiple Regression Part I STAT315, 19-20/3/2014 Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.

More information

ACOVA and Interactions

ACOVA and Interactions Chapter 15 ACOVA and Interactions Analysis of covariance (ACOVA) incorporates one or more regression variables into an analysis of variance. As such, we can think of it as analogous to the two-way ANOVA

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for

More information

Applied Regression Analysis

Applied Regression Analysis Applied Regression Analysis Chapter 3 Multiple Linear Regression Hongcheng Li April, 6, 2013 Recall simple linear regression 1 Recall simple linear regression 2 Parameter Estimation 3 Interpretations of

More information

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between

More information

Week 7 Multiple factors. Ch , Some miscellaneous parts

Week 7 Multiple factors. Ch , Some miscellaneous parts Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires

More information

Lecture 6: Linear Regression

Lecture 6: Linear Regression Lecture 6: Linear Regression Reading: Sections 3.1-3 STATS 202: Data mining and analysis Jonathan Taylor, 10/5 Slide credits: Sergio Bacallado 1 / 30 Simple linear regression Model: y i = β 0 + β 1 x i

More information

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested

More information

Consider fitting a model using ordinary least squares (OLS) regression:

Consider fitting a model using ordinary least squares (OLS) regression: Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful

More information

The First Thing You Ever Do When Receive a Set of Data Is

The First Thing You Ever Do When Receive a Set of Data Is The First Thing You Ever Do When Receive a Set of Data Is Understand the goal of the study What are the objectives of the study? What would the person like to see from the data? Understand the methodology

More information

Example: Data from the Child Health and Development Study

Example: Data from the Child Health and Development Study Example: Data from the Child Health and Development Study Can we use linear regression to examine how well length of gesta:onal period predicts birth weight? First look at the sca@erplot: Does a linear

More information