Regression. Bret Larget. Department of Statistics. University of Wisconsin - Madison. Statistics 371, Fall Correlation Plots

Size: px
Start display at page:

Download "Regression. Bret Larget. Department of Statistics. University of Wisconsin - Madison. Statistics 371, Fall Correlation Plots"

Transcription

1 Correlation Plots r = 0.97 Regression Bret Larget Department of Statistics Statistics 371, Fall Universit of Wisconsin - Madison Correlation Plots r = 0.21 Correlation December 2, 2004 The correlation coefficient r is measure of the strength of the linear relationship between two variables. r = 1 n ( ) i ( ) i ȳ n 1 i=1 s s (i )( = i ȳ) (i ) 2 ( i ȳ) 2 Notice that the correlation is not affected b linear transformations of the data (such as changing the scale of measurement). Statistics 371, Fall 2004 Statistics 371, Fall Statistics 371, Fall

2 Correlation Plots Correlation Plots r = 1 r = Statistics 371, Fall Statistics 371, Fall Correlation Plots Correlation Plots r = 0 r = Statistics 371, Fall Statistics 371, Fall

3 Summar of Correlation Correlation Plots The correlation coefficient r measures the strength of the linear relationship between two quantitative variables, on a scale from 1 to 1. The correlation coefficient is 1 or 1 onl when the data lies perfectl on a line with negative or positive slope, respectivel. If the correlation coefficient is near one, this means that the data is tightl clustered around a line with a positive slope. Correlation coefficients near 0 indicate weak linear relationships. However, r does not measure the strength of nonlinear relationships. If r = 0, rather than X and Y being unrelated, it can be the case that the have a strong nonlinear relationshsip. If r is close to 1, it ma still be the case that a nonlinear relationship is a better description of the data than a linear relationship. Statistics 371, Fall r = Statistics 371, Fall Simple Linear Regression Correlation Plots Simplelinearregression isthestatisticalprocedurefordescribing the relationship between an quantitative eplanator variable X and a quantitative response variable Y with a straight line. In simple linear regression, the regression line is the line that minimizes the sum of the squared residuals r = Statistics 371, Fall Statistics 371, Fall

4 Rile Rile Rile Larget is m son. Below is a plot of his height versus his age, from birth to 8 ears. Height (inches) Age (months) Height (inches) Age (months) Statistics 371, Fall Statistics 371, Fall Finding a best linear fit An line we can use to predict Y from X will have the form Y = b 0 + b 1 X where b 0 is the intercept and b 1 will be the slope. The value ŷ = b 0 + b 1 is the predicted value of Y if the eplanator variable X =. In simple linear regression, the predicted values form a line. (In more advanced forms of regression, we can fit curves or fit functions of multiple eplanator variables.) For each data point ( i, i ), the residual is the difference between the observed value and the predicted value, i ŷ i. Graphicall, each residual is the positive or negative vertical distance from the point to the line. Simple linear regression identifies the line that minimizes the n residual sum of squares, ( i ŷ i ) 2. i=1 Rile The plot indicates that it is not reasonable to model the relationshipbetweenageandheight as linearovertheentireage range, but it is fairl linear from age 2 ears to 8 ears (24 96 months). Statistics 371, Fall Statistics 371, Fall

5 The General Case Let X = + zs, so X is z standard deviations above the mean. ŷ = b 0 + b 1 ( + zs ) = (ȳ b 1 ) + b 1 + b 1 zs ( = ȳ + r s ) zs s = ȳ + (rz) s Least Squares Regression We won t derive them, but there are simple formulas for the slope and intercept of the least squares line as a function of the sample means, standard deviations, and the correlation coefficient. Y = b 0 + b 1 X b 1 = r s s b 0 = ȳ b 1 Notice that if X is z SDs above the mean, we predict Y to be onl rz SDs above the mean. In the tpical situation, r < 1, so we predict the value of Y to beclosertothemean (instandardunits) than X. This iscalled the regression effect. Statistics 371, Fall Statistics 371, Fall Rile (cont.) > n = length(age2) > m = mean(age2) > s = sd(age2) > m = mean(height2) > s = sd(height2) > r = cor(age2, height2) > print(c(m, s, m, s, r, n)) [1] > b1 = r * s/s > b0 = m - b1 * m > print(c(b0, b1)) [1] (Rile s predicted height in inches) = (Rile s age in months) A Special Case Consider the predicted value of an observation X =. ŷ = b 0 + b 1 = (ȳ b 1 ) + b 1 = ȳ So, the regression line alwas goes through the point (, ȳ). Statistics 371, Fall Statistics 371, Fall

6 Rile A Residual Plot > plot(fitted(fit2), residuals(fit2)) > abline(h = 0) Using R > age2 = age[(age > 23) & (age < 97)] > height2 = height[(age > 23) & (age < 97)] > fit2 = lm(height2 ~ age2) > summar(fit2) Call: lm(formula = height2 ~ age2) residuals(fit2) Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) <2e-16 *** age <2e-16 *** --- Signif. codes: 0 *** ** 0.01 * fitted(fit2) Residual standard error: on 14 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: 7627 on 1 and 14 DF, p-value: < 2.2e-16 Statistics 371, Fall Statistics 371, Fall Rile Interpretation Rile Plot of Data We can interpret the slope to mean that from age 2 to 8 ears, Rile grew an average of about 0.25 inches per ear month, or about 3 inches per ear. The intercept is the predicted value when X = 0, or Rile s height (length) at birth. This interpretation ma not be reasonable if 0 is out of the range of the data. Height (inches) Age (months) Statistics 371, Fall Statistics 371, Fall

7 Residual Standard Deviation Rile Etrapolation The residual standard deviation, s Y X, is a measure of a tpical size of a residual. Its formula is s Y X = SS(resid) n 2 Notice that in simple linear regression, there are n 2 degrees of freedom, in contrast to our formula n 1. Height (inches) Thereasonisthat ourmodel for themean usestwo parameters it takes two points to determine a line Age (months) Statistics 371, Fall Statistics 371, Fall Frogs Rile Etrapolation In a stud on ooctes (developing egg cells) from the frog Xenopus laevis, a biologist injects individual ooctes from the same female with radioactive leucine, to measure the amount of leucine incorporated into protein as a function of time. (See Eercise 12.3 on page 536.) Height (inches) Age (months) Statistics 371, Fall Statistics 371, Fall

8 R Fitting a model > fit = lm(leucine ~ time) > summar(fit) Call: lm(formula = leucine ~ time) Residuals: Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) time e-06 *** --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 5 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: 340 on 1 and 5 DF, p-value: 8.628e-06 R Entering the data Here is some R code to create a data frame frog with the time and leucine variables. > frog = data.frame(time = seq(0, 60, b = 10), leucine = c(0.02, , 0.54, 0.69, 1.07, 1.5, 1.74)) > attach(frog) > frog time leucine Statistics 371, Fall Statistics 371, Fall R Verifing tetbook formulae > n = nrow(frog) > m = mean(time) > s = sd(time) > m = mean(leucine) > s = sd(leucine) > r = cor(time, leucine) > c(m, s, m, s, r) [1] > b1 = r * s/s > b0 = m - b1 * m > s = sqrt(sum(residuals(fit)^2)/(n - 2)) > seb1 = s/s/sqrt(n - 1) > ts = b1/seb1 > c(b0, b1, s, seb1, ts) [1] R Plotting the data > plot(time, leucine) leucine time Statistics 371, Fall Statistics 371, Fall

9 Fitting a Quadratic Model We add an additional parameters to fit a curve. Fitting a quadratic curve is a reasonable choice (unless there was a scientific reason to fit an eponential curve, for eample). R A Residual Plot > plot(fitted(fit), residuals(fit), lab = "Fitted Values", lab = "Residuals") > abline(h = 0) Here is a quadratic model. (leucine) = β 0 + β 1 (time) + β 2 (time) 2 The residual sum of squares will cannot be larger when fitting a quadratic model instead of a linear model lines are special cases of parabolas. But, we want to be careful not to overfit. Residuals Fitted Values There might be some nonlinearit.... Statistics 371, Fall Statistics 371, Fall R Fitting a Curve > fit2 = lm(leucine ~ time + I(time^2)) > summar(fit2) Call: lm(formula = leucine ~ time + I(time^2)) R Plotting the regression line > plot(time, leucine, pch = 16) > abline(fit) Residuals: Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) 2.452e e time 2.061e e * I(time^2) 1.441e e Signif. codes: 0 *** ** 0.01 * Residual standard error: on 4 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 2 and 4 DF, p-value: 5.359e-05 leucine time Statistics 371, Fall Statistics 371, Fall

10 Tree Eample We want to be able to estimate the age of trees in the Amazon b their diameter. Use the carbon dating as a pro for the actual age. (See eercise on page 575.) R Plotting the Curve > plot(time, leucine, pch = 16) > abline(fit) > new = predict(fit2, data.frame(time = 0:60)) > lines(0:60, new, lt = 2) leucine time Statistics 371, Fall Statistics 371, Fall Tree Eample > amazon = read.table("amazon.tt", header = T) > attach(amazon) > amazon diameter age Statistics 371, Fall R Residual Plot for Quadratic Model > plot(fitted(fit2), residuals(fit2), lab = "Fitted Values", lab = "Residuals") > abline(h = 0) Residuals Fitted Values Statistics 371, Fall

11 Amazon Residual Plot Amazon Plot > plot(fitted(fit), residuals(fit), lab = "Fitted Values", lab = "Residuals") > abline(h = 0) > plot(diameter, age, pch = 16) Residuals age Fitted Values diameter Statistics 371, Fall Statistics 371, Fall Amazon Normal Probabilit Plot Amazon Fitting a Line > qqnorm(residuals(fit)) Normal Q Q Plot > fit = lm(age ~ diameter) > summar(fit) Call: lm(formula = age ~ diameter) Sample Quantiles Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) diameter * --- Signif. codes: 0 *** ** 0.01 * Residual standard error: on 18 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 1 and 18 DF, p-value: Theoretical Quantiles Statistics 371, Fall Statistics 371, Fall

12 Amazon Normal Probabilit Plot > qqnorm(residuals(fit2)) Normal Q Q Plot Amazon Log Transformation > fit2 = lm(log(age) ~ diameter) > summar(fit2) Call: lm(formula = log(age) ~ diameter) Sample Quantiles Residuals: Min 1Q Median 3Q Ma Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-09 *** diameter * --- Signif. codes: 0 *** ** 0.01 * Theoretical Quantiles Residual standard error: on 18 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 1 and 18 DF, p-value: Statistics 371, Fall Statistics 371, Fall One Last Eample FEV (forced epirator volume) is a measure of lung capacit and strength. It increases as children grow larger. Amazon Residual Plot > plot(fitted(fit2), residuals(fit2), lab = "Fitted Values", lab = "Residuals") > abline(h = 0) The net page has two separate residual plots, one of the residuals versus height from a fit of FEV versus height and one of the residuals versus height from a fit of log(fev) versus height. Notice that the residuals in the first plot have a fan shape but those in the second are more evenl spread out. Also, the first plot shows a nonlinear trend that is not present in the second plot. Residuals Fitted Values Statistics 371, Fall Statistics 371, Fall

13 One Last Eample Original Data Transformed Data Residuals Residuals Fitted Values Fitted Values Statistics 371, Fall

Correlation. Bret Hanlon and Bret Larget. Department of Statistics University of Wisconsin Madison. December 6, Correlation 1 / 25

Correlation. Bret Hanlon and Bret Larget. Department of Statistics University of Wisconsin Madison. December 6, Correlation 1 / 25 Correlation Bret Hanlon and Bret Larget Department of Statistics Universit of Wisconsin Madison December 6, 2 Correlation / 25 The Big Picture We have just completed a length series of lectures on ANOVA

More information

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.

Regression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison. Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose

More information

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots

More information

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007 Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big

More information

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov

Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT. Charlotte Wickham. stat511.cwick.co.nz. Nov Stat 411/511 ESTIMATING THE SLOPE AND INTERCEPT Nov 20 2015 Charlotte Wickham stat511.cwick.co.nz Quiz #4 This weekend, don t forget. Usual format Assumptions Display 7.5 p. 180 The ideal normal, simple

More information

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model

Lab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.

More information

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths Cuckoo Birds Analysis of Variance Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison Statistics 371 29th November 2005 Cuckoo birds have a behavior in which they lay their

More information

STAT Fall HW8-5 Solution Instructor: Shiwen Shen Collection Day: November 9

STAT Fall HW8-5 Solution Instructor: Shiwen Shen Collection Day: November 9 STAT 509 2016 Fall HW8-5 Solution Instructor: Shiwen Shen Collection Day: November 9 1. There is a gala dataset in faraway package. It concerns the number of species of tortoise on the various Galapagos

More information

Comparing Nested Models

Comparing Nested Models Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent

More information

Quantitative Understanding in Biology 2.2 Fitting Model Parameters

Quantitative Understanding in Biology 2.2 Fitting Model Parameters Quantitative Understanding in Biolog 2.2 Fitting Model Parameters Jason Banfelder November 30th, 2017 Let us consider how we might fit a power law model to some data. simulating the power law relationship

More information

Introduction to Linear Regression

Introduction to Linear Regression Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46

More information

Linear Modelling: Simple Regression

Linear Modelling: Simple Regression Linear Modelling: Simple Regression 10 th of Ma 2018 R. Nicholls / D.-L. Couturier / M. Fernandes Introduction: ANOVA Used for testing hpotheses regarding differences between groups Considers the variation

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship

Correlation and regression. Correlation and regression analysis. Measures of association. Why bother? Positive linear relationship 1 Correlation and regression analsis 12 Januar 2009 Monda, 14.00-16.00 (C1058) Frank Haege Department of Politics and Public Administration Universit of Limerick frank.haege@ul.ie www.frankhaege.eu Correlation

More information

Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions.

Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions. Examples of fitting various piecewise-continuous functions to data, using basis functions in doing the regressions. David. Boore These examples in this document used R to do the regression. See also Notes_on_piecewise_continuous_regression.doc

More information

Example: Data from the Child Health and Development Study

Example: Data from the Child Health and Development Study Example: Data from the Child Health and Development Study Can we use linear regression to examine how well length of gesta:onal period predicts birth weight? First look at the sca@erplot: Does a linear

More information

Exam 3 Practice Questions Psych , Fall 9

Exam 3 Practice Questions Psych , Fall 9 Vocabular Eam 3 Practice Questions Psch 3101-100, Fall 9 Rather than choosing some practice terms at random, I suggest ou go through all the terms in the vocabular lists. The real eam will ask for definitions

More information

STAT 350. Assignment 4

STAT 350. Assignment 4 STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the

More information

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression

Dependence and scatter-plots. MVE-495: Lecture 4 Correlation and Regression Dependence and scatter-plots MVE-495: Lecture 4 Correlation and Regression It is common for two or more quantitative variables to be measured on the same individuals. Then it is useful to consider what

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression ST 430/514 Recall: a regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates).

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

1 The Classic Bivariate Least Squares Model

1 The Classic Bivariate Least Squares Model Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating

More information

Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2)

Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2) Math 3339 Homework 2 (Chapter 2, 9.1 & 9.2) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.

More information

Coefficient of Determination

Coefficient of Determination Coefficient of Determination ST 430/514 The coefficient of determination, R 2, is defined as before: R 2 = 1 SS E (yi ŷ i ) = 1 2 SS yy (yi ȳ) 2 The interpretation of R 2 is still the fraction of variance

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

L21: Chapter 12: Linear regression

L21: Chapter 12: Linear regression L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample

More information

Stat 401B Final Exam Fall 2015

Stat 401B Final Exam Fall 2015 Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning

More information

lm statistics Chris Parrish

lm statistics Chris Parrish lm statistics Chris Parrish 2017-04-01 Contents s e and R 2 1 experiment1................................................. 2 experiment2................................................. 3 experiment3.................................................

More information

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017

UNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75

More information

Biostatistics in Research Practice - Regression I

Biostatistics in Research Practice - Regression I Biostatistics in Research Practice - Regression I Simon Crouch 30th Januar 2007 In scientific studies, we often wish to model the relationships between observed variables over a sample of different subjects.

More information

y i s 2 X 1 n i 1 1. Show that the least squares estimators can be written as n xx i x i 1 ns 2 X i 1 n ` px xqx i x i 1 pδ ij 1 n px i xq x j x

y i s 2 X 1 n i 1 1. Show that the least squares estimators can be written as n xx i x i 1 ns 2 X i 1 n ` px xqx i x i 1 pδ ij 1 n px i xq x j x Question 1 Suppose that we have data Let x 1 n x i px 1, y 1 q,..., px n, y n q. ȳ 1 n y i s 2 X 1 n px i xq 2 Throughout this question, we assume that the simple linear model is correct. We also assume

More information

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom

Linear regression Class 25, Jeremy Orloff and Jonathan Bloom 1 Learning Goals Linear regression Class 25, 18.05 Jerem Orloff and Jonathan Bloom 1. Be able to use the method of least squares to fit a line to bivariate data. 2. Be able to give a formula for the total

More information

MODELS WITHOUT AN INTERCEPT

MODELS WITHOUT AN INTERCEPT Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level

More information

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall) We will cover Chs. 5 and 6 first, then 3 and 4. Mon,

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

STAT 3022 Spring 2007

STAT 3022 Spring 2007 Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so

More information

> Y ~ X1 + X2. The tilde character separates the response variable from the explanatory variables. So in essence we fit the model

> Y ~ X1 + X2. The tilde character separates the response variable from the explanatory variables. So in essence we fit the model Regression Analysis Regression analysis is one of the most important topics in Statistical theory. In the sequel this widely known methodology will be used with S-Plus by means of formulae for models.

More information

12.1 Systems of Linear equations: Substitution and Elimination

12.1 Systems of Linear equations: Substitution and Elimination . Sstems of Linear equations: Substitution and Elimination Sstems of two linear equations in two variables A sstem of equations is a collection of two or more equations. A solution of a sstem in two variables

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11 Correlation and Regression Correlation A relationship between two variables. The data can be represented b ordered pairs (, ) is the independent (or eplanator) variable is the dependent (or

More information

R 2 and F -Tests and ANOVA

R 2 and F -Tests and ANOVA R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.

More information

What s a good way to show how the results came out? The relationship between two variables can be represented visually by a SCATTER DIAGRAM.

What s a good way to show how the results came out? The relationship between two variables can be represented visually by a SCATTER DIAGRAM. COUNTING DOTS students were asked to count the dots in the square below without making an marks on their sheet, and then to count them again (There are 87 dots) What s a good wa to show how the results

More information

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression Introduction to Correlation and Regression The procedures discussed in the previous ANOVA labs are most useful in cases where we are interested

More information

Week 7 Multiple factors. Ch , Some miscellaneous parts

Week 7 Multiple factors. Ch , Some miscellaneous parts Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires

More information

Introduction and Single Predictor Regression. Correlation

Introduction and Single Predictor Regression. Correlation Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation

More information

Example: 1982 State SAT Scores (First year state by state data available)

Example: 1982 State SAT Scores (First year state by state data available) Lecture 11 Review Section 3.5 from last Monday (on board) Overview of today s example (on board) Section 3.6, Continued: Nested F tests, review on board first Section 3.4: Interaction for quantitative

More information

MATH 644: Regression Analysis Methods

MATH 644: Regression Analysis Methods MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

14.1 Systems of Linear Equations in Two Variables

14.1 Systems of Linear Equations in Two Variables 86 Chapter 1 Sstems of Equations and Matrices 1.1 Sstems of Linear Equations in Two Variables Use the method of substitution to solve sstems of equations in two variables. Use the method of elimination

More information

Lecture 2. Simple linear regression

Lecture 2. Simple linear regression Lecture 2. Simple linear regression Jesper Rydén Department of Mathematics, Uppsala University jesper@math.uu.se Regression and Analysis of Variance autumn 2014 Overview of lecture Introduction, short

More information

Lecture 10. Factorial experiments (2-way ANOVA etc)

Lecture 10. Factorial experiments (2-way ANOVA etc) Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014 A factorial experiment

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

STAT 572 Assignment 5 - Answers Due: March 2, 2007

STAT 572 Assignment 5 - Answers Due: March 2, 2007 1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

Topics on Statistics 2

Topics on Statistics 2 Topics on Statistics 2 Pejman Mahboubi March 7, 2018 1 Regression vs Anova In Anova groups are the predictors. When plotting, we can put the groups on the x axis in any order we wish, say in increasing

More information

Chapter 8: Simple Linear Regression

Chapter 8: Simple Linear Regression Chapter 8: Simple Linear Regression Shiwen Shen University of South Carolina 2017 Summer 1 / 70 Introduction A problem that arises in engineering, economics, medicine, and other areas is that of investigating

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

Nonlinear Models. Daphnia: Purveyors of Fine Fungus 1/30 2/30

Nonlinear Models. Daphnia: Purveyors of Fine Fungus 1/30 2/30 Nonlinear Models 1/30 Daphnia: Purveyors of Fine Fungus 2/30 What do you do when you don t have a straight line? 7500000 Spores 5000000 2500000 0 30 40 50 60 70 longevity 3/30 What do you do when you don

More information

Math 2311 Written Homework 6 (Sections )

Math 2311 Written Homework 6 (Sections ) Math 2311 Written Homework 6 (Sections 5.4 5.6) Name: PeopleSoft ID: Instructions: Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline.

More information

Multiple Linear Regression (solutions to exercises)

Multiple Linear Regression (solutions to exercises) Chapter 6 1 Chapter 6 Multiple Linear Regression (solutions to exercises) Chapter 6 CONTENTS 2 Contents 6 Multiple Linear Regression (solutions to exercises) 1 6.1 Nitrate concentration..........................

More information

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model 1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor

More information

Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz

Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION. Jan Charlotte Wickham. stat512.cwick.co.nz Stat 412/512 REVIEW OF SIMPLE LINEAR REGRESSION Jan 7 2015 Charlotte Wickham stat512.cwick.co.nz Announcements TA's Katie 2pm lab Ben 5pm lab Joe noon & 1pm lab TA office hours Kidder M111 Katie Tues 2-3pm

More information

Linear Regression Model. Badr Missaoui

Linear Regression Model. Badr Missaoui Linear Regression Model Badr Missaoui Introduction What is this course about? It is a course on applied statistics. It comprises 2 hours lectures each week and 1 hour lab sessions/tutorials. We will focus

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

a. plotting points in Cartesian coordinates (Grade 9 and 10), b. using a graphing calculator such as the TI-83 Graphing Calculator,

a. plotting points in Cartesian coordinates (Grade 9 and 10), b. using a graphing calculator such as the TI-83 Graphing Calculator, GRADE PRE-CALCULUS UNIT C: QUADRATIC FUNCTIONS CLASS NOTES FRAME. After linear functions, = m + b, and their graph the Quadratic Functions are the net most important equation or function. The Quadratic

More information

Analysis of Variance

Analysis of Variance Analysis of Variance Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison November 22 November 29, 2011 ANOVA 1 / 59 Cuckoo Birds Case Study Cuckoo birds have a behavior

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Chapter 12 - Part I: Correlation Analysis

Chapter 12 - Part I: Correlation Analysis ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture,

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

Chapter 12: Linear regression II

Chapter 12: Linear regression II Chapter 12: Linear regression II Timothy Hanson Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 14 12.4 The regression model

More information

BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users

BIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users BIOSTATS 640 Spring 08 Unit. Regression and Correlation (Part of ) R Users Unit Regression and Correlation of - Practice Problems Solutions R Users. In this exercise, you will gain some practice doing

More information

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim

Figure 1: The fitted line using the shipment route-number of ampules data. STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim 0.0 1.0 1.5 2.0 2.5 3.0 8 10 12 14 16 18 20 22 y x Figure 1: The fitted line using the shipment route-number of ampules data STAT5044: Regression and ANOVA The Solution of Homework #2 Inyoung Kim Problem#

More information

Answer to exercise 'height vs. age' (Juul)

Answer to exercise 'height vs. age' (Juul) Answer to exercise 'height vs. age' (Juul) Question 1 Fitting a straight line to height for males in the age range 5-20 and making the corresponding illustration is performed by writing: proc reg data=juul;

More information

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may

More information

Generalised linear models. Response variable can take a number of different formats

Generalised linear models. Response variable can take a number of different formats Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion

More information

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.

Regression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph. Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would

More information

Section 4.1 Increasing and Decreasing Functions

Section 4.1 Increasing and Decreasing Functions Section.1 Increasing and Decreasing Functions The graph of the quadratic function f 1 is a parabola. If we imagine a particle moving along this parabola from left to right, we can see that, while the -coordinates

More information

LESSON #42 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART 2 COMMON CORE ALGEBRA II

LESSON #42 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART 2 COMMON CORE ALGEBRA II LESSON #4 - INVERSES OF FUNCTIONS AND FUNCTION NOTATION PART COMMON CORE ALGEBRA II You will recall from unit 1 that in order to find the inverse of a function, ou must switch and and solve for. Also,

More information

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website. SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association

More information

Practice 2 due today. Assignment from Berndt due Monday. If you double the number of programmers the amount of time it takes doubles. Huh?

Practice 2 due today. Assignment from Berndt due Monday. If you double the number of programmers the amount of time it takes doubles. Huh? Admistrivia Practice 2 due today. Assignment from Berndt due Monday. 1 Story: Pair programming Mythical man month If you double the number of programmers the amount of time it takes doubles. Huh? Invention

More information

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories. Chapter Goals To understand the methods for displaying and describing relationship among variables. Formulate Theories Interpret Results/Make Decisions Collect Data Summarize Results Chapter 7: Is There

More information

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017 PDF file location: http://www.murraylax.org/rtutorials/regression_anovatable.pdf

More information

Chapter 27 Summary Inferences for Regression

Chapter 27 Summary Inferences for Regression Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test

More information

Variance Decomposition and Goodness of Fit

Variance Decomposition and Goodness of Fit Variance Decomposition and Goodness of Fit 1. Example: Monthly Earnings and Years of Education In this tutorial, we will focus on an example that explores the relationship between total monthly earnings

More information

Statistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz

Statistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz Dept of Information Science j.nerbonne@rug.nl incl. important reworkings by Harmut Fitz March 17, 2015 Review: regression compares result on two distinct tests, e.g., geographic and phonetic distance of

More information

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics Chapter 13 Student Lecture Notes 13-1 Department of Quantitative Methods & Information Sstems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analsis QMIS 0 Dr. Mohammad

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

5. Linear Regression

5. Linear Regression 5. Linear Regression Outline.................................................................... 2 Simple linear regression 3 Linear model............................................................. 4

More information

No. For example, f(0) = 3, but f 1 (3) 0. Kent did not follow the order of operations when undoing. The correct inverse is f 1 (x) = x 3

No. For example, f(0) = 3, but f 1 (3) 0. Kent did not follow the order of operations when undoing. The correct inverse is f 1 (x) = x 3 Lesson 10.1.1 10-6. a: Each laer has 7 cubes, so the volume is 42 cubic units. b: 14 6 + 2 7 = 98 square units c: (1) V = 20 units 3, SA = 58 units 2 (2) V = 24 units 3, SA = 60 units 2 (3) V = 60 units

More information

Regression on Faithful with Section 9.3 content

Regression on Faithful with Section 9.3 content Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD

More information

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species

1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for

More information

Linear Model Specification in R

Linear Model Specification in R Linear Model Specification in R How to deal with overparameterisation? Paul Janssen 1 Luc Duchateau 2 1 Center for Statistics Hasselt University, Belgium 2 Faculty of Veterinary Medicine Ghent University,

More information

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs.

The linear model. Our models so far are linear. Change in Y due to change in X? See plots for: o age vs. ahe o carats vs. 8 Nonlinear effects Lots of effects in economics are nonlinear Examples Deal with these in two (sort of three) ways: o Polynomials o Logarithms o Interaction terms (sort of) 1 The linear model Our models

More information

Regression Analysis lab 3. 1 Multiple linear regression. 1.1 Import data. 1.2 Scatterplot matrix

Regression Analysis lab 3. 1 Multiple linear regression. 1.1 Import data. 1.2 Scatterplot matrix Regression Analysis lab 3 1 Multiple linear regression 1.1 Import data delivery

More information