Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

Similar documents
AMS 7 Correlation and Regression Lecture 8

Chapter 6: Exploring Data: Relationships Lesson Plan

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable

7.0 Lesson Plan. Regression. Residuals

Statistical View of Least Squares

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Chapter 27 Summary Inferences for Regression

1. Create a scatterplot of this data. 2. Find the correlation coefficient.

appstats8.notebook October 11, 2016

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

Chapter 3: Describing Relationships

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

appstats27.notebook April 06, 2017

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Test 3A AP Statistics Name:

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Business Statistics. Lecture 9: Simple Regression

Inferences for Regression

Chapter 3: Describing Relationships

Math 243 OpenStax Chapter 12 Scatterplots and Linear Regression OpenIntro Section and

Data Analysis and Statistical Methods Statistics 651

6.2b Homework: Fit a Linear Model to Bivariate Data

INFERENCE FOR REGRESSION

Data Analysis and Statistical Methods Statistics 651

GUIDED NOTES 4.1 LINEAR FUNCTIONS

Outline. Lesson 3: Linear Functions. Objectives:

Mrs. Poyner/Mr. Page Chapter 3 page 1

Linear Regression Communication, skills, and understanding Calculator Use

Business Statistics. Lecture 10: Correlation and Linear Regression

********************************************************************************************************

Math 122 Fall Handout 11: Summary of Euler s Method, Slope Fields and Symbolic Solutions of Differential Equations

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model

Scatterplots and Correlation

Nonlinear Regression Section 3 Quadratic Modeling

Least Squares Regression

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION

MATH 80S Residuals and LSR Models October 3, 2011

MATH 1150 Chapter 2 Notation and Terminology

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Vocabulary. Fitting a Line to Data. Lesson 2-2 Linear Models

Exam 3 Practice Questions Psych , Fall 9

Graphical Diagnosis. Paul E. Johnson 1 2. (Mostly QQ and Leverage Plots) 1 / Department of Political Science

AP CALCULUS AB SUMMER ASSIGNMNET NAME: READ THE FOLLOWING DIRECTIONS CAREFULLY

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation

11 Regression. Introduction. The Correlation Coefficient. The Least-Squares Regression Line

The following formulas related to this topic are provided on the formula sheet:

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Mathematics Level D: Lesson 2 Representations of a Line

UNIT 8: LINEAR FUNCTIONS WEEK 31: Student Packet

Section 3: Simple Linear Regression

Lecture 18: Simple Linear Regression

Unit 3 NOTES Honors Math 2 21

STAT Regression Methods

AP Statistics L I N E A R R E G R E S S I O N C H A P 7

Chapter 3. Measuring data

We will now find the one line that best fits the data on a scatter plot.

Unit D Homework Helper Answer Key

Math 2311 Written Homework 6 (Sections )

3.2: Least Squares Regressions

Chapter 7. Scatterplots, Association, and Correlation

Algebra 1 Midterm Review

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

Intermediate Algebra Final Exam Review

Introduction and Single Predictor Regression. Correlation

MINI LESSON. Lesson 2a Linear Functions and Applications

AP Statistics - Chapter 2A Extra Practice

ASSIGNMENT 3 SIMPLE LINEAR REGRESSION. Old Faithful

Unit 6 - Introduction to linear regression

MATH 2560 C F03 Elementary Statistics I LECTURE 9: Least-Squares Regression Line and Equation

12-1 Graphing Linear Equations. Warm Up Problem of the Day Lesson Presentation. Course 3

BIOSTATISTICS NURS 3324

RATES OF CHANGE. A violin string vibrates. The rate of vibration can be measured in cycles per second (c/s),;

1 Antiderivatives graphically and numerically

Intro to Linear Regression

AP STATISTICS: Summer Math Packet

Lesson 3A: How Fast Are You Moving?

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

Chapter 8. Linear Regression /71

Section 2.1 Exercises

Performance Task. Any Beginning. Chapter 4. Instructional Overview

ALGEBRA 2 MIDTERM REVIEW. Simplify and evaluate the expression for the given value of the variable:

Intro to Linear Regression

Section 2.5 from Precalculus was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website.

Chapter 10 Correlation and Regression

Correlation & Simple Regression

Pre-Calculus Module 4

Study Guide AP Statistics

Math 5a Reading Assignments for Sections

Do Now 18 Balance Point. Directions: Use the data table to answer the questions. 2. Explain whether it is reasonable to fit a line to the data.

Name Period Date. polynomials of the form x ± bx ± c. Use guess and check and logic to factor polynomials of the form 2

Probability and Samples. Sampling. Point Estimates

University of California, Berkeley, Statistics 131A: Statistical Inference for the Social and Life Sciences. Michael Lugo, Spring 2012

Chapter 3 Representations of a Linear Relation

Determine the line of best fit for this data. Write an equation to represent the line of best fit.

12 Rates of Change Average Rates of Change. Concepts: Average Rates of Change

Algebra 3-4 Unit 1 Absolute Value Functions and Equations

Transcription:

Chapter 7 Linear Regression (Pt. 1) 7.1 Introduction Recall that r, the correlation coefficient, measures the linear association between two quantitative variables. Linear regression is the method of fitting a linear model to a scatterplot. As you already learned, r is closely related to this model, as it tells us both its strength and direction. 7.2 The Least-Squares Regression Line When given a scatterplot, there are many different lines you can draw through the data points. However, instead of using an arbitrary line to model our data, a more principled approach might be better. In many applications, people use the least-squares regression line (hereafter, regression line ) to model linear relationships in scatterplots. To find the regression line, we do some calculations (which we will go over later) and end up with a line of the form ŷ = f (x) = b 1 x + b 0 Recall from high school math that this equation is in slope-intercept form, where 1

b 1 is the slope of the line and b 0 is the y-intercept. Taken together, b 0 and b 1 are the parameters of our linear model. The following list is a summary of the properties of the regression line: 1. The regression line minimizes the sum of squared residuals between the line and the actual data. 2. The slope (b 1 ) of the line describes how the response (Y) variable changes with the predictor (X) variable. 3. For a given value of x (whether or not it appears in our data), we can get a prediction ŷ = f (x) for the response variable. Below is a scatterplot with its regression line drawn over the data. Stopping Distance of Cars Stopping Distance (feet) 0 20 40 60 80 100 120 5 10 15 20 25 Speed (mph) 2

7.2.1 Residuals Suppose we have data {(x 1, y 1 ),..., (x N, y N )}, from which we derive the following regression model: ŷ = f (x) = b 1 x + b 0 The ith residual is the difference between the actual value of y i and the predicted value ŷ i = f (x i ). That is, res i = y i ŷ i = Real y Predicted y Sometimes, residuals are called errors as they represent the degree to which the model deviates from the data. As mentioned before, the least-squares line is intended to reduce the sum of the squared residuals between data and model. That is, we choose b 0 and b 1 so that the line f (x) = ŷ = b 1 x + b 0 minimizes Sum of squared residuals = N i=1 (y i ŷ i ) 2 Example (Looking at Residual Plots) The graphic below is the residual plot of the previous car example. The horizontal line represents points where the model s prediction is exactly the same as the data. Points below the line represent cases where the model overestimates the data. Points above the line represent cases where the model underestimates the data. The parameters of the model are b 1 = 3.932 and b 0 = 17.579 (think about these for a minute). The correlation coefficient is r = 0.807. 3

Residual Plot of Cars Data Real Distance - Predicted Distance -20 0 20 40 0 20 40 60 80 Speed 1. Suppose we have a car driving 33 mph. What is the model s prediction for its stopping distance? 2. Suppose we know that a car driving at 15 mph takes 45 feet to stop. Calculate the residual. 3. Suppose a car driving at 22 mph has residual of 11 feet. How long did it take to stop? 4

7.2.2 Interpreting Slope and Intercept Again, suppose we have the linear regression model ŷ = b 1 x + b 0 In general terms, we might interpret the model as follows: 1. For every 1 unit increase in x, y increases (or decreases) by b 1. 2. When x = 0, then y = b 0. This is a template for how you want to give your own interpretation. Remember to include the proper units when writing this interpretation. Note that sometimes, especially for the intercept, you will arrive at some absurd conclusions. Let us take a look at the next example. Example (Interpreting Slope and Intercept) New snowboarders (those who have snowboarded a year or less) often suffer from minor injuries. A random sample of seven new snowboarders produced the data on number of months snowboarding and number of minor injuries in the last month that they snowboarded. The linear regression equation: Minor Injuries = 9.5904 0.7349 Months Snowboarding (r = 0.9614) 1. Identify the slope and write a one-sentence interpretation. 2. Identify the y-intercept and write a one-sentence interpretation. 5

3. If a new snowboarder has snowboarded for five (5) months, how many injuries would you predict s/he had in the last month snowboarding? 4. If a new snowboarder had 4 minor injuries having snowboarded for only 5 months, what is the residual for this amount of time? 7.3 Determining the Regression Line In this class, you will not be asked to compute the parameters of the regression line by hand. However, we give the formulas here to make some observation. ( ) sy b 1 = r b 0 = y b 1 x s x where, if you don t remember, r is the correlation coefficient, s x and s y are the sample standard deviations of x and y, respectively. Note that the correlation plays a role in creating the line. Also note that since the mean and standard deviation appear, the regression line is influenced by outliers, as those statistics are influenced by outliers. Also, note that, for most calculators, the coefficients of the regression line are also calculated when finding the correlation coefficient. 7.4 Preparation for the Quiz Practice Problems Chapter 8: 1, 5, 27, 33, 34, 38, 40 6