Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Similar documents
Chapter 10 Correlation and Regression

Correlation. Relationship between two variables in a scatterplot. As the x values go up, the y values go down.

THE PEARSON CORRELATION COEFFICIENT

CREATED BY SHANNON MARTIN GRACEY 146 STATISTICS GUIDED NOTEBOOK/FOR USE WITH MARIO TRIOLA S TEXTBOOK ESSENTIALS OF STATISTICS, 3RD ED.

Business Statistics. Lecture 10: Correlation and Linear Regression

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Chapter 6: Exploring Data: Relationships Lesson Plan

Bivariate Data Summary

Scatterplots and Correlation

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Section Linear Correlation and Regression. Copyright 2013, 2010, 2007, Pearson, Education, Inc.

Chapter 5 Friday, May 21st

AMS 7 Correlation and Regression Lecture 8

Important note: Transcripts are not substitutes for textbook assignments. 1

Chapter 8. Linear Regression /71

The response variable depends on the explanatory variable.

Sociology 6Z03 Review I

Chapter 9. Correlation and Regression

What is the easiest way to lose points when making a scatterplot?

Chapter 5 Least Squares Regression

1) A residual plot: A)

Chapter 3: Describing Relationships

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

y n 1 ( x i x )( y y i n 1 i y 2

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

8th Grade Common Core Math

Chapter 3: Examining Relationships

A company recorded the commuting distance in miles and number of absences in days for a group of its employees over the course of a year.

Review of Regression Basics

BIVARIATE DATA data for two variables

Chapter 3: Examining Relationships

Unit 5: Moving Straight Ahead Name: Key

Linear Regression Communication, skills, and understanding Calculator Use

UNIT 1: Lesson 1 Solving for a Variable

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

Chapter 5: Data Transformation

In order to prepare for the final exam, you need to understand and be able to work problems involving the following topics:

CRP 272 Introduction To Regression Analysis

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Regression Analysis: Exploring relationships between variables. Stat 251

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

appstats8.notebook October 11, 2016

Analysis of Bivariate Data

3.3 Linear Equations in Standard Form

Describing Bivariate Relationships

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

Mathematics Level D: Lesson 2 Representations of a Line

Grade 8 Mathematics Performance Level Descriptors

Sect The Slope-Intercept Form

Chapter 10 Regression Analysis

Examining Relationships. Chapter 3

The following formulas related to this topic are provided on the formula sheet:

A Library of Functions

NAME: DATE: SECTION: MRS. KEINATH

Linear Regression and Correlation. February 11, 2009

MICHIGAN STANDARDS MAP for a Basic Grade-Level Program. Grade Eight Mathematics (Algebra I)

Prob/Stats Questions? /32

2017 SUMMER REVIEW FOR STUDENTS ENTERING GEOMETRY

Scatterplots and Correlation

5.1 Bivariate Relationships

Chapter 7. Scatterplots, Association, and Correlation

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Algebra 1 Khan Academy Video Correlations By SpringBoard Activity and Learning Target

Statistical View of Least Squares

IF YOU HAVE DATA VALUES:

AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1

INFERENCE FOR REGRESSION

appstats27.notebook April 06, 2017

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation

Chapter 3: Describing Relationships

MATH 1150 Chapter 2 Notation and Terminology

The Correlation Principle. Estimation with (Nonparametric) Correlation Coefficients

Prentice Hall Stats: Modeling the World 2004 (Bock) Correlated to: National Advanced Placement (AP) Statistics Course Outline (Grades 9-12)

7.0 Lesson Plan. Regression. Residuals

IT 403 Practice Problems (2-2) Answers

MAC Module 2 Modeling Linear Functions. Rev.S08

Module 4: Equations and Inequalities in One Variable

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Chapter 2: Looking at Data Relationships (Part 3)

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

Sections 3.2 & 3.3 Introduction to Functions & Graphing

LESSON 2 ALGEBRA & FUNCTIONS

Average Rate of Change & Slope of a Line MATH 092

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

Functions MCC8.F.1 MCC8.F.2 MCC8.EE.5 MCC8.EE.6 MCC8.F.3

Function Junction: Homework Examples from ACE

Chapter 6. Exploring Data: Relationships. Solutions. Exercises:

Single and multiple linear regression analysis

Chapter 16. Simple Linear Regression and dcorrelation

Review for EOC. Arithmetic Sequences, Geometric Sequences, & Scatterplots

Can you tell the relationship between students SAT scores and their college grades?

AP Statistics Two-Variable Data Analysis

Chapter Review. Review Key Vocabulary. Review Examples and Exercises. 4.1 Graphing Linear Equations (pp ) Graph y = 3x 1.

SOLUTIONS FOR PROBLEMS 1-30

Summit Public Schools. Summit, New Jersey. Grade Level 8/ Content Area: Mathematics. Length of Course: Full Academic Year

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Transcription:

Chapter 1 Summarizing Bivariate Data Linear Regression and Correlation This chapter introduces an important method for making inferences about a linear correlation (or relationship) between two variables, and describing such a relationship with an equation that can be used for predicting the value of one variable given the value of the other variable. We will only look at linear relationships, which mean that when graphed, the points approximate a straight-line pattern. We will also introduce a way of measuring the strength of a linear relationship. Data that consist of ordered pairs are called bivariate data. A scatterplot is a display of ordered pairs plotted on a set of axes. Correlation Two variables have a linear relationship if the data tend to cluster around a... when plotted on a scatterplot. A strong correlation means that the points in the scatterplot are closely clustered around a straight line. Two variables are positively associated if large values of one variable are associated with... values of the other. Two variables are negatively associated if large values of one variable are associated with... values of the other. The linear correlation coefficient, denoted r, measures the strength of a linear relationship between the paired x- and y-quantitative values in a sample. The value of r is always between -1 and +1 inclusive. That is, 1 r 1. If r = 1, there is a perfect positive linear correlation. (or association) If r is close to 1, there is a strong positive linear correlation. If r is positive but close to zero, there is a weak positive linear correlation.

If r = 0, there is no correlation. If r is negative but close to zero, there is a weak negative linear correlation. If r is close to -1, there is a strong negative linear correlation. If r = -1, there is a perfect negative linear correlation. Interchanging all x- and y-values will not change r. A common error is to conclude that correlation implies causality, meaning that one variable causes the results of the second variable. Correlation is NOT causation. There might be other known or unknown variables, confounders, which affect both of the variables that we are studying. Remember, that r measures the strength of a linear relationship, not the strength of a relationship that is non-linear. The graphs below certainly seem to have variables with some sort of relationship, but not linear, so r will not be helpful in these cases. In this class, we will not spend time on the tedious calculations required to find a correlation coefficient or the linear regression equation (next section) by hand, but we will use our calculators only. The emphasis will instead be on the interpretation of the results.

Ex. Global Warming: The following data set shows the temperatures at different levels of CO. CO (in parts per million) Temperature (in Celsius) 314 317 30 36 331 339 346 354 361 369 13.9 14.0 13.9 14.1 14.0 14.3 14.1 14.5 14.5 14.4 (a) Draw a scatterplot, first by hand, then using your calculator, where amount of CO is the independent variable and the temperature is the dependent variable. (b) Find the correlation coefficient The Least Squares Regression Line If one can rent a car for $180/week plus $0.5/mile, we can write an equation for the cost of renting a car for a week by y = 180 + 0.5x, where y represents the cost per week, and x represent the number of miles driven in a week. This linear equation gives an exact value of y for any given x. However, the variables often don't have an exact relationship, where one variable is determined completely by the other variable. But if they appear to have a linear relationship, we can find the graph and equation of the straight line that best represents this specific relationship. This straight line is called the least squares regression line (or line of best fit).

The general form of this linear regression equation is y = a + bx (compare this equations to the more familiar form y = mx + b ). The regression line is seen as a measure of the mean value of y for a given value of x. Describe the following: x = y = a = b = The is by. for every.. increase in. variable increasing/ insert number insert number variable decreasing and units and units Positive linear relationship: Negative linear relationship: Describe how we graph a line y = a + bx : Requirements for finding a linear regression line and it s correlation coefficient: 1. The sample of paired data is a random sample of quantitative data.. Visual examination of the scatterplot shows that the points approximate a straight-line pattern. 3. Any outliers must be removed if they are known to be errors. Consider the effects of any outliers that are not known errors. (In a scatterplot, an outlier is a point lying far away from the other data points.) Describe how to find a linear regression equation on a TI83/84:

More problems that applies to the Global Warming example on previous page: (c) Find the equation of the regression line, where amount of CO is the independent variable and the temperature is the dependent variable. (d) Draw the regression line in the same coordinate system as the scatterplot. (e) Mark the errors (residuals) between the actual points and the corresponding points on the regression line. (In general I will not ask you to draw the residuals, only for this problem.) (e) Interpret the slope for this particular problem, using correct units. (f) Interpret the y-intercept for this particular problem, using correct units. (g) Find the predicted temperature for a recent year in which the concentration of CO is 370.9 Is the predicted temperature close to the actual temperature of 14.5 C? (h) If the CO increases by 15 parts per million, how much would you expect the temperature to change.

Regression lines are often useful for predicting the value of one variable, given some particular value of the other variable. If the regression line fits the data quite well, then it makes sense to use its equation for predictions (use a scatterplot and correlation coefficient to determine how well the line fits). However, don't base predictions on values that are far beyond the boundaries of the known sample data, as the linear relationship may not hold true there. An influential point is a point that, when included in a scatterplot, strongly affects the position of the least-squares regression line. ex. When a scatterplot contains outliers: Compute the least-squares regression line both with and without each outlier to determine which outliers are influential. Report the equations of the least-squares regression line both with and without each influential point.