AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Similar documents
INFERENCE FOR REGRESSION

Review of Regression Basics

Chapter 3: Describing Relationships

Scatterplots and Correlation

Chapter 3: Describing Relationships

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation

3.2: Least Squares Regressions

Analysis of Bivariate Data

Linear Regression Communication, skills, and understanding Calculator Use

y = a + bx 12.1: Inference for Linear Regression Review: General Form of Linear Regression Equation Review: Interpreting Computer Regression Output

Chapter 5 Friday, May 21st

CHAPTER 3 Describing Relationships

Describing Bivariate Relationships

Conditions for Regression Inference:

Review of Regression Basics

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

5.1 Bivariate Relationships

Unit 6 - Introduction to linear regression

Chapter 3: Examining Relationships

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Chapter 6: Exploring Data: Relationships Lesson Plan

The following formulas related to this topic are provided on the formula sheet:

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

7. Do not estimate values for y using x-values outside the limits of the data given. This is called extrapolation and is not reliable.

Chapter 7. Scatterplots, Association, and Correlation

The response variable depends on the explanatory variable.

Test 3A AP Statistics Name:

What is the easiest way to lose points when making a scatterplot?

Unit 6 - Simple linear regression

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Scatterplots. 3.1: Scatterplots & Correlation. Scatterplots. Explanatory & Response Variables. Section 3.1 Scatterplots and Correlation

appstats8.notebook October 11, 2016

Warm-up Using the given data Create a scatterplot Find the regression line

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Least-Squares Regression. Unit 3 Exploring Data

Lecture 18: Simple Linear Regression

Chapter 2: Looking at Data Relationships (Part 3)

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Bivariate Data Summary

AP Statistics L I N E A R R E G R E S S I O N C H A P 7

23. Inference for regression

SMAM 319 Exam 1 Name. 1.Pick the best choice for the multiple choice questions below (10 points 2 each)

Section 5.4 Residuals

Mrs. Poyner/Mr. Page Chapter 3 page 1

Algebra 1 Practice Test Modeling with Linear Functions Unit 6. Name Period Date

3.1 Scatterplots and Correlation

Correlation & Simple Regression

IF YOU HAVE DATA VALUES:

Chapter 14. Multiple Regression Models. Multiple Regression Models. Multiple Regression Models

Examining Relationships. Chapter 3

1 Introduction to Minitab

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Important note: Transcripts are not substitutes for textbook assignments. 1

Chapter 9. Correlation and Regression

Correlation and Linear Regression

Regression. Marc H. Mehlman University of New Haven

Chapter 27 Summary Inferences for Regression

Pre-Calculus Multiple Choice Questions - Chapter S8

Ch. 3 Review - LSRL AP Stats

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Ch 13 & 14 - Regression Analysis

Practice Questions for Exam 1

SECTION I Number of Questions 42 Percent of Total Grade 50

1. Use Scenario 3-1. In this study, the response variable is

AMS 7 Correlation and Regression Lecture 8

Correlation and Regression

28. SIMPLE LINEAR REGRESSION III

7.0 Lesson Plan. Regression. Residuals

AP Statistics Two-Variable Data Analysis

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

y n 1 ( x i x )( y y i n 1 i y 2

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

MULTIPLE REGRESSION METHODS

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

Inferences for Regression

THE PEARSON CORRELATION COEFFICIENT

BIVARIATE DATA data for two variables

MATH 2560 C F03 Elementary Statistics I LECTURE 9: Least-Squares Regression Line and Equation

Chapter Goals. To understand the methods for displaying and describing relationship among variables. Formulate Theories.

Ch Inference for Linear Regression

HW38 Unit 6 Test Review

Study Guide AP Statistics

MODELING. Simple Linear Regression. Want More Stats??? Crickets and Temperature. Crickets and Temperature 4/16/2015. Linear Model

Chapter 7 Linear Regression

Relationships Regression

AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1

IT 403 Practice Problems (2-2) Answers

SMAM 314 Exam 42 Name

This document contains 3 sets of practice problems.

appstats27.notebook April 06, 2017

Chapter 3: Examining Relationships Review Sheet

Chapter 10 Correlation and Regression

Correlation. Relationship between two variables in a scatterplot. As the x values go up, the y values go down.

REVIEW 8/2/2017 陈芳华东师大英语系

Business Statistics. Lecture 10: Correlation and Linear Regression

CRP 272 Introduction To Regression Analysis

Transcription:

Scatterplots and Correlation Name Hr A scatterplot shows the relationship between two quantitative variables measured on the same individuals. variable (y) measures an outcome of a study variable (x) may help explain or influence changes in a response variable. A **Remember, the explanatory variable goes on the X-axis!** EXAMPLE 1: Identify the explanatory and response variables: Mrs. Sapp is interested in the relationship between the hours students spend studying for an exam and their score on the exam. A researcher is interested in the effects a new drug has on reducing muscle spasms. How to Make a Scatterplot: 1. Decide which variable is explanatory (x) and which is response (y). 2. Label and scale your axes. (Note: The axes don t need to intersect at (0, 0).) 3. Plot individual values. EXAMPLE 2: Make a scatterplot of the relationship between body weight and pack weight.

Interpreting Scatterplots As in any graph of data, look for the overall (DSS) and for striking from that pattern. AP On the AP Exam, you will need to mention three important characteristics of the scatterplot: Direction: Two variables have a association when above-average values of one tend to accompany above-average values of the other, and when below-average values also tend to occur together. (i.e., Generally speaking, the y values tend to increase as the x values increase.) Two variables have a association when above-average values of one tend to accompany below-average values of the other. (i.e., Generally speaking, the y values tend to decrease as the x values increase.) Shape: Does the data appear linear or curved? Strength: If the points cluster closely around an imaginary line, the association is. If the points are scattered farther from the line, the association is. Outliers or Influential Points: Outlier: an individual value that falls outside the overall pattern of the relationship. Ask yourself:? In a regression setting, an outlier is a data point with a large Influential point: when removed, the of the relationship significantly changes (it influences where the LSRL is located) Typically, if an observation is an outlier, it will be influential Positive or Negative Relationship? a) Minutes spent studying and exam score c) Age and bone density b) Age of vehicle and value of vehicle d) Write down a positive example: EXAMPLE 3: EXAMPLE 4: Can Mrs. Sapp be bribed with chocolate?

Correlation The correlation measures the strength of the linear relationship between two quantitative variables. r is always a number between and r > 0 indicates a association. r < 0 indicates a association. Values of r near indicate a very weak linear relationship. The strength of the linear relationship increases as r moves away from 0 towards -1 or 1. The extreme values r = 1 and r = 1 occur only in the case of a linear relationship. FACTS about correlation: 1. Correlation makes no distinction between explanatory and response variables. 2. r does not when we change the units of measurement of x, y, or both. 3. The correlation r itself has no of measurement. 4. The correlation coefficient is 1 only when all the points lie on a downward-sloping line, and +1 only when all the points lie on an upward-sloping line. 5. The value of r is a measure of the extent to which x and y are. IMPORTANT: A value of r close to zero does not rule out any strong relationship; it just rules out a linear one.

CAUTIONS: Correlation requires that both variables be quantitative. Correlation does not describe curved relationships between variables, no matter how strong the relationship is. Correlation is not. (r is strongly affected by a few outlying observations.) Correlation is not a complete summary of two-variable data. EXAMPLE 5: Interpret the relationship between the variables. a) b) x y 1.2 23.3 2.5 21.5 6.5 12.2 13.1 3.9 24.2 4.0 34.1 18.0 20.8 1.7 37.5 26.1 Correlation and Causation Least squares regression line 1. Correlation measures the extent of association (strong, moderate, or weak), but association does not imply causation! 2. It can frequently happen that two variables are highly correlated not because one is causally related to the other but because they are both strongly related to a third variable (called a confounding variable). For instance, why do high values of hot chocolate consumption tend to be paired with lower crime rates? 3. The only way to make a strong case for causation is by conducting a well-controlled scientific experiment!

Least Squares Regression ŷ a bx is the predicted value of the response variable y is the slope, the amount by which y is predicted to change when x increases by one unit. is the y intercept, the predicted value of y when x = 0. EXAMPLE 6: Does Fidgeting Keep You Slim? Slope: fatgain = 3.505-0.00344(NEA change) y-intercept: Predict the fat gain when NEA= 400 calories: EXAMPLE 7: The ages (in months) and heights (in inches) of seven children are given. x 16 24 42 60 75 102 120 y 24 30 35 40 48 56 60 Determine the LSRL (Least Squares Regression Line): Interpret the slope: Interpret the y-int: Predict the height of a child who is 4.5 years old: Predict the height of someone who is 20 years old:

is the use of a regression line for predictions outside the interval of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate. RESIDUALS: The vertical between the observations & the LSRL the sum of the residuals is always FORMULA : Residual plots: A scatterplot of the (x, ) pairs. Purpose is to tell if a association exists between the x & y variables If exists between the points in the residual plot, then the association is linear.

EXAMPLES: EXAMPLE 8: Determine the LSRL (Least Squares Regression Line): Find the correlation coefficient (r): Interpret the slope: Interpret the y-int: Predict the range of motion for a 29-year-old: Predict the range of motion for a 50-year-old: Make a residual plot. What does this plot tell you about the linearity of the data? Calculate the residual for age 24: Calculate the residual for age 14:

Outliers and Influential Points: Standard deviation formula: Interpretation: The a typical value is from the LSRL COEFFICIENT OF DETERMINATION (r 2 ) : the percent variation in y can be explained by the least-squares regression line of y on x. FORMULA: MEM ORIZ E Memorize this statement! Example: Referring to the age and range of motion data, how well does age predict the range of motion after knee surgery?

COMPUTER OUTPUT Example continued Minitab output looks like Regression Analysis: % Fat y versus Age (x) Estimated y intercept a The regression equation is Regression line % Fat y = 3.22 + 0.548 Age (x) Estimated slope b Predictor Coef SE Coef T P Constant 3.221 5.076 0.63 0.535 Age (x) 0.5480 0.1056 5.19 0.000 S = 5.754 R-Sq = 62.7% R-Sq(adj) = 60.4% Analysis of Variance Source DF SS MS F P Regression 1 891.87 891.87 26.94 0.000 Residual Error 16 529.66 33.10 Total 17 1421.54 SSTo residual df = n -2 16 SSResid 2 s e EXAMPLE 1:

EXAMPLE 2: a) Write the equation of the LSRL. b) State and interpret the correlation coefficient. c) State and interpret the slope. d) State and interpret the standard deviation. e) State and interpret the coefficient of determination.