Bivariate Data Summary

Similar documents
Describing Bivariate Relationships

5.1 Bivariate Relationships

BIVARIATE DATA data for two variables

AP Statistics Two-Variable Data Analysis

Session 4 2:40 3:30. If neither the first nor second differences repeat, we need to try another

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

Chapter 3: Examining Relationships

6.1.1 How can I make predictions?

Scatterplots and Correlation

Chapter 12 Summarizing Bivariate Data Linear Regression and Correlation

Prob/Stats Questions? /32

Chapter 3: Examining Relationships

The response variable depends on the explanatory variable.

Intermediate Algebra Summary - Part I

Least Squares Regression

Section 2.2: LINEAR REGRESSION

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

Mathematical Modeling

Chapter 3: Describing Relationships

Chapter 8. Linear Regression /71

Reteach 2-3. Graphing Linear Functions. 22 Holt Algebra 2. Name Date Class

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

Summarizing Data: Paired Quantitative Data

Least-Squares Regression. Unit 3 Exploring Data

Reminder: Univariate Data. Bivariate Data. Example: Puppy Weights. You weigh the pups and get these results: 2.5, 3.5, 3.3, 3.1, 2.6, 3.6, 2.

Unit 8 - Polynomial and Rational Functions Classwork

THE PEARSON CORRELATION COEFFICIENT

a. Length of tube: Diameter of tube:

3.2: Least Squares Regressions

x is also called the abscissa y is also called the ordinate "If you can create a t-table, you can graph anything!"

CHAPTER 3 Describing Relationships

y n 1 ( x i x )( y y i n 1 i y 2

Scatterplots and Correlations

Chapter 3: Describing Relationships

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

Scatterplots. 3.1: Scatterplots & Correlation. Scatterplots. Explanatory & Response Variables. Section 3.1 Scatterplots and Correlation

Correlation A relationship between two variables As one goes up, the other changes in a predictable way (either mostly goes up or mostly goes down)

Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation. Day #1

Regression Using an Excel Spreadsheet Using Technology to Determine Regression

3.1 Scatterplots and Correlation

Chapter 4 Data with Two Variables

Chapter 6 Scatterplots, Association and Correlation

Algebra II Notes Quadratic Functions Unit Applying Quadratic Functions. Math Background

Chapter 9. Correlation and Regression

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Chapter 4 Data with Two Variables

Chapter 6: Exploring Data: Relationships Lesson Plan

2.1 Scatterplots. Ulrich Hoensch MAT210 Rocky Mountain College Billings, MT 59102

IF YOU HAVE DATA VALUES:

Relationships Regression

Chapter 7. Scatterplots, Association, and Correlation

appstats8.notebook October 11, 2016

Name. The data below are airfares to various cities from Baltimore, MD (including the descriptive statistics).

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

7. Do not estimate values for y using x-values outside the limits of the data given. This is called extrapolation and is not reliable.

S12 - HS Regression Labs Workshop. Linear. Quadratic (not required) Logarithmic. Exponential. Power

AP CALCULUS AB SUMMER ASSIGNMNET NAME: READ THE FOLLOWING DIRECTIONS CAREFULLY

Review of Section 1.1. Mathematical Models. Review of Section 1.1. Review of Section 1.1. Functions. Domain and range. Piecewise functions

Algebra I Notes Modeling with Linear Functions Unit 6

MINI LESSON. Lesson 2a Linear Functions and Applications

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

Linear Regression 3.2

INFERENCE FOR REGRESSION

Overview. 4.1 Tables and Graphs for the Relationship Between Two Variables. 4.2 Introduction to Correlation. 4.3 Introduction to Regression 3.

SCATTERPLOTS. We can talk about the correlation or relationship or association between two variables and mean the same thing.

Review of Regression Basics

Section I: Multiple Choice Select the best answer for each question.

Chapter 2: Looking at Data Relationships (Part 3)

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

Linear Regression and Correlation. February 11, 2009

Unit 6 - Introduction to linear regression

Chapter 6. Exploring Data: Relationships

Steps to take to do the descriptive part of regression analysis:

Describing the Relationship between Two Variables

Graphing Equations in Slope-Intercept Form 4.1. Positive Slope Negative Slope 0 slope No Slope

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Connecticut Common Core Algebra 1 Curriculum. Professional Development Materials. Unit 8 Quadratic Functions

Fish act Water temp

MAC Module 2 Modeling Linear Functions. Rev.S08

Chapter 11. Correlation and Regression

x and y, called the coordinates of the point.

Module 1 Linear Regression

Chapter 10 Correlation and Regression

AP Stats ~ 3A: Scatterplots and Correlation OBJECTIVES:

Chapter 4 Describing the Relation between Two Variables

Pure Math 30: Explained!

Ch Inference for Linear Regression

If the roles of the variable are not clear, then which variable is placed on which axis is not important.

Chapter 5 Friday, May 21st

Chapter 7 Linear Regression

NUMB3RS Activity: How Does it Fit?

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Lecture 4 Scatterplots, Association, and Correlation

OHS Algebra 2 Summer Packet

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

s e, which is large when errors are large and small Linear regression model

EX: Simplify the expression. EX: Simplify the expression. EX: Simplify the expression

Transcription:

Bivariate Data Summary Bivariate data data that examines the relationship between two variables What individuals to the data describe? What are the variables and how are they measured Are the variables quantitative or categorical Types of bivariate data Response variable measures the outcome of a study Explanatory variables attempts to explain (not cause) the response variable to determine which is explanatory and which is response, think about which one seems to be a possible explanation of the other. if it is not obvious which one is explanatory and which is response, it very well be that it doesn t matter. Scatterplots shows the relationship between two quantitative variables measured on the same individuals. The values of one variable appears on the horizontal (x) axis and the other axis appears on the vertical (y) axis. Each individual appears as a point in the plot. The explanatory variable is placed on the x-axis and the response variable is placed on the y-axis. If there is no explanatoryresponse relationship, either variable can go on the horizontal axis. Scatterplots must be labeled (both axes). The intervals on each axis must be uniform. Interpreting scatterplots Look for the overall pattern and deviations from that pattern. Form does the data appear linear or curved or have distinct clusters? Direction the term used is association Positive association low values of the explanatory variable accompanies low values of the response variable and high values of the explanatory variable accompanies high values of the response variable Negative association low values of the explanatory variable accompanies high values of the response variable and high values of the explanatory variable accompanies low values of the response variable Strength of an association how closely the points follow a clear form. Both of the associations above are strong. Outliers a point that falls outside the overall pattern of the relationship. As usual, outliers should be eliminated if it can be justified. www.mastermathmentor.com - 1 - Stu Schwartz

Using the calculator: Place the data in your lists: STAT EDIT Set up your plots: 2nd Y = Then: ZOOM 9 : ZoomStat Be sure that you have no graphs in your Y = list. Correlation measures the direction and strength of the linear relationship between two quantitative variables. The variable to denote correlation is r. Facts about correlation: 1. r is a number between -1 and 1 inclusive. Positive values of r indicates a positive association between variables. Negative values of r indicates a negative association between variables. 2. Values of r near 1 or -1 mean a very strong association (the points are very close to forming a straight line). Values of r near 0 mean a very weak association. If r is exactly 1 or -1, the association is perfect (rarely happens). 3. Correlation makes no difference between explanatory and response variables. It makes no difference which variable is x and which is y when calculating r. 4. When calculating r, both variables must be quantitative. 5. r does not change when we change the units of measurements of x, y, or both. 6. Correlation measures the strength of a linear relationship between variable. It does not relationships that are curved no matter how strong they appear to be. 7. r is non-resistant it is affected strongly by outliers. 8. In general, we will make this claim: $ r ".9...Very strong association.7 # r <.9...Fairly strong association %.5 # r <.7...Moderately strong association.2 # r <.5...Fairly weak association ' r <.2...Very weak association www.mastermathmentor.com - 2 - Stu Schwartz

The formula for correlation: r = 1 # x " x # % ( y " y ) n "1 $ s % x ' $ s (. You are not responsible for this formula. y ' We will find r using the calculator (below). Least Squares Regression a method for finding a line that summarizes the relationship between two variables. Regression line a straight line that describes how a response variable y that changes as an explanatory variable x changes. Regression lines re use to predict the value of y for a given value of x. Regression requires an explanatory variable and a response variable. Least squares regression line (LSRL) the line that makes the sum of the squares of the vertical distances of the of the vertical distances of the data points from the line as small as possible. (Fathom Demo) Important: Every LSRL goes through the point ( x, y ) Equation of the LSRL The regression line is in the form y ) = a + bx (similar to what you used in algebra. The y ) (read y-hat) is used to denote the predicted value of y based on the value of x. The values of a and b are based on the means x and y and the standard deviations s x and s y as well as the correlation r. slope b = r s y s x y-intercept a = y " bx You will not have to find the LSRL using these formulas. Using the Calculator: Have your data in your lists as shown above and create a scatterplot. STAT CALC 4 : LinReg(ax + b) ENTER If you do not specify the lists, the calculator will automatically do regression on L1 and L2. If you want regression on lists other than L1 and L2 you must specify (ex: LinReg(ax+b) L2,L4) This is your result: www.mastermathmentor.com - 3 - Stu Schwartz

If you want to actually graph the line, do these steps: STAT CALC 4 : LinReg(ax + b) VARS Y - VARS Function Y1 Enter Press Y = and your equation will be in Y1:. Now GRAPH If you want to find the value of r (correlation), do this: 2nd CATALOG D DiagnosticOn ENTER. When you do LSRL on the calculator, the value of r will be present:. Note that r 2 (below is also given. Once you turn diagnostic on, it will stay on (so just leave it on it doesn t hurt anything). Using the LSRL: Suppose we want to predict the value of y when x = 4. You could plug x = 4 into the equation above, but it is easier to let the calculator do the work. Graphic way: 2nd CALC Value ENTER 4 ENTER Function method: Vars Y - Vars Function Y1 ENTER (4 Enter This says that the predicted value of y when x = 4 is 9.14. www.mastermathmentor.com - 4 - Stu Schwartz

Coefficient of determination - r 2. This represents the percentage of the variation in y that can be explained by the LSRL. Memorize these words. If for instance, you are measuring the relationship to the amount of studying one does compared to the grades in a test, and you get r =.9 (strong positive association), then r 2 =.81. You say that 81% of the variation in grades in the test can be explained by the LSRL of study time on grades. That means that 19% of the variation in grades can be explained by other factors other that study time. Residuals The different between an observed value of the response variable and the value predicted By the LSRL. The formula for a residual is: Residual = observed y " predicted y = y " ) y For any LSRL, the sum of the residuals is always zero. Residual Plots a scatterplot of the regression residuals against the explanatory variable x. Residual plots help us to assess how good a line is in describing the data. You look at the scatterplot of the residuals and determine information about the original data. Important: If a residual scatterplot shows no pattern, then the original data has a good linear fit. Original data Residual plot If a residual scatterplot shows a curved pattern, then the original data does not have a good linear fit. Original data Residual plot If the residual scatterplot shows no pattern but has an increasing or decreasing spread about the line, a linear fit can be used but the LSRL will be less accurate for larger values of y. Original data Residual plot www.mastermathmentor.com - 5 - Stu Schwartz

Using the Calculator: For the data above, the way you create a residual plot is to follow these steps: 1) Put the data in your lists 2) Generate your scatterplot 3) Perform Least-squares regression on your data and graph the line 4) Go to StatPlot and go to your plot. Change the Ylist: to LRESID. This list is found in your List menu 2nd STAT : 5) Then press Zoom 9 : ZoomStat (You might want to turn your Y1 off) Influential observations: Outliers observations that lies outside the overall pattern of the other observations. - the last data point is clearly an outlier. Influential observations an observation that, if removing it would markedly change the results of the calculation of the LSRL and r. - the last data point is an influential point. With the influential point: Without the influential point: An outlier is influential. An influential point is not necessarily an outlier. www.mastermathmentor.com - 6 - Stu Schwartz