Looking at Data Relationships. 2.1 Scatterplots W. H. Freeman and Company

Similar documents
Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables. Interpreting scatterplots Outliers

Chapter 6: Exploring Data: Relationships Lesson Plan

Scatterplots. STAT22000 Autumn 2013 Lecture 4. What to Look in a Scatter Plot? Form of an Association

Deskription. Exempel 1. Exempel 1 (lösning) Normalfördelningsmodellen (forts.)

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

Describing Bivariate Relationships

5.1 Bivariate Relationships

Scatterplots and Correlations

Chapter 7 Summary Scatterplots, Association, and Correlation

Looking at data: relationships

Chapter 5 Friday, May 21st

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

CHAPTER 3 Describing Relationships

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc.

Relationships Regression

Chapter 6. September 17, Please pick up a calculator and take out paper and something to write with. Association and Correlation.

Friday, September 21, 2018

Chapter 3: Examining Relationships

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation

AP Statistics Unit 2 (Chapters 7-10) Warm-Ups: Part 1

Chapter 2: Looking at Data Relationships (Part 3)

Chi-square tests. Unit 6: Simple Linear Regression Lecture 1: Introduction to SLR. Statistics 101. Poverty vs. HS graduate rate

UNIT 12 ~ More About Regression

Chapter 8. Linear Regression /71

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

Correlation: basic properties.

2.1 Scatterplots. Ulrich Hoensch MAT210 Rocky Mountain College Billings, MT 59102

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Math 147 Lecture Notes: Lecture 12

appstats8.notebook October 11, 2016

y n 1 ( x i x )( y y i n 1 i y 2

The empirical ( ) rule

Chapter 7. Scatterplots, Association, and Correlation

Linear Regression and Correlation. February 11, 2009

Announcements. Unit 6: Simple Linear Regression Lecture : Introduction to SLR. Poverty vs. HS graduate rate. Modeling numerical variables

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Scatterplots and Correlation

AP Statistics - Chapter 2A Extra Practice

Chapter 6 Scatterplots, Association and Correlation

(quantitative or categorical variables) Numerical descriptions of center, variability, position (quantitative variables)

SCATTERPLOTS. We can talk about the correlation or relationship or association between two variables and mean the same thing.

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

The following formulas related to this topic are provided on the formula sheet:

1-1. Chapter 1. Sampling and Descriptive Statistics by The McGraw-Hill Companies, Inc. All rights reserved.

M 225 Test 1 B Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

MEASURES OF LOCATION AND SPREAD

Unit 5: Regression. Marius Ionescu 09/22/2011

Chapter 3: Examining Relationships

ECLT 5810 Data Preprocessing. Prof. Wai Lam

3.1 Scatterplots and Correlation

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Correlation and Regression

Scatterplots and Correlation

Announcements. Lecture 10: Relationship between Measurement Variables. Poverty vs. HS graduate rate. Response vs. explanatory

appstats27.notebook April 06, 2017

Correlation and Regression

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

MATH 1150 Chapter 2 Notation and Terminology

Chapter 10. Correlation and Regression. Lecture 1 Sections:

Correlation. Relationship between two variables in a scatterplot. As the x values go up, the y values go down.

Announcements. Lecture 18: Simple Linear Regression. Poverty vs. HS graduate rate

Statistical Concepts. Constructing a Trend Plot

SECTION I Number of Questions 42 Percent of Total Grade 50

9. Linear Regression and Correlation

6.2b Homework: Fit a Linear Model to Bivariate Data

Chapter 6. Exploring Data: Relationships. Solutions. Exercises:

Section Linear Correlation and Regression. Copyright 2013, 2010, 2007, Pearson, Education, Inc.

Chapter 3: Describing Relationships

Relationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height.

Probability Distributions

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Sem. 1 Review Ch. 1-3

Correlation and Regression Notes. Categorical / Categorical Relationship (Chi-Squared Independence Test)

Chapter 5 Least Squares Regression

23. Inference for regression

LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION

Sociology 6Z03 Review I

Chapter 3: Describing Relationships

Scatterplots and Correlation

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model

Basic Practice of Statistics 7th

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

Warm-up Using the given data Create a scatterplot Find the regression line

For instance, we want to know whether freshmen with parents of BA degree are predicted to get higher GPA than those with parents without BA degree.

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

Lecture 16 - Correlation and Regression

Chapter 7. Association, and Correlation. Scatterplots & Correlation. Scatterplots & Correlation. Stat correlation.

Regression and Correlation

10.1 Simple Linear Regression

Review of Regression Basics

Watch TV 4 7 Read 5 2 Exercise 2 4 Talk to friends 7 3 Go to a movie 6 5 Go to dinner 1 6 Go to the mall 3 1

Chapter 27 Summary Inferences for Regression

Important note: Transcripts are not substitutes for textbook assignments. 1

The response variable depends on the explanatory variable.

CHAPTER 3 Describing Relationships

1. Create a scatterplot of this data. 2. Find the correlation coefficient.

CHAPTER 1 Exploring Data

Transcription:

Looking at Data Relationships 2.1 Scatterplots 2012 W. H. Freeman and Company

Here, we have two quantitative variables for each of 16 students. 1) How many beers they drank, and 2) Their blood alcohol level (BAC) We are interested in the relationship between the two variables: How is one affected by changes in the other one? Student Beers Blood Alcohol 1 5 0.1 2 2 0.03 3 9 0.19 6 7 0.095 7 3 0.07 9 3 0.02 11 4 0.07 13 5 0.085 4 8 0.12 5 3 0.04 8 5 0.06 10 5 0.05 12 6 0.1 14 7 0.09 15 1 0.01 16 4 0.05

Associations Between Variables 3 When you examine the relationship between two variables, a new question becomes important: 1.Is your purpose simply to explore the nature of the relationship? 2.Do you wish to show that one of the variables can explain variation in the other? A response variable measures an outcome of a study. An explanatory variable explains or causes changes in the response variable. 3

Looking at relationships Start with a graph Look for an overall pattern and deviations from the pattern Use numerical descriptions of the data and overall pattern (if appropriate)

Scatterplots In a scatterplot, one axis is used to represent each of the variables, and the data are plotted as points on the graph. Student Beers BAC 1 5 0.1 2 2 0.03 3 9 0.19 6 7 0.095 7 3 0.07 9 3 0.02 11 4 0.07 13 5 0.085 4 8 0.12 5 3 0.04 8 5 0.06 10 5 0.05 12 6 0.1 14 7 0.09 15 1 0.01 16 4 0.05

Interpreting scatterplots After plotting two variables on a scatterplot, we describe the relationship by examining the form, direction, and strength of the association. We look for an overall pattern Form: linear, curved, clusters, no pattern Direction: positive, negative, no direction Strength: how closely the points fit the form and deviations from that pattern. Outliers

Form and direction of an association Linear No relationship Nonlinear

Positive association: High values of one variable tend to occur together with high values of the other variable. Negative association: High values of one variable tend to occur together with low values of the other variable.

No relationship: X and Y vary independently. Knowing X tells you nothing about Y.

Strength of the association The strength of the relationship between the two variables can be seen by how much variation, or scatter, there is around the main form. With a strong relationship, you can get a pretty good estimate of y if you know x. With a weak relationship, for any x you might get a wide range of y values.

This is a weak relationship. For a particular state median household income, you can t predict the state per capita income very well. This is a very strong relationship. The daily amount of gas consumed can be predicted quite accurately for a given temperature value.

Outliers An outlier is a data value that has a very low probability of occurrence (i.e., it is unusual or unexpected). In a scatterplot, outliers are points that fall outside of the overall pattern of the relationship.

Not an outlier: Outliers The upper right-hand point here is not an outlier of the relationship It is what you would expect for this many beers given the linear relationship between beers/weight and blood alcohol. This point is not in line with the others, so it is an outlier of the relationship.

Looking at Data Relationships 2.2 Correlation 2012 W. H. Freeman and Company

The correlation coefficient "r" The correlation coefficient is a measure of the direction and strength of a linear relationship. It is calculated using the mean and the standard deviation of both the x and y variables. Correlation can only be used to describe quantitative variables. Categorical variables don t have means and standard deviations.

This image cannot currently be displayed. This image cannot currently be displayed. The correlation coefficient "r" r n 1 xi = n 1 i 1 sx x yi s = y y Time to swim: = 35, s x = 0.7 y Pulse rate: = 140 s y = 9.5 x You DON'T want to do this by hand. Make sure you learn how to use your calculator or software.

r does not distinguish x & y The correlation coefficient, r, treats x and y symmetrically. r n 1 xi = n 1 i 1 sx x yi s = y y r = -0.75 r = -0.75 "Time to swim" is the explanatory variable here, and belongs on the x axis. However, in either plot r is the same (r=-0.75).

"r" ranges from -1 to +1 "r" quantifies the strength and direction of a linear relationship between 2 quantitative variables. Strength: how closely the points follow a straight line. Direction: is positive when individuals with higher X values tend to have higher values of Y.

When variability in one or both variables decreases, the correlation coefficient gets stronger ( closer to +1 or -1).

Correlation only describes linear relationships No matter how strong the association, r does not describe curved relationships. Note: You can sometimes transform a non-linear association to a linear form, for instance by taking the logarithm. You can then calculate a correlation using the transformed data.

Review examples 1) What is the explanatory variable? Describe the form, direction, and strength of the relationship. Estimate r. (in 1000 s) 2) If women always marry men 2 years older than themselves, what is the correlation of the ages between husband and wife? age man = age woman + 2 equation for a straight line