AP Statistics S C A T T E R P L O T S, A S S O C I A T I O N, A N D C O R R E L A T I O N C H A P 6

Similar documents
AP Statistics L I N E A R R E G R E S S I O N C H A P 7

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

Chapter 7 Summary Scatterplots, Association, and Correlation

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc.

Chapter 6. September 17, Please pick up a calculator and take out paper and something to write with. Association and Correlation.

If the roles of the variable are not clear, then which variable is placed on which axis is not important.

Scatterplots and Correlation

Chapter 6 Scatterplots, Association and Correlation

STAB22 Statistics I. Lecture 7

Relationships Regression

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

3.1 Scatterplots and Correlation

Chapter 3: Describing Relationships

Chapter 8. Linear Regression /71

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

Chapter 7. Scatterplots, Association, and Correlation. Scatterplots & Correlation. Scatterplots & Correlation. Stat correlation

AP Statistics Two-Variable Data Analysis

Chapter 6: Exploring Data: Relationships Lesson Plan

Relationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height.

Chapter 3: Describing Relationships

Chapter 3: Examining Relationships

11 Correlation and Regression

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

appstats27.notebook April 06, 2017

Analyzing Bivariate Data: Interval/Ratio. Today s Content

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

STATISTICS Relationships between variables: Correlation

Important note: Transcripts are not substitutes for textbook assignments. 1

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

AP Stats ~ 3A: Scatterplots and Correlation OBJECTIVES:

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Chapter 5 Least Squares Regression

Chapter 2: Looking at Data Relationships (Part 3)

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables. Interpreting scatterplots Outliers

The Simple Linear Regression Model

AMS 7 Correlation and Regression Lecture 8

appstats8.notebook October 11, 2016

Chapter 6. Exploring Data: Relationships. Solutions. Exercises:

y n 1 ( x i x )( y y i n 1 i y 2

9. Linear Regression and Correlation

The following formulas related to this topic are provided on the formula sheet:

Bivariate data data from two variables e.g. Maths test results and English test results. Interpolate estimate a value between two known values.

Linear Regression and Correlation. February 11, 2009

Simple Linear Regression

STA Module 5 Regression and Correlation. Learning Objectives. Learning Objectives (Cont.) Upon completing this module, you should be able to:

Fish act Water temp

BIOSTATISTICS NURS 3324

6.1.1 How can I make predictions?

Chapter 27 Summary Inferences for Regression

Lecture 4 Scatterplots, Association, and Correlation

Measuring Associations : Pearson s correlation

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Describing Bivariate Relationships

Arvind Borde / MAT , Week 5: Relationships I

Review of Regression Basics

SECTION I Number of Questions 42 Percent of Total Grade 50

Unit 6 - Introduction to linear regression

The empirical ( ) rule

CHAPTER 3 Describing Relationships

Chapter 7. Scatterplots, Association, and Correlation

Related Example on Page(s) R , 148 R , 148 R , 156, 157 R3.1, R3.2. Activity on 152, , 190.

Chapter 4 Data with Two Variables

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

1. Create a scatterplot of this data. 2. Find the correlation coefficient.

Looking at Data Relationships. 2.1 Scatterplots W. H. Freeman and Company

Statistics 100 Exam 2 March 8, 2017

Recall, Positive/Negative Association:

3.2: Least Squares Regressions

Objectives. 2.3 Least-squares regression. Regression lines. Prediction and Extrapolation. Correlation and r 2. Transforming relationships

Scatterplots and Correlation

5.1 Bivariate Relationships

Watch TV 4 7 Read 5 2 Exercise 2 4 Talk to friends 7 3 Go to a movie 6 5 Go to dinner 1 6 Go to the mall 3 1

Chapter 4 Data with Two Variables

Inferences for Regression

Descriptive Univariate Statistics and Bivariate Correlation

Bivariate Data Summary

Scatterplots. 3.1: Scatterplots & Correlation. Scatterplots. Explanatory & Response Variables. Section 3.1 Scatterplots and Correlation

HOMEWORK (due Wed, Jan 23): Chapter 3: #42, 48, 74

Lecture 4 Scatterplots, Association, and Correlation

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers

CORELATION - Pearson-r - Spearman-rho

MATH 1070 Introductory Statistics Lecture notes Relationships: Correlation and Simple Regression

Determine is the equation of the LSRL. Determine is the equation of the LSRL of Customers in line and seconds to check out.. Chapter 3, Section 2

Ch. 3 Review - LSRL AP Stats

Correlation. January 11, 2018

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

AP Final Review II Exploring Data (20% 30%)

Test 3A AP Statistics Name:

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

a. Length of tube: Diameter of tube:

Correlation. Patrick Breheny. November 15. Descriptive statistics Inference Summary

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

Chapter 9. Correlation and Regression

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

BIVARIATE DATA data for two variables

Nov 13 AP STAT. 1. Check/rev HW 2. Review/recap of notes 3. HW: pg #5,7,8,9,11 and read/notes pg smartboad notes ch 3.

Unit 3 Exploring Relationships: Lines and Curves of Best Fit

M 225 Test 1 B Name SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

M 140 Test 1 B Name (1 point) SHOW YOUR WORK FOR FULL CREDIT! Problem Max. Points Your Points Total 75

What is statistics? Statistics is the science of: Collecting information. Organizing and summarizing the information collected

Transcription:

AP Statistics 1 S C A T T E R P L O T S, A S S O C I A T I O N, A N D C O R R E L A T I O N C H A P 6 The invalid assumption that correlation implies cause is probably among the two or three most serious and common errors of human reasoning. Stephen Jay Gould (1941-2002)

Relationship between 2 Quantitative Variables 2 In the mid-20 th century Dr. Mildred Trotter, a forensic anthropologist, determined relationships between dimensions of various bones and a person s height. The relationships she found are still in use today in the effort to identify missing persons based on skeletal remains (think C.S.I). One relationship compares the length of the femur to the person s height. Measure the length of your femur along the outside of your leg (in inches). Also measure your height (also in inches) if you don t already know it. Record both at the front of the room. Femur

Draw a Picture, Draw a Picture, Draw a Picture 3 To display the relationship between two quantitative variables we will use a graphical display known as a scatterplot. For the femur-height data we just collected, plot the femur length on the horizontal axis and height on the vertical axis (more on which variable goes on which axis later in the lesson). Make sure to label the axes and provide scale for the axes. Note you do not need to (and often should not) include the origin in your plot.

Describe the Association between Variables When we described the distribution of a single quantitative variable (univariate data) we described the shape, center, spread, and unusual features. When we describe the association between two quantitative variables (bivariate data), we describe: 1. Direction positive association or negative association 2. Form linear, curved, or no particular pattern 3. Strength amount of scatter around the form 4. Unusual features outliers or subgroups of data 4 Describe the association between femur length and height. Don t forget the W s (or at least the Who and What) in the description.

TI Tips (p. 154) 5 See this TI Tips for one way to create a list with a name more meaningful than L1, L2, etc. This TI Tips also has instructions on making a scatterplot.

Roles for Variables 6 Do heavier smokers develop lung cancer at younger ages (than light or non-smokers)? Is birth order an important factor in predicting future income? Can we estimate a person s % body fat more simply by just measuring waist or wrist size? Notice in each situation one variable plays the role of predictor variable (aka explanatory variable ), while the other variable is the response variable. response explanatory

Correlation 7 Correlation is a statistic that measures the strength and direction of a linear association between two quantitative variables.

Correlation 8 Correlation is a statistic that measures the strength and direction of a linear association between two quantitative variables. Data from the Forensic Anthropology Data Bank (Tennessee) Femur length (mm) Height (cm)

Correlation Standardize both variables (find their z-scores) zx zy 9 x x y y,, sx s y

Correlation Note the standardizing puts both variables on the same scale, that is a scale with no units (rather than mm and cm). Also note the direction, form, and strength are still the same as the original. 10 Standardized Variables Original Variables

Correlation Correlation is calculated as: r z x 11 z n 1 y Note: The numerator is the sum of a bunch of products. Recall the product is positive if both numbers are positive or if both are negative. Notice that most of the standardized points are in the first or third quadrants where both coordinates are positive or negative, hence a positive product. So the sum of all of the products will be positive, hence the correlation is positive.

Correlation 12 For correlation to be appropriate, you must have: Two quantitative variables Linear association between the variables See TI Tips on p. 158 for calculator instructions on calculating correlation.

Correlation Properties (see p. 158) 13 The sign of a correlation gives the direction. Correlation is always between -1 and +1, inclusive It doesn t matter which variable is x and which is y correlation is the same either way. Correlation has no units. Changing scale doesn t have an affect on correlation. Correlation only measures linear relationships. If association is linear, correlation tells strength and direction. Correlation is sensitive to outliers.

Correlation Properties (see p. 152) 14 How strong is strong? It depends. In the Natural Sciences (Chem., Phys., Engr., etc.): Strong 0.9 or more (-0.9 or more ) Moderate 0.7 to 0.9 (-0.7 to -0.9) Weak 0 to 0.7 (0 to -0.7) In the Social Sciences (Sociology., Psychology., Education., etc.): Strong 0.5 or more (-0.5 or more ) Moderate 0.25 to 0.5 (-0.25 to -0.5) Weak 0 to 0.25 (0 to -0.25) Show a scatterplot and use whatever adjectives you want!

Straightening Scatterplots 15 Not all scatterplots show a linear pattern. But many statistical analysis techniques require a linear pattern. So, we straighten up our act

Straightening Scatterplots 16 Camera lenses have an adjustable opening (aperture) whose size is referred to as the f/stop. Changing the aperture alters the amount of light let in through the lens. Here are the optimal aperture settings for different shutter speeds (in seconds): Speed 1 1000 1 500 1 250 1 125 1 60 1 30 1 15 1 8 f/stop 2.8 4 5.6 8 11 16 22 32

Straightening Scatterplots Plot f/stop as the response variable (y-axis) and notice the non-linear pattern. 17 Transform f/stop by squaring, and create a new plot. Notice the linear association between (f/stop)^2 and shutter speed. See TI Tips p. 162.

Correlation Causation 18 There is a roughly linear association between the number of storks and the human population (storks bring babies, right?) Turns out that storks like chimneys more people, more chimneys, more storks. Data from Oldenburg, Germany (beginning in the 1930s)

Beware the Lurking Variable 19 A hidden variable that actually affects the observed relationship between two other variables. The Japanese eat very little fat and suffer fewer heart attacks than the Americans. Heart attacks Fat in diet

Beware the Lurking Variable 20 A hidden variable that actually affects the observed relationship between two other variables. The Mexicans eat a lot of fat and suffer fewer heart attacks than the Americans. Heart attacks Fat in diet

Beware the Lurking Variable 21 A hidden variable that actually affects the observed relationship between two other variables. The Japanese drink very little red wine and suffer fewer heart attacks than the Americans. Heart attacks Red wine in diet

Beware the Lurking Variable 22 A hidden variable that actually affects the observed relationship between two other variables. The French drink excessive amounts of red wine and suffer fewer heart attacks than the Americans. Heart attacks Red wine in diet

Beware the Lurking Variable 23 A hidden variable that actually affects the observed relationship between two other variables. The Germans drink a lot of beer and eat lots of sausages and fats and suffer fewer heart attacks than the Americans. Heart attacks Beer and fat in diet

Beware the Lurking Variable 24 A hidden variable that actually affects the observed relationship between two other variables. CONCLUSION: Eat and drink what you like. Speaking English is apparently what kills you!

Beware of Outliers 25 Note how the outlier creates the impression of a stronger linear association between the variables than is probably realistic to consider.

Assignment 26 Read Chapter 6 Do Ch 6 exercises #1-15 odd, 19, 25-31 odd, 35, 39