Talking feet: Scatterplots and lines of best fit

Similar documents
MATH 1150 Chapter 2 Notation and Terminology

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Foundations for Functions

Chapter 19 Sir Migo Mendoza

Reteach 2-3. Graphing Linear Functions. 22 Holt Algebra 2. Name Date Class

Vocabulary: Data About Us

Unit 2: Lesson 10 Measures of Spread Name:

PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence.

Copyright, Nick E. Nolfi MPM1D9 Unit 6 Statistics (Data Analysis) STA-1

Algebra II Notes Quadratic Functions Unit Applying Quadratic Functions. Math Background

Analytical Graphing. lets start with the best graph ever made

appstats8.notebook October 11, 2016

Classroom Assessments Based on Standards Integrated College Prep I Unit 3 CP 103A

Resistant Measure - A statistic that is not affected very much by extreme observations.

Graphing. LI To practice reading and creating graphs

Analytical Graphing. lets start with the best graph ever made

Sampling Distributions of the Sample Mean Pocket Pennies

Algebra 1 Semester 2. Instructional Materials for the WCSD Math Common Finals

Name(s): Date: Course/Section: Mass of the Earth

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

From Rumor to Chaos TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson. TI-Nspire Navigator System

Lesson 4 Linear Functions and Applications

1.3: Describing Quantitative Data with Numbers

Scatterplots. 3.1: Scatterplots & Correlation. Scatterplots. Explanatory & Response Variables. Section 3.1 Scatterplots and Correlation

Regression Analysis: Exploring relationships between variables. Stat 251

Do Now 18 Balance Point. Directions: Use the data table to answer the questions. 2. Explain whether it is reasonable to fit a line to the data.

Name Student Activity

Lesson 19: Understanding Variability When Estimating a Population Proportion

Session 4 2:40 3:30. If neither the first nor second differences repeat, we need to try another

Bishop Kelley High School Summer Math Program Course: Algebra 1 Part 2 Fall 2013

Using Tables and Graphing Calculators in Math 11

Review of Multiple Regression

Chapter 6 Scatterplots, Association and Correlation

Performance of fourth-grade students on an agility test

Sampling, Frequency Distributions, and Graphs (12.1)

Methods and Tools of Physics

Algebra 8 GGG Mid Unit Test Review Packet

B.U.G. Newsletter. As one semester comes to an end by Jennifer L. Brown. December WÜA UÜÉãÇ

Lab 1 Uniform Motion - Graphing and Analyzing Motion

Regressions of Olympic Proportions

Performance Task: Concentration vs. Time

Bivariate data analysis

Chapter 6. Exploring Data: Relationships

Accumulation with a Quadratic Function

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

scatter plot project

Chapter 4 Data with Two Variables

Activity 6. Exploring the Exponential Function. Objectives. Introduction

OCR Maths S1. Topic Questions from Papers. Representation of Data

6 THE NORMAL DISTRIBUTION

Math and Meteorology. Chapter 1

Multiple Representations: Equations to Tables and Graphs Transcript

Diagnostic Test. Month Balance Change February $ March $ $13.10 April $1, $ May $ $ June $ $163.

AP Statistics Summer Assignment

TIphysics.com. Physics. Bell Ringer: Determining the Relationship Between Displacement, Velocity, and Acceleration ID: 13308

Least-Squares Regression

Constant Acceleration

Name: Class: Date: Mini-Unit. Data & Statistics. Investigation 1: Variability & Associations in Numerical Data. Practice Problems

CHAPTER 10. Regression and Correlation

Math 52 Linear Regression Instructions TI-83

Chapter 6. The Standard Deviation as a Ruler and the Normal Model 1 /67

Summarising numerical data

How spread out is the data? Are all the numbers fairly close to General Education Statistics

GRE Quantitative Reasoning Practice Questions

UNIT 12 ~ More About Regression

Foundations of Math 1 Review

3.2: Least Squares Regressions

Q1: What is the interpretation of the number 4.1? A: There were 4.1 million visits to ER by people 85 and older, Q2: What percent of people 65-74

CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS

2015 VCE Further Mathematics 2 examination report

MEASURING THE SPREAD OF DATA: 6F

The data in this answer key is sample data only. Student answers will vary based on personal data.

Equipotential Lines and Electric Fields

Absolute Value Functions

7-1 Fractions and Percents

Chapter 4 Data with Two Variables

Chapter One. Quadratics 20 HOURS. Introduction

Distributive property and its connection to areas

Unit 2, Ongoing Activity, Little Black Book of Algebra II Properties

Data Analysis and Statistical Methods Statistics 651

5.5. Data Collecting and Modelling. Investigate

SYSTEMS OF THREE EQUATIONS

Lesson 3 Average Rate of Change and Linear Functions

Further Mathematics 2018 CORE: Data analysis Chapter 2 Summarising numerical data

Math 12 - for 4 th year math students

ADMS2320.com. We Make Stats Easy. Chapter 4. ADMS2320.com Tutorials Past Tests. Tutorial Length 1 Hour 45 Minutes

Bivariate Data Summary

Cherry Creek High School Summer Assignment

Complete Week 9 Package

PHYSICS LAB. Newton's Law. Date: GRADE: PHYSICS DEPARTMENT JAMES MADISON UNIVERSITY

Chapter 7. Linear Regression (Pt. 1) 7.1 Introduction. 7.2 The Least-Squares Regression Line

TSIA MATH TEST PREP. Math Topics Covered:

AMS 7 Correlation and Regression Lecture 8

Intermediate Algebra Summary - Part I

Stat 101 L: Laboratory 5

Determination of Density 1

2-1: Relations and Functions. Mr. Gallo Algebra 2. What is a Relation

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

Transcription:

Talking feet: Scatterplots and lines of best fit Student worksheet What does your foot say about your height? Can you predict people s height by how long their feet are? If a Grade 10 student s foot is 27 cm long, how tall is that student likely to be? In this activity, you will use Census at School survey data to determine whether a relationship exists between foot length and height and if so, what it is. Students from South Africa, the United Kingdom, Australia and New Zealand measured and recorded their height and foot length in centimetres and entered this information into the Census at School database. A random sample was selected from the combined data to create the scatterplot below, where each dot shows an ordered pair of height in relation to foot length. Scatterplot of height and foot length for 15-year-old students

Part 1: Line of best fit Look at the scatterplot: Is there a correlation between foot length and height? If so, is it weak or strong? positive or negative? Figure 1. Height and foot length of 15-year-old students 1. Manually draw a line that best fits the points shown on Figure 1. 2. For the line you have just drawn, determine a) its slope b) its equation c) its height intercept (Explain what this means and whether or not this is a reasonable scenario.) 3. How did you determine that your line was the best fit? (Note: The scales do not start at zero in Figure 1.)

4. Compare your line with the lines of several of your classmates. Did you all get the same line? (Note: Compare the lines visually and then compare the equations that you and your classmates computed for the lines.) 5. Is it important that different people analysing the same data should get the same result?

Part 2: Median median line method The median median line is one method for determining a line of best fit. The following table gives us the height and foot length data for 30 randomly selected 15-year-old students. (These same data were used in Part 1.) Table 1. Height and foot length for 15-year-old students Country Sex Date of birth (dd/mm/yyyy) Height (cm) Foot length (cm) Queensland F 25/02/1986 180 27.5 South Africa F 07/09/1986 163 United Kingdom M 25/01/1985 185 29.5 Queensland M 26/03/1986 166 26.5 South Africa F 17/04/1986 172 25.0 South Africa M 10/05/1986 178 27.0 United Kingdom F 30/04/1985 162 21.0 South Africa F 20/06/1986 148 South Africa M 15/06/1986 167 26.0 South Africa M 08/08/1986 179 26.0 South Africa M 10/11/1985 147 19.0 South Africa F 15/07/1986 158 24.0 United Kingdom M 15/09/1985 155 24.0 United Kingdom M 29/12/1985 172 28.0 South Africa M 29/01/1986 141 South Africa M 23/07/1986 150 25.0 South Africa F 06/04/1986 145 United Kingdom F 10/08/1985 171 25.0 South Africa F 27/11/1985 157 South Africa F 26/11/1985 157 Queensland F 03/04/1986 166 25.5 South Africa M 24/01/1986 159 27.0 United Kingdom M 09/02/1985 172 25.5 United Kingdom M 09/07/1985 168 26.0 South Africa F 06/09/1986 163 South Africa M 22/07/1986 150 25.0 South Africa M 19/12/1985 177 29.0 United Kingdom M 08/02/1985 172 25.5 United Kingdom M 22/02/1985 172 22.0 Queensland F 09/11/1985 165 26.0

1. In order to work with a more manageable number of cases, we will look at only the male students from Table 1. Male students Foot length Height (cm) (cm) 19.0 22.0 24.0 25.0 25.0 25.5 25.5 26.0 26.0 26.0 26.5 27.0 27.0 28.0 29.0 29.5 147 173 141 155 150 150 172 172 167 179 168 166 159 178 172 177 185 Procedure: 1. Arrange the data so the x-values (foot length) are in ascending order. 2. Divide the data into three groups. If the number of data does not divide evenly, be sure that the first and third groups contain exactly the same number of data and adjust the middle group so that it contains only one more or one less. Since there are 17 pieces of data here, use groups of 6, 5 and 6. 3. Plot the data using a different colour for each group. 4. Look at the first group. What is the median x-value? What is the median y-value? Write these values as an ordered pair (x, y). This is the summary point for the first group. Call it S 1. Plot the ordered pair using a plus sign (+) or a square instead of a dot, but in the same colour as the rest of the group. 5. Repeat Step 4 for the middle and last groups of data to get the points S 2 and S 3, respectively. 6. Draw a line (lightly) through S 1 and S 3. Find the slope of line S 1 S 3. 7. Calculate the equation of a second line, which passes through S 2 and is parallel to S 1 S 3. 8. Move the line S 1 S 3 one-third of the way towards S 2. This is your median median line. Your teacher will work through the steps with you.

2. Follow the same procedure to construct the median median line for the female students using the data from Table 1. Female students Foot length Height (cm) (cm) 21.0 24.0 25.0 25.0 25.5 26.0 27.5 162 163 148 145 157 157 163 158 171 172 166 165 180 Fill in the following information: Group 1 contains data pairs. Group 2 contains data pairs. Group 3 contains data pairs. S 1 = (, ) S 2 = (, ) S 3 = (, ) The slope of the line through S 1 and S 2 is. The equation of the line through S 1 and S 2 is. The equation of the line through S 3 parallel to the line through S 1 and S 2 is. The equation of the median median line is. 3. To determine the median median line for the male and female students together, go back to Table 1 and enter the foot length and height values into lists L 1 and L 2 of your graphing calculator. Then use the median median option from the STAT menu on your calculator. (A graphing calculator such as the TI-83 or TI-84 can provide results more quickly, especially when there are a lot of data.) 4. Draw the median median line on your scatterplot and label it clearly. 5. How does it compare with the first line of best fit that you drew manually on the same data in Figure 1? Does it fit the data better? 6. Based on the equation, if a student s foot is 27 cm in length, how tall is the student likely to be?

Part 3: Least squares regression method Another method for determining a line of best fit is called least squares regression. This method is based on the mean. The least squares line of best fit is difficult to create without a tool such as a graphing calculator or a computer program, but it is still a useful method to know. The example below shows how it works. Example of least squares regression X Foot length (cm) Y Height (cm) 21 162 25 171 26 165 ( x, y) x = y = It seems reasonable that our line of best fit should pass through the point ( x, y). Many lines through ( x, y) are possible. The graph shows one of them. How do we determine the best line through ( x, y)? Find the vertical displacement (distance) from each point to the line. The line where the sum of these vertical displacements is as small as possible is the line of best fit.

In the adjacent graph, because point B is above the line, its vertical displacement from the line is positive. Points A and C are below the line so their vertical displacements are negative. What is the sum of these vertical displacements? Unfortunately, any line through ( x, y) will give the same result zero because of the cancelling that occurs when some distances are positive and others are negative. Therefore, a more sophisticated approach is necessary. To overcome this cancelling, we square the distances. The linear regression line also called the least squares line results when the sum of the squares of the vertical displacements from the line is at its minimum. Squaring line AM gives the area of a square with each side equal to AM. In Box A below, one square is drawn with each side equal to AM. Similarly, a square is drawn with each side equal to BN and another with each side equal to CL. Our goal is to move the line through ( x, y) until the sum of these squares is at its minimum. Boxes B, C and D show the squares that are created when the line through ( x, y) is moved. Look only at the size of the squares. Can you see which box shows the squares with the smallest combined area?

A B C D Since the differences are so small, it may be difficult to see which has the smallest sum. Look again, but this time notice the values given for the sum of squares in each graph. Continued experimentation with the line shows that the least possible sum of squares is 25.93.

When the sum of the squares is 25.93 the equation of the line is y = 1.07x + 140.3. This is called the least squares line or the linear regression line. A TI-83 calculator can also be used to determine this line. If the coefficient of variation (r 2 ) is close to 1, the correlation is strong. If r 2 is close to zero, then there is little correlation between the two variables. What conclusion can you make about the correlation between the two variables shown in the adjacent graph where r 2 = 0.38? 1. Use your graphing calculator or computer software (e.g., spreadsheet or Fathom) to compute the linear regression equation for the original dataset relating height to foot length (Table 1). 2. Draw this least squares regression line on your scatterplot and label it clearly. 3. Use it to predict the height of a person whose foot measures 27 cm in length. Is this answer much different from your answer using the median median line? Which method do you think gives the better fit? Why?

Part 4: Try this project on your own class data! 1. Use your class data from the Census at School survey to make a scatterplot showing foot length (y) in relation to hand span (x). Is the correlation strong or weak? 2. Determine the equation of the median median line. 3. Draw it on your scatterplot and label it clearly. 4. Determine the equation of the least squares line. 5. Draw it on your scatterplot and label it clearly. 6. What foot length does each model predict for a hand span of 18 cm? Contributed by Anna Spanik, Math teacher, Halifax West High School, Nova Scotia