PS2: Two Variable Statistics

Similar documents
PS2.1 & 2.2: Linear Correlations PS2: Bivariate Statistics

PS5: Two Variable Statistics LT3: Linear regression LT4: The test of independence.

Probability and Samples. Sampling. Point Estimates

Regression and correlation. Correlation & Regression, I. Regression & correlation. Regression vs. correlation. Involve bivariate, paired data, X & Y

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Correlation A relationship between two variables As one goes up, the other changes in a predictable way (either mostly goes up or mostly goes down)

BIOSTATISTICS NURS 3324

Stat 101 Exam 1 Important Formulas and Concepts 1

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Approximate Linear Relationships

P1 Chapter 3 :: Equations and Inequalities

Relationships between variables. Visualizing Bivariate Distributions: Scatter Plots

Chapter 7. Scatterplots, Association, and Correlation. Copyright 2010 Pearson Education, Inc.

Chapter 6. September 17, Please pick up a calculator and take out paper and something to write with. Association and Correlation.

MEASURING THE SPREAD OF DATA: 6F

CORRELATION. suppose you get r 0. Does that mean there is no correlation between the data sets? many aspects of the data may a ect the value of r

MAC Module 2 Modeling Linear Functions. Rev.S08

HUDM4122 Probability and Statistical Inference. February 2, 2015

Preptests 55 Answers and Explanations (By Ivy Global) Section 4 Logic Games

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

M1-Lesson 8: Bell Curves and Standard Deviation

Lesson: Slope. Warm Up. Unit #2: Linear Equations. 2) If f(x) = 7x 5, find the value of the following: f( 2) f(3) f(0)

Warm-up: 1) A craft shop sells canvasses in a variety of sizes. The table below shows the area and price of each canvas type.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Section 2.7 Solving Linear Inequalities

Alex s Guide to Word Problems and Linear Equations Following Glencoe Algebra 1

Statistics 1. Edexcel Notes S1. Mathematical Model. A mathematical model is a simplification of a real world problem.

Chapter 7 Summary Scatterplots, Association, and Correlation

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

Applied Regression Analysis

Relationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height.

Learning Goals. 2. To be able to distinguish between a dependent and independent variable.

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

Lesson 4 Linear Functions and Applications

appstats8.notebook October 11, 2016

Correlation and regression

Sampling Distribution Models. Chapter 17

Business Statistics. Lecture 9: Simple Regression

HOLLOMAN S AP STATISTICS BVD CHAPTER 08, PAGE 1 OF 11. Figure 1 - Variation in the Response Variable

ACCESS TO SCIENCE, ENGINEERING AND AGRICULTURE: MATHEMATICS 1 MATH00030 SEMESTER / Lines and Their Equations

Chapter 3. Measuring data

AP Statistics L I N E A R R E G R E S S I O N C H A P 7

Linear Motion with Constant Acceleration

LECTURE 15: SIMPLE LINEAR REGRESSION I

Chapter 1 Review of Equations and Inequalities

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras

CORELATION - Pearson-r - Spearman-rho

Chapter 16. Simple Linear Regression and dcorrelation

PERIL PIZZA. But that s still a daydream. For the moment you re the assistant to the glamorous. The Challenge

Describing Bivariate Relationships

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Bivariate data data from two variables e.g. Maths test results and English test results. Interpolate estimate a value between two known values.

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION

Lesson 1.2 Position Time Graphs

AMS 7 Correlation and Regression Lecture 8

Ch. 9 Pretest Correlation & Residuals

Lesson 3-1: Solving Linear Systems by Graphing

The following formulas related to this topic are provided on the formula sheet:

Correlation & Regression. Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria

Lesson 26: Characterization of Parallel Lines

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

5.1 Bivariate Relationships

Chapter 5 Least Squares Regression

Linear Regression Communication, skills, and understanding Calculator Use

Boyle s Law and Charles Law Activity

An introduction to plotting data

determine whether or not this relationship is.

Figure 1: Doing work on a block by pushing it across the floor.

Chapter 3: Examining Relationships

SCATTER DIAGRAMS M.K. HOME TUITION. Mathematics Revision Guides Level: GCSE Higher Tier

NAME: DATE: SECTION: MRS. KEINATH

Correlation and Regression

Sampling, Frequency Distributions, and Graphs (12.1)

Warm-up Using the given data Create a scatterplot Find the regression line

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011

PHYSICS 107. Lecture 5 Newton s Laws of Motion

Linear Regression 3.2

Chapter 16. Simple Linear Regression and Correlation

Astronomy 102 Math Review

Classroom Assessments Based on Standards Integrated College Prep I Unit 3 CP 103A

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

Relationships Regression

The response variable depends on the explanatory variable.

Stat 101 L: Laboratory 5

Chapter 18. Sampling Distribution Models /51

Correlation. Relationship between two variables in a scatterplot. As the x values go up, the y values go down.

Upon completion of this chapter, you should be able to:

Measuring Associations : Pearson s correlation

LAB 2: INTRODUCTION TO MOTION

We're in interested in Pr{three sixes when throwing a single dice 8 times}. => Y has a binomial distribution, or in official notation, Y ~ BIN(n,p).

Position and Displacement

t. y = x x R² =

Recitation 8: Graphs and Adjacency Matrices

Chapter 14. Statistical versus Deterministic Relationships. Distance versus Speed. Describing Relationships: Scatterplots and Correlation

Student Exploration: Diffusion

1. In Activity 1-1, part 3, how do you think graph a will differ from graph b? 3. Draw your graph for Prediction 2-1 below:

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

10.1: Scatter Plots & Trend Lines. Essential Question: How can you describe the relationship between two variables and use it to make predictions?

Regression, part II. I. What does it all mean? A) Notice that so far all we ve done is math.

Transcription:

PS2: Two Variable Statistics LT2: Measuring Correlation and Line of best fit by eye. LT3: Linear regression LT4: The χ 2 test of independence. 1

Pearson's Correlation Coefficient In examinations you are expected to calculate r using technology. Therefore, you will put the data in your calculator and run a linreg(ax+b) test to find r. This method is simple and shouldn t be too much of a worry on your paper. However, calculating r using the formula is recommended for the Internal Assessment to get full marks on the mathematic process portion. For us as practice we will only be looking at a few points rather than a multitude of points due to time. The formula is: r = (x x)(y y) x x 2 y y 2 You may recognize the bottom. What does this look similar to? The top is called the covariance, which tells us what happens to a specific point compared to the mean. So if we test to see how each point is associated with the mean, we come up with r, or a representation of correlation from -1 to 1. 2

Pearson's Correlation Coefficient Also important side note, r = S xy S x S y S x = Standard Deviation of x S y = Standard Deviation of y S xy = Covariance of x and y. Sometimes, all we need is to find the standard deviation of x and y, with a given covariance, and we can calculate r. Calculating r by hand Using the long equation r = find the values needed to find r. (x x)(y y) x x 2 y y 2, Lets look at an example and Example: Daisy investigates how the volume of water in a pot affects the time it takes to boil on the stove. The results are given in the table. Find and interpret Pearson s correlation coefficient between the two variables. Pot Volume (x, L) Time to boil (y min) A 1 2 B 2 4 C 4 7 D 6 9 3

Calculating r by hand r = Pot (x x)(y y) x x 2 y y Volume (x, L) 2, each portion section is needed to be solved. Time to boil (y min) A 1 2 x x y y (x x)(y y) x x 2 y y 2 B 2 4 C 4 7 D 6 9 total Calculating mean of x and y, should be a cinch! x = x y =, y = = 4 4 r = Try One On Your Own Period 7 s test scores are as follows as well as their IQ for 5 people. by hand, find what type of correlation there is by interpreting r. Person Score (x) IQ (y) 1 66 124 2 49 126 3 55 130 4 68 168 5 58 101 Total 4

Try One On Your Own Period 7 s test scores are as follows as well as their IQ for 5 people. by hand, find what type of correlation there is by interpreting r. Person Score (x) IQ (y) x x y y (x x)(y y) x x 2 y y 2 1 66 124 6.8-5.8-39.44 46.24 33.64 2 49 126-10.2-3.8 38.76 104.04 14.44 3 55 130-4.2.2 -.84 17.64.04 4 68 168 8.8 38.2 336.16 77.44 1459.2 5 58 101-1.2-28.8 34.56 1.44 829.44 Total 296 649 369.2 246.8 2336.8 x = 5 x = 59. 2, y = y 5 = 129.8 r = 369.2 246.8 2336.8.4862 Now that you ve done this once or twice; Click Here for something amazing by an amazing person. Score (x) IQ (y) x x y y (x x)(y y) x x 2 y y 2 66 124 49 126 55 130 68 168 58 101 31 111 60 199 5

r 2 : The Coefficient of Determination. To help describe the correlation between two variables, we can also calculate the coefficient of determination r 2. This is simply the square of Pearson s Product moment correlation coefficient r, and as such the direction of correlation is eliminated. r describes the direction of the correlation and how correlated something is. r 2 describes the type of correlation at each point. In other words, it describes the percent in which one variable will follow the correlation. How often will a given variable depend on the other variable? Do not get these confused. The IA specifically states to dock points if students get these mixed up as evidence that the student doesn t know what r stands for. Yep. That s me. 6

3C 7

LINE OF BEST FIT BY EYE What is the line of best fit? A line we can draw to best represent the relationship between two variables. How do we calculate this line? Well it s by eye, so we never really get much of an accurate line, but something close will do. Here is how you do it! 8

LOBF: the calculations by hand Step 1: Calculate the mean of the X values, x, and the mean of the Y values, y. Step 2: Mark the mean point ( x, y) on the scatter diagram. Step 3: Draw a line through the mean point which fits the trend of the data, and so that about the same number of data points are above the line as below it. This process is an estimate and therefore can result in some discrepancies. Make sure you use a straight edge for all your lines of best fit by eye. Example A group of LCC students were surveyed on how much they run a week. The data was recorded in a table and the results were as follows. 1) Plot the points on a scatterplot (accuracy is important.) 2) find the line of best fit by eye. (graph x, y) 3) describe the correlation (strength, direction, outliers, etc). Age(x) Distance Miles (y) 16 11 66 3 49 21 23 17 22 11 55 1 71 2 58 6 31 14 60 8 25 20 15 10 5 Distance Miles (y) vs Age (x) 0 0 10 20 30 40 50 60 70 80 9

INTERPOLATION AND EXTRAPOLATION Using the line of best fit we can make predictions about values we don t know about. For instance, on the previous graph, we had 10 different ages. On a scale from 0-70+ we have many more possibilities. Interpolation is an estimation of a data point within the lowest x value (lower pole) and the highest x value (upper pole) using the line of best fit. Extrapolation is an estimate of a data point outside the lower pole and upper pole. Using the LOBF. TOK: Think about his! Are there any limitations to interpolation or extrapolation? Think in terms of the previous slides. How many miles should a 1 year old run? Do all people run the same at age 20? Example On a hot day, nine cars were left in the sun in a car parking lot. The length of time each car was left in the sun was recorded, as well as the temperature inside the car at the end of the period. Car A B C D E F G H I Time 50 5 25 40 15 45 55 10 15 Temp 100 70 88 96 77 110 121 80 73 A. Calculate the mean of both variable B. Draw a scatter diagram of the data. C. Plot the point ( x, y) on the scatter D. diagram and then draw the line of best fit. E. Predict: The temp at 35 minutes The temp at 75 minutes. Comment on the reliability of your predictions. 10

Homework 11B.2 P. 325-326 #1-3 (Saputo page numbers 11-14) 11B.3 P. 327 #1-4 11C P. 330 #1-3 11