Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Similar documents
Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Correlation and Regression

Correlation and Regression

Correlation and Regression

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

Linear Correlation and Regression Analysis

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Correlation A relationship between two variables As one goes up, the other changes in a predictable way (either mostly goes up or mostly goes down)

Inferences for Correlation

Chapter 12 : Linear Correlation and Linear Regression

CORRELATION AND REGRESSION

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

y n 1 ( x i x )( y y i n 1 i y 2

Multiple Linear Regression

Can you tell the relationship between students SAT scores and their college grades?

Upon completion of this chapter, you should be able to:

Simple Linear Regression

Chapter 12 - Part I: Correlation Analysis

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

a) Graph the equation by the intercepts method. Clearly label the axes and the intercepts. b) Find the slope of the line.

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

determine whether or not this relationship is.

Correlation & Simple Regression

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Chapter 4. Regression Models. Learning Objectives

CORRELATION AND REGRESSION

IB Questionbank Mathematical Studies 3rd edition. Bivariate data. 179 min 172 marks

Quantitative Bivariate Data

Regression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).

Do not copy, post, or distribute

Simple Linear Regression

Practice Questions for Math 131 Exam # 1

Unit 4 Linear Functions

CORRELATION AND SIMPLE REGRESSION 10.0 OBJECTIVES 10.1 INTRODUCTION

IOP2601. Some notes on basic mathematical calculations

EXAM 3 Math 1342 Elementary Statistics 6-7

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

Module 8: Linear Regression. The Applied Research Center

Inference with Simple Regression

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

Correlation and Regression

Chapters 9 and 10. Review for Exam. Chapter 9. Correlation and Regression. Overview. Paired Data

Correlation and Regression (Excel 2007)

Ch 13 & 14 - Regression Analysis

Six Sigma Black Belt Study Guides

Chapter 16. Simple Linear Regression and dcorrelation

Final Exam - Solutions

Inferences for Regression

Chapter 6. Exploring Data: Relationships

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

BNAD 276 Lecture 10 Simple Linear Regression Model

This document contains 3 sets of practice problems.

REVIEW 8/2/2017 陈芳华东师大英语系

CORELATION - Pearson-r - Spearman-rho

Statistics Revision Questions Nov 2016 [175 marks]

Ch. 16: Correlation and Regression

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

11 Correlation and Regression

Business Mathematics and Statistics (MATH0203) Chapter 1: Correlation & Regression

1)I have 4 red pens, 3 purple pens and 1 green pen that I use for grading. If I randomly choose a pen,

Example: Can an increase in non-exercise activity (e.g. fidgeting) help people gain less weight?

[ z = 1.48 ; accept H 0 ]

SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. x )

Chapter 16. Simple Linear Regression and Correlation

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

CORRELATION ANALYSIS. Dr. Anulawathie Menike Dept. of Economics

Statistics 100 Exam 2 March 8, 2017

STATISTICS 110/201 PRACTICE FINAL EXAM

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

Chapter 14 Simple Linear Regression (A)

Solutionbank S1 Edexcel AS and A Level Modular Mathematics

Ordinary Least Squares Regression Explained: Vartanian

This gives us an upper and lower bound that capture our population mean.

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables. Interpreting scatterplots Outliers

Geometry - Summer 2016

NUMB3RS Activity: How Does it Fit?

Summer Review for Mathematical Studies Rising 12 th graders

MINI LESSON. Lesson 2a Linear Functions and Applications

Chapter 9. Correlation and Regression

Example #1: Write an Equation Given Slope and a Point Write an equation in slope-intercept form for the line that has a slope of through (5, - 2).

Final Exam - Solutions

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

LI EAR REGRESSIO A D CORRELATIO

Identify the scale of measurement most appropriate for each of the following variables. (Use A = nominal, B = ordinal, C = interval, D = ratio.

Midterm 2 - Solutions

Mathematics Level D: Lesson 2 Representations of a Line

Copyright 2012 Pearson Education, Inc. Publishing as Prentice Hall.

Inference About Means and Proportions with Two Populations. Chapter 10

BIOSTATISTICS NURS 3324

Interactions. Interactions. Lectures 1 & 2. Linear Relationships. y = a + bx. Slope. Intercept

Section 2.5 from Precalculus was developed by OpenStax College, licensed by Rice University, and is available on the Connexions website.

Midterm 2 - Solutions

6. 5x Division Property. CHAPTER 2 Linear Models, Equations, and Inequalities. Toolbox Exercises. 1. 3x = 6 Division Property

Simple Linear Regression: One Quantitative IV

Basic Fraction and Integer Operations (No calculators please!)

Unit #2: Linear and Exponential Functions Lesson #13: Linear & Exponential Regression, Correlation, & Causation. Day #1

Transcription:

Chapter 10 Correlation and Regression McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Example 10-2: Absences/Final Grades Please enter the data below in L1 and L2. The data appears on page 537 of your textbook. Also make sure to have the graphs A, B, and C ready with their corresponding data. Bluman, Chapter 10 2

Graph and Window: The next slides will outline how to calculate the regression model that may fit the data. The first time you ever run the regression you may have turn on the diagnostics. Once the diagnostic is turned on once, you may not have to do it again. Bluman, Chapter 10 3

Chapter 10 Overview Introduction 10-1 Scatter Plots and Correlation 10-2 Regression 10-3 Coefficient of Determination and Standard Error of the Estimate 10-4 Multiple Regression (Optional) Bluman, Chapter 10 4

Chapter 10 Objectives 1. Draw a scatter plot for a set of ordered pairs. 2. Compute the correlation coefficient. 3. Test the hypothesis H 0 : ρ = 0. 4. Compute the equation of the regression line. 5. Compute the coefficient of determination. 6. Compute the standard error of the estimate. 7. Find a prediction interval. 8. Be familiar with the concept of multiple regression. Bluman, Chapter 10 5

Introduction In addition to hypothesis testing and confidence intervals, inferential statistics involves determining whether a relationship between two or more numerical or quantitative variables exists. Bluman, Chapter 10 6

Introduction Correlation is a statistical method used to determine whether a linear relationship between variables exists. Regression is a statistical method used to describe the nature of the relationship between variables that is, positive or negative, linear or nonlinear. Bluman, Chapter 10 7

Introduction The purpose of this chapter is to answer these questions statistically: 1. Are two or more variables related? 2. If so, what is the strength of the relationship? 3. What type of relationship exists? 4. What kind of predictions can be made from the relationship? Bluman, Chapter 10 8

Introduction 1. Are two or more variables related? 2. If so, what is the strength of the relationship? To answer these two questions, statisticians use the correlation coefficient, a numerical measure to determine whether two or more variables are related and to determine the strength of the relationship between or among the variables. Bluman, Chapter 10 9

Introduction 3. What type of relationship exists? There are two types of relationships: simple and multiple. In a simple relationship, there are two variables: an independent variable (predictor variable) and a dependent variable (response variable). In a multiple relationship, there are two or more independent variables that are used to predict one dependent variable. Bluman, Chapter 10 10

Introduction 4. What kind of predictions can be made from the relationship? Predictions are made in all areas and daily. Examples include weather forecasting, stock market analyses, sales predictions, crop predictions, gasoline price predictions, and sports predictions. Some predictions are more accurate than others, due to the strength of the relationship. That is, the stronger the relationship is between variables, the more accurate the prediction is. Bluman, Chapter 10 11

10.1 Scatter Plots and Correlation A scatter plot is a graph of the ordered pairs (x, y) of numbers consisting of the independent variable x and the dependent variable y. Bluman, Chapter 10 12

Chapter 10 Correlation and Regression Section 10-1 Example 10-1 Page #536 Bluman, Chapter 10 13

Example 10-1: Car Rental Companies Construct a scatter plot for the data shown for car rental companies in the United States for a recent year. Step 1: Draw and label the x and y axes. Step 2: Plot each point on the graph. Bluman, Chapter 10 14

Example 10-1: Car Rental Companies Positive Relationship Bluman, Chapter 10 15

Chapter 10 Correlation and Regression Section 10-1 Example 10-2 Page #537 Bluman, Chapter 10 16

Example 10-2: Absences/Final Grades Construct a scatter plot for the data obtained in a study on the number of absences and the final grades of seven randomly selected students from a statistics class. Step 1: Draw and label the x and y axes. Step 2: Plot each point on the graph. Bluman, Chapter 10 17

Example 10-2: Absences/Final Grades Negative Relationship Bluman, Chapter 10 18

Chapter 10 Correlation and Regression Section 10-1 Example 10-3 Page #538 Bluman, Chapter 10 19

Example 10-3: Exercise/Milk Intake Construct a scatter plot for the data obtained in a study on the number of hours that nine people exercise each week and the amount of milk (in ounces) each person consumes per week. Step 1: Draw and label the x and y axes. Step 2: Plot each point on the graph. Bluman, Chapter 10 20

Example 10-3: Exercise/Milk Intake Very Weak Relationship Bluman, Chapter 10 21

Correlation The correlation coefficient computed from the sample data measures the strength and direction of a linear relationship between two variables. There are several types of correlation coefficients. The one explained in this section is called the Pearson product moment correlation coefficient (PPMC). The symbol for the sample correlation coefficient is r. The symbol for the population correlation coefficient is. Bluman, Chapter 10 22

Correlation The range of the correlation coefficient is from 1 to 1. If there is a strong positive linear relationship between the variables, the value of r will be close to 1. If there is a strong negative linear relationship between the variables, the value of r will be close to 1. Bluman, Chapter 10 23

Correlation Bluman, Chapter 10 24

Correlation Coefficient The formula for the correlation coefficient is r n xy x y 2 2 2 2 n x x n y y where n is the number of data pairs. Rounding Rule: Round to three decimal places. Bluman, Chapter 10 25

Chapter 10 Correlation and Regression Section 10-1 Example 10-4 Page #540 Bluman, Chapter 10 26

Example 10-4: Car Rental Companies Compute the correlation coefficient for the data in Example 10 1. Company A B C D E F Cars x (in 10,000s) 63.0 29.0 20.8 19.1 13.4 8.5 Σ x = 153.8 Income y (in billions) xy x 2 y 2 7.0 3.9 2.1 2.8 1.4 1.5 Σ y = 18.7 441.00 113.10 43.68 53.48 18.76 2.75 Σ xy = 682.77 3969.00 841.00 432.64 364.81 179.56 72.25 Σ x 2 = 5859.26 49.00 15.21 4.41 7.84 1.96 2.25 Σ y 2 = 80.67 Bluman, Chapter 10 27

Example 10-4: Car Rental Companies Compute the correlation coefficient for the data in Example 10 1. Σx = 153.8, Σy = 18.7, Σxy = 682.77, Σx 2 = 5859.26, Σy 2 = 80.67, n = 6 r r r n xy x y 2 2 2 2 n x x n y y 6 682.77 153.8 18.7 6 5859.26 153.8 6 80.67 18.7 2 2 0.982 (strong positive relationship) Bluman, Chapter 10 28

Chapter 10 Correlation and Regression Section 10-1 Example 10-5 Page #541 Bluman, Chapter 10 29

Example 10-5: Absences/Final Grades Compute the correlation coefficient for the data in Example 10 2. Student A B C D E F Number of absences, x 6 2 15 9 12 5 Final Grade y (pct.) xy x 2 y 2 82 86 43 74 58 90 492 172 645 666 696 450 36 4 225 81 144 25 6,724 7,396 1,849 5,476 3,364 8,100 G 8 78 624 64 6,084 Σ x = 57 Σ y = 511 Σ xy = 3745 Σ x 2 = 579 Σ y 2 = 38,993 Bluman, Chapter 10 30

Example 10-5: Absences/Final Grades Compute the correlation coefficient for the data in Example 10 2. Σx = 57, Σy = 511, Σxy = 3745, Σx 2 = 579, Σy 2 = 38,993, n = 7 r r r n xy x y 2 2 2 2 n x x n y y 7 3745 57 511 7 579 57 7 38,993 511 2 2 0.944 (strong negative relationship) Bluman, Chapter 10 31

Press 2 nd 0 to access CATALOG Press the down arrow key to access DiagnosticOn Press ENTER twice. 32

Calculating Linear regression: y = 3.622x + 102.492 What is the independent variable? What is the dependent variable? What is the value of slope and what does it mean? What is the value of the y-intercept and what does it mean? Bluman, Chapter 10 33

Chapter 10 Correlation and Regression Section 10-1 Example 10-6 Page #542 Bluman, Chapter 10 34

Example 10-6: Exercise/Milk Intake Compute the correlation coefficient for the data in Example 10 3. Subject Hours, x Amount y xy x 2 y 2 A B C D E F 3 0 2 5 8 5 48 8 32 64 10 32 144 0 64 320 80 160 9 0 4 25 64 25 2,304 64 1,024 4,096 100 1,024 G 10 56 560 100 3,136 H 2 72 144 4 5,184 I 1 48 48 1 2,304 Σ x = 36 Σ y = 370 Σ xy = 1,520 Σ x 2 = 232 Σ y 2 = 19,236 Bluman, Chapter 10 35

Example 10-6: Exercise/Milk Intake Compute the correlation coefficient for the data in Example 10 3. Σx = 36, Σy = 370, Σxy = 1520, Σx 2 = 232, Σy 2 = 19,236, n = 9 r r r n xy x y 2 2 2 2 n x x n y y 7 1520 36 370 7 232 36 7 19, 236 370 2 2 0.067 (very weak relationship) Bluman, Chapter 10 36

Hypothesis Testing In hypothesis testing, one of the following is true: H 0 : 0 This null hypothesis means that there is no correlation between the x and y variables in the population. H 1 : 0 This alternative hypothesis means that there is a significant correlation between the variables in the population. Bluman, Chapter 10 37

t Test for the Correlation Coefficient t r n 2 2 1 r with degrees of freedom equal to n 2. Bluman, Chapter 10 38

Chapter 10 Correlation and Regression Section 10-1 Example 10-7 Page #544 Bluman, Chapter 10 39

Example 10-7: Car Rental Companies Test the significance of the correlation coefficient found in Example 10 4. Use α = 0.05 and r = 0.982. Step 1: State the hypotheses. H 0 : ρ = 0 and H 1 : ρ 0 Step 2: Find the critical value. Since α = 0.05 and there are 6 2 = 4 degrees of freedom, the critical values obtained from Table F are ±2.776. Bluman, Chapter 10 40

Example 10-7: Car Rental Companies Step 3: Compute the test value. n 2 6 2 t r 0.982 10.4 2 1 r 1 0.982 2 Step 4: Make the decision. Reject the null hypothesis. Step 5: Summarize the results. There is a significant relationship between the number of cars a rental agency owns and its annual income. Bluman, Chapter 10 41

Chapter 10 Correlation and Regression Section 10-1 Example 10-8 Page #545 Bluman, Chapter 10 42

Example 10-8: Car Rental Companies Using Table I, test the significance of the correlation coefficient r = 0.067, from Example 10 6, at α = 0.01. Step 1: State the hypotheses. H 0 : ρ = 0 and H 1 : ρ 0 There are 9 2 = 7 degrees of freedom. The value in Table I when α = 0.01 is 0.798. For a significant relationship, r must be greater than 0.798 or less than -0.798. Since r = 0.067, do not reject the null. Hence, there is not enough evidence to say that there is a significant linear relationship between the variables. Bluman, Chapter 10 43

Possible Relationships Between Variables When the null hypothesis has been rejected for a specific a value, any of the following five possibilities can exist. 1. There is a direct cause-and-effect relationship between the variables. That is, x causes y. 2. There is a reverse cause-and-effect relationship between the variables. That is, y causes x. 3. The relationship between the variables may be caused by a third variable. 4. There may be a complexity of interrelationships among many variables. 5. The relationship may be coincidental. Bluman, Chapter 10 44

Possible Relationships Between Variables 1. There is a reverse cause-and-effect relationship between the variables. That is, y causes x. For example, water causes plants to grow poison causes death heat causes ice to melt Bluman, Chapter 10 45

Possible Relationships Between Variables 2. There is a reverse cause-and-effect relationship between the variables. That is, y causes x. For example, Suppose a researcher believes excessive coffee consumption causes nervousness, but the researcher fails to consider that the reverse situation may occur. That is, it may be that an extremely nervous person craves coffee to calm his or her nerves. Bluman, Chapter 10 46

Possible Relationships Between Variables 3. The relationship between the variables may be caused by a third variable. For example, If a statistician correlated the number of deaths due to drowning and the number of cans of soft drink consumed daily during the summer, he or she would probably find a significant relationship. However, the soft drink is not necessarily responsible for the deaths, since both variables may be related to heat and humidity. Bluman, Chapter 10 47

Possible Relationships Between Variables 4. There may be a complexity of interrelationships among many variables. For example, A researcher may find a significant relationship between students high school grades and college grades. But there probably are many other variables involved, such as IQ, hours of study, influence of parents, motivation, age, and instructors. Bluman, Chapter 10 48

Possible Relationships Between Variables 5. The relationship may be coincidental. For example, A researcher may be able to find a significant relationship between the increase in the number of people who are exercising and the increase in the number of people who are committing crimes. But common sense dictates that any relationship between these two values must be due to coincidence. Bluman, Chapter 10 49

On Your own: Read section 10.1 Study the vocabulary words discussed in class the bold words in your reading. Assignment: Page 548 Sec 10.1 1-11 all, 13, 17, 19 50