Chapter 10. Correlation and Regression. Lecture 1 Sections:

Similar documents
Chapter 10 Correlation and Regression

Chapter 9. Correlation and Regression

23. Inference for regression

y n 1 ( x i x )( y y i n 1 i y 2

Lecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1

Quantitative Bivariate Data

Review of Regression Basics

1) A residual plot: A)

Chapters 9 and 10. Review for Exam. Chapter 9. Correlation and Regression. Overview. Paired Data

CREATED BY SHANNON MARTIN GRACEY 146 STATISTICS GUIDED NOTEBOOK/FOR USE WITH MARIO TRIOLA S TEXTBOOK ESSENTIALS OF STATISTICS, 3RD ED.

AP Statistics Bivariate Data Analysis Test Review. Multiple-Choice

Ch 13 & 14 - Regression Analysis

Examining Relationships. Chapter 3

UNIT 12 ~ More About Regression

Start with review, some new definitions, and pictures on the white board. Assumptions in the Normal Linear Regression Model

Analysis of Bivariate Data

Lecture 18: Simple Linear Regression

Conditions for Regression Inference:

Correlation: Relationships between Variables

Can you tell the relationship between students SAT scores and their college grades?

Simple Linear Regression Using Ordinary Least Squares

The following formulas related to this topic are provided on the formula sheet:

Pre-Calculus Multiple Choice Questions - Chapter S8

9. Linear Regression and Correlation

Six Sigma Black Belt Study Guides

AP Statistics Unit 6 Note Packet Linear Regression. Scatterplots and Correlation

Business Statistics. Lecture 10: Correlation and Linear Regression

Linear Regression Communication, skills, and understanding Calculator Use

SMAM 314 Practice Final Examination Winter 2003

SMAM 314 Exam 42 Name

Correlation & Simple Regression

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Warm-up Using the given data Create a scatterplot Find the regression line

Measuring Associations : Pearson s correlation

Reminder: Student Instructional Rating Surveys

SMAM 319 Exam1 Name. a B.The equation of a line is 3x + y =6. The slope is a. -3 b.3 c.6 d.1/3 e.-1/3

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

STAT 212 Business Statistics II 1

Confidence Interval for the mean response

The simple linear regression model discussed in Chapter 13 was written as

Inference for Regression Inference about the Regression Model and Using the Regression Line

This document contains 3 sets of practice problems.

LINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises

q3_3 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Lecture 4 Scatterplots, Association, and Correlation

Lecture 4 Scatterplots, Association, and Correlation

28. SIMPLE LINEAR REGRESSION III

Looking at Data Relationships. 2.1 Scatterplots W. H. Freeman and Company

Analysis of Covariance. The following example illustrates a case where the covariate is affected by the treatments.

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Overview. 4.1 Tables and Graphs for the Relationship Between Two Variables. 4.2 Introduction to Correlation. 4.3 Introduction to Regression 3.

Correlation and Regression

Chapter 6: Exploring Data: Relationships Lesson Plan

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Objectives. 2.1 Scatterplots. Scatterplots Explanatory and response variables Interpreting scatterplots Outliers

ECON3150/4150 Spring 2015

11 Correlation and Regression

SMAM 314 Computer Assignment 5 due Nov 8,2012 Data Set 1. For each of the following data sets use Minitab to 1. Make a scatterplot.

[ ESS ESS ] / 2 [ ] / ,019.6 / Lab 10 Key. Regression Analysis: wage versus yrsed, ex

Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

Multiple Regression Methods

STA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6

Chapter 5 Friday, May 21st

CRP 272 Introduction To Regression Analysis

CORELATION - Pearson-r - Spearman-rho

Models with qualitative explanatory variables p216

AP STATISTICS Name: Period: Review Unit IV Scatterplots & Regressions

Regression Models REVISED TEACHING SUGGESTIONS ALTERNATIVE EXAMPLES

Model Building Chap 5 p251

CORRELATION ANALYSIS. Dr. Anulawathie Menike Dept. of Economics

27. SIMPLE LINEAR REGRESSION II

Relationship Between Interval and/or Ratio Variables: Correlation & Regression. Sorana D. BOLBOACĂ

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

Upon completion of this chapter, you should be able to:

SIMPLE TWO VARIABLE REGRESSION

INFERENCE FOR REGRESSION

Information Sources. Class webpage (also linked to my.ucdavis page for the class):

Ph.D. Preliminary Examination Statistics June 2, 2014

AP CALCULUS - BC BC LECTURE NOTES MS. RUSSELL Section Number: Topics: Moments, Centers of Mass, and Centroids

Graphing Skill #1: What Type of Graph is it? There are several types of graphs that scientists often use to display data.

Chapter 1. Linear Regression with One Predictor Variable

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

Using a Graphing Calculator

Comparing Quantitative Variables

Math 223 Lecture Notes 3/15/04 From The Basic Practice of Statistics, bymoore

Regression Analysis. Regression: Methodology for studying the relationship among two or more variables

Business Statistics. Lecture 9: Simple Regression

Chapter 12 - Part I: Correlation Analysis

Chapter 16. Simple Linear Regression and dcorrelation

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

ECON3150/4150 Spring 2016

Section Linear Correlation and Regression. Copyright 2013, 2010, 2007, Pearson, Education, Inc.

Notebook Tab 6 Pages 183 to ConteSolutions

5. Let W follow a normal distribution with mean of μ and the variance of 1. Then, the pdf of W is

Review of Regression Basics

Statistics Introductory Correlation

Transcription:

Chapter 10 Correlation and Regression Lecture 1 Sections: 10.1 10. You will now be introduced to important methods for making inferences based on sample data that come in pairs. In the previous chapter, we discussed matched pairs which dealt with differences between two population means. The objective of this chapter is to determine whether there is a relationship between two variables, which is sometimes referred to as bivariate data. If there exists a relationship, then we want to describe it with an equation that can be used for predictions. We will begin by considering the concept of correlation, which is used to determine whether there is a statistically significant relationship between two variables. We can investigate correlation by using a scatterplot. We will consider only linear relationships, which means that when graphed, the points approximate a straight-line pattern. We can also investigate correlation with a measure of the direction and strength of linear association between two variables, which is called the linear correlation coefficient. 1

Definitions: 1. A correlation exists between two variables when one of them is related to the other in some way.. A scatterplot sometimes called a scatter diagram, is a graph in which the paired (x, y) sample data are plotted with a horizontal x- axis and a vertical y-axis. Each individual (x, y) pair is plotted as a single point.. The linear correlation coefficient, denoted as r measures the strength of the linear relationship between the paired x- and y- quantitative values in a sample. If we had every pair of population values for x and y, the result would be a population parameter, represented by Greek letter rho ρ.

Assumptions: 1. The sample data of pairs (x, y) is a random sample of quantitative data.. The pairs of (x, y) data have a bivariate normal distribution. This assumption requires that for any fixed value of x, the corresponding values of y have a distribution that is normal. Furthermore, for any fixed value of y, the values of x have a distribution that is normal. This assumption is usually difficult to check, but a partial check can be made by determining whether the values of both x and y have distributions that are approximately normal. Notation for the Linear Correlation Coefficient: n: represents the number of pairs of data present. : denotes the addition of the items indicated. x: denotes the sum of all x-values. x : indicates that each x-value should be squared and then those squares added. ( x) :indicates that the x-values should be added and the total then squared. *NOTE: x ( x) xy: indicates that each x-value should first be multiplied by its corresponding y-value. After obtaining all such products, find their sum. r: represents the linear correlation coefficient for a sample. ρ: represents the linear correlation coefficient for a population.

Properties of the Linear Correlation Coefficient r: 1. n xy ( x)( y) r = Round to decimal places n x x n y y ( ) ( ) ( ) ( ). 1 r +1. The value of r does not change if all values of either variable are converted to a different scale.. The value of r is not affected by the choice of x or y. Interchange all x- and y-values and the value of r will not change.. r measures the strength of a linear relationship. It is not designed to measure the strength of a relationship that is not linear.. The value of r is the proportion of the variation in y that is explained by the linear relationship between x and y. 1. The accompanying table lists monthly income and their food expenditures for the month of December. Income: $,00 $8,00 $,800 $,100 $,00 $,900 $,700 Food Expenditure: $1,00 $,00 $1,00 $1,00 $900 $1,00 $1,700 Scatterplot of FoodEx vs income 00 00 00 000 FoodEx 1800 100 100 100 1700 MINITAB Output Regression Analysis: FoodEx versus income The regression equation is FoodEx = 11 + 0. income 100 100 1000 900 000 100 000 000 100 000 income 7000 8000 9000 Predictor Coef SE Coef T P Constant 10.7 17. 0.9 0.19 income 0. 0.0788. 0.001 S = 19.08 R-Sq = 89.9% R-Sq(adj) = 87.9%

Common Errors Involving Correlation: 1. A common error is to conclude that correlation implies causality. Using the example above, we can conclude that there is a correlation between income and food expenditure, but we cannot conclude that income causes food expenditure.. Error arises with data based on averages. Averages suppress individual variation and may inflate the correlation coefficient.. The property of linearity. A relationship may exist between x and y even when there is no significant linear correlation. If we look at the figure at the right, r = 0. This is an indication of no linear correlation between the two variables. However, we can easily see that the figure has a pattern reflecting a very strong nonlinear relationship. Formal Hypothesis Test: Results and Conclusions

. NECK(X) 1.8 18.0 1. 17. 1. 17. 1. Arm Length(Y)..8..0.7.7.8 Test the claim that there is a linear correlation between neck size and arm length. α = 0.01. The U.S. Department of education reports that there exists a linear correlation between SAT scores and GPA at the high school level. Ten high school students were randomly selected and their SAT score and GPA are listed below. a.test the claim b.the proportion of the variation in y that is SAT GPA explained by the linear relationship between x and y. 191 10 1 119 979 8 791 7 7..9.7.1.80.70....07

. The accompanying table lists weights in pounds of paper discarded by a sample of households, along with the size of the household. Paper:.1 7.7 9. 8.8 8.7.9.8 11. HSize : 1 Test the claim that ρ = 0 Fitted Line Plot HSize = 0.1 + 0.979 Paper S 1.007 R-Sq 9.% R-Sq(adj) 9.% Scatterplot of HSize vs Paper HSize HSize 1 1 7 Paper 8 9 10 11 7 Paper 8 9 10 11. The following is the age and the corresponding blood pressure of 10 subjects randomly selected subjects from a large city Is there significant linear correlation between age and blood pressure? Age Blood Pressure 8 1 0 0 10 11 10 10 1 1 10 1 10 19. A study was conducted to investigate the relationship between age (in years) and BAC (blood alcohol concentration) measured when convicted DWI (driving while intoxicated) jail inmates were first arrested. Based on the data below, does the BAC seem to be correlated to the age of the person tested? Age BAC 17.. 0.7.1 7. 1.0 7.. 0.19 0.0 0. 0.1 0. 0.0 0.18 0. 7