Correlation and Regression Analysis. Linear Regression and Correlation. Correlation and Linear Regression. Three Questions.

Similar documents
Linear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz

Linear Regression and Correla/on

Linear Regression and Correlation

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 12 - Part I: Correlation Analysis

Chapter 14 Simple Linear Regression (A)

Review 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2

Correlation Analysis

BNAD 276 Lecture 10 Simple Linear Regression Model

Statistics for Managers using Microsoft Excel 6 th Edition

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Chapter 4. Regression Models. Learning Objectives

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Regression Models. Chapter 4

Ch 13 & 14 - Regression Analysis

Basic Business Statistics 6 th Edition

Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression

Mathematics for Economics MA course

Ordinary Least Squares Regression Explained: Vartanian

Econ 3790: Business and Economics Statistics. Instructor: Yogesh Uppal

Inferences for Regression

Business Statistics. Lecture 9: Simple Regression

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Forecasting. Dr. Richard Jerz rjerz.com

Business Statistics. Lecture 10: Correlation and Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Stats Review Chapter 14. Mary Stangler Center for Academic Success Revised 8/16

The Multiple Regression Model

Correlation and Regression

Scatter plot of data from the study. Linear Regression

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Probability and Statistics Notes

Six Sigma Black Belt Study Guides

Chapter Eight: Assessment of Relationships 1/42

Measuring the fit of the model - SSR

Scatter plot of data from the study. Linear Regression

Module 8: Linear Regression. The Applied Research Center

Chapter 4: Regression Models

Confidence Intervals, Testing and ANOVA Summary

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

A Plot of the Tracking Signals Calculated in Exhibit 3.9

Chapter 9. Correlation and Regression

regression analysis is a type of inferential statistics which tells us whether relationships between two or more variables exist

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 12 : Linear Correlation and Linear Regression

Midterm 2 - Solutions

Chapter 16. Simple Linear Regression and Correlation

Chapter 27 Summary Inferences for Regression

Correlation and Regression

Linear Correlation and Regression Analysis

Psychology 282 Lecture #4 Outline Inferences in SLR

Lecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University

Simple Linear Regression Using Ordinary Least Squares

Intro to Linear Regression

Simple Linear Regression

Econ 3790: Statistics Business and Economics. Instructor: Yogesh Uppal

Biostatistics 380 Multiple Regression 1. Multiple Regression

y ˆ i = ˆ " T u i ( i th fitted value or i th fit)

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

REVIEW 8/2/2017 陈芳华东师大英语系

Chapter 7: Correlation and regression

STAT Chapter 11: Regression

Sect Polynomial and Rational Inequalities

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

Inference for Regression Inference about the Regression Model and Using the Regression Line

LI EAR REGRESSIO A D CORRELATIO

LOOKING FOR RELATIONSHIPS

Regression Analysis II

appstats27.notebook April 06, 2017

Example: Forced Expiratory Volume (FEV) Program L13. Example: Forced Expiratory Volume (FEV) Example: Forced Expiratory Volume (FEV)

Can you tell the relationship between students SAT scores and their college grades?

Modeling and Performance Analysis with Discrete-Event Simulation

Warm-up Using the given data Create a scatterplot Find the regression line

Simple Linear Regression

Computer Science, Informatik 4 Communication and Distributed Systems. Simulation. Discrete-Event System Simulation. Dr.

Topic 10 - Linear Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Simple Linear Regression

Introduction and Single Predictor Regression. Correlation

Simple Linear Regression

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Lecture # 31. Questions of Marks 3. Question: Solution:

Chapter 7 Student Lecture Notes 7-1

Correlation and Regression Theory 1) Multivariate Statistics

Ch 2: Simple Linear Regression

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Business Statistics. Lecture 10: Course Review

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

Sect 2.4 Linear Functions

Unit 10: Simple Linear Regression and Correlation

Predicted Y Scores. The symbol stands for a predicted Y score

9 Correlation and Regression

Simple Linear Regression

Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015

Ordinary Least Squares Regression Explained: Vartanian

Transcription:

10/8/18 Correlation and Regression Analysis Correlation Analysis is the study of the relationship between variables. It is also defined as group of techniques to measure the association between two variables. Regression Analysis is a technique used to express the relationship between two variables. If the relationship is assumed to be a straight line, this is called linear regression. Linear Regression and Correlation Dr. Rick Jerz 1 Three Questions Correlation and Linear Regression 1. Are two variables related? (correlation analysis). Is there a linear relationship between two variables? (linear regression analysis) 3. How strong are these relationships? 3 4 Correlation Correlation & Regression Example The sales manager of Copier Sales of America, which has a large sales force throughout the United States and Canada, wants to determine whether there is a relationship between the number of sales calls made in a month and the number of copiers sold that month. The manager selects a random sample of 10 representatives and determines the number of sales calls each representative made last month and the number of copiers sold. 5 6 1

Step 1: Look at the Data (Plot the Data) A Scatter Diagram is a chart that portrays the relationship between the two variables. It is the usual first step in correlation analysis The Dependent Variable is the variable being predicted or estimated. The Independent Variable provides the basis for estimation. It is the predictor variable. Step : Are they correlated? 7 8 The Coefficient of Correlation, r The Coefficient of Correlation (r) is a measure of the strength of the relationship between two variables. It requires interval or ratioscaled data. It can range from -1.00 to 1.00. Values of -1.00 or 1.00 indicate perfect and strong correlation. Values close to 0.0 indicate weak correlation. Negative values indicate an inverse relationship and positive values indicate a direct relationship. Correlation Coefficient Interpretation 9 10 Correlation Coefficient Equation, r Coefficient of Determination The coefficient of determination (r ) is the proportion of the total variation in the dependent variable (Y) that is explained or accounted for by the variation in the independent variable (X). It is the square of the coefficient of correlation. It ranges from 0 to 1. It does not give any information on the direction of the relationship between the variables. 11 1

Example: Correlation Coefficient How do we interpret a correlation of 0.759? First, it is positive, so we see there is a direct relationship between the number of sales calls and the number of copiers sold. The value of 0.759 is fairly close to 1.00, so we conclude that the association is strong. However, does this mean that more sales calls cause more sales? No, we have not demonstrated cause and effect here, only that the two variables sales calls and copiers sold are related. 13 Coefficient of Determination (r ) - Example The coefficient of determination, r,is 0.576, found by (0.759) This is a proportion or a percent; we can say that 57.6 percent of the variation in the number of copiers sold is explained, or accounted for, by the variation in the number of sales calls. 14 Step 3: How strong are they correlated? Testing the Significance of H 0 : r = 0 (the correlation in the population is 0) H 1 : r 0 (the correlation in the population is not 0) Reject H 0 if: t > t a/,n- or t < -t a/,n- 15 16 Example: Testing the Significance of H 0: r = 0 (the correlation in the population is 0) H 1: r 0 (the correlation in the population is not 0) Reject H 0 if: t > t a/,n- or t < -t a/,n- t > t 0.05,8 or t < -t 0.05,8 t >.306 or t < -.306 Testing the Significance of Equivalently, you can calculate the critical value for the correlation coefficient using This method gives a benchmark for the correlation coefficient. However, there is no p-value and is inflexible if you change your mind about α. 17 18 3

Testing Relationship with r Testing Relationship with r Step 1: State the Hypotheses Determine whether you are using a one or two-tailed test and the level of significance (a). H0 : ρ = 0 H 1 : ρ 0 Step : Calculate the Critical Value For degrees of freedom n = n -, determine t α then calculate Make the Decision If the sample correlation coefficient r exceeds the critical value r α, then reject H 0. If using the t statistic method, reject H 0 if t > t α or if the p-value α. 19 0 Ex: Testing the Significance of Step 4: Is there a linear relationship? Regression The computed t (3.97) is within the rejection region, therefore, we will reject H0. This means the correlation in the population is not zero. From a practical standpoint, it indicates to the sales manager that there is correlation with respect to the number of sales calls made and the number of copiers sold in the population of salespeople. 1 Example: Robot Repeatability Data 3 4 4

Bivariate Regression Analysis Linear Regression Bivariate Regression analyzes the relationship between two variables. It specifies one dependent (response) variable and one independent (predictor) variable. This hypothesized relationship may be linear, quadratic, or whatever. Unknown parameters are β 0 Intercept β 1 Slope The assumed model for a linear relationship is y i = β 0 + β 1x i + e i, for all observations (i = 1,,, n) The error term is not observable, is assumed normally distributed with mean of 0 and standard deviation σ. 5 6 Linear Regression Model Y ˆ = a + bx Differences between Actual Y Estimated Y Y Hat, is the estimate of Y given X a is the Y-intercept b is the slope of the line X is any value of the independent variable 7 8 Least Squares Principle Determining a regression equation by minimizing the sum of the squares (the variance) of the vertical distances between the actual Y values and the predicted values of Y. Calculating a and b (least squares) b = å XY å X - nxy - nx a = Y - bx 9 30 5

Example: Finding the Regression Equation Step 5: Plot the Estimated and the Actual Y s The regression equation is: Y = a + bx Y = 18.9476 + 1.184X Y = 18.9476 + 1.184(0) 31 Y = 4.6316 3 Computing the Estimates of Y Step 6: Predict other values Step 1 Using the regression equation, substitute the value of each X to solve for the estimated sales The regression equation is: Y = a + bx Y = 18.9476 + 1.184X Y = 18.9476 + 1.184(0) Tom Keller Y = 18.9476 + 1.184X Y = 18.9476 + 1.184(0) Soni Jones Y = 18.9476 + 1.184X Y = 18.9476 + 1.184(30) Y = 4.6316 33 Y = 4.6316 Y = 54.4736 34 Assumptions Underlying Linear Regression Assumptions: Graphic For each value of X, there is a group of Y values, and these Y values are normally distributed. The means of these normal distributions of Y values all lie on the straight line of regression. The standard deviations of these normal distributions are equal. The Y values are statistically independent. This means that in the selection of a sample, the Y values chosen for a particular X value do not depend on the Y values for any other X values. 35 36 6

Confidence Intervals for Slope and Intercept Confidence interval for the true slope: Ex: Confidence Interval Estimate Step 4 Use the formula above by substituting the numbers computed in previous slides Confidence interval for the true intercept: Thus, the 95 percent confidence interval for the average sales of all sales representatives who make 5 calls is from 40.9170 up to 56.188 copiers. 37 38 Ex: Prediction Interval Estimate Step Using the information computed earlier in the confidence interval estimation example, use the formula above. 39 If Sheila Baker makes 5 sales calls, the number of copiers she will sell will be between about 4 and 73 copiers 7