Chapter 12 - Part I: Correlation Analysis

Similar documents
Linear Regression and Correlation

Correlation and Regression Analysis. Linear Regression and Correlation. Correlation and Linear Regression. Three Questions.

Business Statistics. Lecture 10: Correlation and Linear Regression

Inferences for Regression

Ch 13 & 14 - Regression Analysis

Basic Business Statistics 6 th Edition

Regression Models. Chapter 4. Introduction. Introduction. Introduction

Correlation. What Is Correlation? Why Correlations Are Used

Topic 10 - Linear Regression

AMS 7 Correlation and Regression Lecture 8

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

REVIEW 8/2/2017 陈芳华东师大英语系

Chapter 16. Simple Linear Regression and Correlation

Chapter 4: Regression Models

Business Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

appstats27.notebook April 06, 2017

Correlation and simple linear regression S5

Chapter 4. Regression Models. Learning Objectives

Business Statistics. Lecture 9: Simple Regression

Two-Variable Analysis: Simple Linear Regression/ Correlation

Chapter 16. Simple Linear Regression and dcorrelation

Chapter 10. Regression. Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania

Mathematics for Economics MA course

CHAPTER 4 DESCRIPTIVE MEASURES IN REGRESSION AND CORRELATION

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Six Sigma Black Belt Study Guides

Correlation Analysis

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Lecture 15: Chapter 10

Chapter 27 Summary Inferences for Regression

Scatter plot of data from the study. Linear Regression

Regression. Estimation of the linear function (straight line) describing the linear component of the joint relationship between two variables X and Y.

Simple Linear Regression

Intro to Linear Regression

BNAD 276 Lecture 10 Simple Linear Regression Model

Lecture 3. The Population Variance. The population variance, denoted σ 2, is the sum. of the squared deviations about the population

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Inference with Simple Regression

Unit 6 - Introduction to linear regression

Scatter plot of data from the study. Linear Regression

Statistics for Managers using Microsoft Excel 6 th Edition

Describing the Relationship between Two Variables

Chapter 10 Correlation and Regression

ECON3150/4150 Spring 2015

Lecture 30. DATA 8 Summer Regression Inference

4.1 Introduction. 4.2 The Scatter Diagram. Chapter 4 Linear Correlation and Regression Analysis

Linear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz

Linear Regression and Correla/on

Lecture 11: Simple Linear Regression

Regression Models. Chapter 4

Section Linear Correlation and Regression. Copyright 2013, 2010, 2007, Pearson, Education, Inc.

STAT 350 Final (new Material) Review Problems Key Spring 2016

Chapter 12 - Lecture 2 Inferences about regression coefficient

Linear Correlation and Regression Analysis

Psych 10 / Stats 60, Practice Problem Set 10 (Week 10 Material), Solutions

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 6: Exploring Data: Relationships Lesson Plan

Chapter 14 Simple Linear Regression (A)

1. Create a scatterplot of this data. 2. Find the correlation coefficient.

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

+ Statistical Methods in

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Chapter 4 Describing the Relation between Two Variables

Introduction to Linear Regression

Intro to Linear Regression

ES-2 Lecture: More Least-squares Fitting. Spring 2017

appstats8.notebook October 11, 2016

Linear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Chapter 13 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics

Lecture 8 CORRELATION AND LINEAR REGRESSION

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Stat 101 Exam 1 Important Formulas and Concepts 1

CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS

LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION

Unit 6 - Simple linear regression

Ordinary Least Squares Regression Explained: Vartanian

Correlation and Linear Regression

STATISTICS 110/201 PRACTICE FINAL EXAM

Mathematics Level D: Lesson 2 Representations of a Line

28. SIMPLE LINEAR REGRESSION III

Introduction and Single Predictor Regression. Correlation

Applied Regression Modeling: A Business Approach Chapter 2: Simple Linear Regression Sections

Regression Analysis II

Objectives Simple linear regression. Statistical model for linear regression. Estimating the regression parameters

Linear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

Chapter 14 Student Lecture Notes Department of Quantitative Methods & Information Systems. Business Statistics. Chapter 14 Multiple Regression

Chapter 12 : Linear Correlation and Linear Regression

Can you tell the relationship between students SAT scores and their college grades?

Lecture 4 Scatterplots, Association, and Correlation

Regression With a Categorical Independent Variable

Lecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1

Business Mathematics and Statistics (MATH0203) Chapter 1: Correlation & Regression

Analyzing Lines of Fit

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Year 10 Mathematics Semester 2 Bivariate Data Chapter 13

Transcription:

ST coursework due Friday, April - Chapter - Part I: Correlation Analysis Textbook Assignment Page - # Page - #, Page - # Lab Assignment # (available on ST webpage) GOALS When you have completed this lecture, you will be able to: ONE Draw and interpret a scatter diagram. TWO Understand and interpret the terms dependent variable, independent variable and spurious correlation. THREE Calculate and interpret the coefficient of correlation and the coefficient of determination. FOUR Conduct a test of hypothesis to determine if the population correlation is zero (thereby checking for a significant linear relationship between two variables). Correlation Analysis Slide The Coefficient of Correlation, r Slide Correlation Analysis: A group of statistical techniques used to measure the strength of the relationship (correlation) between two variables. Scatter Diagram: A chart that portrays the relationship between the two variables of interest. Dependent Variable: The variable that is being predicted or estimated. Independent Variable: The variable that provides the basis for estimation. It is the predictor variable. The Coefficient of Correlation (r) is a sample statistic that measures the strength of linear relationship between two variables. It requires quantitative data It can range from -. to. Values of -. or. indicate perfect linear correlation. Values close to. indicate weak linear correlation. Negative values indicate an inverse linear relationship and positive values indicate a direct linear relationship.

Perfect Negative Correlation Slide Perfect Positive Correlation Slide Zero Correlation Slide Strong Positive Correlation Slide

Zero Correlation! Slide Formula for r & Correlation vs. Causation Slide r n( Σ ) ( Σ )( Σ ) [ ] [ n( Σ ) ( Σ ) ] n( Σ ) ( Σ ) Correlation does not imply causation: Some variables are linearly related, but the relationship is not a causal relationship. In other words, a change in the independent variable doesn t really cause a change in the dependent variable. We should be careful not to imply that a relationship between two variables is a causal relationship, unless we have conducted an experiment designed to detect such a relationship. Coefficient of Determination Slide EAMPLE Slide The Coefficient of Determination, r, measures the proportion of the total variation in the dependent variable that is explained or accounted for by a linear relationship between and the independent variable. The coefficient of determination is the square of the coefficient of correlation, and ranges from to. An ASCSU senator is concerned about the cost of textbooks. To provide insight into the problem she selects a sample of eight textbooks currently on sale in the bookstore. She is interested in whether the price of a book is related to the number of pages in the text.

Slide Slide : B o o k P a g e s P r i c e ( $ ) Scatter diagram of # pages versus price PRICE PAGES Slide Slide The dependent variable for the study is The independent variable is To measure the strength of the linear relationship between and, we compute the correlation coefficient. r n( Σ ) ( Σ )( Σ ) [ ] [ n( Σ ) ( Σ ) ] n( Σ ) ( Σ ) Based on the value of the sample correlation coefficient, what do you think about the strength of the linear relationship between the cost of textbooks and the number of pages in the book?

Slide The Population Correlation Coefficient Slide The sample coefficient of determination is r The value of the sample coefficient of determination indicates that In example, the sample correlation coefficient was computed based on a sample of only textbooks. If we had access to the entire population of textbooks, we could compute the correlation coefficient for the entire population. The sample correlation coefficient, r, is a point estimate for the population correlation coefficient, ρ. If the population correlation coefficient is zero, then there is no linear relationship between the two variables. Slide Slide Test the hypothesis that there is no linear relationship for the population. Use a. significance level. Step : H : ρ (no linear relationship) H : ρ (linear relationship) Step : α. r n t Step : Test statistic: r t-distribution with df n- Step : df, α. H is rejected if t > or if t < Step : t r n r Since, the decision is Conclusion:

- Chapter - Part II: Regression Analysis Regression Analysis Slide GOALS When you have completed this chapter, you will be able to: ONE Interpret the components of a regression equation and explain the concept of extrapolation. TWO Calculate the regression line, interpret the slope and intercept values, and use the regression equation to predict or estimate values for the dependent variable. THREE Calculate and interpret the standard error of estimate. FOUR Explain the assumptions required for regression analysis and know how to check them. Purpose: To determine the regression equation. The regression equation is used to predict the value of the dependent variable () based on the independent variable (). Procedure: Select a sample from the population and list the paired data for each observation; draw a scatter diagram to give a visual portrayal of the relationship; determine the regression equation. Regression Analysis Slide EAMPLE Slide the regression equation: ' a + b, where: ' is the predicted value of for a given value. a is the -intercept, or the estimated value when. It only makes sense to interpret the intercept when is in the range of the observed data. b is the slope of the line, or the average change in for each change of one unit in the least squares principle is used to obtain a & b: Develop a regression equation for the information given in EAMPLE from the previous lecture. Use the regression equation to predict the selling price of a book based on the number of pages. n( ) ( )( ) b n( ) ( ) b a n ( Σ ) ( Σ ) ( Σ ) n ( Σ ) ( Σ ) Σ Σ b n n a b n n

Slide The Standard Error of Estimate Slide So the regression equation is Interpretation of the coefficients: a: b: The standard error of estimate measures the scatter, or dispersion, of the observed values around the line of regression The formulas that are used to compute the standard error: S Σ ( ' ) n Does the regression equation describe the relationship between & perfectly for the sample data? Σ a ( Σ ) b ( Σ ) n Slide Slide Determine the standard error of the estimate for the regression equation s a( ) b( ) n Note: In this example, the values of the variable, number of pages, ranged from to. Caution: A sample regression equation should only be used for prediction or estimation within the range of -values that are present in the sample. To predict or estimate outside this range of values is called extrapolation. Use the regression equation obtained to estimate the price of a textbook with pages. Now estimate the price of a textbook with pages.

Residuals Slide Assumptions Underlying Linear Regression Slide A residual is the difference between the actual value for and the predicted value,. Residual - For observation # in Example, the number of pages was, and the price was $. Compute the value of the residual for this observation: Every observation in the sample has an associated residual. Assumption : For each value of, there is a group of values, and these values are normally distributed. Assumption : The means of these normal distributions of values all lie on the straight line of regression. Assumption : The standard deviations of these normal distributions are equal. Assumption : The values are statistically independent. Assumptions Underlying Linear Regression Slide Checking Regression Assumptions Slide To check Assumption #: use a normal probability plot of residuals. If the points fall approximately on a line, the assumption of normality seems reasonable. To check Assumption #: use the scatter diagram. To check Assumption #: use a residual plot. plot the residuals on the vertical axis versus the corresponding value of on the horizontal axis. If the amount of vertical spread appears to change as increases, then the assumption of equal variance is suspect. To check Assumption #: consider the design of the study. The assumption holds if: a simple random sample of n observations (x,y) was obtained, or if for each selected value of, or a simple random sample of y-values was obtained. Irwin/McGraw-Hill The McGraw-Hill Companies, Inc.,

Slide Slide Normal Probability Plot of the Residuals (response is PRICE) Residuals Versus PAGES (response is PRICE).. Normal Score.. -. Residual - -. - -. - - - - Residual PAGES