CORELATION - Pearson-r - Spearman-rho

Similar documents
Measuring Associations : Pearson s correlation

Can you tell the relationship between students SAT scores and their college grades?

Correlation: Relationships between Variables

Reminder: Student Instructional Rating Surveys

Chapter 16: Correlation

REVIEW 8/2/2017 陈芳华东师大英语系

Statistics Introductory Correlation

Correlation. A statistics method to measure the relationship between two variables. Three characteristics

Correlation and Linear Regression

Chapter 16: Correlation

Correlation and Regression

About Bivariate Correlations and Linear Regression

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

1 A Review of Correlation and Regression

Business Statistics. Lecture 10: Correlation and Linear Regression

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Ch. 16: Correlation and Regression

Upon completion of this chapter, you should be able to:

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

Data Analysis as a Decision Making Process

Correlation and Regression

Contents. Acknowledgments. xix

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Understand the difference between symmetric and asymmetric measures

Slide 7.1. Theme 7. Correlation

Textbook Examples of. SPSS Procedure

9 Correlation and Regression

Bivariate Relationships Between Variables

Data files for today. CourseEvalua2on2.sav pontokprediktorok.sav Happiness.sav Ca;erplot.sav

CORRELATION AND REGRESSION

Notes 6: Correlation

Correlation & Regression. Dr. Moataza Mahmoud Abdel Wahab Lecturer of Biostatistics High Institute of Public Health University of Alexandria

Lecture 14. Analysis of Variance * Correlation and Regression. The McGraw-Hill Companies, Inc., 2000

Lecture 14. Outline. Outline. Analysis of Variance * Correlation and Regression Analysis of Variance (ANOVA)

Correlation & Linear Regression. Slides adopted fromthe Internet

Analysing data: regression and correlation S6 and S7

Statistics in medicine

Correlation and regression

Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression

Chapter 13 Correlation

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

N Utilization of Nursing Research in Advanced Practice, Summer 2008

Mrs. Poyner/Mr. Page Chapter 3 page 1

Lecture (chapter 13): Association between variables measured at the interval-ratio level

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Intro to Linear Regression

Class 11 Maths Chapter 15. Statistics

psychological statistics

Predicted Y Scores. The symbol stands for a predicted Y score

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Chs. 16 & 17: Correlation & Regression

CORRELATION. compiled by Dr Kunal Pathak

BIOSTATISTICS NURS 3324

Chapter 10. Correlation and Regression. McGraw-Hill, Bluman, 7th ed., Chapter 10 1

Overview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation

Association Between Variables Measured at the Ordinal Level

Answer Key. 9.1 Scatter Plots and Linear Correlation. Chapter 9 Regression and Correlation. CK-12 Advanced Probability and Statistics Concepts 1

9. Linear Regression and Correlation

Business Mathematics and Statistics (MATH0203) Chapter 1: Correlation & Regression

PS2: Two Variable Statistics

Readings Howitt & Cramer (2014) Overview

Chs. 15 & 16: Correlation & Regression

PhysicsAndMathsTutor.com

Chapter 9: Association Between Variables Measured at the Ordinal Level

Relationship Between Interval and/or Ratio Variables: Correlation & Regression. Sorana D. BOLBOACĂ

Relationships between variables. Association Examples: Smoking is associated with heart disease. Weight is associated with height.

Correlation and the Analysis of Variance Approach to Simple Linear Regression

Readings Howitt & Cramer (2014)

Simple Linear Regression

Bivariate statistics: correlation

Nonparametric Statistics

Correlation. Engineering Mathematics III

Psych 230. Psychological Measurement and Statistics

Review. Number of variables. Standard Scores. Anecdotal / Clinical. Bivariate relationships. Ch. 3: Correlation & Linear Regression

PhysicsAndMathsTutor.com

CORRELATION. suppose you get r 0. Does that mean there is no correlation between the data sets? many aspects of the data may a ect the value of r

Intro to Linear Regression

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

y n 1 ( x i x )( y y i n 1 i y 2

Nemours Biomedical Research Biostatistics Core Statistics Course Session 4. Li Xie March 4, 2015

Non-parametric tests, part A:

Linear Regression and Correlation. February 11, 2009

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Sampling Distributions: Central Limit Theorem

Nonparametric Statistics Notes

Introduction and Single Predictor Regression. Correlation

16. Nonparametric Methods. Analysis of ordinal data

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Key Concepts. Correlation (Pearson & Spearman) & Linear Regression. Assumptions. Correlation parametric & non-para. Correlation

Correlation. January 11, 2018

Statistical. Psychology

Simple Linear Regression Using Ordinary Least Squares

Correlation. We don't consider one variable independent and the other dependent. Does x go up as y goes up? Does x go down as y goes up?

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Review. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24

Spearman Rho Correlation

Ø Set of mutually exclusive categories. Ø Classify or categorize subject. Ø No meaningful order to categorization.

Chapter 10. Correlation and Regression. Lecture 1 Sections:

Chapter 8. Linear Regression /71

Transcription:

CORELATION - Pearson-r - Spearman-rho

Scatter Diagram A scatter diagram is a graph that shows that the relationship between two variables measured on the same individual. Each individual in the set is represented by a point on in the scatter diagram. The predictor variable is plotted on the horizontal axis and the response variable is plotted on the vertical axis. Do not connect points when drawing a scatter diagram.

Scatterplot A scatterplot is a graph that shows location of each data formed by a pair of X-Y scores. In a positive linear relationship, as the X scores increase, the Y scores tends to increase. In a negative linear relationship, as the X scores increase, the Y scores tends to decrease. In a nonlinear relationship, as the X scores increase, the Y scores do not only increase or only decreases

Types of relationship A horizontal scatterplot, with horizontal regression line, indicates no relationship. Slopping scatterplots with regression lines oriented so that Y increases as X increases indicate a positive linear relationship. Slopping scatterplots with regression lines oriented so that Y decreases as X increases indicate a negative linear relationship. Scatterplots producing curved regression lines indicate nonlinear relationships.

Strength of relationship The strength of a relationship is the extent to which one value of Y is consistently paired with one and only one value of X. The strength of a relationship is also referred to as the degree of association between the two variables The absolute value of the correlation coefficient (the size of the number we calculate) indicates the strength of the relationship. The largest value you can obtain is 1.0 and the smallest value is 0. The larger the value the stronger the relationship.

For example, on average, as height in people increases, so does weight. Height(in) Weight (lbs) 1 60 102 2 62 120 3 63 130 4 65 150 5 65 120 6 68 145 7 69 175 8 70 170 9 72 185 10 74 210

Example of a Positive Correlation If the correlation is positive, when one variable increases, so does the other.

For example, as study time increases, the number of errors on an exam decreases Study time (min) 1 90 25 2 100 28 3 130 20 4 150 20 5 180 15 6 200 12 7 220 13 8 300 10 9 350 8 10 400 6 No. Errors on test

Example of a negative correlation If the correlation is negative, when one variable increases, the other decreases.

Example of a zero correlation If there is no relationship between the two variables, then as one variable increases, the other variable neither increases nor decreases. In this case, the correlation is zero. For example, if we measure the SAT-V scores of college freshmen and also measure the circumference of their right big toes, there will be a zero correlation.

What is the correlation coefficient? Linear means straight line. Correlation means co-relation, or the degree that two variables "go together". Linear correlation means to go together in a straight line. The correlation coefficient is a number that summarizes the direction and degree (closeness) of linear relations between two variables.

What is the correlation coefficient? The correlation coefficient is also known as the Pearson Product- Moment Correlation Coefficient. The sample value is called r, and the population value is called ρ (rho).

What is the correlation coefficient? The correlation coefficient can take values between -1 through 0 to +1. The sign (+ or -) of the correlation affects its interpretation. When the correlation is positive (r > 0), as the value of one variable increases, so does the other.

The correlation coefficient 1. Pearson correlation coefficient (Both variables must be interval or ratio) 2. Spearman rank-order correlation coefficient (Both variables are ordinal (ranked)) 3. Point-biserial correlation coefficient (One variable is interval or ratio and one variable is nominal and dichotomous) 4. Phi (Both variables are nominal and dichotomous)

Correlation & Association Scale Interval-interval Ordinal-ordinal Nominal-nominal Nominal-interval Nominal-ordinal Ordinal-interval Example Pearson r Spearman Rank Phi, Chi-square Independent test Eta Theta, Kruskal-Wallis H test Jaspen s M, F test

Measuring Associations : Pearson s correlation

Pearson correlation coefficient o The conceptual (definitional) formula of the correlation coefficient is: (1.1) where x and y are deviation scores, that SX and SY are sample standard deviations, that is,

where zx is X in z-score form, zy is Y in z-score form, and S and N have their customary meaning. This says that r is the average cross-product of z- scores. Pearson correlation coefficient Another way of defining correlation is: (1.2)

Pearson correlation coefficient Where

Pearson correlation coefficient Sometimes you will see these formulas written as: and

Pearson correlation coefficient These formulas are correct when the standard deviations used in the calculations are the estimated population standard deviations rather than the sample standard deviations. so the main point is to be consistent. Either use N throughout or use N-1 throughout.

Example:

Covariance Covariance(cov xy )represents the degree which two variables change together Cov xy = (Σ(x xbar). (y-ybar))/n-1 This says that the correlation is the average of cross-products (also called a covariance) standardized by dividing through by both standard deviations.

Height Weight 72 190 66 135 69 155 72 165 71 155 Dapatkan (i) covariance (ii) coefficient of correlation

Interpretation of Pearson Coefficient r Interpretation 0.00-0.20 can be ignored 0.20-0.40 low 0.40-0.60 medium 0.60-0.80 high 0.80-1.00 very high

Strength of Pearson r Coefficient Strength 0.01 0.09 Trivial 0.10 0.29 Low to moderate 0.30 0.49 Moderate to substantial 0.50 0.69 Substantial to very strong 0.70 0.89 Very strong >0.90 Near perfect

The coefficient of determination Correlation cannot be used to explain whether or not one variable causes another, but can be used for predictive purposes The Coefficient of determination, computed by squaring the correlation coefficient, tells the proportion of the variability of one variable that can be explained by the other variable. Coefficient of determination = r 2

The coefficient of determination Suppose that the bird and whale migration were correlated with r = 0.5. r 2= (.5) 2 = 0.25 This means that.25 or 25% of the variance in the time of whale migration can be explained by the variance in time of bird migration..75 or 75% of the variance can be explained by other factors. Therefore, even if the bird were 2 weeks late in their migration, you would not expect the whales to be 2 weeks late because 75% of the variation in whale migration is explained y factors other than bird migration.

Spearman s Coefficient of Rank Correlation, r s

Spearman s rank-order correlation coefficient The correlation coefficient is used when one or more variables is measured on an ordinal (ranking) scale Describes the linear relationship between two variables measured using ranked scores Symbol used r s (The subscript s stands for Spearman; Charles Spearman invented this one)

Computational Formula for the Spearman Rank-Order Correlation Coefficient is: R s = 1 6(ΣD 2 ) ----------- N (N 2-1) N is the number of pair ranks D is the difference between the two ranks in each pair

Running the Spearman Rank-Order Correlation Test 1. Determine the difference between the ranks for each subjects 2. Square each difference and sum them 3. Calculate the rho statistics. 4. Compare the obtained rho value with the critical value

Summary of the Spearman Rank-Order Correlation Test Hypotheses: H 0 : Rho = 0 H a : Rho 0, or Rho < 0, or Rho > 0 Assumptiojns: Subjects are randomly selected Observations are ranked order Decision Rules: n = number of pairs of ranks If rho obt rho crit, reject H 0 If rho obt < rho crit, do not reject H 0 Formula rho = 1 6(ΣD 2 ) n (n 2-1)

Sample data Participant Observer A: X Observer B: Y 1 4 3 2 1 2 3 9 8 4 8 6 5 3 5 6 5 4 7 6 7 8 2 1 9 7 9

Solution Participant Observer A: X Observer B: Y D D 2 1 4 3 1 1 2 1 2-1 1 3 9 8 1 1 4 8 6 2 4 5 3 5-2 4 6 5 4 1 1 7 6 7-1 1 8 2 1 1 1 9 7 9-2 4 ΣD 2 =18

Solution Rs = 1 6(ΣD 2 ) ----------- N (N 2-1) = 1 (6(18)) ---------- 9 (9 2-1) = 1 - ((108)/720) = 1 0.15 = +.85

What does the value of r s tell you? Spearman s rank correlation coefficient is actually derived from the product-moment correlation coefficient, such that: -1 r s 1 r s = 0.85 Means that a child receiving a particular ranking from one observer tended to receive very close to the same ranking from other observer r s = +1 means the ranking is in complete agreement r s = 0 means that there is no correlation between the rankings r s = -1 means that the ranking are in complete disagreement. In fact they are in exact reverse order.

Exercise: The marks of eight candidates in English and Mathematics are: Candidate 1 2 3 4 5 6 7 8 English (x) 50 58 35 86 76 43 40 60 Maths (y) 65 72 54 82 32 74 40 53 Rank the results and hence find Spearman s rank correlation coefficient between the two sets of marks. Comment on the value obtained,

Solution English (x) Maths (y) 50 58 35 86 76 43 40 60 65 72 54 82 32 74 40 53 Rank x 4 5 1 8 7 3 2 6 Rank y 5 6 4 8 1 7 2 3 D -1-1 -3 0 6-4 0 3 D 2 1 1 9 0 36 16 0 9 D 2 = 72

Solution R s = 1 6(ΣD 2 ) ----------- N (N 2-1) = 1 (6(72)) ---------- 8 (8 2-1) = 1 - ((432)/504) = 1 0.857 =.142 Spearman s coefficient of rank correlation is 0.142 This appears to show a very weak positive correlation between the English and Mathematics ranking

Tied Ranks A tied rank occurs when two participants receive the same rank on the same variable (e.g two person are tied for first on variable x) Tied ranks result in an incorrect value of r s Resolve (correct) any tied ranks before computing r s Therefore, for each participant at a tied rank, assign the mean of the ranks that would have been used had there not been a tie

Example Runner Race X Race Y To resolve ties New Y A 4 1 Tie uses ranks 1 and 2, becomes 1.5 B 3 1 Tie uses ranks 1 and 2, becomes 1.5 C 2 2 Becomes 3rd 3 D 1 3 Becomes 4th 4 1.5 1.5

Example Runner Race X New Y D D 2 A 4 1.5 2.5 6.25 B 3 1.5 1.5 2.25 C 2 3-1 1 D 1 4-3 9 D 2 = 18.5

Solution R s = 1 6(ΣD 2 ) ----------- N (N 2-1) = 1 (6(18.5)) ---------- 4 (4 2-1) = 1 - ((111)/60) = 1 1.85 = -.85