n 1 Cov(X,Y)= ( X i- X )( Y i-y ). N-1 i=1 * If variable X and variable Y tend to increase together, then c(x,y) > 0

Similar documents
Research Design - - Topic 17 Multiple Regression & Multiple Correlation: Two Predictors 2009 R.C. Gardner, Ph.D.

Motithang Higher Secondary School Thimphu Thromde Mid Term Examination 2016 Subject: Mathematics Full Marks: 100

Elementary Statistics and Inference. Elementary Statistics and Inference. 11. Regression (cont.) 22S:025 or 7P:025. Lecture 14.

OSCILLATIONS AND GRAVITATION

APPLICATION OF MAC IN THE FREQUENCY DOMAIN

Psychometric Methods: Theory into Practice Larry R. Price

2 x 8 2 x 2 SKILLS Determine whether the given value is a solution of the. equation. (a) x 2 (b) x 4. (a) x 2 (b) x 4 (a) x 4 (b) x 8

Graphs of Sine and Cosine Functions

Related Rates - the Basics

1 Statistics. We ll examine two ways to examine the relationship between two variables correlation and regression. They re conceptually very similar.

When two numbers are written as the product of their prime factors, they are in factored form.

Pearson s Chi-Square Test Modifications for Comparison of Unweighted and Weighted Histograms and Two Weighted Histograms

COMPARING MORE THAN TWO POPULATION MEANS: AN ANALYSIS OF VARIANCE

Auchmuty High School Mathematics Department Advanced Higher Notes Teacher Version

Physics 521. Math Review SCIENTIFIC NOTATION SIGNIFICANT FIGURES. Rules for Significant Figures

Math Section 4.2 Radians, Arc Length, and Area of a Sector

B. Spherical Wave Propagation

Variables and Formulas

working pages for Paul Richards class notes; do not copy or circulate without permission from PGR 2004/11/3 10:50

F-IF Logistic Growth Model, Abstract Version

F g. = G mm. m 1. = 7.0 kg m 2. = 5.5 kg r = 0.60 m G = N m 2 kg 2 = = N

MAP4C1 Exam Review. 4. Juno makes and sells CDs for her band. The cost, C dollars, to produce n CDs is given by. Determine the cost of making 150 CDs.

Physics 2020, Spring 2005 Lab 5 page 1 of 8. Lab 5. Magnetism

Research Design - - Topic 16 Bivariate Correlation Continued 2009 R.C. Gardner, Ph.D.

6 PROBABILITY GENERATING FUNCTIONS

Pulse Neutron Neutron (PNN) tool logging for porosity Some theoretical aspects

arxiv: v2 [physics.data-an] 15 Jul 2015

Goodness-of-fit for composite hypotheses.

History of Astronomy - Part II. Tycho Brahe - An Observer. Johannes Kepler - A Theorist

Universal Gravitation

Chapter 2: Introduction to Implicit Equations

MEASURING CHINESE RISK AVERSION

Inverse Square Law and Polarization

Fresnel Diffraction. monchromatic light source

Electric Field, Potential Energy, & Voltage

Math 1525 Excel Lab 3 Exponential and Logarithmic Functions Spring, 2001

Empirical Prediction of Fitting Densities in Industrial Workrooms for Ray Tracing. 1 Introduction. 2 Ray Tracing using DRAYCUB

3.6 Applied Optimization

Reasons for Teaching and Using the Signed Coefficient of Determination Instead of the Correlation Coefficient

MENSURATION-III

Political Science 552

Circular Orbits. and g =

Dr.Samira Muhammad salh

In statistical computations it is desirable to have a simplified system of notation to avoid complicated formulas describing mathematical operations.

6.1: Angles and Their Measure

In many engineering and other applications, the. variable) will often depend on several other quantities (independent variables).

Lab 10: Newton s Second Law in Rotation

CHAPTER 3. Section 1. Modeling Population Growth

EM Boundary Value Problems

Estimation of the Correlation Coefficient for a Bivariate Normal Distribution with Missing Data

33. 12, or its reciprocal. or its negative.

On the Sun s Electric-Field

CALCULUS II Vectors. Paul Dawkins

Physics 2B Chapter 22 Notes - Magnetic Field Spring 2018

Lecture 3. Basic Physics of Astrophysics - Force and Energy. Forces

1. Review of Probability.

! E da = 4πkQ enc, has E under the integral sign, so it is not ordinarily an

DYNAMICS OF UNIFORM CIRCULAR MOTION

Numerical Integration

KR- 21 FOR FORMULA SCORED TESTS WITH. Robert L. Linn, Robert F. Boldt, Ronald L. Flaugher, and Donald A. Rock

Physics: Work & Energy Beyond Earth Guided Inquiry

Lecture 7 Topic 5: Multiple Comparisons (means separation)

Mathematisch-Naturwissenschaftliche Fakultät I Humboldt-Universität zu Berlin Institut für Physik Physikalisches Grundpraktikum.

Handout: IS/LM Model

The Millikan Experiment: Determining the Elementary Charge

Physics 235 Chapter 5. Chapter 5 Gravitation

763620SS STATISTICAL PHYSICS Solutions 2 Autumn 2012

Surveillance Points in High Dimensional Spaces

7.2. Coulomb s Law. The Electric Force

Algebra. Substitution in algebra. 3 Find the value of the following expressions if u = 4, k = 7 and t = 9.

Review Exercise Set 16

Lecture 3. Basic Physics of Astrophysics - Force and Energy. Forces

The Substring Search Problem

Nuclear and Particle Physics - Lecture 20 The shell model

. Using our polar coordinate conversions, we could write a

Gaia s Place in Space

Voltage ( = Electric Potential )

6 Matrix Concentration Bounds

2018 Physics. Advanced Higher. Finalised Marking Instructions

AP Physics - Coulomb's Law

Suggested Solutions to Homework #4 Econ 511b (Part I), Spring 2004

Chapter 13 Gravitation

AP Physics 1 - Circular Motion and Gravitation Practice Test (Multiple Choice Section) Answer Section

Hopefully Helpful Hints for Gauss s Law

2. Electrostatics. Dr. Rakhesh Singh Kshetrimayum 8/11/ Electromagnetic Field Theory by R. S. Kshetrimayum

EFFECTS OF FRINGING FIELDS ON SINGLE PARTICLE DYNAMICS. M. Bassetti and C. Biscari INFN-LNF, CP 13, Frascati (RM), Italy

ME 3600 Control Systems Frequency Domain Analysis

Section 8.2 Polar Coordinates

Web-based Supplementary Materials for. Controlling False Discoveries in Multidimensional Directional Decisions, with

Between any two masses, there exists a mutual attractive force.

, the tangent line is an approximation of the curve (and easier to deal with than the curve).

Directed Regression. Benjamin Van Roy Stanford University Stanford, CA Abstract

Math Notes on Kepler s first law 1. r(t) kp(t)

Fields and Waves I Spring 2005 Homework 4. Due 8 March 2005

What molecular weight polymer is necessary to provide steric stabilization? = [1]

Assessment of the impact of calculation methodologies on defect determinations in manufacturing

1. Show that the volume of the solid shown can be represented by the polynomial 6x x.

Reversed Gravitational Acceleration for High-speed Particles

Physics 121 Hour Exam #5 Solution

Physics 312 Introduction to Astrophysics Lecture 7

Transcription:

Covaiance and Peason Coelation Vatanian, SW 540 Both covaiance and coelation indicate the elationship between two (o moe) vaiables. Neithe the covaiance o coelation give the slope between the X and Y vaiable, but instead give an indication of the stength and the diection of thei elationship. Neithe the covaiance o the coelation ae used to test cause and effect elationships but simply to detemine whethe o not thee is a elationship between vaiables. The covaiance gives an unstandadized value fo the elationship between X and Y. The fomula fo the covaiance is n 1 Cov(X,Y) ( X i- X )( Y i-y ). N-1 i1 A positive value fo the covaiance indicates a positive elationship between the x and y vaiables. A negative numbe fo the covaiance indicates a negative elationship between the two vaiables. Geneally, given two samples and the same two vaiables within each of these samples (such as hous of wok and income), a lage absolute value fo the covaiance indicates a stonge linea elationship between the two vaiables. Fom the covaiance, we can detemine the following. * If vaiable X and vaiable Y tend to incease togethe, then c(x,y) > 0 * If vaiable X tends to decease when vaiable Y inceases, then c(x,y) < 0 * If vaiable X and vaiable Y ae independent, then c(x,y) 0 Othe than detemining the diection of the elationship between the two vaiables, it is often difficult to intepet the elationship between the x and y vaiables with the covaiance because the covaiance is not standadized. In othe wods, the value of the covaiance is dependent on the units of the vaiables being examined. Fo example, you e examining the elationship between age (measued in yeas) and numbe of childen fom a sample of students, the covaiance may be something like 5.8. If you wee to then detemine the covaiance between hous of wok (measued in hous) and income (measued in dollas), you may get a covaiance of 150. This diffeence in covaiance is in pat due to the diffeence in unit of measuement of hous of wok and income. Thus, the vaiance fo income and hous of wok is much geate than the vaiance fo age and numbe of childen. Because the covaiance values ae not standadized, it s difficult to detemine the meaning of these values. If you wee to take anothe sample of students to detemine the covaiance between age and numbe of childen and found this value to be 6. (compaed to 5.8 in the fist sample), you could say that the fist sample has a lowe positive association, o a weake elationship, between age and numbe of childen than F:\WP60_1\LECT1.PHD\OLSReg+Co\Coelation.wpd 1

does the second sample. If you took a second sample of hous of wok and income and found the covaiance to be 140 (compaed to 150 in the fist sample), you could say that the covaiance, o the stength of the elationship, between income and hous of wok to stonge in the fist sample than in the second. The following is a vaiance-covaiance matix, with the vaiance of each of the vaiables located on the main diagonal of the matix. The vaiables ae age of the head of household (hdag), numbe of kids in the family (kds), and family income-to-needs (endfmns), which has a value of 1 if the family is at the povety line, a value of if the family is at twice the povety line, and a value of 10 if the family has income at 10 times the povety line (and has values above and below any of these values). (obs467) hdag kds endfmns -------------+--------------------------- hdag 6.676 kds -.4464.87998 endfmns.766153 -.534888 3.45833 Thus, the vaiance fo age of the head (hdag) is 6.676 and the vaiance fo kids is.87998. Because the units of analysis ae diffeent fo each of these vaiables, it is difficult to measue the elationship between them. Thus, the covaiance fo age of the head and numbe of kids is -.4464. All this tells us is that thee is some negative elationship between the two vaiables. The covaiance between kids and family income-to-needs is -.534888. Again, all we can tuly see is that thee is a negative elationship between the two vaiables. The coelation coefficient is a much easie statistic to undestand because it uses standadized values. It takes the covaiance and divides these values by the standad deviation of the x and y vaiables. No matte what type of unit is being measued, all coelation coefficients will give measues in the same type of unit. The fomula fo the coelation coefficient is: n ( X i-x )( Y i-y ) i1 N N ( X i-x ) ( Y i-y ) i1 i1 o F:\WP60_1\LECT1.PHD\OLSReg+Co\Coelation.wpd

N N XY - X Y i1. [N X -( X ) ][NSumY -( Y ) ] This second fomula is a little easie to use fo computations. Anothe way of computing is by knowing the odinay least squaes coefficient estimate and the standad deviations fo the independent and dependent vaiables. This fomula is s s x ( )b y. All values fo the coelation coefficient ae between the -1 and 1. The close the absolute value fo, o the coelation coefficient, is to 1, the stonge is the elationship between the two vaiables. A value of 0 fo indicates that thee is no elationship between the two vaiables being examined. Negative values fo indicate a negative linea elationship between x and y. Positive values indicate a positive elationship between x and y. The Peason is measuing the amount of spead of the scatteing of points aound the odinay least squaes line. This line is the best fitting line given values fo the two vaiables being examined, x and y. The geate the spead of the sample points aound this line, the lowe will be the value of. If all of the points lie on the line, the value of will be eithe +1 o -1, depending on whethe the elationship between x and y is positive o negative. If we again look at the vaiables that we used in the vaiance-covaiance matix above and detemine thei coelation coefficients, we get the following: (obs467) hdag kds endfmns -------------+--------------------------- hdag 1.0000 kds -0.187 1.0000 endfmns 0.05-0.1695 1.0000 Along the main diagonal ae the coelations of the vaiables with themselves which will always be 1. The off-diagonal elements ae the coelations of the vaiables with each othe. Hee, we see that the coelation between age of the head and numbe of kids is -.187, a negative elationship, but we can now see that this is a stonge elationship in absolute value than the elationship between age of the head and family income-to-needs (whee the.05). The nd stongest elationship in this matix is the elationship between numbe of kids and family income-to-needs, which has an value of -.1695. F:\WP60_1\LECT1.PHD\OLSReg+Co\Coelation.wpd 3

We could next detemine if these elationships ae statistically significant. The values below the coelation indicate the exact level of significance. hdag kds endfmns -------------+--------------------------- hdag 1.0000 kds -0.187 1.0000 0.0000 endfmns 0.05-0.1695 1.0000 0.0004 0.0000 An Example of Calculating the Coelation Coefficient Let's say we have vaiables: Respondent Yeas of school (X) Income (Y) 1 10 3 11 3 4 1 4 5 13 5 6 14 We could plot this out to see the elationship between the vaiables. The b coefficient fo this elationship is 1. Fo each incease in x, thee is a 1 unit incease in y. We can see that all of the points fall on the least squaes egession line, so we have a good idea what ou value will be. We will next detemine the value of. In the fist example, we would have the following: F:\WP60_1\LECT1.PHD\OLSReg+Co\Coelation.wpd 4

(5)(50)-(0)(60) [5(90)- ][(5(730)- ] 0 60 We thus find that thee is a pefect elationship between X and Y. We know that thee is a stong elationship between X and Y and we know that thee is a positive elationship between the two vaiables. The coelation coefficient does not give the slope between the vaiables, only the stength and the diection of the elationship. Let s look at a second, less pefect elationship between X and Y. 50 50 1 [50][50] 50 Respondent Yeas of school (X) Income (Y) 1 10 5 11 3 6 1 4 4 13 5 6 14 (5)(83)-(3)(60) [5(117)- ][(5(730)- ] 3 60 35 35.6614 [56][50] 5.915 In othe wods, thee is still a positive elationship between x and y but the elationship is no longe pefect, as it was in the fist example. If we wee to plot these points on a scattegam, we would see that not all of the points lie on the least squaes line. In this case, b.65 and a9.15. R Values If we wee to squae the coelation coefficient, we would be given a value that indicates how much of the vaiation in the dependent vaiable is being explained by the independent vaiable. We could then multiply that value by 100 to detemine the pecentage of vaiation being F:\WP60_1\LECT1.PHD\OLSReg+Co\Coelation.wpd 5

explained by the independent vaiable. In the fist example, 1 1 *100 100%, o all of the vaiation in the dependent vaiable is being explained by the independent vaiable. In the second example, (.6614).4374 * 100 43.74% of the vaiation in the dependent vaiable is being explained by the independent vaiable. We can also detemine this value by examining the explained, the unexplained, and the total sums of squae. ( Y p -Y ) Explained SS ( i -Y ) Total SS R Y Whee the Total Sums of Squaes Explained Sums of Squaes + Unexplained Sums of Squaes. You can test to detemine if the coelation coefficient is significant. You null hypothesis will be that the population coelation coefficient ( Ρ Rho) is equal to 0. To test this null hypothesis, you will use an F test. The F test takes the fom F k, n-k-1 R R (n - k -1). 1- k If the F value is geate than the citical value, you will eject the null hypothesis. Fo example, if.5, n500 and k1, then F. 5.5 1,498 * 498 * 498 166. 1-. 5.75 Fo a.05 test, then citical value is 3.84. Because the F value is geate than the citical value, eject the null hypothesis. In all likelihood, thee is a elationship between X and Y in the population. We can also detemine the value of F by using the following fomula: ( yp y) / k Explained ~ S. S./ k Fkn, k 1 ( y y ) /( n k 1) Unexp lained ~ S. S./( n k 1) i p If the F value is geate than the citical value, eject the null hypothesis that the set of independent vaiables explains none of the vaiance in the dependent vaiable. F:\WP60_1\LECT1.PHD\OLSReg+Co\Coelation.wpd 6