Covariance and Correlation

Similar documents
Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Probability Distribution for a normal random variable x:

Psychology 310 Exam1 FormA Student Name:

Methods for sparse analysis of high-dimensional data, II

EXAM # 3 PLEASE SHOW ALL WORK!

Algebra 2 Notes Systems of Equations and Inequalities Unit 03c. System of Equations in Three Variables

FINM 331: MULTIVARIATE DATA ANALYSIS FALL 2017 PROBLEM SET 3

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Concept Category 4. Quadratic Equations

Unit 2. Describing Data: Numerical

Methods for sparse analysis of high-dimensional data, II

There are two things that are particularly nice about the first basis

Vectors and Matrices Statistics with Vectors and Matrices

Absolute Value Inequalities 2.5, Functions 3.6

MATH Topics in Applied Mathematics Lecture 2-6: Isomorphism. Linear independence (revisited).

CORELATION - Pearson-r - Spearman-rho

Midterm 1 practice UCLA: Math 32B, Winter 2017

Math 1314 Week #14 Notes

AP CALCULUS BC 2006 SCORING GUIDELINES (Form B) Question 2

Final exam (practice) UCLA: Math 31B, Spring 2017

Problem Score Problem Score 1 /12 6 /12 2 /12 7 /12 3 /12 8 /12 4 /12 9 /12 5 /12 10 /12 Total /120

11. Regression and Least Squares

Principal Components Theory Notes

7. The Multivariate Normal Distribution

AP CALCULUS BC 2009 SCORING GUIDELINES

Principal Components Analysis (PCA) and Singular Value Decomposition (SVD) with applications to Microarrays

PRINCIPAL COMPONENTS ANALYSIS

CORRELATION. suppose you get r 0. Does that mean there is no correlation between the data sets? many aspects of the data may a ect the value of r

Chapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix

Linear Algebra & Geometry why is linear algebra useful in computer vision?

MATH 1553 PRACTICE MIDTERM 3 (VERSION B)

Lecture # 3 Orthogonal Matrices and Matrix Norms. We repeat the definition an orthogonal set and orthornormal set.

j=1 u 1jv 1j. 1/ 2 Lemma 1. An orthogonal set of vectors must be linearly independent.

SSEA Math 51 Track Final Exam August 30, Problem Total Points Score

Jennie F. Snapp Math Lesson Plans 6 th Grade Date: December 31-4, Notes Topic & Standard Objective Homework Monday

MAS223 Statistical Inference and Modelling Exercises

Properties of Linear Transformations from R n to R m

L3: Review of linear algebra and MATLAB

MATH 33A LECTURE 3 PRACTICE MIDTERM I

Homework 2. Solutions T =

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =

A Probability Review

EXAM. Exam 1. Math 5316, Fall December 2, 2012

STA 2101/442 Assignment 3 1

Assessing the relation between language comprehension and performance in general chemistry. Appendices

Lecture 11. Multivariate Normal theory

Basic Concepts in Matrix Algebra

Finite Mathematics Chapter 2. where a, b, c, d, h, and k are real numbers and neither a and b nor c and d are both zero.

AP CALCULUS AB 2007 SCORING GUIDELINES (Form B)

11 Correlation and Regression

EXERCISES ON DETERMINANTS, EIGENVALUES AND EIGENVECTORS. 1. Determinants

Introduction to Computational Finance and Financial Econometrics Matrix Algebra Review

Measuring Associations : Pearson s correlation

STA 431s17 Assignment Eight 1

Part A. BASIC INFORMATION: (Please print) 1. Name: 2. Please mark all courses you have taken in high school or college :

AP Physics B Math Competancy Test

Math Exam 2, October 14, 2008

Engage Education Foundation

Elementary maths for GMT

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

REVIEW 8/2/2017 陈芳华东师大英语系

1) [3pts] 2. Simplify the expression. Give your answer as a reduced fraction. No credit for decimal answers. ( ) 2) [4pts] ( 3 2 ) 1

Correlation: Relationships between Variables

384 PU M.Sc Five year Integrated M.Sc Programmed (Mathematics, Computer Science,Statistics)

ECE 5615/4615 Computer Project

MATH10212 Linear Algebra B Homework Week 4

AP CALCULUS AB 2009 SCORING GUIDELINES (Form B) Question 2. or meters 2 :

Histograms. Mark Scheme. Save My Exams! The Home of Revision For more awesome GCSE and A level resources, visit us at

Announcements September 19

Announcements: You can turn in homework until 6pm, slot on wall across from 2202 Bren. Make sure you use the correct slot! (Stats 8, closest to wall)

Math 415 Exam I. Name: Student ID: Calculators, books and notes are not allowed!

Section 1.8/1.9. Linear Transformations

Pre-College Workbook

REFRESHER. William Stallings

Mathematics 206 Solutions for HWK 23 Section 6.3 p358

2014 Summer Review for Students Entering Algebra 2. TI-84 Plus Graphing Calculator is required for this course.

Statistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1

Chapter 6 The Standard Deviation as a Ruler and the Normal Model

Exam 1 - Math Solutions

Part I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes

Mathematical foundations - linear algebra

STUDENT NAME: STUDENT SIGNATURE: STUDENT ID NUMBER: SECTION NUMBER RECITATION INSTRUCTOR:

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Math Matrix Algebra

AP CALCULUS AB 2006 SCORING GUIDELINES (Form B) Question 2. the

Multivariate Statistical Analysis

System of Linear Equations

Linear Algebra: Matrix Eigenvalue Problems

Econ 204 Supplement to Section 3.6 Diagonalization and Quadratic Forms. 1 Diagonalization and Change of Basis

The Matrix Algebra of Sample Statistics

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

b) (1) Using the results of part (a), let Q be the matrix with column vectors b j and A be the matrix with column vectors v j :

Final exam (practice 1) UCLA: Math 32B, Spring 2018

Wavelet Transform And Principal Component Analysis Based Feature Extraction

1. Select the unique answer (choice) for each problem. Write only the answer.

Midterm 1 revision source for MATH 227, Introduction to Linear Algebra

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Math 308 Practice Final Exam Page and vector y =

Raquel Prado. Name: Department of Applied Mathematics and Statistics AMS-131. Spring 2010

Transcription:

and Statistics 3513 Fall 008 Mike Anderson Abstract and correlation are measures of association; how strongly one random variable is related to another. Page 1 of 8

1. is a measure of association, how much one random variable changes with respect to another random variable. It is defined as Cov(X, Y ) E [(X µ x )(Y µ y )] We often see covariance in the sums and differences of random variables Similarly and in general V [X + Y ] E [ (X + Y µ x µ y ) ] E [ (X µ x ) ] + E [ (Y µ y ) ] + E [(X µ x )(Y µ y )] V [X] + V [Y ] + Cov(X, Y ) V [X Y ] V [X] + V [Y ] Cov(X, Y ) [ ] V X i V [X i ] + Cov(X i, X j ) i i i<j Page of 8

..1. 008 SAT Scores The College Boards publish a lot of data on SAT scores every year, but some obvious statistics are missing. For example, on their website are the means and standard deviations for the various subject scores (math, critical reading, and writing) as well as the composite scores (CR+M, CR+M+W), but nowhere are there measure of association, say between math and critical reading scores. Fortunately, there is enough data to calculate a covariance: subject µ σ M 515 116 CR 50 11 CR+M 11 1017.. Refresher: A Normal Probability σcr+m σcr + σm + Cov(CR, M) Cov(CR, M) 1 ( σ CR+M σcr σm ) Current admission standards at UTSA are such that a student with combined SAT score, CR+M, of 100 or better, is eligible for admission, regardless of high school class standing. What proportion of students who took the 008 SAT exam are eligible for admission to UTSA? Page 3 of 8

3. Variance- Matrix When dealing with multiple random variables, it s convenient to represent variance and covariance as a single mathematical object, the variance-covariance matrix. If we look at the SAT data and let X m represent the math score and X r represent the reading score, then we have E [X] ( E [Xm ] E [X r ] ) ( 50 515 ) ( V [X] 3.1. Using the Matrix: Linear Combinations V [X m ] Cov(X m, X r ) Cov(X m, X r ) V [X r ] ) ( ) 116 960.5 960.5 113 The variance-covariance matrix simplifies variance calculations for linear combinations of random variables. If A is a linear transformation on the random vector X then E [AX] AE [X] V [AX] AV [X] A T The matrix A can be a single linear combination (a row vector), or it can be a set of linear combinations (a matrix). When A is a matrix, the result above will include the covariances between each pair of transformed variables. Consider the sum and difference of two random variables [ ] [ ] ( ) 1 1 X1 X1 + X Y AX 1 1 X X 1 X The variance-covariance matrix for Y is Σ Y AΣ X A T [ 1 1 1 1 ] [ σ 1 σ 1 σ 1 σ ] [ 1 1 1 1 ] Page 4 of 8

3.. The Sum and Difference in SAT Scores Consider the sum and difference of the Math and Critical Reading SAT scores. From our previous result we see that [ σ Σ Y 1 + σ 1 + σ σ1 σ σ1 σ σ1 σ 1 + σ X m + X r N(1017, 11 ) X m X r N(13, σ ) ] Questions: What is σ? What is the probability that a person s math and reading scores differ by more than 00 points? Page 5 of 8

4. 4.1. Definition is just covariance rescaled to the interval (-1, +1): 4.. Examples ρ XY Cov(X, Y ) V [X] V [Y ] From the previous example about SAT scores, we found V [CR] 11 V [M] 116 Cov(CR, M) 960.5 so the correlation is ρ CR,M 960.5 11 116 Page 6 of 8

5. Principal Components (OPTIONAL) Take another look at the variance-covariance matrix for the sum and difference of the math and reading scores: [ ] [ ] σ Σ ± 1 + σ 1 + σ σ1 σ 44, 51 91 σ1 σ σ1 σ 1 + σ 91 7, 479 The covariance is quite small compared to either of the two variances. Might it be possible to find two linear combinations a weighted sum and difference that have zero covariance? 5.1. Rotation Matrices Yes we can. The key is to use an orthonormal transformation, or rotation matrix: [ ] cos θ sin θ R θ sin θ cos θ This is a length-preserving transformation in -D. To get a better idea of what R θ does, answer these Questions: What is R θ? On graph paper, plot the points (column vectors) X and R π/4 X: X [ 1 1 1 1 1 1 1 1 ] Page 7 of 8

5.. Rotating to Zero Now apply the rotation to our known variance-covariance matrix: [ ] [ ] [ R θ ΣR T cos θ sin θ σ θ 1 σ 1 cos θ sin θ sin θ cos θ σ 1 σ sin θ cos θ ] Then find θ such that the covariance term is zero. These identities might be useful sin θ sin θ cos θ cos θ cos θ sin θ Page 8 of 8