Covariance and Dot Product

Similar documents
Linear Algebra. Alvin Lin. August December 2017

Dot Products. K. Behrend. April 3, Abstract A short review of some basic facts on the dot product. Projections. The spectral theorem.

MATH 12 CLASS 2 NOTES, SEP Contents. 2. Dot product: determining the angle between two vectors 2

What you will learn today

v = v 1 2 +v 2 2. Two successive applications of this idea give the length of the vector v R 3 :

7.1 Projections and Components

Solution to Homework 1

Law of Trichotomy and Boundary Equations

Notes on multivariable calculus

(arrows denote positive direction)

y 2 . = x 1y 1 + x 2 y x + + x n y n 2 7 = 1(2) + 3(7) 5(4) = 3. x x = x x x2 n.

Exercise Solutions for Introduction to 3D Game Programming with DirectX 10

3 Scalar Product. 3.0 The Dot Product. ~v ~w := v 1 w 1 + v 2 w v n w n.

Designing Information Devices and Systems I Fall 2018 Lecture Notes Note 21

Detailed objectives are given in each of the sections listed below. 1. Cartesian Space Coordinates. 2. Displacements, Forces, Velocities and Vectors

Vectors. 1 Basic Definitions. Liming Pang

Brief Review of Exam Topics

If the pull is downward (Fig. 1), we want C to point into the page. If the pull is upward (Fig. 2), we want C to point out of the page.

Inner Product and Orthogonality

MAT 1339-S14 Class 8

Quiz 2 Practice Problems

October 25, 2013 INNER PRODUCT SPACES

y 1 y 2 . = x 1y 1 + x 2 y x + + x n y n y n 2 7 = 1(2) + 3(7) 5(4) = 4.

6. Vectors. Given two points, P 0 = (x 0, y 0 ) and P 1 = (x 1, y 1 ), a vector can be drawn with its foot at P 0 and

The Transpose of a Vector

Solution Set 7, Fall '12

Vector Geometry. Chapter 5

Inner Product Spaces 5.2 Inner product spaces

4.1 Distance and Length

Systems of Linear Equations

Vectors for Physics. AP Physics C

Analytic Trigonometry. Copyright Cengage Learning. All rights reserved.

Give a geometric description of the set of points in space whose coordinates satisfy the given pair of equations.

Vectors and Matrices Statistics with Vectors and Matrices

(, ) : R n R n R. 1. It is bilinear, meaning it s linear in each argument: that is

Notes: Vectors and Scalars

Vectors and Plane Geometry

a b 0 a cos u, 0 u 180 :

Course Notes Math 275 Boise State University. Shari Ultman

12.1 Three Dimensional Coordinate Systems (Review) Equation of a sphere

The Cross Product The cross product of v = (v 1,v 2,v 3 ) and w = (w 1,w 2,w 3 ) is

Worksheet 1.4: Geometry of the Dot and Cross Products

Midterm 1 Review. Distance = (x 1 x 0 ) 2 + (y 1 y 0 ) 2.

which has a check digit of 9. This is consistent with the first nine digits of the ISBN, since

94 CHAPTER 3. VECTORS AND THE GEOMETRY OF SPACE

REVIEW - Vectors. Vectors. Vector Algebra. Multiplication by a scalar

MATH 221: SOLUTIONS TO SELECTED HOMEWORK PROBLEMS

Linear Algebra Massoud Malek

Review of Coordinate Systems

Inner products. Theorem (basic properties): Given vectors u, v, w in an inner product space V, and a scalar k, the following properties hold:

Math 32A Discussion Session Week 2 Notes October 10 and 12, 2017

LINEAR ALGEBRA - CHAPTER 1: VECTORS

MTH 2032 SemesterII

45. The Parallelogram Law states that. product of a and b is the vector a b a 2 b 3 a 3 b 2, a 3 b 1 a 1 b 3, a 1 b 2 a 2 b 1. a c. a 1. b 1.

4.3 - Linear Combinations and Independence of Vectors

Mon Apr dot product, length, orthogonality, projection onto the span of a single vector. Announcements: Warm-up Exercise:

Chapter 3. The Scalar Product. 3.1 The scalar product using coördinates

Calculus Vector Principia Mathematica. Lynne Ryan Associate Professor Mathematics Blue Ridge Community College

The Geometry of R n. Supplemental Lecture Notes for Linear Algebra Courses at Georgia Tech

Numbers, proof and all that jazz.

1.1 Single Variable Calculus versus Multivariable Calculus Rectangular Coordinate Systems... 4

Lecture 3: Linear Algebra Review, Part II

Factoring and Algebraic Fractions

Math 302 Outcome Statements Winter 2013

The geometry of least squares

Inner Product Spaces 6.1 Length and Dot Product in R n

2. FUNCTIONS AND ALGEBRA

Exercise 1a: Determine the dot product of each of the following pairs of vectors.

x 1 x 2. x 1, x 2,..., x n R. x n

CHAPTER 6: ADDITIONAL TOPICS IN TRIG

58. The Triangle Inequality for vectors is. dot product.] 59. The Parallelogram Law states that

Properties of Linear Transformations from R n to R m

CHAPTER 4. APPLICATIONS AND REVIEW IN TRIGONOMETRY

Real Analysis III. (MAT312β) Department of Mathematics University of Ruhuna. A.W.L. Pubudu Thilan

1111: Linear Algebra I

Definitions and Properties of R N

Distance in the Plane

22A-2 SUMMER 2014 LECTURE Agenda

Recall: Dot product on R 2 : u v = (u 1, u 2 ) (v 1, v 2 ) = u 1 v 1 + u 2 v 2, u u = u u 2 2 = u 2. Geometric Meaning:

LECTURE 2: CROSS PRODUCTS, MULTILINEARITY, AND AREAS OF PARALLELOGRAMS

MAC Module 5 Vectors in 2-Space and 3-Space II

General Physics I, Spring Vectors

Integrated Math II Performance Level Descriptors

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

Dot product and linear least squares problems

Unit #17: Spring Trig Unit. A. First Quadrant Notice how the x-values decrease by while the y-values increase by that same amount.

Lecture Note on Linear Algebra 17. Standard Inner Product

15. LECTURE 15. I can calculate the dot product of two vectors and interpret its meaning. I can find the projection of one vector onto another one.

MA201: Further Mathematical Methods (Linear Algebra) 2002

Linear Algebra I. Ronald van Luijk, 2015

1 Matrices and matrix algebra

Applied Linear Algebra in Geoscience Using MATLAB

Calculus 1 for AE (WI1421LR) Lecture 1: 12.3 The dot product 12.4 The cross product

Extra Problems for Math 2050 Linear Algebra I

Lecture 6 - Introduction to Electricity

MAT2342 : Introduction to Applied Linear Algebra Mike Newman, fall Projections. introduction

Lecture Notes on Metric Spaces

Vectors for Beginners

Distances in R 3. Last time we figured out the (parametric) equation of a line and the (scalar) equation of a plane:

2. Matrix Algebra and Random Vectors

Transcription:

Covariance and Dot Product 1 Introduction. As you learned in Calculus III and Linear Algebra, the dot product of two vectors x = (x 1,..., x n ) and y = (y 1,..., y n ) in R n is the number n := x i y i, and as you learned in Probability, the covariance of two random variables X and Y 1 is the number i=1 := E [(X µ X )(Y µ Y )]. These two quantities appear to have nothing in common, beyond the fact that each is a function that accepts two inputs of the same type and returns a numerical value. Appearances can be deceptive, though: the dot product and the covariance are actually twins. In this handout, I will show why this is so. 2 Dot Products and Covariance: Elementary Properties. The table below lists each elementary property 2 of dot product beside the corresponding elementary property of covariance. As you can see, except for a slight discrepancy between L2 and R2, the properties in each row correspond perfectly. Dot Product Covariance L1. 0 R1. 0 L2. = 0 x = 0 R2. = 0 X is constant L3. = y x R3. = Cov(Y, X) L4. (α x) y = α() = x (α y) R4. Cov(αX, Y ) = α = Cov(x, αy ) L5. x ( y 1 + y 2 ) = ( 1 ) + ( 2 ) R5. Cov(X, Y 1 +Y 2 ) = Cov(X, Y 1 ) + Cov(X, Y 2 ) 1 X and Y need to be defined on the same sample space. We will assume throughout that this is the case. 2 These properties are deemed elementary because all of the other properties can be derived from them. 1

3 Dot Products and Covariance: Some Derived Properties. As I mentioned in the introduction, there many properties follow from L1 L5 or respectively from R1 R5. In this section, I will discuss several important examples. 3.1 The Pythagorean Relation. I will begin with a pair of parallel definitions: the length of a vector and the standard deviation of a random variable. 3 Dot Product Covariance x := σ X := L6. x + y 2 = x 2 + 2() + y 2 R6. σ 2 X+Y + 2 + σ 2 Y L7. = 0 x + y 2 = x 2 + y 2 R7. = 0 σ 2 X+Y + σ 2 Y Formulas L6/R6 can be derived by routine parallel calculations from L1 L5 and R1 R5 respectively, and properties L7/R7 follow immediately from formulas L6/R6 respectively. When both sides of the in L7 (respectively R7) are true, then each of these formulas can be intrepreted as the Pythagorean Theorem for an appropriately labeled right triangle. Exercise 1 (a): Derive Formula L6 from properties L1 L5 and the definition of x. (b): Use L6 to prove L7. Exercise 2 (a): Derive Formula R6 from properties R1 R5 and the definition of σ X. (b): Use R6 to prove R7. 3.2 The Cauchy-Schwarz Inequality. The most important property of the dot product is the formula (for nonzero vectors) (1) = ( x )( y ) cos(θ), 3 It is important to note that we are defining x from the dot product of x with itself not from the coordinates of x). Similarly, we are using covariance to define σ X.

where 0 θ π is the non-reflex angle between x and y. 4 There is an immediate consequence that follows from (1). If you take the absolute value on both sides of (1) and use the fact that cos(θ) 1, you arrive very quickly at the Cauchy-Schwarz Inequality: 5 (2) ( x )( y ). At first glance, it does not seem as though either (1) or (2) could possibly correspond to a property of covariance. Consider (1) first. The the analogue of this equation for covariance would seem to be (3) = σ X σ Y cos(φ), for some angle φ. This equation seems utterly meaningless: what angle could φ possibly represent? Now, consider (2). In this case, the analoguous inequality for covariance would seem to be (4) σ X σ Y. In contrast to equation (3), inequality (4) is definitely not meaningless; however, it is not clear (yet) whether it is true. At this point, we certainly do not have a proof of it: without an angle φ that makes (3) true, we cannot use cos(φ) 1 to deduce (4) from (3). Remarkably, as it turns out, there is a different way to prove (1) and (2). The trick, as you will see (Theorem 3.1 below), is to reverse the logical order, by first proving (2) without using cos(θ) and then by using (2) to prove (1). It also turns out that this alternate approach does have a covariance parallel; the proof or (2) corresponds to a proof of (4), and the proof of (1) from (2) indicates what angle φ will make (3) true. The cosine-free proof of (2) is based upon the properties of projections. Recall that if x is a nonzero vector and y is any vector, then the projection p = proj x ( y) of y onto x is the shadow that y casts upon the line l containing x, when the light rays are perpendicular to l. 6 Recall also that one can calculate p with the dot-product-based formula 7 (5) p = x. Since (5) is dot-product based, it gives rise to an analogous covariance-based entity, namely the random variable ( ) (6) P := X, where X is a nonconstant random variable and Y is any random variable. Before discussing the details of Theorem 3.1, I will extend the table so as to include definitions (5) and (6) and equations (1), (2), (3), and (4); this should help you keep the larger picture in focus. 8 4 θ is contained in the (usually) unique plane P containing x and y. 5 The importance of this inequality, which cannot be made clear in a Calculus III course, will emerge in the course of this discussion. 6 The light rays are in the same plane P mentioned above. 7 Formula (5) should actually be viewed as the definition of proj x ( y). 8 You may find the complete table on the last page of this handout.

(5). p := x (6). P := ( ) X L8. p ( y p) = 0 R8. Cov(P, Y P ) = 0 L9. ( x )( y ) R9. σ X σ Y L10. = ( x )( y ) cos(θ) R10. = σ X σ Y cos(φ) The proofs of R8 and R9, which parallel the proofs of of L8 and L9, will be left as exercises. Property R10, which is still meaningless at this point, will be discussed in section 3.3. Theorem 3.1 L8: p ( y p) = 0, where p is the vector proj x ( y). L9: ( x )( y ). L10: = ( x )( y ) cos(θ). Proof of L8. The proof will be easier to follow if I first put α := Equation (5) in the simpler form (7) The proof is the following calculation. p = α x. p ( y p) = (α x) ( y α x) (L4) = α[ x ( y α x)] (L5) = α[ x (α x)] (L4) = α[ α()] (definition of α) = α[ ()] (cancel) = α[ ] = 0. Proof of L9. The first step is to apply the = direction of L7 to L9: : this allows me to express (8) p ( y p) = 0 = p 2 + y p 2 = y 2. Since y p 2 0, (8) leads immediately to the inequality (9) or, equivalently, p 2 y 2 p p y 2.

Then, replacing p with α x and calculating gives (10) (α x) (α x) y 2 α 2 x 2 y 2 α x y ( ) x 2 x y x y ( x )( y ). Proof of L10. From L9, it follows that ( x )( y ) = ( x )( y ) 1, so that there is an angle ˆθ such that We also know from equation (1) 9 that cos(ˆθ) = ( x )( y ). Hence, cos(ˆθ) = cos(θ), so that ˆθ = θ. cos(θ) = ( x )( y ). As mentioned above, one can prove R8 and LR by same arguments used to prove L8 and L9. I will therefore leave the proof of Theorem 3.2 as an exercise. Theorem 3.2 R8: Cov(P, Y P ) = 0, where P is the random variable defined in (6). R9: σ X σ Y. Exercise 3 Prove both parts of Theorem 3.2. Exercise 4 One of the Calculus III handouts uses L9 to prove the triangle inequality By a parallel argument, use R9 to prove x + y x + y. σ X+Y σ X + σ Y. 9 We know this equation is correct by the Law-of-Cosines proof from Calc III/Linear Algebra.

Exercise 5 (a): Show that if you get equality in L9 that is, if = ( x )( y ) then y = c x for a certain scalar c. (Identify c.) (Hint: If you have equality in the last line of the proof of L9 [inequality (10)], then (working upwards), you also have equality in (9), so you can replace p 2 with y 2 in equation (8). Do so, and proceed from there...) (b): Show that if you get equality in R9 that is, if = σ X σ Y then Y = mx +b for certain scalars m and b. (Identify m.) 3.3 Property R10 and the Correlation Coefficient. If we divide R9 through by σ X σ Y, we get (11) σ X σ Y there is therefore an angle φ such that (12) Multiplying (12) through by σ X σ Y then gives σ X σ Y = σ X σ Y 1; = cos(φ). = σ X σ Y cos(φ), so that R10 holds true for this angle φ. This suggests that we define the angle between X and Y to be this angle φ. 10 As you are aware, the correlation coefficient ρ(x, Y ) of two nonconstant random variables X and Y is defined by the formula (13) ρ(x, Y ) :=. σ X σ Y Compare equations (13) and (12): ρ(x, Y ) is the cos(φ) of equation (12). Now compare equation (13) and inequality (11): inequality (11) is actually the well-known theorem ρ(x, Y ) 1. As you are also aware, different values of ρ(x, Y ) imply different linear relationships between X and Y : 11 ρ(x, Y ) is close to 1 Y is almost an increasing linear function of X ρ(x, Y ) is close to ( 1) Y is almost a decreasing linear function of X ρ(x, Y ) is close to 0 there is almost no linear relationship between X and Y If you vizualize ρ(x, Y ) as the cosine of the angle between X and Y, you will form a complementary mental image picture: ρ(x, Y ) is close to 1 X and Y go in almost the same direction ρ(x, Y ) is close to ( 1) X and Y go in almost the opposite directions ρ(x, Y ) is close to 0 X and Y are almost perpendicular 10 Observe that the collection of random variables over a fixed sample space constitutes a vector space. You can visualize X, Y and φ against this backdrop. 11 Compare Exercise 5(b).

4 The Entire Table. Dot Product Covariance L1. 0 R1. 0 L2. = 0 x = 0 R2. = 0 X is constant L3. = y x R3. = Cov(Y, X) L4. (α x) y = α() = x (α y) R4. Cov(αX, Y ) = α = Cov(x, αy ) L5. x ( y 1 + y 2 ) = ( 1 ) + ( 2 ) R5. Cov(X, Y 1 +Y 2 ) = Cov(X, Y 1 ) + Cov(X, Y 2 ) x := σ X := L6. x + y 2 = x 2 + 2() + y 2 R6. σ 2 X+Y + 2 + σ 2 Y L7. = 0 x + y 2 = x 2 + y 2 R7. = 0 σ 2 X+Y + σ 2 Y (5). p := x (6). P := ( ) X L8. p ( y p) = 0 R8. Cov(P, Y P ) = 0 L9. ( x )( y ) R9. σ X σ Y L10. = ( x )( y ) cos(θ) R10. = σ X σ Y cos(φ)