STAT 5 Assignment Name Spring Reading Assignment: Johnson and Wichern, Chapter, Sections.5 and.6, Chapter, and Chapter. Review matrix operations in Chapter and Supplement A. Examine the matrix properties presented in exercises.,.,.,. and. at the end of Chapter. Written Assignment: Due Monday, January, in class. Please write your answers on this assignment. Consider a random vector (,, ) with mean vector µ (,, ) and covariance matrix Ó. The eigenvalues of Ó are ë 5, ë 9, ë 5 and the corresponding eigenvectors are ( 5 ) / ( ) ( ) 5 / / 5 e e 5 / / 5 5/ / e / Evaluate the following quantities. Parts (a), (b), (c), (d) and (g) can be easily done without using either a computer or a calculator. (a) Ó (b) trace(ó) (c) V(e ) (d) (e)
(f) / ' ' (g) Define Y e, Y e, and Y e. Find the mean vector and covariance matrix for Y (Y Y Y )' à ', where the i-th column of à is the i-th eigenvector for Ó. Evaluate ' (Y) E (Y) V Γ ' Γ (h) The linear transformation, Y Γ, examined in part (g) is often called a rotation because it corresponds to simply rotating the coordinate axes. Use the definition of eigenvectors to show that lengths of vectors are not changed by this transformation, i.e., show Y ' Y ' when Y Γ. (i) Let à be defined as in part (g), and let Y Γ, µ E( ), V (), µ Y Γ ' µ and Y V(Y) Γ ' Γ. Determine which of the following measures of variability or distance are unaffected by rotations. Justify your answers. (a) Is Ó Ó Y? (b) Is trace (Ó ) trace (Ó Y )? (c) Is Ó Ó Y?
(d) Is ( µ )' ( µ ) (Y µ )' (Y )? Y Y µ Y (e) Is ( µ )'( µ ) (Y µ )' (Y )? Y µ Y. Consider a bivariate normal population with mean vector where µ and 9 µ and covariance matrix Ó, 5 (a) Write down a formula for the joint density function. (b) Find the eigenvalues and eigenvectors for Ó. λ λ e e (c) Find values for Ó trace (Ó) (d) Write down the formula for the boundary of the smallest region such that there is probability.5 that a randomly selected observation will be inside the boundary. (e) Sketch the boundary of the region in part (d). - - - - 5 6 -
- (f) Determine the area of the region described in parts (d) and (e). (g) Determine the area of the smallest region such that there is probability.95 that a randomly selected observation will be in the region.. Let be a normally distributed random vector with µ and (a) Indicate which of the following are pairs of independent random variables by drawing a circle around the appropriate pairs. (i) and (iv) and + (ii) and (v) + and - (iii) and - (vi) + + and - + (b) What is the distribution of Y (, )? (c) What is the conditional distribution of Y (, ) given x? (d) Find the correlation between and and a formula for the partial correlation between and given x. ρ ρ (e) What is the conditional distribution of given x and x?
5. Suppose, N µ where 9 µ (a) What is the distribution of Z +? (b) What is the joint distribution of Z in part (a) and Z +. (c) Find the conditional distribution of given x, x. (d) Find the partial correlation between and given x and x.. ρ 5. Let be Y and, N µ be, N µ where and Y are independent and 8 6 µ µ (a) Find + Y Y, Cov
6 (b) Are Y and + Y independent random vectors? Explain. (c) Show that the joint distribution for the four dimensional random vector + Y Y is a multivariate normal distribution. 6. Let d(p,q) denote a measure of distance between p and q. Johnson and Wichern indicate that any distance measure should possess the following four properties: (i) d(p, q) d (q, p) (ii) d(p, q) > if (iii) d(p, q) if p p q q (iv) d(p, q) d(p, r) + d(r,q), where r ( r,r )'. These properties are satisfied, for example, by Euclidean distance, i.e.,. d(p, q) / [ (p q)' (p q)] (a) In this class we will often consider distance measures of the form / d(p, q) [ (p q)' A (p q)]. Sometimes A will be the inverse of a covariance matrix which implies that A is symmetric and positive definite. Show that properties (i) through (iv) are satisfied when A is symmetric and positive definite.
7 (b) Does A have to be both symmetric and positive definite for / d(p, q) [ (p q)' A (p q)] to satisfy properties (i) through (iv)? Present your proofs or explanations. (c) It is obvious that the measure d(p,q) max{ p q, p q,..., p q } satisfies properties (i), (ii), (iii). Either prove or disprove that it satisfies property (iv), the triangle in equality. k k 7. One frequently used technique in multivariate analysis first reduces a multivariate problem to a more familiar univariate problem by considering linear combinations of the elements of a random vector, and then defines the "optimal" linear combination as a multivariate measure. For example, consider a p-dimensional random vector (,,..., p) ' with mean vector µ ( µ, µ,..., µ p) ' and covariance matrix Ó. Using a vector of constants, a (a, a,..., ap) ', define a random variable
Y a' a + a +... + a. Then the mean and variance of Y are p p a' µ and a' a, respectively, and the standardized distance of Y from its mean is 8 d (a) / Y a' µ / a' a. Define D² maximum {d²( a ): ' a a and p a ε R }. as the maximum squared "standardized distance" of from µ. (a) Use the extended Cauchy-Schwarz inequality in Section.7 of Johnson and Wichern to show a c ( µ ) provides the maximum of d²( a ) in the definition of D², where c is a normalizing constant selected to make a '. a (b) Do you recognize D² as a measure presented in the lectures for STAT 5? If so, what is its name? (Note that we did not assume that has a multivariate normal distribution.) For additional practice you could do problems.6,.8,.9,.,.,.6 at the end of Chapter and problems.,.,.,.5,.,.5,.6,.7 at the end of Chapter. Problem.8 gives a trivial example of a non-normal bivariate distribution with normal marginal distributions. Do not hand in these additional problems, but answers will be given on the answer sheet for this assignment.