ANOVA: Analysis of Variance - Part I

Similar documents
Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38

STAT200C: Review of Linear Algebra

Chapter 5. The multivariate normal distribution. Probability Theory. Linear transformations. The mean vector and the covariance matrix

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

2. Matrix Algebra and Random Vectors

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Introduction to Matrix Algebra

Linear Algebra Review

Maximum Likelihood Estimation

Vectors and Matrices Statistics with Vectors and Matrices

BIOS 2083 Linear Models Abdus S. Wahed. Chapter 2 84

STAT5044: Regression and Anova. Inyoung Kim

Basic Concepts in Matrix Algebra

Recall the convention that, for us, all vectors are column vectors.

The Singular Value Decomposition

Principal Component Analysis (PCA) Our starting point consists of T observations from N variables, which will be arranged in an T N matrix R,

Mathematical Foundations of Applied Statistics: Matrix Algebra

22m:033 Notes: 7.1 Diagonalization of Symmetric Matrices

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Chapter 3. Matrices. 3.1 Matrices

Exercises * on Principal Component Analysis

Lecture 11: Regression Methods I (Linear Regression)

Linear Algebra: Characteristic Value Problem

MP 472 Quantum Information and Computation

Matrix Algebra: Summary

Lecture 11. Multivariate Normal theory

Multivariate Gaussian Analysis

Review of Linear Algebra

MLES & Multivariate Normal Theory

MATH 431: FIRST MIDTERM. Thursday, October 3, 2013.

Linear Systems. Class 27. c 2008 Ron Buckmire. TITLE Projection Matrices and Orthogonal Diagonalization CURRENT READING Poole 5.4

The purpose of this section is to derive the asymptotic distribution of the Pearson chi-square statistic. k (n j np j ) 2. np j.

Lecture 11: Regression Methods I (Linear Regression)

Chapter 5 Matrix Approach to Simple Linear Regression

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Notes on Random Vectors and Multivariate Normal

Tutorial on Principal Component Analysis

Mathematical Methods wk 2: Linear Operators

Quantum Computing Lecture 2. Review of Linear Algebra

Multivariate Statistical Analysis

Random Vectors and Multivariate Normal Distributions

Introduction to Machine Learning

Chapter 3 Transformations

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

CS 246 Review of Linear Algebra 01/17/19

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Chap 3. Linear Algebra

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Properties of Matrices and Operations on Matrices

SOME THEOREMS ON QUADRATIC FORMS AND NORMAL VARIABLES. (2π) 1 2 (σ 2 ) πσ

Exercise Sheet 1.

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

Linear Algebra Solutions 1

Singular Value Decomposition and Principal Component Analysis (PCA) I

Fall TMA4145 Linear Methods. Exercise set Given the matrix 1 2

This appendix provides a very basic introduction to linear algebra concepts.

MA 1B ANALYTIC - HOMEWORK SET 7 SOLUTIONS

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Appendix A: Matrices

1 Principal component analysis and dimensional reduction

Linear algebra and applications to graphs Part 1

Matrix Algebra Determinant, Inverse matrix. Matrices. A. Fabretti. Mathematics 2 A.Y. 2015/2016. A. Fabretti Matrices

The Multivariate Normal Distribution 1

The Hilbert Space of Random Variables

18.S096 Problem Set 3 Fall 2013 Regression Analysis Due Date: 10/8/2013

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

MATH 31 - ADDITIONAL PRACTICE PROBLEMS FOR FINAL

arxiv: v1 [math.st] 26 Jan 2015

Matrix Approach to Simple Linear Regression: An Overview

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Lecture 6: Selection on Multiple Traits

Linear Algebra Formulas. Ben Lee

Lecture 22: A Review of Linear Algebra and an Introduction to The Multivariate Normal Distribution

Moment Generating Function. STAT/MTHE 353: 5 Moment Generating Functions and Multivariate Normal Distribution

Matrix Algebra, part 2

The Multivariate Gaussian Distribution

Linear Algebra (Review) Volker Tresp 2018

1 Data Arrays and Decompositions

STAT 100C: Linear models

Prediction. is a weighted least squares estimate since it minimizes. Rasmus Waagepetersen Department of Mathematics Aalborg University Denmark

Lecture 1 and 2: Random Spanning Trees

Exercise Set Suppose that A, B, C, D, and E are matrices with the following sizes: A B C D E

Practice Exam. 2x 1 + 4x 2 + 2x 3 = 4 x 1 + 2x 2 + 3x 3 = 1 2x 1 + 3x 2 + 4x 3 = 5

The Matrix Algebra of Sample Statistics

A Little Necessary Matrix Algebra for Doctoral Studies in Business & Economics. Matrix Algebra

3. The F Test for Comparing Reduced vs. Full Models. opyright c 2018 Dan Nettleton (Iowa State University) 3. Statistics / 43

MATH36001 Generalized Inverses and the SVD 2015

Principal Components Theory Notes

Knowledge Discovery and Data Mining 1 (VO) ( )

Section 6.2, 6.3 Orthogonal Sets, Orthogonal Projections

2. Review of Linear Algebra

Symmetric and self-adjoint matrices

Introduction to quantum information processing

Lecture 7 Spectral methods

Linear Models Review

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

MIT Spring 2015

Chapter 6. Eigenvalues. Josef Leydold Mathematical Methods WS 2018/19 6 Eigenvalues 1 / 45

Transcription:

ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises. The solutions of the exercises will follow in another file. In Part II, we will discuss linear models and its relationship with ANOVA models. Symmetric Matrices Symmetric Matrices: Let A be an m m matrix. It is said to be symmetric if a ij = a ji for i = 1,..., m and j = 1,..., m. Equivalently, it means that A = A, where A is the transpose of A. Spectral Decomposition: Let A be a symmetric matrix of size m m. It has m real eigenvalues λ 1,..., λ m associated to the eigenvalues e 1,..., e m where {e 1,..., e m } for an orthonormal basis. That is, e ie j = 0, if i j and e ie i = 1, Its spectral decomposition is that is e i is a unit vector. m A = λ i e i e i. Recall: We say that λ is an eigenvalue of the matrix A if A v = λ v for some non-zero vector v. We say that v is the corresponding eigenvector. Projection Matrix: Let P be a symmetric matrix of size n n with eigenvalues which are either 0 or 1. Then, we say that P is a projection matrix, since P Y = e i e iy = (e iy ) e i, represents the orthogonal projection of the vector Y = [Y 1,..., Y n ] onto the subspace with basis {e 1,..., e k } of dimension k. 1

Idempotent Symmetric Matrix: Consider a symmetric matrix A, i.e. A = A. If A 2 = A, then it is said to be idempotent. A symmetric idempotent symmetric matrix is a projection matrix. Exercise 1: Show that a symmetric idempotent matrix A, must have eigenvalues equal to either 0 or 1. Therefore, A is a projection matrix. Trace of a square matrix: Consider a n n matrix A, its trace is defined as tr(a) = a ii. In other words, the trace is the sum of the elements in the diagonal. Properties of the trace: tr(a 1 + A 2 ) = tr(a 1 ) + tr(a 2 ); tr(c A) = c tr(a); tr(ab) = tr(b A), as long as AB and BA are both square matrices. Exercise 2: Consider A a symmetric matrix of size n n. Show that tr(a) = λ i, where λ 1,..., λ n are the eigenvalues of A. Hint: Consider the spectral decomposition of A. Let P be a projection matrix. Show that tr(p ) is the dimension of the subspace onto which the matrix is projecting. Hint: The eigenvalues of the projection matrix are either 0 or 1. 2

Multivariate Normal Distribution Consider Y = [Y 1,..., Y n ] a vector a independent normal random variables, where Y i N(µ i, σ 2 i ), for i = 1,..., n. Let A m n matrix. We say that the vector W = A Y follows a multivariate distribution. Exercise 3: Let W = A Y be a multivariate normal random vector. Consider two components of W : Show that W i = a ik Y k and W j = a jk Y k. σ{w 1, W 2 } = a ik b jk σi 2. Theorem: Consider multivariate normal random vector W = A Y. The components of W are normal random variables. Uncorrelated components of W are independent. Proof: The ith component of W is a ik Y k, which is a normal since it is a linear combination of independent normals. Sketch of the proof: The moment generating function for a random vector W = [W 1,..., W n ] is M W (t 1,..., t n ) = E{e t W } = E{e t1w1+...+tm Wm }. It can be shown that a joint distribution is characterized by its moment generating function assuming that it exists. In other words, two different 3

joint distribution cannot have the same moment generating function. We will consider two components: W i = a ik Y k and W j = If W i and W j are independent then a jk Y k. M W (t i, t j ) = E{e tiwi }E{e tjwj }. (1) If we can show that if W 1 and W 2 are uncorrelated implies that (1), then this will mean that (W 1, W 2 ) have the same joint distribution as independent random variables. Thus, they must be independent. So lets assume that W 1 and W 2 are uncorrelated, this means σ{w 1, W 2 } = a ik b jk σi 2 = 0. Recall: If X i N(µ, σ 2 ), then M X (t) = E{e t X } = e tµ+t2 σ 2 /2. We have M(t i, t j ) = E{e ti Wj+tj Wj } = E{e n (t1a ik+t 2 a jk )Y k } = E{ = = n M Yk (t 1 a ik + t 2 a jk ) n n e (t1a ik+t 2 a jk ) µ k +(t ia ik +t j a jk ) 2 σ 2 k /2 e {(t1a ik+t 2 a jk )Y k } by the independence of Y 1,..., Y n = e ti n a ikµ k +t 2 i a2 ik σ2 k /2 e tj n a jkµ k +t 2 j a2 jk σ2 k /2 e 2 ti tj n a ika jk σ 2 k = M Wi (t i )M Wi (t i ), since a ik a jk σk 2 = 0. 4

Theorem: Let W = A Y and U = B Y be multivariate normal vectors, where Y 1,..., Y n are independent normal random variables with Y i N(µ i, σ 2 ). Note: We are assuming a common variance. If AB = 0 then AY and B Y are independent. Proof: Consider the ith component of W = AY : W i = and the jth component of U = B Y : U j = a ik Y k a jk Y k. If suffices to show that W i and U j are uncorrelated. Their covariance is since A B = 0. σ{w i, U j } = σ 2 n a ik a jk = 0 Remark: If the variance are not equal, then the covariance is σ{w i, U j } = a ik a jk σk, 2 which is not necessarily equal zero when A B = 0. 5

Quadratic Forms Definition: We say that Q is a quadractic form of Y = [Y 1,..., Y n ] if we can write it as follows: Q = Y AY, where A is symmetric. Equivalently, we have Q = a ij Y i Y j, where a ij = a ji. Note: We say that A is the matrix of the quadratic form. Properties of Quadratic Forms: if Q 1,..., Q k are quadratic forms of Y 1,..., Y n, then Q = c 1 Q 1 +... + c l Q k = Y ( is also a quadratic form of Y 1,..., Y n. c i A i )Y if the matrix of Q is a projection matrix P, then we can write Q as a sum of squares: Q = Y P Y = (P Y ) (P Y ), since P P and P 2 = P. Another representation that is useful is Q = Y ( e i e i)y = Y (e i e i)y = where λ 1 =... = λ k = 1 and λ i = 0 for i > k. (e iy ) (e iy ), Theorem: Consider Y 1,..., Y n independent normal random variables with a common variance, i.e. σ 2 {Y i } = σ 2 for i = 1,..., n. if P 1 and P 2 are projection matrices such that P 1 P 2 = 0, then Q 1 = Y P 1 Y and Q 2 = Y P 2 Y are independent. It follows from the fact they are functions of P 1 Y and P 2 Y which are independent. 6

The quadratic form Y P Y follows a χ 2 distribution. From above, we see the quadratic form is a sum of square of e 1Y, e 2Y,..., e k Y. Since e i e j = 0, for i j, therefore these are independent normals. Furthermore, σ 2 {e iy } = σ 2 e ie i = σ 2, since e i e i = 1 (i.e. a unit vector). Remarks: [Construction of χ 2 ] We are using the following construction of a χ 2 random variable. Let Y 1,..., Y n be independent normal random variables with a common variance σ 2, then n Y i 2 σ 2 χ 2, with ν = n degrees of freedom and the non-centrality parameter is n nc = µ2 i σ 2. The parameters come from the mean of the sum of squares, that is E{ Y 2 i } = n σ 2 + µ i. Expectation of a Quadratic form: We showed in class that E{Y AY } = tr(a) σ 2 + µ Aµ, where µ = [µ 1,..., µ n ]. Therefore, if P is a projection matrix, then E{Y P Y } = tr(p ) σ 2 + µ P µ, and Y P Y χ 2 with ν = tr(p ) degrees of freedom and nc = µ P µ σ 2. Exercise 4: Consider Q = Q 1 + Q 2, then show that ν = ν 1 + ν 2, where Q,Q 1, Q 2 are quadratic forms of Y 1,..., Y n of independent random variables with a common mean. In other words, show that the degrees of freedom are additive. 7

Exercise 5: Let Y 1,..., Y n independent normals with a common variance. That is Y i N(µ i, σ 2 ). 1. Show that the total variability SSTO σ 2 χ 2, by showing that SSTO = Y P Y, where P is a projection matrix, i.e. P = P and P 2 = P. The total variability is SSTO = (Y i Y ) 2 and Y = 1 n Y i. 2. Show that the degrees of freedom is ν = n 1. Hint: Compute the trace of the matrix. 3. Give the form of the non-centrality parameter of SSTO. 8