Examensarbete. Explicit Estimators for a Banded Covariance Matrix in a Multivariate Normal Distribution Emil Karlsson

Size: px
Start display at page:

Download "Examensarbete. Explicit Estimators for a Banded Covariance Matrix in a Multivariate Normal Distribution Emil Karlsson"

Transcription

1 Examensarbete Explicit Estimators for a Banded Covariance Matrix in a Multivariate Normal Distribution Emil Karlsson LiTH - MAT - EX / SE

2

3 Explicit Estimators for a Banded Covariance Matrix in a Multivariate Normal Distribution Mathematical Statistics, Linköpings Universitet Emil Karlsson LiTH - MAT - EX / SE Examensarbete: 16 hp Level: G2 Supervisor: Martin Singull, Mathematical Statistics, Linköpings Universitet Examiner: Martin Singull, Mathematical Statistics, Linköpings Universitet Linköping: Januari 2014

4

5 Abstract The problem of estimating mean and covariances of a multivariate normal distributed random vector has been studied in many forms. This thesis focuses on the estimators proposed in [15] for a banded covariance structure with m-dependence. It presents the previous results of the estimator and rewrites the estimator when m = 1, thus making it easier to analyze. This leads to an adjustment, and a proposition for an unbiased estimator can be presented. A new and easier proof of consistency is then presented. This theory is later generalized into a general linear model where the corresponding theorems and propositions are made to establish unbiasedness and consistency. In the last chapter some simulations with the previous and new estimator verifies that the theoretical results indeed makes an impact. Keywords: Banded covariance matrices, Covariance matrix estimation, Explicit estimators, Multivariate normal distribution, general linear model. URL for electronic version: Karlsson, v

6 vi

7 Acknowledgements First of all, I want to thank my supervisor Martin Singull. You introduced me to the world of mathematical statistics and have inspired me alot. You have read, helped and commented on my work for a long time and have always been helpful when I ran into questions, whether it was about style, the thesis or mathematics in general. Further I would like to thank my opponent Jonas Granholm for the comments regarding the thesis and for the time taken to improve it. I also want to thank Tahere for your encouragement and continuous support. I want to thank mum and dad as well. Lastly I would like to thank the most important person in my life, Ghazaleh. You have always been an incredible source of inspiration and happiness for me and have helped me in ways I never could have imagined. You are the most incredible person I have ever met. Thank you for making me a better person. I love you. Karlsson, vii

8 viii

9 Nomenclature Most of the reoccurring abbreviations and symbols are described here. Notation Throughout the thesis matrices and vectors will be denoted by boldface transcription. Matrices will be denoted by capital letters and vectors by small letters of Latin or Greek alphabets. Scalars and matrix elements will be denoted by ordinary letters of Latin or Greek alphabets. Random variables will be denoted by capital letters from the end of the Latin alphabet. The end of proofs are marked by. LIST OF NOTATION A m,n - matrix of size m n M m,n - the set of all matrices of size m n a ij - matrix element of the i-th row and j-th column a n - vector of size n c - scalar A - transposed matrix A I n - identity matrix of size n A - determinant of A rank(a) - rank of A tr(a) - trace of A X - random matrix x - random vector X - random variable E[X] - expectation var(x) - variance cov(x, Y ) - covariance of X and Y S - sample dispersion matrix N p (µ, Σ) - multivariate normal distribution N p,n (M, Σ, Ψ) - matrix normal distribution MLE - Maximum likelihood estimator Karlsson, ix

10 x

11 Contents 1 Introduction Chapter outline Mathematical background Linear algebra General definitions and theorems Trace Idempotence Vectorization and Kronecker product Statistics General concepts The Maximum Likelihood Estimator The matrix normal distribution General Linear Model Quadratic- and bilinear form Explicit estimators of a banded covariance matrix Previous results Remodeling of explicit estimators Proof of unbiasedness and consistency Generalization to a general linear model Presentation ˆB instead of ˆµ Proposed estimators Proof of unbiasedness and consistency Simulations Simulations of the regular normal distribution Simulations of the estimators for a general linear model Discussion and further research Discussion Further research A Motivation of the estimator 27 Karlsson, xi

12 xii Contents

13 Chapter 1 Introduction There exist many testing, estimation, confidence interval and types of regression models in the multivariate statistical literature that are based on the assumption that the underlying distribution are normally distributed [21, 11, 1]. The primary reason is that often sets of multivariate datasets are, at least approximately, normally distributed. The multivariate normal distribution is also simpler to analyze than many other distributions. For example all the information in a multivariate normal distribution can be found in its mean and covariances. Because of this, estimating the mean and covariances are subjects of importance in statistics. This thesis will study an estimating procedure of a patterned covariance matrix. Patterned covariance matrices arise from a variety of different situations and applications and have been studied by many authors. A seminar paper in the 1940:s [23] considered patterned covariances when studying psychological tests. This was the introduction of the covariance matrix with equal diagonal and equal off-diagonal elements. Two years later the author extended this structure into blocks which had a certain pattern [22]. During the late 1960:s the covariance model where the covariances where depended on the distance between variables, a circular stationary model, where developed by [17]. There were different developments and directions for covariance matrices with structures, e.g., [16, 12, 3]. Some extensions from a linear model with one error term into mixed linear models, growth curves and variance component models were made, e.g., [6, 20]. More recent results, e.g., on block structures of matrices, can be found in [13, 14] and others. A special type of patterned covariance matrix called a banded covariance matrix will be discussed in this thesis. Banded covariance matrices are common in applications and arise often in association with time series. For example in signal processing, covariances of Gauss-Markov random processes or cyclostationary processes [24, 10, 5]. In this thesis we will study a special case of banded matrices with unequal elements except that certain covariances are zero. These covariance matrices will have a tridiagonal structure. Originally, estimates of covariance matrices were obtained using non-iterative methods such as analysis of variance (ANOVA) and minimum norm quadratic unbiased estimation (MINQUE). The introduction of the modern computer changed a lot of things and the cheap processing power made it possible to use iterative methods which performed Karlsson,

14 2 Chapter 1. Introduction better. With this came the rise of the maximum likelihood method and more general estimating equations. These methods has surely dominated during the last years. But nowadays we see a shift back to non-iterative methods since the datasets has grown tremendously. There are examples in EEG/EKG-studies in medicine, QTL-analysis in genetics or time series in meteorology. With such huge datasets estimating with iterative methods can be a slow and tedious job. This thesis will discuss some properties of an explicit non-iterative estimator for a banded covariance matrix derived in [15] and present some improvement to this estimator. The improvements gives an unbiased and consistent estimator under the special case of first order dependence for the mean and the covariance matrix. 1.1 Chapter outline Chapter 2: Introduces the necessary contexts and results in mathematics, mainly in linear algebra and statistics, which are required for studying this thesis. Some of these results are references to throughout the thesis. Chapter 3: Presents the explicit estimator and some results regarding it. From these results a new unbiased explicit estimator is suggested. Chapter 4: The explicit estimator is generalized for estimating the covariance matrix in a general linear model. Later an unbiased estimator is proposed. Chapter 5: Simulations based on the new unbiased explicit estimator in comparison to the previous explicit estimators are presented. Chapter 6: Discuss further improvements and directions of research which are interesting with regard to the explicit estimator. Appendix: Presents a short motivation of the estimator given in [15].

15 Chapter 2 Mathematical background This chapter will introduce some of the mathematics that are needed to understand this thesis. Some of the material may be new and other not, but the aim of this chapter is to give a person studying a Bachelor program in Mathematics sufficient background to understand this thesis. Most of theorems will be given without a proof but it can be accessed through the referenced sources of each section. 2.1 Linear algebra Since this thesis makes extensive use of linear algebra and matrices some of the more common definitions and theorems will be introduced here General definitions and theorems In the first part follows some general definitions and theorems in real linear algebra. This section presents some notation and expressions that are required later on. These results will be given without proof but can be found in [7, 8, 2]. Definition 1. The set of all matrices with m rows and n columns is denoted as M m,n. Definition 2. A matrix A M n,n is called the identity matrix of size n, denoted with I n, if the diagonal elements are 1 and the off-diagonal elements are 0. Definition 3. A matrix A M m,n is called a tridiagonal matrix if a ij = 0 for all i j > 1, where a ij is the element on row i and column j of the matrix A. Definition 4. A vector 1 n M n,1 of size n is called the one-vector of size n, e.g., where 1 denotes the transpose of = ( ). Definition 5. A matrix A M m,n is called the zero-matrix, denoted as 0 m,n, if all matrix elements of A equals 0. Definition 6. The rank of a matrix A M m,n, denoted as rank(a), is defined as the number of linearly independent columns (or rows) of the matrix. Karlsson,

16 4 Chapter 2. Mathematical background Definition 7. A matrix A M m,n is called symmetric if A = A. Definition 8. A matrix A M m,n is called normal if AA = A A. Definition 9. A matrix A M m,n is called orthogonal if A A = I n. In that case AA = I m as well. Definition 10. A symmetric square matrix A M n,n is positive (semi-)definite if x Ax > ( )0 for any vector x 0. Definition 11. The values λ i, which satisfy Ax i = λ i x i, are called eigenvalues of the matrix. The vector x i, which corresponds to the eigenvalue λ i, is called eigenvector of A corresponding to λ i. And lastly a theorem regarding matrix decomposition, called the spectral theorem. Theorem 1. Any normal matrix A M n,n has an orthonormal basis of eigenvectors. In other words, any normal matrix A can be represented as A = UDU, where U M n,n is an orthogonal matrix and D M n,n is a diagonal matrix with the eigenvalues of A on the diagonal Trace This section introduces the linear operator called trace which is frequently used in this thesis. These results can be found in [7, 2]. Definition 12. The trace of a square matrix A M n,n, denoted by tr(a), is defined to be the sum of its diagonal entries, that is, n tr(a) = a ii. Here follows some common properties regarding the trace of a matrix. Lemma 1. (i) tr(a + B) = tr(a) + tr(b) for A, B M n,n. (ii) tr(ca) = ctr(a) for all scalars c. Thus, the trace is a linear map. (iii) tr(a ) = tr(a) i=1 (iv) tr(ab) = tr(ba) for A M n,m and B M m,n. Equivalently, the trace is invariant during cyclic permutations, i.e., (v) tr(abcd) = tr(bcda) = tr(cdab) = tr(dabc) for A, B, C and D of proper sizes. This is known as the cyclic property. The trace of matrix A can also be written as the sum of its eigenvalues. Lemma 2. Let A M n,n and λ 1,..., λ n the eigenvalues of A, listed according to their algebraic multiplicities, then, n tr(a) = λ i. i=1

17 2.1. Linear algebra Idempotence Here follows a definition and some results regarding idempotent matrices. This type of matrices arises frequently in multivariate statistics. These results rely on [7, 2]. Definition 13. A matrix A M n,n is called idempotent if A 2 = AA = A. In terms of Theorem 1, an idempotent matrix is always diagonalizable. Lemma 3. An idempotent matrix is always diagonalizable and its eigenvalues are either 0 or 1. Since an idempotent matrix only has eigenvalues 0 or 1, it is easy to express its trace. Lemma 4. For an idempotent matrix A M n,n, follows tr(a) = rank(a). Idempotent matrices have a strong connection to orthogonal projections in the following sense. Theorem 2. A matrix A M n,n is idempotent and symmetric if and only if A is an orthogonal projection. It is also possible to construct new idempotent matrices by subtracting an idempotent matrix from the identity matrix. Theorem 3. Let A M n.n be an idempotent matrix. Then the matrix B = I n A is also idempotent. Proof. BB = (I n A)(I n A) = I n 2I n A + A 2 = I n 2A + A = I n A = B Vectorization and Kronecker product In this section some definitions and results on vectorization and Kronecker product are presented. These results simplifies notation regarding matrix normal distribution and makes it easier to interpret. The results rely on [8]. Definition 14. Let A = (a ij ) be a p q-matrix and B = (b ij ) an r s-matrix. Then the pr qs-matrix A B is called the Kronecker product of the matrices A and B, if where A B = [a ij B], i = 1,..., p; j = 1,..., q, a ij b 11 a ij b 1s a ij B = a ij b r1 a ij b rs Definition 15. Let A = (a 1,..., a q ) be a p q-matrix, where a i, i = 1..., q is the i-th column vector. The vectorization operator vec is an operator from R p q to R pq, with veca = a 1. a q.

18 6 Chapter 2. Mathematical background 2.2 Statistics This section presents some general definitions in statistics as well as some results in multivariate statistics and the general linear model. The ambition is to give enough substance to establish a basis for the understanding of this thesis. Most of the results will be given without proof but can be found in the references for each subsection General concepts Here follows some definitions and properties regarding estimation in statistics. These definitions and theorems comes from [4]. At first we present the definition of an unbiased estimator. Definition 16. The bias of a point estimator ˆθ of a parameter θ is the difference between the expected value of ˆθ and θ; that is, Bias θ (ˆθ) = E ˆθ θ. An estimator whose bias is identically (in θ) equal to 0 is called unbiased and satisfies E[ˆθ] = θ for all θ. Here follows the definition of a consistent estimator. Definition 17. A sequence of estimators ˆθ n = ˆθ n (X 1,..., X n ) is a consistent sequence of estimators of the parameter θ if for every ɛ > 0 and every θ in the parameter space, lim n P θ( ˆθ n θ < ɛ) = 1. And lastly a theorem which is widely used to establish consistency when unbiasedness already has been proved. Theorem 4. If ˆθ n is a sequence of estimators of a parameter θ satisfying, (i) (ii) lim V ar(ˆθ n ) = 0, n lim Bias θ(ˆθ n ) = 0, for every θ in the parameter space, n then ˆθ n is a consistent sequence of estimators of θ The Maximum Likelihood Estimator In this section the estimating procedure, called maximum likelihood, is examined closer and some results regarding properties of these estimators are presented. These results can be found in [4]. First we present a formal definition of the maximum likelihood estimator (MLE). Definition 18. For each sample point x, let ˆθ(x) be a parameter value at which L(θ x) attains its maximum as a function of θ, with x held fixed. A maximum likelihood estimator (MLE) of the parameter θ based on a sample X is ˆθ(X). The main reasons why the MLEs are so popular are the following theorems regarding its asymptotically properties.

19 2.2. Statistics 7 Theorem 5. Let X 1, X 2,..., be identically independent distributed f(x θ), and let L(θ x) = n f(x i θ) be the likelihood function. Let ˆθ denote the MLE of θ. Let τ (θ) i=1 be a continuous function of θ. Under some regularity conditions, which can be found in Miscellanea in [4] on f(x θ) and hence, L(θ x), for every ɛ > 0 and every θ Θ lim n P θ( τ (ˆθ) τ (θ) ɛ) = 0. That is, τ (ˆθ) is consistent estimator τ (θ). Theorem 6. Let X 1, X 2,..., be identically independent distributed f(x θ), let ˆθ denote the MLE of θ, and let τ (θ) be a continuous function of θ. Under some regularity conditions, which can be found in Miscellanea in [4] on f(x θ) and hence L(θ x), n(τ (ˆθ) τ (θ)) N(0, varlb (ˆθ)), where var LB (ˆθ) is the Cramér-Rao Lower Bound. That is, τ (ˆθ) is a consistent, asymptotically unbiased and asymptotically efficient estimator of τ (θ). Note 1. The regularity conditions in the two previous theorems are rather loose conditions on the probability density function. Since the underlying distribution in this thesis is the normal distribution, the regularity conditions possess no restriction in the development of results in this thesis The matrix normal distribution This section presents some definitions and estimators for the normal distribution generalized into the multivariate and matrix normal distribution. These results can be found in [4, 8, 9, 21, 11]. First of the univariate normal distribution is defined. Definition 19. A random variable X is said to have a univariate normal distribution, i.e., X N(µ, σ 2 ), if the probability density function of X is, f(x) = ( ) 1 exp (x µ)2 2πσ 2 2σ 2. In order to generalize this into a multivariate setting we need to define the covariance of a vector, e.g., the covariance matrix. Definition 20. If the random vector x m has mean µ the covariance matrix of x is defined to be the m n-matrix, where σ ij = cov(x i, X j ). Σ = (σ ij ) = cov(x) = E[(x µ)(x µ) ], In this thesis the covariance matrix with a banded structure is studied. It is defined as follows

20 8 Chapter 2. Mathematical background Definition 21. A covariance matrix of size k has a banded structure of order 1 < m < k, denoted as Σ (k) (m), if, σ 11 σ σ 1m σ 12 σ 22 σ σ 2,m Σ = σ k m+1,k 1... σ k 2,k 1 σ k 1,k 1 σ k 1,k σ k m,k... σ k 1,k σ kk Here follows the definition of a multivariate (vector-) normal distribution. Definition 22. A random vector x m is said to have a m-variate normal distribution if, for every a R m, the distribution of a x is univariate normal distributed. Theorem 7. If x has an m-variate normal distribution then both µ = E[x] and Σ = cov(x) exist and the distribution of x is determined by µ and Σ. In the standard case the following estimators are most frequently used. Theorem 8. Let x 1,..., x n be random sample from a multivariate normal population with mean µ and covariance Σ. Then, ˆµ = x = 1 n n i=1 x i and ˆΣ = 1 n 1 are the MLE respective corrected MLE of µ and Σ. n (x i x)(x i x), Theorem 9. The estimators in Theorem 8 are sufficient, consistent and unbiased with respect to the multivariate normal distribution. The matrix normal distribution adds another dimension of dependence in the data and is often used to model multivariate datasets. Here follows its definition. i=1 Definition 23. Let Σ = τ τ and Ψ = γγ, where τ M p,r and γ M n,s. A matrix X M p,n is said to be matrix normally distributed with parameters M, Σ and Ψ, if it has the same distribution as, M + τ Uγ, where M M p,n is non-random and U M r,s consists of s independent and identically distributed (i.i.d.) N r (0, I) vectors U i, i = 1, 2,..., s. If X M p,n is matrix normally distributed, this will be denoted X N p,n (M, Σ, Ψ). This definition makes it possible to rewrite the likelihood function in Theorem 8 into a matrix normal setting. Let X = (x 1,..., x n ). Then X N p,n (M, Σ, I n ), where M = µ1 n. This will be a random sample from a matrix normal population. Then the estimator of Σ can be written in the following way, ˆΣ = 1 n 1 XCX,

21 2.2. Statistics 9 where C = I n 1 n 11. The matrix normal distribution is nothing else than a multivariate normal distribution, i.e., let X N p,n (µ, Ω, Ψ), then vecx = x N pn (vecm, Ω), where Ω = Ψ Σ. Here follows a theorem which shows a property of the matrix normal distribution and it is later used in the proof of bilinear expectation. Theorem 10. Let X N p,n (0, Σ, I) and Q M n,n be an orthogonal matrix which is independent of X. Then X and XQ have the same distribution General Linear Model This section introduces the general linear model and some estimators frequently associated with it. The results are taken from [11, 21, 1]. The multivariate linear model generalizes to the general linear model in the sense it allows a vector of observations given by rows of a matrix Y, to correspond to the rows of the known design matrix X. The multivariate model takes the form Y = XB + E, where Y M n,m and E M n,m are random matrices, X M n,p is a known matrix, called the design matrix, and B M p,m is an unknown matrix of parameters called regression coefficients. We will assume throughout this chapter that X has rank k, that n > m + p, and the rows of the error matrix E are independent N m (0, Σ) random vectors. Using the notation introduced in the earlier subsection, this means that E is N n,m (0, I n, Σ) so that Y is N n,m (XB, I n, Σ). Here follows a presentation of the MLEs and some of their properties. Theorem 11. Let Y = XB N n,m (XB, I n, Σ) where n > m + p and p = rank(x). Then the MLE of B and the corrected MLE of Σ are ˆB = (X X) 1 X Y and where ˆΣ = 1 n p (Y X ˆB) (Y X ˆB) = 1 n p Y HY, H = I n X(X X) 1 X. The non-corrected MLE of Σ above is not unbiased therefore the adjustment with the rank of X is required to establish unbiasedness. Here follows a theorem regarding the properties of the estimators. Theorem 12. The estimators in Theorem 11 are unbiased, consistent and are sufficient for (B, Σ).

22 10 Chapter 2. Mathematical background Quadratic- and bilinear form In this section some results regarding quadratic and bilinear forms in statistics are presented. The results here are not so common so the references can be found within the section. First a small presentation of the quadratic form taken from [9]. Definition 24. Let x N n (µ, Σ) and A M n,n. Then x Ax is called a quadratic form. Here follows the expression of the first two moments of the quadratic form. Theorem 13. Let x N n (µ, Σ) and A M n,n. Then (i) E[x Ax] = tr(aσ) + µ Aµ (ii) var[x Ax] = 2tr((AΣ) 2 ) + 4µ AΣAµ Here follows some results on the bilinear form. This form has not been studied so much so here follows a definition and some results based on the same ideas as the quadratic form. Definition 25. Let x N n (µ x, I n ), y N n (µ y, I n ) and A M n,n. Then x Ay is called a bilinear form. Here the first two moments of the bilinear form are presented. Theorem 14. Let x N n (0, Σ x ), y N n (0, Σ y ) and A M n,n. The bilinear form x Ay has the following properties. (i) E[x Ay] = tr(a cov(x, y)) (ii) var[x Ay] = tr(a cov(x, y)) 2 +tr(a var(x)a var(y)) = tr(a) cov(x, y) 2 + tr(a 2 ) var(x) var(y). Proof. Here follows a proof of property one with for the slightly simpler case when A is normal. Since A is normal, it can be written as A = BDB, by the spectral decomposition theorem, where B is orthogonal and D is a diagonal matrix. It is now possible to write the bilinear form as where Q = x Ay = x (BDB )y = (B x) D(B y) = z Dw z = B x and w = B y. Since B is orthogonal Theorem 10 implies that z N n (0, I n ) and w N n (0, I n ). It is now possible to look for the expected value of Q. E[Q] = tr[e(q)] = E[tr(z Dw)] = E[tr(Dz w)] = tr(e[dz w]) = tr(d E[z w]) = tr(d cov(z, w)) = tr(d cov(b x, B y)) = tr(db cov(x, y)b) = tr(bdb cov(x, y)) = tr(a cov(x, y)). The proof for any square matrix and the second property can be found in [19, 18].

23 Chapter 3 Explicit estimators of a banded covariance matrix In this chapter we introduce the estimator developed in [15]. This thesis will focus on the case where the covariance matrix is banded of order one. 3.1 Previous results In [15] an explicit estimator for the covariance matrix for a multivariate normal distribution when the covariance matrix have an m-dependence structure is presented. They propose estimators for the general case when m + 1 < p < n and establish some properties of it. Furthermore the article consider the special case where m = 1 in detail and in this section some of the results from the article will be presented. Proposition 1. Let X N p,n (µ1 n, Σ (1) (p), I n). Explicit estimators are given by ˆµ i = 1 n x i1 n, ˆσ ii = 1 n x icx i for i = 1,..., p, ˆσ i,i+1 = 1 n ˆr icx i+1 for i = 1,..., p 1, where ˆr 1 = y 1 and ˆr i = x i ŝ iˆr i 1 for i = 2,..., p 1, ŝ i = ˆr i 1Cx i ˆr i 1Cx i 1, where C = I n 1 n 1 n1 n. Since these estimators are ad hoc it is important to establish some properties to motivate them. Thus the following theorem. Theorem 15. The estimator ˆµ = (ˆµ 1,..., ˆµ p ) given in Proposition 1 is unbiased and consistent, and the estimator ˆΣ (1) (p) = (ˆσ ij ) is consistent. Karlsson,

24 12 Chapter 3. Explicit estimators of a banded covariance matrix The previous theorem presents two crucial properties for the estimator, consistency and unbiasedness. However, the sample covariance matrix above lacks the property of unbiasedness. One of the main goals of this thesis were the development of this desired property. Regardless the simulations of the estimators in comparison to standard methods, e.g., maximum likelihood have shown great promise which made it probable that and unbiasedness variation of the sample covariance matrix exists. The result of this goal will be presented in the following sections. 3.2 Remodeling of explicit estimators This section presents some rewriting of the estimator above to show some properties which makes it a clearer expression and more suitable for interpretation and analyzing. The estimators presented in Proposition 1 is partly composed of the MLE. The estimator for µ i, ˆµ i = 1 n x i1 n is the mean value which is the MLE. The proposed estimator and the MLE share a resemblance, which can be seen by looking at the estimations of the diagonal elements of the covariance matrix, ˆσ ii = 1 n x icx i for i = 1,..., p. The diagonal elements is the MLE for diagonal of a covariance matrix. Also, the first off-diagonal element of the covariance matrix looks the following way ˆσ 12 = 1 n ˆr 1Cx 2, where So that, ˆr 1 = x 1. ˆσ 12 = 1 n ˆr 1Cx 2 = 1 n x 1Cx 2, which is the MLE for the same element for an unstructured matrix. This can also be seen by looking at the covariance matrix as a whole. σ 11 σ σ 1m σ 12 σ 22 σ σ 2,m σ k m+1,k 1... σ k 2,k 1 σ k 1,k 1 σ k 1,k σ k m,k... σ k 1,k σ kk where the left upper 2x2-partion of the matrix ( ) Σ (1) (2) = σ11 σ 12 σ 21 σ 22

25 3.2. Remodeling of explicit estimators 13 is the unstructured covariance matrix. However, the estimators of the off-diagonal elements of the covariance matrix except σ 12 are not the MLE and are therefore harder to analyze. But when i > 1 we can write the estimators as. ˆσ i,i+1 = 1 n ˆr icx i+1 = 1 n (x i ŝ iˆr i 1 ) Cx i+1 = 1 ( ) x i ˆr i 1Cx i n ˆr ˆr i 1 Cx i+1 i 1C ˆr i 1 = 1 ( ) x n icx i+1 ˆr i 1Cx i ˆr ˆr i 1C ˆr i 1Cx i+1 i 1 and since ˆr k 1C ˆr k 1 is a scalar it is possible to write ˆσ i,i+1 = 1 n ( x i Cx i+1 x ic ˆr i 1 (ˆr i 1C ˆr i 1 ) 1ˆr i 1Cx i+1 ) = 1 n x i ( C C ˆri 1 (ˆr i 1C ˆr i 1 ) 1ˆr i 1C ) x i+1, which can be written as, ˆσ i,i+1 = 1 n x ia i 1 x i+1, where A i = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic. This makes it suitable to propose another form of the estimator in Proposition 1 justified by the calculations above. Hence, the following proposition. Proposition 2. Let X N p,n (µ1 n, Σ (1) (p), I n). Explicit estimators are given by ˆµ i = 1 n x i1 n, ˆσ ii = 1 n x icx i for i = 1,..., p, ˆσ i,i+1 = 1 n x ia i 1 x i+1 for i = 1,..., p 1, where ˆr 1 = y 1 and ˆr i = x i where A i = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic, with A 0 = C, ˆr i 1Cx i ˆr i 1Cx i 1 ˆr i 1 for i = 2,..., p 1, with C = I n 1 n 1 n1 n.

26 14 Chapter 3. Explicit estimators of a banded covariance matrix Theorem 16. The matrix A i in Proposition 2 is idempotent and symmetric with rank n 2. Proof. Idempotence: A i = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic. A 2 i = (C C ˆr i (ˆr ic ˆr i ) 1ˆr ic)(c C ˆr i (ˆr ic ˆr i ) 1ˆr ic) = C 2 CC ˆr i (ˆr ic ˆr i ) 1ˆr ic C ˆr i (ˆr ic ˆr i ) 1ˆr icc +C ˆr i (ˆr ic ˆr i ) 1ˆr icc ˆr i (ˆr ic ˆr i ) 1ˆr ic = C 2C ˆr i (ˆr ic ˆr i ) 1ˆr ic + C ˆr i (ˆr ic ˆr i ) 1 (ˆr ic ˆr i )(ˆr ic ˆr i ) 1ˆr ic = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic = A i. Symmetry: A i = (C C ˆr i (ˆr ic ˆr i ) 1ˆr ic) = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic = A i. Rank: Since A i is idempotent the rank(a i ) = tr(a i ). This implies the following, rank(a i ) = tr(a i ) = tr(c C ˆr i (ˆr ic ˆr i ) 1ˆr ic) = tr(c) tr(c ˆr i (ˆr ic ˆr i ) 1ˆr ic) = (n 1) tr(c ˆr i (ˆr ic ˆr i ) 1ˆr i) = n 1 tr((ˆr ic ˆr i )(ˆr ic ˆr i ) 1 ) = n 1 tr(1) = n Proof of unbiasedness and consistency The last section presented some alteration to the original estimator which made it possible to rewrite it with a quadratic or bilinear form, centering an idempotent matrix, i.e., ˆσ ii = 1 n x icx i for i = 1,..., p, and ˆσ i,i+1 = 1 n x ia i 1 x i+1 for i = 1,..., p 1. Section presented some results regarding the expectation and variance for quadraticand bilinear form. It is now possible to combine those results with the results of the previous section to develop an unbiased estimator for the covariance matrix. It is also possible to present a new and much simpler proof of the consistency for the sample covariance matrix compared to the proof in [15]. First of we propose an unbiased estimator of the covariance matrix.

27 3.3. Proof of unbiasedness and consistency 15 Proposition 3. Let X N p,n (µ1 n, Σ (1) (p), I n). Explicit estimators are given by ˆµ i = 1 n x i1 n, ˆσ ii = 1 n 1 x icx i for i = 1,..., p, ˆσ 12 = 1 n 1 x 1Cx 2, ˆσ i,i+1 = 1 n 2 x ia i 1 x i+1 for i = 2,..., p 1, ˆr 1 = y 1 and ˆr i = x i where A i = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic, ˆr i 1Cx i ˆr i 1Cx i 1 ˆr i 1 for i = 2,..., p 1, with C = I n 1 n 1 n1 n. Here follows the proof of the unbiasedness of the previous estimator. Theorem 17. The estimators from Proposition 3 are unbiased. Proof. The estimators ˆµ i, ˆσ ii for i = 1,..., p and ˆσ 12 coincide with the corrected maximum likelihood and are thus according to Theorem 6 unbiased. Therefore it remains to prove that ˆσ i,i+1 are unbiased for i = 2,..., p 1. When the estimators are derived in Appendix, they are conditioned on the previous x:s. That is the calculation of ˆσ i,i+1 assumes x 1,..., x i 1 to be known constants. Therefore, the matrix A i below can be considered as a non-random matrix. We can then consider ˆσ i,i+1 as a bilinear form and calculate its expectation. ( 1 = E n 2 x ia i 1 x i+1 x 1,..., x i 1 ( ) 1 E(ˆσ i,i+1 ) = E n 2 x ia i 1 x i+1 According to the bilinear form, Theorem 14 gives ) = 1 n 2 E(x ia i 1 x i+1 x 1,..., x i 1 ). E(ˆσ i,i+1 ) = 1 n 2 rank(a i 1)σ i,i+1 = 1 n 2 (n 2)σ i,i+1 = σ i,i+1, where we have used Theorem 19 for the second last equality. Thus E(ˆσ i,i+1 ) = σ i,i+1 and the theorem has been proved. The proof for consistency follow the same structure as the proof above but instead uses that the estimators are unbiased and study the variance of the estimators. Theorem 18. The estimators from Proposition 3 are consistent.

28 16 Chapter 3. Explicit estimators of a banded covariance matrix Proof. The estimators ˆµ i, ˆσ ii for i = 1,..., p and ˆσ 12 coincide with the corrected maximum likelihood and are thus according to Theorem 5 consistent. Therefore it remains to prove that ˆσ i,i+1 are consistent for i = 2,..., p 1. When the estimators are derived in the Appendix, they are conditioned on the previous x:s. That is the calculation of ˆσ i,i+1 assumes x 1,..., x i 1 to be known constants. Therefore, the matrix A i below can be considered as a non-random matrix. We can then consider ˆσ i,i+1 as a bilinear form and calculate its variance. ( ) 1 var(ˆσ i,i+1 ) = var ( 1 = var n 2 x ia i 1 x i+1 x 1,..., x i 1 according to Theorem 14. var(ˆσ i,i+1 ) = Since by Theorem 19 A i 1 is idempotent. var(ˆσ i,i+1 ) = (n 2) x ia i 1 x i+1 ) = 1 (n 2) 2 var(x ia i 1 x i+1 x 1,..., x i 1 ), 1 (n 2) 2 rank(a i 1)σ 2 i,i+1 + tr(a 2 )σ ii σ i+1,i+1. 1 (n 2) 2 tr(a)(σ2 i,i+1 + σ ii σ i+1,i+1 ). It is possible to simplify this further because rank(a) = tr(a) is known, thus, var(ˆσ i,i+1 ) = σ2 i,i+1 + σ iiσ i+1,i+1 n 2 0, when the number of observations goes to infinity. Since the estimator also is unbiased, it is according to Theorem 4 consistent. Thus the theorem has been proved.

29 Chapter 4 Generalization to a general linear model In this chapter the estimator presented earlier will be extended into a general linear model. Then the results proved in the last chapter will be proved for this new estimator. Since the estimating procedure of the covariance matrix for a linear model remains similar to the estimation of a regular multivariate normal covariance matrix, the reasoning will be similar. Two differences of concern are the effect of estimating the regression parameters compared to the expected value and the degrees of freedom in covariance matrix compared to the rank of the design matrix. 4.1 Presentation In this section the presumptions and some notations for the estimators will be given. We will use similar presumptions and model as in Chapter 2 on the general linear model. Thus the multivariate model takes the form Y = XB + E, where Y and E are n m random matrices, X is a known n p-design matrix, and B is an unknown p m-matrix of regression coefficients. We will assume throughout this chapter that X has full rank p, that n m + p, and the rows of the error matrix E are independent N m (0, Σ) random vectors. We will also assign Σ a certain structure as in the previous chapter, hence Σ is banded of order one. 4.2 ˆB instead of ˆµ This section contains a motivation why ˆB will maximize the conditional likelihood function in the same way as ˆµ does. The main difference between a normal model and a general linear model is the regression parameters which affects the mean. Therefore you estimate B instead of µ. Karlsson,

30 18 Chapter 4. Generalization to a general linear model In a general linear model the MLE for B is ˆB = (X X) 1 X Y. Since the general linear model is a fusion between different response value, it is possible to determine the different b separately with the following expression. ˆbi = (X X) 1 X y i, for i = 1,..., k. The explicit estimator in this thesis is derived from stepwise maximization of the likelihood function. This procedure gives the estimators in this thesis and can be found in the Appendix. The estimators, as we have seen, coincide with the mean for a normal distribution. The same principle applies to the general linear model in the following way. B = (b 1,..., b k ) ˆbi y 1,..., y i 1 = ˆb i = (X X) 1 X y i. Since each individual b-vector can be determined independently and because the estimator of ˆb above is the MLE, it will maximize the conditional distribution because of the independence of the b:s. It is then possible to write ˆB = (ˆb 1,..., ˆb k ). The estimators ˆB and ˆΣ are sufficient for B and Σ. This altogether makes a good basis to propose explicit estimators for a general linear model with banded structure of order one. 4.3 Proposed estimators In this section we propose explicit estimators for a general linear model. In the last section a motivation for ˆB = (X X) 1 X Y is given and here follows a motivation for the covariance matrix. In the previous chapter we assumed X N p,n (µ1 n, Σ (p) (1), I n). We now study the general linear model Y N n,p (X design B, I n, Σ (p) (1)) and see that the transformation Y X design B will yield the same model as the last chapter, hence Y X design B N p,n (0, Σ (p) (1), I n). Replacing the observation x i with y i X designˆb in Proposition 3 gives the following appearance of the estimator. Proposition 4. Let Y N n,p (XB, I n, Σ (1) (p)). Explicit estimators are given by ˆB = (X X) 1 X Y, ˆσ ii = 1 n 1 (y i Xˆb i ) C(y i Xˆb i ) for i = 1,..., p, ˆσ 12 = 1 n 1 (y 1 Xˆb 1 ) C(y 2 Xˆb 2 ), ˆσ i,i+1 = 1 n 2 ( yi Xˆb i ) A i 1 (y i+1 Xˆb i+1 ) for i = 2,..., p 1, where A i = C C ˆr i (ˆr ic ˆr i ) 1ˆr ic, with A 0 = C,

31 4.3. Proposed estimators 19 where ˆr 1 = y 1 and ˆr i = y i Xˆb i ˆr i 1C(y i Xˆb i ) ˆr i 1C(y i 1 Xˆb i 1 ) ˆr i 1 for i = 2,..., p 1, with C = I n 1 n 1 n1 n. This can be simplified by changing the main matrix C into D = I n X(X X) 1 X. This removes the necessity of having y i Xˆb i as it is included in the matrix D. This leads us to the following proposition. Proposition 5. Let Y = XB + E N n,p (XB, I n, Σ (1) (p)), where rank(x) = k. Explicit estimators are given by ˆB = (X X) 1 X Y, ˆσ ii = 1 n 1 y idy i ˆσ 1,2 = 1 n 2 y 1Dy 2 for i = 1,..., p ˆσ i,i+1 = 1 n 2 y ia i 1 x i+1 for i = 2,..., p 1 where ˆr 1 = y 1 and ˆr i = y i where A i = D Dˆr i (ˆr idˆr i ) 1ˆr id with A 0 = D, ˆr i 1Dy i ˆr i 1Dy i 1 ˆr i 1 for i = 2,..., p 1 with D = I n X(X X) 1 X Since in the last chapter we saw that the correction of unbiasedness depended on matrix A i above which C is a part of, we need to study the properties of the new A i to determine what kind of estimator for a general linear model will give us unbiasedness. Here follows a theorem regarding the properties of the matrix A i above. Theorem 19. The matrix A i in Proposition 5 is idempotent and symmetric with rank n k 1. Proof. Idempotence: A i = D Dˆr i (ˆr idˆr i ) 1ˆr id. A 2 i = (D Dˆr i (ˆr idˆr i ) 1ˆr id)(d Dˆr i (ˆr idˆr i ) 1ˆr id) = D 2 DDˆr i (ˆr idˆr i ) 1ˆr id Dˆr i (ˆr idˆr i ) 1ˆr idd + Dˆr i (ˆr idˆr i ) 1 ˆr iddˆr i (ˆr idˆr i ) 1ˆr id = D 2Dˆr i (ˆr idˆr i ) 1ˆr id + Dˆr i (ˆr idˆr i ) 1 (ˆr idˆr i )(ˆr idˆr i ) 1ˆr id = D Dˆr i (ˆr idˆr i ) 1ˆr id = A i.

32 20 Chapter 4. Generalization to a general linear model Symmetry: A i = (D Dˆr i (ˆr idˆr i ) 1ˆr id) = D D ˆr i (ˆr idˆr i ) 1ˆr id = D Dˆr i (ˆr idˆr i ) 1ˆr id = A i. Rank: Since A i is idempotent the rank(a i ) = tr(a i ). This implies the following, rank(a i ) = tr(a i ) = tr(d Dˆr i (ˆr idˆr i ) 1ˆr id) = tr(d) tr(dˆr i (ˆr idˆr i ) 1ˆr id) = (n k) tr(dˆr i (ˆr idˆr i ) 1ˆr i) = n 1 tr((ˆr idˆr i )(ˆr idˆr i ) 1 ) = n k tr(1) = n k 1. This gives us the possibility to present the following explicit estimator which in next section will be proved to be unbiased and consistent. Proposition 6. Let Y = XB + E N p,n (XB, I n, Σ (1) (p)), where rank(x) = k. Explicit estimators are given by ˆσ i,i+1 = ˆB = (X X) 1 X Y, ˆσ ii = 1 n k y idy i for i = 1,..., p, ˆσ 12 = 1 n k 1 y 1Dy 2, 1 n k 1 y ia i 1 x i+1 for i = 2,..., p 1, where A i = D Dˆr i (ˆr idˆr i ) 1ˆr id, with A 0 = D, where ˆr 1 = y 1 and ˆr i = y i ˆr i 1Dy i ˆr i 1Dy i 1 ˆr i 1 for i = 2,..., p 1, with D = I n X(X X) 1 X.

33 4.4. Proof of unbiasedness and consistency Proof of unbiasedness and consistency This section provides the proof of unbiasedness and consistency of the estimators presented above. Since the structure of the estimators is similar to an ordinary normal model, the proofs will be similar to the ones presented in Chapter 3. We start by proving that the estimators are unbiased. Theorem 20. The estimators from Proposition 6 are unbiased. Proof. The estimators ˆB, ˆσ ii for i = 1,..., p and ˆσ 12 coincide with the corrected maximum likelihood and are thus according to Theorem 12 unbiased. Therefore it remains to prove that ˆσ i,i+1 are unbiased for i = 2,..., p 1. When the estimators are derived in the Appendix, they are conditioned on the previous y:s. That is the calculation of ˆσ i,i+1 assumes y 1,..., y i 1 to be known constants. Therefore, the matrix A i below can be considered as a non-random matrix. We can consider ˆσ i,i+1 as a bilinear form and calculate its expectation as ( ) 1 E(ˆσ i,i+1 ) = E ( = E 1 n k 1 y ia i 1 y i+1 y 1,..., y i 1 = = n k 1 y ia i 1 y i+1 ) 1 n k 1 rank(a i 1)σ i,i+1 1 n k 1 (n k 1)σ i,i+1 = σ i,i+1, = 1 n 2 E(y ia i 1 y i+1 y 1,..., y i 1 ) according to Theorem 14 and Theorem 19. Thus E(ˆσ i,i+1 ) = σ i,i+1 and the theorem has been proved. The proof for consistency follow the same structure as the proof above but instead uses that the estimators are unbiased and study the variance of the estimators. Theorem 21. The estimators from Proposition 6 are consistent. Proof. The estimators ˆµ i, ˆσ ii for i = 1,..., p and ˆσ 12 coincide with the corrected maximum likelihood and are thus according to Theorem 12 consistent. Therefore it remains to prove that ˆσ i,i+1 are consistent for i = 2,..., p 1. When the estimators are derived in Appendix, they are conditioned on the previous y:s. That is the calculation of ˆσ i,i+1 assumes y 1,..., y i 1 to be known constants. Therefore, the matrix A i below can be considered as a non-random matrix. We can then consider ˆσ i,i+1 as a bilinear form and calculate its variance as 1 var(ˆσ i,i+1 ) = var( n 2 y ia i 1 y i+1 ) 1 1 = var( n 2 y ia i 1 y i+1 y 1,..., y i 1 ) = (n 2) 2 var(y ia i 1 y i+1 y 1,..., y i 1 ) = 1 (n 2) 2 rank(a i 1)σ 2 i,i+1 + (A 2 )σ ii σ i+1,i+1

34 22 Chapter 4. Generalization to a general linear model = 1 (n 2) 2 (A)(σ2 i,i+1 + σ ii σ i+1,i+1 ) = σ2 i,i+1 + σ iiσ i+1,i+1, n 2 according to Theorem 14, Theorem 19 and since rank(a) = tr(a). One can see that, when n the var(ˆσ i,i+1 ) 0. Estimators which are both unbiased and whose variance tends to 0 when n are consistent according to Theorem 4. Thus the theorem has been proved. We have presented and proved that these explicit estimators can be used with good qualities for a general linear model.

35 Chapter 5 Simulations In this chapter we display some simulations of the unbiased covariance matrix estimator presented in Proposition 3 and Proposition 6 and compare them to the previous estimator in Proposition 1 and Proposition 5. This will give an idea about how much better the improved estimators perform. 5.1 Simulations of the regular normal distribution ( ) In this section we assumed that x N 4 µ, Σ (4) where and (1) 1 µ = Σ = In this simulation a sample of size n = 20 observations was randomly generated using MATLAB Version Then the unbiased explicit estimators and the previous explicit estimators were calculated in each simulation. This was repeated times and the average values of the obtained estimator were calculated. Based on the average, explicit unbiased average estimators are given by, ˆΣ new = , and the previous estimators are given by, ˆΣ prev = Karlsson,

36 24 Chapter 5. Simulations In this simulation the unbiased estimators perform considerably better than the previous ones. This is in line with the theory presented. 5.2 Simulations of the estimators for a general linear model ( In this section we assumed that Y = XB + E N n,5 XB, I n, Σ (5) Σ = (1) ) where For each simulation the matrices X and B were randomly generated using MATLAB:s randomizer routine to avoid any effect on the estimation process. In this simulation a sample of size n = 80 observations was randomly generated using MATLAB Version Then the unbiased explicit estimators and the previous explicit estimators were calculated in each simulation. This was repeated times and the average values of the obtained estimator were calculated. Based on the average unbiased explicit average estimators are given by, ˆΣ new = , and the previous estimators are given by, ˆΣ prev = In this simulation the unbiased estimators perform vastly better than the previous estimators. This is also in line with the theory since the adjustment for unbiasedness is larger than for an ordinary normal distribution.

37 Chapter 6 Discussion and further research 6.1 Discussion This thesis presents unbiased and consistent estimators for a covariance matrix with a banded structure of order one. This progress makes it more suitable to use in a real situation since the property of unbiasedness is highly desired in combination with the generalization into a general linear model. The simulation study in the Chapter 5 shows that the effect is prominent and since the estimator in the earlier paper [15] was comparable to other used methods this improved version should be efficient as well. 6.2 Further research The main direction for future research in this area is a way to extend these results into a banded covariance matrix of any order. This would provide explicit estimators with good qualities. Another step is to compare these estimators with the MLEs for a banded covariance matrix and determine how well it performs in comparison to that. Lastly a direction of research could be to study the efficiency of the estimator which would give a comparison to all other possible unbiased estimator of a banded covariance matrix. Karlsson,

38 26 Chapter 6. Discussion and further research

39 Appendix A Motivation of the estimator In this chapter the motivation of the explicit estimator presented in [15] will be presented. Here follows some notation specific for the appendix. We will define M ji k as the matrix obtained when the j th row and the i th column have been removed from Σ k. Moreover, we will partition the matrix Σ (m) (k) as ( ) Σ (m) (k) = Σ (m) (k 1) σ 1k σ k1 σ kk, where σ k1 = (0,..., 0, σ k,k m,..., σ k,k 1 ). For a matrix X, the notation X i:j will be used for the matrix including rows from i to j, i.e., X i:j = (x i, x i+1,..., x j ). The estimators are based on the likelihood. However, instead of maximizing the complete likelihood we factor the likelihood and maximize each term. In this way explicit estimators are obtained. By conditioning the probability density equals. f(x) = f(x p X 1,..., X p 1 )... f(x m+2 X 1:m+1 )f(x 1:m+1 ). Hence, for k = m + 2,..., p partion the covariance matrix Σ (k). We have where the conditional variance equals x k X 1:k 1 N 1,n (µ k 1:k 1, σ k 1:k 1, I n ), σ k 1:k 1 = σ kk σ k1σ 1 (k 1) σ 1k, and where the conditional expectation equals µ k 1:k 1 = µ k1 n + σ k1σ 1 (k 1) x 1 µ 1 1 n. x k 1 µ k 1 1 n Karlsson,

40 28 Appendix A. Motivation of the estimator = r k0 1 n + k 1 i=k m σ ki k 1 j=1 σ ij (k 1) x j. Here σ ij (k 1) are the elements of the matrix ( ) M ji Σ 1 (k 1) = σ ij (k 1) = ( 1) i+j (k 1) i,j Σ(k 1) The first regression coefficient equals i,j. (A.1) = µ k r k0 = µ k k 1 i=k m We may rewrite equation (A.1) as where and = r k0 1 n + k 1 i=k m σ ki k 1 µ k 1:k 1 = r k0 1 n + k 1 i=k m = r k0 1 n + σ ki k 1 j=1 σ ij (k 1) µ j M ji ( 1) i+j (k 1) Σ (k 1) µ j. j=1 k 1 i=k m Σ (k 2) k 1 σ ki Σ (k 1) k 1 i=k m σ ki k 1 j=1 j=1 ( 1) i+j Σ ij (k 1) x j r ki s k 1,i = S k 1 r k, r k = (r k0, r k,k m,..., r k,k 1 ), S k 1 = (1 n, s k 1,k m,..., s k 1,k 1 ), M ji (k 1) x j Σ (k 2) Σ (k 2) r ki = σ ki Σ (k 1), for i = k m,..., k 1, k 1 M ji s k 1,i = ( 1) i+j (k 1) Σ (k 2) x j, for i = k 1,..., k 1. j=1 The proposed estimators for the regression coefficients in the k:th step are ˆr = (ˆr k0, ˆr k,k m,..., ˆr k,k 1 ) = (Ŝ k 1Ŝk 1) 1 Ŝ k 1x k, where and j=1 Ŝ k 1 = (1, ŝ k 1,k m,..., ŝ k 1,k 1 ), k 1 ji ˆM ŝ k 1,i = ( 1) i+j (k 1) ˆΣ x j, for i = k m,..., k 1. (k 2)

41 29 Here the estimators from the previous terms (1, 2,..., k 1) are inserted in ŝ k 1,i for all i = k m,..., k 1. The estimator for the conditional variance is given by σ k 1:k 1 = 1 n (x k ˆµ k 1:k 1 ) (x k ˆµ k 1:k 1 ) = 1 n x k(i n Ŝ k 1 (Ŝ k 1Ŝk 1) 1 Ŝ k 1)x k. The estimators for the original parameters may be calculated as ˆΣ (k 1) ˆσ ki = ˆr ki ˆΣ, for i = k m,..., k 1, (k 2) and ˆµ k = ˆr k0 + k 1 i=k m ji ˆM ( 1) i+j (k 1) ˆΣ ˆµ j, (k 2) ˆσ ki k 1 j=1 ˆσ kk = 1 n x k(i n Ŝ k 1 (Ŝ k 1Ŝk 1) 1 Ŝ k 1)x k + ˆσ k1 ˆΣ 1 (k 1)ˆσ 1k. It remains to show that the estimator ˆµ k is the mean of x k, i.e., ˆµ = 1 n x j 1 n for all k = 1,..., p and a proof via induction is now presented. Base step: For k = 1, 2,..., m + 1, ˆµ k = 1 n x k 1 n since the estimators are MLEs in a model with a non-structured covariance matrix. Inductive step: For some m + 1 < k 1 assume that ˆµ j = 1 n x j 1 n, for all j < k 1. Then ˆµ k = ˆr k0 + = ˆr k0 + k 1 i=k m = ˆr k0 + k 1 i=k m k 1 ji ˆM ˆσ ki ( 1) i+j (k 1) j=1 ˆΣ ˆµ j (k 1) ji ˆM ( 1) i+j (k 1) 1 ˆΣ n x j1 n (k 2) ˆr ki k 1 k 1 i=k m j=1 ˆr ki 1 n ˆx j1 = 1 nŝk 1ˆr k = 1 nŝk 1(Ŝ k 1Ŝk 1) 1 Ŝ k 1x k. Since Ŝ k 1 (Ŝ k 1Ŝk 1) 1 Ŝ k 1 is a projection on a space which contains the vector 1 n we have ˆµ k = 1 n 1 nŝk 1(Ŝ k 1Ŝk 1) 1 Ŝ k 1x k = 1 n 1 nx k. Hence, by induction all the estimators for the expectations are means, i.e., ˆµ k = 1 n x k1 n.

TAMS39 Lecture 2 Multivariate normal distribution

TAMS39 Lecture 2 Multivariate normal distribution TAMS39 Lecture 2 Multivariate normal distribution Martin Singull Department of Mathematics Mathematical Statistics Linköping University, Sweden Content Lecture Random vectors Multivariate normal distribution

More information

Matrix Algebra, part 2

Matrix Algebra, part 2 Matrix Algebra, part 2 Ming-Ching Luoh 2005.9.12 1 / 38 Diagonalization and Spectral Decomposition of a Matrix Optimization 2 / 38 Diagonalization and Spectral Decomposition of a Matrix Also called Eigenvalues

More information

MLES & Multivariate Normal Theory

MLES & Multivariate Normal Theory Merlise Clyde September 6, 2016 Outline Expectations of Quadratic Forms Distribution Linear Transformations Distribution of estimates under normality Properties of MLE s Recap Ŷ = ˆµ is an unbiased estimate

More information

Notes on Random Vectors and Multivariate Normal

Notes on Random Vectors and Multivariate Normal MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

2. Matrix Algebra and Random Vectors

2. Matrix Algebra and Random Vectors 2. Matrix Algebra and Random Vectors 2.1 Introduction Multivariate data can be conveniently display as array of numbers. In general, a rectangular array of numbers with, for instance, n rows and p columns

More information

Chapter 1. Matrix Algebra

Chapter 1. Matrix Algebra ST4233, Linear Models, Semester 1 2008-2009 Chapter 1. Matrix Algebra 1 Matrix and vector notation Definition 1.1 A matrix is a rectangular or square array of numbers of variables. We use uppercase boldface

More information

ANOVA: Analysis of Variance - Part I

ANOVA: Analysis of Variance - Part I ANOVA: Analysis of Variance - Part I The purpose of these notes is to discuss the theory behind the analysis of variance. It is a summary of the definitions and results presented in class with a few exercises.

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

3 (Maths) Linear Algebra

3 (Maths) Linear Algebra 3 (Maths) Linear Algebra References: Simon and Blume, chapters 6 to 11, 16 and 23; Pemberton and Rau, chapters 11 to 13 and 25; Sundaram, sections 1.3 and 1.5. The methods and concepts of linear algebra

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Basic Concepts in Matrix Algebra

Basic Concepts in Matrix Algebra Basic Concepts in Matrix Algebra An column array of p elements is called a vector of dimension p and is written as x p 1 = x 1 x 2. x p. The transpose of the column vector x p 1 is row vector x = [x 1

More information

Linear Algebra Formulas. Ben Lee

Linear Algebra Formulas. Ben Lee Linear Algebra Formulas Ben Lee January 27, 2016 Definitions and Terms Diagonal: Diagonal of matrix A is a collection of entries A ij where i = j. Diagonal Matrix: A matrix (usually square), where entries

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini April 27, 2018 1 / 1 Table of Contents 2 / 1 Linear Algebra Review Read 3.1 and 3.2 from text. 1. Fundamental subspace (rank-nullity, etc.) Im(X ) = ker(x T ) R

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Economics 573 Problem Set 5 Fall 2002 Due: 4 October b. The sample mean converges in probability to the population mean.

Economics 573 Problem Set 5 Fall 2002 Due: 4 October b. The sample mean converges in probability to the population mean. Economics 573 Problem Set 5 Fall 00 Due: 4 October 00 1. In random sampling from any population with E(X) = and Var(X) =, show (using Chebyshev's inequality) that sample mean converges in probability to..

More information

Chapter 17: Undirected Graphical Models

Chapter 17: Undirected Graphical Models Chapter 17: Undirected Graphical Models The Elements of Statistical Learning Biaobin Jiang Department of Biological Sciences Purdue University bjiang@purdue.edu October 30, 2014 Biaobin Jiang (Purdue)

More information

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method.

STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. STAT 135 Lab 13 (Review) Linear Regression, Multivariate Random Variables, Prediction, Logistic Regression and the δ-method. Rebecca Barter May 5, 2015 Linear Regression Review Linear Regression Review

More information

Vectors and Matrices Statistics with Vectors and Matrices

Vectors and Matrices Statistics with Vectors and Matrices Vectors and Matrices Statistics with Vectors and Matrices Lecture 3 September 7, 005 Analysis Lecture #3-9/7/005 Slide 1 of 55 Today s Lecture Vectors and Matrices (Supplement A - augmented with SAS proc

More information

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =

Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x = Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Multivariate Distributions

Multivariate Distributions IEOR E4602: Quantitative Risk Management Spring 2016 c 2016 by Martin Haugh Multivariate Distributions We will study multivariate distributions in these notes, focusing 1 in particular on multivariate

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38

Preliminaries. Copyright c 2018 Dan Nettleton (Iowa State University) Statistics / 38 Preliminaries Copyright c 2018 Dan Nettleton (Iowa State University) Statistics 510 1 / 38 Notation for Scalars, Vectors, and Matrices Lowercase letters = scalars: x, c, σ. Boldface, lowercase letters

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices

Lecture 13: Simple Linear Regression in Matrix Format. 1 Expectations and Variances with Vectors and Matrices Lecture 3: Simple Linear Regression in Matrix Format To move beyond simple regression we need to use matrix algebra We ll start by re-expressing simple linear regression in matrix form Linear algebra is

More information

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved.

Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. Matrix Algebra, Class Notes (part 2) by Hrishikesh D. Vinod Copyright 1998 by Prof. H. D. Vinod, Fordham University, New York. All rights reserved. 1 Converting Matrices Into (Long) Vectors Convention:

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Linear Algebra Review 1 / 45 Definition of Matrix Rectangular array of elements arranged in rows and

More information

STAT5044: Regression and Anova. Inyoung Kim

STAT5044: Regression and Anova. Inyoung Kim STAT5044: Regression and Anova Inyoung Kim 2 / 51 Outline 1 Matrix Expression 2 Linear and quadratic forms 3 Properties of quadratic form 4 Properties of estimates 5 Distributional properties 3 / 51 Matrix

More information

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Principles of Statistical Inference Recap of statistical models Statistical inference (frequentist) Parametric vs. semiparametric

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Mathematical Foundations of Applied Statistics: Matrix Algebra

Mathematical Foundations of Applied Statistics: Matrix Algebra Mathematical Foundations of Applied Statistics: Matrix Algebra Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/105 Literature Seber, G.

More information

Chapter 5 Matrix Approach to Simple Linear Regression

Chapter 5 Matrix Approach to Simple Linear Regression STAT 525 SPRING 2018 Chapter 5 Matrix Approach to Simple Linear Regression Professor Min Zhang Matrix Collection of elements arranged in rows and columns Elements will be numbers or symbols For example:

More information

STAT200C: Review of Linear Algebra

STAT200C: Review of Linear Algebra Stat200C Instructor: Zhaoxia Yu STAT200C: Review of Linear Algebra 1 Review of Linear Algebra 1.1 Vector Spaces, Rank, Trace, and Linear Equations 1.1.1 Rank and Vector Spaces Definition A vector whose

More information

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013

Multivariate Gaussian Distribution. Auxiliary notes for Time Series Analysis SF2943. Spring 2013 Multivariate Gaussian Distribution Auxiliary notes for Time Series Analysis SF2943 Spring 203 Timo Koski Department of Mathematics KTH Royal Institute of Technology, Stockholm 2 Chapter Gaussian Vectors.

More information

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y.

MATRIX ALGEBRA. or x = (x 1,..., x n ) R n. y 1 y 2. x 2. x m. y m. y = cos θ 1 = x 1 L x. sin θ 1 = x 2. cos θ 2 = y 1 L y. as Basics Vectors MATRIX ALGEBRA An array of n real numbers x, x,, x n is called a vector and it is written x = x x n or x = x,, x n R n prime operation=transposing a column to a row Basic vector operations

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form

Topic 7 - Matrix Approach to Simple Linear Regression. Outline. Matrix. Matrix. Review of Matrices. Regression model in matrix form Topic 7 - Matrix Approach to Simple Linear Regression Review of Matrices Outline Regression model in matrix form - Fall 03 Calculations using matrices Topic 7 Matrix Collection of elements arranged in

More information

Foundations of Matrix Analysis

Foundations of Matrix Analysis 1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the

More information

Chapter 3. Matrices. 3.1 Matrices

Chapter 3. Matrices. 3.1 Matrices 40 Chapter 3 Matrices 3.1 Matrices Definition 3.1 Matrix) A matrix A is a rectangular array of m n real numbers {a ij } written as a 11 a 12 a 1n a 21 a 22 a 2n A =.... a m1 a m2 a mn The array has m rows

More information

1 Appendix A: Matrix Algebra

1 Appendix A: Matrix Algebra Appendix A: Matrix Algebra. Definitions Matrix A =[ ]=[A] Symmetric matrix: = for all and Diagonal matrix: 6=0if = but =0if 6= Scalar matrix: the diagonal matrix of = Identity matrix: the scalar matrix

More information

MATH 431: FIRST MIDTERM. Thursday, October 3, 2013.

MATH 431: FIRST MIDTERM. Thursday, October 3, 2013. MATH 431: FIRST MIDTERM Thursday, October 3, 213. (1) An inner product on the space of matrices. Let V be the vector space of 2 2 real matrices (that is, the algebra Mat 2 (R), but without the mulitiplicative

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra

Massachusetts Institute of Technology Department of Economics Statistics. Lecture Notes on Matrix Algebra Massachusetts Institute of Technology Department of Economics 14.381 Statistics Guido Kuersteiner Lecture Notes on Matrix Algebra These lecture notes summarize some basic results on matrix algebra used

More information

Exam 2. Jeremy Morris. March 23, 2006

Exam 2. Jeremy Morris. March 23, 2006 Exam Jeremy Morris March 3, 006 4. Consider a bivariate normal population with µ 0, µ, σ, σ and ρ.5. a Write out the bivariate normal density. The multivariate normal density is defined by the following

More information

Gaussian Models (9/9/13)

Gaussian Models (9/9/13) STA561: Probabilistic machine learning Gaussian Models (9/9/13) Lecturer: Barbara Engelhardt Scribes: Xi He, Jiangwei Pan, Ali Razeen, Animesh Srivastava 1 Multivariate Normal Distribution The multivariate

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Review of Linear Algebra Denis Helic KTI, TU Graz Oct 9, 2014 Denis Helic (KTI, TU Graz) KDDM1 Oct 9, 2014 1 / 74 Big picture: KDDM Probability Theory

More information

The Multivariate Gaussian Distribution

The Multivariate Gaussian Distribution The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance

More information

Theory of Statistics.

Theory of Statistics. Theory of Statistics. Homework V February 5, 00. MT 8.7.c When σ is known, ˆµ = X is an unbiased estimator for µ. If you can show that its variance attains the Cramer-Rao lower bound, then no other unbiased

More information

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models

Bare minimum on matrix algebra. Psychology 588: Covariance structure and factor models Bare minimum on matrix algebra Psychology 588: Covariance structure and factor models Matrix multiplication 2 Consider three notations for linear combinations y11 y1 m x11 x 1p b11 b 1m y y x x b b n1

More information

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0. Matrices Operations Linear Algebra Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0 The rectangular array 1 2 1 4 3 4 2 6 1 3 2 1 in which the

More information

Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014

Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014 Duke University, Department of Electrical and Computer Engineering Optimization for Scientists and Engineers c Alex Bronstein, 2014 Linear Algebra A Brief Reminder Purpose. The purpose of this document

More information

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX September 2007 MSc Sep Intro QT 1 Who are these course for? The September

More information

A matrix over a field F is a rectangular array of elements from F. The symbol

A matrix over a field F is a rectangular array of elements from F. The symbol Chapter MATRICES Matrix arithmetic A matrix over a field F is a rectangular array of elements from F The symbol M m n (F ) denotes the collection of all m n matrices over F Matrices will usually be denoted

More information

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model

Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Gaussian vectors and central limit theorem

Gaussian vectors and central limit theorem Gaussian vectors and central limit theorem Samy Tindel Purdue University Probability Theory 2 - MA 539 Samy T. Gaussian vectors & CLT Probability Theory 1 / 86 Outline 1 Real Gaussian random variables

More information

Linear Methods for Prediction

Linear Methods for Prediction Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Boolean Inner-Product Spaces and Boolean Matrices

Boolean Inner-Product Spaces and Boolean Matrices Boolean Inner-Product Spaces and Boolean Matrices Stan Gudder Department of Mathematics, University of Denver, Denver CO 80208 Frédéric Latrémolière Department of Mathematics, University of Denver, Denver

More information

A Probability Review

A Probability Review A Probability Review Outline: A probability review Shorthand notation: RV stands for random variable EE 527, Detection and Estimation Theory, # 0b 1 A Probability Review Reading: Go over handouts 2 5 in

More information

A Quick Tour of Linear Algebra and Optimization for Machine Learning

A Quick Tour of Linear Algebra and Optimization for Machine Learning A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators

More information

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30 Random Vectors 1 STA442/2101 Fall 2017 1 See last slide for copyright information. 1 / 30 Background Reading: Renscher and Schaalje s Linear models in statistics Chapter 3 on Random Vectors and Matrices

More information

ECE 275A Homework 6 Solutions

ECE 275A Homework 6 Solutions ECE 275A Homework 6 Solutions. The notation used in the solutions for the concentration (hyper) ellipsoid problems is defined in the lecture supplement on concentration ellipsoids. Note that θ T Σ θ =

More information

Matrix Algebra: Summary

Matrix Algebra: Summary May, 27 Appendix E Matrix Algebra: Summary ontents E. Vectors and Matrtices.......................... 2 E.. Notation.................................. 2 E..2 Special Types of Vectors.........................

More information

Canonical Correlation Analysis of Longitudinal Data

Canonical Correlation Analysis of Longitudinal Data Biometrics Section JSM 2008 Canonical Correlation Analysis of Longitudinal Data Jayesh Srivastava Dayanand N Naik Abstract Studying the relationship between two sets of variables is an important multivariate

More information

Fundamentals of Engineering Analysis (650163)

Fundamentals of Engineering Analysis (650163) Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is

More information

Common-Knowledge / Cheat Sheet

Common-Knowledge / Cheat Sheet CSE 521: Design and Analysis of Algorithms I Fall 2018 Common-Knowledge / Cheat Sheet 1 Randomized Algorithm Expectation: For a random variable X with domain, the discrete set S, E [X] = s S P [X = s]

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

Random Vectors and Multivariate Normal Distributions

Random Vectors and Multivariate Normal Distributions Chapter 3 Random Vectors and Multivariate Normal Distributions 3.1 Random vectors Definition 3.1.1. Random vector. Random vectors are vectors of random 75 variables. For instance, X = X 1 X 2., where each

More information

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A =

Matrices and Vectors. Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = 30 MATHEMATICS REVIEW G A.1.1 Matrices and Vectors Definition of Matrix. An MxN matrix A is a two-dimensional array of numbers A = a 11 a 12... a 1N a 21 a 22... a 2N...... a M1 a M2... a MN A matrix can

More information

Ch4. Distribution of Quadratic Forms in y

Ch4. Distribution of Quadratic Forms in y ST4233, Linear Models, Semester 1 2008-2009 Ch4. Distribution of Quadratic Forms in y 1 Definition Definition 1.1 If A is a symmetric matrix and y is a vector, the product y Ay = i a ii y 2 i + i j a ij

More information

Linear Models Review

Linear Models Review Linear Models Review Vectors in IR n will be written as ordered n-tuples which are understood to be column vectors, or n 1 matrices. A vector variable will be indicted with bold face, and the prime sign

More information

Testing Some Covariance Structures under a Growth Curve Model in High Dimension

Testing Some Covariance Structures under a Growth Curve Model in High Dimension Department of Mathematics Testing Some Covariance Structures under a Growth Curve Model in High Dimension Muni S. Srivastava and Martin Singull LiTH-MAT-R--2015/03--SE Department of Mathematics Linköping

More information

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n = Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) 1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)

More information

Introduction to Matrix Algebra

Introduction to Matrix Algebra Introduction to Matrix Algebra August 18, 2010 1 Vectors 1.1 Notations A p-dimensional vector is p numbers put together. Written as x 1 x =. x p. When p = 1, this represents a point in the line. When p

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

CS281A/Stat241A Lecture 17

CS281A/Stat241A Lecture 17 CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian

More information

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian FE661 - Statistical Methods for Financial Engineering 2. Linear algebra Jitkomut Songsiri matrices and vectors linear equations range and nullspace of matrices function of vectors, gradient and Hessian

More information

Multivariate Time Series

Multivariate Time Series Multivariate Time Series Notation: I do not use boldface (or anything else) to distinguish vectors from scalars. Tsay (and many other writers) do. I denote a multivariate stochastic process in the form

More information

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of

Chapter 2. Linear Algebra. rather simple and learning them will eventually allow us to explain the strange results of Chapter 2 Linear Algebra In this chapter, we study the formal structure that provides the background for quantum mechanics. The basic ideas of the mathematical machinery, linear algebra, are rather simple

More information

Stat 206: Linear algebra

Stat 206: Linear algebra Stat 206: Linear algebra James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Vectors We have already been working with vectors, but let s review a few more concepts. The inner product of two

More information

MAS223 Statistical Inference and Modelling Exercises

MAS223 Statistical Inference and Modelling Exercises MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,

More information

Hands-on Matrix Algebra Using R

Hands-on Matrix Algebra Using R Preface vii 1. R Preliminaries 1 1.1 Matrix Defined, Deeper Understanding Using Software.. 1 1.2 Introduction, Why R?.................... 2 1.3 Obtaining R.......................... 4 1.4 Reference Manuals

More information

Stat 206: Sampling theory, sample moments, mahalanobis

Stat 206: Sampling theory, sample moments, mahalanobis Stat 206: Sampling theory, sample moments, mahalanobis topology James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Notation My notation is different from the book s. This is partly because

More information

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε,

Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, 2. REVIEW OF LINEAR ALGEBRA 1 Lecture 1 Review: Linear models have the form (in matrix notation) Y = Xβ + ε, where Y n 1 response vector and X n p is the model matrix (or design matrix ) with one row for

More information

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data

Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Applied Mathematical Sciences, Vol 3, 009, no 54, 695-70 Testing a Normal Covariance Matrix for Small Samples with Monotone Missing Data Evelina Veleva Rousse University A Kanchev Department of Numerical

More information