Linear Mixed Models: Methodology and Algorithms

Linear Mixed Models: Methodology and Algorithms David M. Allen University of Kentucky March 6, 2017

C Topics from Calculus Maximum likelihood and REML estimation involve the minimization of a negative log likelihood function with respect to the parameters. Minimization is done using Newton s method which requires the derivatives of the negative log likelihood function. The likelihood function is a function of functions so the chain rule facilitates determining the derivatives. This Chapter gives the required calculus.

Section C.1 The Chain Rule The chain rule is a fundamental rule of differentiation. Some physicists claim that the chain rule is the most important theorem in all of mathematics (Hubbard and Hubbard [1]). C.1 29

Statement of the Chain Rule Let g be a m-vector of functions having n arguments and ƒ be a p-vector of functions having m arguments. If g is differentiable at and ƒ is differentiable at g(), then the composition ƒ (g()) is differentiable at, and its derivative is given by d d ƒ (g()) = = d d ƒ () =g() d d g() = C.1 30

Section C.2 Some Matrix Derivatives The likelihood function of the multivariate normal distribution involves both the determinant and inverse of the variance matrix. For application of Newton s algorithm, the first and second derivatives of the determinant and inverse of the variance matrix are required. Finding these derivatives is the subject of this section. C.2 31

Notation Any letter could be used to represent the matrix under discussion. I will use V since its use is in the context of a variance matrix. Assume the elements of V are functions of a vector parameters θ. This is emphasized by writing it as V(θ). C.2 32

Derivative of an Inverse Matrix The derivative of an inverse is the simpler of the two cases considered. The defining relationship between a matrix and its inverse is V(θ)V 1 (θ) = The derivative of both sides with respect to the kth element of θ is d d V(θ) V 1 (θ) + V(θ) V 1 (θ) = 0 θ k θ k Straightforward manipulation gives d d V 1 (θ) = V 1 (θ) V(θ) V 1 (θ) θ k θ k (C.2.1) C.2 33

Analogies There are two analogies to one variable calculus in the derivative above: derivative of a product and implicit differentiation. C.2 34

The Derivative of a Determinant For discussion of the derivative of a determinant, I temporarily suspend the dependence of V on θ and derive the derivative with respect it an element of V. The derivative with respect to an element of θ is brought in via the chain rule. C.2 35

The Cofactor of a Matrix For a square matrix V, the minor of its (, j) entry is defined to be the determinant of the submatrix obtained by removing from V its th row and jth column, and it is denoted by M j. Then C j = ( 1) +j M j is called the (, j) cofactor of V. C.2 36

The Determinant of a Matrix The determinant of V(n n) may be expressed as for any fixed j, or det(v) = det(v) = n j C j =1 n j C j for any fixed. These are called column and row expansions respectively. j=1 C.2 37

For a matrix Cofactor Matrix 11 12 1n 21 22 2n V =...... n1 n2 nn the cofactor matrix is C 11 C 12 C 1n C 21 C 22 C 2n C =....... C n1 C n2 C nn C.2 38

The Adjugate and Inverse Matrices The adjugate matrix is the transpose of the cofactor matrix dj(v) = C t. Provided det(v) = 0 the inverse of V is V 1 = 1 det(v) dj(v) C.2 39

The Derivative With Respect to an Element The derivative of the logarithm of the determinant of V with respect to an element is d d j log(det(v)) = 1 det(v) C j = V 1 j C.2 40

Derivative with Respect to θ Bring back the dependency of V on θ and apply the chain rule: d 1 log(det(v(θ)) = dθ k det(v) = n n =1 j=1 = tr V 1 n n d j (θ) C j dθ =1 j=1 k V 1 d V(θ) j dθ k d V t (θ) dθ k j (C.2.2) C.2 41

Exercises The following exercises depend on these quantities: 5.4 1 2 5.9 6 1 Y =, A =, 5.5 3 3 1.1 2 1 θ = [θ 1, θ 2 ] t, V(θ) = AA t θ 1 + θ 2. Y is a realization of N 4 (0, V(θ)). Exercise C.2.1. Let L(θ; Y) represent negative two times the log likelihood function of θ. Give the expression for L(θ; Y). You may ignore the constant term. Exercise C.2.2. Find the derivative of L(θ; Y) with respect to θ 1 evaluated at [θ 1, θ 2 ] = [3, 2]. C.2 42

Exercise C.2.3. Find the derivative of L(θ; Y) with respect to θ 2 evaluated at [θ 1, θ 2 ] = [3, 2]. C.2 43

References [1] John H. Hubbard and Barbara Burke Hubbard. Vector Calculus, Linear Algebra, and Differential Forms. Fifth edition. Ithaca, New York: Matrix Editions, 1015. C.2 44