April 26, Applied mathematics PhD candidate, physics MA UC Berkeley. Lecture 4/26/2013. Jed Duersch. Spd matrices. Cholesky decomposition

Applied mathematics PhD candidate, physics MA UC Berkeley April 26, 2013 UCB 1/19

Symmetric positive-definite I Definition A symmetric matrix A R n n is positive definite iff x T Ax > 0 holds x 0 R n. Such naturally arise in a variety of math, science, and engineering applications. For example, The Hamiltonian operator of a physical system The Hessian of a convex function The covariance matrix of linearly independent random variables UCB 2/19

Symmetric positive-definite II Theorem The following statements are equivalent: 1 A is symmetric positive definite. 2 (x, y) := x T Ay is an inner product on R n. 3 All eigenvalues are positive and eigenvectors can be constructed to form an orthonormal basis. 4 A is symmetric with positive leading principal minors. 5 The of A exists. UCB 3/19

Existence I To prove the existence of the for symmetric positive definite, we apply induction. Induction hypothesis: Given any spd A n 1 R n 1 n 1 assume there exists lower-triangular L n 1 where. A n 1 = L n 1 L T n 1 Base case: A 1 = [a 1,1 ] = [ a 1,1 ][ a 1,1 ] T. Consistency: Show that for any spd A n R n n there exists L n such that A n = L n L T n. UCB 4/19

Existence II Let us represent A n as [ α ] c T A n = c B where α := a 1,1 is a scalar, c := a 2:n,1 is a column vector, and B = a 2:n,2:n is the remaining R n 1 n 1 lower-right sub-matrix. Then we have A n = [ ][ ][ ] α 0 1 0 α α 1 c T 1 α c I n 1 0 B 1 α cct 0 I n 1 UCB 5/19

Existence III Let y be an arbitrary nonzero column vector in R n 1. We can solve It follows [ ] α α 1 c T 0 I n 1 x = [ ] 0 y y T (B 1 α cc T )y = x T A n x > 0 Thus B 1 α cct is spd. From the induction hypothesis. B 1 α cct = L n 1 L T n 1 UCB 6/19

Existence IV Putting it all together, we have [ ][ ][ ] α 0 1 0 α α 1 c T 1 A n = α c I n 1 0 B 1 α cct 0 I n 1 = = [ ][ ][ ] α 0 1 0 α α 1 c T 1 α c I n 1 0 L n 1 L T n 1 0 I n 1 [ ][ ] α 0 α α 1 c T 1 α c L n 1 0 L T n 1 = L n L T n UCB 7/19

Implementation I Note that the proof of existence was constructive. That means we can follow the same procedure to calculate the. 1 Loop over columns. for k=1:n-1 2 Set diagonal element of L. L(k,k)=sqrt(A(k,k)) 3 Finish column. L(k+1:n,k)=A(k+1:n)/L(k,k) 4 Update lower right sub-matrix of A. A(k+1:n,k+1:n)=A(k+1:n,k+1:n)- L(k+1:n,k)*L(k+1:n,k) T 5 End for loop. end UCB 8/19

I We can similarly take advantage of the stucture of band to optimize the calculation of the LU. Definition A band matrix A R n n of band width 2m + 1 has nonzero entries only on the main diagonal, m super-diagonals, and m sub-diagonals. UCB 9/19

II For example, a tridiagonal matrix (band width 3) has the form: a 1,1 a 1,2 0 a 2,1 a 2,2 a 2,3 a 3,2 a 3,3 a 3,4. A = a 4,3 a.. 4,4...... an 1,n 0 a n,n 1 a n,n UCB 10/19

III We can easily show that the LU will have the form: 1 0 u 1,1 u 1,2 0 l 2,1 1. u.. 2,2 A =......... un 1,n 0 l n,n 1 1 0 u n,n Because we know so many entries must be zero, we can optimize the algorithm to avoid calculating them. Similarly, the procedures for forward and backward substitution can also be optimized in these cases. UCB 11/19

Diagonal dominance I Partial pivoting can be implemented, but it will increase the band width of U from m + 1 to 2m + 1. However, pivoting can sometimes be avoided. Definition A matrix A R n n is column diagonally dominant iff a j,j > i j a i,j We can show that a column diagonally dominant matrix never requires pivoting. UCB 12/19

Diagonal dominance II Proof: Let a (i) k,j represent element k, j after zeroing column i the standard LU algorithm. Assume the induction hypothesis: j,j > n k=i+1, j k,j When column i + 1 is zeroed below the diagonal, the elements of A update as follows: a (i+1) k,j = a (i) k,j a(i) k,i+1 a(i) i+1,j a (i) i+1,i+1 UCB 13/19

Diagonal dominance III a (i+1) j,j = j,j a(i) j,i+1 a(i) i+1,j a (i) i+1,i+1 j,j a(i) j,i+1 a(i) i+1,i+1 n > k,j a(i) = k=i+1, j n k=i+2, j i+1,j j,i+1 a(i) i+1,j i+1,i+1 k,j + a(i) i+1,j (1 a(i) j,i+1 i+1,i+1 ) UCB 14/19

Diagonal dominance IV To make further progress we observe: a (i+1) k,j k,j + a(i) k,i+1 a(i) i+1,j i+1,i+1 k,j a(i+1) k,j a(i) k,i+1 a(i) i+1,j i+1,i+1 UCB 15/19

Diagonal dominance V Together this gives: a (i+1) j,j > > n k=i+2, j n k=i+2, j a (i+1) k,j + a (i+1) k,j i+1,j 1 n k=i+2 k,i+1 i+1,i+1 This shows consistency for the induction hypothesis and the base case easily follows from the definition of diagonal dominance. The desired result immediate follows since the diagonal elements are always largest in magnitude. UCB 16/19

Implementation I We can follow the standard LU procedure and simply avoid elements that do not update. If A has band width 2m + 1, we have 1 Loop over columns. for k=1:n-1 2 Construct column k of L. L(k+1:k+m,k)=A(k+1:k+m,k)/A(k,k) 3 Set column k of A to zero. A(k+1:k+m,k)=0 4 Update lower right sub-matrix of A. A(k+1:k+m,k+1:k+m)=A(k+1:k+m,k+1:k+m)- L(k+1:k+m,k)*A(k,k+1:k+m) 5 End for loop. Identify U. end; U=A UCB 17/19

Implementation II In the case of a tridiagonal matrix, the procedure becomes exceptionally simple and efficient. Let us write the diagonals of A as vectors. Then L and U can be written as: d 1 e 1 0. c 1 d.. 2 A =...... en 1 0 c n 1 d n 1 0 u 1 e 1 0 l 1 1. u.. 2 =......... en 1 0 l n 1 1 0 u n UCB 18/19

Implementation III Follow the same procedure, but avoid updating A. Build U directly. Note that the super-diagonal doesn t change and can be copied directly into U. 1 Set first row of U u(1)=d(1) 2 Loop over columns. for k=1:n-1 3 Construct column k of L. l(k)=c(k)/u(k) 4 Update A and place result directly in U. u(k+1)=d(k+1)-l(k)*e(k) 5 End for loop. end UCB 19/19