Math 502 Fall 2005 Solutions to Homework 4 (1) It is easy to see that A A is positive definite and hermitian, since A is non-singular. Thus, according to Theorem 23.1 (page 174 in Trefethen and Bau), A A has a unique Cholesky factorization A = U U, with U upper triangular and u ii > 0, 1 i m. By assumption A = QR, where Q is unitary and R is upper triangular with r ii > 0, 1 i m. Therefore A A = R Q QR = R R, so that A A = R R is also a Cholesky factorization with the same charactertics. By uniqueness U = R. (2) (a) There are several ways to prove this. The first argument below was the one most people used. (i) Let λ C and B = A λi. Since A is tridiagonal with all of its subdiagonal and superdiagonal entries nonzero, B also has these properties. Let B be partitioned as [ ] v T 0 B = U w where v, w are column vectors of length m 1, and U is (m 1) (m 1). Since B is tridiagonal with all subdiagonal entries nonzero, U is upper triangular with nonzero diagonal entries. Therefore det(u) 0. It follows that if det(b) = 0 then rank(b) = m 1, and otherwise rank(b) = m. In the former case, λ is an eigenvalue and null(b) is the eigenspace. Since dim(null(b)) = m rank(b) = 1, the geometric multiplicity of λ is one, which is also the algebraic mutliplicity since A is hermitian. Hence all eigenvalues are distinct. (ii) Let B be defined as above. For convenience let d i = b i,i denote the diagonal entries of B and l i = b i,i 1 and u i = b i,i+1 denote the subdiagonal and superdiagonal entries. Then x null(b) if and only if d 1 x 1 + u 1 x 2 = 0, l i x i 1 + d i x i + u i x i+1 = 0, i = 2,..., m 1, l m x m 1 + d m x m = 0. Clearly, if x 1, x 2 satisfy the first equation then x 2 = (d 1 /u 1 )x 1, since by assumption u 1 0. Let c 1 = 1, c 2 = d 1 /u 1, and inductively define c i = 1 u i 1 (l i 1 c i 2 + d i 1 c i 1 ), i = 3,..., m. Since u i 0 for all i, these numbers are well-defined. An easy induction argument shows that if x R m satisfies the first m 1 equations in this system, then x i = c i x 1, i = 1,..., m. That is x = x 1 c, where c = [c 1,..., c m ] T. Hence x null(b) if and only if x = x 1 c and l m x m 1 + d m x m = (l m c m 1 + d m c m )x 1 = 0.
If l m c m 1 +d m c m = 0 then dim(null(b)) = 1, otherwise null(b) = {0}. In the former case λ is an eigenvalue and its eigenspace is one dimensional. Since A is hermitian it follows that all eigenvalues are distinct. (b) A simple example is 1 0 0 1 1 0 0 1 1 which has λ = 1 as a triple eigenvalue. The companion matrices described on page 192 of the text (see (25.3)) are also examples, if the companion polynomial p(z) has a repeated root. (3) An induction argument shows that m ( ) k v (k) λj = α 1 q 1 + α j q j, k = 1, 2,..., j=2 and from this we deduce v (k) α 1 q 1 as k, since λ j / < 1, for j = 2,..., m. To show the converge is linear with asymptotic constant C = / we need to verify the limit e (k+1) lim k e (k) = lim v (k+1) α 1 q 1 k v (k) α 1 q 1 =. Using the orthonormality of the eigenvectors we have m ( ) 2k ( ) 2k e (k) 2 λj = α j 2 λ2 { m = α2 2 + Similarly j=2 e (k+1) 2 = lim k ( λ2 ) 2k+2 { α 2 2 + j=3 m j=3 ( λj j=3 ( λj ) 2k+2 α j 2} Using the assumption > λ 3 λ j for j > 3, we have ( m ( ) 2k α2 2 λj + α j 2) = α2. 2 Since α 2 0 it follows that e (k+1) lim k e (k) = α 2 α 2 =. ) 2k α j 2}
(4) The MATLAB function below can be used as either Pwr1 or Pwr2 by uncommenting the appropriate line. function [v,lam,k] = Pwr(A,v0) % This function uses the power iteration method to compute % the largest eigenvalue of the input matrix A, and a normalized % eigenvector. It uses the input vector v0 as the starting vector % for the power iteration sequence. The computed normalized % eigenvector v, eigenvalue lam and the number of iterations % used are ed. max_loops = 500; epsilon = 10^(-8); % upper bounded on the number of iterations % this version does a little error checking [m,n] = size(a); [p,q] = size(v0); if ((m ~= n) (p ~= m) (q ~= 1)) disp( error - A must be square and v0 a compatible column vector ) k = 0; v = v0; lam0 = v *A*v; k = 0; diff = 1; % initializations for the while loop while ((diff > epsilon) & (k < max_loops)) k = k+1; w = A*v; s = 1/norm(w); v = s*w; lam = v *A*v; diff = abs(lam0 - lam); % if used, this is Pwr2 % diff = norm(v - v0); % if used, this is Pwr1 v0 = v; lam0 = lam;
MATLAB diary (edited) >> A = diag([-4,2,1,1,1]) + triu(rand(5,5),1); >> v0 = ones(5,1); >> [v,lam,k] = Pwr(A,v0) >> format long >> [v,lam,k] = Pwr(A,v0) (This is Pwr1) v = 1.00000000000000 lam = -4 k = 500 >> [v,lam,k] = Pwr(A,v0) (This is Pwr2) v = -0.99999999999999 0.00000016335687 0.00000000000012 0.00000000000001 lam = -4.00000000249500 k = 25 >> A = diag([9,2,1,5,-8]) + triu(rand(5,5),1); >> v0 = ones(5,1); >> [v,lam,k] = Pwr(A,v0) (This is Pwr1) v = 1.00000000000000-0.00000000009851-0.00000000042682-0.00000000020326 0.00000000465153 lam = 9.00000000296730 k = 160 >> [v,lam,k] = Pwr(A,v0) (This is Pwr2) v = 1.00000000000000 0.00000000014026 0.00000000060772 0.00000000028941-0.00000000662299 lam = 8.99999999577507 k = 157
Discussion The sequence of approximate eigenvectors does not converge in general. The first set of data shows this, where the sequence can be shown to be essentially..., e 1, e 1, e 1, e 1,..., an alternating sequence of vectors. The alternating sign is a result of the dominant eigenvalue being negative. This problem does not occur with the second set of data. In general it must be realized that the convergence of the sequence of vectors is to the eigenspace of the eigenvalue, and not to a specific eigenvector unless some additional normalization is enforced. With the second set of data we see similar accuracy with either stopping criterion. The order of convergence is the same for the sequence of approximate eigenvalues and eigenvectors since the matrix is not symmetric. (How does this compare with the symmetric case?) The number of iterations for both sets of data is consistent with the linear convergence estimate given in problem 3. Suppose we want e n < ɛ and we know e n C e n so that e n C n e 0. Then n should be chosen so that C n e ɛ, or n ln C ln(ɛ/ e 0 ) ln(ɛ) assuming e 0 1. If ɛ = 10 8 then this gives n 8 ln 10/ ln C. For the first set of data C = / = 4/2 = 1/2, giving n 8 ln 10/ ln 2 27. For the second set of data C = / = 8/9 = 8/9, giving n 8 ln 10/ ln(9/8) 156.
(5) The MATLAB function below is a solution for problem 5. function [v,lam,k] = Inv(A,v0,mu) % s the eigenvalue lam and % a normalized eigenvector v max_loops = 500; epsilon = 10^(-8); % this error checking is a little different than that of Pwr [m,n] = size(a); [p,q] = size(v0); if (m ~= n) disp( error - A must be a square matrix ) if ((p ~= m) (q ~= 1)) disp( error - v0 must be a column compatible with A ) v = v0/norm(v0); % initializations for the while loop lam0 = v *A*v; k = 0; diff = 1; B = A - mu*eye(m,m); while ((diff > epsilon) & (k < max_loops)) k = k+1; w = B\v; s = 1/norm(w); v = s*w; lam = v *A*v; diff = abs(lam0 - lam); lam0 = lam;
MATLAB diary (edited) >> A = diag([9,2,1,5,-8]) + triu(rand(5,5),1); >> v0 = ones(5,1); >> [v,lam,k] = Inv(A,v0,8.8) v = 1.00000000000000 0.00000000000652 0.00000000000330 0.00000000004371 lam = 9.00000000003505 k = 8 Discussion Obviously the convergence is much faster here than observed with Pwr2. An impovement in the accuracy is also apparent. This is due to a rapid reduction in error at each step. This can be predicted using the linear convergence analysis of problem 3. This time the power iterations involves the matrix (A µi) 1. If λ is an eigenvalue of A then (λ µ) 1 is an eigenvalue of (A µi) 1. With µ = 8.8 the largest two in magnitude are = (9 8.8) 1 = and = (5 8.8) 1. Thus C = / = 0.2/3.8 0.057, and n 8 ln 10/ ln(0.2/3.8) 6 or 7.
(6) The MATLAB function below is a solution for problem 6. function [v,lam,k] = Ray(A,v0) % s the eigenvalue lam and % a normalized eigenvector v and is approximated by v0 max_loops = 500; epsilon = 10^(-8); % some error checking [m,n] = size(a); [p,q] = size(v0); if (m ~= n) disp( error - A must be a square matrix ) if ((p ~= m) (q ~= 1)) disp( error - v0 must be a column compatible with A ) v = v0/norm(v0); lam0 = v *A*v; k = 0; diff = 1; Id = eye(m,m); % initializations for while loop while ((diff > epsilon) & (k < max_loops)) k = k+1; B = A - lam0*id; w = B\v; s = 1/norm(w); v = s*w; lam = v *A*v; diff = abs(lam0 - lam); lam0 = lam;
MATLAB diary (edited) >> A = diag([9,2,1,5,-8]) + triu(rand(5,5),1); >> v0 = ones(5,1); >> [v,lam,k] = Ray(A,v0) v = -0.09920793658093 0.99506672405390 lam = 2 k = 6 >> v0 = rand(5,1) v0 = 0.70273991324038 0.54657115182911 0.44488020467291 0.69456724042555 0.62131013079541 >> [v,lam,k] = Ray(A,v0) Warning: Matrix is singular to working precision. > In.../Ray.m at line 36 v = NaN NaN NaN NaN NaN lam = NaN k = 7 Discussion The convergence in this case is very fast and in fact can be too rapid as the second shows. Various modifications can be used to take care of this problem. Which eigenvalue and eigenvector is found deps on the initial vector v0. Using one of the other methods to get an initial approximation would make this more selective.