APPM 4720/5720 Problem Set 2 Solutions This assignment is due at the start of class on Wednesday, February 9th. Minimal credit will be given for incomplete solutions or solutions that do not provide details on how the solution is found. Problems marked with a are required only for students enrolled in APPM 5720. You may discuss the problems with your classmates, but all work (analysis and code) must be your own.. (Watkins 2.2.5) We know that a matrix A is singular if and only if det (A) 0. We might expect that a very small determinant indicates a poorly conditioned matrix. This turns out not to be true, and you will prove it in this exercise. [ ] α 0 (a) Let α be a positive real number and consider the following matrix A Show that for 0 α any induced matrix norm A α, A /α and κ(a), while det (A) α 2. Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(a) α 2. Furthermore, we have A Ax Since A α I we have A Then κ(a) A A α α A x αix α Ix α α α (b) More generally, given any nonsingular matrix A, discuss the condition number and determinant of αa, where α is any positive real number. For positive α we have det(αa) α det(a) which means we can make the determinant of a matrix arbitrarily large by multiplying it by a large scalar. On the other hand, for any induced matrix norm we have αa α A and αa α A κ(αa) κ(a) α which implies that multiplying A by a scalar does not affect its condition number. The main take-away from this is that the determinant of a matrix has very little to do with its conditioning. Matrices with large determinants can be perfectly well conditioned and matrices with small determinants can be terribly ill-conditioned.
2. (Watkins 2.5.9) Let ˆx be an approximation to the solution of Ax b, and let r b Aˆx. Define δa R n n by δa αrˆx T, where α ˆx 2 2 (a) Show that ˆx is the exact solution of (A + δa) ˆx b. We have (b) Show that δa 2 r 2 / ˆx 2 and (A + δa) ˆx ( A + αrˆx T ) ˆx δa 2 A 2 Aˆx + r ˆxT ˆx ˆx T ˆx Aˆx + r Aˆx + b Aˆx b r 2 A 2 ˆx 2 Thus if r 2 is tiny relative to A 2 ˆx 2, then the algorithm (whichever algorithm was used) is backward stable for this problem. We have δa 2 y 2 δay 2 y αrˆxt y 2 2 ˆx T y y 2 ˆx T ˆx r 2 We are only free to select y to maximize this quantity. Notice that since this is the Euclidean norm, ˆx T y will be maximized when y it is a unit vector in the same direction as ˆx. Thus we pick y ˆx/ 2 which gives δa 2 ˆx 2 ˆx 2 r 2 r 2 2 ˆx 2 It is then trivial to see that δa 2 r 2 A 2 A 2 ˆx 2
3. (Watkins 2.5.9) In this exercise you will assess the backward stability of Gaussian Elimination by calculating error in the LU decomposition: E LU A. Write a MATLAB/OCTAVE/PYTHON program that computes the LU decomposition of A both with and without partial pivoting. (a) Calculate the LU decomposition of random matrices without pivoting for several choices of n (e.g. n 40, 80, 60), and note L, U, and the norm of the backward error E LU A. On the same matrices compute the LU decomposition with partial pivoting and calculate the same quantities. Discuss the effectiveness of both the methods, in terms of stability, and comment on the effect of partial pivoting. The following tables shows the mean values of the quantities of interest over 00 random matrices. The table on the left shows the results for LU with no pivoting. The table on the right shows the results with partial pivoting. Notice that without pivoting the norms of L and U can grow quite large, while with pivoting the L remains small since all engries in L are smaller than or equal to. While the error between A and the compute LU stays fairly small even without pivoting, the error is much better with pivoting. 40 357.3 5764..32 0 2 80 250. 44885 3.75 0 60 35.3.39 0 5 2.89 0 40 5.3 69.3 6.62 0 5 80 29. 89.2 2.09 0 4 60 53.3 52.9 6.25 0 4 (b) To demonstrate the weakness of the LU decomposition without pivoting, give it a matrix for which one of the pivots is guaranteed to be small. The easiest way to do this is to use matrices whose (, ) entry is tiny. Repeat the experiments from (a) using matrices for which a is tiny. Here we perform identical tests as above, except this time we set a 0 3 for each matrix. Again the results with and without pivoting are shown in the tables on the left and right, respectively. Here we see just how bad LU without pivoting can be on matrices with small diagonal elements. The entries in both L and U without pivoting are extremely large, and the error in the decomposition is quite bad. With pivoting however the size of the entries of L and U remain fairly small, and the decomposition is very accurate. 40 2.4 0 3 2.38 0 4 0.052 80 2.60 0 3 5.39 0 4 0.085 60 2.89 0 3 8.93 0 4 0.2435 40 5.9 68.9 6.67 0 5 80 28.7 86.2 2.07 0 4 60 52.3 57.8 6.27 0 4
4. (Watkins 3.2.46) The following steps lead to a proof of the uniqueness of the QR decomposition. (a) Suppose B R n n is both orthogonal and upper triangular. Prove that B must be a diagonal matrix whose main-diagonal entries are ±. If we write B in component form and enforce the conditions that it is orthogonal and upper triangular we have b b 2 b 22 b 3 b 23 b 33...... b n b 2n b 3n b nn b b 2 b 3 b n b 22 b 23 b 2n b 33 b 3n.... b nn... Let s form the first row of the product B T B one entry at a time. Taking the inner product of the first row of B T with the first column of B we have b 2 b ± Then, taking the inner product of the first row of B T with the j th column of B for j 2 we have b b j 0 b j 0 for j 2,..., n Having determined the entries in the first row of B (and the first column of B T ) we now have ± 0 b 22 0 b 23 b 33...... 0 b 2n b 3n b nn ± 0 0 0 b 22 b 23 b 2n b 33 b 3n.... b nn... From this equation it is clear that upper triangular B is orthogonal iff ˆB T ˆB In where ˆB B 2:n,2:n. We can prove the result by applying the same argument as above recursively to ˆB. (b) Suppose Q R Q 2 R 2, where Q and Q 2 are both orthogonal, and R and R 2 are both upper triangular and nonsingular. Show that there is a diagonal matrix D with main diagonal entries ± such that R 2 DR and Q Q 2 D. Note that Q R Q 2 R 2. Since Q 2 is orthogonal and R is nonsingular we can multiplying both sides of the equation on the left by Q T 2 and the right by R to obtain D : Q T 2 Q R 2 R It is easy to check that the inverse of an upper triangular matrix is also upper triangular, and the product of two upper triangular matrices is upper triangular. It is also easy to verfy that Q T 2 Q is orthogonal. Then, D is a matrix that is both orthogonal and upper triangular. Thus by part (a) D is diagonal with diagonal entries ±. It remains to be shown that the matrices satisfy the relations R 2 DR and Q Q 2 D. We have DR ( R 2 R ) R R 2 and Q 2 D Q 2 ( Q T 2 Q ) Q as desired.
(c) Suppose that A R n n is nonsingular and there exist orthogonal Q R n n and upper triangular R R n n with positive main-diagonal entries such that A QR. Use parts (a) and (b) to prove that this decomposition is unique. Let A R n n be nonsingular and assume there exist orthogonal Q R n n and upper triangular R R n n with positive main diagonal entries such that A QR. Note also that since the main diagonal entries of R are nonzero then R is nonsingular. Assume that A can also be written as A ˆQ ˆR where ˆQ and ˆR satisfy the same requirements as above. Then we have QR ˆQ ˆR. From part (b) there exists diagonal matrix D with main-diagonal entries ± such that ˆR DR. If both R and ˆR have positive main diagonal entries then the only D for which this works is the identity matrix. Then ˆR R and ˆQ Q and the decomposition is unique.
5. (Watkins 3.2.65) Some of the most important transforming matrices are rank-one updates of the identity matrix (including Householder Reflectors) (a) Recall that the rank of a matrix is equal to the number of linearly independent columns. Prove that A R m m has rank one if and only if there exist nonzero vectors u, v R m such that A uv T. To what extent is there flexibility in the choice of u and v? ( ) Assume that there exist nozero vector u and v in R m such that A uv T. Then each column of A has the form αu (where here α v i ). Since each column of A can be written as a scalar multiple of the same vector it must be the case that rank(a). ( ) Assume that rank(a) and let A j represent the j th column of A. Since A is rank it must be the case that dim(span {A j }). We can easily construct (nonunique) nonzero vectors u and v such that A uv T. For instance, we can let u A and define v such that v j a j /a for j,..., m We have some small amount of flexibility in choosing u and v. We can choose u to be any vector in R(A). Once u is fixed then the entries of v are uniquely determined by v j a j /u for j,..., m. (b) A matrix of the form G I uv T is called a rank-one update of the identity for obvious reasons. Prove that if G is singular, then Gx 0 if and only if x is a multiple of u. Prove that G is singular if and only if v T u. Let G I uv T be singular. Then there exists some x 0 such that Gx 0. Then we have Gx 0 ( I uv T ) x 0 x uv T x x ( v T x ) u αu ( ) Let G I uv T be singular. We showed above that any vector in the nullspace of G must have the form αu. Then we have G (αu) 0 αu uv T u 0 u u ( v T u ) v T u ( ) Assume that v T u. Then we have Gu u u ( v T u ) u u 0 and G is singular. (c) Show that if G I uv T is nonsingular, then G has the form I βuv T. Give a formula for β. This one we ll prove by construction (by actually finding the desired β). We have I ( I uv T ) ( I βuv T ) I βuv T uv T + βuv T uv T I βuv T uv T + β ( v T u ) uv T This implies that [ β + β ( v T u )] uv T 0
Solving for the β that makes the scalar coefficient zero we find β v T u We showed in part (b) that if G is nonsingular then v T u so β is well-defined. Thus G βuv T where β is as given above. (d) Prove that if G I uv T is orthogonal, then u and v are proportional, i.e. v ρu for some nonzero scalar ρ. Show that if u 2, then ρ 2. If G I uv T is orthogonal then we have G T G. We know the form of G from part (c) so we have G T G I vu T I βuv T vu T βuv T If G T G then G T x G x for all nonzero x. If we take x u then we have vu T u βuv T u v β vt u u T u u ρu where ρ β vt u u T u Assume that u 2, then letting v ρu the expression G T G I G T G I ( I uv T ) T ( I uv T ) I ( I ρuu T ) T ( I ρuu T ) I I 2ρuu T + ρ 2 u 2 2uu T I Solving the quadratic equation in uu T gives ρ 0 and ρ 2. Since we know ρ 0 we conclude that ρ 2. (e) Show that if G I uv T and W R m n, then GW can be computed in about 4nm flops if the arithmetic is done in an efficient way. Notice that we can write GW ( I uv T ) W W uv T W. We first compute x T v T W, which requires taking the inner product of v with the n columns of W. Eeach inner product costs approximately 2m flops, and there are n of them, for a total of 2mn flops. We then form the m n outer-product matrix ux T, which costs mn flops. Finally we subtract W ux T for another mn flops, for a grand total of 4mn flops.