AM205: Assignment 2. i=1

Similar documents
AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Applied Linear Algebra in Geoscience Using MATLAB

Scientific Computing

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Applied Linear Algebra in Geoscience Using MATLAB

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Computational Linear Algebra

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Matrix decompositions

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Orthogonal Transformations

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

AM 205: lecture 6. Last time: finished the data fitting topic Today s lecture: numerical linear algebra, LU factorization

(Mathematical Operations with Arrays) Applied Linear Algebra in Geoscience Using MATLAB

ENGG5781 Matrix Analysis and Computations Lecture 8: QR Decomposition

Linear Algebra, part 3 QR and SVD

AMS526: Numerical Analysis I (Numerical Linear Algebra)

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

Least squares: the big idea

Numerical Linear Algebra

Numerical Methods I Non-Square and Sparse Linear Systems

Linear Algebraic Equations

MAA507, Power method, QR-method and sparse matrix representation.

Linear Least squares

Linear Algebra Review

Orthonormal Transformations and Least Squares

6 Linear Systems of Equations

Lecture 3: Review of Linear Algebra

1.Chapter Objectives

Lecture 3: Review of Linear Algebra

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

Linear Algebra- Final Exam Review

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Conceptual Questions for Review

B553 Lecture 5: Matrix Algebra Review

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

This can be accomplished by left matrix multiplication as follows: I

Lecture 2: Linear Algebra Review

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

EECS 275 Matrix Computation

AMS526: Numerical Analysis I (Numerical Linear Algebra)

The QR Factorization

lecture 2 and 3: algorithms for linear algebra

Lecture 3: QR-Factorization

A Review of Linear Algebra

2. Review of Linear Algebra

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

Least squares problems Linear Algebra with Computer Science Application

forms Christopher Engström November 14, 2014 MAA704: Matrix factorization and canonical forms Matrix properties Matrix factorization Canonical forms

Linear Analysis Lecture 16

EXAM. Exam 1. Math 5316, Fall December 2, 2012

MTH 464: Computational Linear Algebra

15 Singular Value Decomposition

Cheat Sheet for MATH461

MIT Final Exam Solutions, Spring 2017

Applied Linear Algebra

CS227-Scientific Computing. Lecture 4: A Crash Course in Linear Algebra

13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2

Orthonormal Transformations

Lecture Notes to Accompany. Scientific Computing An Introductory Survey. by Michael T. Heath. Chapter 2. Systems of Linear Equations

Math 407: Linear Optimization

Applied Linear Algebra in Geoscience Using MATLAB

Numerical Linear Algebra

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

lecture 3 and 4: algorithms for linear algebra

Linear Algebra. Brigitte Bidégaray-Fesquet. MSIAM, September Univ. Grenoble Alpes, Laboratoire Jean Kuntzmann, Grenoble.

Linear Least-Squares Data Fitting

Numerical Methods - Numerical Linear Algebra

Scientific Computing: An Introductory Survey

MATH 3511 Lecture 1. Solving Linear Systems 1

Algebra C Numerical Linear Algebra Sample Exam Problems

Math 520 Exam 2 Topic Outline Sections 1 3 (Xiao/Dumas/Liaw) Spring 2008

9. Numerical linear algebra background

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

5.6. PSEUDOINVERSES 101. A H w.

Computational Methods. Eigenvalues and Singular Values

Chapter 6: Orthogonality

STA141C: Big Data & High Performance Statistical Computing

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Index. book 2009/5/27 page 121. (Page numbers set in bold type indicate the definition of an entry.)

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

Gaussian Elimination without/with Pivoting and Cholesky Decomposition

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

Numerical Methods in Matrix Computations

LINEAR ALGEBRA KNOWLEDGE SURVEY

AMSC/CMSC 466 Problem set 3

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

AMS 147 Computational Methods and Applications Lecture 17 Copyright by Hongyun Wang, UCSC

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

Solutions to Review Problems for Chapter 6 ( ), 7.1

Math 307 Learning Goals. March 23, 2010

Lecture 9. Errors in solving Linear Systems. J. Chaudhry (Zeb) Department of Mathematics and Statistics University of New Mexico

Main matrix factorizations

AMS526: Numerical Analysis I (Numerical Linear Algebra)

MATH 304 Linear Algebra Lecture 18: Orthogonal projection (continued). Least squares problems. Normed vector spaces.

MATH 235: Inner Product Spaces, Assignment 7

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Transcription:

AM05: Assignment Question 1 [10 points] (a) [4 points] For p 1, the p-norm for a vector x R n is defined as: ( n ) 1/p x p x i p ( ) i=1 This definition is in fact meaningful for p < 1 as well, although it does not define a norm in this case because it does not satisfy the triangle inequality The Matlab function norm(x,p) evaluates the function ( ) for any real value p, even when it is not technically a norm On a single figure, plot the p-norm unit circle in R (ie {v R : v p = 1}), for p = 1, 15,, 5, 10 Also include the infinity norm unit circle and the p = 05 norm unit circle What do you see happening to the p-norm unit circles as p gets larger? Provide a specific counter-example that shows that the p = 05 norm does not satisfy the triangle inequality (b) [ points] For p =, plot the image of the p-norm unit circle under the action of the matrix [ ] 3 1 A = 1 0 (Note that the image of a set C under the action of A is {Ax : x C}) Indicate a point on your plot such that the p-norm of the point is equal to the p-norm of A (the point need not be unique) Repeat the problem with p = (c) [4 points] (No code required for this part) Let A R m n Prove that for v R n we have v v n v, and hence show that A m A Question [0 points] (a) [4 points] (No code required for this part) Solve Ax = b for 1 1 0 A = 4 3 3 1 8 7 9 5, b = 6 7 9 8 by first forming the LU factorization without pivoting and then applying forward and back substitution You should show all the steps in this process in your write-up (b) [10 points] Write three functions: forwardsub which takes a lower-triangular matrix L and a right-hand-side vector b as arguments and returns the solution for the system Lx = b, 0 1 3, 1

backsub which takes an upper-triangular matrix U and a right-hand-side vector b and returns the solution for the system Ux = b, mylu which takes a matrix A as an argument, performs LU factorization with partial pivoting and return matrices L, U and P (Do not use Matlab s built-in LU functions in mylu) Show that your mylu function accurately factorizes a set of ten randomly generated matrices A n R n n for n = 10 : 10 : 100, by plotting the relative -norm error P A n LU / A n as a function of n Also, verify your answer from (a) using mylu, forwardsub and backsub (c) [6 points] Use the Matlab function hilb to generate the 1 1 Hilbert matrix (Hilbert matrices are famously ill-conditioned) Let x = [1, 1,, 1] T R 1 and let b = Ax Use mylu, forwardsub and backsub from (b) to solve Ax = b, and let ˆx denote solution you obtain Compute the -norm relative residual ( ) r(ˆx) A ˆx and relative error Question 3 [10 points] Matrices of the form G n ( ) x ˆx ˆx and discuss the relationship between these two values 1 0 0 0 1 1 1 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 R n n are examples of the very rare situation in which Gaussian elimination is numerically unstable Write a function generateg that returns G n Then, for n = 10 : 10 : 00, set x = [1, 1,, 1] T R n and construct a right-hand side vector b = G n x Solve G n x = b using backslash (which uses Gaussian elimination with partial pivoting), and let ˆx denote the computed solution Plot the -norm relative error as a function of n and also include a plot that shows that the inequality: x ˆx ˆx cond(a) r(ˆx) A ˆx is satisfied Based on your results, explain why Gaussian elimination with partial pivoting applied to this case is considered to be numerically unstable Question 4 [4 points] (a) [ points] (Written question, no code required) Determine a Householder transformation matrix, Q, that annihilates all but the first entry of the vector a = [, 1, 0, ] T (of the two possibilities, choose the Householder matrix Q recommended for numerical stability) What is the first entry of Qa? (b) [7 points] (Written question, no code required) An alternative method for constructing a QR factorization that we did not discuss in lectures is the method of Givens rotations Givens rotations are generally less efficient than Householder reflections for factorization of dense matrices, but can have advantages for sparse matrices A Givens rotation matrix is an orthogonal matrix which annihilates a single component of a vector, and

it is represented by a matrix of the form (where i j): G(i, j, θ) = = g ii = c cos(θ) g jj = c g ij = s sin(θ) g ji = s g kk = 1 k i, j g mn = 0 otherwise 1 0 0 0 0 c s 0 0 s c 0 0 0 0 1 When acting from the left G(i, j, θ) will only modify the i and j rows of a matrix A, and we choose c and s in order to annihilate an entry of row j of A In particular, let a 1 and a denote entries in a column of A, then we can introduce a zero the column as follows: [ ] [ ] [ ] c s a1 α = s c 0 a Since the matrix in the equation above is orthogonal, we must have α = a 1 + a, and then it also follows that c = a 1 / a 1 + a, and s = a / a 1 + a The formulas above work, but they are more susceptible to underflow or overflow than necessary, due to the a 1 + a factor Hence it is better to formulate the matrix differently to avoid calculating this factor explicitly To achieve this, c and s can be calculated in terms of the tangent or cotangent of the rotation angle (denoted t and τ below, respectively) as follows: If a 1 > a, set t tan(θ) = a /a 1, and hence c = 1/ 1 + t, s = ct, If a > a 1, set τ cot(θ) = a 1 /a, and hence s = 1/ 1 + τ, c = sτ With these definitions, t < 1 (resp τ < 1) and hence there is no danger of overflow or underflow in the term 1 + t (resp 1 + τ ) Use Givens rotation matrices to compute the QR factorization by hand for the following matrix: A = 5 3 0 3 3 9 4 1 3 Show each G i explicitly as well as the intermediate products, G 1 A and G G 1 A Also, write out the Q and R factors that you obtain from this process (c) [15 points] Now you will implement and compare several QR factorization algorithms: Write a Matlab function [Q hat,r hat] = mgs(a) that returns the reduced QR factorization of a matrix A R m n, m n, using the modified Gram-Schmidt method 3

Write a Matlab function [W,R] = householder(a) that computes an implicit representation of a full QR factorization of A R m n, m n using Householder reflections The output variables are a lower triangular matrix W R m n whose columns are the vectors v k defining successive Householder reflections, and a triangular matrix R R n n Write a second function Q = formq(w) that takes the matrix W produced by householder and returns the corresponding orthogonal matrix Q Write a Matlab function [Q,R] = givens(a) that computes the full QR factorization of a matrix A R m n, m n, using Givens rotations This function should use the tan/cotan formulation of a Givens matrix Use the command vander(linspace(-1,1,n)) to generate a set of n n Vandermonde matrices, for n = 10 : 10 : 100 and plot the error A QR / A for the QR factorizations obtained from mgs, householder/formq and givens Also, plot Q T Q I, and hence discuss how well each method preserves orthogonality of Q in this case Question 5 [16 points] Suppose we are given a dataset represented by a matrix X R m n One popular way to analyze such a dataset is to perform Principal Component Analysis (PCA), which consists of computing the eigenvalue decomposition of the matrix XX T (in statistics this matrix is related to correlations between the data in the rows of X) The eigenvectors of XX T are referred to as principal components, and the eigenvalues relate to the variance in the dataset associated with the principal components (a) [1 point] (Written question, no code required) We discussed the relationship between the eigenvalue decomposition of XX T and the SVD of X in lectures Based on this relationship, show that the principal components are the same as the left singular vectors of X (b) [3 points] It can be less accurate to perform PCA by computing eigenvalues of XX T as opposed to using the SVD For example, consider the matrix: 1 10 8 X = 1 10 8 1 10 8 1 10 8 What are the exact eigenvalues of XX T (you may find these by hand or use a symbolic computing environment like Mathematic or Maple)? Also, using Matlab (or another numerical environment of your choosing), compute: (i) the eigenvalues of XX T, (ii) the squares of the singular values of X Which method, (i) or (ii), is better in this case? (c) [1 points] In this question we will use PCA (or, equivalently, the SVD) to do some analysis of data provided in the file q5 cmat, which provides a matrix X R 10 0 (i) Recall from linear algebra that the orthogonal projection of a vector x R m onto a subspace W of R m is defined as follows: v T (P x x) = 0, v W, (1) where P x W denotes the projection of x onto W (This formula states that the projection error P x x is orthogonal to W ) 4

The principal components of X provide an orthogonal basis that we can use to form a low-dimensional approximation to X via orthogonal projection Compute the orthogonal projection of each column of X onto the first two principal components, and form a D scatter plot of the projected data In your plot, the coefficients of the first and second principal components should correspond to the horizontal and vertical axes, respectively (ii) The total projection error from (i) is given by: where x j denotes the j th column of X 0 E P = P x j x j j=1 1, () Give an analytical formula for E P that involves the singular values of X (Hint: For P denoting the orthogonal projection operator onto the span of the first two principal components, it can be shown that the matrix [P x 1 P x P x n ] R 10 0 minimizes the total projection error, as defined in (), among all rank matrices) Also, calculate E P directly based on your results for P x j, j = 1,, 0 from (i) Verify that your calculated value agrees with the result of evaluating your formula (ie the formula from above that involves singular values) for E P Question 6 [0 points] As discussed in lectures, it is highly desirable to minimize fill-in when performing sparse matrix factorizations The amount of fill-in that occurs when applying a factorization is very sensitive to the order in which the rows or columns of the matrix are processed By re-ordering the matrix beforehand it is often possible to greatly reduce the amount of fill-in The reverse Cuthill-McKee algorithm (RCM) 1 is a popular method for finding a re-ordering of a sparse symmetric matrix that will reduce the matrix s bandwidth This bandwidth reduction leads to reduced fill-in when computing a matrix factorization Let A R n n denote a sparse symmetric matrix The RCM algorithm constructs a re-ordering of A by considering a graph with n vertices which represents the sparsity pattern of A 3 For example, in the figure below we have a graph (right-most figure) with an adjacency matrix (middle figure) that matches the sparsity pattern of a matrix (left-most figure): 1 E Cuthill and J McKee Reducing the bandwidth of sparse symmetric matrices In Proceedings of the 1969 4th national conference, pages 157-17, New York, NY, USA, 1969 ACM Press Sometimes this is also called the bandwidth minimization problem Exact minimization of bandwidth turns out to be an NP-complete problem, but heuristic approaches can find pretty good solutions in a reasonable amount of time 3 The re-ordering provided by the RCM algorithm is not necessarily unique Also note that if the graph corresponding to A has any disconnected sub-graphs then the RCM algorithm must be applied to each of them individually 5

Note that an adjacency matrix for a graph indicates connections between vertices with non-zero entries Let Adj(v) denote the adjacency set for the vertex v, ie the set of vertices connected v We can obtain Adj(v) by looking at the column (or row) of the adjacency matrix corresponding to v Also, the degree of a vertex v in a graph is given by length(adj(v)) Due to the close connection between the graph and the sparsity pattern, we can calculate the bandwidth of the sparsity pattern directly from the labels of the graph, as follows: bw = max i,j=1,,n L i L j + 1, L i = label of vertex i The RCM algorithm constructs a re-labeling of the vertices of the graph in order to reduce the bandwidth of the adjacency matrix A re-labeling is given by a vector of n vertex labels, where the j th entry indicates the new label of vertex j Equivalently, this re-labeling vector yields a symmetric re-ordering of the matrix A, as you will see below The RCM algorithm is described in Algorithm 1 In general the choice of the initial vertex is important in RCM, and it can be provided by other heuristic algorithms For our purposes, just use the simple initial choice in step 1 of Algorithm 1 Algorithm 1 Reverse Cuthill-McKee Algorithm 1: Find a vertex p 1 with the lowest degree in the graph, and initialize the vector p = [p 1 ] : for i = 1,, do 3: Let p i denote the i th entry of p Construct the vector A i which is the adjacency set for p i excluding any vertices that are already in p 4: Sort A i in ascending degree order (ie the first vertex in A i has lowest degree) 5: Append A i to p, ie in Matlab notation we have p = [p, A i ] 6: if length(p) = n then 7: Terminate the loop 8: end if 9: end for 10: (Reverse the ordering) The RCM ordering is given by r = [p n, p n 1,, p, p 1 ] (a) [1 points] (Written question, no code required) Consider the following symmetric matrix: 40 3 9 8 3 45 5 1 9 45 5 45 A = 5 45 45 8 1 45 3 1 5 3 45 1 39 Draw a labeled graph which has adjacency matrix that matches the sparsity pattern of A Use the RCM algorithm by hand to construct a re-ordering vector r You should show what happens at each step of the algorithm (Note that the reordering obtained from Algorithm 1 is not necessarily unique) Let  be the re-ordered matrix, where Âij = A(r(i), r(j)) Plot the sparsity patterns of A and  and report the bandwidth in each case Let b = [1,, 3,, 9] T R 9 Solve Ax = b and ˆx = ˆb, where ˆx and ˆb are the appropriately reordered versions of x and b, respectively (You just need to report the results of these solves, you do not need to do the solves by hand ) Show that the solutions x and ˆx are the same when appropriately re-ordered by r 6

(b) [8 points] Load the matrix q6 bmat (provided with the assignment), which contains a symmetric positive-definite sparse matrix Compute the bandwidth of A, and plot the sparsity pattern of its Cholesky factor L, where A = LL T Next, use Matlab s implementation of RCM (the built-in function symrcm) to determine a re-ordering of A Let  denote the re-ordered matrix, and note that if r is a reordering vector, we can use the syntax A hat = A(r,r) to obtain  in Matlab Compute the bandwidth of Â, and plot the sparsity pattern of its Cholesky factor L How many non-zero entries are there in the Cholesky factors of A and Â? Did RCM reduce fill-in for the Cholesky factorization? 7