EE364a Review Session 7

Similar documents
9. Numerical linear algebra background

9. Numerical linear algebra background

L. Vandenberghe EE133A (Spring 2017) 3. Matrices. notation and terminology. matrix operations. linear and affine functions.

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

LU Factorization. LU Decomposition. LU Decomposition. LU Decomposition: Motivation A = LU

Numerical Methods I: Numerical linear algebra

Numerical Linear Algebra

ECE133A Applied Numerical Computing Additional Lecture Notes

Scientific Computing: Solving Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems

Review Questions REVIEW QUESTIONS 71

12. Cholesky factorization

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

7. LU factorization. factor-solve method. LU factorization. solving Ax = b with A nonsingular. the inverse of a nonsingular matrix

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

MATRICES. a m,1 a m,n A =

Solving linear systems (6 lectures)

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Consider the following example of a linear system:

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Algebra C Numerical Linear Algebra Sample Exam Problems

MTH 464: Computational Linear Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Gaussian Elimination without/with Pivoting and Cholesky Decomposition

Review of Matrices and Block Structures

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

Linear Systems of Equations. ChEn 2450

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

MATHEMATICS FOR COMPUTER VISION WEEK 2 LINEAR SYSTEMS. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

Scientific Computing

Linear algebra I Homework #1 due Thursday, Oct Show that the diagonals of a square are orthogonal to one another.

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 6

ACM106a - Homework 2 Solutions

B553 Lecture 5: Matrix Algebra Review

Numerical Linear Algebra

Hani Mehrpouyan, California State University, Bakersfield. Signals and Systems

Linear Algebra & Analysis Review UW EE/AA/ME 578 Convex Optimization

Chapter 1: Systems of Linear Equations and Matrices

MATRIX ALGEBRA AND SYSTEMS OF EQUATIONS. + + x 1 x 2. x n 8 (4) 3 4 2

Scientific Computing: Dense Linear Systems

MATH2210 Notebook 2 Spring 2018

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Matrix decompositions

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

Lecture 9: Numerical Linear Algebra Primer (February 11st)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Gaussian Elimination -(3.1) b 1. b 2., b. b n

The Solution of Linear Systems AX = B

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes

4 Elementary matrices, continued

Engineering Computation

CS 219: Sparse matrix algorithms: Homework 3

1.Chapter Objectives

Computational Methods. Systems of Linear Equations

4 Elementary matrices, continued

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

Finite Difference Methods (FDMs) 1

Linear Algebra and Matrix Inversion

5. Orthogonal matrices

Numerical Methods - Numerical Linear Algebra

Linear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Solving Linear Systems of Equations

Mobile Robotics 1. A Compact Course on Linear Algebra. Giorgio Grisetti

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Bastian Steder

A Cholesky LR algorithm for the positive definite symmetric diagonal-plus-semiseparable eigenproblem

Solving linear equations with Gaussian Elimination (I)

3.2 Gaussian Elimination (and triangular matrices)

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Kai Arras, Maren Bennewitz

Introduction to Mobile Robotics Compact Course on Linear Algebra. Wolfram Burgard, Cyrill Stachniss, Maren Bennewitz, Diego Tipaldi, Luciano Spinello

Matrix decompositions

Matrices and systems of linear equations

Norms for Vectors and Matrices

Chapter 4 - MATRIX ALGEBRA. ... a 2j... a 2n. a i1 a i2... a ij... a in

Basic Concepts in Linear Algebra

Matrix Algebra. Matrix Algebra. Chapter 8 - S&B

Matrices and Matrix Algebra.

Lecture4 INF-MAT : 5. Fast Direct Solution of Large Linear Systems

Lecture 3: QR-Factorization

Direct Methods for Solving Linear Systems. Matrix Factorization

. D Matrix Calculus D 1

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Gaussian Elimination and Back Substitution

Next topics: Solving systems of linear equations

Review of Basic Concepts in Linear Algebra

Dense LU factorization and its error analysis

Math 291-2: Final Exam Solutions Northwestern University, Winter 2016

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Elementary maths for GMT

1 Linear Algebra Problems

1 Determinants. 1.1 Determinant

Iterative Methods. Splitting Methods

Transcription:

EE364a Review Session 7 EE364a Review session outline: derivatives and chain rule (Appendix A.4) numerical linear algebra (Appendix C) factor and solve method exploiting structure and sparsity 1

Derivative and gradient When f is real-valued (i.e., f : R n R), the derivative Df(x) is a 1 n matrix, i.e., it is a row vector. its transpose is called the gradient of the function: f(x) = Df(x) T, which is a (column) vector, i.e., in R n its components are the partial derivatives of f: f(x) i = f(x) x i, i = 1,...,n the first-order approximation of f at a point x can be expressed as (the affine function of z) f(x) + f(x) T (z x) EE364a Review Session 7 2

example: Find the gradient of g : R m R, g(y) = log m exp(y i ). i=1 solution. g(y) = 1 m i=1 expy i expy 1. expy m EE364a Review Session 7 3

Chain rule Suppose f : R n R m and g : R m R p are differentiable. Define h : R n R p by h(x) = g(f(x)). Then Dh(x) = Dg(f(x))Df(x). Composition with an affine function: Suppose g : R m R p is differentiable, A R m n, and b R m. Define h : R n R p as h(x) = g(ax + b). The derivative of h is Dh(x) = Dg(Ax + b)a. When g is real-valued (i.e., p = 1), h(x) = A T g(ax + b). EE364a Review Session 7 4

example A.2: Find the gradient of h : R n R, h(x) = log m exp(a T i x + b i ), i=1 where a 1,...,a m R n, and b 1,...,b m R. Hint: h is the composition of the affine function Ax + b, where A R m n has rows a T 1,...,a T m, and the function g(y) = log( m i=1 exp y i). solution. g(y) = 1 m i=1 expy i then by the composition formula we have h(x) = 1 1 T z AT z where z i = exp(a T i x + b i), i = 1,...,m expy 1. expy m EE364a Review Session 7 5

Second derivative When f is real-valued (i.e., f : R n R), the second derivative or Hessian matrix 2 f(x) is a n n matrix, with components 2 f(x) ij = 2 f(x) x i x j, i = 1,...n, j = 1,...,n, the second-order approximation of f, at or near x, is the quadratic function of z defined by f(z) = f(x) + f(x) T (z x) + (1/2)(z x) T 2 f(x)(z x). EE364a Review Session 7 6

Chain rule for second derivative Composition with scalar function Suppose f : R n R, g : R R, and h(x) = g(f(x)). The second derivative of h is 2 h(x) = g (f(x)) 2 f(x) + g (f(x)) f(x) f(x) T. Composition with affine function Suppose g : R m R, A R m n, and b R m. Define h : R n R by h(x) = g(ax + b). The second derivative of h is 2 h(x) = A T 2 g(ax + b)a. EE364a Review Session 7 7

example A.4: Find the Hessian of h : R n R, h(x) = log m exp(a T i x + b i ), i=1 where a 1,...,a m R n, and b 1,...,b m R. Hint: For g(y) = log( m i=1 expy i), we have 1 g(y) = m i=1 expy expy 1. i expy m 2 g(y) = diag( g(y)) g(y) g(y) T. solution. using the chain rule for composition with affine function, ( 1 2 h(x) = A T 1 T z diag(z) 1 ) (1 T z) 2zzT A where z i = exp(a T i x + b i), i = 1,...,m EE364a Review Session 7 8

Numerical linear algebra factor-solve method for Ax = b consider set of n linear equations in n variables, i.e., A is square computational cost f + s f is flop count of factorization s is flop count of solve step for single factorization and k solves, computational cost is f + ks EE364a Review Session 7 9

LU factorization nonsingular matrix A can be decomposed as A = PLU f = (2/3)n 3 (Gaussian elimination) s = 2n 2 (forward and back substitution) for example, can compute n n matrix inverse with cost f + ns = (8/3)n 3 (why?) solution. write AX = I as A[x 1 x n ] = [e 1 e n ] then solve Ax i = e i for i = 1,...,n EE364a Review Session 7 10

Cholesky factorization symmetric, positive definite matrix A can be decomposed as A = LL T f = (1/3)n 3 s = 2n 2 prob. 9.31a: only factor once every N iterations, but solve every iteration every N steps, computation is f + s = (1/3)n 3 + 2n 2 flops all other steps, computation is s = 2n 2 flops EE364a Review Session 7 11

Exploiting structure computational costs for solving Ax = b structure of A f s none (2/3)n 3 2n 2 symmetric, positive definite (1/3)n 3 2n 2 lower triangular 0 n 2 k-banded (a ij = 0 if i j > k) 4nk 2 6nk block diag with m blocks (2/3)n 3 /m 2 2n 2 /m DFT (using FFT to solve) 0 5n log n EE364a Review Session 7 12

Block elimination solve [ ] [ ] A11 A 12 x1 A 21 A 22 x 2 = [ b1 b 2 ] first equation A 11 x 1 + A 12 x 2 = b 1 gives us x 1 = A 1 11 (b 1 A 12 x 2 ) second equation is then (A 22 A 21 A 1 11 A 12)x 2 = b 2 A 21 A 1 11 b 1. speedup if A 11 and S = A 22 A 21 A 1 11 A 12 are easy to invert EE364a Review Session 7 13

example: Solve the set of equations [ A 0 0 B ] x = [ b c ] where A R n n, B R n n, b R n, c R n, and matrices A and B are nonsingular flop count of brute-force method? solution. (2/3)(2n) 3 = (16/3)n 3 how can we exploit structure? solution. partition x = (x 1,x 2 ) x 1 = A 1 b, x 2 = B 1 c flop count: 2(2/3)n 3 = (4/3)n 3 EE364a Review Session 7 14

example: Solve the set of equations [ ] I B x = C I [ b1 b 2 ] where B R m n, C R n m, and m n; also assume that the whole matrix is nonsingular flop count of brute-force method? solution. (2/3)(m + n) 3 how can we exploit structure? solution. use block elimination to get equations (I CB)x 2 = b 2 Cb 1 and x 1 = b 1 Bx 2 flop count: forming I CB costs 2mn 2, b 2 Cb 1 is 2mn, solving for x 2 is (2/3)n 3, and computing x 1 costs 2mn; overall complexity is 2mn 2 EE364a Review Session 7 15

Solving almost separable linear equations Consider the following system of 2n + m equations Ax + By Dx + Ey + Fz Hy + Jz = c = g = k where A, J R n n, B, H R n m, D,F R m n, E R m m, c,k R n, g R m and n > m EE364a Review Session 7 16

need to solve the following system A B 0 D E F 0 H J x y z = c g k naive way: treat as dense can take advantage of the structure by first reordering the equations and variables A 0 B x c 0 J H z = k D F E y g the system now looks like an arrow system, which we can efficiently solve by block elimination. EE364a Review Session 7 17

since [ ] [ ] [ ] [ ] A 0 x B c + y = 0 J z H k then [ ] [ ] [ ] x A = 1 c A z J 1 1 B k J 1 y H we know that [ D F ] [ x z ] + Ey = g then plugging into the expression derived above [ D F ] ([ A 1 c J 1 k ] [ A 1 B J 1 H ] ) y + Ey = g therefore (E DA 1 B FJ 1 H)y = g DA 1 c FJ 1 k EE364a Review Session 7 18

We can therefore solve the system of equations efficiently by taking advantage of structure in the following way form M = A 1 B, n = A 1 c, P = J 1 H, q = J 1 k. compute r = g Dn Fq. compute S = E DM FP. find y = S 1 r, x = n My, z = q Py. EE364a Review Session 7 19

Using sparsity in Matlab construct using sparse, spalloc, speye, spdiags, spones visualize and analyze using spy, nnz also have sprand, sprandn, eigs, svds be careful not to accidentally make sparse matrices dense EE364a Review Session 7 20

using sparsity in additional problem 2 the (sparse) tridiagonal matrix R n n = 1 1 0 0 0 0 1 2 1 0 0 0 0 1 2 0 0 0...... 0 0 0 2 1 0 0 0 0 1 2 1 0 0 0 0 1 1. can be built in Matlab as follows: e = ones(n,1); D = spdiags([-e 2*e -e],[-1 0 1], n,n); D(1,1) = 1; D(n,n) = 1; the sparse identity matrix can be built using speye EE364a Review Session 7 21

Sparse Cholesky factorization with permutations consider factorizing downward arrow matrix A d = L d L T d, with n = 3000 nnz(a_d)=8998 call using L_d=chol(A_d, lower ) (use L_d=chol(A_d) in older versions of Matlab) 0 0 500 500 1000 1000 1500 1500 2000 2000 2500 2500 3000 0 1000 2000 3000 A d 3000 0 1000 2000 3000 L d nnz(l_d)=5999; factorization takes tf_d=0.0022 seconds to solve A d x = b, call x=l_d \(L_d\b), which takes ts_d=0.0020 seconds to run EE364a Review Session 7 22

now look at factorizing upward arrow matrix Au = LuLTu again, nnz(a_u)=8998 call using L_u=chol(A_u, lower ) 0 0 500 500 1000 1000 1500 1500 2000 2000 2500 2500 3000 0 1000 2000 Au 3000 3000 0 1000 2000 3000 Lu nnz(l_u)=4501500, and takes tf_u=3.7288 seconds to compute calling x=l_u \(L_u\b) no longer efficient; takes ts_u=0.5673 seconds to run EE364a Review Session 7 23

instead factorize A u with permutations, A u = PL p L T p P T call using [L_p,pp,P]=chol(A_u, lower ) 0 0 500 500 1000 1000 1500 1500 2000 2000 2500 2500 3000 0 1000 2000 3000 L p 3000 0 1000 2000 3000 P P is Toeplitz matrix with 1 s on the sub-diagonal and in upper right corner, i.e., P 1,n = 1, P k+1,k = 1 for k = 1,...,n, all other entries 0 factorization only takes tf_p=0.0028 seconds to compute solve A u x = b using x=p \(L_p \(L_p\(P\b))); solve takes ts_p=0.0042 seconds EE364a Review Session 7 24