Matrix Secant Methods - PDF Free Download

Equation Solving g(x) = 0

Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ).

Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). 3700 years ago the Babylonians used the secant method in 1D: J = x x 1 g(x ) g(x 1 ).

Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). 3700 years ago the Babylonians used the secant method in 1D: This gives the error J = x x 1 g(x ) g(x 1 ). g (x ) 1 J = g(x 1 ) [g(x ) + g (x )(x 1 x )] g (x )[g(x 1 ) g(x. )] If g (x ) 0, then α > 0 such that g (x ) 1 J (L/2) x 1 x 2 α g (x ) x 1 x K x 1 x by the Quadratic Bound Lemma, so x x 2-step quadratic rate.

Can we apply the secant method to higher dimentions than 1? J = x x 1 g(x ) g(x 1 )

Can we apply the secant method to higher dimentions than 1? J = x x 1 g(x ) g(x 1 ) Multiply on the rhs by g(x ) g(x 1 ) gives J (g(x ) g(x 1 )) = x x 1. This is called the matrix secant equation (MSE), or quasi-newton equation. Here the matrix J is unnown (n 2 unnowns), and satisfies the n linear equations of the MSE.

To better understand what further conditions on J are sensible, we revert to discussing the matrices B = J 1, so the MSE becomes B (x x 1 ) = g(x ) g(x 1 ).

To better understand what further conditions on J are sensible, we revert to discussing the matrices B = J 1, so the MSE becomes B (x x 1 ) = g(x ) g(x 1 ). Key Idea: Thin of the J s as part of an iteration scheme {x, B }. As the iteration proceeds, it is hoped that B becomes a better approximation to g (x ). With this in mind, we assume that, in the long run, B is already close to g (x ), so B +1 should not differ from B by too much, i.e. B should be close to B +1.

What do we mean by close in this context? Should B B +1 be small?

What do we mean by close in this context? Should B B +1 be small? We tae close to mean algebraically close in the sense that B +1 should be easy to compute from B and yet satisfy the MSE.

What do we mean by close in this context? Should B B +1 be small? We tae close to mean algebraically close in the sense that B +1 should be easy to compute from B and yet satisfy the MSE. Let s start by assuming that B +1 is a ran one update to B. That is, there exist u, v R n such that B +1 = B + uv T

Set s := x +1 x and y := g(x +1 ) g(x ). The MSE becomes B +1 s = y.

Set s := x +1 x and y := g(x +1 ) g(x ). The MSE becomes B +1 s = y. Multiplying by s gives Hence, if v T s 0, we obtain and y = B +1 s = B s + uv T s. B +1 = B + u = y B s v T s ( y B s ) v T v T s.

( y B s ) v T B +1 = B + v T s This equation determines a whole class of ran one updates that satisfy the MSE by choosing v R n so that v T s 0.

( y B s ) v T B +1 = B + v T s This equation determines a whole class of ran one updates that satisfy the MSE by choosing v R n so that v T s 0. One choice is v = s giving B +1 = B = ( y B s ) s T s T s. This is nown as Broyden s update. It turns out that the Broyden update is also analytically close to B.

Let A R n n, s, y R n, s 0. Then for any matrix norms and e such that AB A B e and the solution to vv T v T v e 1, min { B A : Bs = y} B R n n is T (y As)s A + = A +. s T s In particular, A + solves the given optimization problem when is the l 2 matrix norm, and A + is the unique solution when is the Frobenius norm.

Proof: Let B {B R n n : Bs = y}, then T (y As)s A + A = = (B A) ss T s T s s T s B A ss T s T s e B A.

Broyden s Methods Algorithm: Broyden s Method Initialization: x 0 R n, B 0 R n n Having (x, B ) compute (x +1, B x+1 ) as follows: Solve B s = g(x ) for s and set x +1 : = x + s y : = g(x +1 ) g(x ) B +1 : = B + (y B s )s T. s T s

Sherman-Morrison-Woodbury Formula for Matrix Inversion Suppose A R n n, U R n, V R n are such that both A 1 and (I + V T A 1 U) 1 exist, then (A + UV T ) 1 = A 1 A 1 U(I + V T A 1 U) 1 V T A 1

Inverse Broyden Update If B 1 J +1 = = J exists and s T J y = s T B 1 y 0, then [ B + (y B s )s T s T s = J + (s J y )s T J. s T J y ] 1 = B 1 + (s B 1 y )s T B 1 s T B 1 y

Inverse Broyden Update If B 1 J +1 = = J exists and s T J y = s T B 1 y 0, then [ B + (y B s )s T s T s = J + (s J y )s T J. s T J y ] 1 = B 1 + (s B 1 y )s T B 1 s T B 1 y Under suitable hypotheses x x superlinearly.

Broyden Updates Algorithm: Broyden s Method (Inverse Updating) Initialization: x 0 R n, B 0 R n n Having (x, B ) compute (x +1, B x+1 ) as follows: s : = J g(x ) x +1 : = x + s y : = g(x +1 ) g(x ) J +1 = J + (s J y )s T J. s T J y