Quasi-Newton methods for minimization

Size: px
Start display at page:

Download "Quasi-Newton methods for minimization"

Transcription

1 Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1 / 63

2 Quasi Newton Method Outline 1 Quasi Newton Method 2 The symmetric rank one update 3 The Powell-symmetric-Broyden update 4 The Davidon Fletcher and Powell rank 2 update 5 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 6 The Broyden class Quasi-Newton methods for minimization 2 / 63

3 Quasi Newton Method Algorithm (General quasi-newton algorithm) k 0; x 0 assigned; g 0 f(x 0 ) T ; H 0 2 f(x 0 ) 1 ; while g k > ɛ do compute search direction d k H k g k ; Approximate arg min α>0 f(x k + αd k ) by linsearch; perform step x k+1 x k + α k d k ; g k+1 f(x k+1 ) T ; update H k+1 H k+1 some algorithm ( H k, x k, x k+1, g k, g k+1 ) ; k k + 1; end while Quasi-Newton methods for minimization 3 / 63

4 The symmetric rank one update Outline 1 Quasi Newton Method 2 The symmetric rank one update 3 The Powell-symmetric-Broyden update 4 The Davidon Fletcher and Powell rank 2 update 5 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 6 The Broyden class Quasi-Newton methods for minimization 4 / 63

5 The symmetric rank one update Let B k an approximation of the Hessian of f(x). Let x k, x k+1, g k and g k+1 points and gradients at k and k + 1-th iterates. Using the Broyden update formula to force secant condition to B k+1 we obtain B k+1 B k + (y k B k s k )s T k s T k s k where s k = x k+1 x k and y k = g k+1 g k. By using Sherman Morrison formula and setting H k = B 1 k we obtain the update: H k+1 H k (H ky k s k )s T k s T k s k + s T k H kg k+1 H k The previous update do not maintain symmetry. In fact if H k is symmetric then H k+1 not necessarily is symmetric., Quasi-Newton methods for minimization 5 / 63

6 The symmetric rank one update To avoid the loss of symmetry we can consider an update of the form: H k+1 H k + uu T Imposing the secant condition (on the inverse) we obtain H k+1 y k = s k H k y k + uu T y k = s k from previous equality y T k H ky k + y T k uut y k = y T k s k y T k u = ( y T k s k y T k H ky k ) 1/2 we obtain u = s k H k y k u T y k = s k H k y k ( y T k s k y T k H ky k ) 1/2 Quasi-Newton methods for minimization 6 / 63

7 The symmetric rank one update substituting the expression of u u = s k H k y k ( y T k s k y T k H ky k ) 1/2 in the update formula, we obtain H k+1 H k + w kw T k w T k y k w k = s k H k y k The previous update formula is the symmetric rank one formula (SR1). To be definite the previous formula needs w T k y k 0. Moreover if w T k y k < 0 and H k is positive definite then H k+1 may loss positive definitiveness. Have H k symmetric and positive definite is important for global convergence Quasi-Newton methods for minimization 7 / 63

8 The symmetric rank one update This lemma is used in the forward theorems Lemma Let be q(x) = 1 2 xt Ax b T x + c with A R n n symmetric and positive defined. Then y k = g k+1 g k = Ax k+1 b Ax k + b = As k where g k = q(x k ) T. Quasi-Newton methods for minimization 8 / 63

9 The symmetric rank one update Theorem (property of SR1 update) Let be q(x) = 1 2 xt Ax b T x + c with A R n n symmetric and positive definite. Let be x 0 and H 0 assigned. Let x k and H k produced by 1 x k+1 = x k + s k ; 2 H k+1 updated by the SR1 formula H k+1 H k + w kw T k w T k y k w k = s k H k y k If s 0, s 1,..., s n 1 are linearly independent then H n = A 1. Quasi-Newton methods for minimization 9 / 63

10 The symmetric rank one update Proof. (1/2). We prove by induction the hereditary property H i y j = s j. BASE: For i = 1 is exactly the secant condition of the update. INDUCTION: Suppose the relation is valid for k > 0 the we prove that it is valid for k + 1. In fact, from the update formula H k+1 y j = H k y j + wt k y j wk T y w k k w k = s k H k y k by the induction hypothesis for j < k and using lemma on slide 8 we have w T k y j = s T k y j y T k H ky j = s T k y j y T k s j = y T k Ay j y T k Ay j = 0 so that H k+1 y j = H k y j = s j for j = 0, 1,..., k 1. For j = k we have H k+1 y k = s k trivially by construction of the SR1 formula. Quasi-Newton methods for minimization 10 / 63

11 The symmetric rank one update Proof. To prove that H n = A 1 notice that (2/2). H n y j = s j, As j = y j, j = 0, 1,..., n 1 and combining the equality H n As j = s j, j = 0, 1,..., n 1 due to the linear independence of s i we have H n A = I i.e. H n = A 1. Quasi-Newton methods for minimization 11 / 63

12 The symmetric rank one update Properties of SR1 update (1/2) 1 The SR1 update possesses the natural quadratic termination property (like CG). 2 SR1 satisfy the hereditary property H k y j = s j for j < k. 3 SR1 does maintain the positive definitiveness of H k if and only if w T k y k > 0. However this condition is difficult to guarantee. 4 Sometimes w T k y k becomes very small or 0. This results in serious numerical difficulty (roundoff) or even the algorithm is broken. We can avoid this breakdown by the following strategy Breakdown workaround for SR1 update 1 if wk T y k ɛ wk T y k (i.e. the angle between w k and y k is far from 90 degree), then we update with the SR1 formula. 2 Otherwise we set H k+1 = H k. Quasi-Newton methods for minimization 12 / 63

13 The symmetric rank one update Properties of SR1 update (2/2) Theorem (Convergence of nonlinear SR1 update) Let f(x) satisfying standard assumption. Let be {x k } a sequence of iterates such that lim k x k = x. Suppose we use the breakdown workaround for SR1 update and the steps {s k } are uniformly linearly independent. Then we have lim Hk 2 f(x ) 1 = 0. k A.R.Conn, N.I.M.Gould and P.L.Toint Convergence of quasi-newton matrices generated by the symmetric rank one update. Mathematic of Computation , Quasi-Newton methods for minimization 13 / 63

14 The Powell-symmetric-Broyden update Outline 1 Quasi Newton Method 2 The symmetric rank one update 3 The Powell-symmetric-Broyden update 4 The Davidon Fletcher and Powell rank 2 update 5 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 6 The Broyden class Quasi-Newton methods for minimization 14 / 63

15 The Powell-symmetric-Broyden update The SR1 update, although symmetric do not have minimum property like the Broyden update for the non symmetric case. The Broyden update B k+1 = B k + (y k B k s k )s T k s T k s k solve the minimization problem B k+1 B k F B B k F for all Bs k = y k If we solve a similar problem in the class of symmetric matrix we obtain the Powell-symmetric-Broyden (PSB) update Quasi-Newton methods for minimization 15 / 63

16 The Powell-symmetric-Broyden update Lemma (Powell-symmetric-Broyden update) Let A R n n symmetric and s, y R n with s 0. Consider the set B = { B R n n Bs = y, B = B T } if s T y 0 a then there exists a unique matrix B B such that A B F A C F for all C B moreover B has the following form B = A + ωst + sω T s T s (ω T s) sst (s T s) 2 ω = y As then B is a rank two perturbation of the matrix A. a This is true if Wolfe line search is performed Quasi-Newton methods for minimization 16 / 63

17 The Powell-symmetric-Broyden update Proof. First of all notice that B is not empty, in fact [ ] 1 1 s T y yyt B s T y yyt s = y (1/11). So that the problem is not empty. Next we reformulate the problem as a constrained minimum problem: 1 arg min B R n n 2 n (A ij B ij ) 2 subject to Bs = y and B = B T i,j=1 The solution is a stationary point of the Lagrangian: g(b, λ, M) = 1 2 A B 2 F + λt (By s) + i<j µ ij (B ij B ji ) Quasi-Newton methods for minimization 17 / 63

18 The Powell-symmetric-Broyden update Proof. taking the gradient we have B ij g(b, λ, B) = A ij B ij + λ i s j + M ij = 0 (2/11). where µ ij if i < j; M ij = µ ij if i > j; 0 If i = j. The previous equality can be written in matrix form as B = A + λs T + M. Quasi-Newton methods for minimization 18 / 63

19 The Powell-symmetric-Broyden update Proof. Imposing symmetry for B (3/11). A + λs T + M = A T + sλ T + M T = A + sλ T M solving for M we have substituting in B we have M = sλt λs T 2 B = A + sλt + λs T 2 Quasi-Newton methods for minimization 19 / 63

20 The Powell-symmetric-Broyden update Proof. Imposing s T Bs = s T y (4/11). s T As + st sλ T s + s T λs T s 2 λ T s = (s T ω)/(s T s) where ω = y As. Imposing Bs = y As + sλt s + λs T s 2 λ = 2ω s T s (st ω)s (s T s) 2 next we compute the explicit form of B. = s T y = y Quasi-Newton methods for minimization 20 / 63

21 The Powell-symmetric-Broyden update Proof. Substituting (5/11). we obtain λ = 2ω s T s (st ω)s (s T s) 2 in B = A + sλt + λs T 2 B = A + ωst + sω T s T s (ω T s) sst (s T s) 2 ω = y As next we prove that B is the unique minimum. Quasi-Newton methods for minimization 21 / 63

22 The Powell-symmetric-Broyden update Proof. The matrix B is a minimum, in fact B A F = ωs T + sω T s T s (ω T s) sst (s T s) 2 F (6/11). To bound this norm we need the following properties of Frobenius norm: M N 2 F = M 2 F + N 2 F 2M N; where M N = ij M ijn ij setting M = ωst + sω T s T s N = (ω T s) sst (s T s) 2 now we compute M F, N F and M N. Quasi-Newton methods for minimization 22 / 63

23 The Powell-symmetric-Broyden update Proof. M N = ωt s (s T s) 3 ij (ω i s j + ω j s i )s i s j = ωt s [ (ωi (s T s) 3 s i )sj 2 + (ω j s j )si 2 ) ] ij [ = ωt s (s T s) 3 (ω i s i ) sj 2 + (ω j s j ) i j j i [ ] = ωt s (s T s) 3 (ω T s)(s T s) + (ω T s)(s T s) = 2(ωT s) 2 (s T s) 2 s 2 i (7/11). ] Quasi-Newton methods for minimization 23 / 63

24 The Powell-symmetric-Broyden update Proof. (8/11). To bound N 2 F and M 2 F we need the following properties of Frobenius norm: uv T 2 F = (ut u)(v T v); uv T + vu T 2 F = 2(uT u)(v T v) + 2(u T v) 2 ; Then we have N 2 F = (ωt s) 2 (s T s) 4 ss T 2 F = (ωt s) 2 (s T s) 4 (st s) 2 = (ωt s) 2 (s T s) 2 M 2 F = ωst + sω T s T s = 2(ωT ω)(s T s) + 2(s T ω) 2 (s T s) 2 Quasi-Newton methods for minimization 24 / 63

25 The Powell-symmetric-Broyden update Proof. Putting all together and using Cauchy-Schwartz inequality (a T b a b ): (9/11). M N 2 F = (ωt s) 2 (s T s) 2 + 2(ωT ω)(s T s) + 2(s T ω) 2 (s T s) 2 4(ωT s) 2 (s T s) 2 = 2(ωT ω)(s T s) (ω T s) 2 (s T s) 2 ωt ω s T s = ω 2 s 2 [used Cauchy-Schwartz] Using ω = y As and noticing that y = Cs for all C B. so that ω = y As = Cs As = (C A)s Quasi-Newton methods for minimization 25 / 63

26 The Powell-symmetric-Broyden update Proof. To bound (C A)s we need the following property of Frobenius norm: in fact Mx M F x ; Mx 2 = i using this inequality ( ) 2 ( M ij s j j = M 2 F s 2 i j M 2 ij )( k s 2 k (10/11). ) M N F ω s = (C A)s s C A F s s i.e. we have A B F C A F for all C B. Quasi-Newton methods for minimization 26 / 63

27 The Powell-symmetric-Broyden update Proof. (11/11). Let B and B two different minimum. Then 1 2 (B + B ) B moreover A 1 2 (B + B ) 1 A B F 2 F + 1 A B 2 F If the inequality is strict we have a contradiction. From the Cauchy Schwartz inequality we have an equality only when A B = λ(a B ) so that and B λb = (1 λ)a B s λb s = (1 λ)as (1 λ)y = (1 λ)as but this is true only when λ = 1, i.e. B = B. Quasi-Newton methods for minimization 27 / 63

28 The Powell-symmetric-Broyden update Algorithm (PSB quasi-newton algorithm) k 0; x assigned; g f(x) T ; B 2 f(x); while g > ɛ do compute search direction d B 1 g; [solve linear system Bd = g] Approximate arg min α>0 f(x + αd) by linsearch; perform step x x + αd; update B k+1 ω f(x) T + (α 1)g; g f(x) T ; β (αd T d) 1 ; γ β 2 αd T ω; B B + β ( dω T + ωd T ) γdd T ; k k + 1; end while Quasi-Newton methods for minimization 28 / 63

29 The Davidon Fletcher and Powell rank 2 update Outline 1 Quasi Newton Method 2 The symmetric rank one update 3 The Powell-symmetric-Broyden update 4 The Davidon Fletcher and Powell rank 2 update 5 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 6 The Broyden class Quasi-Newton methods for minimization 29 / 63

30 The Davidon Fletcher and Powell rank 2 update The SR1 and PSB update maintains the symmetry but do not maintains the positive definitiveness of the matrix H k+1. To recover this further property we can try the update of the form: H k+1 H k + αuu T + βvv T Imposing the secant condition (on the inverse) H k+1 y k = s k H k y k + α(u T y k )u + β(v T y k )v = s k α(u T y k )u + β(v T y k )v = s k H k y k clearly this equation has not a unique solution. A natural choice for u and v is the following: u = s k v = H k y k Quasi-Newton methods for minimization 30 / 63

31 The Davidon Fletcher and Powell rank 2 update Solving for α and β the equation we obtain α(s T k y k)s k + β(y T k H ky k )H k y k = s k H k y k α = 1 s T k y k 1 β = yk T H ky k substituting in the updating formula we obtain the Davidon Fletcher and Powell (DFP) rank 2 update formula H k+1 H k + s ks T k s T k y H ky k yk T H k k yk T H ky k Obviously this is only one of the possible choices and with other solutions we obtain different update formulas. Next we must prove that under suitable condition the DFP update formula maintains positive definitiveness. Quasi-Newton methods for minimization 31 / 63

32 The Davidon Fletcher and Powell rank 2 update Positive definitiveness of DFP update Theorem (Positive definitiveness of DFP update) Given H k symmetric and positive definite, then the DFP update H k+1 H k + s ks T k s T k y H ky k yk T H k k yk T H ky k produce H k+1 positive definite if and only if s T k y k > 0. Remark (Wolfe DFP update is SPD) Expanding s T k y k > 0 we have f(x k+1 )s k > f(x k )s k. Remember that in a minimum search algorithm we have s k = α k p k with α k > 0. But the second Wolfe condition for line-search is f(x k + α k p k )p k c 2 f(x k )p k with 0 < c 2 < 1. But this imply: f(x k+1 )s k c 2 f(x k )s k > f(x k )s k s T k y k > 0. Quasi-Newton methods for minimization 32 / 63

33 The Davidon Fletcher and Powell rank 2 update Proof. (1/2). Let be s T k y k > 0: consider a z 0 then ( z T H k+1 z = z T H k H ky k yk T H ) k yk T H z + z T s ks T k ky k s T k y z k = z T H k z (zt H k y k )(y T k H kz) y T k H ky k + (zt s k ) 2 s T k y k H k is SPD so that there exists the Cholesky decomposition LL T = H k. Defining a = L T z and b = L T y k we can write z T H k+1 z = (at a)(b T b) (a T b) 2 b T b + (zt s k ) 2 s T k y k from the Cauchy-Schwartz inequality we have (a T a)(b T b) (a T b) 2 so that z T H k+1 z 0. Quasi-Newton methods for minimization 33 / 63

34 The Davidon Fletcher and Powell rank 2 update Proof. To prove strict inequality remember from the Cauchy-Schwartz inequality that (a T a)(b T b) = (a T b) 2 if and only if a = λb, i.e. (2/2). but in this case L T z = λl T y k z = λy k (z T s k ) 2 s T k y k = λ 2 (yt s k ) 2 s T k y k > 0 z T H k+1 z > 0. Quasi-Newton methods for minimization 34 / 63

35 The Davidon Fletcher and Powell rank 2 update Algorithm (DFP quasi-newton algorithm) k 0; x assigned; g f(x) T ; H 2 f(x) 1 ; while g > ɛ do compute search direction d Hg; Approximate arg min α>0 f(x + αd) by linsearch; perform step x x + αd; update H k+1 y f(x) T g; z Hy; g f(x) T ; H H α ddt d T y zzt y T z ; k k + 1; end while Quasi-Newton methods for minimization 35 / 63

36 The Davidon Fletcher and Powell rank 2 update Theorem (property of DFP update) Let be q(x) = 1 2 (x x ) T A(x x ) + c with A R n n symmetric and positive definite. Let be x 0 and H 0 assigned. Let {x k } and {H k } produced by the sequence {s k } 1 x k+1 x k + s k ; 2 H k+1 H k + s ks T k s T k y H ky k yk T H k k yk T H ; ky k where s k = α k p k with α k is obtained by exact line-search. Then for j < k we have 1 g T k s j = 0; [orthogonality property] 2 H k y j = s j ; [hereditary property] 3 s T k As j = 0; [conjugate direction property] 4 The method terminate (i.e. f(x m ) = 0) at x m = x with m n. If n = m then H n = A 1. Quasi-Newton methods for minimization 36 / 63

37 The Davidon Fletcher and Powell rank 2 update Proof. Points (1), (2) and (3) are proved by induction. The base of induction is obvious, let be the theorem true for k > 0. Due to exact line search we have: g T k+1 s k = 0 moreover by induction for j < k we have g T k+1 s j = 0, in fact: g T k+1 s j = g T j s j + k 1 = 0 + k 1 i=j (g i+1 g i ) T s j i=j (A(x i+1 x ) A(x i x )) T s j = k 1 i=j (A(x i+1 x i )) T s j = k 1 i=j st i As j = 0. (1/4). [induction + conjugacy prop.] Quasi-Newton methods for minimization 37 / 63

38 The Davidon Fletcher and Powell rank 2 update Proof. (2/4). By using s k+1 = α k+1 H k+1 g k+1 we have s T k+1 As j = 0, in fact: s T k+1 As j = α k+1 gk+1 T H k+1(ax j+1 Ax j ) = α k+1 gk+1 T H k+1(a(x j+1 x ) A(x j x )) = α k+1 gk+1 T H k+1(g j+1 g j ) = α k+1 gk+1 T H k+1y j = α k+1 gk+1 T s j [induction + hereditary prop.] = 0 notice that we have used As j = y j. Quasi-Newton methods for minimization 38 / 63

39 The Davidon Fletcher and Powell rank 2 update Proof. Due to DFP construction we have (3/4). H k+1 y k = s k by inductive hypothesis and DFP formula for j < k we have, s T k y j = s T k As j = 0, moreover H k+1 y j = H k y j + s ks T k y j s T k y k H ky k y T k H ky j y T k H ky k = s j + s k0 s T k y H ky k yk T s j k yk T H [H k y j = s j ] ky k = s j H ky k (g k+1 g k ) T s j y T k H ky k [y j = g j+1 g j ] = s j [induction + ortho. prop.] Quasi-Newton methods for minimization 39 / 63

40 The Davidon Fletcher and Powell rank 2 update Proof. (4/4). Finally if m = n we have s j with j = 0, 1,..., n 1 are conjugate and linearly independent. From hereditary property and lemma on slide 8 i.e. we have H n As k = H n y k = s k H n As k = s k, k = 0, 1,..., n 1 due to linear independence of {s k } follows that H n = A 1. Quasi-Newton methods for minimization 40 / 63

41 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Outline 1 Quasi Newton Method 2 The symmetric rank one update 3 The Powell-symmetric-Broyden update 4 The Davidon Fletcher and Powell rank 2 update 5 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 6 The Broyden class Quasi-Newton methods for minimization 41 / 63

42 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Another update which maintain symmetry and positive definitiveness is the Broyden Fletcher Goldfarb and Shanno (BFGS,1970) rank 2 update. This update was independently discovered by the four authors. A convenient way to introduce BFGS is by the concept of duality. Consider an update for the Hessian, say B k+1 U(B k, s k, y k ) which satisfy B k+1 s k = y k (the secant condition on the Hessian). Then by exchanging B k H k and s k y k we obtain the dual update for the inverse of the Hessian, i.e. H k+1 U(H k, y k, s k ) which satisfy H k+1 y k = s k (the secant condition on the inverse of the Hessian). Quasi-Newton methods for minimization 42 / 63

43 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Starting from the Davidon Fletcher and Powell (DFP) rank 2 update formula H k+1 H k + s ks T k s T k y H ky k yk T H k k yk T H ky k by the duality we obtain the Broyden Fletcher Goldfarb and Shanno (BFGS) update formula B k+1 B k + y ky T k y T k s k B ks k s T k B k s T k B ks k The BFGS formula written in this way is not useful in the case of large problem. We need an equivalent formula for the inverse of the approximate Hessian. This can be done with a generalization of the Sherman-Morrison formula. Quasi-Newton methods for minimization 43 / 63

44 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Sherman-Morrison-Woodbury formula (1/2) Sherman-Morrison-Woodbury formula permit to explicit write the inverse of a matrix changed with a rank k perturbation Proposition (Sherman Morrison Woodbury formula) (A + UV T ) 1 = A 1 A 1 UC 1 V T A 1 where C = I + V T A 1 U, U = [ u 1, u 2,..., u k ] V = [ v 1, v 2,..., v k ] The Sherman Morrison Woodbury formula can be checked by a direct calculation. Quasi-Newton methods for minimization 44 / 63

45 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Sherman-Morrison-Woodbury formula (2/2) Remark The previous formula can be written as: where ( A + k u i vi T i=1 ) 1 = A 1 A 1 UC 1 V T A 1 C ij = δ ij + v T i A 1 u j i, j = 1, 2,..., k Quasi-Newton methods for minimization 45 / 63

46 The Broyden Fletcher Goldfarb and Shanno (BFGS) update The BFGS update for H Proposition By using the Sherman-Morrison-Woodbury formula the BFGS update for H becomes: H k+1 H k H ky k s T k + s kyk T H k s T k y k + s ks T ( k s T k y 1 + yt k H ) (A) ky k k s T k y k Or equivalently H k+1 ( I s kyk T ( )H s T k y k I y ks T ) k k s T k y + s ks T k k s T k y k (B) Quasi-Newton methods for minimization 46 / 63

47 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. (1/3). Consider the Sherman-Morrison-Woodbury formula with k = 2 and u 1 = v 1 = y k (s T k y k) 1/2 u 2 = v 2 = in this way (setting H k = B 1 k ) we have C 11 = 1 + v1 T B 1 k u 1 = 1 + yt k H ky k s T k y k C 22 = 1 + v T 2 B 1 k B k s k (s T k B ks k ) 1/2 u 2 = 1 st k B kb 1 k B ks k s T k B = 1 1 = 0 ks k C 12 = C 21 = v T 1 B 1 k u 2 = v T 2 B 1 k u 1 = C 12 yk T B 1 k B ks k (s T k y k) 1/2 (s T k B ks k ) 1/2 = (st k y k) 1/2 (s T k B ks k ) 1/2 Quasi-Newton methods for minimization 47 / 63

48 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. In this way the matrix C has the form ( ) β α C = C 1 = 1 ( 0 α 0 α 2 α ) α β (2/3). β = 1 + yt k H ky k s T k y k α = (st k y k) 1/2 (s T k B ks k ) 1/2 where setting Ũ = H ku and Ṽ = H kv where ũ i = H k u i and ṽ i = H k v i i = 1, 2 we have H k+1 H k H k UC 1 V T H k = H k ŨC 1 Ṽ T Quasi-Newton methods for minimization 48 / 63

49 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. Notice that (matrix product is R n 2 R 2 2 R 2 n ) ŨC 1 Ṽ T = 1 ) ( 0 α (ũ1 α 2 ũ 2 α β ) (ṽt ) 1 ṽ T 2 (3/3). = 1 α (ũ 1ṽ T 2 ũ 2 ṽ T 1 ) + β α 2 ũ2ṽ T 2 = 1 α (H ku 1 v T 2 H k H k u 2 v T 1 H k ) + β α 2 H ku 2 v T 2 H k Substituting the values of α, β, ũ s and ṽ s we have we have H k+1 H k H ky k s T k + s ky T k H k s T k y k + s ks T ( k s T k y 1 + yt k H ) ky k k s T k y k At this point the update formula (B) is a straightforward calculation. Quasi-Newton methods for minimization 49 / 63

50 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Positive definitiveness of BFGS update Theorem (Positive definitiveness of BFGS update) Given H k symmetric and positive definite, then the DFP update H k+1 ( I s kyk T ( )H s T k y k I y ks T ) k k s T k y + s ks T k k s T k y k produce H k+1 positive definite if and only if s T k y k > 0. Remark (Wolfe BFGS update is SPD) Expanding s T k y k > 0 we have f(x k+1 )s k > f(x k )s k. Remember that in a minimum search algorithm we have s k = α k p k with α k > 0. But the second Wolfe condition for line-search is f(x k + α k p k )p k c 2 f(x k )p k with 0 < c 2 < 1. But this imply: f(x k+1 )s k c 2 f(x k )s k > f(x k )s k s T k y k > 0. Quasi-Newton methods for minimization 50 / 63

51 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. Let be s T k y k > 0: consider a z 0 then z T H k+1 z = w T H k w + (zt s k ) 2 s T k y k where w = z y k s T k z s T k y k In order to have z T H k+1 z = 0 we must have w = 0 and z T s k = 0. But z T s k = 0 imply w = z and this imply z = 0. Let be z T H k+1 z > 0 for all z 0: Choosing z = y k we have 0 < y T k H k+1y k = (st k y k) 2 s T k y k = s T k y k and thus s T k y k > 0. Quasi-Newton methods for minimization 51 / 63

52 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Algorithm (BFGS quasi-newton algorithm) k 0; x assigned; g f(x) T ; H 2 f(x) 1 ; while g > ɛ do compute search direction d Hg; Approximate arg min α>0 f(x + αd) by linsearch; perform step x x + αd; update H k+1 y f(x) T g; z Hy; g f(x) T ; H H zdt + dz T ( d T + y k k + 1; end while α + yt z d T y ) dd T d T y ; Quasi-Newton methods for minimization 52 / 63

53 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Theorem (property of BFGS update) Let be q(x) = 1 2 (x x ) T A(x x ) + c with A R n n symmetric and positive definite. Let be x 0 and H 0 assigned. Let {x k } and {H k } produced by the sequence {s k } 1 x k+1 x k + s k ; 2 H k+1 ( I s kyk T s T k y k ( )H k I y ks T k s T k y k ) + s ks T k s T k y k where s k = α k p k with α k is obtained by exact line-search. Then for j < k we have 1 g T k s j = 0; [orthogonality property] 2 H k y j = s j ; [hereditary property] 3 s T k As j = 0; [conjugate direction property] 4 The method terminate (i.e. f(x m ) = 0) at x m = x with m n. If n = m then H n = A 1. ; Quasi-Newton methods for minimization 53 / 63

54 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. Points (1), (2) and (3) are proved by induction. The base of induction is obvious, let be the theorem true for k > 0. Due to exact line search we have: g T k+1 s k = 0 moreover by induction for j < k we have g T k+1 s j = 0, in fact: g T k+1 s j = g T j s j + k 1 = 0 + k 1 i=j (g i+1 g i ) T s j i=j (A(x i+1 x ) A(x i x )) T s j = k 1 i=j (A(x i+1 x i )) T s j = k 1 i=j st i As j = 0. (1/4). [induction + conjugacy prop.] Quasi-Newton methods for minimization 54 / 63

55 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. (2/4). By using s k+1 = α k+1 H k+1 g k+1 we have s T k+1 As j = 0, in fact: s T k+1 As j = α k+1 gk+1 T H k+1(ax j+1 Ax j ) = α k+1 gk+1 T H k+1(a(x j+1 x ) A(x j x )) = α k+1 gk+1 T H k+1(g j+1 g j ) = α k+1 gk+1 T H k+1y j = α k+1 gk+1 T s j [induction + hereditary prop.] = 0 notice that we have used As j = y j. Quasi-Newton methods for minimization 55 / 63

56 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. Due to BFGS construction we have (3/4). H k+1 y k = s k by inductive hypothesis and BFGS formula for j < k we have, s T k y j = s T k As j = 0, H k+1 y j = = ( I s kyk T ( )H s T k y k y j st k y ) j k s T k y y k + s ks T k y j k s T k y k ( I s kyk T ) s T k y H k y j + s k0 k s T k y k = s j yt k s j s T k y s k k = s j [H k y j = s j ] Quasi-Newton methods for minimization 56 / 63

57 The Broyden Fletcher Goldfarb and Shanno (BFGS) update Proof. (4/4). Finally if m = n we have s j with j = 0, 1,..., n 1 are conjugate and linearly independent. From hereditary property and lemma on slide 8 i.e. we have H n As k = H n y k = s k H n As k = s k, k = 0, 1,..., n 1 due to linear independence of {s k } follows that H n = A 1. Quasi-Newton methods for minimization 57 / 63

58 The Broyden class Outline 1 Quasi Newton Method 2 The symmetric rank one update 3 The Powell-symmetric-Broyden update 4 The Davidon Fletcher and Powell rank 2 update 5 The Broyden Fletcher Goldfarb and Shanno (BFGS) update 6 The Broyden class Quasi-Newton methods for minimization 58 / 63

59 The Broyden class The DFP update BF GS Hk+1 H k H ky k s T k + s kyk T H k s T k y + s ks T ( k k s T k y 1 + yt k H ) ky k k s T k y k and BFGS update Hk+1 DF P H k + s ks T k s T k y H ky k yk T H k k yk T H ky k maintains the symmetry and positive definitiveness. The following update H θ k+1 P (1 θ)hdf k+1 + θh BF GS k+1 maintain for any θ the symmetry, and for θ [0, 1] also the positive definitiveness. Quasi-Newton methods for minimization 59 / 63

60 The Broyden class Positive definitiveness of Broyden Class update Theorem (Positive definitiveness of Broyden Class update) Given H k symmetric and positive definite, then the Broyden Class update H θ k+1 P (1 θ)hdf k+1 + θh BF GS k+1 produce Hk+1 θ s T k y k > 0. positive definite for any θ [0, 1] if and only if Quasi-Newton methods for minimization 60 / 63

61 The Broyden class Theorem (property of Broyden Class update) Let be q(x) = 1 2 (x x ) T A(x x ) + c with A R n n symmetric and positive definite. Let be x 0 and H 0 assigned. Let {x k } and {H k } produced by the sequence {s k } 1 x k+1 x k + s k ; 2 Hk+1 θ (1 θ)hdf P k+1 + θhbf GS k+1 ; where s k = α k p k with α k is obtained by exact line-search. Then for j < k we have 1 g T k s j = 0; [orthogonality property] 2 H k y j = s j ; [hereditary property] 3 s T k As j = 0; [conjugate direction property] 4 The method terminate (i.e. f(x m ) = 0) at x m = x with m n. If n = m then H n = A 1. Quasi-Newton methods for minimization 61 / 63

62 The Broyden class The Broyden Class update can be written as where H θ k+1 = HDF P k+1 + θw k w T k BF GS = Hk+1 + (θ 1)w k wk T w k = ( yk T H ) [ 1/2 sk ky k s T k y H ] ky k k yk T H ky k For particular values of θ we obtain 1 θ = 0, the DFP update 2 θ = 1, the BFGS update 3 θ = s T k y k/(s k H k y k ) T y k the SR1 update 4 θ = (1 ± (y T k H ky k /s T k y k)) 1 the Hoshino update Quasi-Newton methods for minimization 62 / 63

63 The Broyden class References J. Stoer and R. Bulirsch Introduction to numerical analysis Springer-Verlag, Texts in Applied Mathematics, 12, J. E. Dennis, Jr. and Robert B. Schnabel Numerical Methods for Unconstrained Optimization and Nonlinear Equations SIAM, Classics in Applied Mathematics, 16, Quasi-Newton methods for minimization 63 / 63

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS)

Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno February 6, / 25 (BFG. Limited memory BFGS (L-BFGS) Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb Shanno (BFGS) Limited memory BFGS (L-BFGS) February 6, 2014 Quasi-Newton methods: Symmetric rank 1 (SR1) Broyden Fletcher Goldfarb

More information

5 Quasi-Newton Methods

5 Quasi-Newton Methods Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min

More information

MATH 4211/6211 Optimization Quasi-Newton Method

MATH 4211/6211 Optimization Quasi-Newton Method MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:

More information

Lecture 18: November Review on Primal-dual interior-poit methods

Lecture 18: November Review on Primal-dual interior-poit methods 10-725/36-725: Convex Optimization Fall 2016 Lecturer: Lecturer: Javier Pena Lecture 18: November 2 Scribes: Scribes: Yizhu Lin, Pan Liu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Optimization and Root Finding. Kurt Hornik

Optimization and Root Finding. Kurt Hornik Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725 Quasi-Newton Methods Javier Peña Convex Optimization 10-725/36-725 Last time: primal-dual interior-point methods Consider the problem min x subject to f(x) Ax = b h(x) 0 Assume f, h 1,..., h m are convex

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Quasi Newton Methods Barnabás Póczos & Ryan Tibshirani Quasi Newton Methods 2 Outline Modified Newton Method Rank one correction of the inverse Rank two correction of the

More information

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

2. Quasi-Newton methods

2. Quasi-Newton methods L. Vandenberghe EE236C (Spring 2016) 2. Quasi-Newton methods variable metric methods quasi-newton methods BFGS update limited-memory quasi-newton methods 2-1 Newton method for unconstrained minimization

More information

Search Directions for Unconstrained Optimization

Search Directions for Unconstrained Optimization 8 CHAPTER 8 Search Directions for Unconstrained Optimization In this chapter we study the choice of search directions used in our basic updating scheme x +1 = x + t d. for solving P min f(x). x R n All

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428

More information

Quasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization

Quasi-Newton Methods. Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization Quasi-Newton Methods Zico Kolter (notes by Ryan Tibshirani, Javier Peña, Zico Kolter) Convex Optimization 10-725 Last time: primal-dual interior-point methods Given the problem min x f(x) subject to h(x)

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 3. Gradient Method Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 3 Gradient Method Shiqian Ma, MAT-258A: Numerical Optimization 2 3.1. Gradient method Classical gradient method: to minimize a differentiable convex

More information

Quasi-Newton Methods

Quasi-Newton Methods Quasi-Newton Methods Werner C. Rheinboldt These are excerpts of material relating to the boos [OR00 and [Rhe98 and of write-ups prepared for courses held at the University of Pittsburgh. Some further references

More information

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction

Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method. 1 Introduction ISSN 1749-3889 (print), 1749-3897 (online) International Journal of Nonlinear Science Vol.11(2011) No.2,pp.153-158 Global Convergence of Perry-Shanno Memoryless Quasi-Newton-type Method Yigui Ou, Jun Zhang

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

March 8, 2010 MATH 408 FINAL EXAM SAMPLE March 8, 200 MATH 408 FINAL EXAM SAMPLE EXAM OUTLINE The final exam for this course takes place in the regular course classroom (MEB 238) on Monday, March 2, 8:30-0:20 am. You may bring two-sided 8 page

More information

Numerical Methods for Large-Scale Nonlinear Systems

Numerical Methods for Large-Scale Nonlinear Systems Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.

More information

Math 408A: Non-Linear Optimization

Math 408A: Non-Linear Optimization February 12 Broyden Updates Given g : R n R n solve g(x) = 0. Algorithm: Broyden s Method Initialization: x 0 R n, B 0 R n n Having (x k, B k ) compute (x k+1, B x+1 ) as follows: Solve B k s k = g(x

More information

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods

More information

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss

More information

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University

More information

1. Search Directions In this chapter we again focus on the unconstrained optimization problem. lim sup ν

1. Search Directions In this chapter we again focus on the unconstrained optimization problem. lim sup ν 1 Search Directions In this chapter we again focus on the unconstrained optimization problem P min f(x), x R n where f : R n R is assumed to be twice continuously differentiable, and consider the selection

More information

Lecture 7 Unconstrained nonlinear programming

Lecture 7 Unconstrained nonlinear programming Lecture 7 Unconstrained nonlinear programming Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University,

More information

Quasi-Newton Methods

Quasi-Newton Methods Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

Optimization II: Unconstrained Multivariable

Optimization II: Unconstrained Multivariable Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 1

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23 Optimization: Nonlinear Optimization without Constraints Nonlinear Optimization without Constraints 1 / 23 Nonlinear optimization without constraints Unconstrained minimization min x f(x) where f(x) is

More information

Statistics 580 Optimization Methods

Statistics 580 Optimization Methods Statistics 580 Optimization Methods Introduction Let fx be a given real-valued function on R p. The general optimization problem is to find an x ɛ R p at which fx attain a maximum or a minimum. It is of

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

ECS550NFB Introduction to Numerical Methods using Matlab Day 2

ECS550NFB Introduction to Numerical Methods using Matlab Day 2 ECS550NFB Introduction to Numerical Methods using Matlab Day 2 Lukas Laffers lukas.laffers@umb.sk Department of Mathematics, University of Matej Bel June 9, 2015 Today Root-finding: find x that solves

More information

Gradient-Based Optimization

Gradient-Based Optimization Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems

More information

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

Improved Damped Quasi-Newton Methods for Unconstrained Optimization Improved Damped Quasi-Newton Methods for Unconstrained Optimization Mehiddin Al-Baali and Lucio Grandinetti August 2015 Abstract Recently, Al-Baali (2014) has extended the damped-technique in the modified

More information

Matrix Secant Methods

Matrix Secant Methods Equation Solving g(x) = 0 Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). Newton-Lie Iterations: x +1 := x J g(x ), where J g (x ). 3700 years ago the Babylonians used the secant method in 1D:

More information

arxiv: v3 [math.na] 23 Mar 2016

arxiv: v3 [math.na] 23 Mar 2016 Randomized Quasi-Newton Updates are Linearly Convergent Matrix Inversion Algorithms Robert M. Gower and Peter Richtárik arxiv:602.0768v3 [math.na] 23 Mar 206 School of Mathematics University of Edinburgh

More information

1. Introduction and motivation. We propose an algorithm for solving unconstrained optimization problems of the form (1) min

1. Introduction and motivation. We propose an algorithm for solving unconstrained optimization problems of the form (1) min LLNL IM Release number: LLNL-JRNL-745068 A STRUCTURED QUASI-NEWTON ALGORITHM FOR OPTIMIZING WITH INCOMPLETE HESSIAN INFORMATION COSMIN G. PETRA, NAIYUAN CHIANG, AND MIHAI ANITESCU Abstract. We present

More information

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions

More information

Lecture 14: October 17

Lecture 14: October 17 1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

ENSIEEHT-IRIT, 2, rue Camichel, Toulouse (France) LMS SAMTECH, A Siemens Business,15-16, Lower Park Row, BS1 5BN Bristol (UK)

ENSIEEHT-IRIT, 2, rue Camichel, Toulouse (France) LMS SAMTECH, A Siemens Business,15-16, Lower Park Row, BS1 5BN Bristol (UK) Quasi-Newton updates with weighted secant equations by. Gratton, V. Malmedy and Ph. L. oint Report NAXY-09-203 6 October 203 0.5 0 0.5 0.5 0 0.5 ENIEEH-IRI, 2, rue Camichel, 3000 oulouse France LM AMECH,

More information

Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng

Optimization 2. CS5240 Theoretical Foundations in Multimedia. Leow Wee Kheng Optimization 2 CS5240 Theoretical Foundations in Multimedia Leow Wee Kheng Department of Computer Science School of Computing National University of Singapore Leow Wee Kheng (NUS) Optimization 2 1 / 38

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

Randomized Quasi-Newton Updates are Linearly Convergent Matrix Inversion Algorithms

Randomized Quasi-Newton Updates are Linearly Convergent Matrix Inversion Algorithms Randomized Quasi-Newton Updates are Linearly Convergent Matrix Inversion Algorithms Robert M. Gower and Peter Richtárik School of Mathematics University of Edinburgh United Kingdom ebruary 4, 06 Abstract

More information

Optimization II: Unconstrained Multivariable

Optimization II: Unconstrained Multivariable Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Optimization II: Unconstrained

More information

1 Numerical optimization

1 Numerical optimization Contents 1 Numerical optimization 5 1.1 Optimization of single-variable functions............ 5 1.1.1 Golden Section Search................... 6 1.1. Fibonacci Search...................... 8 1. Algorithms

More information

Part 4: IIR Filters Optimization Approach. Tutorial ISCAS 2007

Part 4: IIR Filters Optimization Approach. Tutorial ISCAS 2007 Part 4: IIR Filters Optimization Approach Tutorial ISCAS 2007 Copyright 2007 Andreas Antoniou Victoria, BC, Canada Email: aantoniou@ieee.org July 24, 2007 Frame # 1 Slide # 1 A. Antoniou Part4: IIR Filters

More information

March 5, 2012 MATH 408 FINAL EXAM SAMPLE

March 5, 2012 MATH 408 FINAL EXAM SAMPLE March 5, 202 MATH 408 FINAL EXAM SAMPLE Partial Solutions to Sample Questions (in progress) See the sample questions for the midterm exam, but also consider the following questions. Obviously, a final

More information

Simultaneous Diagonalization of Positive Semi-definite Matrices

Simultaneous Diagonalization of Positive Semi-definite Matrices Simultaneous Diagonalization of Positive Semi-definite Matrices Jan de Leeuw Version 21, May 21, 2017 Abstract We give necessary and sufficient conditions for solvability of A j = XW j X, with the A j

More information

Static unconstrained optimization

Static unconstrained optimization Static unconstrained optimization 2 In unconstrained optimization an objective function is minimized without any additional restriction on the decision variables, i.e. min f(x) x X ad (2.) with X ad R

More information

ISyE 6663 (Winter 2017) Nonlinear Optimization

ISyE 6663 (Winter 2017) Nonlinear Optimization Winter 017 TABLE OF CONTENTS ISyE 6663 (Winter 017) Nonlinear Optimization Prof. R. D. C. Monteiro Georgia Institute of Technology L A TEXer: W. KONG http://wwkong.github.io Last Revision: May 3, 017 Table

More information

1 Numerical optimization

1 Numerical optimization Contents Numerical optimization 5. Optimization of single-variable functions.............................. 5.. Golden Section Search..................................... 6.. Fibonacci Search........................................

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)

More information

Stochastic Quasi-Newton Methods

Stochastic Quasi-Newton Methods Stochastic Quasi-Newton Methods Donald Goldfarb Department of IEOR Columbia University UCLA Distinguished Lecture Series May 17-19, 2016 1 / 35 Outline Stochastic Approximation Stochastic Gradient Descent

More information

Cubic regularization in symmetric rank-1 quasi-newton methods

Cubic regularization in symmetric rank-1 quasi-newton methods Math. Prog. Comp. (2018) 10:457 486 https://doi.org/10.1007/s12532-018-0136-7 FULL LENGTH PAPER Cubic regularization in symmetric rank-1 quasi-newton methods Hande Y. Benson 1 David F. Shanno 2 Received:

More information

Reduced-Hessian Methods for Constrained Optimization

Reduced-Hessian Methods for Constrained Optimization Reduced-Hessian Methods for Constrained Optimization Philip E. Gill University of California, San Diego Joint work with: Michael Ferry & Elizabeth Wong 11th US & Mexico Workshop on Optimization and its

More information

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality

More information

Institute of Computer Science. Limited-memory projective variable metric methods for unconstrained minimization

Institute of Computer Science. Limited-memory projective variable metric methods for unconstrained minimization Institute of Computer Science Academy of Sciences of the Czech Republic Limited-memory projective variable metric methods for unconstrained minimization J. Vlček, L. Lukšan Technical report No. V 036 December

More information

Notes on Numerical Optimization

Notes on Numerical Optimization Notes on Numerical Optimization University of Chicago, 2014 Viva Patel October 18, 2014 1 Contents Contents 2 List of Algorithms 4 I Fundamentals of Optimization 5 1 Overview of Numerical Optimization

More information

Algorithms for Nonsmooth Optimization

Algorithms for Nonsmooth Optimization Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α

More information

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees

A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees Noname manuscript No. (will be inserted by the editor) A Quasi-Newton Algorithm for Nonconvex, Nonsmooth Optimization with Global Convergence Guarantees Frank E. Curtis Xiaocun Que May 26, 2014 Abstract

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS 1. Please write your name and student number clearly on the front page of the exam. 2. The exam is

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts Some definitions Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization M. M. Sussman sussmanm@math.pitt.edu Office Hours: MW 1:45PM-2:45PM, Thack 622 A matrix A is SPD (Symmetric

More information

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005

University of Houston, Department of Mathematics Numerical Analysis, Fall 2005 3 Numerical Solution of Nonlinear Equations and Systems 3.1 Fixed point iteration Reamrk 3.1 Problem Given a function F : lr n lr n, compute x lr n such that ( ) F(x ) = 0. In this chapter, we consider

More information

Step lengths in BFGS method for monotone gradients

Step lengths in BFGS method for monotone gradients Noname manuscript No. (will be inserted by the editor) Step lengths in BFGS method for monotone gradients Yunda Dong Received: date / Accepted: date Abstract In this paper, we consider how to directly

More information

Matrix Derivatives and Descent Optimization Methods

Matrix Derivatives and Descent Optimization Methods Matrix Derivatives and Descent Optimization Methods 1 Qiang Ning Department of Electrical and Computer Engineering Beckman Institute for Advanced Science and Techonology University of Illinois at Urbana-Champaign

More information

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM

NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM NONSMOOTH VARIANTS OF POWELL S BFGS CONVERGENCE THEOREM JIAYI GUO AND A.S. LEWIS Abstract. The popular BFGS quasi-newton minimization algorithm under reasonable conditions converges globally on smooth

More information

Line search methods with variable sample size. Nataša Krklec Jerinkić. - PhD thesis -

Line search methods with variable sample size. Nataša Krklec Jerinkić. - PhD thesis - UNIVERSITY OF NOVI SAD FACULTY OF SCIENCES DEPARTMENT OF MATHEMATICS AND INFORMATICS Nataša Krklec Jerinkić Line search methods with variable sample size - PhD thesis - Novi Sad, 2013 2. 3 Introduction

More information

Improving the Convergence of Back-Propogation Learning with Second Order Methods

Improving the Convergence of Back-Propogation Learning with Second Order Methods the of Back-Propogation Learning with Second Order Methods Sue Becker and Yann le Cun, Sept 1988 Kasey Bray, October 2017 Table of Contents 1 with Back-Propagation 2 the of BP 3 A Computationally Feasible

More information

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen Numerisches Rechnen (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang Institut für Geometrie und Praktische Mathematik RWTH Aachen Wintersemester 2011/12 IGPM, RWTH Aachen Numerisches Rechnen

More information

On nonlinear optimization since M.J.D. Powell

On nonlinear optimization since M.J.D. Powell On nonlinear optimization since 1959 1 M.J.D. Powell Abstract: This view of the development of algorithms for nonlinear optimization is based on the research that has been of particular interest to the

More information

EECS260 Optimization Lecture notes

EECS260 Optimization Lecture notes EECS260 Optimization Lecture notes Based on Numerical Optimization (Nocedal & Wright, Springer, 2nd ed., 2006) Miguel Á. Carreira-Perpiñán EECS, University of California, Merced May 2, 2010 1 Introduction

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study

Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms: A Comparative Study International Journal of Mathematics And Its Applications Vol.2 No.4 (2014), pp.47-56. ISSN: 2347-1557(online) Determination of Feasible Directions by Successive Quadratic Programming and Zoutendijk Algorithms:

More information

NonlinearOptimization

NonlinearOptimization 1/35 NonlinearOptimization Pavel Kordík Department of Computer Systems Faculty of Information Technology Czech Technical University in Prague Jiří Kašpar, Pavel Tvrdík, 2011 Unconstrained nonlinear optimization,

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search

More information

HYBRID RUNGE-KUTTA AND QUASI-NEWTON METHODS FOR UNCONSTRAINED NONLINEAR OPTIMIZATION. Darin Griffin Mohr. An Abstract

HYBRID RUNGE-KUTTA AND QUASI-NEWTON METHODS FOR UNCONSTRAINED NONLINEAR OPTIMIZATION. Darin Griffin Mohr. An Abstract HYBRID RUNGE-KUTTA AND QUASI-NEWTON METHODS FOR UNCONSTRAINED NONLINEAR OPTIMIZATION by Darin Griffin Mohr An Abstract Of a thesis submitted in partial fulfillment of the requirements for the Doctor of

More information

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection 6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE Three Alternatives/Remedies for Gradient Projection Two-Metric Projection Methods Manifold Suboptimization Methods

More information

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many

More information

Optimization Methods

Optimization Methods Optimization Methods Decision making Examples: determining which ingredients and in what quantities to add to a mixture being made so that it will meet specifications on its composition allocating available

More information

Quadratic Programming

Quadratic Programming Quadratic Programming Outline Linearly constrained minimization Linear equality constraints Linear inequality constraints Quadratic objective function 2 SideBar: Matrix Spaces Four fundamental subspaces

More information

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems

Introduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,

More information

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

New class of limited-memory variationally-derived variable metric methods 1

New class of limited-memory variationally-derived variable metric methods 1 New class of limited-memory variationally-derived variable metric methods 1 Jan Vlček, Ladislav Lukšan Institute of Computer Science, Academy of Sciences of the Czech Republic, L. Lukšan is also from Technical

More information

Lecture Notes 6: Dynamic Equations Part C: Linear Difference Equation Systems

Lecture Notes 6: Dynamic Equations Part C: Linear Difference Equation Systems University of Warwick, EC9A0 Maths for Economists Peter J. Hammond 1 of 45 Lecture Notes 6: Dynamic Equations Part C: Linear Difference Equation Systems Peter J. Hammond latest revision 2017 September

More information

Constrained Nonlinear Optimization Algorithms

Constrained Nonlinear Optimization Algorithms Department of Industrial Engineering and Management Sciences Northwestern University waechter@iems.northwestern.edu Institute for Mathematics and its Applications University of Minnesota August 4, 2016

More information

MATH 4211/6211 Optimization Basics of Optimization Problems

MATH 4211/6211 Optimization Basics of Optimization Problems MATH 4211/6211 Optimization Basics of Optimization Problems Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 A standard minimization

More information

Minimax Design of Complex-Coefficient FIR Filters with Low Group Delay

Minimax Design of Complex-Coefficient FIR Filters with Low Group Delay Minimax Design of Complex-Coefficient FIR Filters with Low Group Delay Wu-Sheng Lu Takao Hinamoto Dept. of Elec. and Comp. Engineering Graduate School of Engineering University of Victoria Hiroshima University

More information

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations

An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations International Journal of Mathematical Modelling & Computations Vol. 07, No. 02, Spring 2017, 145-157 An Alternative Three-Term Conjugate Gradient Algorithm for Systems of Nonlinear Equations L. Muhammad

More information