Reduced-Hessian Methods for Constrained Optimization

Size: px
Start display at page:

Download "Reduced-Hessian Methods for Constrained Optimization"

Transcription

1 Reduced-Hessian Methods for Constrained Optimization Philip E. Gill University of California, San Diego Joint work with: Michael Ferry & Elizabeth Wong 11th US & Mexico Workshop on Optimization and its Applications Huatulco, Mexico, January 8 12, UC San Diego Center for Computational Mathematics 1/45

2 Our honoree... UC San Diego Center for Computational Mathematics 2/45

3 Outline 1 Reduced-Hessian Methods for Unconstrained Optimization 2 Bound-Constrained Optimization 3 Quasi-Wolfe Line Search 4 Reduced-Hessian Methods for Bound-Constrained Optimization 5 Some Numerical Results UC San Diego Center for Computational Mathematics 3/45

4 Reduced-Hessian Methods for Unconstrained Optimization UC San Diego Center for Computational Mathematics 4/45

5 Definitions Minimize f : R n R C 2 with quasi-newton line-search method: Given x k, let f k = f (x k ), g k = f (x k ), and H k 2 f (x k ). Choose p k such that x k + p k minimizes the quadratic model q k (x) = f k + gk T (x x k) (x x k) T H k (x x k ) If H k is positive definite then p k satisfies H k p k = g k (qn step) UC San Diego Center for Computational Mathematics 5/45

6 Definitions Define x k+1 = x k + α k p k where α k is obtained from line search on φ k (α) = f (x k + αp k ) Armijo condition: φ k (α) < φ k (0) + η A αφ k (0), η A (0, 1 2 ) (strong) Wolfe conditions: φ k (α) < φ k (0) + η A αφ k (0), η A (0, 1 2 ) φ k (α) η W φ k (0), η W (η A, 1) UC San Diego Center for Computational Mathematics 6/45

7 Wolfe conditions f(x k + αp k ) α

8 Quasi-Newton Methods Updating H k : H 0 = σi n where σ > 0 Compute H k+1 as the BFGS update to H k, i.e., 1 H k+1 = H k sk T H k s k H k s k sk T H k + 1 yk T s y k yk T, k where s k = x k+1 x k, y k = g k+1 g k, and y T k s k approximates the curvature of f along p k. Wolfe condition guarantees that H k can be updated. One option to calculate p k : Store upper-triangular Cholesky factor R k where R T k R k = H k UC San Diego Center for Computational Mathematics 8/45

9 Quasi-Newton Methods Updating H k : H 0 = σi n where σ > 0 Compute H k+1 as the BFGS update to H k, i.e., 1 H k+1 = H k sk T H k s k H k s k sk T H k + 1 yk T s y k yk T, k where s k = x k+1 x k, y k = g k+1 g k, and y T k s k approximates the curvature of f along p k. Wolfe condition guarantees that H k can be updated. One option to calculate p k : Store upper-triangular Cholesky factor R k where R T k R k = H k UC San Diego Center for Computational Mathematics 8/45

10 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

11 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

12 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

13 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

14 Reduced-Hessian Methods Significance of p k G k : No need to minimize the quadratic model over the full space. Search directions lie in an expanding sequence of subspaces. Significance of H k z G k and H k w = σw: Curvature stored in H k along any unit vector in G k is σ. All nontrivial curvature information in H k can be stored in a smaller r k r k matrix, where r k = dim(g k ). UC San Diego Center for Computational Mathematics 10/45

15 Reduced-Hessian Methods Significance of p k G k : No need to minimize the quadratic model over the full space. Search directions lie in an expanding sequence of subspaces. Significance of H k z G k and H k w = σw: Curvature stored in H k along any unit vector in G k is σ. All nontrivial curvature information in H k can be stored in a smaller r k r k matrix, where r k = dim(g k ). UC San Diego Center for Computational Mathematics 10/45

16 Reduced-Hessian Methods Given a matrix B k R n r k, whose columns span G k, let B k = Z k T k be the QR decomposition of B k ; W k be a matrix whose orthonormal columns span G k ; Q k = ( Z k W k ). Then, H k p k = g k (Qk TH k Q k )QT k p k = QT k g k, where ( Z Qk T T H k Q k = k H k Z k Zk TH k W ) ( ) k Z T Wk TH k Z k Wk TH k W = k H k Z k 0 k 0 σi n rk Q T k g k = ( Z T k g k 0 ). UC San Diego Center for Computational Mathematics 11/45

17 Reduced-Hessian Methods Given a matrix B k R n r k, whose columns span G k, let B k = Z k T k be the QR decomposition of B k ; W k be a matrix whose orthonormal columns span G k ; Q k = ( Z k W k ). Then, H k p k = g k (Qk TH k Q k )QT k p k = QT k g k, where ( Z Qk T T H k Q k = k H k Z k Zk TH k W ) ( ) k Z T Wk TH k Z k Wk TH k W = k H k Z k 0 k 0 σi n rk Q T k g k = ( Z T k g k 0 ). UC San Diego Center for Computational Mathematics 11/45

18 Reduced-Hessian Methods Given a matrix B k R n r k, whose columns span G k, let B k = Z k T k be the QR decomposition of B k ; W k be a matrix whose orthonormal columns span G k ; Q k = ( Z k W k ). Then, H k p k = g k (Qk TH k Q k )QT k p k = QT k g k, where ( Z Qk T T H k Q k = k H k Z k Zk TH k W ) ( ) k Z T Wk TH k Z k Wk TH k W = k H k Z k 0 k 0 σi n rk Q T k g k = ( Z T k g k 0 ). UC San Diego Center for Computational Mathematics 11/45

19 Reduced-Hessian Methods A reduced-hessian (RH) method obtains p k from p k = Z k q k where q k solves Zk T HZ k q k = Z k T g k, (RH step) which is equivalent to (qn step). In practice, we use a Cholesky factorization R T k R k = Z T k H k Z k. The new gradient g k+1 is accepted iff (I Z k Z T k )g k+1 > ɛ. Store and update Z k, R k, Z T k p k, Z T k g k, and Z T k g k+1. UC San Diego Center for Computational Mathematics 12/45

20 Reduced-Hessian Methods A reduced-hessian (RH) method obtains p k from p k = Z k q k where q k solves Zk T HZ k q k = Z k T g k, (RH step) which is equivalent to (qn step). In practice, we use a Cholesky factorization R T k R k = Z T k H k Z k. The new gradient g k+1 is accepted iff (I Z k Z T k )g k+1 > ɛ. Store and update Z k, R k, Z T k p k, Z T k g k, and Z T k g k+1. UC San Diego Center for Computational Mathematics 12/45

21 H k = Q k Qk T H k Q k QT k = ( ) ( Z Z k W k TH k Z ) ( ) k 0 Zk k 0 σi n rk W k = Z k (Z T k H k Z k )Z T k + σ(i Z k Z T k ). any z such that Z T k z = 0 satisfies H kz = σz. UC San Diego Center for Computational Mathematics 13/45

22 Reduced-Hessian Method Variants Reinitialization: If g k+1 G k, the Cholesky factor R k is updated as ( ) Rk 0 R k+1, 0 σk+1 where σ k+1 is based on the latest estimate of the curvature, e.g., σ k+1 = y k Ts k sk Ts. k Lingering: restrict search direction to a smaller subspace and allow the subspace to expand only when f is suitably minimized on that subspace. UC San Diego Center for Computational Mathematics 14/45

23 Reduced-Hessian Method Variants Reinitialization: If g k+1 G k, the Cholesky factor R k is updated as ( ) Rk 0 R k+1, 0 σk+1 where σ k+1 is based on the latest estimate of the curvature, e.g., σ k+1 = y k Ts k sk Ts. k Lingering: restrict search direction to a smaller subspace and allow the subspace to expand only when f is suitably minimized on that subspace. UC San Diego Center for Computational Mathematics 14/45

24 Reduced-Hessian Method Variants Limited-memory: instead of storing the full approximate Hessian, keep information from only the last m steps (m n). Key differences: Form Z k from search directions instead of the gradients. Must store T k from B k = Z k T k to update most quantities. Drop columns from B k when necessary. UC San Diego Center for Computational Mathematics 15/45

25 Reduced-Hessian Method Variants Limited-memory: instead of storing the full approximate Hessian, keep information from only the last m steps (m n). Key differences: Form Z k from search directions instead of the gradients. Must store T k from B k = Z k T k to update most quantities. Drop columns from B k when necessary. UC San Diego Center for Computational Mathematics 15/45

26 Main idea: Maintain B k = Z k T k, where B k is the m n matrix B k = ( p k m+1 p k m+2 p k 1 p k ) with m n (e.g., m = 5). B k = Z k T k is not available until p k has been computed. However, if Z k is a basis for span(p k m+1,..., p k 1, p k ) then it is also a basis for span(p k m+1,..., p k 1, g k ), i.e., only the triangular factor associated with the (common) basis for each of these subspaces will be different. UC San Diego Center for Computational Mathematics 16/45

27 Main idea: Maintain B k = Z k T k, where B k is the m n matrix B k = ( p k m+1 p k m+2 p k 1 p k ) with m n (e.g., m = 5). B k = Z k T k is not available until p k has been computed. However, if Z k is a basis for span(p k m+1,..., p k 1, p k ) then it is also a basis for span(p k m+1,..., p k 1, g k ), i.e., only the triangular factor associated with the (common) basis for each of these subspaces will be different. UC San Diego Center for Computational Mathematics 16/45

28 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

29 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

30 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

31 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

32 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

33 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

34 Summary of key differences: Form Z k from search directions instead of gradients. Must store T k from B k = Z k T k to update most quantities. Can store B k instead of Z k. Drop columns from B k when necessary. Half the storage of conventional limited-memory approaches. Retains finite termination property for quadratic f (both with and without reinitialization). UC San Diego Center for Computational Mathematics 18/45

35 Summary of key differences: Form Z k from search directions instead of gradients. Must store T k from B k = Z k T k to update most quantities. Can store B k instead of Z k. Drop columns from B k when necessary. Half the storage of conventional limited-memory approaches. Retains finite termination property for quadratic f (both with and without reinitialization). UC San Diego Center for Computational Mathematics 18/45

36 Bound-Constrained Optimization UC San Diego Center for Computational Mathematics 19/45

37 Bound Constraints Given l, u R n, solve minimize f (x) subject to l x u. x R n Focus on line-search methods that use the BFGS method: Projected-gradient [Byrd, Lu, Nocedal, Zhu (1995)] Projected-search [Bertsekas (1982)] These methods are designed to move on and off constraints rapidly and identify the active set after a finite number of iterations. UC San Diego Center for Computational Mathematics 20/45

38 Projected-Gradient Methods UC San Diego Center for Computational Mathematics 21/45

39 Definitions A(x) = {i : x i = l i or x i = u i } Define the projection P(x) componentwise, where l i if x i < l i, [P(x)] i = u i if x i > u i, otherwise. Given an iterate x k, define the piecewise linear paths x gk (α) = P(x k αg k ) and x pk (α) = P(x k + αp k ) x i UC San Diego Center for Computational Mathematics 22/45

40 Algorithm L-BFGS-B Given x k and q k (x), a typical iteration of L-BFGS-B looks like: gk x k Move along projected path x gk (α) UC San Diego Center for Computational Mathematics 23/45

41 Algorithm L-BFGS-B gk x k x c k Find x c k, the first point that minimizes q k(x) along x gk (α) UC San Diego Center for Computational Mathematics 23/45

42 Algorithm L-BFGS-B gk x k x c k x Find x, the minimizer of q k (x) with x i fixed for every i A(x c k ) UC San Diego Center for Computational Mathematics 23/45

43 Algorithm L-BFGS-B gk x k x c k x x Find x by projecting x onto the feasible region UC San Diego Center for Computational Mathematics 23/45

44 Algorithm L-BFGS-B gk x k p k x c k x x Wolfe line search along p k with α max = 1 to ensure feasibility UC San Diego Center for Computational Mathematics 23/45

45 Projected-Search Methods UC San Diego Center for Computational Mathematics 24/45

46 Definitions A(x) = {i : x i = l i or x i = u i } W(x) = {i : (x i = l i and g i > 0) or (x i = u i and g i < 0)} Given a point x and vector p, the projected vector of p at x, P x (p), is defined componentwise, where { 0 if i W(x), [P x (p)] i = otherwise. p i Given a subspace S, the projected subspace of S at x is defined as P x (S) = {P x (p) : p S}. UC San Diego Center for Computational Mathematics 25/45

47 Definitions A(x) = {i : x i = l i or x i = u i } W(x) = {i : (x i = l i and g i > 0) or (x i = u i and g i < 0)} Given a point x and vector p, the projected vector of p at x, P x (p), is defined componentwise, where { 0 if i W(x), [P x (p)] i = otherwise. p i Given a subspace S, the projected subspace of S at x is defined as P x (S) = {P x (p) : p S}. UC San Diego Center for Computational Mathematics 25/45

48 Projected-Search Methods A typical projected-search method updates x k as follows: Compute W(x k ) Calculate p k as the solution of min q k(x k + p) p P xk (R n ) Obtain x k+1 from an Armijo-like line search on ψ(α) = f (x pk (α)) UC San Diego Center for Computational Mathematics 26/45

49 Projected-Search Methods Given x k and f (x), a typical iteration looks like: p k x k UC San Diego Center for Computational Mathematics 27/45

50 Projected-Search Methods p k x k UC San Diego Center for Computational Mathematics 27/45

51 Projected-Search Methods p k x k Armijo-like line search along P(x k + αp k ) UC San Diego Center for Computational Mathematics 27/45

52 Quasi-Wolfe Line Search UC San Diego Center for Computational Mathematics 28/45

53 Can t use Wolfe conditions: ψ(α) is a continuous, piecewise differentiable function with cusps where x pk (α) changes direction. As ψ(α) = f ( x p (α) ) is only piecewise differentiable, it is not possible to know when an interval contains a Wolfe step. UC San Diego Center for Computational Mathematics 29/45

54 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

55 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

56 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

57 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

58 Theory Analogous conditions for the existence of the Wolfe step imply the existence of the quasi-wolfe step. Quasi-Wolfe line searches are ideal for projected-search methods. If ψ is differentiable the quasi-wolfe and Wolfe conditions are identical. In rare cases, H k cannot be updated after quasi-wolfe step. UC San Diego Center for Computational Mathematics 31/45

59 Reduced-Hessian Methods for Bound-Constrained Optimization UC San Diego Center for Computational Mathematics 32/45

60 Motivation Projected-search methods typically calculate p k as the solution to min q k(x k + p) p P xk (R n ) If N k has orthonormal columns that span P xk (R n ): p k = N k q k where q k solves (N T k H k N k )q k = NT k g k An RH method for bound-constrained optimization (RH-B) solves p k = Z k q k where q k solves (Z T k H k Z k )q k = Z T k g k where the orthonormal columns of Z k span P xk (G k ). UC San Diego Center for Computational Mathematics 33/45

61 Motivation Projected-search methods typically calculate p k as the solution to min q k(x k + p) p P xk (R n ) If N k has orthonormal columns that span P xk (R n ): p k = N k q k where q k solves (N T k H k N k )q k = NT k g k An RH method for bound-constrained optimization (RH-B) solves p k = Z k q k where q k solves (Z T k H k Z k )q k = Z T k g k where the orthonormal columns of Z k span P xk (G k ). UC San Diego Center for Computational Mathematics 33/45

62 Details Builds on limited-memory RH method framework, plus: Update values dependent on B k when W k W k 1 Use projected-search trappings (working set, projections,...). Use line search compatible with projected-search methods. Update H k with curvature in direction restricted to range(z k+1 ). UC San Diego Center for Computational Mathematics 34/45

63 Cost of Changing the Working Set An RH-B method can be explicit or implicit. An explicit method stores and uses Z k and an implicit method stores and uses B k. When W k W k 1, all quantities dependent on B k are updated: Dropping n d indices: 21m 2 n d flops if implicit Adding n a indices: 24m 2 n a flops if implicit Using an explicit method: +6mn(n a + n d ) flops to update Z k UC San Diego Center for Computational Mathematics 35/45

64 Some Numerical Results UC San Diego Center for Computational Mathematics 36/45

65 Results LRH-B (v1.0) and LBFGS-B (v3.0) LRH-B implemented in Fortran 2003; LBFGS-B in Fortran problems from CUTEst test set, with n between 1 and 192, 627. Termination: P xk (g k ) < 10 5 or 300K itns or 1000 cpu secs. UC San Diego Center for Computational Mathematics 37/45

66 Implementation Algorithm LRH-B (v1.0): Limited-memory: restricted to m preceding search directions. Reinitialization: included if n > min(6, m). Lingering: excluded. Method type: implicit. Updates: B k, T k, R k updated for all W k W k 1. UC San Diego Center for Computational Mathematics 38/45

67 Default parameters LBFGS-B m = 5 LRH-B m = LBFGS-B m = 5 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems Name m Failed LBFGS-B 5 78 LRH-B UC San Diego Center for Computational Mathematics 39/45

68 LBFGS-B m = 5 LRH-B m = LBFGS-B m = 5 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems Name m Failed LBFGS-B 5 78 LRH-B 5 50 UC San Diego Center for Computational Mathematics 40/45

69 LBFGS-B m = 10 LRH-B m = LBFGS-B m = 10 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems Name m Failed LBFGS-B LRH-B UC San Diego Center for Computational Mathematics 41/45

70 LBFGS-B m = 5 LBFGS-B m = 10 LRH-B m = 5 LRH-B m = LBFGS-B m = 5 LBFGS-B m = 10 LRH-B m = 5 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems UC San Diego Center for Computational Mathematics 42/45

71 Outstanding Issues Projected-search methods: When to update/factor? Better to refactor when there are lots of changes to A. Implement plane rotations via level-two BLAS. How complex should we make the quasi-wolfe search? UC San Diego Center for Computational Mathematics 43/45

72 Thank you! UC San Diego Center for Computational Mathematics 44/45

73 References D. P. Bertsekas, Projected Newton methods for optimization problems with simple constraints, SIAM J. Control Optim., 20: , R. Byrd, P. Lu, J. Nocedal and C. Zhu, A limited-memory algorithm for bound constrained optimization, SIAM J. Sci. Comput. 16: , M. C. Fenelon, Preconditioned conjugate-gradient-type methods for large-scale unconstrained optimization, Ph.D. thesis, Department of Operations Research, Stanford University, Stanford, CA, M. W. Ferry, P. E. Gill and E. Wong, A limited-memory reduced Hessian method for bound-constrained optimization, Report CCoM 18-01, Center for Computational Mathematics, University of California, San Diego, 2018 (to appear). P. E. Gill and M. W. Leonard, Reduced-Hessian quasi-newton methods for unconstrained optimization, SIAM J. Optim., 12: , P. E. Gill and M. W. Leonard, Limited-memory reduced-hessian methods for unconstrained optimization, SIAM J. Optim., 14: , J. Morales and J. Nocedal, Remark on Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization, ACM Trans. Math. Softw., 38(1)7:1 7:4, D. Siegel, Modifying the BFGS update by a new column scaling technique, Math. Program., 66:45 78, UC San Diego Center for Computational Mathematics 45/45

Preconditioned conjugate gradient algorithms with column scaling

Preconditioned conjugate gradient algorithms with column scaling Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and

More information

UC San Diego UC San Diego Electronic Theses and Dissertations

UC San Diego UC San Diego Electronic Theses and Dissertations UC San Diego UC San Diego Electronic Theses and Dissertations Title Projected-search methods for box-constrained optimization Permalink https://escholarship.org/uc/item/99277951 Author Ferry, Michael William

More information

UNIVERSITY OF CALIFORNIA, SAN DIEGO

UNIVERSITY OF CALIFORNIA, SAN DIEGO UNIVERSITY OF CALIFORNIA, SAN DIEGO Reduced Hessian Quasi-Newton Methods for Optimization A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Mathematics

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization Instructor: Michael Saunders Spring 2015 Notes 11: NPSOL and SNOPT SQP Methods 1 Overview

More information

5 Quasi-Newton Methods

5 Quasi-Newton Methods Unconstrained Convex Optimization 26 5 Quasi-Newton Methods If the Hessian is unavailable... Notation: H = Hessian matrix. B is the approximation of H. C is the approximation of H 1. Problem: Solve min

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)

More information

REPORTS IN INFORMATICS

REPORTS IN INFORMATICS REPORTS IN INFORMATICS ISSN 0333-3590 A class of Methods Combining L-BFGS and Truncated Newton Lennart Frimannslund Trond Steihaug REPORT NO 319 April 2006 Department of Informatics UNIVERSITY OF BERGEN

More information

Arc Search Algorithms

Arc Search Algorithms Arc Search Algorithms Nick Henderson and Walter Murray Stanford University Institute for Computational and Mathematical Engineering November 10, 2011 Unconstrained Optimization minimize x D F (x) where

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

ORIE 6326: Convex Optimization. Quasi-Newton Methods

ORIE 6326: Convex Optimization. Quasi-Newton Methods ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted

More information

Stochastic Optimization Algorithms Beyond SG

Stochastic Optimization Algorithms Beyond SG Stochastic Optimization Algorithms Beyond SG Frank E. Curtis 1, Lehigh University involving joint work with Léon Bottou, Facebook AI Research Jorge Nocedal, Northwestern University Optimization Methods

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

Quasi-Newton methods for minimization

Quasi-Newton methods for minimization Quasi-Newton methods for minimization Lectures for PHD course on Numerical optimization Enrico Bertolazzi DIMS Universitá di Trento November 21 December 14, 2011 Quasi-Newton methods for minimization 1

More information

Downloaded 12/02/13 to Redistribution subject to SIAM license or copyright; see

Downloaded 12/02/13 to Redistribution subject to SIAM license or copyright; see SIAM J. OPTIM. Vol. 23, No. 4, pp. 2150 2168 c 2013 Society for Industrial and Applied Mathematics THE LIMITED MEMORY CONJUGATE GRADIENT METHOD WILLIAM W. HAGER AND HONGCHAO ZHANG Abstract. In theory,

More information

DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS

DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS DENSE INITIALIZATIONS FOR LIMITED-MEMORY QUASI-NEWTON METHODS by Johannes Brust, Oleg Burdaov, Jennifer B. Erway, and Roummel F. Marcia Technical Report 07-, Department of Mathematics and Statistics, Wae

More information

Programming, numerics and optimization

Programming, numerics and optimization Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428

More information

Line Search Methods for Unconstrained Optimisation

Line Search Methods for Unconstrained Optimisation Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic

More information

Improved Damped Quasi-Newton Methods for Unconstrained Optimization

Improved Damped Quasi-Newton Methods for Unconstrained Optimization Improved Damped Quasi-Newton Methods for Unconstrained Optimization Mehiddin Al-Baali and Lucio Grandinetti August 2015 Abstract Recently, Al-Baali (2014) has extended the damped-technique in the modified

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization

Proximal Newton Method. Zico Kolter (notes by Ryan Tibshirani) Convex Optimization Proximal Newton Method Zico Kolter (notes by Ryan Tibshirani) Convex Optimization 10-725 Consider the problem Last time: quasi-newton methods min x f(x) with f convex, twice differentiable, dom(f) = R

More information

Quadratic Programming

Quadratic Programming Quadratic Programming Outline Linearly constrained minimization Linear equality constraints Linear inequality constraints Quadratic objective function 2 SideBar: Matrix Spaces Four fundamental subspaces

More information

ALGORITHM XXX: SC-SR1: MATLAB SOFTWARE FOR SOLVING SHAPE-CHANGING L-SR1 TRUST-REGION SUBPROBLEMS

ALGORITHM XXX: SC-SR1: MATLAB SOFTWARE FOR SOLVING SHAPE-CHANGING L-SR1 TRUST-REGION SUBPROBLEMS ALGORITHM XXX: SC-SR1: MATLAB SOFTWARE FOR SOLVING SHAPE-CHANGING L-SR1 TRUST-REGION SUBPROBLEMS JOHANNES BRUST, OLEG BURDAKOV, JENNIFER B. ERWAY, ROUMMEL F. MARCIA, AND YA-XIANG YUAN Abstract. We present

More information

Lecture 13: Orthogonal projections and least squares (Section ) Thang Huynh, UC San Diego 2/9/2018

Lecture 13: Orthogonal projections and least squares (Section ) Thang Huynh, UC San Diego 2/9/2018 Lecture 13: Orthogonal projections and least squares (Section 3.2-3.3) Thang Huynh, UC San Diego 2/9/2018 Orthogonal projection onto subspaces Theorem. Let W be a subspace of R n. Then, each x in R n can

More information

Spectral gradient projection method for solving nonlinear monotone equations

Spectral gradient projection method for solving nonlinear monotone equations Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions

More information

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review Department of Statistical Sciences and Operations Research Virginia Commonwealth University Oct 16, 2013 (Lecture 14) Nonlinear Optimization

More information

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem SIAM J. OPTIM. Vol. 9, No. 4, pp. 1100 1127 c 1999 Society for Industrial and Applied Mathematics NEWTON S METHOD FOR LARGE BOUND-CONSTRAINED OPTIMIZATION PROBLEMS CHIH-JEN LIN AND JORGE J. MORÉ To John

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent

Methods that avoid calculating the Hessian. Nonlinear Optimization; Steepest Descent, Quasi-Newton. Steepest Descent Nonlinear Optimization Steepest Descent and Niclas Börlin Department of Computing Science Umeå University niclas.borlin@cs.umu.se A disadvantage with the Newton method is that the Hessian has to be derived

More information

LINEAR AND NONLINEAR PROGRAMMING

LINEAR AND NONLINEAR PROGRAMMING LINEAR AND NONLINEAR PROGRAMMING Stephen G. Nash and Ariela Sofer George Mason University The McGraw-Hill Companies, Inc. New York St. Louis San Francisco Auckland Bogota Caracas Lisbon London Madrid Mexico

More information

Step lengths in BFGS method for monotone gradients

Step lengths in BFGS method for monotone gradients Noname manuscript No. (will be inserted by the editor) Step lengths in BFGS method for monotone gradients Yunda Dong Received: date / Accepted: date Abstract In this paper, we consider how to directly

More information

Convex Optimization. Problem set 2. Due Monday April 26th

Convex Optimization. Problem set 2. Due Monday April 26th Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining

More information

Higher-Order Methods

Higher-Order Methods Higher-Order Methods Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. PCMI, July 2016 Stephen Wright (UW-Madison) Higher-Order Methods PCMI, July 2016 1 / 25 Smooth

More information

A convex QP solver based on block-lu updates

A convex QP solver based on block-lu updates Block-LU updates p. 1/24 A convex QP solver based on block-lu updates PP06 SIAM Conference on Parallel Processing for Scientific Computing San Francisco, CA, Feb 22 24, 2006 Hanh Huynh and Michael Saunders

More information

Quasi-Newton Methods

Quasi-Newton Methods Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications

More information

LBFGS. John Langford, Large Scale Machine Learning Class, February 5. (post presentation version)

LBFGS. John Langford, Large Scale Machine Learning Class, February 5. (post presentation version) LBFGS John Langford, Large Scale Machine Learning Class, February 5 (post presentation version) We are still doing Linear Learning Features: a vector x R n Label: y R Goal: Learn w R n such that ŷ w (x)

More information

Lecture 14: October 17

Lecture 14: October 17 1-725/36-725: Convex Optimization Fall 218 Lecture 14: October 17 Lecturer: Lecturer: Ryan Tibshirani Scribes: Pengsheng Guo, Xian Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

A Limited Memory, Quasi-Newton Preconditioner. for Nonnegatively Constrained Image. Reconstruction

A Limited Memory, Quasi-Newton Preconditioner. for Nonnegatively Constrained Image. Reconstruction A Limited Memory, Quasi-Newton Preconditioner for Nonnegatively Constrained Image Reconstruction Johnathan M. Bardsley Department of Mathematical Sciences, The University of Montana, Missoula, MT 59812-864

More information

On Lagrange multipliers of trust region subproblems

On Lagrange multipliers of trust region subproblems On Lagrange multipliers of trust region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Applied Linear Algebra April 28-30, 2008 Novi Sad, Serbia Outline

More information

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection 6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE Three Alternatives/Remedies for Gradient Projection Two-Metric Projection Methods Manifold Suboptimization Methods

More information

Numerical Optimization of Partial Differential Equations

Numerical Optimization of Partial Differential Equations Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada

More information

2. Quasi-Newton methods

2. Quasi-Newton methods L. Vandenberghe EE236C (Spring 2016) 2. Quasi-Newton methods variable metric methods quasi-newton methods BFGS update limited-memory quasi-newton methods 2-1 Newton method for unconstrained minimization

More information

Algorithms for Constrained Optimization

Algorithms for Constrained Optimization 1 / 42 Algorithms for Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University April 19, 2015 2 / 42 Outline 1. Convergence 2. Sequential quadratic

More information

Sub-Sampled Newton Methods

Sub-Sampled Newton Methods Sub-Sampled Newton Methods F. Roosta-Khorasani and M. W. Mahoney ICSI and Dept of Statistics, UC Berkeley February 2016 F. Roosta-Khorasani and M. W. Mahoney (UCB) Sub-Sampled Newton Methods Feb 2016 1

More information

OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods

OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods OPER 627: Nonlinear Optimization Lecture 9: Trust-region methods Department of Statistical Sciences and Operations Research Virginia Commonwealth University Sept 25, 2013 (Lecture 9) Nonlinear Optimization

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many

More information

CONVEXIFICATION SCHEMES FOR SQP METHODS

CONVEXIFICATION SCHEMES FOR SQP METHODS CONVEXIFICAION SCHEMES FOR SQP MEHODS Philip E. Gill Elizabeth Wong UCSD Center for Computational Mathematics echnical Report CCoM-14-06 July 18, 2014 Abstract Sequential quadratic programming (SQP) methods

More information

MATH 4211/6211 Optimization Quasi-Newton Method

MATH 4211/6211 Optimization Quasi-Newton Method MATH 4211/6211 Optimization Quasi-Newton Method Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Quasi-Newton Method Motivation:

More information

Optimization II: Unconstrained Multivariable

Optimization II: Unconstrained Multivariable Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Justin Solomon CS 205A: Mathematical Methods Optimization II: Unconstrained Multivariable 1

More information

Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods. Jorge Nocedal

Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods. Jorge Nocedal Derivative-Free Optimization of Noisy Functions via Quasi-Newton Methods Jorge Nocedal Northwestern University Huatulco, Jan 2018 1 Collaborators Albert Berahas Northwestern University Richard Byrd University

More information

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S. Bull. Iranian Math. Soc. Vol. 40 (2014), No. 1, pp. 235 242 Online ISSN: 1735-8515 AN EIGENVALUE STUDY ON THE SUFFICIENT DESCENT PROPERTY OF A MODIFIED POLAK-RIBIÈRE-POLYAK CONJUGATE GRADIENT METHOD S.

More information

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725

Quasi-Newton Methods. Javier Peña Convex Optimization /36-725 Quasi-Newton Methods Javier Peña Convex Optimization 10-725/36-725 Last time: primal-dual interior-point methods Consider the problem min x subject to f(x) Ax = b h(x) 0 Assume f, h 1,..., h m are convex

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning

Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Improving L-BFGS Initialization for Trust-Region Methods in Deep Learning Jacob Rafati http://rafati.net jrafatiheravi@ucmerced.edu Ph.D. Candidate, Electrical Engineering and Computer Science University

More information

R-Linear Convergence of Limited Memory Steepest Descent

R-Linear Convergence of Limited Memory Steepest Descent R-Linear Convergence of Limited Memory Steepest Descent Frank E. Curtis, Lehigh University joint work with Wei Guo, Lehigh University OP17 Vancouver, British Columbia, Canada 24 May 2017 R-Linear Convergence

More information

This ensures that we walk downhill. For fixed λ not even this may be the case.

This ensures that we walk downhill. For fixed λ not even this may be the case. Gradient Descent Objective Function Some differentiable function f : R n R. Gradient Descent Start with some x 0, i = 0 and learning rate λ repeat x i+1 = x i λ f(x i ) until f(x i+1 ) ɛ Line Search Variant

More information

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09 Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods

More information

Newton s Method. Javier Peña Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725 Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and

More information

POD/DEIM 4DVAR Data Assimilation of the Shallow Water Equation Model

POD/DEIM 4DVAR Data Assimilation of the Shallow Water Equation Model nonlinear 4DVAR 4DVAR Data Assimilation of the Shallow Water Equation Model R. Ştefănescu and Ionel M. Department of Scientific Computing Florida State University Tallahassee, Florida May 23, 2013 (Florida

More information

Unconstrained minimization: assumptions

Unconstrained minimization: assumptions Unconstrained minimization I terminology and assumptions I gradient descent method I steepest descent method I Newton s method I self-concordant functions I implementation IOE 611: Nonlinear Programming,

More information

A Regularized Interior-Point Method for Constrained Nonlinear Least Squares

A Regularized Interior-Point Method for Constrained Nonlinear Least Squares A Regularized Interior-Point Method for Constrained Nonlinear Least Squares XII Brazilian Workshop on Continuous Optimization Abel Soares Siqueira Federal University of Paraná - Curitiba/PR - Brazil Dominique

More information

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained

More information

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2 Coralia Cartis, University of Oxford INFOMM CDT: Modelling, Analysis and Computation of Continuous Real-World Problems Methods

More information

What s New in Active-Set Methods for Nonlinear Optimization?

What s New in Active-Set Methods for Nonlinear Optimization? What s New in Active-Set Methods for Nonlinear Optimization? Philip E. Gill Advances in Numerical Computation, Manchester University, July 5, 2011 A Workshop in Honor of Sven Hammarling UCSD Center for

More information

Solving linear equations with Gaussian Elimination (I)

Solving linear equations with Gaussian Elimination (I) Term Projects Solving linear equations with Gaussian Elimination The QR Algorithm for Symmetric Eigenvalue Problem The QR Algorithm for The SVD Quasi-Newton Methods Solving linear equations with Gaussian

More information

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Accelerated Block-Coordinate Relaxation for Regularized Optimization Accelerated Block-Coordinate Relaxation for Regularized Optimization Stephen J. Wright Computer Sciences University of Wisconsin, Madison October 09, 2012 Problem descriptions Consider where f is smooth

More information

8 Numerical methods for unconstrained problems

8 Numerical methods for unconstrained problems 8 Numerical methods for unconstrained problems Optimization is one of the important fields in numerical computation, beside solving differential equations and linear systems. We can see that these fields

More information

An Inexact Newton Method for Optimization

An Inexact Newton Method for Optimization New York University Brown Applied Mathematics Seminar, February 10, 2009 Brief biography New York State College of William and Mary (B.S.) Northwestern University (M.S. & Ph.D.) Courant Institute (Postdoc)

More information

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization Frank E. Curtis, Lehigh University involving joint work with Travis Johnson, Northwestern University Daniel P. Robinson, Johns

More information

Chapter 4. Unconstrained optimization

Chapter 4. Unconstrained optimization Chapter 4. Unconstrained optimization Version: 28-10-2012 Material: (for details see) Chapter 11 in [FKS] (pp.251-276) A reference e.g. L.11.2 refers to the corresponding Lemma in the book [FKS] PDF-file

More information

Gradient-Based Optimization

Gradient-Based Optimization Multidisciplinary Design Optimization 48 Chapter 3 Gradient-Based Optimization 3. Introduction In Chapter we described methods to minimize (or at least decrease) a function of one variable. While problems

More information

Zero-Order Methods for the Optimization of Noisy Functions. Jorge Nocedal

Zero-Order Methods for the Optimization of Noisy Functions. Jorge Nocedal Zero-Order Methods for the Optimization of Noisy Functions Jorge Nocedal Northwestern University Simons Institute, October 2017 1 Collaborators Albert Berahas Northwestern University Richard Byrd University

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

An Iterative Descent Method

An Iterative Descent Method Conjugate Gradient: An Iterative Descent Method The Plan Review Iterative Descent Conjugate Gradient Review : Iterative Descent Iterative Descent is an unconstrained optimization process x (k+1) = x (k)

More information

Gradient Based Optimization Methods

Gradient Based Optimization Methods Gradient Based Optimization Methods Antony Jameson, Department of Aeronautics and Astronautics Stanford University, Stanford, CA 94305-4035 1 Introduction Consider the minimization of a function J(x) where

More information

MATH 4211/6211 Optimization Basics of Optimization Problems

MATH 4211/6211 Optimization Basics of Optimization Problems MATH 4211/6211 Optimization Basics of Optimization Problems Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 A standard minimization

More information

On efficiency of nonmonotone Armijo-type line searches

On efficiency of nonmonotone Armijo-type line searches Noname manuscript No. (will be inserted by the editor On efficiency of nonmonotone Armijo-type line searches Masoud Ahookhosh Susan Ghaderi Abstract Monotonicity and nonmonotonicity play a key role in

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal

IPAM Summer School Optimization methods for machine learning. Jorge Nocedal IPAM Summer School 2012 Tutorial on Optimization methods for machine learning Jorge Nocedal Northwestern University Overview 1. We discuss some characteristics of optimization problems arising in deep

More information

ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS

ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-203-OS-03 Department of Mathematics KTH Royal

More information

Projection methods to solve SDP

Projection methods to solve SDP Projection methods to solve SDP Franz Rendl http://www.math.uni-klu.ac.at Alpen-Adria-Universität Klagenfurt Austria F. Rendl, Oberwolfach Seminar, May 2010 p.1/32 Overview Augmented Primal-Dual Method

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS

A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS A COMBINED CLASS OF SELF-SCALING AND MODIFIED QUASI-NEWTON METHODS MEHIDDIN AL-BAALI AND HUMAID KHALFAN Abstract. Techniques for obtaining safely positive definite Hessian approximations with selfscaling

More information

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity

A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity A Trust Funnel Algorithm for Nonconvex Equality Constrained Optimization with O(ɛ 3/2 ) Complexity Mohammadreza Samadi, Lehigh University joint work with Frank E. Curtis (stand-in presenter), Lehigh University

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes

More information

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods

AM 205: lecture 19. Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods AM 205: lecture 19 Last time: Conditions for optimality, Newton s method for optimization Today: survey of optimization methods Quasi-Newton Methods General form of quasi-newton methods: x k+1 = x k α

More information

New class of limited-memory variationally-derived variable metric methods 1

New class of limited-memory variationally-derived variable metric methods 1 New class of limited-memory variationally-derived variable metric methods 1 Jan Vlček, Ladislav Lukšan Institute of Computer Science, Academy of Sciences of the Czech Republic, L. Lukšan is also from Technical

More information

Nonlinear Optimization for Optimal Control

Nonlinear Optimization for Optimal Control Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]

More information

Optimization II: Unconstrained Multivariable

Optimization II: Unconstrained Multivariable Optimization II: Unconstrained Multivariable CS 205A: Mathematical Methods for Robotics, Vision, and Graphics Doug James (and Justin Solomon) CS 205A: Mathematical Methods Optimization II: Unconstrained

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1

Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1 journal of complexity 18, 557 572 (2002) doi:10.1006/jcom.2001.0623 Extra-Updates Criterion for the Limited Memory BFGS Algorithm for Large Scale Nonlinear Optimization 1 M. Al-Baali Department of Mathematics

More information