Reduced-Hessian Methods for Constrained Optimization

Size: px

Start display at page:

Download "Reduced-Hessian Methods for Constrained Optimization"

Lucinda Watkins
5 years ago
Views:

1 Reduced-Hessian Methods for Constrained Optimization Philip E. Gill University of California, San Diego Joint work with: Michael Ferry & Elizabeth Wong 11th US & Mexico Workshop on Optimization and its Applications Huatulco, Mexico, January 8 12, UC San Diego Center for Computational Mathematics 1/45

2 Our honoree... UC San Diego Center for Computational Mathematics 2/45

3 Outline 1 Reduced-Hessian Methods for Unconstrained Optimization 2 Bound-Constrained Optimization 3 Quasi-Wolfe Line Search 4 Reduced-Hessian Methods for Bound-Constrained Optimization 5 Some Numerical Results UC San Diego Center for Computational Mathematics 3/45

4 Reduced-Hessian Methods for Unconstrained Optimization UC San Diego Center for Computational Mathematics 4/45

5 Definitions Minimize f : R n R C 2 with quasi-newton line-search method: Given x k, let f k = f (x k ), g k = f (x k ), and H k 2 f (x k ). Choose p k such that x k + p k minimizes the quadratic model q k (x) = f k + gk T (x x k) (x x k) T H k (x x k ) If H k is positive definite then p k satisfies H k p k = g k (qn step) UC San Diego Center for Computational Mathematics 5/45

6 Definitions Define x k+1 = x k + α k p k where α k is obtained from line search on φ k (α) = f (x k + αp k ) Armijo condition: φ k (α) < φ k (0) + η A αφ k (0), η A (0, 1 2 ) (strong) Wolfe conditions: φ k (α) < φ k (0) + η A αφ k (0), η A (0, 1 2 ) φ k (α) η W φ k (0), η W (η A, 1) UC San Diego Center for Computational Mathematics 6/45

7 Wolfe conditions f(x k + αp k ) α

8 Quasi-Newton Methods Updating H k : H 0 = σi n where σ > 0 Compute H k+1 as the BFGS update to H k, i.e., 1 H k+1 = H k sk T H k s k H k s k sk T H k + 1 yk T s y k yk T, k where s k = x k+1 x k, y k = g k+1 g k, and y T k s k approximates the curvature of f along p k. Wolfe condition guarantees that H k can be updated. One option to calculate p k : Store upper-triangular Cholesky factor R k where R T k R k = H k UC San Diego Center for Computational Mathematics 8/45

9 Quasi-Newton Methods Updating H k : H 0 = σi n where σ > 0 Compute H k+1 as the BFGS update to H k, i.e., 1 H k+1 = H k sk T H k s k H k s k sk T H k + 1 yk T s y k yk T, k where s k = x k+1 x k, y k = g k+1 g k, and y T k s k approximates the curvature of f along p k. Wolfe condition guarantees that H k can be updated. One option to calculate p k : Store upper-triangular Cholesky factor R k where R T k R k = H k UC San Diego Center for Computational Mathematics 8/45

10 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

11 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

12 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

13 Reduced-Hessian Methods (Fenelon, 1981 and Siegel, 1992) Let G k = span(g 0, g 1,..., g k ) and Gk complement of G k in R n. be the orthogonal Result Consider a quasi-newton method with BFGS update applied to a general nonlinear function. If H 0 = σi (σ > 0), then: p k G k for all k. If z G k and w G k, then H k z G k and H k w = σw. UC San Diego Center for Computational Mathematics 9/45

14 Reduced-Hessian Methods Significance of p k G k : No need to minimize the quadratic model over the full space. Search directions lie in an expanding sequence of subspaces. Significance of H k z G k and H k w = σw: Curvature stored in H k along any unit vector in G k is σ. All nontrivial curvature information in H k can be stored in a smaller r k r k matrix, where r k = dim(g k ). UC San Diego Center for Computational Mathematics 10/45

15 Reduced-Hessian Methods Significance of p k G k : No need to minimize the quadratic model over the full space. Search directions lie in an expanding sequence of subspaces. Significance of H k z G k and H k w = σw: Curvature stored in H k along any unit vector in G k is σ. All nontrivial curvature information in H k can be stored in a smaller r k r k matrix, where r k = dim(g k ). UC San Diego Center for Computational Mathematics 10/45

16 Reduced-Hessian Methods Given a matrix B k R n r k, whose columns span G k, let B k = Z k T k be the QR decomposition of B k ; W k be a matrix whose orthonormal columns span G k ; Q k = ( Z k W k ). Then, H k p k = g k (Qk TH k Q k )QT k p k = QT k g k, where ( Z Qk T T H k Q k = k H k Z k Zk TH k W ) ( ) k Z T Wk TH k Z k Wk TH k W = k H k Z k 0 k 0 σi n rk Q T k g k = ( Z T k g k 0 ). UC San Diego Center for Computational Mathematics 11/45

17 Reduced-Hessian Methods Given a matrix B k R n r k, whose columns span G k, let B k = Z k T k be the QR decomposition of B k ; W k be a matrix whose orthonormal columns span G k ; Q k = ( Z k W k ). Then, H k p k = g k (Qk TH k Q k )QT k p k = QT k g k, where ( Z Qk T T H k Q k = k H k Z k Zk TH k W ) ( ) k Z T Wk TH k Z k Wk TH k W = k H k Z k 0 k 0 σi n rk Q T k g k = ( Z T k g k 0 ). UC San Diego Center for Computational Mathematics 11/45

18 Reduced-Hessian Methods Given a matrix B k R n r k, whose columns span G k, let B k = Z k T k be the QR decomposition of B k ; W k be a matrix whose orthonormal columns span G k ; Q k = ( Z k W k ). Then, H k p k = g k (Qk TH k Q k )QT k p k = QT k g k, where ( Z Qk T T H k Q k = k H k Z k Zk TH k W ) ( ) k Z T Wk TH k Z k Wk TH k W = k H k Z k 0 k 0 σi n rk Q T k g k = ( Z T k g k 0 ). UC San Diego Center for Computational Mathematics 11/45

19 Reduced-Hessian Methods A reduced-hessian (RH) method obtains p k from p k = Z k q k where q k solves Zk T HZ k q k = Z k T g k, (RH step) which is equivalent to (qn step). In practice, we use a Cholesky factorization R T k R k = Z T k H k Z k. The new gradient g k+1 is accepted iff (I Z k Z T k )g k+1 > ɛ. Store and update Z k, R k, Z T k p k, Z T k g k, and Z T k g k+1. UC San Diego Center for Computational Mathematics 12/45

20 Reduced-Hessian Methods A reduced-hessian (RH) method obtains p k from p k = Z k q k where q k solves Zk T HZ k q k = Z k T g k, (RH step) which is equivalent to (qn step). In practice, we use a Cholesky factorization R T k R k = Z T k H k Z k. The new gradient g k+1 is accepted iff (I Z k Z T k )g k+1 > ɛ. Store and update Z k, R k, Z T k p k, Z T k g k, and Z T k g k+1. UC San Diego Center for Computational Mathematics 12/45

21 H k = Q k Qk T H k Q k QT k = ( ) ( Z Z k W k TH k Z ) ( ) k 0 Zk k 0 σi n rk W k = Z k (Z T k H k Z k )Z T k + σ(i Z k Z T k ). any z such that Z T k z = 0 satisfies H kz = σz. UC San Diego Center for Computational Mathematics 13/45

22 Reduced-Hessian Method Variants Reinitialization: If g k+1 G k, the Cholesky factor R k is updated as ( ) Rk 0 R k+1, 0 σk+1 where σ k+1 is based on the latest estimate of the curvature, e.g., σ k+1 = y k Ts k sk Ts. k Lingering: restrict search direction to a smaller subspace and allow the subspace to expand only when f is suitably minimized on that subspace. UC San Diego Center for Computational Mathematics 14/45

23 Reduced-Hessian Method Variants Reinitialization: If g k+1 G k, the Cholesky factor R k is updated as ( ) Rk 0 R k+1, 0 σk+1 where σ k+1 is based on the latest estimate of the curvature, e.g., σ k+1 = y k Ts k sk Ts. k Lingering: restrict search direction to a smaller subspace and allow the subspace to expand only when f is suitably minimized on that subspace. UC San Diego Center for Computational Mathematics 14/45

24 Reduced-Hessian Method Variants Limited-memory: instead of storing the full approximate Hessian, keep information from only the last m steps (m n). Key differences: Form Z k from search directions instead of the gradients. Must store T k from B k = Z k T k to update most quantities. Drop columns from B k when necessary. UC San Diego Center for Computational Mathematics 15/45

25 Reduced-Hessian Method Variants Limited-memory: instead of storing the full approximate Hessian, keep information from only the last m steps (m n). Key differences: Form Z k from search directions instead of the gradients. Must store T k from B k = Z k T k to update most quantities. Drop columns from B k when necessary. UC San Diego Center for Computational Mathematics 15/45

26 Main idea: Maintain B k = Z k T k, where B k is the m n matrix B k = ( p k m+1 p k m+2 p k 1 p k ) with m n (e.g., m = 5). B k = Z k T k is not available until p k has been computed. However, if Z k is a basis for span(p k m+1,..., p k 1, p k ) then it is also a basis for span(p k m+1,..., p k 1, g k ), i.e., only the triangular factor associated with the (common) basis for each of these subspaces will be different. UC San Diego Center for Computational Mathematics 16/45

27 Main idea: Maintain B k = Z k T k, where B k is the m n matrix B k = ( p k m+1 p k m+2 p k 1 p k ) with m n (e.g., m = 5). B k = Z k T k is not available until p k has been computed. However, if Z k is a basis for span(p k m+1,..., p k 1, p k ) then it is also a basis for span(p k m+1,..., p k 1, g k ), i.e., only the triangular factor associated with the (common) basis for each of these subspaces will be different. UC San Diego Center for Computational Mathematics 16/45

28 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

29 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

30 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

31 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

32 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

33 At the start of iteration k, suppose we have the factors of Compute p k. B (g) k = ( p k m+1 p k m+2 p k 1 g k ) Swap p k with g k to give the factors of B k = ( p k m+1 p k m+2 p k 1 p k ) Compute x k+1, g k+1, etc., using a Wolfe line search. Update and reinitialize the factor R k. If g k+1 is accepted, compute the factors of B (g) k+1 = ( p k m+2 p k m+2 p k g k+1 ) UC San Diego Center for Computational Mathematics 17/45

34 Summary of key differences: Form Z k from search directions instead of gradients. Must store T k from B k = Z k T k to update most quantities. Can store B k instead of Z k. Drop columns from B k when necessary. Half the storage of conventional limited-memory approaches. Retains finite termination property for quadratic f (both with and without reinitialization). UC San Diego Center for Computational Mathematics 18/45

35 Summary of key differences: Form Z k from search directions instead of gradients. Must store T k from B k = Z k T k to update most quantities. Can store B k instead of Z k. Drop columns from B k when necessary. Half the storage of conventional limited-memory approaches. Retains finite termination property for quadratic f (both with and without reinitialization). UC San Diego Center for Computational Mathematics 18/45

36 Bound-Constrained Optimization UC San Diego Center for Computational Mathematics 19/45

37 Bound Constraints Given l, u R n, solve minimize f (x) subject to l x u. x R n Focus on line-search methods that use the BFGS method: Projected-gradient [Byrd, Lu, Nocedal, Zhu (1995)] Projected-search [Bertsekas (1982)] These methods are designed to move on and off constraints rapidly and identify the active set after a finite number of iterations. UC San Diego Center for Computational Mathematics 20/45

38 Projected-Gradient Methods UC San Diego Center for Computational Mathematics 21/45

39 Definitions A(x) = {i : x i = l i or x i = u i } Define the projection P(x) componentwise, where l i if x i < l i, [P(x)] i = u i if x i > u i, otherwise. Given an iterate x k, define the piecewise linear paths x gk (α) = P(x k αg k ) and x pk (α) = P(x k + αp k ) x i UC San Diego Center for Computational Mathematics 22/45

40 Algorithm L-BFGS-B Given x k and q k (x), a typical iteration of L-BFGS-B looks like: gk x k Move along projected path x gk (α) UC San Diego Center for Computational Mathematics 23/45

41 Algorithm L-BFGS-B gk x k x c k Find x c k, the first point that minimizes q k(x) along x gk (α) UC San Diego Center for Computational Mathematics 23/45

42 Algorithm L-BFGS-B gk x k x c k x Find x, the minimizer of q k (x) with x i fixed for every i A(x c k ) UC San Diego Center for Computational Mathematics 23/45

43 Algorithm L-BFGS-B gk x k x c k x x Find x by projecting x onto the feasible region UC San Diego Center for Computational Mathematics 23/45

44 Algorithm L-BFGS-B gk x k p k x c k x x Wolfe line search along p k with α max = 1 to ensure feasibility UC San Diego Center for Computational Mathematics 23/45

45 Projected-Search Methods UC San Diego Center for Computational Mathematics 24/45

46 Definitions A(x) = {i : x i = l i or x i = u i } W(x) = {i : (x i = l i and g i > 0) or (x i = u i and g i < 0)} Given a point x and vector p, the projected vector of p at x, P x (p), is defined componentwise, where { 0 if i W(x), [P x (p)] i = otherwise. p i Given a subspace S, the projected subspace of S at x is defined as P x (S) = {P x (p) : p S}. UC San Diego Center for Computational Mathematics 25/45

47 Definitions A(x) = {i : x i = l i or x i = u i } W(x) = {i : (x i = l i and g i > 0) or (x i = u i and g i < 0)} Given a point x and vector p, the projected vector of p at x, P x (p), is defined componentwise, where { 0 if i W(x), [P x (p)] i = otherwise. p i Given a subspace S, the projected subspace of S at x is defined as P x (S) = {P x (p) : p S}. UC San Diego Center for Computational Mathematics 25/45

48 Projected-Search Methods A typical projected-search method updates x k as follows: Compute W(x k ) Calculate p k as the solution of min q k(x k + p) p P xk (R n ) Obtain x k+1 from an Armijo-like line search on ψ(α) = f (x pk (α)) UC San Diego Center for Computational Mathematics 26/45

49 Projected-Search Methods Given x k and f (x), a typical iteration looks like: p k x k UC San Diego Center for Computational Mathematics 27/45

50 Projected-Search Methods p k x k UC San Diego Center for Computational Mathematics 27/45

51 Projected-Search Methods p k x k Armijo-like line search along P(x k + αp k ) UC San Diego Center for Computational Mathematics 27/45

52 Quasi-Wolfe Line Search UC San Diego Center for Computational Mathematics 28/45

53 Can t use Wolfe conditions: ψ(α) is a continuous, piecewise differentiable function with cusps where x pk (α) changes direction. As ψ(α) = f ( x p (α) ) is only piecewise differentiable, it is not possible to know when an interval contains a Wolfe step. UC San Diego Center for Computational Mathematics 29/45

54 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

55 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

56 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

57 Definition Let ψ (α) and ψ +(α) denote left- and right-derivatives of ψ(α). Define a [strong] quasi-wolfe step to be an Armijo-like step that satisfies at least one of the following conditions: ψ (α) η W ψ +(0) ψ +(α) η W ψ +(0) ψ (α) 0 ψ +(α) UC San Diego Center for Computational Mathematics 30/45

58 Theory Analogous conditions for the existence of the Wolfe step imply the existence of the quasi-wolfe step. Quasi-Wolfe line searches are ideal for projected-search methods. If ψ is differentiable the quasi-wolfe and Wolfe conditions are identical. In rare cases, H k cannot be updated after quasi-wolfe step. UC San Diego Center for Computational Mathematics 31/45

59 Reduced-Hessian Methods for Bound-Constrained Optimization UC San Diego Center for Computational Mathematics 32/45

60 Motivation Projected-search methods typically calculate p k as the solution to min q k(x k + p) p P xk (R n ) If N k has orthonormal columns that span P xk (R n ): p k = N k q k where q k solves (N T k H k N k )q k = NT k g k An RH method for bound-constrained optimization (RH-B) solves p k = Z k q k where q k solves (Z T k H k Z k )q k = Z T k g k where the orthonormal columns of Z k span P xk (G k ). UC San Diego Center for Computational Mathematics 33/45

61 Motivation Projected-search methods typically calculate p k as the solution to min q k(x k + p) p P xk (R n ) If N k has orthonormal columns that span P xk (R n ): p k = N k q k where q k solves (N T k H k N k )q k = NT k g k An RH method for bound-constrained optimization (RH-B) solves p k = Z k q k where q k solves (Z T k H k Z k )q k = Z T k g k where the orthonormal columns of Z k span P xk (G k ). UC San Diego Center for Computational Mathematics 33/45

62 Details Builds on limited-memory RH method framework, plus: Update values dependent on B k when W k W k 1 Use projected-search trappings (working set, projections,...). Use line search compatible with projected-search methods. Update H k with curvature in direction restricted to range(z k+1 ). UC San Diego Center for Computational Mathematics 34/45

63 Cost of Changing the Working Set An RH-B method can be explicit or implicit. An explicit method stores and uses Z k and an implicit method stores and uses B k. When W k W k 1, all quantities dependent on B k are updated: Dropping n d indices: 21m 2 n d flops if implicit Adding n a indices: 24m 2 n a flops if implicit Using an explicit method: +6mn(n a + n d ) flops to update Z k UC San Diego Center for Computational Mathematics 35/45

64 Some Numerical Results UC San Diego Center for Computational Mathematics 36/45

65 Results LRH-B (v1.0) and LBFGS-B (v3.0) LRH-B implemented in Fortran 2003; LBFGS-B in Fortran problems from CUTEst test set, with n between 1 and 192, 627. Termination: P xk (g k ) < 10 5 or 300K itns or 1000 cpu secs. UC San Diego Center for Computational Mathematics 37/45

66 Implementation Algorithm LRH-B (v1.0): Limited-memory: restricted to m preceding search directions. Reinitialization: included if n > min(6, m). Lingering: excluded. Method type: implicit. Updates: B k, T k, R k updated for all W k W k 1. UC San Diego Center for Computational Mathematics 38/45

67 Default parameters LBFGS-B m = 5 LRH-B m = LBFGS-B m = 5 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems Name m Failed LBFGS-B 5 78 LRH-B UC San Diego Center for Computational Mathematics 39/45

68 LBFGS-B m = 5 LRH-B m = LBFGS-B m = 5 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems Name m Failed LBFGS-B 5 78 LRH-B 5 50 UC San Diego Center for Computational Mathematics 40/45

69 LBFGS-B m = 10 LRH-B m = LBFGS-B m = 10 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems Name m Failed LBFGS-B LRH-B UC San Diego Center for Computational Mathematics 41/45

70 LBFGS-B m = 5 LBFGS-B m = 10 LRH-B m = 5 LRH-B m = LBFGS-B m = 5 LBFGS-B m = 10 LRH-B m = 5 LRH-B m = Function evaluations for 374 UC/BC problems 0.2 Times for 374 UC/BC problems UC San Diego Center for Computational Mathematics 42/45

71 Outstanding Issues Projected-search methods: When to update/factor? Better to refactor when there are lots of changes to A. Implement plane rotations via level-two BLAS. How complex should we make the quasi-wolfe search? UC San Diego Center for Computational Mathematics 43/45

72 Thank you! UC San Diego Center for Computational Mathematics 44/45

73 References D. P. Bertsekas, Projected Newton methods for optimization problems with simple constraints, SIAM J. Control Optim., 20: , R. Byrd, P. Lu, J. Nocedal and C. Zhu, A limited-memory algorithm for bound constrained optimization, SIAM J. Sci. Comput. 16: , M. C. Fenelon, Preconditioned conjugate-gradient-type methods for large-scale unconstrained optimization, Ph.D. thesis, Department of Operations Research, Stanford University, Stanford, CA, M. W. Ferry, P. E. Gill and E. Wong, A limited-memory reduced Hessian method for bound-constrained optimization, Report CCoM 18-01, Center for Computational Mathematics, University of California, San Diego, 2018 (to appear). P. E. Gill and M. W. Leonard, Reduced-Hessian quasi-newton methods for unconstrained optimization, SIAM J. Optim., 12: , P. E. Gill and M. W. Leonard, Limited-memory reduced-hessian methods for unconstrained optimization, SIAM J. Optim., 14: , J. Morales and J. Nocedal, Remark on Algorithm 778: L-BFGS-B: Fortran subroutines for large-scale bound constrained optimization, ACM Trans. Math. Softw., 38(1)7:1 7:4, D. Siegel, Modifying the BFGS update by a new column scaling technique, Math. Program., 66:45 78, UC San Diego Center for Computational Mathematics 45/45

Preconditioned conjugate gradient algorithms with column scaling

Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and