Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes (machine written, font size minimum 0 pt) Maximum number of points: 36 (8 points to pass) All answers must be motivated the get full points. Consider the linear program: max f(x) = 9x +7x 2 s.t. x +3x 2 6 3x +2x 2 2 x,x 2 0. a) Rewrite the program into standard form. [0.5 pt] () Solution min g(x) f(x) = 9x 7x 2 s.t. x +3x 2 +x 3 = 6 3x +2x 2 +x 4 = 2 x,x 2,x 3,x 4 0. (2) b) Solve the problem by using the simplex method. Use the slack variables as initial basic variables and employ Bland s rule to determine the entering and leaving variables. Give the solution for the optimizer x and the value of the objective function of the optimizer f(x ). Note: State clearly what your basic and non-basic variables are in every step. [4 pt] Solution.) First iteration x B = (x 3,x 4 ) T, x N = (x,x 2 ) T ( ) ( ) 0 3 B =, N = 0 3 2
c B = (0,0) T, c N = ( 9, 7) ˆb = B b = (6,2) T ŷ T = c T BB = (0,0) ĉ r = c T N yt N = ( 9, 7) T The basis is not optimal. The entering variable is x (smallest index i in ĉ r,i < 0). On the other hand,  = B A = ( 0 0 )( 3 ), and, then, the leaving variable is x 4, since only Â,2 > 0. 2.) Second iteration B = x B = (x 3,x ) T, x N = (x 4,x 2 ) T ( 0 3 ), N = ( ) 0 3, B = 2 3 c B = (0, 9) T, c N = (0, 7) T ˆb = B b = (3,7) T ŷ T = c T B B = (0, 3) ĉ r = c T N y T N = (3, ) T ( ) 3 0 The basis is not optimal since not all components of ĉ r are positive. The negative component corresponds to x 2 which is the entering variable. On the other hand, and  2 = B A 2 = 3 and, thus, the leaving variable is x 3. 3.) Third iteration B = ( 3 0 )( ) 3 = 2 3 ( ) 2 { } { ˆbi 39 min,a 2,i 0 = min i  2,i i, 2 }, 2 ( 3 3 2 x B = (x,x 2 ) T, x N = (x 4,x 3 ) T ) ( ) 0, N =, B = 0 c B = ( 9,7) T, c N = (0,0) T ŷ T = c T B B = ( 3, 34) ( ) 2 3 3 2
ĉ r = c T N y T N = (3,34)T The basis is now optimal since ĉ r > 0. Stop here. The optimizer is x =ˆb = B b = (5,39)T, and the value of the objective function (with the primal problem written in standard form) is 732/. c) Write down the dual of problem in () and give the solution for the dual problem (the optimizer y and also value of the dual objective). [.5 pt] Solution The explicit expression of the dual problem is not unique (in the same way that the primal problem can be also expressed in different forms). For example, if we directly calculate the dual from the primal expressed as in () we obtain (see p.4 on the slides for Chapter 6. Duality and Sensitivity for an identical example) max f(x) = 9x +7x 2 s.t. x +3x 2 6 3x +2x 2 2 min w(y) = 6y +2y 2 s.t. y +3y 2 9 3y +2y 2 7 (3) x,x 2 0. y,y 2 0. If, on the other hand, we decide to write first problem () in canonical form we have that the primal and the dual problems can be equivalently expressed as (see p.2 on the slides for Chapter 6. Duality and Sensitivity) min g(x) f(x) = 9x 7x 2 max v(y) w(y) = 6y 2y 2 s.t. x 3x 2 6 3x 2x 2 2 s.t. y 3y 2 9 3y 2y 2 7 (4) x,x 2 0, y,y 2 0. Another alternative is expressing the primal in standard form, as in(2). Then, the dual problem is max w(y) = 6y +2y 2 min g(x) f(x) = 9x 7x 2 s.t. x +3x 2 +x 3 = 6 3x +2x 2 +x 4 = 2 x,x 2,x 3,x 4 0. s.t. y +3y 2 9 3y +2y 2 7 y 0 y 2 0. (5) In this last formulation, as the primal constraints are equalities, this would mean that the variables y and y 2 are unrestricted, but the constraints y 0 3
and y 2 0 (coming for the positivity of the primal slack variables x 3 and x 4 ) are morerestrictive. Note also that thedual in (5) is, of course, equivalent to the dual in (4) simply by doing the change of variables y = y and y 2 = y 2. From all the previous equivalent formulations, we have to use the last one (with the primal expressed in standard form), as the theoretical results of the weak and strong duality theory are based on the primal problem being written in such a way (see p.2 on the slides for Chapter 6. Duality and Sensitivity and pp. 79 80 on the book by Griva, Nash and Sofer (2009)). Then, the optimizer is just the simplex multiplier from the last iteration i.e. ŷ = y = ( 3, 34) T / and the objective value of the dual function, written in standard form as in (5), is the same 732/ as for the primal problem, by the strong duality theorem. 2. The time evolution of a cooling process in a harmonic quantum system can be described by n(t) = e Wt ( s)+s. Here W is the effective cooling rate and s the steady state of the system. Based on the measured values for average quantum number n in the table, the optimal values for W and s can be found. t n 0.5 0.587 2 0.739 6 0.005 a) Formulate the problem as a non-linear least squares problem. Write down the objective function f(x), the gradient f(x), the Jacobian matrix F(x), and the expression for the Hessian matrix (without putting in the numbers). [2 pt] Solution The problem is formulated in the common form for nonlinear least-squares problems: min x f(x) = 2 3 f i (x) 2 = 2 i= 3 [n(t i,x) n i ] 2 = 2 F(x)T F(x), i= where x = (W,s) T and F(x) = (f (x),f 2 (x),f 3 (x)) T. The residuals f i (x) are defined as: f i (x) = e Wt i ( s)+s n i. 4
The partial derivatives with respect to the parameters are: By using the Jacobian f F(x) = W f s f i W = t ie Wt i ( s), f i s = e Wt i +. f 2 W f 2 s f 3 W f, 3 s the derivative of the objective function can be expressed in a more compact way as: f(x) = F(x)F(x) The resulting Hessian matrix for the problem is, then: 2 f(x) = F(x) F(x) T + 3 f i (x) 2 f i (x). i= b) Perform one iteration with the Gauss-Newton method with a step length of one. Start with the initial guess W =.46, s = 0.09 (use 4 digits in the calculation). Check if the Armijo condition for a relaxation parameter µ = 0. is fulfilled. Give the value of the objective function. [5 pt] Solution We calculate all the components first. The vector of the residuals is then: F ( x (0)) = ( 0.05320, 0.03480, 0.0036) T. The value of the objective function at x (0) is The Jacobian evaluated at x (0) is: f ( x (0)) = 2 F( x (0)) T F ( x (0) ) = 0.0020740. F ( x (0)) = The gradient evaluated at x (0) is: ( 0.293 0.0986 0.0008566 0.58 0.946 0.9998 ). f ( x (0)) = F ( x (0)) F ( x (0)) = (0.0509, 0.07083) T. 5
The approximated Hessian, according to the Gauss-Newton algorithm, is 2 f ( x (0)) H ( x (0)) = F ( x (0)) F ( ( ) x (0)) T 0.0577 0.2073 =. 0.2073 2.63 Now, having all necessary components, we can calculate a quasi-newton search direction: p = H ( x (0)) ( f x (0) ) [ = F ( x (0)) F ( x (0)) ] T ( F x (0) ) F ( x (0)) = ( 0.293,0.073) T, and obtain a new point x () as x () = x (0) +p = (.46,0.09) T +( 0.704,0.022) T = (.24,0.07) T, with a function value f ( x ()) =.060 0 5. Using the Armijo condition means to make relaxed linear prediction for the change of the function value: f ( x ()) f ( x (0)) +µp T f ( x (0)) =.660 0 3. And we see the condition is fulfilled, and we have a sufficient decrease in the step. 3. Consider the following problem min f(x) = 4 x2 + 8 x2 2. a) Write the algorithm for the steepest-descent method without line search. [0.5 pt] b) Solve the problem using the algorithm in (3a). i. Perform the first five iterations of the method starting from x 0 = (,) T. Give a 6-digit approximation to the value of the norm of the gradient at each iteration. Write the results in a table like the one below. k x p f (x) 0. ii. What can you say about the convergence rate of the iterates? (do not make numbers here, just an intuitive interpretation). iii. Plottheiterates onthex x 2 plane, togetherwithasketch ofsomecontour lines of f (x). c) Perform again the first iterate as in (3b), but using an exact line search. Discuss the optimum value obtained for α. [.5 pt] [ pt] [ pt] [ pt] 6
Solution a) The algorithm for the steepest descent method without line search is simply Input Select initial value for the variable x 0 ; k = 0 repeat Step If x k is optimal, stop Step 2 Determine a search direction p k = f(x k ) and update the iterate x k+ = x k +p k ; k = k + until convergence b) The gradient of the objective function is ( f (x) = 2 x ) 4 x. 2 i. So, the first iterate is easily computed as x = x 0 +p 0 = x 0 f (x 0 ) = ( ) The rest of the iterates are given in the table below ( /2 /4 ) = k x p f (x) 0 (,) ( /2, /4) 0.5590 ( /2, 3 /4) ( /4. 3 /6) 0.325 2 (/4, 9/6) ( /8, 9/64) 0.88 3 (/8, 27/64) ( /6, 27/256) 0.226 4 (/6, 8/256) ( /32, 8/024) 0.085 5 ( /32, 243 /024) ( /64, 243 /4096) 0.063 ( /2 ii. The convergence rate is linear, something typical of the behavior of the steepest descent method. (The exact calculations bring a convergence rate r = and a constant C = 3 4 ). iii. The plot with the iterates and some contour lines is given below 3/4 ). 7
c) We perform now the first iterate but using a exact line search. To obtain the value of the optimum step length α opt, we use the exact formula α = f (x 0) T p 0 p T 0 Qp, (6) where Q is the Hessian matrix of the quadratic function 2 xt Qx c T x. In this case, ( ) Q 2 /2 0 f (x) =, 0 /4 so the computation is straightforward and α = 20 9. Should we take this whole step, the new point would be at ( 9, 4 9) (red line on the plot), with a gradient norm equal to 0.242, so we will be improving the performance of the algorithm. However, should we have used a backtracking strategy, we would have started with an initial value α =, which corresponds to the first step already calculated in (3b). Note: In general, when the objective function is not quadratic, (6) cannot be used, and we should solve the following unidimensional unconstrained optimization problem to obtain α min α φ(α) = f (x 0 +αp 0 ), with x 0 = (,) T and p 0 = ( 2 4), as previously calculated. In this case, the expression for φ(α) is simple, φ(x) = ( ) 4 α 2 ( 2 + 8 α 2, 4) so the minimum is easily calculated at α = 20 9. 8
4. Consider an inequality-constrained problem with two constraints min f (x) s.t. g (x) = x 2 +x 2 2 +x 2 3 3 0 g 2 (x) = 2x 4x 2 +x 2 3 + 0. a) State the necessary optimality conditions for this kind of problems, without using the explicit expressions for g and g 2. Consider now the points x a = (,,) T and x b = (3,2, ) T. [0.5 pt] b) Which of them (if any) is feasible? [0.5 pt] c) Which of them (if any) is regular? [ pt] d) Compute a null-space matrix Z(x) (as stated in 4a) for each point. [.5 pt] e) What is the range of admissible values for the Lagrange multipliers if we want the necessary conditions to be fulfilled at x a? And at x b? Can degeneracy happen in any of these cases? [.5 pt] Solution a) The necessary optimality conditions for inequality-constrained problems(kkt conditions) are stated on p. 506 on the book of Griva, Nash and Sofer (2009). Theorem. (Necessary Conditions, Inequality Constraints). Let x be a local minimum point of f subject to the constraints g(x) 0. Let the columns of Z(x ) form a basis for the null space of the Jacobian of the active constraints at x. If x is a regular point for the constraints, then there exists a vector of Lagrange multipliers λ such that x L(x ;λ ) = 0, or equivalently Z(x ) T f (x ) = 0, λ 0, λ T g(x ) = 0, and Z(x ) T 2 xxl(x ;λ )Z(x ) is positive semidefinite. b) x a is feasible, since it satisfies both constraints g (x a ) = 0 (active), g 2 (x a ) = 0 (active). x b is also feasible, since it also satisfies both constraints g (x b ) = > 0 (inactive), g 2 (x b ) = 0 (active). c) To check for regularity, we need the Jacobian matrix only of the active constraints. The general expression of the Jacobian is ( ) g(x) T 2x 2x = 2 2x 3. 2 4 2x 3 9
At x a both constraints are active, so the Jacobian evaluated at x a is ( ) g(x a ) T 2 2 2 =. 2 4 2 The columns of g(x a ) (or the rows of g(x a ) T ) are linearly independent so x a is a regular point. Regarding x b, only the second constraint is active, so we need only check the corresponding column of the Jacobian, i.e. g(x b ) T = ( 2 4 2 ), which is a nonzero vector, thus x b is also a regular point. d) Z is a basis for the null space of the Jacobian of the active constraints at x. If we use the variable reduction method, matrices B and N for g(x a ) will be e.g. B = ( ) 2 2, N = 2 4 ( ) ( ) 2 B 2 = 2 6 Z(x a ) = ( B ) N = 0. Z Regarding g(x b ), matrices B, N and Z will be B = ( 2 ), N = ( 4 2 ) B = ( ( B 2) Z(xb ) = ) 2 N = 0. Z 0 e) As the two constraints are active at x a, the Lagrange multipliers must be strictly nonnegative (λ,λ 2 0). Degeneracy happens if λ and/or λ 2 are equal to zero. For x b only the second constraint is active, so it must hold that λ = 0 and λ 2 0. Degeneracy arises if λ 2 is equal to zero. 5. Solve the problem two ways analytically: min f (x) = x 2 +x x 2 +x 2 2 2x 2 s.t. x +x 2 = 2 a) With the necessary conditions. i. What is x? [0.5 pt] ii. What is λ? iii. Is x a strict local minimizer? iv. Is x also a global minimizer? [0.5 pt] [0.5 pt] [0.5 pt] 0
Note: Solving of the problem by simply expressing one variable in terms of the other from the constraint, plugging it into the objective function and minimizing this (now unidimensional) function will NOT be evaluated. b) Either with a logarithmic barrier function or with a quadratic penalty function. Motivate your choice. i. What is x(µ) (or x(ρ), depending on your choice)? [0.5 pt] ii. What is λ(µ) (or λ(ρ), depending on your choice)? iii. What is x? iv. What is λ? c) Compute the Hessian matrix B of the logarithmic barrier function for µ = 0 4 (or of the quadratic penalty function for ρ = 0 4, depending on your choice). What is the condition number of B? What is B? Help: For a 2 2 nonsingular symmetric matrix Q the condition number is cond(q) = λ max λ min, where λ min,λ max are, respectively, the smallest and largest (in moduli) eigenvalues of Q. [0.5 pt] [0.5 pt] [0.5 pt] [2 pt] Solution This is a linear equality constrained problem, with one single constraint. The gradient and the Hessian of the objective function are ( ) ( ) 2x +x f (x) = 2, 2 2 f (x) =. x +2x 2 2 2 The constraint matrix is A = ( ), so we can choose (from the variable reduction method), a basis matrix for the null space of the constraint ( ) Z =. a) If we impose now the first-order optimality conditions, we get f (x ) = A T λ ( ) 2x +x 2 = x +2x 2 2 ( ) λ, together with the constraint x +x 2 2 = 0. We solve the gradient condition for x and x 2 in terms of λ, and plug x (λ) and x 2 (λ) to obtain the value λ = 2 which, in turn, yields x = 0 and x 2 = 2. Hence
i. x = (0,2) T. ii. λ = 2. iii. Yes, x is a strict local minimizer, since the reduced Hessian at the solution is Z T 2 f (x )Z = ( )( )( ) 2 = 2 > 0, 2 satisfying therefore the second order sufficient condition. iv. x is also a global minimizer since the function is convex and the feasible set is convex. b) We solve now the problem using a quadratic penalty function, because the problem has only equality constraints. ψ(x) = 2 m g i (x) 2, i= so that the original constrained problem is transformed into the unconstrained problem min π ρ (x) = f (x)+ρψ(x) = x 2 +x x 2 +x 2 2 2x 2 + 2 ρ(x +x 2 2) 2. We now impose the first-order optimality condition to π ρ (x) π ρ (x) x 2x +x 2 +ρ(x +x 2 2) = 0 π ρ (x) x 2 x +2x 2 2+ρ(x +x 2 2) = 0. Subtracting both equations, we get that x = x 2 2, and plugging it back into any of the constraints we get that i. The solution of the unconstrained problem, expressed in terms of ρ is ( 2 x(ρ) = (x (ρ),x 2 (ρ)) = 2ρ+3, 4ρ+4 ). 2ρ+3 ii. The estimate of the Lagrange multiplier for the constraint is λ(ρ) = ρg(x(ρ)) = ρ(x +x 2 2) = ρ 4 2ρ+3 = 4ρ 2ρ+3. iii. In the limit when ρ, x = (0,2). iv. In the limit when ρ, λ = 2. 2
c) We need to compute now the Hessian matrix of the quadratic penalty function π ρ (x), obtaining B 2 xπ ρ (x) = ( ) 2+ρ +ρ +ρ 2+ρ ρ=0 4 ( ) 0002 000. 000 0002 The eigenvalues of this matrix are σ = {,2ρ+3} = {,20003}, so the condition number is cond(b) = 2ρ+3 = 20003. Finally, the inverse of B is ( ) ρ+2 B 2ρ+3 ρ+ 2ρ+3 = ρ+ 2ρ+3 ρ+2 2ρ+3 which is very close to being singular. ρ=0 4 ( ) 0.500025 0.499975, 0.499975 0.500025 The graphical representation of the problem is given below. 6. A cardboard box for packing quantities of small foam balls is to be manufactured as shown in the figure. The top, bottom, and front faces must be of double weight (i.e., two pieces of cardboard). We want to find the dimensions of such a box that maximize the volume for a given amount of cardboard, equal to 72 m 2. 3
Top Front x 3 x 2 x Bottom a) Write the optimization problem. State clearly all the constraints (if any). [0.5 pt] b) By geometric reasoning, the problem is guaranteed to have a solution x = (x,x 2,x 3 ) such that x,x 2,x 3 are strictly positive. Does this fact change your formulation of the optimization problem? Motivate. c) Through a suitable change of variables, it is possible to transform the problem in (6b) into a linear equality constrained problem. Formulate one such change of variables and the corresponding optimization problem. State again clearly all the constraints (if any), making use of the assumption in (6b). Hint: minimizing f(x) for f(x) > 0 is equivalent to minimizing f(x). [0.5 pt] [ pt] d) State the first-order necessary conditions for the problem in (6c). [ pt] e) Find x = (x,x 2,x 3 ). [3 pt] f) Verify the second-order sufficient condition for x. [ pt] Solution a) The problem can be formulated as max f (x) = x x 2 x 3 s.t. 4x x 2 +3x x 3 +2x 2 x 3 = 72 (7) x,x 2,x 3 0, because we need to have double weight on the Front face (which has an area equal to x x 3 ) and on the Top and Bottom faces, which have an area of x x 2. b) As we know that the constraints x,x 2,x 3 0 will not be active at the solution, we can remove, and we can thus express problem (7) as max f (x) = x x 2 x 3 s.t. 4x x 2 +3x x 3 +2x 2 x 3 = 72. (8) 4
Problem (8) can be equivalently expressed as a minimization problem min f (x) = x x 2 x 3 s.t. 4x x 2 +3x x 3 +2x 2 x 3 72 = 0. (9) c) The gradient and the Hessian for problem (9) are x 2 x 3 0 x 3 x 2 f (x) = x x 3, 2 f (x) = x 3 0 x. (0) x x 2 x 2 x 0 The Jacobian matrix of the nonlinear constraint is g(x) 4x x 2 +3x x 3 +2x 2 x 3 72 = 0 () 4x 2 +3x 3 g(x) = 4x +2x 3. (2) 3x +2x 2 However, we can write problem (9) in a different (and simpler) way, making use of the following changes of variables y = x x 2, y 2 = x x 3, y 3 = x 2 x 3. (3) Of course, the point y = (y,y 2,y 3 ) corresponding to the original solution x = (x,x 2,x 3 ) will be also guaranteed to be strictly positive at the solution. Hence, problem (9) now reads min f (y) = y y 2 y 3 s.t. 4y +3y 2 +2y 3 72 = 0. Since the square root is a monotonically increasing function, we can equivalently minimize the radicand y y 2 y 3, so we are finally facing the following linear equality constrained optimization problem min f (y) = y y 2 y 3 s.t. 4y +3y 2 +2y 3 72 = 0. (4) The gradient and the Hessian of the objective function in problem (4) are y 2 y 3 0 y 3 y 2 f (y) = y y 3, 2 f (y) = y 3 0 y. y y 2 y 2 y 0 We have only one linear equality constraint g(y) 4y +3y 2 +2y 3 72 = 0, (5) 5
so the constraint matrix is A = ( 4 3 2 ), and a choice for Z (computed through the variable reduction method) would be Z = 3/4 /2 0 0 d) We can now move to the first optimality condition y 2 y 3 4 f (y ) = A T λ y y 3 = 3 λ, (6) y y 2 2 which has to be solved, together with the feasibility requirement (5) in order to obtain the stationary points of problem (4). e) If we solve each equation on the gradient condition (6) for λ we obtain. λ = y 2y 3 4, λ = y y 3 3, λ = y y 2 2. (7) If we compare the right-hand sides of the first and second relationships in (7), this yields y 2 y 3 = y y 3 y 2 = 4 4 3 3 y (8) (here we can divide by y 3 as the y,y 2,y 3 are guaranteed to be strictly positive). Similarly, if we compare the right-hand sides of the first and third relationships in (7), we get y 2 y 3 4 = y y 2 2 y 3 = 2y. (9) We can plug now (8) and (9) into the constraint (5), and solve for y 4y +3 ( 4 3 y ) +2(2y ) 72 = 0 2y = 72 y = 6. (20) Replacing y into (7), (8) and (9) this yields y 2 = 8 and y 3 = 2, obtaining the solution y = (6,8,2), λ = 24. The solution in terms of the original variables is obtained from the original relationships (3) x x 2 = 6, x x 3 = 8, x 2 x 3 = 2, dividing the second and the third equations (to solve for x 2 in terms of x x 2 = 3 2 x ), and the first and the third equations (to solve for x 3 in terms of x x 3 = 2x ), and plugging the result into the original constraint () 4x ( 3 2 x ) +3x (2x )+2 ( 3 2 x ) (2x ) 72 = 0 x = 4 = 2, 6
and, from here, x 2 = 3 and x 3 = 4, so the solution is x = (2,3,4). We have to check now that this point is actually a stationary point of the original problem (9). To do so, we check the first order optimality condition for problem (9), and we observe that the gradient f (x ) and the Jacobian g(x ) are parallel at the solution 2 24 f (x ) = g(x )λ 8 = 6 λ, 6 2 obtaining a Lagrange multiplier λ = 2. f) To check if x = (2,3,4) is actually a strict local minimizer, we should move now to the second order sufficient optimality condition for the original problem (9), which requires the reduced Hessian Z T 2 f (x )Z to be positive definite at x. Here, Z is a null space matrix of the Jacobian g(x ), evaluated at the solution, i.e. 24 2 /3 /2 g(x ) = 6 Z = 0. 2 0 With this, it is straightforward to check that the reduced Hessian ( ) 0 4 3 Z T 2 f (x 2/3 0 2 /3 /2 )Z = = /2 0 4 0 2 3 2 0 0 0 ( ) 6/3 2 2 3 is positive definite, since its principal minors are 6 3 > 0 and 6/3 2 2 3 = 2 > 0, so the point x = (2,3,4) is a strict local minimizer of problem (9), and, in consequence, a strict local maximizer of problem (7). Good Luck! Javier & Markus 7