Norwegian University of Science and Technology Department of Mathematical Sciences Page 1 of 11 Contact during exam: Anne Kværnø: 966384 Exam in TMA418 Optimization Theory Wednesday May 9, 13 Tid: 9. 13. Auxiliary materials: Simple calculator (Hewlett Packard HP3S or Citizen SR-7X) Rottmann: Matematisk formelsamling With preliminary solutions (not proofread). Problem 1 Let f(x) = x a) Find the gradient and the hessian of f. b) Find all minima of f. + x cos y. For the remaining part of this problem we discuss one step of a line search method, starting from x = (1, π/4), with search direction p = ( 1, ). c) Confirm that p is a descent direction from x. d) State the Wolfe-conditions. What are the purpose of these conditions? If c 1 =.1 and c =.8, what are the admissible values for the steplength α. e) Do one step of the line search method, using the exact value for the steplength. Solution:
Page of 11 a) Gradient: f(x) = (x + cos y, x sin y) T, ( ) 1 sin y Hessian: f(x) = sin y x cos y b) There are two sets of solutions for the first order condition: f(x) = : 1. The first solution is x =, y = (n + 1/)π for n =, ±1, ±,.... In this case the Hessian is ( ) 1 ( 1) n+1 f(x) = ( 1) n+1 with eigenvalues λ = 1 (1± 5), so these are not minima (they arabesquee saddle points).. The second solution is x = ( 1) n+1, y = nπ, for n =, ±1, ±,.... The Hessian in this point is simply the identity matrix, which obviously is SPD, so these are strict local minima. Further f ( ( 1) n+1, nπ ) = 1 + ( 1)n+1 cos nπ = 1 so we can conclude that these are also global minima. c) The direction p is descent if f(x ) T p <. In our case this is (1 + /, ( ) 1 /) = 1 / < so p is a descent direction. d) The Wolfe conditions are (general) f(x k + α k p k ) f(x k ) + c 1 α k f(x k ) T p k f(x k + α k p k ) T p k c f(x k ) T p k. for two parameters c 1 and c satisfying < c 1 < c < 1. Choosing the steplengths such that the Wolfe conditions are satisfied ensures sufficiently decrease of the objective function from one iteration to the next one, to ensure convergence the line search method, see Theorem 3. in N&W for details. This is critical when the one-dimensional minimization problem of a line search step is solved approximately. With the given f, x and p, this becomes or, with c 1 =.1 and c =.8 α (1 c 1 )( + ) α (1 c )(1 + )/.3414 α 3.73.
Page 3 of 11 e) One step with a line search method means to solve the following one-dimensional problem { } min f(x 1 + αp ) = min α> α> (1 α) + (1 α) with solution α = 1 + /. The new iterate is then x 1 = x + α p = ( ) T, π. 4 The value of the objective function is then f(x 1 ) =.
Page 4 of 11 Problem Given the linear problem min x 1 3x subject to x 1 + x 6 x 1 + x 5 x 1, x a) Make a sketch of the feasible domain, and solve the problem graphically. b) Write the problem in standard form. c) Write up the dual of this problem. State the relations between the primal and the dual problem (the duality theorem). d) Explain the idea of the simplex method for linear problems. Perform one step of the method on the given problem, starting from (x 1, x ) = (, ). Solution: a) The minimum is at the solution of that x 1 = 4/3 and x = 11/3. x 1 + x = 6, x 1 + x = 5,
Page 5 of 11 b) The standard form of an LP problem is min c T x subject to Ax = b, x >. In our case, using slack variables x 3 and x 4, this is subject to min x 1 3x x 1 + x + x 3 = 6 x 1 + x + x 4 = 5 x 1, x, x 3, x 4 so A = ( ) 1 1, b = (6, 5) T, c T = (1, 3,, ). 1 1 1 c) The dual problem is max b T λ subject to A T λ c where λ is the vector of Lagrange multipliers for the equality constraints. In our case, this is: subject to max 6λ 1 + 5λ λ 1 + λ 1 λ 1 + λ 3 λ 1 λ For the relation between the primal and the dual problem, see Theorem 13.1 in the N&W. d) The rough idea of the simplex method is the following: The feasible domain Ω of a LP problem is a polytope. Start from a vertex (or a basic feasible point) of this polytope. Move from vertex to vertex along edges, reducing the objective function c T x. If a minimum exist, it is on a vertex, and will eventually be found. How this can be done mathematically is better explained with the example. In our case we start with x 1 = x =. Since the point has to be feasible, Ax = b has to be satisfied, so x 3 = 6 and x 4 = 5. This fulfill the definition of a basic feasible point (p. 363 in N&W), with B = {3, 4}. Do the corresponding splitting of the system, that is: x = [,, 6, 5] T = [ T, x T B] T, A = [N, B], c T = [c T 1, c T ]. From the definition of a feasible point, B has to be invertible. In this case, B is simply the identity matrix The idea is now to search for an x(t)
Page 6 of 11 Ω s.t. x() = [ T, x T B ]T and c T x(t) is reduced as t increases. Let x(t) = [v(t) T, y(t) T ] T. We get Ax(t) = Nv(t) + By(t) = b y(t) = B 1 (b Nv(t) = x B B 1 Nv(t). c T x(t) = c T v(t) + c T 1 y(t) = c T 1 x B + (c T c T 1 B 1 N)v(t) In our case, the vector c T c T 1 B 1 N = [1, 3] T. So, by choosing v(t) = [, t] T c T x(t) will decrease with increasing value of t. But the last two components are y(t) = [6, 5] T [, 1] T t. They should both be non-negative, so we can only increase t up to t = 3, in which case y(3) = [, ] T, v(3) = [, 3] T and the new point is x = [, 3,, ] T. Which is what was expexted from the figure in point a).
Page 7 of 11 Problem 3 a) Define a (strict) convex function and a convex set. b) Let the set Ω be defined by Ω = {x R n : c i (x), i = 1,..., m} Show that Ω is convex if all the functions c i, i = 1,,..., m are convex. Consider the following constrained optimization problem: subject to min x 1 x 1 x x 3 x + x 3 1 c) Set up the KKT conditions for this problem. d) Find the KKT point(s). What can you say about the optimality of these points (if they exist)? Solution: a) A set Ω is convex if, for all x, y Ω θx + (1 θ)y Ω, for all θ (, 1). A function f defined on a convex set Ω is convex iff for all x, y Ω f(θx + (1 θ)y) θf(x) + (1 θ)f(y), for all θ (, 1). It is strictly convex if the inequality above is strict. b) We have: c i (θx + (1 θ)y) θc i (x) + (1 θ)c i (y) c i is convex c i (x), c i (y), θ, 1 θ >. Which is true for all i = 1,..., n. So θx + (1 θ)y Ω and we can conclude that Ω is convex.
Page 8 of 11 c) The Lagrangian is L(x, λ) = x 1 λ 1 (x 1 x x 3) λ (x + x 3 1) and the KKT-conditions are x L(x, λ) = (a) λ i c i (x) =, i = 1, (b) λ i, i = 1, (c) c i (x), i = 1, (d) which becomes 1 λ 1 =, (a ) λ 1 x λ = λ 1 x 3 λ = λ 1 (x 1 x x 3) = λ (x + x 3 1) = λ 1, λ, x 1 x x 3, x + x 3 1 (b ) (c ) (d ) d) Clearly, λ 1 = 1, so (c ) is satisfied for this multiplier. Further, from (b ), x 1 = x + x 3. If λ = then from (a ) x = x 3 = and then x 1 =. But this solution do not satisfy the last inequality of (d ), so we conclude that λ. The last two equations of (a ) together with the last of (b ) is x λ =, x 3 λ =, x + x 3 1 =, with the solution λ = 1 and x = x 3 = 1/. We then have the following solution x 1 = x = x 3 = 1, λ 1 = λ = 1. (*) Finally, we have to prove that the LICQ-condition holds, that is c i (x) are linear independent for all active constraints in this point. In our case, both constraints are active, and c 1 (x) = (1, x, x 3 ), c (x) = (, 1, 1) which obviously are linear independent. So (*) is a KKT-point. Since the objective function x 1 is linear and thus convex, the feasible domain Ω is convex (since c i, i = 1, are both concave), and we have found an extreme point, we know that this is a global minimum. And since there was only one solution of the KKT-conditions, we know that the minimum is unique.
Page 9 of 11 Problem 4 be defined on Let F (y) = 1 (y (x)) dx x D = { C 1 [, 1] ; y(1) =, y() = 3}. a) Solve the problem min F (y). y D b) Solve the problem from a), but now with the additional constraint 1 y(x)dx = 1. Solution: a) First of all, notice that the integrand only depends on y. On D the function f(x, z) = z /x is strongly convex, since f(x, z + w) f(x, z) f z (x, z)w = 1 x with equality if and only if w =. The Euler-Lagrange equation d dx f z[y(x)] = f y [y(x)] becomes d y (x) = dx x for some constant C. So y (x) = Cx ( (z + w) z zw ) = w x y (x) x = C y(x) = 1 4 Cx + D Insert the boundary conditions from D, y(1) = and y() = 3, and the solution becomes y(x) = x 1 which is a minimizer, since F is strictly convex on D b) In this case, we want to minimize F (y) = 1 y (x) x dx + λ over D, The Euler-Lagrange equations becomes d y (x) dx x = λ 1 y(x)dx
Page 1 of 11 with solution y(x) = 1 6 λx + 1 Cx + D Insert the boundary conditions gives C = 7λ/9 and D = 1 + λ/9, so y(x) = 1 6 λ x3 + 1 ( 79 ) λ x 1 + 9 λ Finally, λ is determined from the constraint: and the solution becomes 1 y(x)dx = 4 3 13 16 λ = 1 λ = 7/13 y(x) = 1 13 (1x3 15x + 3). Since the constraint is linear and thus convex, F is strictly convex, and y(x) is the minimizer of the constrained problem.
Page 11 of 11 Problem 5 Consider the integral functional ) J(y) = (y(x) + (y (x)) + y(x) y (x) dx for y C [, 1]. a) Find the Gâteaux derivative δj(y; v) of J. b) Find y C [, 1] such that δj(y; v) = for all v C [, 1]. Is this a minimizer of J? Justify your answer. Solution: a) Use δj(y; v) = ε J(u + εv) ε=. which becomes δj(y; v) = = ) ((y + εv) + (y + εv ) + (y + εv) (y + εv ) ε= dx ε (v + y v + y v + yv ) dx By using partial integration the Gâteaux derivative can be rewritten: δj(y; v) = = vdx y vdx + y v 1 + y vdx + y vdx + yv 1 y v 1 (1 + y )vdx + yv 1 (1) b) From (1) we have that this is true if 1 + y =, and y() = y(1) =. which is y(x) = 1 x(1 x). This is not a minimizer, nor a maximizer. F.inst. y(x) = K (some constant) is a function in C [, 1]. For this choice of y(x), J(y) = K which can be as small or large we want. So J do not have ha minimizer.