March 5, 202 MATH 408 FINAL EXAM SAMPLE Partial Solutions to Sample Questions (in progress) See the sample questions for the midterm exam, but also consider the following questions. Obviously, a final exam would not contain questions of this depth and scope for each question. That is, the final will contain some easy and some hard questions. The questions listed below are all on the hard side with some very hard.. (a) Is the function f(x, x 2 ) = e (x +x 2 ) 2 coercive? Is it convex? Justify your answers. Solution: It is not coercive, just consider the line {(t, ) t IR}. It is convex since the function φ(u) = e u2 is convex (φ (u) = 2e u2 ( + 2u 2 ) > 0), and f is just the composition of φ with the linear function (x, x 2 ) x + x 2. (b) For what values of the parameters α, β IR is the function f(x, x 2 ) = x 2 + 2αx x 2 + βx 2 2 a convex function. For what values is it coercive? Justify your answers. [ ] 2 2α Solution: det 2 f(x) = det = 4(β α 2α 2β 2 ). Therefore, 2 f(x) is positive definite if and only if β > α 2 and, by continuity, is positive semidefinite if β = α 2 ; otherwise, 2 f(x) is indefinite. Consequently, f is convex if and only if β α 2 and is strictly convex if and only if β > α 2. Finally, a quadratic function is coercive if and only if 2 f(x) is positive definite which in tis case occurs when β > α 2. (c) Compute and classify all critical points of the function f(x, x 2, x 3 ) = 2(ax ) 2 + (x x 2 2) 2 + (ax 2 + x 3 ) 2, where a > 0 is a given positive real number. Solution: Critical points are x 2a/(2a 2 + ) x 2 = x 3 0 0, /a / a, and a /a / a. a By considering 2 f(x) at each of these points we find that the first one is a saddle point and the last two are local solutions. By plugging the last two into f we see that they yield the same function value zero. Since f is everywhere non-negative, these last two points are global solutions. 2. Consider the function f(x) = 2 xt Qx + c T x, where Q IR n n is symmetric and c IR m. (a) Under what condition on the matrix Q IR n n is f convex? Justify your answer.
Solution: There are many ways to analyze this question. Below we establish the convexity condition from first principals. First observe that since α 2 = α α( α) we have (( λ)x + λy) T Q(( λ)x + λy) = ( λ) 2 x T Qx + 2λ( λ)x T Qy + λ 2 y T Qy Therefore, for all 0 λ and x, y IR n, = (( λ) ( λ)λ)x T Qx + 2λ( λ)x T Qy + (λ λ( λ))y T Qy = ( λ)x T Qx + λy T Qy λ( λ) [ x T Qx 2x T Qy + y T Qy ] = ( λ)x T Qx + λy T Qy λ( λ)(x y) T Q(x y). f(( λ)x + λy) = 2 (( λ)x + λy)t Q(( λ)x + λy) + c T (( λ)x + λy) [ ] [ ] = ( λ) 2 xt Qx + c T x + λ 2 yt Qy + c T λ( λ) y (x y) T Q(x y) 2 λ( λ) = ( λ)f(x) + λf(y) (x y) T Q(x y). 2 Consequently, f is convex if and only if (x y) T Q(x y) 0 for all x, y IR n, or equivalently, Q is positive semi-definite. (b) If Q is symmetric and positive definite, show that there is a nonsingular matrix L such that Q = LL T. Solution: This is easily shown either by an eigenvalue decomposition of a Cholesky factorization. (c) With Q and L as defined in the part (b), show that Solution: Just multiply it out: f(x) = 2 LT x + L c 2 2 2 ct Q c. 2 LT x + L c 2 2 2 ct Q c = [ (L T x) T (L T x) + 2(L c) T (L T x) + (L c) T (L c) ] 2 2 ct Q c = [ x T LL T x + 2c T x + c T L T L c ] 2 2 ct Q c = 2 xt Qx + c T x. (d) If f is convex, under what conditions is x IR n f(x) =? Solution: If Q has a negative eigenvalue, then the imum value is obviously so we can assume that Q is positive semi-definite. If Q is positive definite, then f is coercive so the value cannot be, so we can assume that Q has a non-trivial null-space. Since we are assug that Q is positive semi-definite, f is convex, so a necessary and sufficient condition for f to have a global solution is that there exist an x such that 0 = f(x) = Qx + c, or equivalently, c RanQ. Therefore, f = can only possibly occur is c / RanQ. In this case, we may write c = c +c 2 with c RanQ and c 2 (RanQ) = NulQ with c 2 0. Note that f(tc 2 ) = t c 2 2 and letting t sends f to. Putting all of this together we get that f = if and only if either Q has a negative eigenvalue or Q is positive semi-definite and c / NulQ.
(e) Assume that Q is symmetric and positive definite and let S be a subspace of IR n. Show that x solves the problem x S f(x) if and only if f( x) S. Solution: In lecture this was shown to be true for an arbitary convex function. 3. Let A IR m n and b IR m and consider the function (a) Show that Ran(A T A) = Ran(A T ). Solution: Done in class a couple of times. f(x) = 2 Ax b 2 2. (b) Show that for every b IR m there exists a solution to the equation A T Ax = A T b. Solution: Obvious from (i). (c) Show that there always exists a solution to the problem LS and describe the set of solutions to this problem. Solution: Obvious from (ii). x IR n 2 Ax b 2 2, (d) Under what conditions on A is the solution to the problem LS unique. Solution: Nul(A) = {0}. (e) Using the condition from part (iv) derive a formula for the unique solution to LS. Solution: d = (A T A) A T b. (f) If A T A is invertible, show that the matrix P = A(A T A) A T is the orthogonal projection onto Ran(A), that is, show that i. P 2 = P ii. P T = P iii. Ran(P ) = Ran(A). Solution: A and B are obvious. For C, clearly Ran(P ) Ran(A). On the other hand, if y Ran(A) then there is an x such that Ax = y. Consequently, P y = P Ax = A(A T A) A T Ax = Ax = y so Ran(A) Ran(P ). 4. (a) Matrix secant questions. i. Let u, v IR n with u 0 v. Show that the matrix uv T IR n n is rank. Solution: Since Ran(uv T ) = Span[u], uv T ha rank. ii. Let B IR n n and s, y IR n such that y Bs. For what values of u, v IR n is it true that the matrix B + = B + uv T satisfies B + s = y? Justify your answer. Solution: y = B + s = Bs + uv T s so v T s 0 since y Bs. But then u = y Bs v T s. iii. Let H IR n n be symmetric and s, y IR n. Under what conditions do there exist vectors u, v IR n such that the matrix H + = H + uv T is also symmetric and satisfies H + s = y? Justify your answer. Solution: From the previous problem we must have v T s 0 and for symmetry we must have u is a multiple of v. So v = y Hs and we must have (y Hs) T s 0 or equivalently y T s s T Hs.
iv. Let M IR n n be symmetric and positive definite and let s, y IR n be such that s T y > 0. The BFGS applied to M is M + = M + yyt s T y MssT M s T Ms. Show that M + can be rewritten in the form M + = M + UD U T, where U IR n 2 and D IR 2 2 is an invertible diagonal matrix, by deriving formulas for both U and D. Solution: U = [y, Ms] and D = diag(s T y, s T Ms). (b) (This problem is over the top. But parts might be trimmed down for a suitable final exam question.) Let F : IR n IR m be continuously differentiable, and let be any norm on IR m. In this problem we consider the function f(x) = F (x) and properties of the the associated Gauss- Newton direction for imizing f. Recall that x IR n is a first-order stationary point for f if 0 f ( x; d) for all d IR n. i. Given x, d IR n and t > 0 show that ii. Why is it true that F (x + td) F (x) + tf (x)d F (x + td) (F (x) + tf (x)d). Hint: What is the definition of F (x)? iii. Use parts i and ii to show that F (x + td) (F (x) + tf (x)d) lim = 0? t 0 t f (x; d) = lim t 0 F (x) + tf (x)d F (x) t iv. Use part iii and the convexity of the norm to show that f (x; d) F (x) + F (x)d F (x). Hint: F (x) + tf (x)d = ( t)f (x) + t(f (x) + F (x)d) v. Use parts iii and iv to show that 0 f (x; d) for all d if and only if F (x) F (x) + F (x)d for all d. vi. If x is not a first-order stationary point for f and d solves. GN F (x) + F (x)d, d IR n show that d is a descent direction for f at x by showning that vii. Show that the problem f (x; d) F (x) + F (x)d F (x) < 0. d IR n F (x) + F (x)d can be written as a linear program. viii. Suppose at a given point x IR n one solves the problem GN in Part vi above to obtain a direction d for which F (x) + F (x) d < F (x). Show that the following backtracking line search procedure is finitely terating in the sense that the solution t is not zero.
Line Search: Let 0 < c < and 0 < γ < and set t := γ s s.t. s {0,, 2, 3,...} and F (x + γ s d) ( γ s c) F (x) + cγ s F (x) + F (x) d. Hint: ( γ s c) F (x) + cγ s F (x) + F (x) d = F (x) + cγ s [ F (x) + F (x) d F (x) ]. Then use Part vi. 5. (a) Locate all of the KKT points for the following problem. Are these points local solutions? Are they global solutions? imize x 2 + x2 2 4x 4x 2 subject to x 2 x 2 x + x 2 2 (b) Let Q IR n n be symmetric and positive definite, g IR n, b IR m, and A IR m n with Nul(A T ) = {0}. i. Show that the matrix AQ A T is nonsingular. Solution: Since Q is positive definite w T Qw = 0 iff w = 0, so y T AQ A T y = 0 iff A T y = 0 iff y = 0 since Nul(A T ) = {0}. Therefore, AQ A T is symmetric and positive definite and so nonsingular. ii. Show that solves the problem x = Q A T (AQ A T ) b [I Q A T (AQ A T ) A]Q c imize 2 xt Qx + c T x subject to Ax = b Hint: Use the Lagrangian L(x, y) = f(x) + y T (b Ax) and show that the Lagrange multiplier ȳ must satisfy x = Q ( c + A T ȳ). Then multiply this expression through by A and solve for ȳ. Solution: First note that 0 = Nul(A T ) = Ran(A) so Ran(A) = IR m. In particular, this implies that this optimization problem is feasible for possible values of b. The KKT conditions are 0 = L( x, ȳ) = Q x + c A T ȳ, or equivalently, x = Q (A T ȳ c). Plugging this into Ax = b, we get b = A x = AQ (A T ȳ c) = AQ A T ȳ AQ c. Multiplying through by (AQ A T ) and re-arranging gives ȳ = (AQ A T ) (AQ c + b). Plugging this back into x = Q (A T ȳ c) gives the result. (c) Let Q IR n n be symmetric and positive definite, and let c IR n. Consider the optimization problem 0 x 2 xt Qx + c T x. i. What is the Lagrangian function for this problem? ii. Show that the Lagrangian dual is the problem max y c 2 yt Q y = y c. 2 yt Q y.
iii. Show that if x, ȳ IR n satisfy ȳ = Q x, then x solves the primal problem if and only if ȳ solves the dual problem, and the optimal values in the primal and dual coincide. (d) (A harder duality problem) Let Q IR n n be symmetric and positive definite. optimization problem P imize 2 xt Qx + c T x subject to x. i. Show that this problem is equivalent to the problem ˆP imize 2 xt Qx subject to e x e, where e is the vector of all ones. ii. What is the Lagrangian for ˆP? iii. Show that the Lagrangian dual for ˆP is the problem Consider the D max 2 (y c)t Q (y c) y = 2 (y c)t Q (y c) + y. This is also the Lagrangian dual for cp. iv. Show that if x, ȳ IR n satisfy ȳ = Q x + c, then x solves P if and only if ȳ solves D, and the optimal values in P and D coincide.