ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 1 ECE580 Partial Solution to Problem Set 3 These problems are from the textbook by Chong and Zak, 4th edition, which is the textbook for the ECE580 Fall 2015 semester As such, many of the problem statements are taken verbatim from the text; however, others have been reworded for reasons of efficiency or instruction Solutions are mine Any errors are mine and should be reported to me, skoskie@iupuiedu, rather than to the textbook authors 81 Because the calculations are routine, I will not be providing a Matlab script for the calculations involved in solving this problem Starting from x (0) = 0, perform two iterations of the steepest descent algorithm towards finding the minimizer of Also solve the problem analytically f(x 1, x 2 ) = x 1 + x 2 /2 + x 2 1/2 + x 2 2 + 3 Solution: The steepest descent algorithm is where First we find Then we find α 0 Let x (k+1) = x (k) α k f(x (k) ), (1) α k = arg min α>0 f ( x (k) α f(x (k) ) ) (2) f(x) = [ 1 + x 1 1/2 + 2x 2 T φ 0 (α) = f ( x (0) α f(x (0) ) ) ( [ ) 1 = f 0 α 1/2 = 0 α (α/2)/2 + ( α) 2 /2 + ( α/2) 2 + 3 = 5α/4 3α 2 /4 + 3 Then solving for the stationary point of φ 0, dφ 0 dα = 5/4 3α/2 = 0 yields α = 5/6 d 2 φ 0 /dα 2 = 3/2, so the value is a minimum The first step of the algorithm then yields [ [ 1 5/6 x (1) = 0 5/6 = 1/2 5/12 f(x (1) ) = [ 1 5/6 1/2 2(5/12) = [ 1/6 1/3 T,
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 2 so the second step give us where x (2) = x (1) α 1 f(x (1) ), (3) α 1 = arg min α>0 f ( x (1) α f(x (1) ) ) (4) With some help from Matlab, we solve for α 1 as follows: ([ [ ) ([ 5/6 1/6 5/6 α/6 φ 1 (α) = f α = f 5/12 1/3 5/12 + α/3 = ( 5/6 α/6) + ( 5/12 + α/3)/2 + ( 5/6 α/6) 2 /2 + ( 5/12 + α/3) 2 = α 2 /8 5α/36 25/48 = 75/108, so dφ 0 = 5/36 + α/4 = 0 dα Taking the second derivative we find d 2 φ 0 /dα 2 = 1/8, so α = 5/9 is a minimum Thus Thus we have x (0) = 0, x (2) = x (1) α 1 f(x (1) ) [ [ 5/6 1/6 = 5/9 5/12 1/3 [ 50/54 = 25/108 x (1) = [ 5/6 5/12 T [ 08333 04147 T, and ) x (2) = [ 25/27 25/108 T [ 09259 02315 T Let s compare the values obtained at x (0), x (1), and x (2), with some help from Matlab: f(x (0) ) = 3, f(x (1) ) = 3 75/144 2742, and f(x (2) ) = 3 425/1296 26721 The analytical solution is obtained by solving [ f(x 1 + x ) = 1 1/2 + 2x = 0 2
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 3 to obtain and x = [ 1 1/4 f(x ) = 1 (1/4)/2 + ( 1) 2 /2 + ( 1/4) 2 + 3 = 3 9/16 24375 Of course we must check the Hessian to make sure this is a maximum Since the Hessian is diagonal with eigenvalues 1 and 2, the Hessian is positive definite and x is a minimizer 88 Global Convergence of Fixed Step Algorithm Using the fixed-step gradient algorithm, x (k+1) = x (k) α f(x (k) ), (5) find the maximum α 0 such that the algorithm is globally convergent for all α [0, α 0 ), when applied to the function Solution: f(x (k) ) = 3(x 2 1 + x 2 2) + 4x 1 x 2 + 5x 1 + 6x 2 + 7 Theorem 83 says that for the fixed-step gradient algorithm, x (k) converges to x for any x (0), ie the algorithm is globally convergent, if and only if (iff) The Q in question is the Q in We have to find a Q such that The appropriate value is 0 < α < 2 λ max (Q) f(x) = 1 2 xt Qx b T x 3(x 2 1 + x 2 2) + 4x 1 x 2 = 1 2 xt Qx Q = [ 6 4 4 6 which you can check by simply doing the multiplication To apply the Theorem, we need the eigenvalues of Q which we find as follows: [ si A = s 6 4 = s 2 12s + 36 16 = (s 10)(s 2) 4 s 6 Thus λ max (Q) = 10 and α 0 = 2/10 = 1/5,
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 4 89 Zero Finding Consider finding the zeros of [ 4 + 3x1 + 2x h(x) = 2 1 + 2x 1 + 3x 2 by applying the fixed-step algorithm x (k+1) = x (k) αh(x (k) ) (a) Find the solution of h(x) = 0 Solution: We have two equations in two unknowns: We can rewrite this as [ 3 2 4 + 3x 1 + 2x 2 = 0 1 + 2x 1 + 3x 2 = 0 [ 4 x + 1 The determinant of the 2 2 matrix is 9 4 = 5 0 so the system of equations has a unique solution, namely, by inverting the matrix, x = 1 [ [ [ 3 2 4 2 = 5 1 1 (b) Find α 0 such that the algorithm is globally convergent (ie converges regardless of the value of x 0 we use Solution We find the eigenvalues of [ 6 4 4 6 so that we can use Theorem 83 as in the previous problem, obtaining again that the algorithm is globally convergent for α [0, α 0 where α 0 = 1/5 (c) Consider the value α = 1000, which is well outside the range for global convergence Find an initial condition x (0) = [ x 1 0 T the algorithm does not satisfy the descent property Solution: Recall the definition, V (x) = f(x) + 1 2 xt Qx on page 142, and the assertion of Lemma 81 that for the iterative gradient algorithm, V (x (k+1) ) = (1 γ k ) V (x (k) )
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 5 when γ k is defined as on p 142 Lemma 83 then asserts that if g (k) 0 k, then γ k = 1 iff g (k) is an eigenvector of Q Now, if γ k is one, the algorithm will not satisfy the descent property, so we should choose an eigenvector as our initial condition We already know that the eigenvalues of the Q matrix are 2 and 10, so we just need to find the eigenvectors: [ 6 4 4 6 v = 2v implies that 6x 1 + 4x 2 = 2x 1 (first row) so we need x 2 = x 1 [ 6 4 v = 10v 4 6 implies that 6x 1 + 4x 2 = 10x 1 (first row) so we need x 2 = x 1 Thus if [ 1 h (0) = ±1 will result in the algorithm stopping before finding a solution Again we have two equations in two unknowns: We can rewrite this as [ 3 2 4 + 3x 1 + 2x 2 = 1 1 + 2x 1 + 3x 2 = 1 [ 1 x + 11 As before, the determinant of the 2 2 matrix is 9 4 = 5 0 so the system of equations has a unique solution, namely, by inverting the matrix, x = 1 [ [ [ 3 2 4 2 = 5 1 1 (b) Find α 0 such that the algorithm is globally convergent (ie converges regardless of the value of x 0 we use) 813 Descent and Global Convergence Let f(x) = (x 1) 2 for real x Consider the following iterative algorithm for finding the minimizer of f: x (k+1) = x (k) α2 k f (x (k) ), α (0, 1) (a) Does the algorithm have the descent property?
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 6 We will apply Lemma 81 which states that for the quadratic f and the iterative algorithm x (k+1) = x (k) α k f(x (k) ), α (0, 1), the function V (x) := f(x) + 1 2 x T Qx, satisfies V (x k+1 ) = (1 γ k )V (x) where γ k = { α k f(x k ) T Q f(x k ) f(x k ) T Q 1 f(x k ) ( 1 ) f(x k ) = 0 2 f(xk ) T f(x k ) f(x k ) T Q f(x k ) k otherwise Accordingly, for the given f with q = 2 and f = 2x 2, we have γ k = 2 k α (2x(k) 2)2(2x (k) 2) (2x (k) 2)(1/2)(2x (k) 2) = 2 k+2 α ( 1 2 k α ), ( 2 (2x(k) 2)(2x (k) 2) (2x (k) 2)2(2x (k) 2) 2 k α which is independent of x (k) To determine whether this value will be between zero and one, we proceed as follows The first three γ k are γ 0 = 4(α α 2 ) γ 1 = 2α α 2 γ 2 = α α 2 /4 ) For k 2, we see that γ k (0, 1) We also see that γ 0 > γ 1 so if we can show that γ 0 < 1, then we have shown that the sequence has the descent property To determine the range of possible values that γ 0 can take, we take the derivative with respect to α, obtaining dγ 0 /dα = d/dα ( 4α 4α 2) = 4(1 2α) = 0 when α = 1/2 The second derivative is d 2 γ 0 /dα 2 = 8, so α = 1/2 is a maximizer of γ 0 Unfortunately, when α 0 = 1/2, γ 0 = 1, which leaves V (x (k+1) ) = V (x (k) ) However, γ 1 = 3/4 so for k > 0, the sequence of x (k) has the descent property (b) Is the algorithm globally convergent? Solution: By Theorem 81, the algorithm is globally convergent iff γ k > 0 for all k and the infinite sum of the γ k is infinite First we must determine whether γ k is positive for all k From part (a) we obtained that γ k = 2 k+2 α(1 2 k α) = 2 k+2 α 2 2k+2 α 2
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 7 Because α (0, 1), α 2 < α Also for k > 0, 2 2k+2 < 2 k+2 For k = 0, they are equal Thus γ k > 0 for all k Next we compute the sum: γ k = k=0 ( 2 k+2 α 2 2k+2 α 2) k=0 2 k+2 α = 4α 2 k = 4α < k=0 k=0 so the algorithm is not globally convergent 820 Fixed Step-Size Algorithm Consider the function f(x) = x T [ 3/2 2 0 3/2 x + x T [ 3 1 22 (a) Find the range of step-size values for which the fixed-step gradient algorithm converges to the minimizer Solution: First we must rewrite the function as f(x) = 1 [ [ 3 2 3 2 xt x + x T 22 1 We find the eigenvalues of the Q matrix: [ s 3 2 = (s 3) 2 4 = s 2 6s + 9 4 = s 2 6s + 5 = (s 1)(s 5) 2 s 3 Thus we have global convergence for α (0, 2/5) (b) For a step size of 1000, find an initial condition x (0) for which the algorithm diverges Solution: We ll need the eigenvectors of the Q matrix We have [ 3 2 v = v implies that 3x 1 + 2x 2 = x 1 (first row) so we need x 2 = x 1 [ 3 2 v = 5v implies that 3x 1 + 2x 2 = 5x 1 (first row) so we need x 2 = x 1
ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 8 Thus if we select x (0) such that f(x (0) ) is a multiple of either of these eigenvectors, the algorithm will diverge Let s check We want [ [ [ 3 1 Qx (0) b = x + = 1 1 This can be solved for x (0) = 1 5 [ 3 2 [ 2 2 Using Matlab we find that using this initial condition we obtain [ [ 1002 4997998 x (1) = and x (2) = 998 4998002 821 Find the largest α 0 such that the fixed-step algorithm is globally convergent for α (0, α 0 ) (a) f(x) = 1 + 2x 1 + 3(x 2 1 + x 2 2) + 4x 1 x 2 Solution: We rewrite the function as f(x) = 1 2 xt [ 6 2 2 6 x + x T [ 2 0 + 1 We already found the eigenvalues of this Q matrix to be 2 and 10 and α 0 = 1/5 [ [ 3 3 16 (b) f(x) = x T x + x 1 3 T + π 23 2 Solution: We rewrite the function as f(x) = 1 [ [ 6 2 16 2 xt x + x T + π 2 2 6 23 Again, α 0 = 1/5 c 2015 S Koskie