ECE580 Partial Solution to Problem Set 3

Similar documents
ECE580 Solution to Problem Set 6

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc.

ECE580 Solution to Problem Set 3: Applications of the FONC, SONC, and SOSC

ECE580 Solution to Problem Set 4

ECE 680 Modern Automatic Control. Gradient and Newton s Methods A Review

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor.

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

FALL 2018 MATH 4211/6211 Optimization Homework 4

SOLUTIONS to Exercises from Optimization

MA/OR/ST 706: Nonlinear Programming Midterm Exam Instructor: Dr. Kartik Sivaramakrishnan INSTRUCTIONS

TMA4180 Solutions to recommended exercises in Chapter 3 of N&W

The Steepest Descent Algorithm for Unconstrained Optimization

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.

Quasi-Newton Methods

Introduction to gradient descent

MATH 4211/6211 Optimization Basics of Optimization Problems

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Gradient Descent Methods

Mechanical Systems II. Method of Lagrange Multipliers

4 damped (modified) Newton methods

Convex Optimization. Problem set 2. Due Monday April 26th

LECTURE 22: SWARM INTELLIGENCE 3 / CLASSICAL OPTIMIZATION

The Conjugate Gradient Method

MATH 4211/6211 Optimization Quasi-Newton Method

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Lecture 3: Basics of set-constrained and unconstrained optimization

C&O367: Nonlinear Optimization (Winter 2013) Assignment 4 H. Wolkowicz

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

An Iterative Descent Method

Conjugate Gradient (CG) Method

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

x k+1 = x k + α k p k (13.1)

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Unconstrained optimization

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

Chapter 8 Gradient Methods

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

Unconstrained minimization of smooth functions

8 Numerical methods for unconstrained problems

Functions of Several Variables

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion:

Nonlinear Optimization: What s important?

Math (P)refresher Lecture 8: Unconstrained Optimization

Gradient Methods Using Momentum and Memory

Numerical Optimization

Numerical Methods I Solving Nonlinear Equations

Math 273a: Optimization Basic concepts

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Lecture 17: October 27

Optimization and Calculus

Math 273a: Optimization Netwon s methods

2.098/6.255/ Optimization Methods Practice True/False Questions

Deep Learning. Authors: I. Goodfellow, Y. Bengio, A. Courville. Chapter 4: Numerical Computation. Lecture slides edited by C. Yim. C.

Lecture 3: Linesearch methods (continued). Steepest descent methods

446 CHAP. 8 NUMERICAL OPTIMIZATION. Newton's Search for a Minimum of f(x,y) Newton s Method

Second Order Optimization Algorithms I

1. Background: The SVD and the best basis (questions selected from Ch. 6- Can you fill in the exercises?)

Bindel, Fall 2011 Intro to Scientific Computing (CS 3220) Week 6: Monday, Mar 7. e k+1 = 1 f (ξ k ) 2 f (x k ) e2 k.

Recitation 1. Gradients and Directional Derivatives. Brett Bernstein. CDS at NYU. January 21, 2018

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

Iterative Methods for Solving A x = b

5 Overview of algorithms for unconstrained optimization

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Numerical Methods - Numerical Linear Algebra

Linear Algebra Practice Problems

Generalized Gradient Descent Algorithms

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

MATH 205C: STATIONARY PHASE LEMMA

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

Vector Derivatives and the Gradient

Conditional Gradient (Frank-Wolfe) Method

Transpose & Dot Product

Math 312 Final Exam Jerry L. Kazdan May 5, :00 2:00

Linear Algebra. P R E R E Q U I S I T E S A S S E S S M E N T Ahmad F. Taha August 24, 2015

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

Optimality Conditions for Constrained Optimization

Steepest Descent. Juan C. Meza 1. Lawrence Berkeley National Laboratory Berkeley, California 94720

ECS171: Machine Learning

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)

Transpose & Dot Product

Practical Optimization: Basic Multidimensional Gradient Methods

1. Method 1: bisection. The bisection methods starts from two points a 0 and b 0 such that

Algorithms for constrained local optimization

Math 263 Assignment #4 Solutions. 0 = f 1 (x,y,z) = 2x 1 0 = f 2 (x,y,z) = z 2 0 = f 3 (x,y,z) = y 1

An Introduction to Model-based Predictive Control (MPC) by

26. Directional Derivatives & The Gradient

Optimization Methods. Lecture 19: Line Searches and Newton s Method

FIXED POINT ITERATIONS

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

Latent Variable Models and EM algorithm

The goal of this chapter is to study linear systems of ordinary differential equations: dt,..., dx ) T

Symmetric Matrices and Eigendecomposition

Chapter 12: Iterative Methods

Line Search Methods for Unconstrained Optimisation

R-Linear Convergence of Limited Memory Steepest Descent

Notes for CS542G (Iterative Solvers for Linear Systems)

Nonlinear Optimization for Optimal Control

Positive Definite Matrix

Transcription:

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 1 ECE580 Partial Solution to Problem Set 3 These problems are from the textbook by Chong and Zak, 4th edition, which is the textbook for the ECE580 Fall 2015 semester As such, many of the problem statements are taken verbatim from the text; however, others have been reworded for reasons of efficiency or instruction Solutions are mine Any errors are mine and should be reported to me, skoskie@iupuiedu, rather than to the textbook authors 81 Because the calculations are routine, I will not be providing a Matlab script for the calculations involved in solving this problem Starting from x (0) = 0, perform two iterations of the steepest descent algorithm towards finding the minimizer of Also solve the problem analytically f(x 1, x 2 ) = x 1 + x 2 /2 + x 2 1/2 + x 2 2 + 3 Solution: The steepest descent algorithm is where First we find Then we find α 0 Let x (k+1) = x (k) α k f(x (k) ), (1) α k = arg min α>0 f ( x (k) α f(x (k) ) ) (2) f(x) = [ 1 + x 1 1/2 + 2x 2 T φ 0 (α) = f ( x (0) α f(x (0) ) ) ( [ ) 1 = f 0 α 1/2 = 0 α (α/2)/2 + ( α) 2 /2 + ( α/2) 2 + 3 = 5α/4 3α 2 /4 + 3 Then solving for the stationary point of φ 0, dφ 0 dα = 5/4 3α/2 = 0 yields α = 5/6 d 2 φ 0 /dα 2 = 3/2, so the value is a minimum The first step of the algorithm then yields [ [ 1 5/6 x (1) = 0 5/6 = 1/2 5/12 f(x (1) ) = [ 1 5/6 1/2 2(5/12) = [ 1/6 1/3 T,

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 2 so the second step give us where x (2) = x (1) α 1 f(x (1) ), (3) α 1 = arg min α>0 f ( x (1) α f(x (1) ) ) (4) With some help from Matlab, we solve for α 1 as follows: ([ [ ) ([ 5/6 1/6 5/6 α/6 φ 1 (α) = f α = f 5/12 1/3 5/12 + α/3 = ( 5/6 α/6) + ( 5/12 + α/3)/2 + ( 5/6 α/6) 2 /2 + ( 5/12 + α/3) 2 = α 2 /8 5α/36 25/48 = 75/108, so dφ 0 = 5/36 + α/4 = 0 dα Taking the second derivative we find d 2 φ 0 /dα 2 = 1/8, so α = 5/9 is a minimum Thus Thus we have x (0) = 0, x (2) = x (1) α 1 f(x (1) ) [ [ 5/6 1/6 = 5/9 5/12 1/3 [ 50/54 = 25/108 x (1) = [ 5/6 5/12 T [ 08333 04147 T, and ) x (2) = [ 25/27 25/108 T [ 09259 02315 T Let s compare the values obtained at x (0), x (1), and x (2), with some help from Matlab: f(x (0) ) = 3, f(x (1) ) = 3 75/144 2742, and f(x (2) ) = 3 425/1296 26721 The analytical solution is obtained by solving [ f(x 1 + x ) = 1 1/2 + 2x = 0 2

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 3 to obtain and x = [ 1 1/4 f(x ) = 1 (1/4)/2 + ( 1) 2 /2 + ( 1/4) 2 + 3 = 3 9/16 24375 Of course we must check the Hessian to make sure this is a maximum Since the Hessian is diagonal with eigenvalues 1 and 2, the Hessian is positive definite and x is a minimizer 88 Global Convergence of Fixed Step Algorithm Using the fixed-step gradient algorithm, x (k+1) = x (k) α f(x (k) ), (5) find the maximum α 0 such that the algorithm is globally convergent for all α [0, α 0 ), when applied to the function Solution: f(x (k) ) = 3(x 2 1 + x 2 2) + 4x 1 x 2 + 5x 1 + 6x 2 + 7 Theorem 83 says that for the fixed-step gradient algorithm, x (k) converges to x for any x (0), ie the algorithm is globally convergent, if and only if (iff) The Q in question is the Q in We have to find a Q such that The appropriate value is 0 < α < 2 λ max (Q) f(x) = 1 2 xt Qx b T x 3(x 2 1 + x 2 2) + 4x 1 x 2 = 1 2 xt Qx Q = [ 6 4 4 6 which you can check by simply doing the multiplication To apply the Theorem, we need the eigenvalues of Q which we find as follows: [ si A = s 6 4 = s 2 12s + 36 16 = (s 10)(s 2) 4 s 6 Thus λ max (Q) = 10 and α 0 = 2/10 = 1/5,

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 4 89 Zero Finding Consider finding the zeros of [ 4 + 3x1 + 2x h(x) = 2 1 + 2x 1 + 3x 2 by applying the fixed-step algorithm x (k+1) = x (k) αh(x (k) ) (a) Find the solution of h(x) = 0 Solution: We have two equations in two unknowns: We can rewrite this as [ 3 2 4 + 3x 1 + 2x 2 = 0 1 + 2x 1 + 3x 2 = 0 [ 4 x + 1 The determinant of the 2 2 matrix is 9 4 = 5 0 so the system of equations has a unique solution, namely, by inverting the matrix, x = 1 [ [ [ 3 2 4 2 = 5 1 1 (b) Find α 0 such that the algorithm is globally convergent (ie converges regardless of the value of x 0 we use Solution We find the eigenvalues of [ 6 4 4 6 so that we can use Theorem 83 as in the previous problem, obtaining again that the algorithm is globally convergent for α [0, α 0 where α 0 = 1/5 (c) Consider the value α = 1000, which is well outside the range for global convergence Find an initial condition x (0) = [ x 1 0 T the algorithm does not satisfy the descent property Solution: Recall the definition, V (x) = f(x) + 1 2 xt Qx on page 142, and the assertion of Lemma 81 that for the iterative gradient algorithm, V (x (k+1) ) = (1 γ k ) V (x (k) )

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 5 when γ k is defined as on p 142 Lemma 83 then asserts that if g (k) 0 k, then γ k = 1 iff g (k) is an eigenvector of Q Now, if γ k is one, the algorithm will not satisfy the descent property, so we should choose an eigenvector as our initial condition We already know that the eigenvalues of the Q matrix are 2 and 10, so we just need to find the eigenvectors: [ 6 4 4 6 v = 2v implies that 6x 1 + 4x 2 = 2x 1 (first row) so we need x 2 = x 1 [ 6 4 v = 10v 4 6 implies that 6x 1 + 4x 2 = 10x 1 (first row) so we need x 2 = x 1 Thus if [ 1 h (0) = ±1 will result in the algorithm stopping before finding a solution Again we have two equations in two unknowns: We can rewrite this as [ 3 2 4 + 3x 1 + 2x 2 = 1 1 + 2x 1 + 3x 2 = 1 [ 1 x + 11 As before, the determinant of the 2 2 matrix is 9 4 = 5 0 so the system of equations has a unique solution, namely, by inverting the matrix, x = 1 [ [ [ 3 2 4 2 = 5 1 1 (b) Find α 0 such that the algorithm is globally convergent (ie converges regardless of the value of x 0 we use) 813 Descent and Global Convergence Let f(x) = (x 1) 2 for real x Consider the following iterative algorithm for finding the minimizer of f: x (k+1) = x (k) α2 k f (x (k) ), α (0, 1) (a) Does the algorithm have the descent property?

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 6 We will apply Lemma 81 which states that for the quadratic f and the iterative algorithm x (k+1) = x (k) α k f(x (k) ), α (0, 1), the function V (x) := f(x) + 1 2 x T Qx, satisfies V (x k+1 ) = (1 γ k )V (x) where γ k = { α k f(x k ) T Q f(x k ) f(x k ) T Q 1 f(x k ) ( 1 ) f(x k ) = 0 2 f(xk ) T f(x k ) f(x k ) T Q f(x k ) k otherwise Accordingly, for the given f with q = 2 and f = 2x 2, we have γ k = 2 k α (2x(k) 2)2(2x (k) 2) (2x (k) 2)(1/2)(2x (k) 2) = 2 k+2 α ( 1 2 k α ), ( 2 (2x(k) 2)(2x (k) 2) (2x (k) 2)2(2x (k) 2) 2 k α which is independent of x (k) To determine whether this value will be between zero and one, we proceed as follows The first three γ k are γ 0 = 4(α α 2 ) γ 1 = 2α α 2 γ 2 = α α 2 /4 ) For k 2, we see that γ k (0, 1) We also see that γ 0 > γ 1 so if we can show that γ 0 < 1, then we have shown that the sequence has the descent property To determine the range of possible values that γ 0 can take, we take the derivative with respect to α, obtaining dγ 0 /dα = d/dα ( 4α 4α 2) = 4(1 2α) = 0 when α = 1/2 The second derivative is d 2 γ 0 /dα 2 = 8, so α = 1/2 is a maximizer of γ 0 Unfortunately, when α 0 = 1/2, γ 0 = 1, which leaves V (x (k+1) ) = V (x (k) ) However, γ 1 = 3/4 so for k > 0, the sequence of x (k) has the descent property (b) Is the algorithm globally convergent? Solution: By Theorem 81, the algorithm is globally convergent iff γ k > 0 for all k and the infinite sum of the γ k is infinite First we must determine whether γ k is positive for all k From part (a) we obtained that γ k = 2 k+2 α(1 2 k α) = 2 k+2 α 2 2k+2 α 2

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 7 Because α (0, 1), α 2 < α Also for k > 0, 2 2k+2 < 2 k+2 For k = 0, they are equal Thus γ k > 0 for all k Next we compute the sum: γ k = k=0 ( 2 k+2 α 2 2k+2 α 2) k=0 2 k+2 α = 4α 2 k = 4α < k=0 k=0 so the algorithm is not globally convergent 820 Fixed Step-Size Algorithm Consider the function f(x) = x T [ 3/2 2 0 3/2 x + x T [ 3 1 22 (a) Find the range of step-size values for which the fixed-step gradient algorithm converges to the minimizer Solution: First we must rewrite the function as f(x) = 1 [ [ 3 2 3 2 xt x + x T 22 1 We find the eigenvalues of the Q matrix: [ s 3 2 = (s 3) 2 4 = s 2 6s + 9 4 = s 2 6s + 5 = (s 1)(s 5) 2 s 3 Thus we have global convergence for α (0, 2/5) (b) For a step size of 1000, find an initial condition x (0) for which the algorithm diverges Solution: We ll need the eigenvectors of the Q matrix We have [ 3 2 v = v implies that 3x 1 + 2x 2 = x 1 (first row) so we need x 2 = x 1 [ 3 2 v = 5v implies that 3x 1 + 2x 2 = 5x 1 (first row) so we need x 2 = x 1

ECE580 Fall 2015 Solution to Problem Set 3 October 23, 2015 8 Thus if we select x (0) such that f(x (0) ) is a multiple of either of these eigenvectors, the algorithm will diverge Let s check We want [ [ [ 3 1 Qx (0) b = x + = 1 1 This can be solved for x (0) = 1 5 [ 3 2 [ 2 2 Using Matlab we find that using this initial condition we obtain [ [ 1002 4997998 x (1) = and x (2) = 998 4998002 821 Find the largest α 0 such that the fixed-step algorithm is globally convergent for α (0, α 0 ) (a) f(x) = 1 + 2x 1 + 3(x 2 1 + x 2 2) + 4x 1 x 2 Solution: We rewrite the function as f(x) = 1 2 xt [ 6 2 2 6 x + x T [ 2 0 + 1 We already found the eigenvalues of this Q matrix to be 2 and 10 and α 0 = 1/5 [ [ 3 3 16 (b) f(x) = x T x + x 1 3 T + π 23 2 Solution: We rewrite the function as f(x) = 1 [ [ 6 2 16 2 xt x + x T + π 2 2 6 23 Again, α 0 = 1/5 c 2015 S Koskie