Course Instructor Dr. Raymond C. Rumpf Office: A 337 Phone: (915) 747 6958 E Mail: rcrumpf@utep.edu Topic 8c Multi Variable Optimization EE 4386/5301 Computational Methods in EE Outline Mathematical Preliminaries Multivariable functions Scalar and vector fields Gradients and Hessians Revised derivative tests for multiple variables Powell s method Gradients and Hessians Steepest ascent method Newton s method for multiple variables Multivariable Optimization 1
Mathematical Preliminaries Multivariable Optimization 3 Multivariable Functions General form for a multivariable function: f f 1,,, N 1 N Eample #1 Eample # f, y y y f A,, Aep f y A f Multivariable Optimization 4
Scalar Field Vs. Vector Field Scalar Field,, magnitude,, f y Vector Field, v, y yz magnitude, yz, direction, yz, Multivariable Optimization 5 Isocontour Lines Isocontour lines trace the paths of equal value. Closely space isocontours conveys that the function is varying rapidly. Multivariable Optimization 6 3
Gradient of a Scalar Field (1 of 3) We start with a scalar field f y, Multivariable Optimization 7 Gradient of a Scalar Field ( of 3) then plot the gradient on top of it. Color in background is the original scalar field. f y, Multivariable Optimization 8 4
Gradient of a Scalar Field (3 of 3) The gradient will always be perpendicular to the isocontour lines. Multivariable Optimization 9 The Multidimensional Gradient The standard 3D gradient can be written as,,,,,,,, f yz ˆ f yz f yz a ˆ f yz ˆ ay az y z When more dimensions are involved, we write it as f f f f 1 N 1 N Multivariable Optimization 10 5
Properties of the Gradient 1. It only makes sense to calculate the gradient of a scalar field*.. f points in the direction of the maimum rate of change in f. 3. f at any point is perpendicular to the constant f surface that passes through that point. 4. The gradient points toward big positive numbers in the scalar field. * The gradient of a vector field is a tensor called the Jacobian. It is commonly used in coordinate transformations, but is outside the scope of this course. Multivariable Optimization 11 Numerical Calculation of the Gradient We may not always have a closed form epression for our function so it may not be possible to calculate an analytical epression for the gradient. When this is the case, we can calculate the gradient numerically. f f 1 1 f 1 1 1 1 f f f f f f N N f N N N N This can be a very epensive computation! Multivariable Optimization 1 6
Derivative Tests in Multiple Dimensions Suppose we have a D function f(,y). f 0 f y 0 Does this indicate a minimum? No! Figure borrowed from: Steven Chapra, Numerical Methods for Engineers, 7 th Ed., McGraw Hill. Multivariable Optimization 13 The Hessian The Hessian describes curvature of multiple variable functions. We will use it to determine whether we are really at a maimum or minimum. The Hessian is defined as f f f 1 1 1 N f f f H f 1 N f f f N1 N N In two dimensions, the Hessian is f H f y f y f y f f f det H determinant y y Multivariable Optimization 14 7
Derivative Tests Revised for Multiple Dimensions f If deth0 and 0, then f, y has a local minumum. f If deth0 and 0, then f, y has a local maimum. H f y If det 0 then, has a saddle point. Multivariable Optimization 15 Conjugate Direction (1 of ) 1 a b Suppose we start at two different points, a and b, and use 1D optimizations along parallel directions to arrive at the two etremas 1 and. The direction of the line connecting 1 and is called a conjugate direction and is directed toward the maimum. Multivariable Optimization 16 8
Conjugate Direction ( of ) 1 a Suppose we start at common point a and use 1D optimizations along two different directions to arrive at the two etremas 1 and. The direction of the line connecting 1 and is also a conjugate direction and is directed toward the maimum. Multivariable Optimization 17 Univariate Search Multivariable Optimization 18 9
Algorithm for Basic Univariate Search 1. Make an initial guess. Loop over independent variables. 1. Fi all other independent variables ecept i.. Perform 1D optimization on f( i ) to find etremum. 3. If not converged, go back to Step. Multivariable Optimization 19 1 Pattern Direction After a few passes through all independent variables, an overall direction becomes apparent. It is the direction connecting the starting point to the end point. This direction points toward the etremum. Multivariable Optimization 0 10
Powell s Method Multivariable Optimization 1 h h 1 opt 0 Pick a starting point 0 and two different starting directions h 1 and h. Multivariable Optimization 11
h 1 opt 0 1 Staring at 0, perform a 1D optimization along h 1 to find etremum 1. Multivariable Optimization 3 h h 1 opt 0 1 Staring at 1, perform a 1D optimization along h to find etremum. Multivariable Optimization 4 1
h 3 h h 1 opt 0 1 Define h 3 to be in the direction connecting 0 to. Multivariable Optimization 5 h 3 3 h 1 h opt 0 1 Staring at, perform a 1D optimization along h 3 to find etremum 3. Multivariable Optimization 6 13
h h 3 3 h 1 h opt 4 0 1 Staring at 3, perform a 1D optimization along h to find etremum 4. Multivariable Optimization 7 h 3 h h 3 3 h 1 h 5 opt 4 0 1 Staring at 4, perform a 1D optimization along h 3 to find etremum 5. Multivariable Optimization 8 14
h 3 h h 3 3 h 1 h 5 opt 4 0 1 h 4 Define h 4 to be in the direction connecting 3 to 5. Multivariable Optimization 9 h 3 h h 3 h 1 h 3 5 4 opt This last 1D optimization is guaranteed to find the maimum because Powell showed that h 3 and h 4 are both conjugate directions. 0 1 h 4 Staring at 5, perform a 1D optimization along h 4 to find etremum opt. Multivariable Optimization 30 15
Algorithm Summary 1. Pick a starting point 0 and two different starting directions h 1 and h.. Staring at 0, perform a 1D optimization along h 1 to find etremum 1. 3. Staring at 1, perform a 1D optimization along h to find etremum. 4. Define h 3 to be in the direction connecting 0 to. 5. Staring at, perform a 1D optimization along h 3 to find etremum 3. 6. Staring at 3, perform a 1D optimization along h to find etremum 4. 7. Staring at 4, perform a 1D optimization along h 3 to find etremum 5. 8. Define h 4 to be in the direction connecting 3 to 5. 9. Staring at 5, perform a 1D optimization along h 4 to find etremum opt. Multivariable Optimization 31 Convergence Powell s method is quadratically convergent and etremely efficient. If iterated, it will converge in a finite number of iterations if the function is quadratic. Most functions are nearly quadratic near their etrema. Multivariable Optimization 3 16
Steepest Ascent Method Multivariable Optimization 33 Concept of Steepest Ascent Suppose we have a function f(,y). How do we find the maimum? Multivariable Optimization 34 17
Concept of Steepest Ascent Step 1 Pick a starting point 1. Multivariable Optimization 35 Concept of Steepest Ascent Step Calculate the gradient because that points toward increasing values of f( 1 ). f 1 Multivariable Optimization 36 18
Concept of Steepest Ascent Step 3 Calculate net point in direction of gradient. But how far? 1?? Multivariable Optimization 37 Concept of Steepest Ascent Step 3 Calculate net point in direction of gradient. But how far? Moving too far along the gradient may cause the algorithm to go unstable and not find the maimum. Moving not far enough will require many iterations to find the maimum. For now, lets choose a constant = 0.5. f 1 1 Multivariable Optimization 38 19
Concept of Steepest Ascent Step 4 Calculate the gradient at the second point. f Multivariable Optimization 39 Concept of Steepest Ascent Step 5 Calculate the net point 3 along the gradient. f 3 Multivariable Optimization 40 0
Concept of Steepest Ascent Step 6 Calculate the gradient at point 3. f 3 The steep gradient made us overshoot the maimum. The choice of is important! Multivariable Optimization 41 Steepest Ascent Method We wish to minimize the number of times the gradient is calculated. Let s calculate it once and then move in that direction until f() stops increasing. At this point, we reevaluate the gradient and repeat in the new direction. Algorithm 1. Pick a starting point.. Calculate the gradient at this point: g = f() 3. If the gradient is zero or less than some tolerance, we are done! 4. Otherwise, move in small increments in the direction of g until f() stops increasing: = + g Note: if we think of searching along this direction like a 1D optimization, we can improve efficiency greatly. 5. Go back to Step. Multivariable Optimization 4 1
Choice of (1 of 5) Too large values of a can cause the algorithm to jump away from maimum. At best, the algorithm converges on a different maimum. Multivariable Optimization 43 Choice of ( of 5) Too large values of a can also cause the algorithm to oscillate about the maimum and never converge to it. Multivariable Optimization 44
Choice of (3 of 5) For this eample, = 0.5 seems like a very good choice. The best choice of depends on the properties of the function. If the function varies wildly, choose small. If the function is rather well behaved, larger values of can converge faster. Multivariable Optimization 45 Choice of (4 of 5) = 0.1 seems like another good choice. This is typically the value that I choose at first if nothing else is known. Multivariable Optimization 46 3
Choice of (5 of 5) Small values of converge very slowly. This can be costly when evaluating the function is slow. Multivariable Optimization 47 Eample (1 of 5) Problem Find the maimum of the following function., f y y y 0 4 0.5 y.5 Solution Step 1 Make an initial guess at the position. 1 y 0 1 Multivariable Optimization 48 4
Eample ( of 5) Step Calculate the gradient.,, f y f y f, y aˆ ˆ ay y y y aˆ ˆ y y a y y a y aˆ ˆ 4,0 0 ˆ 40 f a ˆ a aˆ 4aˆ Step 3 Gradient is not zero, so we are not done. y Multivariable Optimization 49 y y y Eample (3 of 5) Step 4 Move in direction of gradient. 0 1 1 g 4 Choose 0.1 1.8 1g1 0.1 0 4 0.4 Multivariable Optimization 50 5
Eample (4 of 5) Step 5 Go back to Step. Step Calculate gradient at second point. 1.8 0.4 g y 0.4 1.8 0.8 4y 1.8 40.4.0 Step 3 We are still not done! Multivariable Optimization 51 Eample (5 of 5) Step 5 Go back to Step. Step 4 Calculate net point. 3 g g 1.8 0.4 0.8.0 1.8 0.8 1.7 0.1 0.4.0 0.6 Multivariable Optimization 5 6
Eample (6 of 6) Step 5 Go back to Step. Step 5 And so on After 77 iterations (tolerance 10 3 ), the answer converges to e.00 1.00 Multivariable Optimization 53 Newton s Method for Multiple Variables Multivariable Optimization 54 7
Newton s Method with Multiple Variables (1 of ) We can etend Newton s method to multiple variables using the Hessian. We can write a second order Taylor series for f() near = i. 1 f f f H T T i i i i i i At an etremum, f() = 0,. To find this point, we derive the gradient of the above epression. f f H i i i Multivariable Optimization 55 Newton s Method with Multiple Variables ( of ) We set our gradient to zero and solve for. i i i f ihii0 Hiif i 1 i Hi f i 1 H f f f H 0 i i i From this, our update equation for the Newton s method is 1 H f i1 i i i Multivariable Optimization 56 8