Optimization Part 1 P. Agius L2.1, Spring 2008

Similar documents
MATH 4211/6211 Optimization Basics of Optimization Problems

Computational Optimization. Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised)

MVE165/MMG631 Linear and integer optimization with applications Lecture 13 Overview of nonlinear programming. Ann-Brith Strömberg

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Module 04 Optimization Problems KKT Conditions & Solvers

OPER 627: Nonlinear Optimization Lecture 14: Mid-term Review

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Lecture 3. Optimization Problems and Iterative Algorithms

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

Constrained Optimization

Nonlinear Optimization: What s important?

Optimization Tutorial 1. Basic Gradient Descent

8 Numerical methods for unconstrained problems

Gradient Descent. Dr. Xiaowei Huang

Lecture 2: Convex Sets and Functions

ECE580 Exam 1 October 4, Please do not write on the back of the exam pages. Extra paper is available from the instructor.

Written Examination

Numerical Optimization

Nonlinear Optimization

Computational Finance

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Algorithms for constrained local optimization

Chapter III. Unconstrained Univariate Optimization

Methods for Unconstrained Optimization Numerical Optimization Lectures 1-2

OPER 627: Nonlinear Optimization Lecture 2: Math Background and Optimality Conditions

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

Numerical Optimization

ECE580 Fall 2015 Solution to Midterm Exam 1 October 23, Please leave fractions as fractions, but simplify them, etc.

Quasi-Newton Methods

Algorithms for nonlinear programming problems II

2.098/6.255/ Optimization Methods Practice True/False Questions

Mathematical optimization

UNCONSTRAINED OPTIMIZATION

Optimization: Nonlinear Optimization without Constraints. Nonlinear Optimization without Constraints 1 / 23

Examination paper for TMA4180 Optimization I

Lecture 3: Basics of set-constrained and unconstrained optimization

Optimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29

Math 273a: Optimization Basic concepts

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Chapter 3 Numerical Methods

Unconstrained optimization

Computational Optimization. Augmented Lagrangian NW 17.3

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Steepest Descent. Juan C. Meza 1. Lawrence Berkeley National Laboratory Berkeley, California 94720

IE 5531: Engineering Optimization I

Scientific Computing: Optimization

14 Lecture 14 Local Extrema of Function

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Line Search Methods for Unconstrained Optimisation

An Introduction to Model-based Predictive Control (MPC) by

Motivation: We have already seen an example of a system of nonlinear equations when we studied Gaussian integration (p.8 of integration notes)

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

Optimization Methods. Lecture 19: Line Searches and Newton s Method

Constrained optimization

Chapter 2. Optimization. Gradients, convexity, and ALS

Introduction to unconstrained optimization - direct search methods

Numerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09

Multidisciplinary System Design Optimization (MSDO)

Computational Optimization. Constrained Optimization Part 2

AM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods

Introduction and Math Preliminaries

Conic Linear Programming. Yinyu Ye

CHAPTER 2: QUADRATIC PROGRAMMING

Nonlinear Programming

ECE580 Solution to Problem Set 6

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

Optimality Conditions for Constrained Optimization

1 Strict local optimality in unconstrained optimization

4 Newton Method. Unconstrained Convex Optimization 21. H(x)p = f(x). Newton direction. Why? Recall second-order staylor series expansion:

LINEAR AND NONLINEAR PROGRAMMING

Mathematical Economics. Lecture Notes (in extracts)

2. Quasi-Newton methods

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

HYBRID RUNGE-KUTTA AND QUASI-NEWTON METHODS FOR UNCONSTRAINED NONLINEAR OPTIMIZATION. Darin Griffin Mohr. An Abstract

Outline. Roadmap for the NPP segment: 1 Preliminaries: role of convexity. 2 Existence of a solution

CoE 3SK3 Computer Aided Engineering Tutorial: Unconstrained Optimization

Gradient Descent. Mohammad Emtiyaz Khan EPFL Sep 22, 2015

CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares

ECE 680 Modern Automatic Control. Gradient and Newton s Methods A Review

Constrained optimization: direct methods (cont.)

The Steepest Descent Algorithm for Unconstrained Optimization

Lecture 3: Linesearch methods (continued). Steepest descent methods

Static Problem Set 2 Solutions

Chapter 8 Gradient Methods

Descent methods. min x. f(x)

Numerical Optimization. Review: Unconstrained Optimization

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Linear and non-linear programming

Introduction to Nonlinear Optimization Paul J. Atzberger

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????

MATH 4211/6211 Optimization Constrained Optimization

Additional Exercises for Introduction to Nonlinear Optimization Amir Beck March 16, 2017

Lecture 11 and 12: Penalty methods and augmented Lagrangian methods for nonlinear programming

10-725/ Optimization Midterm Exam

Lecture Notes: Geometric Considerations in Unconstrained Optimization

Practical Optimization: Basic Multidimensional Gradient Methods

4 damped (modified) Newton methods

Lecture 2: Linear Algebra Review

Transcription:

Optimization Part 1

Contents Terms and definitions, formulating the problem Unconstrained optimization conditions FONC, SONC, SOSC Searching for the solution Which direction? What step size? Constrained optimization The Lagrangian function The Dual KKT conditions Some optimization methods

What is Optimization? arrive at the best possible decision in any given set of circumstances. [G.R. Walsh, Methods of Optimization] Optimization models attempt to express, in mathematical terms, the goal of solving a problem in the best way. [Nash and Sofer, Linear and Nonlinear Programming]

Notation min x f(x) subjectto g(x) 0 h(x)=0 x R Objective function } Constraints min x f(x)isequivalenttomax x f(x) In unconstrained optimization, there are no constraints.

min x f(x) subjectto g(x) 0 h(x)=0 x R Inequality constraints Equality constraints Linearconstraints(Eg. x+2=5,x 1 x 2 >6) Nonlinearconstraints(Eg. x 2 1+x 2 2 5) Afunctionf isconvexonasets if f(αx+(1 α)y) αf(x)+(1 α)f(y) forall0 α 1andforallx,y S

Afunctionf isconvexonasets if f(αx+(1 α)y) αf(x)+(1 α)f(y) forall0 α 1andforallx,y S αf(x)+(1 α)f(y) f(y) f(x) x αx+(1 α)y y Convex quadratic function more later min x 0.5x T Qx

Formulating the Problem Example: Find two nonnegative numbers whose sum is 9 such that the product of one number and the square of the other number is a maximum. Letnumbersbexand9 x. max x x(9-x) 2 subjectto x 0

Unimodal functions A function is unimodal if it has only one maximum or minimum. 3000 3000 2500 2500 2000 2000 1500 1500 1000 1000 500 500 0 0 0 1 2 3 4 5 6 7 8-500 0 2 4 6 8 10 12 14 16 18 20

Stationary points f (x*) = 0 For unconstrained problems, a stationary point is a point where the first derivatives of f are equal to zero. 3000 0 2500-500 2000-1000 1500-1500 1000-2000 500-2500 0 0 1 2 3 4 5 6 7 8-3000 0 1 2 3 4 5 6 7 8

Local and Global minimizers A local minimizer is a point x * that satisfies the condition f(x *) f(x) for all x in a search region. A global minimizer is a point x * that satisfies the condition f(x *) f(x) for all x. 500 0 0-500 -500-1000 -1000-1500 -1500-2000 -2000-2500 -2500-3000 0 1 2 3 4 5 6 7 8-3000 0 2 4 6 8 10 12 14 16 18 20 STRICT local/global f(x*) < f(x)

Unconstrained Optimization - FONC A [strict] local minimizer is a point x* that satisfies the condition f(x*) f(x) [f(x*) < f(x)] for all x. Example: f(x) = x² x* = 0 If x* is a local minimizer and satisfies the condition f (x*) = 0 = first derivative This is known as the First Order Necessary Condition f (x) = 2x, f (x*) = 0.

Back to the previous example (constrained example) Letnumbersbexand9 x. max x x(9-x) 2 subjectto x 0 Set the first derivative to zero: x 2 +12x 27=0 Solutions: x=3orx=9

Unconstrained Optimization - SONC For one-dimensional problems, define ²f (x*) (the second derivative of f wrt x) to be positive semi-definite (psd) if ²f (x*) 0 (positive definite if ²f (x*) > 0) Example: f(x) = x² x* = 0 ²f (x) = x ²f (x*) = 0 If x* is a local minimizer then ²f (x*) is psd this is the Second Order Necessary Condition.

Example continued. Set the first derivative to zero: x 2 +12x 27=0 Solutions: x=3orx=9 150 100 50 0-50 -100-150 -200-250 -2 0 2 4 6 8 10 12 Second derivative: 2 f(x)= 2x+12 2 f(3)= 2 3+12>0 2 f(9)= 2 9+12<0

Unconstrained Optimization - SOSC If f (x*)=0 and ²f (x*) is positive definite then x* is a strict local minimizer of f. Example: f(x) = x² x* = 0 ²f (x) = x ²f (x*) = 0 This is the Second Order Sufficiency Condition.

A two-dimensional example for SOSC Need a little bit of matrix notation and definitions. Matrix notation http://www.purplemath.com/modules/matrices2.htm Matrix operations http://math.jct.ac.il/~naiman/linalg/lay/slides/c02/sec2_1ov.pdf Vector dot products (from Wikipedia) http://en.wikipedia.org/wiki/dot_product Some notation that I will use: [a b c d] is the transpose of the vector containing elements a,b, c and d. (on occasion, T will be used to indicate the transpose). In matrices, ; indicates a new line of elements. Classwork Q1: If v1=[5 4 7 1] and v2=[3 1 2 9], find <v1,v2>, v1 and v2. Q2: Find the product of the following two matrices: [1 1 0; 2 1 5; 3 2 2] and [3 1 1; -2 0 8; -1-3 0]

A two-dimensional example for SOSC Consider the function f(x 1,x 2 )= 1 3 x3 1+ 1 2 x2 1+2x 1 x 2 + 1 2 x2 2 x 2 +9 Stationary point: f(x) = ( x 2 1 +x 1 +2x 2 2x 1 +x 2 1 ) =0 Derivativew.r.t. x 1 Derivativew.r.t. x 2 Substitutesecondcomponentx 2 =1 2x 1 into first component, we get two solutions: x a =[1 1]andx b =[2 3].

SOSC example continued Second derivative, also known as the Hessian Derivative of 1 st component of f wrt x1 2 f(x)= ( 2x1 +1 2 2 1 Derivative of 2 nd component of f wrt x1 ) Derivative of 1 st component of f wrt x2 Derivative of 2 nd component of f wrt x2 Is 2 f(x a )psd? Is 2 f(x b )psd?

SOSC example continued A matrix H is said to be positive semi-definite if either of the following is true: x H x 0 for all vectors x all of the eigenvalues of H are 0 2 f(x a )= Try x=[-1 1] More on eigenvalues later on ( 3 2 2 1 ) get 0. Not positive definite. SOSC failed. 2 f(x b )= ( 5 2 2 1 ) This is positive definite. So xb is a local minimizer.

In matlab ezsurfc('(1/3)*x^3 + 0.5*x^2 + 2*x*y +0.5*y^2-y+9',[-5,5,-5,5],35) (1/3) x 3 + 0.5 x 2 + 2 x y +0.5 y 2 -y+9 100 50 0-50 -100 5-5 0 0-5 5 y x

Searching for x* Let f(x) be unimodal. We wish to find x* by function evaluations and comparisons. how should we choose values of x that bring us as close as possible to x*? can we choose those values such that the number of iterations is minimal? Which direction (p)? How big a step (α)? Steepest descent Golden Section Search Newton s Method?

Steepest descent (p) At the k th iteration, the direction is defined to be p k = f(x k ) and then uses a line search to find x k+1 =x k +α k p k where αk is the step-size

Exact line search (α) Chooseα k >0thatisalocalminimizerof F(α)=f(x k +αp k ). Notethat F(α)=p k f(x k +αp k ). Example: Considerthefunctionf(x)=x 2 6x+5 andsupposex k =4. UsingSteepestdescent,p k =??? Usingexactlinesearch,α k =??? p k =2,α k =0.5

Fibonacci Sequence 1 1 2 3 5 8 13 21 34 55 89 F 0 =F 1 =1,F n =F n 1 +F n 2 for n 2

Fibonacci numbers and Nature FLOWERS: On many plants, the number of petals is a Fibonacci number: buttercups have 5 petals; lilies and iris have 3 petals; some delphiniums have 8; corn marigolds have 13 petals; some asters have 21 whereas daisies can be found with 34, 55 or even 89 petals. Rabbits are able to mate at the age of one month so that at the end of its second month a female can produce another pair of rabbits. Suppose that our rabbits never die and that the female always produces one new pair (one male, one female) every month from the second month on. The puzzle that Fibonacci posed was... How many pairs will there be in one year? http://www.mcs.surrey.ac.uk/personal/r.knott/fibonacci/fibnat.html

Unravelling Fibonacci Consider the sequence of the ratios of consecutive Fibonacci terms

Unraveling more of Fibonacci Solve the above equation what do you get? 1+ 5 2 aka the Golden Ratio

The Golden Ratio a + b a b a b = a + b a Set b=1 what do you get??

The other solution a b 1+ 5 2 = 1.618 b a 5 1 2 = 0.618

Golden Section search x4-x1 = 1 Goal: Given a convex function, we want to shrink the interval as much as possible at each iteration, until we close in on x*. Suppose that at some iteration in our search, we have an interval [x1, x3] that contains x*, and f(x2) < f(x3). x1 x2 x3 x4

Golden Section search x4-x1 = 1 Goal: Given a convex function, we want to shrink the interval as much as possible at each iteration, until we close in on x*. Suppose that at some iteration in our search, we have an interval [x1, x3] that contains x*, and f(x2) < f(x3). x1 x2 x3 x1 x2 x3 x4 x4 Then at the next iteration, define x 1,x2,x3,x4 such that x1 =x1, x3 =x2, x4 =x3. x3 = x1+ ז (x3-x1) = x1 + ז (x4 -x1) = x1 + ז (x1 + ז (x4-x1) -x1) x4 - ז (x4-x1)= x1 + ( ²(x4-x1 ז = x1 + ( ²(x4-x1 ז (x4-x1) ( ²(x4-x1 ז=( x4-x1 )ז- ( 1)*(x4-x1 -ז+ ² ז) = 0

Newton s Method Initialpointx 0 andsetprecisiontoǫ x x 0 repeat x x [ 2 f(x)] 1 [ f(x)] until f(x) <ǫ In the classical Newton Method, stepsize α=1. In other versions, x x α k [ 2 f(x)] 1 [ f(x)] In Quasi Newton methods, an approximation is used for 2 f(x)

Newton s Method, a simple 1-dim Example f(x)=x 2 3x+2 FindNewtonsmethodatx 0 =1 2.5 2.0 f(x)=2x 3, f(1)= 1 2 f(x)=2 1.5 1.0 0.5 x 10.5 ( 1)=1.5 0.0-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 Nowtryx 0 = 2

Newton s Method, 2-dim example Consider the function f(x 1,x 2 )= 1 3 x3 1 + 1 2 x2 1 +2x 1x 2 + 1 2 x2 2 x 2 +9 f(x)= ( x 2 1 +x 1 +2x 2 2x 1 +x 2 1 ) 2 f(x)= ( 2x1 +1 2 2 1 ) Atx=[2 3], f(x)= ( 0 5 ) 2 f(x)= ( 5 2 2 1 ) [δ 2 f(x)] 1 =???

Inverse of a matrix Theorem. For any square matrix A of order n, we have For n>2, things get more complicated In particular, if, then For a square matrix of order 2, we have which gives 2 f(x)= ( 5 2 2 1 ) [ 2 f(x)] 1 = ( 1 2 2 5 )

Steepest vs Newton 19 th century, Cauchy simple does not require second derivatives low cost per iteration slower rate of convergence (linear) convergence so slow that sometimes xk+1 xk is below computer precision many iterations optimal solution may never be reached 17 th century, Newton expensive iterations, requires second derivatives and their inverses it can sometimes fail fast rate of convergence (quadratic)