Computational Optimization. Mathematical Programming Fundamentals 1/25 (revised)

Similar documents
Computational Optimization. Convexity and Unconstrained Optimization 1/29/08 and 2/1(revised)

1 Overview. 2 A Characterization of Convex Functions. 2.1 First-order Taylor approximation. AM 221: Advanced Optimization Spring 2016

Lecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1)

Math 273a: Optimization Basic concepts

2019 Spring MATH2060A Mathematical Analysis II 1

Problem Set 0 Solutions

DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO

Examination paper for TMA4180 Optimization I

Unconstrained Optimization

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

Chapter 11. Taylor Series. Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27

Chapter 2 Convex Analysis

MATH3283W LECTURE NOTES: WEEK 6 = 5 13, = 2 5, 1 13

Symmetric Matrices and Eigendecomposition

Functions of Several Variables

Chapter 1. Optimality Conditions: Unconstrained Optimization. 1.1 Differentiable Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

SWFR ENG 4TE3 (6TE3) COMP SCI 4TE3 (6TE3) Continuous Optimization Algorithm. Convex Optimization. Computing and Software McMaster University

Chapter 2: Preliminaries and elements of convex analysis

1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:

Differentiable Functions

Introduction to Nonlinear Stochastic Programming

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Problem Set 3

Nonlinear Programming Models

Chapter 13. Convex and Concave. Josef Leydold Mathematical Methods WS 2018/19 13 Convex and Concave 1 / 44

ARE202A, Fall 2005 CONTENTS. 1. Graphical Overview of Optimization Theory (cont) Separating Hyperplanes 1

MATH529 Fundamentals of Optimization Unconstrained Optimization II

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

Preliminary draft only: please check for final version

min f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;

CHAPTER 4: HIGHER ORDER DERIVATIVES. Likewise, we may define the higher order derivatives. f(x, y, z) = xy 2 + e zx. y = 2xy.

Unconstrained minimization of smooth functions

Math 341: Convex Geometry. Xi Chen

Iowa State University. Instructor: Alex Roitershtein Summer Homework #1. Solutions

Lecture 2: Convex Sets and Functions

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

Monotone Function. Function f is called monotonically increasing, if. x 1 x 2 f (x 1 ) f (x 2 ) x 1 < x 2 f (x 1 ) < f (x 2 ) x 1 x 2

3 Applications of partial differentiation

Constrained Optimization and Lagrangian Duality

BASICS OF CONVEX ANALYSIS

Convex Functions and Optimization

MA102: Multivariable Calculus

Lecture 4: Convex Functions, Part I February 1

1 Introduction to Optimization

Math 10C - Fall Final Exam

Functions of Several Variables

Constrained optimization: direct methods (cont.)

Optimization Tutorial 1. Basic Gradient Descent

Summary Notes on Maximization

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

3.5 Quadratic Approximation and Convexity/Concavity

Computational Optimization. Augmented Lagrangian NW 17.3

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

1 Directional Derivatives and Differentiability

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)

The Steepest Descent Algorithm for Unconstrained Optimization

Taylor and Maclaurin Series. Copyright Cengage Learning. All rights reserved.

Convexity and Smoothness

Introduction to Real Analysis

Introduction to unconstrained optimization - direct search methods

Mathematical Economics. Lecture Notes (in extracts)

Optimality conditions for unconstrained optimization. Outline

8.7 Taylor s Inequality Math 2300 Section 005 Calculus II. f(x) = ln(1 + x) f(0) = 0

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Recitation 1. Gradients and Directional Derivatives. Brett Bernstein. CDS at NYU. January 21, 2018

Optimization Methods. Lecture 19: Line Searches and Newton s Method

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

Conjugate Gradient (CG) Method

Constrained optimization. Unconstrained optimization. One-dimensional. Multi-dimensional. Newton with equality constraints. Active-set method.

MAC College Algebra

Unit #24 - Lagrange Multipliers Section 15.3

Inequality Constraints

Math 113 (Calculus 2) Exam 4

Constrained Optimization Theory

Optimization. Yuh-Jye Lee. March 21, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 29

Section Taylor and Maclaurin Series

Introduction to gradient descent

AP Calculus Testbank (Chapter 9) (Mr. Surowski)

MATH 4211/6211 Optimization Basics of Optimization Problems

Numerical Optimization Techniques

IFT Lecture 2 Basics of convex analysis and gradient descent

Solutions and Proofs: Optimizing Portfolios

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #16: Gradient Descent February 18, 2015

Numerical Optimization

FALL 2018 MATH 4211/6211 Optimization Homework 1

Module 2: Reflecting on One s Problems

8 Numerical methods for unconstrained problems

g(t) = f(x 1 (t),..., x n (t)).

Math (P)refresher Lecture 8: Unconstrained Optimization

Variational Methods & Optimal Control

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Nonlinear Optimization

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

X. Numerical Methods

Nonlinear equations and optimization

Euler s Theorem for Homogeneous Functions

1. (4 % each, total 20 %) Answer each of the following. (No need to show your work for this problem). 3 n. n!? n=1

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Division of the Humanities and Social Sciences. Supergradients. KC Border Fall 2001 v ::15.45

Transcription:

Computational Optimization Mathematical Programming Fundamentals 1/5 (revised)

If you don t know where you are going, you probably won t get there. -from some book I read in eight grade If you do get there, you won t know it. -Dr. Bennett s amendment Mathematical Programming Theory tells us How to formulate a model. Strategies for solving the model. How to know when we have found an optimal solutions. How hard it is to solve the model. Let s start with the basics

Line Segment Let x R n and y R n, the points on the line segment joining x and y are { z z = λx+(1- λ)y, 0 λ 1 }. x y

Convex Sets A set S is convex if the line segment joining any two points in the set is also in the set, i.e., for any x,y S, λx+(1- λ)y S for all 0 λ 1 }. convex not convex convex not convex not convex

Favorite Convex Sets Circle with center c and radius r { x x c r Linear Equalities = plane Linear Inequalities or Polyhedrons } Matrix A R b R x R { x Ax = b} mxn m n Matrix A R b R x R { x Ax b} mxn m n

Convex Sets Is the intersection of two convex sets convex? Yes Is the union of two convex sets convex? NO

Convex Functions A function f is (strictly) convex on a convex set S, if and only if for any x,y S, f(λx+(1- λ)y)(<) λf(x)+ (1- λ)f(y) for all 0 λ 1. f(λx+(1- λ)y) f(y) f(x) x λx+(1- λ)y y

Concave Functions A function f is (strictly) concave on a convex set S, if and only if for any f is (strictly) convex on S. f -f

(Strictly)Convex, Concave, or none of the above? None of the above Concave Convex Concave Strictly convex

Favorite Convex Functions Linear functions n f ( x) = w' x = wx where x R i= 1 f ( x, x ) = x + x 1 1 i Certain Quadratic functions depends on choice of Q (the Hessian matrix) i f ( x) = x' Qx+ w' x+ c f ( x, x ) = x + x n 1 1

Convexity of function affects optimization algorithm

Convexity of constraints affects optimization algorithm min f(x) subject to x S S convex direction of Steepest descent S not convex

Convex Program min f(x) subject to x S where f and S are convex Make optimization nice Many practical problems are convex problem Use convex program as subproblem for nonconvex programs

Theorem : Global Solution of convex program If x* is a local minimizer of a convex programming problem, x* is also a global minimizer. Further more if the objective is strictly convex then x* is the unique global minimizer. Proof: contradiction x* y f(y)<f(x*)

Proof by contradiction Suppose x* is a local but not global minimizer, i.e. there exist y, s.t. f(y) <f(x*). Then for all 0<ε<1, f(εx*+(1- ε)y) ε f(x*)+(1- ε)f(y) < ε f(x*)+(1- ε)f(x*)=f(x*). Contradiction, x* is not a local min. You try for uniqueness in strict case.

Problems with nonconvex objective Min f(x) subject to x [a,b] f strictly convex, problem has unique global minimum a x* b f not convex, problem has two local minima a x b x*

Problems with nonconvex set Min f(x) subject to x [a,b] or [c d] a b x c x* d

Multivariate Calculus For x R n, f(x)=f(x 1, x, x 3, x 4,, x n ) The gradient of f: f ( x) f ( x) f ( x) f( x) =,,..., x1 x xn The Hessian of f: f( x) f( x) f( x)... x1 x1 x1 x x1 xn f( x) f( x)... f( x) f( x) f( x)... xn x1 xn x xn xn f( x) = x x1 x x

For example 4 3 x 1 ( ) 3 4 f x = x + x + e + x x 1 1 3 x 1 x1 + 3e + 4 x f ( x ) = 3 1x + 4x1 + 3 x 1 9 e 4 f ( x ) = 4 3 6 x = [0,1] 7 f ( x ) = 1 f x 11 4 ( ) = 4 3 6 x

Quadratic Functions Form n nxn n x R Q R b R Gradient 1 f ( x) = x Qx b ' x n n n 1 = Q x x b x j = 1 ij i j j j i= 1 j = 1 j = 1 f ( x) 1 1 = Q x + Q x + Q x b x k n kk k ik i kj j k i k j k = Q x b assuming Qsymmetric kj j k f () x = Qx b f () x = Q

Taylor Series Expansion about x* - 1D Case Let x=x*+p 3 3 f(x)= f(x*+p)=f(x*)+pf (x*)+ p f (x*)+ p f (x*) Equivalently 1 n n + + p f (x*) + n! 1 1 3! 3 3 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( x x*) f (x*) 1 1 3! 1 n! n n + + ( x x*) f (x*) +

Taylor Series Example Let f(x) = exp(-x), compute Taylor Series Expansion about x*=0 1 1 x x 3! 1 n n + + ( x x*) f (x*) + n! 3 n x* x x* x x* n x x* = 1 xe + e e + +(-1) e + 3! n! 3 3 f(x)=f(x*)+(x-x*)f (x*)+ (x-x*) f (x*)+ ( *) f (x*) 3 n x x n x = 1 x + + +(-1) + 3! n!

First Order Taylor Series Let x=x*+p Approximation f(x)=f(x*+p)=f(x*)+p f(x*)+ p α ( x*, p) where lim α ( x*, p) = 0 p 0 Says that a linear approximation of a function works well locally f(x) f(x) f(x*+p)= f ( x*) + p f( x*) f(x) f ( x*) + ( x x*) ' f( x*) x*

Second OrderTaylor Series Let x=x*+p Approximation f(x)=f(x*+p)=f(x*)+p f(x*)+ f(x*)p+ p ( *, ) where lim α ( x*, p) = 0 p 0 Says that a quadratic approximation of a function works even better locally 1 p α x p f(x) f(x) f( x*) + ( x x*) ' f( x*) x* 1 ( *)' x x f ( x *)( x x *) +

Theorem.1 Taylor s Theorem version Suppose f is cont diff, f ( x+ p) = f( x) + f( x+ tp)' p for some t [0,1]. If f is twice cont. diff, f ( x+ p) = f( x) + f( x)' p+ p' f( x+ tp)' p for some t [0,1]. 1 Also called Mean Value Theorem

Taylor Series Approximation Exercise Consider the function and x*=[-,3] = 3 + + + 1 1 1 1 f ( x, x ) x 5x x 7xx x Compute gradient and Hessian. What is First Order TSA about X* What is second order TSA about X* Evaluate both TSA at y=[-1.9,3.] and compare with f(y)

Exercise f ( x, x ) = x + 5x x + 7x x + x function 3 1 1 1 1 f ( x) = f ( x*) = [, ]' gradient f ( x) f ( x*) Hessian First order TSA: = = g ( x) = f ( x*) + ( x x*) f ( x*) = second order TSA: h( x) = f ( x*) + ( x x*) f ( x*) f ( y) g ( y) = f ( y) h( y) = 1 + ( x x*) f ( x*)( x x*)

Exercise f( x, x ) = x + 5x x + 7x x + x function f( x*) = 56 3 1 1 1 1 3x1 + 10xx 1 + 7x f( x) = f( x*) = [15, 5] gradient 5x1 + 14x1x + 4x 6 x1+ 10x 10x1+ 14x 18 f( x) = f( x*) = Hessian 10x1+ 14x 14x1+ 4-4

Exercise First order T S A : g ( x ) = f ( x*) + ( x x*) f ( x*) second order TSA: h ( x ) = f ( x*) + ( x x*) f ( x*) + ( x x *) f ( x*)( x x*) 1 f ( y ) g ( y ) = 64.811 ( 64.9) =.089 f ( y ) h ( y ) = 64.811 ( 64.5) =.0 3 9

General Optimization algorithm Specify some initial guess x 0 For k = 0, 1, If x k is optimal then stop Determine descent direction p k Determine improved estimate of the solution: x k+1 =x k +λ k p k Last step is one-dimensional search problem called line search

Descent Directions If the directional derivative is negative then linesearch will lead to decrease in the function f ( x) d < 0 [8,] [0,-1] f () x d

Descent directions create decrease Let d' f( x) < 0, then λ > 0 such that f( x+ λd) < f( x) Proof for λ λ f ( x+ λd) = f( x) + λd f( x) + λd α( x, λd) f( x+ λd) f( x) λ = d f( x) + d α( x, λd) f( x+ λd) f( x) < 0 for λsufficiently small since d f( x) < 0and α( x, λd) 0.

Negative Gradient An important fact to know is that the negative gradient always points downhill Let d = f( x), then λ > 0 such that f( x+ λd) < f( x) Proof f ( x+ λd) = f( x) + λd f( x) + λd α( x, λd) f( x+ λd) f( x) λ = d f( x) + d α( x, λd) f( x+ λd) f( x) < 0 for λsufficiently small since d f( x) < 0and α( x, λd) 0. for λ λ

Notes on negative gradient If gradient nonzero, then negative gradient defines a descent direction d ' f ( x) = f ( x)' f ( x) by substitution of d = f( x) < 0 if f( x) 0

Directional Derivative f ( x, d) = lim λ 0 = f( x) d f ( x + λd) f ( x) λ Always exists when function is convex

Assignment Read chapter 3 in NW