N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

Similar documents
Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

Math (P)refresher Lecture 8: Unconstrained Optimization

Lecture 3. Optimization Problems and Iterative Algorithms

y Ray of Half-line or ray through in the direction of y

Algorithm for QP thro LCP

Constrained Optimization

Lecture 2: Convex Sets and Functions

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction and Math Preliminaries

Semidefinite Programming Basics and Applications

Lecture Note 5: Semidefinite Programming for Stability Analysis

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

1 Strict local optimality in unconstrained optimization

Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2)

Module 04 Optimization Problems KKT Conditions & Solvers

Applications of Linear Programming

Written Examination

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

CE 191: Civil and Environmental Engineering Systems Analysis. LEC 05 : Optimality Conditions

Lecture 2: Convex functions

Symmetric Matrices and Eigendecomposition

Optimization: an Overview

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

58 Appendix 1 fundamental inconsistent equation (1) can be obtained as a linear combination of the two equations in (2). This clearly implies that the

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Common-Knowledge / Cheat Sheet

Chapter 2 BASIC PRINCIPLES. 2.1 Introduction. 2.2 Gradient Information

Conic Linear Programming. Yinyu Ye

Optimality Conditions

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

13. Convex programming

Linear and non-linear programming

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

Linear Algebra And Its Applications Chapter 6. Positive Definite Matrix

Generalization to inequality constrained problem. Maximize

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

Mathematical Economics. Lecture Notes (in extracts)

CS-E4830 Kernel Methods in Machine Learning

4. Convex optimization problems (part 1: general)

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Lecture: Examples of LP, SOCP and SDP

CSCI : Optimization and Control of Networks. Review on Convex Optimization

MATH2070 Optimisation

Convex Optimization & Lagrange Duality

6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection

Computational Finance

A Brief Review on Convex Optimization

Numerisches Rechnen. (für Informatiker) M. Grepl P. Esser & G. Welper & L. Zhang. Institut für Geometrie und Praktische Mathematik RWTH Aachen

Constrained Optimization Theory

Lecture 18: The Rank of a Matrix and Consistency of Linear Systems

Lecture 7: Positive Semidefinite Matrices

SWFR ENG 4TE3 (6TE3) COMP SCI 4TE3 (6TE3) Continuous Optimization Algorithm. Convex Optimization. Computing and Software McMaster University

OR MSc Maths Revision Course

Sequential Convex Programming

Functions of Several Variables

Lecture: Convex Optimization Problems

Copositive Plus Matrices

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

Optimality Conditions for Constrained Optimization

Convex Optimization. Convex Analysis - Functions

minimize x subject to (x 2)(x 4) u,

MATHEMATICAL ECONOMICS: OPTIMIZATION. Contents

0.1 O. R. Katta G. Murty, IOE 510 Lecture slides Introductory Lecture. is any organization, large or small.

4. Convex optimization problems

Constrained optimization

Mathematical Economics (ECON 471) Lecture 3 Calculus of Several Variables & Implicit Functions

Optimality, Duality, Complementarity for Constrained Optimization

14. Nonlinear equations

Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Review of Optimization Methods

1 Computing with constraints

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Interior Point Methods for Convex Quadratic and Convex Nonlinear Programming

Convex Functions. Wing-Kin (Ken) Ma The Chinese University of Hong Kong (CUHK)

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then

Conic Linear Programming. Yinyu Ye

Chapter 2: Unconstrained Extrema

1 Convexity, concavity and quasi-concavity. (SB )

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Nonlinear Optimization: What s important?

Support Vector Machines

Convex optimization problems. Optimization problem in standard form

CHAPTER 2: QUADRATIC PROGRAMMING

Chapter 2: Preliminaries and elements of convex analysis

1. First and Second Order Conditions for A Local Min or Max.

10 Numerical methods for constrained problems

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

September Math Course: First Order Derivative

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Lecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1)

Math Matrix Algebra

5. Duality. Lagrangian

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

ARE211, Fall 2005 CONTENTS. 5. Characteristics of Functions Surjective, Injective and Bijective functions. 5.2.

Constrained Optimization and Lagrangian Duality

Lecture: Duality.

Constrained optimization: direct methods (cont.)

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Transcription:

0.1 N. L. P. Katta G. Murty, IOE 611 Lecture slides Introductory Lecture NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP does not include everything other than linear. NLP does not include IP or Discrete Opt. Needs special Enumerative Techniques. NLP, also called Continuous Optimization or Smooth Optimization. Models of following form: Minimize θ(x) s. to h i (x) = 0, i =1tom g p (x) > = 0, p =1tot All functions θ(x),h i (x),g p (x) assumed Smooth Functions. Smooth Function = one with all derivatives. 1

Beyond 2nd derivatives, impractical particularly when many variables. So, for us smooth means: Continuously differentiable if using gradients only Twice continuously differentiable if using Hessians Inequality constraints include lower and upper bounds on decision variables: l < j = xj < = uj. f( x) denotes ( f(x) x j )atx = x written as row vector. Also called the gradient of f(x) at x. 2 xxf( x) denotes ( 2 f(x) x i x j ), n n Hessian matrix of f(x) at x, If g(x) =(g 1 (x),...,g m (x)) T is vector of functions, then g(x) = ( g i(x) x j : i =1tom, j =1ton), the m n Jacobian matrix of g(x) at x. Each row vector in g(x) is the gradient vector of one function in g(x). 2

QUADRATIC FUNCTIONS: Simplest nonlinear functions A Quadratic Form in x =(x 1,...,x n ) T is of form f(x) = n i=1 q ii x 2 i + n n i=1 j=i+1 q ij x i x j. Define Square symmetric matrix D =(d ij ) of order n where d ii = q ii d ij = d ji = 1 2 q ij for i =1ton for i j, j > i Then f(x) =x T Dx Example: h(x) =81x 2 1 7x2 2 +5x 1x 2 6x 1 x 3 +18x 2 x 3. A Quadratic Function is of form Q(x) =x T Dx + cx + c 0. A square matrix M of order n, whether symmetric or not, is: Positive semidefinite (PSD) if x T Mx > = 0 for all x R n Positive definite (PD) if x T Mx > 0 for all x 0 Convex, Concave Functions: A function f(x) defined over R n, or some convex subset of R n,isconvex Function iff for all x 1,x 2 in that set, and all 3

0 α 1, f(αx 1 +(1 α)x 2 ) αf(x 1 )+(1 α)f(x 2 ) Inequality called Jensen s Inequality after Danish mathematician who defined it in 1905. Geometric interpretation as function lying beneath every chord Function f(x) Concave if above inequality holds other way, i.e., f(x) is convex. 4

Some Properties of Convex Functions 1. Nonnegative Combinations of convex functions convex. 2. If f(x) is convex defined on convex set Γ, then for all x 1,...,x r Γ and α 1,...,α r 0 satisfying r i=1 α i =1, f( r i=1 α i x i ) r i=1 α i f(x i ) 3. f(x) convex iff its Epigraph is a convex set. 4. If f(x) convex, for all α, the set {x : f(x) α} is a convex set. Converse not true. 5. Pointwise supremum function of convex functions convex. 6. Differentiable function defined on real line convex iff its 1st derivative is a monotonic increasing function, i.e., iff its 2nd derivative is nonnegative function. 7. Quadratic function defined over R n covex over R n iff its Hessian is PSD. Quadratic function whose Hessian not PSD, may be convex 5

over a subspace of R n. 8. Twice continuously differentiable f(x) defined over R n is convex iff its Hessian PSD for all x. 9. Gradient Support Inequality: Differentiable function f(x) defined over R n is convex iff for all x f(x) l(x) =f( x)+ f( x)(x x), for all x Lower bound property of LINEARIZATION: Function l(x) defined above called linearization of f(x) at x. For convex functions, linearization at any point, lower bound for function at every point. So approximating convex function by linearization, leads to underestimating it at every point. 6

How to check if a given function is convex? One variable functions like x 2 1,x4 1,e x 1,e x 1, log(x 1 ) [over x 1 > 0] are convex. For many variables, checking convexity hard, as it involves checking PSD of Hessian at every point! Local Convexity: If Hessian of f(x) at x is PD, in small neighborhood of x, f(x) convex. In this case we say f(x) is locally convex at x. 7

Types of NLPs Unconstrained Minimization: No constraints. Real world problems have constraints. Unconstrained min. very imp. because constrained problems can be transformed into unconstrained ones by penalty function methods. LP: If all functions affine. QP: If objective function quadratic, all constraints affine. Linearly constrained NLP: Objective function nonlinear, all constraints affine. Equality constrained NLP: All constraints equations, and no bounds on variables. Convex Programming Problem: Of form Minimize θ(x) s. to h i (x) = 0, i =1tom g p (x) > = 0, p =1tot 8

where θ(x) convex, all h i (x) affine, all g p (x) concave. Nicest among NLPs. Useful necessary and sufficient optimality conditions for global minimum are only known for convex programming problems. Nonconvex Programming Problems: Violates some of the conditions for convex programming. 9

Types of Solutions For NLP Feasible solution x is Local Minimum For ɛ>0, θ(x) θ( x) for all feasible x satisfying x x <ɛ. Strong local min if θ(x) > θ( x) for all feasible x x satisfying x x < ɛ. Weak local min If not strong. Global min If θ(x) θ( x) for all feasible x. Stationary Point If it satisfies necessary condition for local minimum. Local (strong, weak) maxima, and global maxima similar. 10

Differences In Constructing LP Models & NLP Models Variety of functional forms: In LP all affine. In NLP unlimited variety of functional forms. Data: LP involving m constraints in n variables has (m + 1)(n + 1) 1 coefficients as data elements. Large scale LP refers to one with m>1000s, and n> 10,000s. Such models solved to global optimality, in reasonable time. To construct an NLP model need to determine functional forms for objective, constraint functions. Usually involves Curve Fitting and Parameter estimation. Usually by Least Squares Method using special unconstrained min algos. So even constructing NLP model, needs NLP algos. So, even 200 variable NLP model considered large scale. Expectation on solution: LP we don t even talk about local min, because we get global min. 11

In NLP, if model nonconvex, no efficient algos can guarantee finding global min. So, one compromises on type of solution expected. Convex Programs Are Nicest NLPs! Theorem: For convex program every local min is global min. For convex program, any method finding local min will find global min. Also, every stationary point is global min in convex programs. 12

Properties of PSD, PD matrices, Algos to check. Preserved under symmetrization: M PD, PSD, iff D = 1 2 (M + M T ) is. Signs of diagonal elements: M =(m ij ). If PD, all m ii > 0. If PSD all m ii 0. Skew-symmetry on a 0 diagonal element: If M =(m ij ) PSD and m ii = 0, then for all j, m ij + m ji =0. So in symmetric PSD matrix if diagonal element is 0, its row and col must be 0. Preservation in Principal submatrix after Gaussian Pivot step: Let D =(d ij ) be symmetric n n, and suppose d 11 0. Perform Gaussian pivot step with d 11 as pivot element. After pivot step eliminate row 1 and column 1, resulting in matrix D 1 of order (n 1) (n 1). D is PD iff d 11 > 0 and D 1 is PD. D is PSD iff d 11 > 0 and D 1 is PSD. 13

Superdiagonalization Algorithm for checking if M PD: Symmetrize: Let D = 1 2 (M + M T ). Perform Gaussian Pivot Steps: Let D 0 = D Do for i =1ton 1. Let D i 1 be matrix from previous operation. If any diagonal entries in D i 1 0, terminate. Conclude M not PD. Otherwise, carry out Gaussian pivot step on D i 1 with its ith diagonal element as pivot element, resulting in matrix D i. If all Gaussian pivot steps carried out and all diagonal elements of final matrix D n 1 are > 0, M PD, terminate. Examples 3 1 2 2 1 2 0 2 0 4 4 5 3 0 2 13 6 3, 1 0 2 0 0 2 4 0 2 4 4 5 0 0 5 3 14

Superdiagonalization Algo for checking M PSD Symmetrize: Let D = 1 2 (M + M T ). Perform Gaussian Pivot Steps: Let E 0 = D Do for i =1ton 1. Let E i 1 be current matrix after previous operation. If any diagonal entries in E i 1 are < 0, terminate. Conclude M not PSD. If top diagonal entry in E i 1 is=0 If top row or 1st col of E i 1 are nonzero, terminate. Conclude M not PSD. Otherwise strike off the zero top row and 1st col of E i 1 and let remaining matrix be new current matrix E i. If top diagonal entry in E i 1 is > 0 Perform Gaussian pivot step on E i 1 with the top diagonal element as pivot element. 15

After this pivot step, erase top row and 1st col of resulting matrix, and let remaining matrix be the new current matrix E i. If no termination in above steps, M PSD, terminate. Examples 0 2 3 4 5 2 3 3 0 0 3 3 3 0 0 4 0 0 8 4 5 0 0 4 2, 1 0 2 0 0 2 4 0 2 4 4 5 0 0 5 3 16

Other Properties of PSD, PD Matrices Property preserved for Principal submatrices: If M PD (PSD) so are all the principal submatrices of M Sign of determinants: If M is PD (PSD), whether symmetric or not, all its principal subdeterminants are > 0( 0). A square symmetric matrix PD iff all its principal subdeterminants are > 0 A square symmetric matrix of order n PD iff all its n staircase or leading principal subdeterminants are > 0. P -matrix: A square matrix, whether symmetric or not, is P -matrix, iff all its principal subdeterminants are > 0. A symmetric P -matrix is PD. An assymetric P -matrix may not be PD, it may be PSD, or even indefinite. Linear Dependence Relation at Optimum: If M is PSD, and x minimizes x T Mx, then (M + M T ) x =0. 17