January 29, Introduction to optimization and complexity. Outline. Introduction. Problem formulation. Convexity reminder. Optimality Conditions

Similar documents
Convex Optimization and l 1 -minimization

IOE 611/Math 663: Nonlinear Programming

1. Introduction. mathematical optimization. least-squares and linear programming. convex optimization. example. course goals and topics

Convex Optimization & Machine Learning. Introduction to Optimization

Lecture 6: Conic Optimization September 8

Course Outline. FRTN10 Multivariable Control, Lecture 13. General idea for Lectures Lecture 13 Outline. Example 1 (Doyle Stein, 1979)

Introduction to Convex Optimization

Constrained Optimization and Lagrangian Duality

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

The Q-parametrization (Youla) Lecture 13: Synthesis by Convex Optimization. Lecture 13: Synthesis by Convex Optimization. Example: Spring-mass System

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

A Brief Review on Convex Optimization

FRTN10 Multivariable Control, Lecture 13. Course outline. The Q-parametrization (Youla) Example: Spring-mass System

Lecture: Duality of LP, SOCP and SDP

CSCI : Optimization and Control of Networks. Review on Convex Optimization

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

5. Duality. Lagrangian

Lecture 18: Optimization Programming

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

FINANCIAL OPTIMIZATION

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

Convex Optimization M2

Chapter 2: Preliminaries and elements of convex analysis

Convex Optimization Boyd & Vandenberghe. 5. Duality

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

Math 5593 Linear Programming Week 1

Lecture: Duality.

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY

Lecture 3. Optimization Problems and Iterative Algorithms

Interior Point Methods for Mathematical Programming

CS-E4830 Kernel Methods in Machine Learning

On the Method of Lagrange Multipliers

Microeconomics I. September, c Leopold Sögner

Computational Finance

Part IB Optimisation

Appendix A Taylor Approximations and Definite Matrices

Lecture 1: Introduction

Convex Optimization & Lagrange Duality

Review of Optimization Methods

Optimality, Duality, Complementarity for Constrained Optimization

10 Numerical methods for constrained problems

x +3y 2t = 1 2x +y +z +t = 2 3x y +z t = 7 2x +6y +z +t = a

Support Vector Machines: Maximum Margin Classifiers

Lecture 2: Convex Sets and Functions

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

1 Introduction

GEORGIA INSTITUTE OF TECHNOLOGY H. MILTON STEWART SCHOOL OF INDUSTRIAL AND SYSTEMS ENGINEERING LECTURE NOTES OPTIMIZATION III

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lecture Note 5: Semidefinite Programming for Stability Analysis

Convex Optimization in Communications and Signal Processing

Constrained Optimization Theory

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Week 4: Calculus and Optimization (Jehle and Reny, Chapter A2)

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

12. Interior-point methods

FIN 550 Exam answers. A. Every unconstrained problem has at least one interior solution.

EE364a Review Session 5

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Introduction and Math Preliminaries

Optimality Conditions for Constrained Optimization

EE/AA 578, Univ of Washington, Fall Duality

FIN 550 Practice Exam Answers. A. Linear programs typically have interior solutions.

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Convex Optimization and Modeling

5 Handling Constraints

Nonlinear Programming and the Kuhn-Tucker Conditions

ICS-E4030 Kernel Methods in Machine Learning

E 600 Chapter 4: Optimization

Numerical Optimization

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Chap 2. Optimality conditions

Tutorial on Convex Optimization for Engineers Part II

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Linear and non-linear programming

The Kuhn-Tucker Problem

Generalization to inequality constrained problem. Maximize

ARE202A, Fall 2005 CONTENTS. 1. Graphical Overview of Optimization Theory (cont) Separating Hyperplanes 1

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

4TE3/6TE3. Algorithms for. Continuous Optimization

Constrained Optimization

CS711008Z Algorithm Design and Analysis

Final Exam - Math Camp August 27, 2014

Optimization for Machine Learning

1. f(β) 0 (that is, β is a feasible point for the constraints)

CONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

September Math Course: First Order Derivative

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Economics 101A (Lecture 3) Stefano DellaVigna

Written Examination

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CONSTRAINED NONLINEAR PROGRAMMING

36106 Managerial Decision Modeling Linear Decision Models: Part II

Transcription:

Olga Galinina olga.galinina@tut.fi ELT-53656 Network Analysis Dimensioning II Department of Electronics Communications Engineering Tampere University of Technology, Tampere, Finl January 29, 2014

1 2 3 4 5

1 2 3 4 5

A bit of a Hisry... Leonhard Euler (1707-1783): nothing at all takes place in the Universe in which some rule of maximum or minimum does not appear

Where Network Optimization Arises? Optimization discipline deals with finding the maxima minima of functions subject some constraints Transportation Systems Transportation of goods over transportation networks Scheduling of fleets of airplanes Manufacturing Systems Scheduling of goods for manufacturing Flow of manufactured items within invenry systems Communication Systems Design expansion of communication systems Flow of information across networks Energy Systems, Financial Systems, much more

Examples Portfolio variables: amounts invested in different assets constraints: budget, max./min. investment per asset, minimum return objective: overall risk or return variance Device sizing in electronic circuits variables: device widths lengths constraints: manufacturing limits, timing requirements, maximum area objective: power consumption Data fitting variables: model parameters constraints: prior information, parameter limits objective: measure of prediction error

Conventional Design Method Initial design System analysis Is design satisfacry? No Correct the design basing on experience Depends on the designers intuition, experience skills Trial--error method Not easy apply a complex system Does not always lead the best possible design Qualitative design

1 2 3 4 5

Mathematical Model Consider an (mathematical) problem minimize f (x), x R n subject x Ω. Definition The function f (x) : R R is a real-valued function, called the objective function, or cost function.

Mathematical Model Consider an (mathematical) problem minimize f (x), x R n subject x Ω. Definition The function f (x) : R R is a real-valued function, called the objective function, or cost function. Definition The variables x = [x 1,..., x n ] are variables.

Mathematical Model Consider an (mathematical) problem minimize f (x), x R n subject x Ω. Definition The function f (x) : R R is a real-valued function, called the objective function, or cost function. Definition The variables x = [x 1,..., x n ] are variables. Definition Optimal solution x 0 has the smallest value of f (x) among all vecrs.

Constraint Set Definition The set Ω R n is the constraint set or feasible set/region. Ω takes form: {x : h i (x) = 0, g j (x) < 0}, where h i (x), g i (x) are constraint functions

Constraint Set Definition The set Ω R n is the constraint set or feasible set/region. Definition The above problem is a general form of a constrained problem. If Ω = R n, the problem is we refer the problem unconstrained. Ω takes form: {x : h i (x) = 0, g j (x) < 0}, where h i (x), g i (x) are constraint functions

Constraint Set Definition The set Ω R n is the constraint set or feasible set/region. Definition The above problem is a general form of a constrained problem. If Ω = R n, the problem is we refer the problem unconstrained. Ω takes form: {x : h i (x) = 0, g j (x) < 0}, where h i (x), g i (x) are constraint functions Definition If Ω =, the problem is infeasible, otherwise feasible.

Solving Optimization s General problem very difficult solve methods involve some compromise, e.g., very long computation time, or not always finding the solution Exceptions: certain problem classes can be solved efficiently reliably least-squares problems linear programming problems convex problems

Least-squares problem minimize Ax b 2 2 analytical solution: x 0 = (A T A) 1 A T reliable efficient algorithms software computation time proportional n 2 k (A R k n ); less if structured a mature technology Using least-squares least-squares problems are easy recognize a few stard techniques increase flexibility (e.g., including weights, adding regularization terms)

Linear programming minimize c T x subject a T i x b i, i = 1,..., m Solving linear programs no analytical formula for solution reliable efficient algorithms software computation time proportional n 2 m if m n; less with structure a mature technology Using linear programming not as easy recognize as least-squares problems a few stard tricks used convert problems in linear programs (e.g., problems involving l 1 - or l -norms, piecewise-linear functions)

Convex problem minimize f (x) subject g i (x) b i, i = 1,..., m objective constraint functions are convex: g i (α 1 x + α 2 y) α 1 g i (x) + α 2 g i (y), if α 1 + α 2 = 1, α 1 0, α 1 0. includes least-squares problems linear programs as special cases

Convex problems Solving convex problems no analytical solution reliable efficient algorithms computation time (roughly) proportional max{n 3, n 2 m, F }, where F is cost of evaluating f (x) its first second derivatives almost a technology Using convex often difficult recognize many tricks for transforming problems in convex form surprisingly many problems can be solved via convex

Nonlinear Traditional techniques for general nonconvex problems involve compromises Local methods (nonlinear programming) find a point that minimizes f among feasible points near it fast, can hle large problems require initial guess provide no information about distance (global) optimum Global methods find the (global) solution worst-case grows exponentially with problem size These algorithms are often based on solving convex subproblems

Brief hisry of convex (1900-1970) Algorithms 1947: simplex algorithm for linear programming (Dantzig) 1960s: early interior-point methods (Fiacco McCormick, Dikin,... ) 1970s: ellipsoid method other subgradient methods 1980s: polynomial-time interior-point methods for linear programming (Karmarkar 1984) late 1980s-now: polynomial-time interior-point methods for nonlinear convex (Nesterov Nemirovski 1994) Applications before 1990: mostly in operations research; few in engineering since 1990: many new applications in engineering (control, signal processing, communications, circuit design,... ); new problem classes

Mental break The last possibility is not the case. Answer: 2/(4 1) = 2/3 A kitten has a 50/50 chance be male or female. My cat just delivered two adorable kittens. My veterinarian said that at least one of them is female. What is the probability that the other kitten is a boy? There are 4 variants: female-female female-male male-female male-male

1 2 3 4 5

Affine Convex Sets Definition S R n is affine if [x, y S, α R] αx + (1 α)y S If x 1,..., x m R n, j α j = 1, α j > 0, then x = α 1 x 1 +... + α 1 x 1 is a convex combination of x 1,..., x m. The intersection of (any number of) convex sets is convex

Affine Convex Sets Definition S R n is affine if [x, y S, α R] αx + (1 α)y S Definition S R n is convex if for all [x, y S, 0 < α < 1] z = αx + (1 α)y S (z a convex combination of x y). If x 1,..., x m R n, j α j = 1, α j > 0, then x = α 1 x 1 +... + α 1 x 1 is a convex combination of x 1,..., x m. The intersection of (any number of) convex sets is convex

Compact Sets Let B δ (x 0 ) denote the open ball of radius δ centered at the point x: B δ (x 0 ) = {x : x x 0 < δ}. Definition Set S R n is said be open if for each point x 0 S there is δ such that B δ (x 0 ). A set S R n is said be closed if its complement R n \ S is open. Alternative: every sequence in S has a convergent subsequence, whose limit lies in S. Note: If S R n, closed bounded, then S - compact (Heine-Borel theorem).

Compact Sets Let B δ (x 0 ) denote the open ball of radius δ centered at the point x: B δ (x 0 ) = {x : x x 0 < δ}. Definition Set S R n is said be open if for each point x 0 S there is δ such that B δ (x 0 ). A set S R n is said be closed if its complement R n \ S is open. Definition Set S is compact if each of its open covers has a finite subcover: {C i } i A, S {C i } i A finite J : S {C j } j J. Alternative: every sequence in S has a convergent subsequence, whose limit lies in S. Note: If S R n, closed bounded, then S - compact (Heine-Borel theorem).

Convex functions Definition Let C R n be a nonempty convex set. Then f : C R is convex (on C) if for all x, y C all α (0, 1): f (αx + (1 α)y) αf (x) + (1 α)f (y) If strict inequality holds whenever x y, then f is said be strictly convex. The negative of a (strictly) convex function is called a (strictly) concave function.

Convex functions nonnegative multiple: αf is convex if f is convex, α 0 sum: f 1 + f 2 convex if f 1, f 2 convex (extends infinite sums, integrals) composition with affine function: f (Ax + b) is convex if f is convex Some univariate convex functions: 1. exponential f (x) = e αx (for all real α) 2. powers f (x) = x p if x 0 1 p < 3. powers of abs valuef (x) = x p, if x > 0 < p 0 Concave: 1. powers f (x) = x p if x 0 0 p 1 2. logarithm: f (x) = log x if x > 0.

Differentials f is dierentiable if dom(f ) is open the gradient of f : f (x) ( f,..., f ) T, x dom (f ) x 1 x n f is twice dierentiable if dom f is open the Hessian of f : H D 2 f (x) = 2 f x 2 1... 2 f x 1 x n... 2 f x n x 1... 2 f xn 2, x dom (f ) Note: Not all convex functions are differentiable.

First-order condition Theorem (gradient inequality) Differentiable f is convex on convex C R n i.f.f. x, y C f (y) f (x) + ( f (x)) T (y x).

First-order condition Theorem (gradient inequality) Differentiable f is convex on convex C R n i.f.f. x, y C f (y) f (x) + ( f (x)) T (y x). Theorem Minimizing differentiable convex function f (x) s.t. x C Find x C such that ( f (x )) T (y x ) 0 for all y C (variational inequality problem)

Second-order condition Theorem Twice differentiable f is convex on C R n i.f.f Hessian matrix 2 f is positive semidefinite for all x C. Note: If 2 f (x) is positive definite for all x C, then f is strictly convex on C. The converse is false. Example: consider the function f (x) = x 4

1 2 3 4 5

Minimization Find an optimal decision x (minimizer WLG): Definition x Ω is a local minimizer (minimum) of f over Ω if there exists ɛ > 0 such that f (x) f (x ) for all x Ω \ N(x ), where N(x ) is a neighborhood of x. Typically, N(x ) is just some open ball B δ (x ) If we replace with > then we have a strict local minimizer a strict global minimizer. Then f (x) is the global minimum value.

Minimization Find an optimal decision x (minimizer WLG): Definition x Ω is a local minimizer (minimum) of f over Ω if there exists ɛ > 0 such that f (x) f (x ) for all x Ω \ N(x ), where N(x ) is a neighborhood of x. Typically, N(x ) is just some open ball B δ (x ) Definition x Ω is a global minimizer (minimum) of f over Ω if f (x) f (x ) for all x Ω \ {x }. If we replace with > then we have a strict local minimizer a strict global minimizer. Then f (x) is the global minimum value.

The Method of Lagrange Multipliers minimize f (x) s.t. c i (x) = 0, i = 1,.., m, x R n, m n Jacobian matrix of the mapping c(x) = (c 1 (x),..., c m (x)): Lagrange theorem c(x) = c 1 x 1...... c m x 1... c 1 x n c m x n For local minimizer x continiously differentiable f, c 1,..., c m, y 1,...y m: m f (x) y i c i (x ) = 0 i=1

The Method of Lagrange Multipliers Lagrange multipliers: y 1,...y me Lagrange function (Lagrangian): Partial gradients: x L = L(x, y) = f (x) m y i c i (x) i=1 ( L,..., L ) m = f (x) y i c i (x) x 1 x n i=1 ( ) L L y L =,..., = c(x) y 1 y m

The Karush-Kuhn-Tucker Theorem minimize f (x) s.t. c i (x) 0, i = 1,.., m, s.t. h i (x) = 0, i = 1,.., l Active or binding constrait at x 0 : c i (x 0 ) = 0 Theorem For local minimizer x continiously differentiable f, c i, h i, λ 0, λ 1,..., λ m, µ 1,..., µ l : λ 0 f (x) m λ i c i (x ) i=1 l µ i h i (x ) = 0 i=1 λ i c i (x ) = 0, i = 1,..., m (complementary slackness) λ, λ i 0, i = 1,..., m (dual feasibility) c i (x) 0, h i (x) = 0 (primal feasibility)

Mental break How many apples at equal distances from each other can I have? Place three apples on a plane add one more below the plane. They form a tetrahedron. 3 + 1 = 4

1 2 3 4 5

Computational Answers questions: What is an efficient algorithm? How do we measure efficiency? Computational of an algorithm A measure of how many steps the algorithm will require in the worst case for an input of a given size or

Computational Answers questions: What is an efficient algorithm? How do we measure efficiency? Computational of an algorithm A measure of how many steps the algorithm will require in the worst case for an input of a given size or Classifying problems according tractability or intractability

Algorithms A problem e.g. Traveling Salesman : Given a graph with nodes edges costs associated with the edges, what is a least-cost closed walk (or ur) containing each of the nodes exactly once? can be thought of as a function p that maps an instance x an output p(x) (an answer).

Algorithms A problem e.g. Traveling Salesman : Given a graph with nodes edges costs associated with the edges, what is a least-cost closed walk (or ur) containing each of the nodes exactly once? An instance of a problem The graph contains nodes 1, 2, 3, 4, 5, 6, edges (1, 2) with cost 10, (1, 3) with cost 14,... can be thought of as a function p that maps an instance x an output p(x) (an answer).

Algorithms A problem e.g. Traveling Salesman : Given a graph with nodes edges costs associated with the edges, what is a least-cost closed walk (or ur) containing each of the nodes exactly once? An instance of a problem The graph contains nodes 1, 2, 3, 4, 5, 6, edges (1, 2) with cost 10, (1, 3) with cost 14,... can be thought of as a function p that maps an instance x an output p(x) (an answer). An algorithm A finite procedure for computing p(x) for any given input x.

Measuring Computational By counting the number of elementary operations addition (a + b) subtraction (a b) multiplication (a b) finite-precision division ( a b ) comparison of two numbers (a < b). or running time of the algorithm A simple function of the input size that is a reasonably tight upper bound on the actual number of steps Examples 100 (t 2 + t) = O(t 2 ), but 0.0001t 3 O(t 2 )

Measuring Computational By counting the number of elementary operations addition (a + b) subtraction (a b) multiplication (a b) finite-precision division ( a b ) comparison of two numbers (a < b). or running time of the algorithm A simple function of the input size that is a reasonably tight upper bound on the actual number of steps Big-O notation We say that f (t) = O(g(t)), t 0 if c > 0: for t >0 f (t) cg(t). Examples 100 (t 2 + t) = O(t 2 ), but 0.0001t 3 O(t 2 )

classes