Lecture 1 Introduction and overview

Similar documents
Lecture 1 Introduction and overview

Lecture 1 Introduction

Lecture 4: Linear and quadratic problems

Linear algebra review

9. Geometric problems

Convex Optimization M2

15-780: LinearProgramming

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18

CO 250 Final Exam Guide

8. Geometric problems

3. Linear Programming and Polyhedral Combinatorics

8. Geometric problems

Nonlinear Programming Models

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

Math 5593 Linear Programming Week 1

15. Conic optimization

Lecture Note 18: Duality

3. Linear Programming and Polyhedral Combinatorics

Linear Programming. Chapter Introduction

Linear Algebra Review: Linear Independence. IE418 Integer Programming. Linear Algebra Review: Subspaces. Linear Algebra Review: Affine Independence

Linear and Integer Optimization (V3C1/F4C1)

Example 1 linear elastic structure; forces f 1 ;:::;f 100 induce deections d 1 ;:::;d f i F max i, several hundred other constraints: max load p

Motivating examples Introduction to algorithms Simplex algorithm. On a particular example General algorithm. Duality An application to game theory

Convex optimization examples

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

An introductory example

Advanced Linear Programming: The Exercises

COT 6936: Topics in Algorithms! Giri Narasimhan. ECS 254A / EC 2443; Phone: x3748

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

CSCI 1951-G Optimization Methods in Finance Part 01: Linear Programming

MAT-INF4110/MAT-INF9110 Mathematical optimization

MS-E2140. Lecture 1. (course book chapters )

BBM402-Lecture 20: LP Duality

Lecture 2: Linear Algebra Review

MS-E2140. Lecture 1. (course book chapters )

Lecture: Algorithms for LP, SOCP and SDP

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

1 Review Session. 1.1 Lecture 2

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture 7 Duality II

Spring 2017 CO 250 Course Notes TABLE OF CONTENTS. richardwu.ca. CO 250 Course Notes. Introduction to Optimization

Discrete Optimization

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

12. Interior-point methods

Lecture 1: Review of linear algebra

Today: Linear Programming (con t.)

A Brief Outline of Math 355

Convex Optimization and Modeling

Another max flow application: baseball

Linear Programming: Simplex

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

IE 521 Convex Optimization Homework #1 Solution

CSC373: Algorithm Design, Analysis and Complexity Fall 2017 DENIS PANKRATOV NOVEMBER 1, 2017

Inequality Constraints

Midterm Review. Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.

CS261: A Second Course in Algorithms Lecture #9: Linear Programming Duality (Part 2)

Duality of LPs and Applications

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Farkas Lemma. Rudi Pendavingh. Optimization in R n, lecture 2. Eindhoven Technical University. Rudi Pendavingh (TUE) Farkas Lemma ORN2 1 / 15

Lecture: Convex Optimization Problems

Chapter 1. Preliminaries. The purpose of this chapter is to provide some basic background information. Linear Space. Hilbert Space.

Optimization methods NOPT048

Elementary linear algebra

Lecture 5. Theorems of Alternatives and Self-Dual Embedding

A Brief Review on Convex Optimization

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

The Simplex Algorithm

1 Overview. 2 Extreme Points. AM 221: Advanced Optimization Spring 2016

subject to (x 2)(x 4) u,

Lectures 6, 7 and part of 8

Optimality, Duality, Complementarity for Constrained Optimization

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Chapter 1. Preliminaries

CS711008Z Algorithm Design and Analysis

Advances in Convex Optimization: Theory, Algorithms, and Applications

Course Outline. FRTN10 Multivariable Control, Lecture 13. General idea for Lectures Lecture 13 Outline. Example 1 (Doyle Stein, 1979)

Constrained Optimization and Lagrangian Duality

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

9. Numerical linear algebra background

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Math Linear Algebra II. 1. Inner Products and Norms

MATH 304 Linear Algebra Lecture 18: Orthogonal projection (continued). Least squares problems. Normed vector spaces.

Convex optimization problems. Optimization problem in standard form

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Algorithms and Theory of Computation. Lecture 13: Linear Programming (2)

CSCI5654 (Linear Programming, Fall 2013) Lectures Lectures 10,11 Slide# 1

Part IB Optimisation

Convex Optimization and l 1 -minimization

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Theory and Internet Protocols

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Exercises. Exercises. Basic terminology and optimality conditions. 4.2 Consider the optimization problem

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

Optimization WS 13/14:, by Y. Goldstein/K. Reinert, 9. Dezember 2013, 16: Linear programming. Optimization Problems

LMI Methods in Optimal and Robust Control

Transcription:

ESE504 (Fall 2010) Lecture 1 Introduction and overview linear programming example course topics software integer linear programming 1 1

Linear program (LP) minimize subject to n c j x j j=1 n a ij x j b i, j=1 n c ij x j = d i, j=1 i =1,...,m i =1,...,p variables: x j problem data: the coefficients c j, a ij, b i, c ij, d i can be solved very efficiently (several 10,000 variables, constraints) widely available general-purpose software extensive, useful theory (optimality conditions, sensitivity analysis,... ) Introduction and overview 1 2

Example. Open-loop control problem single-input/single-output system (with input u, output y) y(t) =h 0 u(t)+h 1 u(t 1) + h 2 u(t 2) + h 3 u(t 3) + output tracking problem: minimize deviation from desired output y des (t) max t=0,...,n y(t) y des(t) subject to input amplitude and slew rate constraints: u(t) U, u(t +1) u(t) S variables: u(0),..., u(m) (with u(t) =0for t < 0, t > M) solution: can be formulated as an LP, hence easily solved (more later) Introduction and overview 1 3

example step response (s(t) =h t + + h 0 ) and desired output: step response y des (t) 1 1 0 0 0 100 200 1 0 100 200 amplitude and slew rate constraint on u: u(t) 1.1, u(t) u(t 1) 0.25 Introduction and overview 1 4

optimal solution 1 output and desired output 1.1 input u(t) 0 0.0 1 0 100 200 1.1 0 100 200 0.25 u(t) u(t 1) 0.00 0.25 0 100 200 Introduction and overview 1 5

Brief history 1930s (Kantorovich): economic applications 1940s (Dantzig): military logistics problems during WW2; 1947: simplexalgorithm 1950s 60s discovery of applications in many other fields (structural optimization, control theory, filter design,... ) 1979 (Khachiyan) ellipsoid algorithm: more efficient (polynomial-time) than simplex in worst case, but slower in practice 1984 (Karmarkar): projective (interior-point) algorithm: polynomial-time worst-case complexity, and efficient in practice 1984 today. many variations of interior-point methods (improved complexity or efficiency in practice), software for large-scale problems Introduction and overview 1 6

Course outline the linear programming problem linear inequalities, geometry of linear programming engineering applications signal processing, control, structural optimization... duality algorithms the simplex algorithm, interior-point algorithms large-scale linear programming and network optimization techniques for LPs with special structure, network flow problems integer linear programming introduction, some basic techniques Introduction and overview 1 7

Software solvers: solve LPs described in some standard form modeling tools: accept a problem in a simpler, more intuitive, notation and convert it to the standard form required by solvers software for this course (see class website) platforms: Matlab, Octave, Python solvers: linprog (Matlab Optimization Toolbox), modeling tools: CVX (Matlab), YALMIP (Matlab), Thanks to Lieven Vandenberghe at UCLA for his slides Introduction and overview 1 8

integer linear program Integer linear program minimize subject to n j=1 c jx j n j=1 a ijx j b i, i =1,...,m n j=1 c ijx j = d i, i =1,...,p x j Z Boolean linear program minimize subject to n j=1 c jx j n j=1 a ijx j b i, i =1,...,m n j=1 c ijx j = d i, i =1,...,p x j {0, 1} very general problems; can be extremely hard to solve can be solved as a sequence of linear programs Introduction and overview 1 9

scheduling graph V: Example. Scheduling problem i j n nodes represent operations (e.g., jobs in a manufacturing process, arithmetic operations in an algorithm) (i, j) V means operation j must wait for operation i to be finished M identical machines/processors; each operation takes unit time problem: determine fastest schedule Introduction and overview 1 10

Boolean linear program formulation variables: x is, i =1,...,n, s =0,...,T: x is =1if job i starts at time s, x is =0otherwise constraints: 1. x is {0, 1} 2. job i starts exactly once: T x is =1 s=0 3. if there is an arc (i, j) in V, then T sx js s=0 T sx is 1 s=0 Introduction and overview 1 11

4. limit on capacity (M machines) at time s: n x is M i=1 cost function (start time of job n): T s=0 sx ns Boolean linear program minimize subject to T s=0 sx ns T s=0 x is =1, i =1,...,n T s=0 sx js T s=0 sx is 1, (i, j) V n i=1 x is M, s =0,...,T x is {0, 1}, i =1,...,n, s=0,...,t Introduction and overview 1 12

ESE504 (Fall 2010) Lecture 2 Linear inequalities vectors inner products and norms linear equalities and hyperplanes linear inequalities and halfspaces polyhedra 2 1

Vectors (column) vector x R n : x = x 1 x 2. x n x i R: ith component or element of x also written as x =(x 1,x 2,...,x n ) some special vectors: x =0(zero vector): x i =0, i =1,...,n x = 1: x i =1, i =1,...,n x = e i (ith basis vector or ith unit vector): x i =1, x k =0for k i (n follows from context) Linear inequalities 2 2

Vector operations multiplying a vector x R n with a scalar α R: αx = αx 1. αx n adding and subtracting two vectors x, y R n : x + y = x 1 + y 1. x n + y n, x y = x 1 y 1. x n y n y 1.5y 0.75x +1.5y 0.75x x Linear inequalities 2 3

Inner product x, y R n x, y := x 1 y 1 + x 2 y 2 + + x n y n = x T y important properties αx, y = α x, y x + y, z = x, z + y, z x, y = y, x x, x 0 x, x =0 x =0 linear function: f : R n R is linear, i.e. f(αx + βy) =αf(x)+βf(y), if and only if f(x) = a, x for some a Linear inequalities 2 4

Euclidean norm for x R n we define the (Euclidean) norm as x = x 2 1 + x2 2 + + x2 n = x T x x measures length of vector (from origin) important properties: αx = α x (homogeneity) x + y x + y (triangle inequality) x 0 (nonnegativity) x =0 x =0(definiteness) distance between vectors: dist(x, y) = x y Linear inequalities 2 5

angle between vectors in R n : i.e., x T y = x y cos θ Inner products and angles θ = (x, y) =cos 1 x and y aligned: θ =0; x T y = x y x and y opposed: θ = π; x T y = x y x T y x y x and y orthogonal: θ = π/2 or π/2; x T y =0(denoted x y) x T y>0 means (x, y) is acute; x T y<0 means (x, y) is obtuse x x x T y>0 y x T y<0 y Linear inequalities 2 6

Cauchy-Schwarz inequality: x T y x y projection of x on y x y θ ( x T ) y y 2 y projection is given by ( x T ) y y 2 y Linear inequalities 2 7

Hyperplanes hyperplane in R n : {x a T x = b} (a 0) solution set of one linear equation a 1 x 1 + + a n x n = b with at least one a i 0 set of vectors that make a constant inner product with vector a =(a 1,...,a n ) (the normal vector) x 0 a ( at x 0 a 2 )a 0 x (a T x = a T x 0 ) in R 2 : a line, in R 3 :aplane,... Linear inequalities 2 8

Halfspaces (closed) halfspace in R n : {x a T x b} (a 0) solution set of one linear inequality a 1 x 1 + + a n x n b with at least one a i 0 a =(a 1,...,a n ) is the (outward) normal x 0 a 0 {x a T x a T x 0 } {x a T x a T x 0 } {x a T x<b} is called an open halfspace Linear inequalities 2 9

Affine sets solution set of a set of linear equations a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 1 a m1 x 1 + a m2 x 2 + + a mn x n = b m intersection of m hyperplanes with normal vectors a i =(a i1,a i2,...,a in ) (w.l.o.g., all a i 0). in matrix notation: with A = Ax = b a 11 a 12 a 1n a 21. a 22. a 2n. a m1 a m2 a mn, b = b 1 b 2. b m Linear inequalities 2 10

Polyhedra solution set of system of linear inequalities a 11 x 1 + a 12 x 2 + + a 1n x n b 1 a m1 x 1 + a m2 x 2 + + a mn x n b m intersection of m halfspaces, with normal vectors a i =(a i1,a i2,...,a in ) (w.l.o.g., all a i 0) a 1 a 2. a 5 a 3 a 4 Linear inequalities 2 11

matrix notation Ax b with A = a 11 a 12 a 1n a 21. a 22. a 2n. a m1 a m2 a mn, b = b 1 b 2. b m Ax b stands for componentwise inequality, i.e., fory, z R n, y z y 1 z 1,...,y n z n Linear inequalities 2 12

Examples of polyhedra ahyperplane{x a T x = b}: a T x b, a T x b solution set of system of linear equations/inequalities a T i x b i, i =1,...,m, c T i x = d i, i =1,...,p aslab{x b 1 a T x b 2 } the probability simplex {x R n 1 T x =1, x i 0, i=1,...,n} (hyper)rectangle {x R n l x u} where l<u Linear inequalities 2 13

Linear inequalities 2 14

ESE504 (Fall 2010) Lecture 3 Geometry of linear programming subspaces and affine sets, independent vectors matrices, range and nullspace, rank, inverse polyhedron in inequality form extreme points degeneracy the optimal set of a linear program 3 1

Subspaces S R n (S ) is called a subspace if x, y S, α,β R = αx + βy S αx + βy is called a linear combination of x and y examples (in R n ) S= R n, S = {0} S={αv α R} where v R n (i.e., a line through the origin) S= span(v 1,v 2,...,v k )={α 1 v 1 + + α k v k α i R}, wherev i R n set of vectors orthogonal to given vectors v 1,...,v k : S = {x R n v1 T x =0,...,vk T x =0} Geometry of linear programming 3 2

Independent vectors vectors v 1,v 2,...,v k are independent if and only if α 1 v 1 + α 2 v 2 + + α k v k =0 = α 1 = α 2 = =0 some equivalent conditions: coefficients of α 1 v 1 + α 2 v 2 + + α k v k are uniquely determined, i.e., α 1 v 1 + α 2 v 2 + + α k v k = β 1 v 1 + β 2 v 2 + + β k v k implies α 1 = β 1,α 2 = β 2,...,α k = β k no vector v i can be expressed as a linear combination of the other vectors v 1,...,v i 1,v i+1,...,v k Geometry of linear programming 3 3

Basis and dimension {v 1,v 2,...,v k } is a basis for a subspace S if v 1,v 2,...,v k span S, i.e., S = span(v 1,v 2,...,v k ) v 1,v 2,...,v k are independent equivalently: every v Scan be uniquely expressed as v = α 1 v 1 + + α k v k fact: for a given subspace S, the number of vectors in any basis is the same, and is called the dimension of S, denoted dim S Geometry of linear programming 3 4

Affine sets V R n (V ) is called an affine set if x, y V,α+ β =1 = αx + βy V αx + βy is called an affine combination of x and y examples (in R n ) subspaces V= b + S = {x + b x S}where S is a subspace V= {α 1 v 1 + + α k v k α i R, i α i =1} V= {x v T 1 x = b 1,...,v T k x = b k} (if V ) every affine set V canbewrittenasv = x 0 + S where x 0 R n, S a subspace (e.g., can take any x 0 V, S = V x 0 ) dim(v x 0 ) is called the dimension of V Geometry of linear programming 3 5

A = some special matrices: Matrices a 11 a 12 a 1n a 21. a 22. a 2n. a m1 a m2 a mn Rm n A =0(zero matrix): a ij =0 A = I (identity matrix): m = n and A ii =1for i =1,...,n, A ij =0 for i j A = diag(x) where x R n (diagonal matrix): m = n and A = x 1 0 0 0 x 2 0...... 0 0 x n Geometry of linear programming 3 6

Matrix operations addition, subtraction, scalar multiplication transpose: A T = a 11 a 21 a m1 a 12. a 22. a m2. a 1n a 2n a mn Rn m multiplication: A R m n, B R n q, AB R m q : AB = n i=1 a n 1ib i1 i=1 a n 1ib i2 i=1 a n 1ib iq i=1 a n 2ib i1 i=1 a 2ib i2 n i=1 a 2ib iq... n i=1 a n mib i1 i=1 a n mib i2 i=1 a mib iq Geometry of linear programming 3 7

Rows and columns rows of A R m n : A = a T 1 a T 2. with a i =(a i1,a i2,...,a in ) R n a T m columns of B R n q : B = [ b 1 b 2 b q ] with b i =(b 1i,b 2i,...,b ni ) R n for example, can write AB as AB = a T 1 b 1 a T 1 b 2 a T 1 b q a T 2 b 1. a T 2 b 2. a T 2 b q. a T mb 1 a T mb 2 a T mb q Geometry of linear programming 3 8

Range of a matrix the range of A R m n is defined as R(A) ={Ax x R n } R m a subspace set of vectors that can be hit by mapping y = Ax the span of the columns of A =[a 1 a n ] R(A) ={a 1 x 1 + + a n x n x R n } the set of vectors y s.t. Ax = y has a solution R(A) =R m Ax = y can be solved in x for any y the columns of A span R m dim R(A) =m Geometry of linear programming 3 9

Interpretations v R(A), w R(A) y = Ax represents output resulting from input x v is a possible result or output w cannot be a result or output R(A) characterizes the achievable outputs y = Ax represents measurement of x y = v is a possible or consistent sensor signal y = w is impossible or inconsistent; sensors have failed or model is wrong R(A) characterizes the possible results Geometry of linear programming 3 10

Nullspace of a matrix the nullspace of A R m n is defined as N (A) ={ x R n Ax =0} a subspace the set of vectors mapped to zero by y = Ax the set of vectors orthogonal to all rows of A: where A =[a 1 a m ] T zero nullspace: N (A) ={0} N (A) = { x R n a T 1 x = = a T mx =0 } x can always be uniquely determined from y = Ax (i.e., the linear transformation y = Ax doesn t lose information) columns of A are independent Geometry of linear programming 3 11

Interpretations suppose z N(A) y = Ax represents output resulting from input x z is input with no result x and x + z have same result N (A) characterizes freedom of input choice for given result y = Ax represents measurement of x z is undetectable get zero sensor readings x and x + z are indistinguishable: Ax = A(x + z) N (A) characterizes ambiguity in x from y = Ax Geometry of linear programming 3 12

Inverse A R n n is invertible or nonsingular if det A 0 equivalent conditions: columns of A are a basis for R n rows of A are a basis for R n N(A) ={0} R(A) =R n y = Ax has a unique solution x for every y R n A has an inverse A 1 R n n,withaa 1 = A 1 A = I Geometry of linear programming 3 13

Rank of a matrix we define the rank of A R m n as rank(a) =dimr(a) (nontrivial) facts: rank(a) =rank(a T ) rank(a) is maximum number of independent columns (or rows) of A, hence rank(a) min{m, n} rank(a)+dimn (A) =n Geometry of linear programming 3 14

Full rank matrices for A R m n we have rank(a) min{m, n} we say A is full rank if rank(a) =min{m, n} for square matrices, full rank means nonsingular for skinny matrices (m >n), full rank means columns are independent for fat matrices (m <n), full rank means rows are independent Geometry of linear programming 3 15

Sets of linear equations given A R m n, y R m Ax = y solvable if and only if y R(A) unique solution if y R(A) and rank(a) =n general solution set: {x 0 + v v N(A)} where Ax 0 = y A square and invertible: unique solution for every y: x = A 1 y Geometry of linear programming 3 16

A =[a 1 a m ] T R m n, b R m Polyhedron (inequality form) P = {x Ax b} = {x a T i x b i, i =1,...,m} a 1 a 2 a 6 a 5 a 3 P is convex: a 4 x, y P, 0 λ 1 = λx +(1 λ)y P i.e., theline segment between any two points in P lies in P Geometry of linear programming 3 17

Extreme points and vertices x P is an extreme point if it cannot be written as x = λy +(1 λ)z with 0 λ 1, y, z P, y x, z x P c c T x constant x P is a vertex if there is a c such that c T x<c T y for all y P, y x fact: x is an extreme point x is a vertex (proof later) Geometry of linear programming 3 18

Basic feasible solution define I as the set of indices of the active or binding constraints (at x ): a T i x = b i, i I, a T i x <b i, i I define Ā as Ā = a T i 1 a T i 2. a T i k, I = {i 1,...,i k } x is called a basic feasible solution if rank A = n fact: x is a vertex (extreme point) x is a basic feasible solution (proof later) Geometry of linear programming 3 19

Example 1 0 2 1 0 1 1 2 x 0 3 0 3 (1,1) is an extreme point (1,1) is a vertex: unique minimum of c T x with c =( 1, 1) (1,1) is a basic feasible solution: I = {2, 4} and rank A =2,where A = [ 2 1 1 2 ] Geometry of linear programming 3 20

Equivalence of the three definitions vertex = extreme point let x be a vertex of P, i.e., thereisac 0such that c T x <c T x for all x P, x x let y, z P, y x, z x : c T x <c T y, c T x <c T z so, if 0 λ 1, then hence x λy +(1 λ)z c T x <c T (λy +(1 λ)z) Geometry of linear programming 3 21

extreme point = basic feasible solution suppose x P is an extreme point with a T i x = b i, i I, a T i x <b i, i I suppose x is not a basic feasible solution; then there exists a d 0with a T i d =0, i I and for small enough ɛ>0, y = x + ɛd P, z = x ɛd P we have x =0.5y +0.5z, which contradicts the assumption that x is an extreme point Geometry of linear programming 3 22

basic feasible solution = vertex suppose x P is a basic feasible solution and a T i x = b i i I, a T i x <b i i I define c = i I a i;then c T x = i I b i and for all x P, c T x i I b i with equality only if a T i x = b i, i I however the only solution to a T i x = b i, i I, isx ; hence c T x <c T x for all x P Geometry of linear programming 3 23

Degeneracy set of linear inequalities a T i x b i, i =1,...,m a basic feasible solution x with a T i x = b i, i I, a T i x <b i, i I is degenerate if #indices in I is greater than n a property of the description of the polyhedron, not its geometry affects the performance of some algorithms disappears with small perturbations of b Geometry of linear programming 3 24

Unbounded directions P contains a half-line if there exists d 0, x 0 such that x 0 + td P for all t 0 equivalent condition for P = {x Ax b}: Ax 0 b, Ad 0 fact: P unbounded P contains a half-line P contains a line if there exists d 0, x 0 such that x 0 + td P for all t equivalent condition for P = {x Ax b}: Ax 0 b, Ad =0 fact: P has no extreme points P contains a line Geometry of linear programming 3 25

Optimal set of an LP minimize subject to c T x Ax b optimal value: p =min{c T x Ax b} (p = ± is possible) optimal point: x with Ax b and c T x = p optimal set: X opt = {x Ax b, c T x = p } example minimize c 1 x 1 + c 2 x 2 subject to 2x 1 + x 2 1 x 1 0, x 2 0 c =(1, 1): X opt = {(0, 0)}, p =0 c =(1, 0): X opt = {(0,x 2 ) 0 x 2 1}, p =0 c =( 1, 1): X opt =, p = Geometry of linear programming 3 26

Existence of optimal points p = if and only if there exists a feasible half-line with c T d<0 {x 0 + td t 0} d x 0 p =+ if and only if P = p is finite if and only if X opt c Geometry of linear programming 3 27

property: if P has at least one extreme point and p is finite, then there exists an extreme point that is optimal c X opt Geometry of linear programming 3 28

ESE504 (Fall 2010) Lecture 4 The linear programming problem: variants and examples variants of the linear programming problem LP feasibility problem examples and some general applications linear-fractional programming 4 1

Variants of the linear programming problem general form minimize c T x subject to a T i x b i, i =1,...,m gi T x = h i, i =1,...,p in matrix notation: where A = a T 1 a T 2. minimize subject to c T x Ax b Gx = h Rm n, G = g T 1 g T 2. Rp n a T m g T p The linear programming problem: variants and examples 4 2

inequality form LP minimize c T x subject to a T i x b i, i =1,...,m in matrix notation: minimize subject to c T x Ax b standard form LP minimize c T x subject to gi T x = h i, i =1,...,m x 0 in matrix notation: minimize subject to c T x Gx = h x 0 The linear programming problem: variants and examples 4 3

Reduction of general LP to inequality/standard form minimize c T x subject to a T i x b i, i =1,...,m gi T x = h i, i =1,...,p reduction to inequality form: minimize c T x subject to a T i x b i, i =1,...,m gi T x h i, i =1,...,p gi T x h i, i =1,...,p in matrix notation (where A has rows a T i, G has rows gt i ) minimize subject to c T x A G G x b h h The linear programming problem: variants and examples 4 4

reduction to standard form: minimize c T x + c T x subject to a T i x+ a T i x + s i = b i, i =1,...,m gi T x+ gi T x = h i, i =1,...,p x +,x,s 0 variables x +, x, s recover x as x = x + x s R m is called a slack variable in matrix notation: where x = x+ x s, c = minimize subject to c c 0, G = c T x G x = h x 0 [ A A I G G 0 ] [ b, h = h ] The linear programming problem: variants and examples 4 5

LP feasibility problem feasibility problem: find x that satisfies a T i x b i, i =1,...,m solution via LP (with variables t, x) minimize t subject to a T i x b i + t, i =1,...,m variables t, x if minimizer x, t satisfies t 0, thenx satisfies the inequalities LP in matrix notation: x = [ x t minimize c T x subject to à x b ] [ ] 0, c =, 1 à = [ A 1 ], b = b The linear programming problem: variants and examples 4 6

Piecewise-linear minimization piecewise-linear minimization: minimize max i=1,...,m (c T i x + d i) max i (c T i x + d i) c T i x + d i x equivalent LP (with variables x R n, t R): minimize t subject to c T i x + d i t, i =1,...,m in matrix notation: x = [ x t ], c = minimize c T x subject to à x [ ] b 0, à = [ C 1 ] [ ], b = d 1 The linear programming problem: variants and examples 4 7

Convex functions f : R n R is convex if for 0 λ 1 f(λx +(1 λ)y) λf(x)+(1 λ)f(y) λf(x) +(1 λ)f(y) x λx +(1 λ)y y The linear programming problem: variants and examples 4 8

Piecewise-linear approximation assume f : R n R differentiable and convex 1st-order approximation at x 1 is a global lower bound on f: f(x) f(x 1 )+ f(x 1 ) T (x x 1 ) f(x) x 1 x evaluating f, f at several x i yields a piecewise-linear lower bound: f(x) ( max f(x i )+ f(x i ) T (x x i ) ) i=1,...,k The linear programming problem: variants and examples 4 9

Convex optimization problem minimize (f i convex and differentiable) f 0 (x) LP approximation (choose points x j, j =1,...,K): minimize t subject to f 0 (x j )+ f 0 (x j ) T (x x j ) t, j =1,...,K (variables x, t) yields lower bound on optimal value can be extended to nondifferentiable convex functions more sophisticated variation: cutting-plane algorithm (solves convex optimization problem via sequence of LP approximations) The linear programming problem: variants and examples 4 10

Norms norms on R n : Euclidean norm x (or x 2 ) = x 2 1 + + x2 n l 1 -norm: x 1 = x 1 + + x n l - (or Chebyshev-) norm: x =max i x i x 2 1 1 x =1 x =1 x 1 =1 x 1 The linear programming problem: variants and examples 4 11

Norm approximation problems minimize Ax b p x R n is variable; A R m n and b R m are problem data p =1, 2, r = Ax b is called residual r i = a T i x b i is ith residual (a T i is ith row of A) usually overdetermined, i.e., b R(A) (e.g., m>n, A full rank) interpretations: approximate or fit b with linear combination of columns of A b is corrupted measurement of Ax; find least inconsistent value of x for given measurements The linear programming problem: variants and examples 4 12

examples: r = r T r: least-squares or l 2 -approximation (a.k.a. regression) r =max i r i : Chebyshev, l, or minimax approximation r = i r i : absolute-sum or l 1 -approximation solution: l 2 : closed form expression x opt =(A T A) 1 A T b (assume rank(a) =n) l 1, l : no closed form expression, but readily solved via LP The linear programming problem: variants and examples 4 13

l 1 -approximation problem l 1 -approximation via LP minimize Ax b 1 write as minimize subject to m i=1 y i y Ax b y an LP with variables y, x: minimize subject to c T x à x b with x = [ x y ], c = [ 0 1 ], à = [ A A I I ] [, b = b b ] The linear programming problem: variants and examples 4 14

l -approximation problem l -approximation via LP minimize Ax b write as minimize subject to t t1 Ax b t1 an LP with variables t, x: minimize subject to c T x à x b with x = [ x t ], c = [ 0 1 ], à = [ A A 1 1 ] [, b = b b ] The linear programming problem: variants and examples 4 15

Example minimize Ax b p for p =1, 2, (A R 100 30 ) resulting residuals: ri (p =1) ri (p =2) ri (p = ) 2 0 2 2 0 2 2 0 2 10 20 30 40 50 60 70 80 90 100 i 10 20 30 40 50 60 70 80 90 100 i 10 20 30 40 50 60 70 80 90 100 i The linear programming problem: variants and examples 4 16

histogram of residuals: number of ri number of ri number of ri 40 20 0 3 2 1 0 1 2 3 40 20 0 3 2 1 0 1 2 3 40 20 r (p =1) r (p =2) 0 3 2 1 0 1 2 3 r (p = ) p = gives thinnest distribution; p = 1gives widest distribution p =1most very small (or even zero) r i The linear programming problem: variants and examples 4 17

Interpretation: maximum likelihood estimation m linear measurements y 1,...,y m of x R n : y i = a T i x + v i, i =1,...,m v i : measurement noise, IID with density p y is a random variable with density p x (y) = m i=1 p(y i a T i x) log-likelihood function is defined as log p x (y) = m log p(y i a T i x) i=1 maximum likelihood (ML) estimate of x is ˆx =argmax x m log p(y i a T i x) i=1 The linear programming problem: variants and examples 4 18

examples v i Gaussian: p(z) =1/( 2πσ)e z2 /2σ 2 ML estimate is l 2 -estimate ˆx =argmin x Ax y 2 v i double-sided exponential: p(z) =(1/2a)e z /a ML estimate is l 1 -estimate ˆx =argmin x Ax y 1 { (1/a)e z/a z 0 v i is one-sided exponential: p(z) = 0 z<0 ML estimate is found by solving LP minimize 1 T (y Ax) subject to y Ax 0 v i are uniform on [ a, a]: p(z) = { 1/(2a) a z a 0 otherwise ML estimate is any x satisfying Ax y a The linear programming problem: variants and examples 4 19

Linear-fractional programming c T x + d minimize f T x + g subject to Ax b f T x + g 0 (asume a/0 =+ if a>0, a/0 = if a 0) nonlinear objective function like LP, can be solved very efficiently equivalent form with linear objective (vars. x, γ): minimize γ subject to c T x + d γ(f T x + g) f T x + g 0 Ax b The linear programming problem: variants and examples 4 20

Bisection algorithm for linear-fractional programming given: interval [l, u] that contains optimal γ repeat: solve feasibility problem for γ =(u + l)/2 c T x + d γ(f T x + g) f T x + g 0 Ax b if feasible u := γ; if infeasible l := γ until u l ɛ each iteration is an LP feasibility problem accuracy doubles at each iteration number of iterations to reach accuracy ɛ starting with initial interval of width u l = ɛ 0 : k = log 2 (ɛ 0 /ɛ) The linear programming problem: variants and examples 4 21

Generalized linear-fractional programming minimize subject to max i=1,...,k c T i x + d i f T i x + g i Ax b f T i x + g i 0, i =1,...,K equivalent formulation: minimize subject to γ Ax b c T i x + d i γ(fi T x + g i), i =1,...,K fi T x + g i 0, i =1,...,K efficiently solved via bisection on γ each iteration is an LP feasibility problem The linear programming problem: variants and examples 4 22

Von Neumann economic growth problem simple model of an economy: m goods, n economic sectors x i (t): activity of sector i in current period t a T i x(t): amount of good i consumed in period t b T i x(t): amount of good i produced in period t choose x(t) to maximize growth rate min i x i (t +1)/x i (t): maximize γ subject to Ax(t +1) Bx(t), x(t +1) γx(t), x(t) 1 or equivalently (since a ij 0): maximize γ subject to γax(t) Bx(t), x(t) 1 (linear-fractional problem with variables x(0), γ) The linear programming problem: variants and examples 4 23

Optimal transmitter power allocation m transmitters, mn receivers all at same frequency transmitter i wants to transmit to n receivers labeled (i, j), j =1,...,n transmitter k receiver (i, j) transmitter i A ijk is path gain from transmitter k to receiver (i, j) N ij is (self) noise power of receiver (i, j) variables: transmitter powers p k, k =1,...,m The linear programming problem: variants and examples 4 24

at receiver (i, j): signal power: S ij = A iji p i noise plus interference power: I ij = k i A ijkp k + N ij signal to interference/noise ratio (SINR): S ij /I ij problem: choose p i to maximize smallest SINR: maximize subject to A iji p min i i,j k i A ijkp k + N ij 0 p i p max a (generalized) linear-fractional program special case with analytical solution: m =1, no upper bound on p i (see exercises) The linear programming problem: variants and examples 4 25

The linear programming problem: variants and examples 4 26

ESE504 (Fall 2010) Lecture 5 Structural optimization minimum weight truss design truss topology design limit analysis design with minimum number of bars 5 1

Truss m bars with lengths l i and cross-sectional areas x i N nodes; nodes 1,...,n are free, nodes n +1,...,N are anchored external load: forces f i R 2 at nodes i =1,...,n design problems: given the topology (i.e., location of bars and nodes), find the lightest truss that can carry a given load (vars: bar sizes x k, cost: total weight) same problem, where cost #bars used find best topology find lightest truss that can carry several given loads analysis problem: for a given truss, what is the largest load it can carry? Structural optimization 5 2

Material characteristics u i R is force in bar i (u i > 0: tension, u i < 0: compression) s i R is deformation of bar i (s i > 0: lengthening, s i < 0: shortening) we assume the material is rigid/perfectly plastic: u i /x i α s i s i =0 if α<u i /x i <α u i /x i = α if s i > 0 u i /x i = α if s i < 0 α (α is a material constant) Structural optimization 5 3

Minimum weight truss for given load force equilibrium for (free) node i: m j=1 u j [ nij,x n ij,y ] + [ fi,x f i,y ] =0 f i,y n ij bar j n ij depends on topology: n ij =0if bar j is not connected to node i node i θ ij f i,x n ij =(cosθ ij, sin θ ij ) otherwise minimum weight truss design via LP: minimize subject to m i=1 l ix i m j=1 u jn ij + f i =0, i =1,...,n αx j u j αx j, j =1,...,m (variables x j, u j ) Structural optimization 5 4

example bar 1 45 bar 3 node 1 45 bar 2 f mimimize l 1 x 1 + l 2 x 2 + l 3 x 3 subject to u 1 / 2 u 2 / 2 u 3 + f x =0 u 1 / 2 u 2 / 2+f y =0 αx 1 u 1 αx 1 αx 2 u 2 αx 2 αx 3 u 3 αx 3 Structural optimization 5 5

Truss topology design grid of nodes; bars between any pair of nodes design minimum weight truss: u i =0for most bars optimal topology: only use bars with u i 0 example: 20 11 grid, i.e., 220 (potential) nodes, 24,090 (potential) bars nodes a, b, c are fixed; unit vertical force at node d optimal topology has 289 bars a bc d Structural optimization 5 6

Multiple loading scenarios minimum weight truss that can carry M possible loads f 1 i,...,f M i : minimize subject to m i=1 l ix i m j=1 uk j n ij + fi k =0, i =1,...,n, k =1,...,M αx j u k j αx j, j =1,...,m, k =1,...,M (variables x j, u 1 j,...,um j ) adds robustness: truss can carry any load with λ k 0, k λ k 1 f i = λ 1 f 1 i + + λ M f M i Structural optimization 5 7

Limit analysis truss with given geometry (including given cross-sectional areas x i ) load f i is given up to a constant multiple: f i = γg i, with given g i R 2 and γ>0 find largest load that the truss can carry: maximize subject to γ m j=1 u jn ij + γg i =0, i =1,...,n αx j u j αx j, j =1,...,m an LP in γ, u j maximum allowable γ is called the safety factor Structural optimization 5 8

Design with smallest number of bars integer LP formulation (assume wlog x i 1): minimize subject to m j=1 z j m j=1 u jn ij + f i =0, i =1,...,n αx j u j αx j, j =1,...,m x j z j, j =1,...,m z j {0, 1}, j =1,...,m variables z j, x j, u j extremely hard to solve; we may have to enumerate all 2 m possible values of z heuristic: replace z j {0, 1} by 0 z j 1 yields an LP; at the optimum many (but not all) z j s will be 0 or 1 called LP relaxation of the integer LP Structural optimization 5 9

Structural optimization 5 10

ESE504 (Fall 2010) Lecture 6 FIR filter design FIR filters linear phase filter design magnitude filter design equalizer design 6 1

FIR filters finite impulse response (FIR) filter: y(t) = n 1 τ=0 h τ u(t τ), t Z u : Z R is input signal; y : Z R is output signal h i R are called filter coefficients; n is filter order or length filter frequency response: H : R C H(ω) = h 0 + h 1 e jω + + h n 1 e j(n 1)ω = n 1 t=0 h t cos tω j n 1 t=0 h t sin tω (j = 1) periodic, conjugate symmetric, so only need to know/specify for 0 ω π FIR filter design problem: choose h so H and h satisfy/optimize specs FIR filter design 6 2

example: (lowpass) FIR filter, order n =21 impulse response h: 0.2 h(t) 0.1 0 0.1 0.2 0 2 4 6 8 10 12 14 16 18 20 t frequency response magnitude H(ω) and phase H(ω): 10 1 3 H(ω) 10 0 10 1 10 2 H(ω) 2 1 0 1 2 10 3 0 0.5 1 1.5 2 2.5 3 ω 3 0 0.5 1 1.5 2 2.5 3 ω FIR filter design 6 3

Linear phase filters suppose n =2N +1is odd and impulse response is symmetric about midpoint: h t = h n 1 t, t =0,...,n 1 then H(ω) = h 0 + h 1 e jω + + h n 1 e j(n 1)ω = e jnω (2h 0 cos Nω +2h 1 cos(n 1)ω + + h N ) = e jnω H(ω) term e jnω represents N-sample delay H(ω) is real H(ω) = H(ω) called linear phase filter ( H(ω) is linear except for jumps of ±π) FIR filter design 6 4

Lowpass filter specifications δ 1 1/δ 1 δ 2 ωp ωs π specifications: maximum passband ripple (±20 log 10 δ 1 in db): ω 1/δ 1 H(ω) δ 1, 0 ω ω p minimum stopband attenuation ( 20 log 10 δ 2 in db): H(ω) δ 2, ω s ω π FIR filter design 6 5

Linear phase lowpass filter design sample frequency (ω k = kπ/k, k =1,...,K) can assume wlog H(0) > 0, so ripple spec is 1/δ 1 H(ω k ) δ 1 design for maximum stopband attenuation: minimize δ 2 subject to 1/δ 1 H(ω k ) δ 1, 0 ω k ω p δ 2 H(ω k ) δ 2, ω s ω k π passband ripple δ 1 is given an LP in variables h, δ 2 known (and used) since 1960 s can add other constraints, e.g., h i α FIR filter design 6 6

example linear phase filter, n =31 passband [0, 0.12π]; stopband [0.24π, π] max ripple δ 1 =1.059 (±0.5dB) design for maximum stopband attenuation impulse response h and frequency response magnitude H(ω) 0.2 10 1 0.15 10 0 h(t) 0.1 0.05 0 0.05 0.1 0 5 10 15 20 25 30 t H(ω) 10 1 10 2 10 3 10 4 0 0.5 1 1.5 2 2.5 3 ω FIR filter design 6 7

Some variations H(ω) =2h 0 cos Nω +2h 1 cos(n 1)ω + + h N minimize passband ripple (given δ 2, ω s, ω p, N) minimize δ 1 subject to 1/δ 1 H(ω k ) δ 1, 0 ω k ω p δ 2 H(ω k ) δ 2, ω s ω k π minimize transition bandwidth (given δ 1,δ 2,ω p,n) minimize ω s subject to 1/δ 1 H(ω k ) δ 1, 0 ω k ω p δ 2 H(ω k ) δ 2, ω s ω k π FIR filter design 6 8

minimize filter order (given δ 1, δ 2, ω s, ω p ) minimize N subject to 1/δ 1 H(ω k ) δ 1, 0 ω k ω p δ 2 H(ω k ) δ 2, ω s ω k π can be solved using bisection each iteration is an LP feasibility problem FIR filter design 6 9

Filter magnitude specifications transfer function magnitude spec has form L(ω) H(ω) U(ω), ω [0,π] where L, U : R R + are given and H(ω) = n 1 t=0 h t cos tω j n 1 t=0 h t sin tω arises in many applications, e.g., audio, spectrum shaping not equivalent to a set of linear inequalities in h (lower bound is not even convex) can change variables and convert to set of linear inequalities FIR filter design 6 10

Autocorrelation coefficients autocorrelation coefficients associated with impulse response h =(h 0,...,h n 1 ) R n are r t = n 1 t τ=0 h τ h τ+t (with h k =0for k<0 or k n) r t = r t and r t =0for t n; hence suffices to specify r =(r 0,...,r n 1 ) Fourier transform of autocorrelation coefficients is R(ω) = τ e jωτ r τ = r 0 + n 1 t=1 2r t cos ωt = H(ω) 2 can express magnitude specification as L(ω) 2 R(ω) U(ω) 2, ω [0,π]... linear inequalities in r FIR filter design 6 11

Spectral factorization question: when is r R n the autocorrelation coefficients of some h R n? answer (spectral factorization theorem): if and only if R(ω) 0 for all ω spectral factorization condition is convex in r (a linear inequality for each ω) many algorithms for spectral factorization, i.e., finding an h such that R(ω) = H(ω) 2 magnitude design via autocorrelation coefficients: use r as variable (instead of h) add spectral factorization condition R(ω) 0 for all ω optimize over r use spectral factorization to recover h FIR filter design 6 12

Magnitude lowpass filter design maximum stopband attenuation design with variables r becomes minimize δ2 subject to 1/ δ 1 R(ω) δ 1, ω [0,ω p ] R(ω) δ 2, ω [ω s,π] R(ω) 0, ω [0,π] ( δ i corresponds to δ 2 i in original problem) now discretize frequency: minimize δ2 subject to 1/ δ 1 R(ω k ) δ 1, 0 ω k ω p R(ω k ) δ 2, ω s ω k π R(ω k ) 0, 0 ω k π...anlpinr, δ 2 FIR filter design 6 13

Equalizer design g(t) h(t) (time-domain) equalization: given g (unequalized impulse response) g des (desired impulse response) design (FIR equalizer) h so that g = h g g des common choice: pure delay D: g des (t) = { 1 t = D 0 t D as an LP: minimize max t D g(t) subject to g(d) =1 FIR filter design 6 14

example unequalized system G is 10th order FIR: 1 0.8 0.6 g(t) 0.4 0.2 0 0.2 0.4 0 1 2 3 4 5 6 7 8 9 t 10 1 3 2 G(ω) 10 0 G(ω) 1 0 1 2 10 1 0 0.5 1 1.5 2 2.5 3 ω 3 0 0.5 1 1.5 2 2.5 3 ω FIR filter design 6 15

design 30th order FIR equalizer with G(ω) e j10ω minimize max t 10 g(t) equalized system impulse response g 1 0.8 g(t) 0.6 0.4 0.2 0 0.2 0 5 10 15 20 25 30 35 t equalized frequency response magnitude G and phase G 10 1 3 2 G(ω) 10 0 G(ω) 1 0 1 2 10 1 0 0.5 1 1.5 2 2.5 3 ω 3 0 0.5 1 1.5 2 2.5 3 ω FIR filter design 6 16

Magnitude equalizer design H(ω) G(ω) given system frequency response G :[0,π] C design FIR equalizer H so that G(ω)H(ω) 1: minimize max ω [0,π] G(ω)H(ω) 2 1 use autocorrelation coefficients as variables: minimize α subject to G(ω) 2 R(ω) 1 α, ω [0,π] R(ω) 0, ω [0,π] when discretized, an LP in r, α,... FIR filter design 6 17

Multi-system magnitude equalization given M frequency responses G k :[0,π] C design FIR equalizer H so that G k (ω)h(ω) constant: minimize max k=1,...,m max ω [0,π] G k (ω)h(ω) 2 γ k subject to γ k 1, k =1,...,M use autocorrelation coefficients as variables: minimize α subject to G k (ω) 2 R(ω) γ k α, ω [0,π], k =1,...,M R(ω) 0, ω [0,π] γ k 1, k =1,...,M... when discretized, an LP in γ k, r, α FIR filter design 6 18

example. M =2, n =25, γ k 1 unequalized and equalized frequency responses 2.5 2.5 2 2 Gk(ω) 2 1.5 1 Gk(ω)H(ω) 2 1.5 1 0.5 0.5 0 0 0.5 1 1.5 2 2.5 3 ω 0 0 0.5 1 1.5 2 2.5 3 ω FIR filter design 6 19

FIR filter design 6 20

ESE504 (Fall 2010) Lecture 7 Applications in control optimal input design robust optimal input design pole placement (with low-authority control) 7 1

Linear dynamical system y(t) =h 0 u(t)+h 1 u(t 1) + h 2 u(t 2) + single input/single output: input u(t) R, output y(t) R h i are called impulse response coefficients finite impulse response (FIR) system of order k: h i =0for i>k if u(t) =0for t<0, y(0) y(1) y(2). y(n) = h 0 0 0 0 h 1 h 0 0 0 h 2 h 1 h 0 0....... h N h N 1 h N 2 h 0 u(0) u(1) u(2). u(n) a linear mapping from input to output sequence Applications in control 7 2

Output tracking problem choose inputs u(t), t =0,...,M (M <N)that minimize peak deviation between y(t) and a desired output y des (t), t =0,...,N, max t=0,...,n y(t) y des(t) satisfy amplitude and slew rate constraints: u(t) U, u(t +1) u(t) S as a linear program (variables: w, u(0),...,u(n)): minimize. subject to w w t i=0 h iu(t i) y des (t) w, t =0,...,N u(t) =0, t = M +1,...,N U u(t) U, t =0,...,M S u(t +1) u(t) S, t =0,...,M +1 Applications in control 7 3

example. single input/output, N = 200 step response y des 1 1 0 0 0 100 200 1 0 100 200 constraints on u: input horizon M = 150 amplitude constraint u(t) 1.1 slew rate constraint u(t) u(t 1) 0.25 Applications in control 7 4

output and desired output: y(t), y des (t) 1 0 1 0 100 200 optimal input sequence u: 1.1 u(t) 0.25 u(t) u(t 1) 0.0 0.00 1.1 0 100 200 0.25 0 100 200 Applications in control 7 5

Robust output tracking (1) impulse response is not exactly known; it can take two values: (h (1) 0,h(1) 1,...,h(1) k ), (h(2) 0,h(2) 1,...,h(2) k ) design an input sequence that minimizes the worst-case peak tracking error minimize w subject to w t i=0 h(1) i u(t i) y des (t) w, t =0,...,N w t i=0 h(2) i u(t i) y des (t) w, t =0,...,N u(t) =0, t = M +1,...,N U u(t) U, t =0,...,M S u(t +1) u(t) S, t =0,...,M +1 an LP in the variables w, u(0),...,u(n) Applications in control 7 6

example step responses outputs and desired output 1 1 0 0 0 100 200 1 0 100 200 1.1 u(t) 0.25 u(t) u(t 1) 0.0 0.00 1.1 0 100 200 0.25 0 100 200 Applications in control 7 7

Robust output tracking (2) h i and v (j) i h 0 (s) h 1 (s). h k (s) = h 0 h 1. h k + s 1 v (1) 0 v (1) 1. v (1) k are given; s i [ 1, +1] is unknown + + s K v (K) 0 v (K) 1. v (K) k robust output tracking problem (variables w, u(t)): min. s.t. w w t i=0 h i(s)u(t i) y des (t) w, t =0,...,N, s [ 1, 1] K u(t) =0, t = M +1,...,N U u(t) U, t =0,...,M S u(t +1) u(t) S, t =0,...,M +1 straightforward (and very inefficient) solution: enumerate all 2 K extreme values of s Applications in control 7 8

simplification: we can express the 2 K+1 linear inequalities w t h i (s)u(t i) y des (t) w for all s { 1, 1} K i=0 as two nonlinear inequalities t K h i u(t i)+ i=0 j=1 t i=0 i u(t i) y des(t)+w v (j) t h i u(t i) i=0 K j=1 t i=0 i u(t i) y des(t) w v (j) Applications in control 7 9

proof: max s { 1,1} K = = t h i (s)u(t i) i=0 t h i u(t i)+ i=0 t h i u(t i)+ i=0 K t max s j s j { 1,+1} i=0 K t v (j) i u(t i) j=1 j=1 i=0 v (j) i u(t i) and similarly for the lower bound Applications in control 7 10

robust output tracking problem reduces to: min. s.t. w t h i=0 i u(t i)+ K j=1 t i=0 v(j) i u(t i) y des (t)+w, t h i=0 i u(t i) K j=1 t i=0 v(j) i u(t i) y des (t) w, u(t) =0, t = M +1,...,N U u(t) U, t =0,...,M S u(t +1) u(t) S, t =0,...,M +1 t =0,...,N t =0,...,N (variables u(t), w) to express as an LP: for t =0,...,N, j =1,...,K, introduce new variables p (j) (t) and constraints t p (j) (t) v (j) i u(t i) p (j) (t) i=0 replace i v(j) i u(t i) by p (j) (t) Applications in control 7 11

example (K =6) nominal and perturbed step responses 1 0 design for nominal system 0 20 40 60 output for nominal system output for worst-case system 1 1 0 1 0 1 0 50 100 0 50 100 Applications in control 7 12

robust design output for nominal system output for worst-case system 1 1 0 1 0 1 0 50 100 0 50 100 Applications in control 7 13

input-output description: State space description y(t) =H 0 u(t)+h 1 u(t 1) + H 2 u(t 2) + if u(t) =0, t<0: y(0) y(1) y(2). y(n) = H 0 0 0 0 H 1 H 0 0 0 H 2 H 1 H 0 0....... H N H N 1 H N 2 H 0 u(0) u(1) u(2). u(n) block Toeplitz structure (constant along diagonals) state space model: x(t +1)=Ax(t)+Bu(t), y(t) =Cx(t)+Du(t) with H 0 = D, H i = CA i 1 B (i >0) x(t) R n is state sequence Applications in control 7 14

alternative description: 0 0. 0 y(0) y(1). y(n) = A I 0 0 B 0 0 0 A I 0 0 B 0............. 0 0 0 I 0 0 B C 0 0 0 D 0 0 0 C 0 0 0 D 0............. 0 0 0 C 0 0 D x(0) x(1) x(2). x(n) u(0) u(1). u(n) we don t eliminate the intermediate variables x(t) matrix is larger, but very sparse (interesting when using general-purpose LP solvers) Applications in control 7 15

Pole placement linear system ż(t) =A(x)z(t), z(0) = z 0 where A(x) =A 0 + x 1 A 1 + + x p A p R n n solutions have the form z i (t) = k β ik e σ kt cos(ω k t φ ik ) where λ k = σ k ± jω k are the eigenvalues of A(x) x R p is the design parameter goal: place eigenvalues of A(x) in a desired region by choosing x Applications in control 7 16

Low-authority control eigenvalues of A(x) are very complicated (nonlinear, nondifferentiable) functions of x first-order perturbation: if λ i (A 0 ) is simple, then λ i (A(x)) = λ i (A 0 )+ p k=1 wi A kv i wi v x k + o( x ) i where w i, v i are the left and right eigenvectors: w i A 0 = λ i (A 0 )w i, A 0 v i = λ i (A 0 )v i low-authority control: use linear first-order approximations for λ i can place λ i in a polyhedral region by imposing linear inequalities on x we expect this to work only for small shifts in eigenvalues Applications in control 7 17

Example trusswith30nodes,83bars M d(t)+d d(t)+kd(t) =0 d(t): vector of horizontal and vertical node displacements M = M T > 0 (mass matrix): masses at the nodes D = D T > 0 (damping matrix); K = K T > 0 (stiffness matrix) to increase damping, we attach dampers to the bars: D(x) =D 0 + x 1 D 1 + + x p D p x i > 0: amount of external damping at bar i Applications in control 7 18

eigenvalue placement problem minimize p i=1 x i subject to λ i (M,D(x),K) C, i =1,...,n x 0 an LP if C is polyhedral and we use the 1st order approximation for λ i 5 before 4 eigenvalues 5 after 4 3 3 2 2 1 1 0 0 1 1 2 2 3 3 4 4 5 0.01 0.009 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 0 5 0.01 0.009 0.008 0.007 0.006 0.005 0.004 0.003 0.002 0.001 0 Applications in control 7 19

location of dampers Applications in control 7 20

ESE504 (Fall 2010) Lecture 8 Duality (part 1) the dual of an LP in inequality form weak duality examples optimality conditions and complementary slackness Farkas lemma and theorems of alternatives proofofstrongduality 8 1

LP in inequality form: The dual of an LP in inequality form minimize c T x subject to a T i x b i, i =1,...,m n variables, m inequality constraints, optimal value p called primal problem (in context of duality) the dual LP (with A =[a 1 a 2... a m ] T ): maximize b T z subject to A T z + c =0 z 0 an LP in standard form with m variables, n equality constraints optimal value denoted d main property: p = d (if primal or dual is feasible) Duality (part 1) 8 2

Weak duality lower bound property: if x is primal feasible and z is dual feasible, then c T x b T z proof: c T x c T x + m i=1 z i(a T i x b i)= b T z c T x + b T z is called the duality gap associated with x and z weak duality: minimize over x, maximize over z: p d always true (even when p =+ and/or d = ) Duality (part 1) 8 3

example primal problem minimize 4x 1 5x 2 1 0 subject to 2 1 0 1 1 2 [ x1 x 2 ] 0 3 0 3 optimal point: x =(1, 1), optimal value p = 9 dual problem maximize 3z 2 3z 4 subject to [ 1 2 0 1 0 1 1 2 ] z 1 z 2 z 3 z 4 z 1 0, z 2 0, z 3 0, z 4 0 + [ 4 5 z =(0, 1, 0, 2) is dual feasible with objective value 9 ] =0 Duality (part 1) 8 4

conclusion (by weak duality): z is a certificate that x is (primal) optimal x is a certificate that z is (dual) optimal Duality (part 1) 8 5

Piecewise-linear minimization minimize max i=1,...,m (a T i x b i) lower bounds for optimal value p? LP formulation (variables x, t) minimize t subject to [ A 1 ] [ x t ] b dual LP maximize subject to (same optimal value) b T z [ A T 1 T z 0 ] z + [ 0 1 ] =0 Duality (part 1) 8 6

Interpretation lemma: if z 0, i z i =1,thenforally, max i y i i z i y i hence, max i (a T i x b i) z T (Ax b) this yields a lower bound on p : p =min x max(a T i x b i ) min z T (Ax b) = i x { b T z if A T z =0 otherwise to get best lower bound: maximize b T z subject to A T z =0 1 T z =1 z 0 Duality (part 1) 8 7

l -approximation LP formulation minimize Ax b minimize subject to [ t A A 1 1 ][ x t ] [ b b ] LP dual canbeexpressedas maximize b T w + b T v subject to A T w A T v =0 1 T w + 1 T v =1 w, v 0 maximize b T z subject to A T z =0 z 1 1 (1) (2) Duality (part 1) 8 8

proofofequivalenceof(1)and(2) assume w, v feasible in (1), i.e., w 0, v 0, 1 T (w + v) =1 z = v w is feasible in (2): z 1 = i v i w i 1 T v + 1 T w =1 same objective value: b T z = b T v b T w assume z is feasible in (2), i.e., A T z =0, z 1 1 w i =max{z i, 0} + α, v i =max{ z i, 0} + α, with α =(1 z 1 )/(2m), are feasible in (1): v, w 0, 1 T w + 1 T v =1 same objective value: b T v b T w = b T z Duality (part 1) 8 9

Interpretation lemma: u T v u 1 v hence, for every z with z 1 1, wehavealowerboundon Ax b : Ax b z T (Ax b) p =min x togetbestlowerbound Ax b min x z T (Ax b) = maximize b T z subject to A T z =0 z 1 1 { b T z if A T z =0 otherwise Duality (part 1) 8 10

Optimality conditions primal feasible x is optimal if and only if there is a dual feasible z with c T x = b T z i.e., associated duality gap is zero complementary slackness: forx, z optimal, c T x + b T z = m z i (b i a T i x) =0 i=1 hence for each i, a T i x = b i or z i =0: z i > 0= a T i x = b i (ith inequality is active at x) a T i x<b i = z i =0 Duality (part 1) 8 11

Geometric interpretation example in R 2 : a 1 a 1 c a 2 a 2 two active constraints at optimum (a T 1 x = b 1, a T 2 x = b 2 ) optimal dual solution satisfies c = A T z, z 0, z i =0for i 1, 2, i.e., c = a 1 z 1 + a 2 z 2 geometrically, c lies in the cone generated by a 1 and a 2 Duality (part 1) 8 12

Separating hyperplane theorem if S R n is a nonempty, closed, convex set, and x S, then there exists c 0such that c T x <c T x for all x S, i.e., for some value of d, the hyperplane c T x = d separates x from S x c p S (x ) S idea of proof: use c = p S (x ) x,wherep S (x ) is the projection of x on S, i.e., p S (x ) = argmin x x x S Duality (part 1) 8 13

Farkas lemma given A, b, exactly one of the following two statements is true: 1. there is an x 0 such that Ax = b 2. there is a y such that A T y 0, b T y<0 very useful in practice: any y in 2 is a certificate or proof that Ax = b, x 0 is infeasible, and vice-versa proof (easy part): we have a contradiction if 1 and 2 are both true: 0=y T (Ax b) b T y>0 Duality (part 1) 8 14

proof (difficult part): 1 = 2 1 means b S = {Ax x 0} S is nonempty, closed, and convex (the image of the nonnegative orthant under a linear mapping) hence there exists a y s.t. implies: y T b<y T Ax for all x 0 y T b<0 (choose x =0) A T y 0 (if (A T y) k < 0 for some k, wecanchoosex i =0for i k, and x k + ; theny T Ax ) i.e., 2istrue Duality (part 1) 8 15

Theorems of alternatives many variations on Farkas lemma: e.g., forgivena R m n b R m, exactly one of the following statements is true: 1. there is an x with Ax b 2. there is a y 0 with A T y =0, b T y<0 proof (easy half): 1 and 2 together imply 0 (b Ax) T y = b T y<0 (difficult half): if 1 does not hold, then b S = {Ax + s x R n,s R m,s 0} hence, there is a separating hyperplane, i.e., y 0subject to y T b<y T (Ax + s) for all x and all s 0 equivalent to b T y<0, A T y =0, y 0 (i.e., 2istrue) Duality (part 1) 8 16

Proof of strong duality strong duality: p = d (except possibly when p =+, d = ) suppose p is finite, and x is optimal with a T i x = b i, i I, a T i x <b i, i I we ll show there is a dual feasible z with b T z = c T x x optimal implies that the set of inequalities a T i d 0, i I, c T d<0 (1) is infeasible; otherwise we would have for small t>0 a T i (x + td) b i, i =1,...,m, c T (x + td) <c T x Duality (part 1) 8 17

from Farkas lemma: (1) is infeasible if and only if there exists λ i, i I, λ i 0, λ i a i = c i I this yields a dual feasible z: z i = λ i, i I, z i =0, i I z is dual optimal: b T z = i I b i z i = i I (a T i x )z i = z T Ax = c T x this proves: p finite = d = p exercise: p =+ = d =+ or d = Duality (part 1) 8 18

Summary possible cases: p = d and finite: primal and dual optima are attained p = d =+ : primal is infeasible; dual is feasible and unbounded p = d = : primal is feasible and unbounded; dual is infeasible p =+, d = : primal and dual are infeasible uses of duality: dual optimal z provides a proof of optimality for primal feasible x dual feasible z provides a lower bound on p (useful for stopping criteria) sometimes it is easier to solve the dual modern interior-point methods solve primal and dual simultaneously Duality (part 1) 8 19

Duality (part 1) 8 20

ESE504 (Fall 2010) Lecture 9 Duality (part 2) duality in algorithms sensitivity analysis via duality duality for general LPs examples mechanics interpretation circuits interpretation two-person zero-sum games 9 1