Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 09/04/18 Monday 5th February 2018 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 1/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 2/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Organization and Material Lectures (hoor-colleges) and 3 exercise classes for motivation, geometric illustration, and proofs. Script Mathematical Optimization, cited as: [MO,?] Sheets of the course (on Blackboard) Mark based on written exam Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 3/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Chapter 1: Real vector spaces A collection of facts from Analysis and Linear Algebra, self-instruction Chapter 2: Linear equations, linear inequalities Gauss elimination and application Gauss elimination provides constructive proofs of main theorems in matrix theory. Fourier-Motzkin elimination This method provides constructive proofs of the Farkas Lemma (which is strong LP duality in disguise). Least square approximation, Fourier approximation Integer solutions of linear equations (discrete optimization) Chapter 3: Linear programs and applications LP duality, sensitivity analysis, matrix games. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 4/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Chapter 4: Convex analysis Properties of convex sets and convex functions. Applications in optimization Chapter 5: Unconstrained optimization optimality conditions algorithms: descent methods, Newton s method, Gauss-Newton method, Quasi-Newton methods, minimization of nondifferentiable functions. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 5/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Chapter 2. Linear equations, inequalities We start with some definitions: Definitions in matrix theory M = (m ij ) is said to be lower triangular: if m ij = 0 for i < j, upper triangular: if m ij = 0 for i > j. P = (p ij ) R m m is a permutation matrix if p ij {0, 1} and each row and each column of P contains exactly one coefficient 1. Note that P T P = I, implying that P 1 = P T. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 6/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions The set of symmetric matrices in R n n is denoted by S n. Q S n, is called Positive Semidefinite (denoted Q O or Q PSD n ) if x T Qx 0 for all x R n, Positive Definite (denoted: Q O) if x T Qx > 0 for all x R n \ {0}, Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 7/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] Gauss-elimination (for solving Ax = b) Explicit Gauss Algorithm Implications of the Gauss algorithm Gauss-Algorithm for symmmetric A 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 8/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Gauss-elimination (for solving Ax = b) In this Section 2.1 We shortly survey the Gauss algorithm and use the Gauss algorithm to give a constructive proof of important theorems in Matrix Theory Motivation: We give a simple example which shows that successive elimination is equivalent with the Gauss algorithm. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 9/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions General Idea: To eliminate x 1, x 2... is equivalent with transforming Ax = b or (A b) to triangular normal form (Ã b) (with same solution set). Then solve Ãx = b, recursively: a 11 a 12... a 1n b 1 a 21 a 22... a 2n b 2.. a m1 a m2... a mn b m Transformation into form (Ã b): ã 1 j1 ã 1 j2 ã 1 jr 1 ã 1 jr b1 ã 2 j2 ã 2 jr 1 ã 2 jr b2.... ã r 1 jr 1 ã r 1 jr b r 1 ã r jr b r. b m Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 10/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions This Gauss elimination uses 2 types of row operations: (G1) (i, j)-pivot: k > i add λ k = a kj a ij times row i to row k. (G2) interchange row i with row k The matrix form of these operations are: [MO, Ex.2.3] The matrix form of (G1), (A b) (Ã b), is given by (Ã b) = M (A b) with a nonsingular lower triangular M R m m. [MO, Ex.2.4] The matrix form of (G2), (A b) (Ã b), is given by (Ã b) = P (A b) with a permutation matrix P R m m. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 11/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Explicit Gauss Algorithm 1 Set i = 1 and j = 1. 2 While i m and j n do 3 If k i such that a kj 0 then 4 Interchange row i and row k. (G2) 5 Apply (i, j)-pivot. (G1) 6 Update i i + 1 and j j + 1. 7 else 8 Update j j + 1. 9 end if 10 end while Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 12/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Implications of the Gauss algorithm Lemma 2.1 The set of lower triangular matrices is closed under addition, multiplication and inversion (provided square and nonsingular). Theorem 2.2 ([MO, Thm. 2.1]) For every A R m n, there exists an (m m)-permutation matrix P and an invertible lower triangular matrix M R m m such that U = MPA is upper triangular. Corollary 2.3 (Cor. 2.1, LU-factorization) For A R m n, there exists an (m m)-permutation matrix P, an invertible lower triangular L R m m and an upper triangular U R m n such that LU = PA. Rem.: Solve Ax = b by using the decomposition PA = LU! (How?) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 13/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Corollary 2.4 ([MO, Cor. 2.2], Gale s Theorem) Exactly one of the following statements is true for A R m n : (a) The system Ax = b has a solution x R n. (b) There exists y R m such that: y T A = 0 T and y T b 0. Remark: In normal form A Ã, the number r gives dimension of the space spanned by the rows of A. This equals the dimension of the space spanned by columns of A. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 14/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Modified Gauss-Algorithm for symmmetric A Perform same row and column operators: 1 Set i = 1. 2 While i n do 3 If a kk 0 for some k i 4 Apply (G2 ): Interchange row i and row k. Interchange col. i and col. k, A PAP T. 5 Apply (G1 ): For all j > i add λ j = a ij a ii times row i to row j, and λ j times col. i to col. j, A MAM T. 6 Update i i + 1. 7 Else if a ik 0 for some k > i then 8 Add row k to row i and add col. k to col. i, A BAB T. 9 Else 10 Update i i + 1. 11 End if 12 End while. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 15/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Implications of the symmetric Gauss algorithm Note: By symmetric Gauss the solution set of Ax = b is destroyed!!! But it is useful to get the following results. Theorem 2.5 ([MO, Thm. 2.2]) A R n n symmetric. Then with some nonsingular Q R n n QAQ T = D = diag(d 1,..., d n ), d R n. Look out: The d i s are in general not the eigenvalues of A. Recall: Q PSD n is positive semidefinite (notation: Q O) if: x T Qx 0 for all x R n. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 16/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Corollary 2.6 ([MO, Cor. 2.3]) Let A S n, Q R n n nonsingular s.t. QAQ T = diag(d 1,..., d n ). Then (a) A O d i 0, i = 1,..., n (b) A O d i > 0, i = 1,..., n Implication: Checking A O can be done by the Gauss-algorithm. Corollary 2.7 ([MO, Cor. 2.4]) Let A S n. Then (a) A O A = BB T for some B R n m (b) A O A = BB T for some nonsingular B R n n Complexity of Gauss algorithm The number of ±,, / flop s (floating point operations) needed to solve Ax = b, where A R n n, is less than or equal to n 3. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 17/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] Projection and equivalent condition Constructing a solution Gram Matrix Gram-Schmidt Algorithm Eigenvalues 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 18/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Projection and equivalent condition In this section: We see that the orthogonal projection problem is solved by solving a system of linear equations; and present some more results from matrix theory. Assumption: V is a linear vector space over R with inner product x, y and (induced) norm x = x, x. Minimization Problem: Given x V, subspace W V find x W such that: x x = min x y (1) y W The vector x is called the projection of x onto W. Lemma 2.8 ([MO, Lm. 2.1]) x W is the (unique) solution to (1) x x, w = 0 w W. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 19/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Constructing a solution via Lm 2.8 Let a 1,..., a m be a basis of W, i.e., a 1,..., a m are linearly independent and W = span{a 1,..., a m }. Write x := m i=1 z ia i Then x x, w = 0, w W is equivalent with x m z ia i a j = 0, j = 1,..., m i=1 or m a i, a j z i = x, a j, j = 1,..., m i=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 20/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Gram Matrix Defining the Gram-matrix G = ( a i, a j ) i,j S m and considering b = ( x, a i ) i R m, this leads to the linear equation (for z): (2.16) Gz = b with solution ẑ = G 1 b Ex. 2.1 Prove that the Gram-matrix is positive definite, and thus non-singular (under our assumption). Summary Let W = span{a 1,..., a m }. Then the solution x of the minimization problem (1) is given by x := m i=1 ẑia i where ẑ is computed as the solution of G ẑ = b. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 21/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Special case 1: V = R n, x, y = x T y and a 1,..., a m a basis of W. Then with A := [a 1,..., a m ] the projection of x onto W is given by ẑ = arg min{ x Az } = (A T A) 1 A T x, z x = arg min y { x y : y AR m } = Aẑ = A(A T A) 1 A T x. Special case 2: V = R n, x, y = x T y, a 1,..., a m R n lin. indep. and W = {w R n a T i w = 0, i = 1,..., m}. Then the projection of x onto W is given by x = arg min{ x y : a T y i y = 0 i} = x A(A T A) 1 A T x Special case 3: W = span{a 1,..., a m } with {a i }, an orthonormal basis, i.e., I = ( a i, a j ) i,j ). Then the projection of x onto W is given by ẑ = ( a j, x ) j R m, ( Fourier coefficients ) x = arg min{ x y : y W } = m ẑja j y j=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 22/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Gram-Schmidt Algorithm Given subspace W with basis a 1,..., a m, construct an orthonormal basis c 1,... c m for W, i.e. ( c i, c j ) i,j = I. Start with b 1 := a 1 and c 1 = b 1 / b 1. For k = 2,..., m let b k = a k k 1 i=1 c i, a k c i and c i = b i / b i. Gram-Schmidt in matrix form: With W V := R n. Put A = [a 1,..., a m ] T, B = [b 1,..., b m ] T, C = [c 1,..., c m ] T. Then the Gram-Schmidt-steps are equivalent with: add multiple of row j < k to row k (for B) multiply row k by scalar (for C) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 23/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Matrix form of Gram-Schmidt : Given A R m n with full row rank, there is a decomposition C = LA with lower triangular nonsingular matrix L (l ii = 1) and the rows c j of C are orthogonal, i.e. c i, c j = 0, i j. A corollary of this fact: Lemma 2.9 ([MO, Prop. 2.1]) Prop. 2.1 (Hadamard s inequality) Let A R m n with rows a T i. Then m 0 det (AA T ) a T i a i i=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 24/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Eigenvalues Definition. λ C is an eigenvalue of A R n n if there is an (eigenvector) 0 x C n with Ax = λx. The results above (together with the Theorem of Weierstrass) allow a proof of: Theorem 2.10 ([MO, Thm. 2.3], Spectral theorem for symmetric matrices) Let A S n. Then there exists an orthogonal matrix Q R n n (i.e. Q T Q = I) and eigenvalues λ 1,..., λ n R such that Q T AQ = D = diag (λ 1,..., λ n ) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 25/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] Fourier-Motzkin Algorithm Solvability of linear systems Application: Markov chains 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 26/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Linear Inequalities Ax b In this Section we learn: Finding a solution to Ax b. What the Gauss algorithm is for linear equations; is the Fourier-Motzkin algorithm for linear inequalities. The Fourier-Motzkin algorithm leads to a construcive proof of the Farkas Lemma (the basis for strong duality in linear programming). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 27/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Fourier-Motzkin algorithm for solving Ax b. Eliminate x 1 : a r1 x 1 + a s1 x 1 + n a rj x j b r j=2 n a sj x j b s j=2 n a tj x j b t j=2 r = 1,..., k s = k + 1,..., l t = l + 1,..., m with a r1 > 0, a s1 < 0. Divide by a r1, a s1, leading to (for r and s) x 1 + x 1 + n a rjx j b r j=2 n a sjx j b s j=2 r = 1,..., k s = k + 1,..., l Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 28/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions These two sets of inequalities have a solution x iff n a sjx j b s x 1 b r n j=2 j=2 a rjx j { r = 1,..., k s = k + 1,..., l or equivalently: n { (a sj + a rj)x j b s + b r = 1,..., k r s = k + 1,..., l j=2 Remark Explosion of number of inequalities. Before number = k + (l k) + (m l). Now the number = k (l k) + (m l) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 29/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions So Ax b has a solution x = (x 1,.., x n ) if and only if there is a solution x = (x 2,.., x n ) of n (a sj + a rj)x j b r + b s r = 1,..., k; s = k + 1..., l j=2 n a tj x j b t t = l + 1,..., m. j=2 In matrix form: Ax b has a solution x = (x 1,.., x n ) if and only if there is a solution of the transformed system: A x b or ( 0 A )x b Remark: of (A b): Any row of (0 A b ) is a positive combination of rows any row is of the form y T (A b), y 0 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 30/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions By eliminating x 1, x 2,..., x n, in this way we finally obtain an equivalent system Ã (n) x b where Ã (n) = 0 which is solvable iff 0 b i, i. Theorem 2.11 (Projection Theorem, [MO, Thm. 2.5]) Let P = {x R n Ax b}. Then all for k = 1,..., n, the projection P (k) = {(x k+1,.., x n ) (x 1,.., x k, x k+1,.., x n ) P is the solution set of a linear system in n k variables x (k) = (x k+1,..., x n ). for suitable x 1,..., x k R} A (k) x (k) b (k) In principle: Linear inequalities can be solved by FM. However this might be inefficient! (Why?) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 31/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Solvability of linear systems Theorem 2.12 (Farkas Lemma, [MO, 2.6]) Exactly one of the following statements is true: (I) Ax b has a solution x R n. (II) There exists y R m such that y T A = 0 T, y T b < 0 and y 0 Ex. 2.2 Let A R m n, C R k n, b R m, c R k. Then precisely one of the alternatives is valid. (I) There is a solution x of: Ax b, Cx = c (II) There is a solution µ R m, µ 0, λ R k of : ( A T b T ) µ + ( C T c T ) λ = ( ) 0 1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 32/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Corollary 2.13 (Gordan, [MO, Cor. 2.5]) Given A R m n, exactly one of the following alternatives is true: (I) Ax = 0, x 0 has a solution x 0. (II) y T A < 0 T has a solution y. Remark: As we shall see in Chapter 3, the Farkas Lemma in the following form is the strong duality of LP in disguise. Corollary 2.14 (Farkas, implied inequalities, [MO, Cor. 2.6]) Let A R m n, b R m, c R n, z R. Assume that Ax b is feasible. Then the following are equivalent: (a) Ax b c T x z (b) y T A = c T, y T b z, y 0 has a solution y. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 33/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Application: Markov chains (Existence of a steady state) Def. A vector π R n + with 1 T π := n i=1 π i = 1 is called a probability distribution on {1,.., n}. A matrix P = (p ij ) where each row P i is a probability distribution is called a stochastic matrix, i.e. P R+ n n and P1 = 1 In a stochastic process: (individuals in n possibly states) π i proportion of population is in state i p ij is probability of transition from state i j So the transition step k k + 1 is: π (k+1) = P T π (k) A probability distribution π is called steady state if As a corollary of Gordan s result: π = P T π Each stochastic matrix P has a steady state π. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 34/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Two variables Lattices Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 35/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Integer Solutions of Linear Equations Example Equation 3x 1 2x 2 = 1 has solution x = (1, 1) Z 2. But equation 6x 1 2x 2 = 1 does not allow an integer solution x. Key remark: Let a 1, a 2 Z and let a 1 x 1 + a 2 x 2 = b have a solution x 1, x 2 Z. Then b = λc with λ Z, c = gcd(a 1, a 2 ) Here: gcd(a 1, a 2 ) denotes the greatest common divisor of a 1, a 2. Lemma 2.15 (Euclid s Algorithm, [MO, Lm. 2.2]) Let c = gcd(a 1, a 2 ). Then L(a 1, a 2 ) := {a 1 λ 1 + a 2 λ 2 λ 1, λ 2 Z} = {cλ λ Z} =: L(c). (The proof of) this result allows to solve (if possible) a 1 x 1 + a 2 x 2 = b (in Z). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 36/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Algorithm to solve, a 1 x 1 + a 2 x 2 = b (in Z) Compute c = gcd(a 1, a 2 ). If λ := b/c / Z, no integer solution exists. If λ := b/c Z, compute solutions λ 1, λ 2 Z of λ 1 a 1 + λ 2 a 2 = c. Then (λ 1 λ)a 1 + (λ 2 λ)a 2 = b. General problem: Given a 1,..., a n, b Z m, find x = (x 1,, x n ) Z n such that ( ) a 1 x 1 + a 2 x 2 +... + a n x n = b or equivalently Ax = b where A := [a 1,..., a n ]. Def. We introduce the lattice generated by a 1,..., a n, { n } L = L(a 1,..., a n ) = a jλ j λ j Z R m. j=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 37/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Assumption 1: rank A = m (m n); w.l.o.g., a 1,..., a m are linearly independent. To solve the problem: Find C = [c 1... c m ] Z m m such that ( ) L(c 1,..., c m ) = L(a 1,..., a n ). Then ( ) has a solution x Z n iff λ := C 1 b Z m Bad news: As in the case of one equation: in general Lemma 2.16 ([MO, Lm. 2.3]) L(a 1,..., a m ) L(a 1,..., a n ). Let c 1,..., c m L(a 1,..., a n ). Then L(c 1,..., c m ) = L(a 1,..., a n ) if and only if for all j = 1,..., n, the system Cλ = a j has an integral solution. Last step: Find such c i s Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 38/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions Main Result: The algorithm Lattice Basis INIT: C = [c 1,..., c m ] = [a 1,..., a m ] ; ITER: Compute C 1 ; If C 1 a j Z m for j = 1,..., n, then stop; If λ = C 1 a j / Z m for some j, then Let a j = Cλ = m i=1 λ ic i and compute c = m i=1 (λ i [λ i ])c i = a j m i=1 [λ i]c i ; Let k be the largest index i such that λ i / Z ; Update C by replacing c k with c in column k; next iteration stops after at most K = log 2 (det[a 1,..., a m ]) steps with a matrix C satisfying ( ). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 39/40

Introduction Gauss-elimination Orthogonal projection Linear Inequalities Integer Solutions As an exercise: Explain how an integer solution x of ( ) can be constructed with the help of the results from the algorithm above. Complexity From the algorithm above we see: To solve integer systems of equations is polynomial. Under additional inequalities (such as x 0) the problem becomes NP-hard Theorem 2.17 ([MO, Thm. 2.4]) Let A Z m n and b Z m be given. Then exactly one of the following statements is true: (a) There exists some x Z n such that Ax = b. (b) There exists some y R m such that y T A Z n and y T b Z. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 40/40

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Mathematical Optimisation, Chpt 3: Linear Optimisation/Programming Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 09/04/18 Wednesday 21st February 2018 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 1/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Table of Contents 1 Introduction 2 Recap 3 Primal and dual problems [MO, 3.1.1-2] 4 Shadow prices [MO, 3.1.3] 5 Matrix Games [MO, 3.1.4] 6 Algorithms Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 2/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms What do you recall from first year course? Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 3/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Peter J.C. 6 Dickinson Algorithms http://dickinson.website MO18, Chpt 3: Linear Optimisation 4/23 Table of Contents 1 Introduction 2 Recap 3 Primal and dual problems [MO, 3.1.1-2] Definitions Weak and strong duality Complementarity Equivalent LP s 4 Shadow prices [MO, 3.1.3] 5 Matrix Games [MO, 3.1.4]

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Linear Optimisation Given fixed parameters A R m n, b R m and c R n : max c T x s. t. Ax b, (LP p ) x R n F p := {x R n Ax b} is the feasible set of (LP p ), z p := max x Fp c T x is the optimal value of (LP p ), x F p is an optimal solution of (LP p ) if c T x = z p. http://ggbtu.be/m2677379 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 5/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Dual Problem Consider the problem max x R n c T x s. t. Ax b. Dual problem to this is of following form for some (1), (2), (3): (1) y b T y s. t. A T y (2) c, y (3) What should be filled in for (1), (2), (3)? (1) (a) min (b) max (2) (a) (b) (c) = (3) (a) R m (b) R m + (c) ( R m +) N.B. y R m + iff y R m and y i 0 for all i. y ( R m +) iff y R m and y i 0 for all i. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 6/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Lagrangian Function (not directly needed for exam) Theorem 3.1 (Min-Max Theorem) For L : (X Y) R we have { } Proof. max x X min{l(x; y)} y Y { min y Y max{l(x; y)} x X }. min{l(x; y)} L(x; y) y Y, x X y Y { } max min{l(x; y)} max{l(x; y)} y Y x X y Y x X { } { } max x X min{l(x; y)} y Y min y Y max{l(x; y)} x X Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 7/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Lagrangian Function (not directly needed for exam) Theorem 3.1 (Min-Max Theorem) For L : (X Y) R we have { } Example 1 max x X min{l(x; y)} y Y Defining L : (R n R m +) as { min y Y max{l(x; y)} x X }. L(x; y) = c T x + y T (b Ax) = b T y + x T (c A T y) we have { } max x {ct x : Ax b} = max min {L(x; y)} x R n y R m + { } min y R m + max x Rn{L(x; y)} = min y {b T y : A T y = c, y R m +}. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 7/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Primal and dual problems Given fixed parameters A R m n, b R m and c R n : max c T x s. t. Ax b, (LP p ) x R n min b T y s. t. A T y = c, y 0. (LP d ) y R m F p := {x R n Ax b} is the feasible set of (LP p ), F d := {y R m A T y = c, y 0} is the feasible set of (LP d ), zp := max x Fp c T x is the optimal value of (LP p ), zd := min y F d b T y is the optimal value of (LP d ), x F p is an optimal solution of (LP p ) if c T x = zp, y F d is an optimal solution of (LP d ) if b T y = zd. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 8/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Weak and strong duality Lemma 3.2 (Weak duality, [MO, Lm 3.1]) For all (x, y) F p F d we have c T x b T y, and thus z p z d. Corollary 3.3 ([MO, Lm 3.1]) If (x, y) F p F d with c T x = b T y then x and y are optimal solutions of (LP p ) and (LP d ) respectively. Theorem 3.4 (Strong Duality, [MO, Thm 3.1]) Primal problem Feasible Infeasible Dual problem Feasible Infeasible z p = zd R zp = zd = z p = zd = p = < = zd If both (LP p ) and (LP d ) are feasible then there exist optimal solutions (x, y) F p F d (satisfying c T x = b T y). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 9/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Complementarity For (x, y) R n R m, these are optimal solutions of (LP p ) and (LP d ) resp. if and only if they solve the following system: Ax b (1) A T y = c (2) c T x b T y = 0 (3) y 0 (4). Note that for x, y solving this system we have m 0 = b T y c T x = y T (b Ax) = i=1 y i }{{} 0 (b Ax) }{{} i. 0 Theorem 3.5 (Complementarity condition, [MO, Eq (3.12)]) If (1), (2), (4) hold then (3) y T (b Ax) = 0 y i (b i [Ax] i ) = 0 i. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 10/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Equivalent LP s, example What is the dual to max x c T x s. t. Ax b, x 0? (a) min y b T y s. t. A T y = c, y 0, (b) min y b T y s. t. A T y c, y 0, (c) min y b T y s. t. A T y = c, (d) min y b T y s. t. A T y c, (e) Other. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 11/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Equivalent LP s, [MO, 3.1.2] Example 2 Primal: max x c T x s. t. Ax b, x R n. Dual: min y b T y s. t. y 0, A T y = c. Rules for primal dual pairs: Primal problem max Free variable Nonnegative variable Equality constraint constraint Dual problem min Equality constraint constraint Free variable Nonnegative variable Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 12/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Equivalent LP s, [MO, 3.1.2] Example 2 Primal: max x c T x s. t. Ax b, x 0. Dual: min y b T y s. t. y 0, A T y c. Rules for primal dual pairs: Primal problem max Free variable Nonnegative variable Equality constraint constraint Dual problem min Equality constraint constraint Free variable Nonnegative variable Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 12/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Equivalent LP s, [MO, 3.1.2] Example 2 Primal: max x c T x s. t. Ax = b, x 0. Dual: min y b T y s. t. y R m, A T y c. Rules for primal dual pairs: Primal problem max Free variable Nonnegative variable Equality constraint constraint Dual problem min Equality constraint constraint Free variable Nonnegative variable Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 12/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Equivalent LP s, [MO, 3.1.2] Example 2 Primal: max x c T x s. t. Ax = b, x R n. Dual: min y b T y s. t. y R m, A T y = c. Rules for primal dual pairs: Primal problem max Free variable Nonnegative variable Equality constraint constraint Dual problem min Equality constraint constraint Free variable Nonnegative variable Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 12/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Table of Contents 1 Introduction 2 Recap 3 Primal and dual problems [MO, 3.1.1-2] 4 Shadow prices [MO, 3.1.3] Problem Shadow prices 5 Matrix Games [MO, 3.1.4] 6 Algorithms Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 13/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Shadow prices, [MO, 3.1.3] A factory makes products P 1,..., P n from resources R 1,..., R m. x i amount of P i you choose to produce c i profit per unit on P i a ji units of R j required per unit of P i units of R j available b j max x c T x s. t. Ax b, x 0, Assume problem is feasible and optimal value, z 0, finite. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 14/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Shadow prices, [MO, 3.1.3] A factory makes products P 1,..., P n from resources R 1,..., R m. x i amount of P i you choose to produce c i profit per unit on P i a ji units of R j required per unit of P i units of R j available b j max x c T x s. t. Ax b, x 0, Assume problem is feasible and optimal value, z 0, finite. How much would you pay to get more resources? Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 14/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Shadow prices, [MO, 3.1.3] A factory makes products P 1,..., P n from resources R 1,..., R m. x i amount of P i you choose to produce c i profit per unit on P i a ji units of R j required per unit of P i units of R j available b j max x c T x s. t. Ax b, x 0, min y b T y s. t. A T y c, y 0. Assume problem is feasible and optimal value, z 0, finite. Let y be the optimal solution of the dual problem. How much would you pay to get more resources? Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 14/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Shadow prices, [MO, 3.1.3] How much would you pay to get more resources? max x c T x s. t. Ax b + t, x 0, Lemma 3.6 min y (b + t) T y s. t. A T y c, y 0. Letting z t be their common optimal value, have z t z 0 + tt y. Corollary 3.7 (Shadow Price) Obtain higher profit only if price per unit for R j is smaller than y j. NB: If y is the unique dual optimal solution, then ε > 0 such that z t = z 0 + tt y for all t [ ε, ε] m. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 15/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Table of Contents 1 Introduction 2 Recap 3 Primal and dual problems [MO, 3.1.1-2] 4 Shadow prices [MO, 3.1.3] 5 Matrix Games [MO, 3.1.4] Pure strategy Mixed strategies Nash Equilibrium 6 Algorithms Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 16/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Matrix Games, [MO, 3.1.4] Matrix game is example of non-cooperative two player game. Payout matrix A R m n. Have two players R (rows) and C (columns). Players make one move simultaneously R chooses row i and C chooses column j R wins a ij from C. Example 3 A = ( 1 4 ) 3 2 3 5 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 17/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Matrix Games, [MO, 3.1.4] Matrix game is example of non-cooperative two player game. Payout matrix A R m n. Have two players R (rows) and C (columns). Players make one move simultaneously R chooses row i and C chooses column j R wins a ij from C. Pure strategy: Moves deterministic. May be no Nash-Equilibrium, i.e. no stable public strategies. Example 3 ( ) R chooses 1 C chooses 2 1 4 3 A = 2 3 5 C chooses 1 R chooses 2 C would never choose 3 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 17/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Mixed strategies, [MO, 3.1.4] Game R chooses row i with probability x; x i 0, i x i = 1. C chooses col. j with probability y; y i 0, i y i = 1. Expected payment to R from C is x T Ay. Given y: R plays solution of max x n x T Ay ( = max i (Ay) i ) Given x: C plays solution of min y m x T Ay ( = min j (A T x) j ) Example 4 A = Given y: ( 1 4 3 2 3 5 R plays x = ) 1/3, y = 1/3. 1/3 max x n x T Ay = max x n ( ) 0. 1 ( x1 x 2 ) T ( ) 0 = 2, 2 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 18/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Mixed strategies, [MO, 3.1.4] Game R chooses row i with probability x; x i 0, i x i = 1. C chooses col. j with probability y; y i 0, i y i = 1. Expected payment to R from C is x T Ay. Given y: R plays solution of max x n x T Ay ( = max i (Ay) i ) Given x: C plays solution of min y m x T Ay ( = min j (A T x) j ) Example 4 A = Given y: Given x: ) ( ) 7/10 1/2, x =, y = 3/10 1/2 0 ( ) T ( ) max x n x T x1 1/2 Ay = max x n = 1/2, x 2 1/2 T 1/2 y 1 min y m x T Ay = min y m 1/2 y 2 = 1/2. 4 y 3 ( 1 4 3 2 3 5 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 18/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Nash Equilibrium, [MO, 3.1.4] Assume your opponent plays the best possible strategy against you. For R: max x n min y m x T Ay. For C: min y m max x n x T Ay. Lemma 3.8 max x F min y G f (x, y) min y G max x F f (x, y). Theorem 3.9 (minmax-theorem, [MO, Thm 3.2]) There exist feasible x, y such that x T Ay = max x n xt Ay = min y m xt Ay = max min x n y xt Ay = min max m y m x xt Ay. n x, y are a Nash equilibrium of the mixed stragegy matrix game. We say a game is fair if x T Ay = 0. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 19/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Table of Contents 1 Introduction 2 Recap 3 Primal and dual problems [MO, 3.1.1-2] 4 Shadow prices [MO, 3.1.3] 5 Matrix Games [MO, 3.1.4] 6 Algorithms Simplex Algorithm Interior Point Method Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 20/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Simplex Algorithm F p = {x R n Ax b} is a polyhedral. In the simplex method we move from vertex to vertex in F p, continually improving c T x until we can improve it no more, and are thus at the optimal. Alternatively, in the dual simplex method we move from vertex to vertex in F d. Mathematically equivalent, but in practice the dual simplex method tends to work better. Need to choose which vertex to travel to. Methods are available which work well in practice, although there always seems to be some nasty examples remaining. (Exponential) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 21/23

Introduction Recap Primal and dual problems Shadow prices Matrix Games Algorithms Interior Point Method Consider following system in variables (x, y, s) R n R m R m. Ax + s = b, A T y = c, (m equalities) (n equalities) y i s i = ε for all i, (m equalities) y, s 0. (x, y) is optimal solution to (LP p ), (LP d ) resp. iff there exists s such that (x, y, s) is a solution to this system with ε = 0. Using Newton s method, we attempt to solve the system (excluding inequalities) with ε > 0, decreasing ε towards zero as we go. The interior point method has better worst case behaviour (polynomial). Comparision in practice is a matter of debate. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 3: Linear Optimisation 22/23

Introduction Convex sets Convex functions Reduction to R n variables Mathematical Optimisation, Chpt 4: Convexity Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 09/04/18 Monday 5th March 2018 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 1/30

Introduction Convex sets Convex functions Reduction to R n variables Table of Contents 1 Introduction 2 Convex sets [MO, 4.1] 3 Convex functions [MO, 4.2] 4 Reduction to R [MO, 4.2.1] 5 n variables [MO, 4.2.2] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 2/30

Introduction Convex sets Convex functions Reduction to R n variables Table of Contents 1 Introduction 2 Convex sets [MO, 4.1] Recall Convex sets Hyperplanes and halfspaces Ellipsoid method Convex cones 3 Convex functions [MO, 4.2] 4 Reduction to R [MO, 4.2.1] 5 n variables [MO, 4.2.2] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 3/30

Introduction Convex sets Convex functions Reduction to R n variables Recall Definition 4.1 A R n is closed if for all limiting sequences {x i i N} A we have lim i x i A. A R n is bounded if R R such that x 2 R for all x A. A R n is compact if it is closed and bounded. f : R n R is a continuous function if f (c) = lim x c f (x) c R n. Lemma 4.2 If A is a compact set and {x i i N} A, then there is a limiting subsequence {x ij j N}, i.e. i j+1 > i j N for all j N, and lim j x ij is well defined. Ex. 4.1 Show that if f : R n R is a continuous function and A R n is a nonempty compact set, then the minimums and maximums of f over A are attained. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 4/30

Introduction Convex sets Convex functions Reduction to R n variables Convex sets Definition 4.3 A R n is a convex set if x, y A, θ [0, 1] (1 θ)x + θy A. NB: If a set is not convex then it is called a nonconvex set. There is no such thing as a concave set! Ex. 4.2 Show that the intersection of (a) closed sets is closed; (b) convex sets is convex; (c) a bounded set with any other sets is bounded. Corollary 4.4 The intersection of compact sets is compact. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 5/30

Introduction Convex sets Convex functions Reduction to R n variables (a) (b) (c) {x R 2 x 2 3 x 1 } {x R 2 3x 2 x 1 } { x R 2 2x 2 x 1 2x 1 x 2 } { (d) (e) (f) } x R 2 2x 2 x 1 + 1 2x 1 x 2 + 1 {x R 2 x 1 2x 2 } {x R 2 x 2 1 x 2} {x R 2 x 2 1 x 2} Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 6/30

Introduction Convex sets Convex functions Reduction to R n variables (a) Nonconvex (b) Convex (c) Convex {x R 2 x 2 3 x 1 } {x R 2 3x 2 x 1 } { x R 2 2x 2 x 1 2x 1 x 2 } { } x R 2 2x 2 x 1 + 1 2x 1 x 2 + 1 (d) Convex (e) Convex (f) Nonconvex {x R 2 x 1 2x 2 } {x R 2 x 2 1 x 2} {x R 2 x 2 1 x 2} Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 6/30

Introduction Convex sets Convex functions Reduction to R n variables Hyperplanes and halfspaces Definition 4.5 H R n is a hyperplane if there exists a R n \ {0} and α R such that H = {x R n : a T x = α}. Ĥ R n is a closed halfspace if there exists a R n \ {0} and α R such that Ĥ = {x Rn : a T x α}. Ex. 4.3 Prove that hyperplanes and closed halfspaces are closed convex sets. Corollary 4.6 The intersection of (infinitely many) halfspaces and hyperplanes is a closed convex set. Ex. 4.4 Show that the feasible sets F p and F d from the previous lecture (slide 9) are closed convex sets. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 7/30

Introduction Convex sets Convex functions Reduction to R n variables Hyperplanes and halfspaces Definition 4.7 Hyperplane H = {x R n a T x = α} (with a R n \ {0}, α R) is a separating hyperplane w.r.t. A R n and c R n \ A if a T x α < a T c for all x A. Theorem 4.8 ([MO, Lm 4.1 & Cor 4.1]) For A R n, the following are equivalent: 1 A is an intersection of closed halfspaces. 2 For all c R n \ A there exists a separating hyperplane w.r.t. A and c. 3 A is a closed convex set. Ex. 4.6 Prove (2) (1) (3). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 8/30

Introduction Convex sets Convex functions Reduction to R n variables Proof (3) (2) Let A R n be a closed convex set and let c R n \ A. Will show (a, α) R n R s.t. a T x α < a T c for all x A. 1 Let b arg min z { c z 2 : z A} 2 Let a = c b R n \ {0}, α = a T b. 3 a T c = a T (a + b) = a 2 2 + α > α. 4 Assume for sake of contradiction y A such that β = a T y α > 0. 5 θ [0, 1] have x θ = θy + (1 θ)b A, b c 2 2 x θ c 2 2 = b c 2 2 2θβ + θ 2 y b 2 2 < b c 2 2 for θ > 0 small enough http://ggbtu.be/m2640679 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 9/30

Introduction Convex sets Convex functions Reduction to R n variables Supporting hyperplanes Definition 4.9 Hyperplane H = {x R n a T x = α} (with a R n \ {0}, α R) is a supporting hyperplane of A R n at c A if Theorem 4.10 ([MO, Thm 4.2]) a T x α = a T c for all x A. For a closed convex set A and c A we have that c bd(a) if and only if there is a supporting hyperplane of A at c. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 10/30

Introduction Convex sets Convex functions Reduction to R n variables Ellipsoid method For a compact convex set A consider the problem min c T x s. t. x A. If we can check whether y A efficiently, then this problem can be solved efficiently using the ellipsoid method. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 11/30

Definition 4.11 K R n is a cone if R ++ K K. (a) (b) (c) {x R 2 x 2 3 x 1 } {x R 2 3x 2 x 1 } { x R 2 2x 2 x 1 2x 1 x 2 } { (d) (e) (f) } x R 2 2x 2 x 1 + 1 2x 1 x 2 + 1 {x R 2 x 1 2x 2 } {x R 2 x 2 1 x 2} {x R 2 x 2 1 x 2}

Definition 4.11 K R n is a cone if R ++ K K. (a) Cone (b) Cone (c) Not cone {x R 2 x 2 3 x 1 } {x R 2 3x 2 x 1 } { x R 2 2x 2 x 1 2x 1 x 2 } { (d) Cone (e) Not cone (f) Not cone } x R 2 2x 2 x 1 + 1 2x 1 x 2 + 1 {x R 2 x 1 2x 2 } {x R 2 x 2 1 x 2} {x R 2 x 2 1 x 2}

Introduction Convex sets Convex functions Reduction to R n variables Convex cones Definition 4.12 K R n is a cone if R ++ K K. Theorem 4.13 K R n is a convex cone if x, y K, λ 1, λ 2 > 0 λ 1 x + λ 2 y K. e.g.: Some closed convex cones: Nonnegative vectors, R n + = {x R n x 0}, Symmetric matrices, S n = {X R n n X = X T }, Positive semidefinite cone, PSD n = {X S n v T X v 0 v R n } = {X S n X 0}. e.g.: Semidefinite Optimisation: max c T x s. t. B n A i x i 0. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 13/30 i=1

Introduction Convex sets Convex functions Reduction to R n variables Table of Contents 1 Introduction 2 Convex sets [MO, 4.1] 3 Convex functions [MO, 4.2] Epigraph Formal definition Examples Convex Hull 4 Reduction to R [MO, 4.2.1] 5 n variables [MO, 4.2.2] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 14/30

Introduction Convex sets Convex functions Reduction to R n variables Epigraphs Definition 4.14 For a function f : A R, let the epigraph epi(f ) A R be given by epi(f ) = {(x, z) A R z f (x)}. e.g.: f (x) = x sin(x) {(x, z) z = f (x)} Theorem 4.15 ([MO, Ex 4.6]) f is a convex function if and only if epi(f ) is a convex set. Corollary 4.16 f : A R convex {x A f (x) a} convex for all a R. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 15/30

Introduction Convex sets Convex functions Reduction to R n variables Epigraphs Definition 4.14 For a function f : A R, let the epigraph epi(f ) A R be given by epi(f ) = {(x, z) A R z f (x)}. e.g.: f (x) = x sin(x) epi(f ) Theorem 4.15 ([MO, Ex 4.6]) f is a convex function if and only if epi(f ) is a convex set. Corollary 4.16 f : A R convex {x A f (x) a} convex for all a R. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 15/30

Introduction Convex sets Convex functions Reduction to R n variables Formal definition Definition 4.17 A function f : A R is defined to be a convex function if A is a convex set and for all x, y A, θ [0, 1] we have f (θx + (1 θ)y) θf (x) + (1 θ)f (y). f : A R is strictly convex if A is a convex set and for all x, y A, θ (0, 1) with x y we have f (θx + (1 θ)y) < θf (x) + (1 θ)f (y). Ex. 4.7 Prove Theorem 4.15. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 16/30

Introduction Convex sets Convex functions Reduction to R n variables Examples Affine functions (i.e. f (x) = a T x + α) are convex. f (x) = x 2, x 4, exp(x), x are convex on R. f (x) = x 3 x is convex on R +, but not on [ 1, 1]. f (x) = ax 2 + bx + c is convex on R if and only if a 0. { x 2 if x [ 1, 1) f (x) = is convex iff a 1. a if x = 1 f (x) = x is convex on R n for any (semi)norm. http://ggbtu.be/m2729175 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 17/30

Introduction Convex sets Convex functions Reduction to R n variables Convex Hull Definition 4.18 For r N, a 1,..., a r R n, θ 1,..., θ r 0 with r i=1 θ i = 1, we say that r v = θ i a i i=1 is a convex combination of the a i s. Definition 4.19 The convex hull of A is the set of all convex combinations of vectors in A: { r } r N, a 1,..., a r A, conv A = θ i a i θ 1,..., θ r 0, r i=1 θ i = 1 i=1 Theorem 4.20 conv A is the smallest convex set containing A. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 18/30

Introduction Convex sets Convex functions Reduction to R n variables Important auxiliary result Ex. 4.8 Show for A R n and f : A R that (a) A is convex if and only if for all r N, a 1,..., a r A, θ 1,..., θ r 0 with r i=1 θ i = 1 we have r i=1 θ ia i A. (b) f is convex if and only if for all r N, a 1,..., a r A, θ 1,..., θ r 0 with r i=1 θ i = 1 we have r i=1 θ ia i A and f ( r i=1 θ ia i ) r i=1 θ if (a i ). Corollary 4.21 ([MO, Thm 4.3]) Let A = conv{a i : i I} R n and consider a convex function f : A R. Then max x A f (x) = max i I f (a i ). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 19/30

Introduction Convex sets Convex functions Reduction to R n variables Table of Contents 1 Introduction 2 Convex sets [MO, 4.1] 3 Convex functions [MO, 4.2] 4 Reduction to R [MO, 4.2.1] Reduction to R Differentiation Second Derivative Continuity 5 n variables [MO, 4.2.2] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 20/30

Introduction Convex sets Convex functions Reduction to R n variables Reduction to R Lemma 4.22 ([MO, Lm 4.2]) f : A R is convex if and only if for every x 0 A and d R n, the function p x0,d(t) := f (x 0 + td) is a convex function of t on the interval L = A d (x 0 ) = {t R x 0 + td A}. Ex. 4.9 Prove Lemma 4.22. e.g.: For A S n, consider f (x) = x T Ax + a T x + α. We have p x0,d(t) = t 2 d T Ad + 2td T (Ax 0 + a) + f (x 0 ), which is convex for fixed x 0, d R n iff d T Ad 0. Therefore f is convex iff d T Ad 0 for all x 0, d R n, or equivalently A 0. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 21/30

Introduction Convex sets Convex functions Reduction to R n variables Common Error Corollary 4.23 For a convex function f : R n R we have that for every x 0 R n and i {1,..., n}, the function p x0,e i (t) = f (x 0 + te i ) is a convex function of t R. e.g.: Consider the function f : R 2 R given by f (x) = x 1 x 2. For every x 0 R n and i {1, 2}, the function p x0,e i (t) is a linear function, and thus convex. However 1 2 f (1, 1) + 1 2 f ( 1, 1) = 1 < 0 = f ( 1 2 (1, 1) + 1 2 ( 1, 1)), and thus f is not convex on R 2. https://ggbm.at/use2n8v2 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 22/30

Introduction Convex sets Convex functions Reduction to R n variables Left and right derivatives Lemma 4.24 ([MO, Lm 4.3]) For convex f : (a, b) R and x 0 (a, b) we have that ϕ x0 (t) := f (x 0 + t) f (x 0 ), t (a x 0, b x 0 ) \ {0}. t is monotonically increasing in t. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 23/30

Introduction Convex sets Convex functions Reduction to R n variables Left and right derivatives Lemma 4.24 ([MO, Lm 4.3]) For convex f : (a, b) R and x 0 (a, b) we have that ϕ x0 (t) := f (x 0 + t) f (x 0 ), t (a x 0, b x 0 ) \ {0}. t is monotonically increasing in t. Thus f (x 0 ) := lim t 0 ϕ x 0 (t) lim t 0 + ϕ x 0 (t) =: f +(x 0 ), and f (x 0 ), f +(x 0 ) R. Remark A convex function f : (a, b) R need not be differentiable at all x (a, b), e.g. for f (x) = x have f (0) = 1 < 1 = f +(0). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 23/30

Introduction Convex sets Convex functions Reduction to R n variables Subderivative Theorem 4.25 ([MO, Eq (4.1)]) Let f : (a, b) R be convex and differentiable at x (a, b). Then Definition 4.26 f (y) f (x) + f (x)(y x) for all y (a, b). For f : (a, b) R, we call d x R a subderivative of f at x (a, b) if f (y) f (x) + d x (y x) for all y (a, b). The set of all subderivatives of f at x is the subdifferential denoted by f (x). Ex. 4.10 Show that if f : (a, b) R is convex then for all x (a, b) we have f (x) = {d R f (x) d f +(x)} Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 24/30

Introduction Convex sets Convex functions Reduction to R n variables Subderrivative Theorem 4.27 ([MO, Thm 4.6]) Let f : (a, b) R. Then f is convex f (x) x (a, b) Ex. 4.11 Prove Theorem 4.27. Corollary 4.28 Let f C 1. Then f is convex f (y) f (x) + f (x)(y x) for all x, y. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 25/30

Introduction Convex sets Convex functions Reduction to R n variables Second Derivative Theorem 4.29 ([MO, Thm 4.7]) Let f : (a, b) R be differentiable. Then f is convex f (x) is monotonically increasing on (a, b). Corollary 4.30 ([MO, Cor 4.2]) Let f : (a, b) R be twice differentiable. Then f is convex f (x) 0 x (a, b). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 26/30

Introduction Convex sets Convex functions Reduction to R n variables Lipschitz Continuous Definition 4.31 f : R n R is Lipschitz continuous at x 0 if ε, L > 0 such that f (x) f (x 0 ) L x x 0 for all x U ε (x 0 ), where U ε (x 0 ) := {x R n x x 0 ε}. http://ggbtu.be/m2727987 Theorem 4.32 Let f : (a, b) R be convex. Then f is Lipschitz continuous at all x 0 (a, b). e.g.: A convex function f : A R need not be continuous at its boundary, e.g. the following function on [0, 1]: { 0 if x [0, 1) f (x) = 1 if x = 1. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 27/30

Introduction Convex sets Convex functions Reduction to R n variables Table of Contents 1 Introduction 2 Convex sets [MO, 4.1] 3 Convex functions [MO, 4.2] 4 Reduction to R [MO, 4.2.1] 5 n variables [MO, 4.2.2] Subgradient Differentiability and continuity Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 28/30

Introduction Convex sets Convex functions Reduction to R n variables Subgradient Definition 4.33 ([MO, Eq (4.3)]) For f : A R, we call d x R n a subgradient of f at x A if f (y) f (x) + d T x (y x) for all y A. The set of all subgradients of f at x is the subdifferential denoted by f (x). Theorem 4.34 ([MO, Thm 4.8]) For f : A R with A an open convex set, we have f is convex f (x) for all x A. Theorem 4.35 For f : A R convex, x A and d f (x) we have {y A f (y) f (x)} {y d T y d T x} Ex. 4.12 Show that if f : A R is convex with A open, then f (x) is a nonempty convex compact set for all x A. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 29/30

Introduction Convex sets Convex functions Reduction to R n variables Differentiability and continuity Corollary 4.36 (From Corollary 4.28, [MO, Thm 4.8]) Let f C 1. Then f is convex f (y) f (x) + f (x) T (y x) for all x, y. Corollary 4.37 (From Corollary 4.30, [MO, Cor 4.3]) Let f : A R be twice differentiable, with A convex. Then f is convex 2 f (x) 0 x A. Theorem 4.38 ([MO, Thm 4.9]) Let f : A R be convex. Then f is continuous on int A. In fact f is Lipschitz continuous at all x 0 int A. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 4: Convexity 30/30

Intro Desc. Methods No derivative Conj. Dir. Newton s meth. Mathematical Optimisation, Chpt 5: Unconstrained Optimisation Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 09/04/18 Wednesday 7th March 2018 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 5: Unconstrained Optimisation 1/49

Intro Desc. Methods No derivative Conj. Dir. Newton s meth. Table of Contents 1 Introduction [MO, 5.1] Definitions Descent methods Necessary Conditions and Sufficient Conditions Solving necessary conditions 2 Descent Methods [MO, 5.2 & 5.4] 3 Minimisation of f without derivatives [MO, 5.8] 4 Method of conjugate directions [MO, 5.3] 5 Newton s method [MO, 5.5 7] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 5: Unconstrained Optimisation 2/49

Intro Desc. Methods No derivative Conj. Dir. Newton s meth. Minimisation problem: Given F R n and f : F R, min x F f (x) (P) (P) is an unconstrained program if: F is open, e.g., F = R n Def. x F is a global minimiser of f (over F) if f (x) f (x) for all x F. x F is a local minimiser of f if there is an ε > 0 such that f (x) f (x) for all x F, x x ε. and a strict local minimiser if with an ε > 0 f (x) < f (x) for all x F, x x, x x ε. Rem. In nonlinear (nonconvex) Optimisation we usually mean: Find a local minimiser. Global minimisation is more difficult. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 5: Unconstrained Optimisation 3/49

Intro Desc. Methods No derivative Conj. Dir. Newton s meth. CONCEPTUAL ALGORITHM: step k: Choose x 0 R n. Iterate Given x k R n, find a new point x k+1 with f (x k+1 ) < f (x k ). We want: x k x with x a local mininimiser. Definition Let x k x for k. The sequence (x k ) is: linearly convergent if with a constant 0 C < 1 and some K N: x k+1 x C x k x, k K. C is called convergence factor. quadratically convergent if with a constant c 0, superlinear convergence if x k+1 x c x k x 2, k N. lim k x k+1 x x k x = 0. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 5: Unconstrained Optimisation 4/49

Intro Desc. Methods No derivative Conj. Dir. Newton s meth. Geometry of min f (x) : For f C 1 (R n, R) consider the level set N α = {x f (x) = α} (some α R) and a point x N α with f (x) 0. Then: In a neighbourhood of x the solution set N α is a C 1 -manifold of dimension n 1 and at x we have f (x) N α i.e., f (x) is perpendicular to N α and points into the direction of maximal increase for f (x). Notation: In this Chapter 5 of the sheets the gradient f (x) is always a column vector!!!! Peter J.C. Dickinson http://dickinson.website MO18, Chpt 5: Unconstrained Optimisation 5/49

Intro Desc. Methods No derivative Conj. Dir. Newton s meth. Example: http://ggbm.at/e3vayubw f (x 1, x 2 ) = 1 100 (x 2 1 + x 2 2 )((x 1 5) 2 + (x 2 1) 2 )((x 1 2) 2 + (x 2 3) 2 + 1) Two global minima and three strict local minima. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 5: Unconstrained Optimisation 6/49