Mathematical Optimisation, Chpt 2: Linear Equations and inequalities

Mathematical Optimisation, Chpt 2: Linear Equations and inequalities Peter J.C. Dickinson p.j.c.dickinson@utwente.nl http://dickinson.website version: 12/02/18 Monday 5th February 2018 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 1/40

Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 2/40

Organization and Material Lectures (hoor-colleges) and 3 exercise classes for motivation, geometric illustration, and proofs. Script Mathematical Optimization, cited as: [MO,?] Sheets of the course (on Blackboard) Mark based on written exam Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 3/40

Chapter 1: Real vector spaces A collection of facts from Analysis and Linear Algebra, self-instruction Chapter 2: Linear equations, linear inequalities Gauss elimination and application Gauss elimination provides constructive proofs of main theorems in matrix theory. Fourier-Motzkin elimination This method provides constructive proofs of the Farkas Lemma (which is strong LP duality in disguise). Least square approximation, Fourier approximation Integer solutions of linear equations (discrete optimization) Chapter 3: Linear programs and applications LP duality, sensitivity analysis, matrix games. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 4/40

Chapter 4: Convex analysis Properties of convex sets and convex functions. Applications in optimization Chapter 5: Unconstrained optimization optimality conditions algorithms: descent methods, Newton s method, Gauss-Newton method, Quasi-Newton methods, minimization of nondifferentiable functions. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 5/40

Chapter 2. Linear equations, inequalities We start with some definitions: Definitions in matrix theory M = (m ij ) is said to be lower triangular: if m ij = 0 for i < j, upper triangular: if m ij = 0 for i > j. P = (p ij ) R m m is a permutation matrix if p ij {0, 1} and each row and each column of P contains exactly one coefficient 1. Note that P T P = I, implying that P 1 = P T. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 6/40

The set of symmetric matrices in R n n is denoted by S n. Q S n, is called Positive Semidefinite (denoted Q O or Q PSD n ) if x T Qx 0 for all x R n, Positive Definite (denoted: Q O) if x T Qx > 0 for all x R n \ {0}, Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 7/40

Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] Gauss-elimination (for solving Ax = b) Explicit Gauss Algorithm Implications of the Gauss algorithm Gauss-Algorithm for symmmetric A 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 8/40

Gauss-elimination (for solving Ax = b) In this Section 2.1 We shortly survey the Gauss algorithm and use the Gauss algorithm to give a constructive proof of important theorems in Matrix Theory Motivation: We give a simple example which shows that successive elimination is equivalent with the Gauss algorithm. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 9/40

General Idea: To eliminate x 1, x 2... is equivalent with transforming Ax = b or (A b) to triangular normal form (Ã b) (with same solution set). Then solve Ãx = b, recursively: a 11 a 12... a 1n b 1 a 21 a 22... a 2n b 2.. a m1 a m2... a mn b m Transformation into form (Ã b): ã 1 j1 ã 1 j2 ã 1 jr 1 ã 1 jr b1 ã 2 j2 ã 2 jr 1 ã 2 jr b2.... ã r 1 jr 1 ã r 1 jr b r 1 ã r jr b r. b m Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 10/40

This Gauss elimination uses 2 types of row operations: (G1) (i, j)-pivot: k > i add λ k = a kj a ij times row i to row k. (G2) interchange row i with row k The matrix form of these operations are: [MO, Ex.2.3] The matrix form of (G1), (A b) (Ã b), is given by (Ã b) = M (A b) with a nonsingular lower triangular M R m m. [MO, Ex.2.4] The matrix form of (G2), (A b) (Ã b), is given by (Ã b) = P (A b) with a permutation matrix P R m m. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 11/40

Explicit Gauss Algorithm 1 Set i = 1 and j = 1. 2 While i m and j n do 3 If k i such that a kj 0 then 4 Interchange row i and row k. (G2) 5 Apply (i, j)-pivot. (G1) 6 Update i i + 1 and j j + 1. 7 else 8 Update j j + 1. 9 end if 10 end while Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 12/40

Implications of the Gauss algorithm Lemma 2.1 The set of lower triangular matrices is closed under addition, multiplication and inversion (provided square and nonsingular). Theorem 2.2 ([MO, Thm. 2.1]) For every A R m n, there exists an (m m)-permutation matrix P and an invertible lower triangular matrix M R m m such that U = MPA is upper triangular. Corollary 2.3 (Cor. 2.1, LU-factorization) For A R m n, there exists an (m m)-permutation matrix P, an invertible lower triangular L R m m and an upper triangular U R m n such that LU = PA. Rem.: Solve Ax = b by using the decomposition PA = LU! (How?) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 13/40

Corollary 2.4 ([MO, Cor. 2.2], Gale s Theorem) Exactly one of the following statements is true for A R m n : (a) The system Ax = b has a solution x R n. (b) There exists y R m such that: y T A = 0 T and y T b 0. Remark: In normal form A Ã, the number r gives dimension of the space spanned by the rows of A. This equals the dimension of the space spanned by columns of A. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 14/40

Modified Gauss-Algorithm for symmmetric A Perform same row and column operators: 1 Set i = 1. 2 While i n do 3 If a kk 0 for some k i 4 Apply (G2 ): Interchange row i and row k. Interchange col. i and col. k, A PAP T. 5 Apply (G1 ): For all j > i add λ j = a ij a ii times row i to row j, and λ j times col. i to col. j, A MAM T. 6 Update i i + 1. 7 Else if a ik 0 for some k > i then 8 Add row k to row i and add col. k to col. i, A BAB T. 9 Else 10 Update i i + 1. 11 End if 12 End while. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 15/40

Implications of the symmetric Gauss algorithm Note: By symmetric Gauss the solution set of Ax = b is destroyed!!! But it is useful to get the following results. Theorem 2.5 ([MO, Thm. 2.2]) A R n n symmetric. Then with some nonsingular Q R n n QAQ T = D = diag(d 1,..., d n ), d R n. Look out: The d i s are in general not the eigenvalues of A. Recall: Q PSD n is positive semidefinite (notation: Q O) if: x T Qx 0 for all x R n. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 16/40

Corollary 2.6 ([MO, Cor. 2.3]) Let A S n, Q R n n nonsingular s.t. QAQ T = diag(d 1,..., d n ). Then (a) A O d i 0, i = 1,..., n (b) A O d i > 0, i = 1,..., n Implication: Checking A O can be done by the Gauss-algorithm. Corollary 2.7 ([MO, Cor. 2.4]) Let A S n. Then (a) A O A = BB T for some B R n m (b) A O A = BB T for some nonsingular B R n n Complexity of Gauss algorithm The number of ±,, / flop s (floating point operations) needed to solve Ax = b, where A R n n, is less than or equal to n 3. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 17/40

Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] Projection and equivalent condition Constructing a solution Gram Matrix Gram-Schmidt Algorithm Eigenvalues 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 18/40

Projection and equivalent condition In this section: We see that the orthogonal projection problem is solved by solving a system of linear equations; and present some more results from matrix theory. Assumption: V is a linear vector space over R with inner product x, y and (induced) norm x = x, x. Minimization Problem: Given x V, subspace W V find x W such that: x x = min x y (1) y W The vector x is called the projection of x onto W. Lemma 2.8 ([MO, Lm. 2.1]) x W is the (unique) solution to (1) x x, w = 0 w W. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 19/40

Constructing a solution via Lm 2.8 Let a 1,..., a m be a basis of W, i.e., a 1,..., a m are linearly independent and W = span{a 1,..., a m }. Write x := m i=1 z ia i Then x x, w = 0, w W is equivalent with x m z ia i a j = 0, j = 1,..., m i=1 or m a i, a j z i = x, a j, j = 1,..., m i=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 20/40

Gram Matrix Defining the Gram-matrix G = ( a i, a j ) i,j S m and considering b = ( x, a i ) i R m, this leads to the linear equation (for z): (2.16) Gz = b with solution ẑ = G 1 b Ex. 2.1 Prove that the Gram-matrix is positive definite, and thus non-singular (under our assumption). Summary Let W = span{a 1,..., a m }. Then the solution x of the minimization problem (1) is given by x := m i=1 ẑia i where ẑ is computed as the solution of G ẑ = b. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 21/40

Special case 1: V = R n, x, y = x T y and a 1,..., a m a basis of W. Then with A := [a 1,..., a m ] the projection of x onto W is given by ẑ = arg min{ x Az } = (A T A) 1 A T x, z x = arg min y { x y : y AR m } = Aẑ = A(A T A) 1 A T x. Special case 2: V = R n, x, y = x T y, a 1,..., a m R n lin. indep. and W = {w R n a T i w = 0, i = 1,..., m}. Then the projection of x onto W is given by x = arg min{ x y : a T y i y = 0 i} = x A(A T A) 1 A T x Special case 3: W = span{a 1,..., a m } with {a i }, an orthonormal basis, i.e., I = ( a i, a j ) i,j ). Then the projection of x onto W is given by ẑ = ( a j, x ) j R m, ( Fourier coefficients ) x = arg min{ x y : y W } = m ẑja j y j=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 22/40

Gram-Schmidt Algorithm Given subspace W with basis a 1,..., a m, construct an orthonormal basis c 1,... c m for W, i.e. ( c i, c j ) i,j = I. Start with b 1 := a 1 and c 1 = b 1 / b 1. For k = 2,..., m let b k = a k k 1 i=1 c i, a k c i and c i = b i / b i. Gram-Schmidt in matrix form: With W V := R n. Put A = [a 1,..., a m ] T, B = [b 1,..., b m ] T, C = [c 1,..., c m ] T. Then the Gram-Schmidt-steps are equivalent with: add multiple of row j < k to row k (for B) multiply row k by scalar (for C) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 23/40

Matrix form of Gram-Schmidt : Given A R m n with full row rank, there is a decomposition C = LA with lower triangular nonsingular matrix L (l ii = 1) and the rows c j of C are orthogonal, i.e. c i, c j = 0, i j. A corollary of this fact: Lemma 2.9 ([MO, Prop. 2.1]) Prop. 2.1 (Hadamard s inequality) Let A R m n with rows a T i. Then m 0 det (AA T ) a T i a i i=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 24/40

Eigenvalues Definition. λ C is an eigenvalue of A R n n if there is an (eigenvector) 0 x C n with Ax = λx. The results above (together with the Theorem of Weierstrass) allow a proof of: Theorem 2.10 ([MO, Thm. 2.3], Spectral theorem for symmetric matrices) Let A S n. Then there exists an orthogonal matrix Q R n n (i.e. Q T Q = I) and eigenvalues λ 1,..., λ n R such that Q T AQ = D = diag (λ 1,..., λ n ) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 25/40

Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] Fourier-Motzkin Algorithm Solvability of linear systems Application: Markov chains 5 Integer Solutions of Linear Equations [MO, 2.3] Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 26/40

Linear Inequalities Ax b In this Section we learn: Finding a solution to Ax b. What the Gauss algorithm is for linear equations; is the Fourier-Motzkin algorithm for linear inequalities. The Fourier-Motzkin algorithm leads to a construcive proof of the Farkas Lemma (the basis for strong duality in linear programming). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 27/40

Fourier-Motzkin algorithm for solving Ax b. Eliminate x 1 : a r1 x 1 + a s1 x 1 + n a rj x j b r j=2 n a sj x j b s j=2 n a tj x j b t j=2 r = 1,..., k s = k + 1,..., l t = l + 1,..., m with a r1 > 0, a s1 < 0. Divide by a r1, a s1, leading to (for r and s) x 1 + x 1 + n a rjx j b r j=2 n a sjx j b s j=2 r = 1,..., k s = k + 1,..., l Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 28/40

These two sets of inequalities have a solution x iff n a sjx j b s x 1 b r n j=2 j=2 a rjx j { r = 1,..., k s = k + 1,..., l or equivalently: n { (a sj + a rj)x j b s + b r = 1,..., k r s = k + 1,..., l j=2 Remark Explosion of number of inequalities. Before number = k + (l k) + (m l). Now the number = k (l k) + (m l) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 29/40

So Ax b has a solution x = (x 1,.., x n ) if and only if there is a solution x = (x 2,.., x n ) of n (a sj + a rj)x j b r + b s r = 1,..., k; s = k + 1..., l j=2 n a tj x j b t t = l + 1,..., m. j=2 In matrix form: Ax b has a solution x = (x 1,.., x n ) if and only if there is a solution of the transformed system: A x b or ( 0 A )x b Remark: of (A b): Any row of (0 A b ) is a positive combination of rows any row is of the form y T (A b), y 0 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 30/40

By eliminating x 1, x 2,..., x n, in this way we finally obtain an equivalent system Ã (n) x b where Ã (n) = 0 which is solvable iff 0 b i, i. Theorem 2.11 (Projection Theorem, [MO, Thm. 2.5]) Let P = {x R n Ax b}. Then all for k = 1,..., n, the projection P (k) = {(x k+1,.., x n ) (x 1,.., x k, x k+1,.., x n ) P is the solution set of a linear system in n k variables x (k) = (x k+1,..., x n ). for suitable x 1,..., x k R} A (k) x (k) b (k) In principle: Linear inequalities can be solved by FM. However this might be inefficient! (Why?) Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 31/40

Solvability of linear systems Theorem 2.12 (Farkas Lemma, [MO, 2.6]) Exactly one of the following statements is true: (I) Ax b has a solution x R n. (II) There exists y R m such that y T A = 0 T, y T b < 0 and y 0 Ex. 2.2 Let A R m n, C R k n, b R m, c R k. Then precisely one of the alternatives is valid. (I) There is a solution x of: Ax b, Cx = c (II) There is a solution µ R m, µ 0, λ R k of : ( A T b T ) µ + ( C T c T ) λ = ( ) 0 1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 32/40

Corollary 2.13 (Gordan, [MO, Cor. 2.5]) Given A R m n, exactly one of the following alternatives is true: (I) Ax = 0, x 0 has a solution x 0. (II) y T A < 0 T has a solution y. Remark: As we shall see in Chapter 3, the Farkas Lemma in the following form is the strong duality of LP in disguise. Corollary 2.14 (Farkas, implied inequalities, [MO, Cor. 2.6]) Let A R m n, b R m, c R n, z R. Assume that Ax b is feasible. Then the following are equivalent: (a) Ax b c T x z (b) y T A = c T, y T b z, y 0 has a solution y. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 33/40

Application: Markov chains (Existence of a steady state) Def. A vector π R n + with 1 T π := n i=1 π i = 1 is called a probability distribution on {1,.., n}. A matrix P = (p ij ) where each row P i is a probability distribution is called a stochastic matrix, i.e. P R+ n n and P1 = 1 In a stochastic process: (individuals in n possibly states) π i proportion of population is in state i p ij is probability of transition from state i j So the transition step k k + 1 is: π (k+1) = P T π (k) A probability distribution π is called steady state if As a corollary of Gordan s result: π = P T π Each stochastic matrix P has a steady state π. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 34/40

Table of Contents 1 Introduction 2 Gauss-elimination [MO, 2.1] 3 Orthogonal projection, Least Squares, [MO, 2.2] 4 Linear Inequalities [MO, 2.4] 5 Integer Solutions of Linear Equations [MO, 2.3] Two variables Lattices Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 35/40

Integer Solutions of Linear Equations Example Equation 3x 1 2x 2 = 1 has solution x = (1, 1) Z 2. But equation 6x 1 2x 2 = 1 does not allow an integer solution x. Key remark: Let a 1, a 2 Z and let a 1 x 1 + a 2 x 2 = b have a solution x 1, x 2 Z. Then b = λc with λ Z, c = gcd(a 1, a 2 ) Here: gcd(a 1, a 2 ) denotes the greatest common divisor of a 1, a 2. Lemma 2.15 (Euclid s Algorithm, [MO, Lm. 2.2]) Let c = gcd(a 1, a 2 ). Then L(a 1, a 2 ) := {a 1 λ 1 + a 2 λ 2 λ 1, λ 2 Z} = {cλ λ Z} =: L(c). (The proof of) this result allows to solve (if possible) a 1 x 1 + a 2 x 2 = b (in Z). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 36/40

Algorithm to solve, a 1 x 1 + a 2 x 2 = b (in Z) Compute c = gcd(a 1, a 2 ). If λ := b/c / Z, no integer solution exists. If λ := b/c Z, compute solutions λ 1, λ 2 Z of λ 1 a 1 + λ 2 a 2 = c. Then (λ 1 λ)a 1 + (λ 2 λ)a 2 = b. General problem: Given a 1,..., a n, b Z m, find x = (x 1,, x n ) Z n such that ( ) a 1 x 1 + a 2 x 2 +... + a n x n = b or equivalently Ax = b where A := [a 1,..., a n ]. Def. We introduce the lattice generated by a 1,..., a n, { n } L = L(a 1,..., a n ) = a jλ j λ j Z R m. j=1 Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 37/40

Assumption 1: rank A = m (m n); w.l.o.g., a 1,..., a m are linearly independent. To solve the problem: Find C = [c 1... c m ] Z m m such that ( ) L(c 1,..., c m ) = L(a 1,..., a n ). Then ( ) has a solution x Z n iff λ := C 1 b Z m Bad news: As in the case of one equation: in general Lemma 2.16 ([MO, Lm. 2.3]) L(a 1,..., a m ) L(a 1,..., a n ). Let c 1,..., c m L(a 1,..., a n ). Then L(c 1,..., c m ) = L(a 1,..., a n ) if and only if for all j = 1,..., n, the system Cλ = a j has an integral solution. Last step: Find such c i s Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 38/40

Main Result: The algorithm Lattice Basis INIT: C = [c 1,..., c m ] = [a 1,..., a m ] ; ITER: Compute C 1 ; If C 1 a j Z m for j = 1,..., n, then stop; If λ = C 1 a j / Z m for some j, then Let a j = Cλ = m i=1 λ ic i and compute c = m i=1 (λ i [λ i ])c i = a j m i=1 [λ i]c i ; Let k be the largest index i such that λ i / Z ; Update C by replacing c k with c in column k; next iteration stops after at most K = log 2 (det[a 1,..., a m ]) steps with a matrix C satisfying ( ). Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 39/40

As an exercise: Explain how an integer solution x of ( ) can be constructed with the help of the results from the algorithm above. Complexity From the algorithm above we see: To solve integer systems of equations is polynomial. Under additional inequalities (such as x 0) the problem becomes NP-hard Theorem 2.17 ([MO, Thm. 2.4]) Let A Z m n and b Z m be given. Then exactly one of the following statements is true: (a) There exists some x Z n such that Ax = b. (b) There exists some y R m such that y T A Z n and y T b Z. Peter J.C. Dickinson http://dickinson.website MO18, Chpt 2: Linear Equations and inequalities 40/40