Complexity of linear programming: outline

Complexity of linear programming: outline I Assessing computational e ciency of algorithms I Computational e ciency of the Simplex method I Ellipsoid algorithm for LP and its computational e ciency IOE 610: LP II, Fall 2013 Complexity of linear programming Page 139 Problem, instance, algorithm I Linear Programming in inequality form is a problem; I An instance of this problem is given by d =(A, b, c) Definition 8.1 An instance of an optimization problem consists of a feasible set F and a cost function c : F!<. An optimization problem is defined as a collection of instances. I An algorithm is a finite set of instructions, such as arithmetic operators, conditional statements, read and write statements, etc., and the running time of the algorithm is the total number of steps involved in carrying out these instructions until termination is reached I Algorithms are described for problems, but then applied to individual instances. I An algorithm solves the problem if it terminates in finite time and produces a correct answer for all instances of the problem. IOE 610: LP II, Fall 2013 Complexity of linear programming Page 140

Algorithms for Solving (Optimization) Problems I It is reasonable to expect the running time of an algorithm to depend on the size of the instance to which it is applied. Definition 8.2 The size of an instance is defined as the number of bits used to describe the instance, according to a prespecified format.... geared towards binary computation and integer input data I r 2 Z, r apple U: needblog 2 Uc + 2 bits (binary representation) r = a k 2 k +a k 1 2 k 1 + +a 1 w 1 +a 0 2 0, a i 2 {0, 1} 8i, k appleblog 2 Uc I An LP instance with d 2 Z mn+m+n,max{ a ij, b i, c j } = U: the size of d is O(mn log 2 U) IOE 610: LP II, Fall 2013 Complexity of linear programming Page 141 Computational e ciency analysis of algorithms Question: Given an algorithm that solves our problem and an instance of our problem of a particular size, how long will the algorithm take to solve it? Two types of answers: I Average performance on a typical problem instance I Mathematically di cult to define or analyze rigorously I Observations from empirical experience give some hints I Worst case performance: how long for the most di cult (for this algorithm) instance of the given size? I Often too pessimistic to predict practical behavior I Relatively easy to define and analyze rigorously I Provides a worst-case guarantee I Important for the theory of complexity: Suggests where the limits of our computational abilities are IOE 610: LP II, Fall 2013 Complexity of linear programming Page 142

Worst-case analysis and e cient algorithms I We will be looking at T (S): the worst-case running time of the algorithm on instances of the size (at most) S. I As before, we will use the arithmetic model of computation, and count the number of arithmetic operations to estimate T (S) I The algorithm is usually considered e cient if T (S) canbe bounded above by a polynomial function Definition 8.3 An algorithm runs in polynomial time (poly-time) if 9k 2 Z such that T (S) =O(S k ). IOE 610: LP II, Fall 2013 Complexity of linear programming Page 143 How can we measure the running time of an algorithm? I Arithmetic model of computation: each operation (including arithmetic operations) takes unit time. (Easy to analyze.) I Bit model of computation: each operation is decomposed into bit operations, and it s these elementary operations that are assumed to take unit time (for example, adding two numbers takes longer if the numbers are large). (Better estimate of the true running time) I Fact Suppose I an algorithm takes poly-time under the arithmetic model, and I on instances of size S, any integer produced in the course of the algorithm execution has size bounded by a polynomial in S. Then the algorithm runs in polynomial time under the bit model. IOE 610: LP II, Fall 2013 Complexity of linear programming Page 144

Computational e ciency of simplex method I An LP instance in n variables with m constraints min c T x s.t. Ax b I We have shown that each iteration takes polynomial time (O(mn) with revised implementation) I Total number of iterations? I On problems arising in practice simplex method is usually extremely fast I Conventional wisdom suggests that number of iterations in practice is about 3m I Worst case number of iterations exponential in n (Klee and Minty, 1972) IOE 610: LP II, Fall 2013 Complexity of linear programming Page 145 Klee-Minty example For n = 3: max nx 2 n j x j j=1 i 1 X s.t. 2 2 i j x j + x i apple 100 i 1, i =1,...,n x j j=1 0, j =1,...,n max 4x 1 + 2x 2 + x 3 s.t. x 1 apple 1 4x 1 + x 2 apple 100 8x 1 + 4x 2 + x 3 apple 10000 x 1, x 2, x 3 0 IOE 610: LP II, Fall 2013 Complexity of linear programming Page 146

Klee-Minty cube 5 Case n =3: 10000 9992 9592 1 96 9600 100 Feasible region is a distortion to a stretched n-dimensional cube 0 apple x 1 apple 1 0 apple x 2 apple 100. 0 apple x n apple 100 n 1 Starting at x = 0, using steepest pivot rule, simplex method will visit all 2 n vertices before finding the optimal solution... what if we use a di erent pivoting rule? 5 Thanks to R. Vanderbei for the illustration IOE 610: LP II, Fall 2013 Complexity of linear programming Page 147 Simplex method and diameter of polyhedra Examples of bad polyhedra have been given for all known pivoting rules. Is there an exponential example for any pivoting rule, perhaps one not yet invented? I If x and y are two vertices of a polyhedron, consider traveling from x to y via adjacent vertices; I d(x, y) # of steps in the shortest such path I Diameter of a polyhedron: max x,y d(x, y) I (n, m) - maximum diameter of all bounded polyhedra in < n that can be represented by m inequalities I With a perfect pivoting rule, simplex shouldn t need more than (n, m) iterations I Hirsch Conjecture proposed in 1957: (n, m) apple m n I True for n apple 3 and other special cases I Disproven by Francisco Santos, 2010: a polyhedron with n = 86 and m = 43 with diameter bigger than 43 I Still open: (n, m) apple m? Known: (n, m) apple m 1+log 2 n Even if we can bound (n, m), still need the right pivoting rule... IOE 610: LP II, Fall 2013 Complexity of linear programming Page 148

Some LP history I 1930 s 1940 s I (Specialized) LP models and solution approaches developed independently in the Soviet Union and the West for a variety of optimal resource allocation and planning applications I Late 1940 s I General LP theory (John Von Neumann) and solution method (Simplex algorithm, George Dantzig) developed in US I Simplex runs quite fast in practice; LP used for military operations and gains widespread use after the war I 1972 I Klee and Minty show that the simplex algorithm is not e cient, i.e., it does not run in polynomial time I 1975: Nobel Prize in Economics is awarded for for their contributions to the theory of optimum allocation of resources via LP to Leonid V. Kantorovich and Tjalling C. Koopmans IOE 610: LP II, Fall 2013 Complexity of linear programming Page 149 LP history, continued I 1970 s I THE BIG QUESTION: Does there exist an algorithm for solving LPs that s poly-time in the worst case? I 1979, in the Soviet Union... I Leonid G. Khachiyan: YES, THERE IS the Ellipsoid algorithm! I NY Times publishes several articles about this discovery; makes some mathematical blunders about the implications of the result and has to print retractions IOE 610: LP II, Fall 2013 Complexity of linear programming Page 150

Ellipsoid Algorithm for LP: outline I Develop general ideas for the Ellipsoid algorithm I Specify the algorithm for finding a point in P = {x 2< n : Ax b} I Modify the algorithm for solving min c T x s.t. Ax b IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 151 Volumes and A ne Transformations Definition If L < n, the volume of L is Z Vol(L) = Definition 8.6 x2l If D 2< n n is nonsingular and b 2< n,themapping S(x) = Dx + b is an a ne transformation. dx I Note: by definition, a ne transformation is invertible Lemma 8.1 Let L < n.ifs(x) =Dx + b then Vol(S(L)) = det(d) Vol(L). IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 152

Assumptions for Ellipsoid Algorithm Goal of the algorithm: find a point in a suitably described convex set P < n. Assumptions: I P is bounded: 9B(x 0, r) P. LetV Vol(B(x 0, r)). I P is full-dimensional, i.e., has positive volume (Definition 8.7) I Let Vol(P) > v > 0. I P can be described via a separation oracle: given a vector y 2< n,theoracleeitherreportsthaty 2 P, orfindsa separating hyperplane: a 2< n such that a T y < a T x 8x 2 P. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 153 Positive definite matrices and Ellipsoids Definition 8.4 Let Q be an n n symmetric matrix, i.e., Q T = Q. Q is called positive definite (notation: Q 0) if x T Qx > 0 8x 6= 0. I Symmetric: Q 2 S n ;symmetricpd:q 2 S n + or Q 0 I Q 2 S n has n eigenvalues 1,..., n; det(q) = Q n i=1 i I Q 2 S n + i > 0 Definition 8.5 AsetE < n E = E(z, Q) ={x :(x z) T Q 1 (x z) apple 1} where Q 0isanellipsoid with center z 2< n. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 154

Ellipsoids as a ne transformations of balls I Aballcenteredatzwith radius r > 0: E(z, r 2 I)={x :(x z) T (x z) apple r 2 } = {x : kx zk appler} I Unit ball: ball centered at 0 with radius 1: B(0, 1) = E(0, I) I If Q 2 S+, n 9Q 1 2 2 S+: n Q 1 2 Q 1 2 = Q and det(q 1 2 )= p det(q) I Note: an ellipsoid is an a ne transformation of the unit ball: E(z, Q) =Q 1 2 E(0, I)+z Corollary: Let Q 0. Then Vol(E(z, Q)) = p det(q) Vol(B(0, 1)). IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 155 Central-cut ellipsoid algorithm: general idea Input: Ellipsoid E 0, constants V and v as above. Initialization: E 0 = E(x 0, Q 0 ) P, t = 0. Iteration t: (E t = E(x t, Q t ) P) I Call the separation oracle with x t as input. I If x t 2 P terminate. I Otherwise, oracle returns a such that P E t \ {x : a T x a T x t }. Construct ellipsoid E t+1 = E(x t+1, Q t+1 ) of smaller volume containing the set on the right. I Set t t + 1 and continue. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 156

Central-cut ellipsoid algorithm Iteration details How to construct E t+1 E t \ {x : a T x a T x t } Theorem 8.1 Let E = E(z, Q) be an ellipsoid in < n,andlet06= a 2< n. Consider the halfspace H = {x : a T x a T z} and let Q = z = z + 1 Qa p n +1 a T Qa, n2 n 2 Q 1 2 n +1 Qaa T Q a T. Qa Then Q 0, thus E 0 = E( z, Q) is an ellipsoid. Moreover, (a) E \ H E 0 (b) Vol(E 0 ) < e 1 2(n+1) Vol(E). E 0 is the smallest-volume ellipsoid that contains E \ H IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 157 Central-cut ellipsoid algorithm Termination and running time I At iteration t, Vol(E t ) < e 1 2(n+1) t Vol(E0 ) apple e t 2(n+1) I P E t,sov apple Vol(P) apple Vol(E t ) I v < e t 2(n+1) V,hence I Since t is an integer, we have V t < 2(n + 1) ln(v /v) t appled2(n + 1) ln(v /v)e 1 t? 1, i.e., we must have x t? 1 2 P. I So, the ellipsoid algorithm will terminate in at most t? iterations, each iteration consisting of a call to the separation oracle and some arithmetics IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 158

Ellipsoid algorithm for linear inequalities Goal: find out whether P = ;. P = {x 2< n : Ax b} I Assume P is bounded and either empty or full-dimensional. Also, presume E 0, v and V are known. I Separation oracle: for given y 2< n,checkif a T i y b i, i =1,...,m I If all constraints satisfied, y 2 P I If ith constraint is violated, a T i x a T i y is a separating hyperplane The oracle requires O(mn) arithmetic operations I We assume for now calculations are made in infinite precision and taking a square root takes the same time as any other arithmetic operation IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 159 Central-cut ellipsoid algorithm for linear inequalities Input: Ellipsoid E 0, constants V and v as above. Initialization: E 0 = E(x 0, Q 0 ) P, t = 0. Let t? = d2(n + 1) ln(v /v)e. Iteration t: (E t = E(x t, Q t ) P) I If t = t?, stop; P = ;. I If x t 2 P, stop; P 6= ;. I Otherwise, find a violated constraint: a T i x t < b i. Construct ellipsoid E t+1 = E(x t+1, Q t+1 ) E t \ {x : a T i x a T i x t } (use Thm. 8.1). I Set t t + 1 and continue. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 160

Assumptions revisited: V Lemma 8.2 (modified) Let A 2 Z m n,andb2< m.letu =max{ a ij, b i }. (a) Every basic solution of {x : Ax b} satisfies (nu) n apple x j apple (nu) n, j =1,...,n (b) Every basic solution of P = {x : Ax = b, x 0} satisfies Proof of (a): (mu) m apple x j apple (mu) m, j =1,...,n I Basic solution is Â 1ˆb, whereâ 2< n n is a submatrix of A and ˆb 2< n is a subvector of b. I Cramer s rule gives a formula of each component x j If rows of A span < n, P 6= ; i it has extreme points, all of which are contained in E 0 = E(0, n(nu) 2n I ), Vol(E 0 ) < V =(2n) n (nu) n2 IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 161 Assumptions revisited: v Lemma 8.4 Let P = {x : Ax b} be full-dimensional and bounded. Assume A and b have integer entries of magnitude at most U. Then Vol(P) n n (nu) n2 (n+1) Idea of the proof: I P is the convex hull of its extreme points I If P 2< n is full-dimensional, it has n + 1 extreme points that do not belong to a common hyperplane I Vol(P) volume of convex hull of these extreme points I The latter can be bounded by bounding components of extreme points IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 162

Assumptions revisited: P full-dimensional Lemma 8.3 Let P = {x : Ax b}. AssumeA and b have integer entries of magnitude at most U. Let = 1 2(n + 1) ((n + 1))U) (n+1). Let P = {x : Ax b e}. (a) If P is empty, then P is empty (b) If P is nonempty, then P is full-dimensional. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 163 Run-time of the ellipsoid algorithm: number of iterations P = {x : Ax b} Assume A, b have integer components with magnitude bounded by U, rows of A span < n. Case 1: If we know that P is bounded, and either empty or full-dimensional, take E 0 = E(0, n(nu) 2n I ), v = n n (nu) n2 (n+1), V =(2n) n (nu)n 2, and the algorithm with these inputs will terminate in O(n 4 log(nu)) iterations (Recall: instance size is O(mn log U)) Case 2: If P is arbitrary, I Construct P B by adding bounds on variables as in Lemma 8.2 I PB contains all extreme points of P I Construct P B, as in Lemma 8.3 I To decide if P = ;, applyeatop B, ;itwillterminatein O(n 6 log(nu)) iterations. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 164

Run-time of the ellipsoid algorithm: overall running time I We showed that number of iterations is polynomial in the size of the problem I To show the algorithm takes polynomial time, need to show that the number of operations/run-time per iteration is also polynomial I Issues: I Need to take square roots cannot be done accurately in finite-precision calculations I Need to make sure the numbers generated have polynomial size I These problems can be resolved by showing that computations need only be carried out with finite (polynomial) precision. The result: Theorem 8.3 The LP feasibility problem with integer data can be solved in polynomial time. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 165 Ellipsoid algorithm for LP optimization (P) min c T x (D) max p T b s.t. Ax b s.t. p T A = c T Option 1: Apply ellipsoid algorithm to p 0 Q = {(x, p) :b T p = c T x, Ax b, A T p = c, p 0}. Option 2: Sliding objective: start by finding x 0 2 P; foreacht apply ellipsoid algorithm to P \ {x : c T x < c T x t }. If the set is empty, x t is a solution. Otherwise, we find a point x t+1 2 P that s better than x t, and continue. IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 166

Practical implications? I Although a great theoretical accomplishment, Ellipsoid algorithm never became a method of choice for solving LPs in practice I Its observed running time its worst-case running time... I...unlike the simplex method I...or the Interior Point (Barrier) Methods IOE 610: LP II, Fall 2013 Ellipsoid algorithm Page 167