Optimization (168) Lecture 7-8-9 Jesús De Loera UC Davis, Mathematics Wednesday, April 2, 2012 1
DEGENERACY IN THE SIMPLEX METHOD 2
DEGENERACY z =2x 1 x 2 + 8x 3 x 4 =1 2x 3 x 5 =3 2x 1 + 4x 2 6x 3 x 6 =2 + x 1 3x 2 4x 3 x 1,x 2,x 3,x 4,x 5,x 6,x 0 0 Clearly x 3 enters the basis, but who leaves? All variables x 4,x 5,x 6 give the same increase! Choose any!! Say x 4 pivot. z =4 + 2x 1 x 2 4x 4 x 3 =0.5 0.5x 4 x 5 = 2x 1 + 4x 2 + 3x 4 x 6 =x 1 3x 2 2x 4 x 1,x 2,x 3,x 4,x 5,x 6,x 0 0 NOTE: x 5,x 6 are basic, but they are also equal to ZERO! DEGENERATE PROBLEM! 3
This has annoying consequences. For example if we pivot again, x 1 enters the basis and x 5 leaves (limit of increment is zero!). z =4 + 3x 2 x 4 x 5 x 1 =2x 2 + 1.5x 4 0.5x 5 x 3 =0.5 0.5x 4 x 6 = x 2 + 3.5x 4 0.5x 5 x 1,x 2,x 3,x 4,x 5,x 6,x 0 0 This does not change the solution at all!!! We did not move!!! Definition: We say a dictionary is degenerate if one or more of the basic variables vanish. Definition: A pivot is degenerate if the objective function value does not change. Sometimes the simplex method goes through a few degenerate pivots one after the other, sometimes even CYCLING CAN HAPPEN!! Namely, a sequence of pivots that returns to the dictionary from which the cycle began. 4
We use a concrete pivot rule: (i) The entering variable will be the nonbasic variable that has the largest coefficient in the z-row. (ii) If two or more basic variables compete for leaving the basis, then the candidate with the smallest subscript will be made to leave. z =10x 1 57x 2 9x 3 24x 4 x 5 = 0.5x 1 + 5.5x 2 + 2.5x 3 9x 4 x 6 = 0.5x 1 + 1.5x 2 + 0.5x 3 x 4 x 7 =1 x 1 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 What is the next dictionary? Who enters who leaves? z =53x 2 + 41x 3 204x 4 20x 5 x 1 =11x 2 + 5x 3 18x 4 2x 3 x 6 = 4x 2 2x 3 + 8x 4 + x 5 x 7 =1 11x 2 5x 3 + 18x 4 + 2x 5 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 5
z =14.5x 3 98x 4 6.75x 5 13.25x 6 x 2 = 0.5x 3 + 2x 4 + 0.25x 5 0.25x 6 x 1 = 0.5x 3 + 4x 4 + 0.75x 5 2.75x 6 x 7 =1 + 0.5x 3 4x 4 0.75x 5 13.25x 6 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 After the third pivot. z =18x 4 + 15x 5 93x 6 29x 1 x 3 =8x 4 + 1.5x 5 5.5x 6 2x 1 x 2 = 2x 4 0.5x 5 + 2.5x 6 + x 1 x 7 =1 x 1 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 6
After the fourth pivot: z = 10.5x 5 70.5x 6 20x 1 9x 2 x 3 = 0.5x 5 + 4.5x 6 + 2x 1 4x 2 x 4 = 0.25x 5 + 1.25x 6 + 0.5x 1 0.5x 2 x 7 = 1 x 1 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 After the fifth pivot: z = 24x 6 + 22x 1 93x 2 21x 3 x 5 = 9x 6 + 4x 1 8x 2 2x 3 x 4 = x 6 0.5x 1 + 1.5x 2 + 0.5x 3 x 7 = 1 x 1 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 Finally, if we do it one more time we go back to where we started!!! QUESTION How to deal with this potential problem? 7
Commercial Break: MATRIX NOTATION 8
The standard linear program is We start with a problem in the form max s.t. n j=1 n j=1 c j x j x j 0 a ij x j b i i = 1...m Turn it into a DICTIONARY by adding slack variables (one for each inequality). z = n j=1 x n+i = b i x j, 0 c j x j n j=1 a ij x j for i = 1...m We can rewrite it using matrices: maxc T x subject to Ax = b, x 0 where 9
a 11 a 12 a 1n 1 a 21 a 22 a 2n 1 A =........... a m1 a m2 a mn 1, x = x 1 x 2. x n x n+1. x n+m b = c = [ c 1 c 2 c n 0 0 ]. Each iteration the simplex method selects some basic variables B and nonbasic N. We separate the equation Ax as A B x B + A N x N where the indices B, N label the columns. involved. Thus we can write A B x B = b A N x N. Lemma At each step A B is invertible (WHY?) Multiply both sides by the inverse A 1 B : x B = A 1 B b (A 1 B )A Nx N. b 1 b 2. b m To obtain the new objective function row: c B x B + c N x N substitute blue equation:. z = c B (A 1 B b (A 1)A Nx N ) + c N x N = c B A 1 b + (c N c B A 1 A N)x N B B B 10
z = c B A 1 B b + (c N c B A 1 B A N)x N = c B B 1 b + (c N c B B 1 N)x N. x B = A 1 B b (A 1 B )A Nx N = B 1 b B 1 Nx N. 11
12
13
Let us compare!! The matrix formulas for the dictionary are z = c B B 1 b + (c N c B B 1 N)x N. x B = B 1 b B 1 Nx N. Did we get the same answer in our example? (x 2 enters, x 4 leaves). 14
From matrix notation commercial: A dictionary is uniquely determined by the choice of BASIC variables. Proposition If the simplex method fails to terminate, then it is because it cycled! Proof: Each dictionary is uniquely determined by the choice of BASIC and NON-BASIC variables. ( ) n+m If the problem has n variables, m inequalities, then there are m possible dictionaries. Thus if it fails to terminate, we must repeat a dictionary! Every pivot on a cycle must be degenerate!! WHY? There are two ways to fix the problem: Perturbation and Bland s Rule Definition: A Pivot rule is a method to choose which variable enters and which variable leaves. Largest coefficient rule: Choose the variable with the largest coefficient in the objective function. Greatest increase rule: Pick the entering/leaving pair so as to maximize the increase of the objective function over all possibilities. Random selection rule: Select uniformly at random from the set of possibilities. 15
Bland s pivot rule was the first ever pivot rule that did not cycle!!! First ever pivot rule that guaranteed simplex method will terminate!!! Theorem(Bland 1977) The simplex method terminates as long as the entering and leaving variables are selected by the Smallest-Subscript rule (Bland s rule): ALWAYS CHOOSE THE CANDIDATE x k THAT HAS THE SMALLEST SUBSCRIPT k 16
The Fundamental Theorem of Linear Programming Corollary Every LP in the standard form maxc T x subject to Ax = b, x 0 has the following three properties: Proof: If it has no optimal solution, then it is either infeasible or unbounded. If it has a feasible solution, then it has a basic feasible solution. If it has an optimal solution, then it has a basic optimal solution If the simplex method does not cycle, it must terminate. This can be achieved thanks to Bland s rule!. Phase I algorithm either proves that original problem is infeasible OR produces a basic feasible solution. Phase II either proves problem is unbounded OR reaches a basic optimal solution. 17
18
We start with the cycling example. z =10x 1 57x 2 9x 3 24x 4 x 5 = 0.5x 1 + 5.5x 2 + 2.5x 3 9x 4 x 6 = 0.5x 1 + 1.5x 2 + 0.5x 3 x 4 x 7 =1 x 1 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 We do a perturbation by adding ε i s. We think that 0 << ε 3 << ε 2 << ε 1 << 1. z =10x 1 57x 2 9x 3 24x 4 x 5 = ε 1 0.5x 1 + 5.5x 2 + 2.5x 3 9x 4 x 6 = ε 2 0.5x 1 + 1.5x 2 + 0.5x 3 x 4 x 7 =1 + ε 3 x 1 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 19
Sometimes the best way to think about the ε i is that for 0 < ε < 1 we set ε i = ε i. Who is entering now? x 1. The constraints that limit the increase of x 1 are: x 5,x 6,x 7, thus 2ε 1,2ε 2,1 + ε 3. HOW to tell which is bigger? Use the force... of the LEXICOGRAPHIC RULE! These numbers are like words in a dictionary r 0 + r 1 ε 1 + r 2 ε 2 + + r m ε m, s 0 + s 1 ε 1 + s 2 ε 2 + + s m ε m 2 + 21ε 1 + 19ε 2 + + 20ε 3 < 2 + 21ε 1 + 20ε 2 + + 15ε 3 We can see 2ε 2 < 2ε 1 < 1 + ε 3. We have the variable x 6 leaves!! 20
z =20ε 2 27x 2 + x 3 44x 4 20x 6 x 1 =2ε 2 +3x 2 + x 3 2x 4 2x 6 x 6 =ε 1 ε 2 +4x 2 + 2x 3 8x 4 + x 6 x 7 =1 2ε 2 + ε 3 3x 2 x 3 + 2x 4 + 2x 6 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 WHO ENTERS? WHO LEAVES? x 3 enters and x 7 leaves! z =1 + 18ε 2 + ε 3 30x 2 42x 4 18x 6 x 7 x 3 =1 2ε 2 + ε 3 3x 2 + 2x 4 + 2x 6 x 7 x 1 =1 + ε 3 x 7 x 5 =2 + ε 1 5ε 2 + 2ε 3 2x 2 4x 4 + 5x 6 2x 7 x 1,x 2,x 3,x 4,x 5,x 6,x 7 0 We have an optimum! This can be converted back to the original dictionary by simply disregarding the terms with ε i. 21
Efficiency Question: Given a problem of a certain size, how long will it take to solve it? How do you measuring duration? (e.g., Number of iterations, time per iteration, etc). Two Kinds of Answers: Average Case How long for a random problem. Worst Case How long for the hardest problem. Performance will depend on the size of the problem!! but... what is the size of a problem? Number of constraints m and/or number of variables n. Size in bytes for the formulation. We are looking for a function that describes the number of pivots. Some functions are faster than others! Some algorithms are better than others! 22
23
Theorem V. Klee and G. Minty (1972) showed that there are explicit linear programs with n constraints for which the largest-coefficient pivot rule can take 2 n 1 pivots to reach the optimal solution. But we know today there are many alternative methods to the SIMPLEX METHOD: Fourier-Motzkin Elimination (Goes back to Fourier, rediscovered by T. Motzkin. Interesting in theory but much slower than Simplex) Relaxation Methods (invented by T. Motzkin and his students in the 1930 s, 1940 s. Interesting in theory but much slower than Simplex). Kachiyan s Ellipsoid Method (invented in the late 1970 s. First ever polynomial time algorithm for solving linear programs. SLOW!) Karmarkar s Interior Point Methods (invented in the late 1980 s. Good theoretical and practical performance!! Compete with Simplex!). Many other algorithms exist, but they are mostly variations or improvements of those mentioned so far. 24