Duality in Linear Programming - PDF Free Download

Duality in Linear Programming Gary D. Knott Civilized Software Inc. 1219 Heritage Park Circle Silver Spring MD 296 phone:31-962-3711 email:knott@civilized.com URL:www.civilized.com May 1, 213.1 Duality in Linear Programming Consider the following standard primal linear programming problem. (Here we impose n 1 and m 1. Also note that in this section the symbol c is not reserved to be an (n + m)-covector; it may have differing numbers of components in differing contexts.) Determine x (R n ) to maximize c T x subject to A o x b and x where A o is an m n matrix, and c (R n ), and b (R m ). Now note that if y with y (R m ) then A o x b implies y T A o x y T b. This is because y T A o x y T b is just the sum of the non-negatively-scaled inequalities y i (A o row i, x) b i, i.e., 1 i m y i(a o row i, x) 1 i m y ib i. Now let P = {x (R n ) A o x b and x }. Suppose x is a feasible point for our primal linear programming problem, so x P, i.e., A o x b and x. Also suppose we can choose y 1,...,y m defining an m-covector y such that (y T A o ) i c i and y i for i = 1,...,n. Then c T x y T A o x, and we have y T A o x y T b, so altogether we have c T x y T A o x y T b. We see that y T A o x and y T b are both upper-bounds of our primal objective function ζ(x) = c T x for admissible choices of x and y, and y T b is independent of x, so for any admissible choice of y, y T b is an upper-bound of c T x for all feasible points x P. Exercise.1: Why must we assume both A o x b and x to conclude that c T x y T A o x y T b? If we want to make y T b = b T y a close upper-bound for c T x, we will want to choose y, subject to y T A o c T and y, to make b T y as small as possible. This is just a linear programming problem: determine y (R m ) to minimize b T y subject to (A o ) T y c and y where A o is an m n matrix, and c (R n ), and b (R m ). 1

2 This linear programming problem is called the standard dual problem associated with our standard primal linear programming problem. If this dual problem has a feasible point then it has an optimal point. Moreover, for every feasible point, y, of our dual problem, we have c T x b T y for all feasible points x of the corresponding primal problem. (Note if this dual problem is non-empty, i.e., has a feasible point, and the only optimal points for this dual problem are points at infinity, then the value of the objective function b T y will be at such an optimal point and the primal problem can have no solution. This is because b T y is minimized when y is an optimal point.) Exercise.2: Suppose our dual problem has a feasible point. Why must b T ŷ = when all the optimal points of our dual problem are points at infinity? Hint: consider how we might go from a finite feasible point to a point-at-infinity optimal point. Exercise.3: If b =, is the standard dual problem guaranteed to have a solution? Exercise.4: Show that the dual problem of the linear programming problem: determine y (R m ) to minimize b T y subject to (A o ) T y c and y where A o is an m n matrix, and c (R n ), and b (R m ) is just our original primal programming problem: determine x (R n ) to maximize c T x subject to A o x b and x. Solution.4: Constructing the dual problem of a given primal linear programming problem is a syntactic process. When we state a given linear programming problem in the standard form: determine x (R n ) to maximize c T x subject to A o x b and x where A o is an m n matrix, and c (R n ), and b (R m ). Then the dual problem is: determine y (R m ) to minimize b T y subject to (A o ) T y c and y. This dual problem is equivalent to the standard-form problem: determine y (R m ) to maximize b T y subject to ( A o ) T y c and y. Syntactically, we replace c T with b T, b with c, and A o with ( A o ) T in our standard primal linear programming problem in order to construct its dual problem in standard form. A o b In array form, the array c T describing our primal problem becomes the array ( A o ) T c b T describing our dual problem. This is sometimes summarized by saying the dual problem is the negative transpose of the primal problem. ( A Now note that the negative transpose of o ) T c A o b b T is just c T again. Thus the dual of the dual of our primal problem is our primal problem itself. Exercise.5: Show that the dual problem of the linear programming problem: determine x (R k ) to maximize c T x subject to Ax = b and x where A is an m k matrix, and c (R k ), and b (R m ) is the problem: determine w (R m ) to minimize b T w subject to A T w c. Note c and x are here k-covectors. This fluidity in the meanings of the symbols c and x is a potential source of confusion herein; beware. (We shall sometimes write c o to indicate c o (R n ), but this is not uniformly done because it is often counterproductive.)

A b Solution.5: Put this problem is standard form by writing Ax = b as x. A b A b Then the primal problem array is A b and the corresponding negative transpose array c T A for the dual problem is T A T c b T b T, and this specifies the dual problem as: determine y1 to maximize b y T b T y 1 subject to A 2 y T A T y 1 y1 c and 2 where y 1 and are m-covectors. And this problem can be restated as: determine w to minimize b T w subject to A T w c. This is because max b T b T y1 = max b T y y 1, y 1 + b T 2 y 1, where w = y 1. And A T A T y 1 = max y 1, b T y 1 + = max w bt ( w) = min bt w w c and y1 3 is equivalent to A T ( y 1 + ) c and y 1 and, which is equivalent to A T w c with w (R m ) otherwise unrestricted. (Given w, we may determine y 1 and as follows. Take (y 1 ) i = w i when w i and take (y 1 ) i = otherwise. And take ( ) i = w i when w i < and take ( ) i = otherwise.) c Note if k = n+m and A = A o I where A o o is an m n matrix and c = with c o (R n ), then this dual problem becomes determine w (R m ) to minimize b T w subject to (A o ) T w c o and w. Exercise.6: Show that the linear programming problem: determine w to minimize b T w subject to A T w c is equivalent to the linear programming problem: determine z to maximize b T z subject to A T z c. Exercise.7: Show that the dual problem of the slack-variable form of our primal linear programming problem is equivalent to the standard problem dual to our standard primal problem. Solution.7: Put the slack-variable form of our primal problem in standard form as: determine x to maximize (c o ) T o x A I x o b subject to. Here t t A I t b represents the slack variables. A o I b Then the corresponding array representation is A o I b and the corresponding neg- (c o ) T

ative transpose array is ( A o ) T (A o ) T c o ( I) T I T b T b T to maximize b T b T y 1, and this specifies the dual problem: deter- y1 ( A mine subject to o ) T (A o ) T y1 c o ( I) T I T. y1 And this problem is equivalent to determine to maximize b T (y 1 ) subject to ( A o ) T y 1 + (A o ) T c o and y 1 + which is in turn equivalent to determine w to maximize b T w subject to ( A o ) T w c o and w where w = y 1, and finally, this is equivalent to determine w to minimize b T w subject to (A o ) T w c o and w. Note the array form we used to introduce the negative transpose dual is not the form we use in the A o tableau representation of the simplex algorithm. This is because we require c T to transform to ( A o ) T c b corresponding to (A o ) T c, while we need to transform to b T ; if we were to change the sign of c, we would need to change the sign of b also. 4.1.1 The Strong Duality Theorem Let P be the polyhedron of our primal linear programming problem: determine x (R n ) to maximize c T x subject to A o x b and x where A o is an m n matrix, and c (R n ), and b (R m ) and let Q be the polyhedron of the associated dual linear programming problem: determine y (R m ) to minimize b T y subject to (A o ) T y c and y. Thus P = {x (R n ) A o x b and x } and Q = {y (R m ) (A o ) T y c and y }. We saw above that if x P and y Q, then c T x b T y. This is called the weak-duality theorem. Thus if c T x = b T y with x P and y Q, then x must be an optimal point of P that maximizes c T x and y must be an optimal point of Q that minimizes b T y. This is because b T y max{c T v v P }. And when b T y = c T x, b T y is the least possible upper-bound of {c T v v P }, and thus b T y = c T x = max{c T v v P }, so x must be a solution of our primal linear programming problem. Similarly, c T x is the greatest possible lower-bound of {b T w w Q}, and thus y must be a solution of our dual linear programming problem when c T x = b T y. Also we have the following corollary of the weak duality theorem. If the polyhedron P of our primal linear programming problem contains a point at infinity x for which the objective function value c Tˆx =, then, since b T y c Tˆx whenever y Q, b T y must also be for all y Q. But this is impossible if Q is non-empty, since Q then contains a finite feasible point y, i.e., Q cannot be composed entirely of points at infinity, and then the dual objective function value b T y is finite and the minimal value of b T y must be no greater than b T y. Hence Q must be empty. Thus if our primal linear programming problem is unbounded with an unbounded objective function value, then the dual linear programming problem must be empty i.e., Q =. Conversely, if Q is non-empty, our primal problem must have a finite optimal point. Similarly, if the polyhedron Q of our dual linear programming problem contains a point at infinity ŷ as an optimal point for which the objective function vaue b T ŷ =, then the primal linear

5 programming problem must be empty, i.e., P =. And conversely, if P is non-empty, our dual problem must have a finite optimal point. Remember a linear programming problem may have an associated unbounded polyhedron without its objective function being ±, but if the objective function is ± then the associated polyhedron is necessarily unbounded. If either our primal or dual linear programming problem has a finite optimal point, then so does the other problem. And in fact, the values of their respective objective functions must be identical! This fact is called the strong-duality theorem. Exercise.8: Give an example where P is unbounded but the dual problem is non-empty and has a finite optimal point. Exercise.9: Give an example where P = and Q =. Solution.9: 1 1 primal: determine x to maximize x 1 subject to x and x. 1 1 1 dual: determine y to minimize y 1 subject to y and y. 1 Altogether we have the following possiblities for our standard-form primal and dual problems. 1. Our primal problem has a finite optimal point ˆx and our dual problem has a finite optimal point ŷ and c Tˆx = b T ŷ. Also, if one problem has a face of finite solutions of dimension greater than, then the finite solution of the other problem is an overdetermined vertex. 2. Our primal problem has a point-at-infinity optimal point ˆx and c Tˆx =, and our dual problem has no solution, (Q is empty.) 3. Our dual problem has a point-at-infinity optimal point ŷ and b T ŷ =, and our primal problem has no solution, (P is empty.) 4. Neither our primal problem or our dual problem have solutions, (P is empty and Q is empty.) Recall we have a criterion that specifies exactly when P = {x (R n ) A o x b and x } is empty, namely: P is empty exactly when b K = {p (R m ) p = A o Ix and x } which (A is equivalent to the criterion: there exists y (R m ) such that o ) T y and b I T y <. The possibilities above are somewhat subtle when P or Q is unbounded. For example, consider the primal problem: find x (R 2 ) to maximize c T x subject to x P where 1 1 P = {x (R 2 ) A o x b and x } with A o = and b =, i.e., 1 { } 1 1 P = x (R 2 ) x1 and x. Note b = means P is a cone 1 x 2 with { apex. The dual problem is: find y } (R 2 ) to minimize subject to y Q where 1 y1 c1 Q = y and y. 1 1 c 2

In the case where c =, all the points in P, including points at infinity, are optimal points, but c T x = for x P; and the polyhedron Q is the singleton set {(, ) T } so the solution of the dual problem is the finite point ŷ = T =. Also note (, ) T is an overdetermined vertex of Q. (And (, ) T is an overdetermined vertex of P as well.) 1 In the case where c =, we have the set of points {(, x 2 ) T x 2 } P as optimal points of our primal problem with c T x = for x {(, x 2 ) T x 2 }; all these optimal points are points at infinity. And the dual problem has Q =. 1 In the case where c =, we have the set of points {(, x 2 ) T x 2 } P as optimal points of our primal problem with c T x = for x {(, x 2 ) T x 2 }; all these optimal points are finite (R 2 ) -points. And the dual problem has every point of Q as an optimal point, where Q = {y (R 2 ) y 1 1 and y 1 } and b T y = T y = for y Q. Note Q is bounded. Exercise.1: With respect to the example discussed above, what is the set Q, and the set of solutions of our primal problem, and the set of solutions of our dual problem, when c T = ( 1, 1)? What are the solution sets when c T = ( 1, 1)? The fact that, if ˆx P is a finite optimal point of our standard-form primal problem then there exists ŷ Q such that ŷ is a finite optimal point of our standard-form dual problem, and conversely, and moreover c Tˆx = b T ŷ, can be proven as follows. Suppose ˆx is a finite optimal point of our primal linear programming problem with ˆx a vertex of P. Then the gradient covector c belongs to the normal cone of ˆx with respect to P. Recall that P = H 1 H n H n+1 H n+m where H i = {x (R n ) ( e T i, x) } for i = 1,...,n, and H n+j = {x (R n ) (a T j, x) b j} for j = 1,...,m where a j = A o row j. Here e T 1,..., et n, a T 1,...,aT m are all outwardly-directed { normal vectors of the } half-spaces whose A intersection defines the polyhedron P. Thus P = x (R n ) o b x. I Let H i 1,...,H iq be the half-spaces whose boundaries contain the optimal vertex ˆx; this means the bounding hyperplanes H i1,..., H iq intersect at the vertex ˆx. Let v 1,...,v q denote the outwardly-directed normal vectors of the half-spaces H i1,...,h iq. Because H i1 H iq = {ˆx}, we have q n and dim({v 1,...,v q }) = n. The normal cone at ˆx with respect to P is the polyhedral cone Nˆx = cone (v 1,...,v q ), and we have seen that c Nˆx because ˆx is an optimal vertex of P that maximizes (c, x). Thus c is a coniccombination of the covectors v 1,...,v q. (Remember a conic-combination is a linear combination of vectors (or covectors) with non-negative scalar coefficients.) The covectors v 1,...,v q all appear as columns of the m (n + m) matrix (A o ) T I. (Recall Nˆx is just the polar dual cone of the -apex cone H i1 H i2 H iq ˆx.) Carathéodory s theorem for polyhedral cones asserts that we can write c as a conic-combination of 6

7 a subset of n of the covectors v 1,...,v q. Thus we can write c = (A o ) T I y z where y (R m ) and z (R n ) with y and z, and at most n of the components of the y covector are positive and the rest are zero. z Exercise.11: Show that the columns of the matrix (A o ) T I are the outwardlydirected normal covectors a T 1,...,aT m, e T 1,..., et n. Thus c = (A o ) T y z, and since y is a non-negative m-covector and z is a non-negative (n m)- covector, we have c (A o ) T y and y. Note this is valid in the special case where c =. Exercise.12: Explain why the relations c (A o ) T y and y are valid when P = {}. What if an entire facet of P is comprised of optimal points of P? Exercise.13: Show that if ˆx, then at least one facet of P containing ˆx lies in a nonorthant-boundary hyperplane among H n+1,..., H n+m, i.e., at least one of the inequalities Aˆx b is tight and at least one component of y is positive. The relations c (A o ) T y and y mean y Q (!) So our dual problem has y as a feasible point and therefore our dual problem is not empty and hence has an optimal point ŷ with b T ŷ b T y. And c Tˆx b T ŷ by the weak duality theorem, so b T ŷ > and hence ŷ is finite when ˆx is finite. Suppose ŷ is a finite optimal point of our dual linear programming problem with ŷ a vertex of Q. Then the dual problem gradient covector b belongs to the normal cone of ŷ with respect to Q. { (A Since Q = y (R m ) o ) T } c y and the gradient vector of our dual problem is I b, we may follow the same line of argument as used above to characterize b Nŷ by writing b = A o I x where x (R n ) and v (R m ) with x and v and at most m of the components of the x covector are positive and the rest are zero. Here the columns of A v o I are outwarddirected normals of the half-space boundaries whose intersection defines ŷ. Thus b = A o x v where x and v, so b = A o x + v and x and v, and thus A o x b with x. The relations A o x b and x mean x P so our primal problem has x as a feasible point and therefore our primal problem is not empty and hence also has an optimal point ˆx. And we have c Tˆx b T ŷ by the weak duality theorem, so c Tˆx < and hence ˆx is finite when ŷ is finite. Now let us return to the representation of the gradient covector c as a conic combination of n of the outwardly-directed covectors e T 1,..., et n, a T 1,...,aT m where a j = A o row j. Let j 1,...,j n be the indices in {1,...,n + m} of the hyperplanes H j1,..., H jn from among the hyperplanes v

8 H 1,..., H n+m such that H j1,..., H jn are linearly-independent hyperplanes that intersect at y {ˆx}. The corresponding constraints are tight at ˆx and hence we may choose the covector so z y that zero or more of the non-negative coefficients in the covector that multiply the outwardlydirected normals of the hyperplanes z H j1,..., H jn in the columns of (A o ) T I will be positive. y and all the other components of will be zero. z Exercise.14: Show that {j 1,...,j n } {i 1,...,i q }. Exercise.15: Explain why we said zero or more above. y Let p =, so that p row 1 : m = y z and p row (m + 1) : (m + n) = z. Let t = b A oˆx. Define the sequence J = j 1,...,j n and define the sequence K = 1,...,n+m J. (The ordering of the indices in J and K are not important.) We have p row J and p row K =. The hyperplane normals indexed by K include all the slack constraints. Thus (A o ) T I col K contains all the columns of (A o ) T I where (A o row i) Tˆx < b i for 1 i m or e T i ˆx < for 1 i n. The components of p corresponding to these slack columns lie in p row K and are thus all zero. The non-zero components of p all lie in p row J and correspond to tight columns of (A o ) T I where (A o row i, ˆx) = or ˆx i =. This means (b A oˆx) T ˆx T p =! That is (b A oˆx) T ˆx T y =. z Thus (b A oˆx) T y + ˆx T z =. Exercise.16: Show that t T y + ˆx T z = where t = b A oˆx, and hence t T y = and ˆx T z =. Recall we have c = (A o ) T y z. Thus b T y = ˆx T (A o ) T y ˆx T z = ˆx T ((A o ) T y z) = ˆx T c = c Tˆx. Therefore, as we have seen above, y must be an optimal point of our dual problem and we have b T ŷ = c Tˆx where ŷ = y. The conclusion that y is an optimal point of Q is a consequence of the corollary of the weak duality theorem discussed above. Similarly, we may follow the line of reasoning parallel to that used above to establish the identity (( A o ) T ŷ + c) T ŷ T x =. v Thus ŷ T A o x + c T x ŷ T v =. But then c T x = ŷ T (A o x + v) = ŷ T b = b T ŷ since we showed that v + A o x = b above. Therefore x must be an optimal point of our primal problem and again we have c Tˆx = b T ŷ where ˆx = x. (We also have t = v because x = ˆx.) QED

9 We have c T x y T A o x b T y when x P and y Q; and when ˆx P maximizes c T x subject to x P and ŷ Q minimizes b T y subject to y Q, we have c Tˆx = ŷ T A oˆx = b T ŷ. Now we can deduce ŷ T (b A oˆx) = from ŷ T A oˆx = b T ŷ. And we can deduce ˆx T ((A o ) T ŷ c) = from c Tˆx = ŷ T A oˆx. The identities ŷ T (b A oˆx) = and ˆx T ((A o ) T ŷ c) = are called the complementary slackness conditions. Exercise.17: We have already obtained the complementary slackness conditions above. Identify where they are derived in the proof of the strong duality theorem. Let t = b A oˆx; we have t because A oˆx b. The elements of the m-covector t are the slack variables of our primal linear programming problem. Similarly, Let z = (A o ) T ŷ c; we have z because (A o ) T ŷ c. The elements of the n-covector z are the slack variables of our dual linear programming problem. The complementary slackness conditions are then ŷt = and ˆxz =. Since ŷ, t, ˆx, and z, these conditions imply that if ŷ i, t i =, and if t i, ŷ i =. And similarly, if ˆx i, z i =, and if z i, ˆx i =. In fact, unless ˆx is an overdetermined vertex of P, exactly one of ˆx i and z i are zero, and unless ŷ is an overdetermined vertex of Q, exactly one of ŷ i and t i are zero.1.2 Computing a Solution of the Dual Problem The simplex algorithm can be applied to our dual problem recast as: determine y (R m ) to maximize b T y subject to ( A o ) T y c o and y where A o is an m n matrix, and c o (R n ), and b (R m ), but in fact the simplex algorithm applied to the corresponding primal problem, (i.e., to the dual of this problem,) also produces an optimal point of this dual problem when our primal problem has a finite optimal point (!) We can see this as follows. Recall our primal problem is: determine x o (R n ) to maximize (c o ) T x o subject to A o x o b and x o where A o is an m n matrix, and c o (R n ), x o (R n ), and b (R m ). This corresponds to the slack-variable form problem: determine x (R n+m ) to maximize c T x subject to Ax = b and x where A = A o I is an m (n + m) matrix, and c T = (c o ) T R n+m, x T = (x o ) T t T R n+m, (the elements of the covector x row (n + 1) : m = t constitute our slack variables,) and b (R m ). When the simplex algorithm applied to our slack-variable primal problem halts with a finite optimal point ˆx (R n+m ), we have a length-n basis sequence N and a corresponding orthant-boundary constraint sequence Z = 1,...,n + m N, and ˆx row N = w and and ˆx row Z = and Aˆx = b. We also have the reduced gradient covector d where d T = (c T col N)B 1 A c T and where B is the non-singular m m matrix A col N. And z = c Tˆx. Recall w = B 1 b. Now d Tˆx = (c T col N)B 1 Aˆx c Tˆx. We know d since this is the termination condition in the simplex algorithm for the discovery of a finite optimal point. Also d row N =. Moreover, c Tˆx = (c T col N)B 1 b since d Tˆx = and Aˆx = b. Alternatively, (c T col N)B 1 b = (c T col N)(ˆx row N)

1 = (c T col N)(ˆx row N) + (c T col Z)(ˆx row Z) = c Tˆx. Exercise.18: Show that d Tˆx =. Hint: look at d row N, d row Z, ˆx row N, and ˆx row Z. Thus we have c Tˆx = (c T col N)B 1 b. Let ŷ T = (c T col N)B 1. Then c Tˆx = ŷ T b = b T ŷ. So ŷ might potentially be an optimal point of our dual problem: determine y (R m ) to minimize b T y subject to (A o ) T y c and y where A o is an m n matrix, and c (R n ), and b (R m ). Indeed ŷ will be an optimal point if ŷ is a feasible point of our dual problem; this is a consequence of the strong duality theorem. c But, we have the termination condition: d T = ŷ T A c T, and A = A o o I and c =, so d T = ŷ T A o c o ŷ T. Thus ŷ T A o (c o ) T and ŷ T, i.e., (A o ) T ŷ c o and ŷ. Thus ŷ is a feasible point of our dual problem, so ŷ is in fact an optimal point of our dual problem. Note ŷ T = d T col (n+1) : (n+m). Thus our terminal tableau exhibits not only ˆx (as ˆx row N = w and ˆx row Z =,) but also ŷ (as ŷ T = d T col (n + 1) : (n + m),) and the common value of the primal and dual objective functions is given by z = c Tˆx = b T ŷ. Exercise.19: What are the values of the n slack variables associated with the dual problem? Also, we can obtain a constructive proof of the strong duality theorem from the above specification of ŷ. Given that ˆx is a finite optimal point of our primal problem and ŷ is a feasible point of our dual problem and b T ŷ = c Tˆx, we may conclude that ŷ is an optimal point of our dual problem. Since the weak duality theorem requires min y Q b T y max x P c T x = c Tˆx, and min y Q b T y b T ŷ = c Tˆx, we see that min y Q b T y c Tˆx = b T ŷ min y Q b T y so min y Q b T y = b T ŷ. And when ˆx is a finite optimal point of our primal problem, we see that ŷ = B 1T (c row N) is a feasible point of our dual problem with the associated objective function value b T ŷ = c Tˆx, so ŷ is an optimal point of our dual problem..1.3 The Geometry of Duality Recall our standard primal linear programming problem: determine x (R n ) to maximize c T x subject to A o x b and x where A o is an m n matrix, and c (R n ), and b (R m ). We may introduce m slack variables t 1,...,t m defined in terms of x by t = b A o x, necessarily with t. Now consider the following slack-variable formulation of our primal linear programming problem. (Recall we assume n 1 and m 1.) Determine x (R n+m ) to maximize c T x subject to Ax = b and x where A is an m (n+m) rank m matrix, and c (R n+m ), and b (R m ).

11 Here we have redefined x to be an (n + m)-covector with the m added slack-variable components x n+1 = t 1,..., x n+m = t m, and we have redefined c to be an (n + m)-covector by extending it with c row (n + 1) : (n + m) =, and we define A to be the m (n + m) matrix A o I. The corresponding polyhedron is P = {x (R n+m ) Ax b and x }. (In fact the following results do not depend on the matrix A being of the form A o I; the following is valid when A is an arbitrary m (n + m) rank m matrix.) The corresponding dual linear programming problem is: determine y (R m ) to minimize b T y subject to A T y c where A is an m (n + m) rank m matrix, and c (R n+m ), and b (R m ). Let k = n + m so that A is an m k matrix. The slack-variable primal problem above can be written as: determine x (R k ) to maximize c T x subject to A(x d) and x where d (R k ) and satisfies Ad = b. Note when A = A o I, we can choose d =. When there is no covector d such that Ad = b, b our slack-variable primal problem has no solution; otherwise the m equations Ax = Ad specify a (k m)-dimensional flat F in (R k ). Even if d exists such that Ad = b, our problem will only have a solution if d can be chosen such that d. Let us assume our slack-variable primal problem has a feasible point, i.e., the corresponding polyhedron P = F O + k is non-empty. Prabhakaran Pra2 gives a way to lift the dual problem from (R m ) to (R k ). The basic idea is that the m-dimensional polyhedron Q = {y (R m ) A T y c} of the dual problem can be mapped one-to-one into an m-dimensional flat G in (R k ) orthogonal to the (k m)-dimensional flat F (R k ) where F = {x (R k ) Ax = b} = {x (R k ) Ax = Ad} is the flat associated with the slack-variable formulation of our primal linear programming problem. First introduce k slack variables s 1,...,s k to write the dual problem as: determine s (R k ) and y (R m ) to minimize b T y subject to s = A T y c and s. Note that the constraints s = A T y c define s in terms of y (R m ) and thus confine s to an m-dimensional flat in (R k ). The map y A T y c is the map that carries Q into the flat G orthogonal to F. When there are no covectors s and y such that A T y = c + s with s, our dual problem has no solution. Let us assume our dual problem has a feasible point, i.e., the corresponding polyhedron Q is non-empty. Exercise.2: Show that if our primal slack-variable form problem has a feasible point and our dual slack-variable form problem also has a feasible point, then both our primal and our dual problem have finite optimal points. Now recall k m = n and define A as a n k rank n matrix whose rows form a basis of rowspace(a). Note A A T = O n m. (When A = A o I m m, A can be taken as I n n (A o ) T.) The dual problem constraint s + c = A T y can now be transformed to yield A (s + c) = because A (s + c) = A A T y and A A T = O n m and A has rank n.

12 Exercise.21: Show that when Q, s+c rowspace(a). And then, since rank(a) = m, there exists a unique m-covector y such that A T y = s + c. Exercise.22: Show that A T Q = {A T y (R k ) A T y c} = {s + c (R k ) s = A T y c and s } = {s + c (R k ) A (s + c) = and s }. Also the dual problem objective function b T y can be written as d T (c + s) since we have assumed there is a covector d (R k ) such that b T = d T A T, so b T y = d T A T y, and we have c + s = A T y. And thus b T y = d T (c + s) and we see that choosing y to minimize b T y is equivalent to choosing s to minimize d T s. Thus our dual problem is transformable to: determine s (R k ) to minimize d T s subject to A (s + c) = and s. The n equations A s = A c specify an m-dimensional flat G in (R k ) where G = {s (R k ) A s = A c}. Recall F = {x (R k ) Ax = Ad}. Note c G and d F. When our slack-variable primal and dual problems both have feasible points, we have ldim(f) = k m = n and ldim(g) = m, and in fact F G. Exercise.23: Show that G = flat(a T Q ) c. Since the rows of A are normal to the rows of A, the (R k ) -flats F and G are orthogonal, and since the subspaces F d and G + c are likewise orthogonal and their dimensions sum to k, they are complementary to one-another in (R k ), and (F d) (G + c) = {}. Hence there is a unique (R k ) -point a such that F G = {a}. The point a is computable as the unique solution of the k equations A A a = Ad A c. (Note A A is a k k non-singular matrix.) Exercise.24: Is the point a necessarily a feasible point of either the slack-varible form of our primal linear programming problem or of the slack-varible form of our dual linear programming problem? That is, does a necessarily hold? Now, A(x d) = A((x a) (d a)) = A(x a) since A(d a) =, (because a F.) And A (s+c) = A ((s a) ( c a)) = A (s a) since A ( c a) =, (because a G.) Moreover, the k-covector ˆx that maximizes c T x subject to A(x a) = and x also minimizes a T x subject to the same constraints. This is because c T x = (c a+a) T (x d+d) = a T x+(c+a) T (x d)+(c+a) T d, and A(x d) = and A (c+a) = implies x d and c+a are perpendicular, so (c a) T (x d) =. Thus c T x = a T x + (c + a) T d, so the covector ˆx that yields a constrained minimum of a T x is the same as the covector that yields a constrained maximum of c T x. In the same way, the k-covector ŝ that minimizes d T s subject to the constraints A (s + c) = A (s a) = and s also minimizes a T s subject to the same constraints. Exercise.25: Show that d T s = a T s + (d a) T a. Thus the above transformations produce the modified primal problem: determine x (R k ) to minimize a T x subject to A(x a) = and x.

13 And the modified dual problem: determine s (R k ) to minimize a T s subject to A (s a) = and s. Suppose that one of our modified primal and modified dual problems has a finite optimal point. Then the strong-duality theorem implies that both have finite solutions. Note if a, both our modified primal and modified dual problems have a as a feasible point. Let x and s denote feasible points of our modified primal and modified dual problems respectively. We have x F O + k and s G O + k. This means x d F d and s+c G+c, so (x d, s+c) = = xt (s+c) d T (s+c). And then b T y = d T (c + s) = x T (c + s) c T x since x and s. This is the weak-duality theorem. Exercise.26: Assume there is a covector y (R m ) such that A T y c. Given s (R k ) such that A (s a) = and s, explain how to compute the corresponding feasible point y such that A T y c. Hint: the affine transformation that maps Q into G is given by v A T v c Solution.26: Recall A a = A c. Thus we have A (s + c) =, so s + c rowspace(a), i.e., there exists a unique m-covector y such that A T y = s + c. (Note rtnullspace(a ) = rowspace(a).) Thus y = (A T ) + (s + c). The k equations A T y = s + c are overdetermined, but s is defined exactly so that these equations have a (unique) solution. And since s, A T y = s + c ensures that A T y c. The modified primal and modified dual problems obtained above have a geometrical interpretation where the solution to our modified primal problem, if it exists, is an (R k ) -covector ˆx such that ˆx is an extreme point of the (k m)-dimensional degenerate polyhedron F O + k lying in the intersection of some distinguished collection of m (and possibly more) of the hyperplanes defining the boundary of O + k, while the solution to our modified dual problem, if it exists, is an (Rk ) -covector ŝ such that ŝ is an extreme point of the m-dimensional degenerate polyhedron G O + k lying in the intersection of the other k m hyperplanes defining the boundary of O + k, and possibly more such O+ k -hyperplanes as well. And the connection between duality and orthogonality is clearly seen since the flats F and G are orthogonal. Exercise.27: Show that (ˆx a,ŝ a) =. If we start at the point a common to F and G, and move in F to the extreme point ˆx and also move in G to the extreme point ŝ, then as we move, the distance between the two moving points gets steadily larger until we reach the semi-corners of the boundary of O + k where the optimal points ˆx and ŝ are found. Note that the motion of these two points need not be consistent with the two gradient directions obtained from the common gradient direction a projected in F and in G; this is because a need not lie in O + k, and if so, both points will move from outside O+ k to the boundary of O + k. Exercise.28: Under what conditions does the angle between these two moving points also grow steadily larger, or at least not diminish, as we move until the maximum possible separating distance is attained at the two optimal points? Exercise.29: Let α, β R + and let p, q, a R 2 with p and q and p q. Let C denote the cone cone (p, q). Show that when a C ( C), the full -angle in, 2π) between

REFERENCES 14 the two vectors αp + a and βq + a increases as α and β increase, unless a =. (We need to admit full-angles greater than π for this to be generally true for a C.) References Pra2 Manoj Prabhakaran. Visualizing LP duality. Princeton CS Dept., 22.