Comparing Convex Relaxations for Quadratically Constrained Quadratic Programming Kurt M. Anstreicher Dept. of Management Sciences University of Iowa European Workshop on MINLP, Marseille, April 2010
The QCQP problem Consider a quadratically constrained quadratic program: (QCQP) z = min f 0 (x) s.t. f i (x) d i, i = 1,..., q x 0, Ax b, where f i (x) = x T Q i x + c T i x, i = 0, 1,..., q, each Q i is an n n symmetric matrix, and A is an m n matrix.
The QCQP problem Consider a quadratically constrained quadratic program: (QCQP) z = min f 0 (x) s.t. f i (x) d i, i = 1,..., q x 0, Ax b, where f i (x) = x T Q i x + c T i x, i = 0, 1,..., q, each Q i is an n n symmetric matrix, and A is an m n matrix. Let F = {x 0 : Ax b}; assume throughout F bounded.
The QCQP problem Consider a quadratically constrained quadratic program: (QCQP) z = min f 0 (x) s.t. f i (x) d i, i = 1,..., q x 0, Ax b, where f i (x) = x T Q i x + c T i x, i = 0, 1,..., q, each Q i is an n n symmetric matrix, and A is an m n matrix. Let F = {x 0 : Ax b}; assume throughout F bounded. If Q i 0 for each i, QCQP is a convex programming problem that can be solved in polynomial time, but in general the problem is NP-Hard. QCQP is a fundamental global optimization problem.
Two Convexifications of QCQP A common approach to obtaining a lower bound on z is to somehow convexify the problem. We consider two different approaches.
Two Convexifications of QCQP A common approach to obtaining a lower bound on z is to somehow convexify the problem. We consider two different approaches. For the first, let ˆf i ( ) be the convex lower envelope of f i ( ) on F, ˆf i (x) = max{v T x : v T ˆx f(ˆx) ˆx F}.
Two Convexifications of QCQP A common approach to obtaining a lower bound on z is to somehow convexify the problem. We consider two different approaches. For the first, let ˆf i ( ) be the convex lower envelope of f i ( ) on F, ˆf i (x) = max{v T x : v T ˆx f(ˆx) ˆx F}. Let QCQP be the problem where ˆfi ( ) replaces f i ( ), i = 0,..., q, and let ẑ be the solution value in QCQP.
Second approach to convexifying QCQP is based on linearizing the problem by adding additional variables. Let X denote a symmetric n n matrix. Then QCQP can be written (QCQP) z = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, i = 1,..., q x 0, Ax b, X = xx T.
Second approach to convexifying QCQP is based on linearizing the problem by adding additional variables. Let X denote a symmetric n n matrix. Then QCQP can be written (QCQP) z = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, x 0, Ax b, X = xx T. i = 1,..., q A convexification of QCQP can then be given in terms of the set { (1 )( ) 1 T C = Co : x F}. x x
Second approach to convexifying QCQP is based on linearizing the problem by adding additional variables. Let X denote a symmetric n n matrix. Then QCQP can be written (QCQP) z = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, x 0, Ax b, X = xx T. i = 1,..., q A convexification of QCQP can then be given in terms of the set { (1 )( ) 1 T C = Co : x F}. x x Let QCQP denote problem where X = xx T is replaced by ( ) 1 x T Y (x, X) := C. x X
Have two convexifications: ( QCQP) ẑ = min ˆf 0 (x) s.t. ˆfi (x) d i, i = 1,..., q x 0, Ax b, (QCQP) z = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, Y (x, X) C, i = 1,..., q
Have two convexifications: ( QCQP) ẑ = min ˆf 0 (x) s.t. ˆfi (x) d i, i = 1,..., q x 0, Ax b, (QCQP) Claim: ẑ z z = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, Y (x, X) C, i = 1,..., q To prove the claim, must relate the different convexifications used to construct QCQP and QCQP.
Theorem 1. For x F, let f(x) = x T Qx + c T x, and let ˆf( ) be the convex lower envelope of f( ) on F. Then ˆf(x) = min{q X + c T x : Y (x, X) C}.
Theorem 1. For x F, let f(x) = x T Qx + c T x, and let ˆf( ) be the convex lower envelope of f( ) on F. Then ˆf(x) = min{q X + c T x : Y (x, X) C}. Proof: For x F, let g(x) = min{q X +c T x : Y (x, X) C}. Must show that ˆf(x) = g(x). First show that g( ) is a convex function with g(x) f(x), x F, implying that g(x) ˆf(x).
Theorem 1. For x F, let f(x) = x T Qx + c T x, and let ˆf( ) be the convex lower envelope of f( ) on F. Then ˆf(x) = min{q X + c T x : Y (x, X) C}. Proof: For x F, let g(x) = min{q X +c T x : Y (x, X) C}. Must show that ˆf(x) = g(x). First show that g( ) is a convex function with g(x) f(x), x F, implying that g(x) ˆf(x). Assume that for i {1, 2}, x i F and g(x i ) = Q X i + c T x i, where Y (x i, X i ) C. For 0 λ 1, let x(λ) = λx 1 + (1 λ)x 2, X(λ) = λx 1 + (1 λ)x 2. Then Y (x(λ), X(λ)) = λy (x 1, X 1 )+(1 λ)y (x 2, X 2 ) C, since C is convex. It follows that g(x(λ)) Q X(λ) + c T x(λ) = λg(x 1 ) + (1 λ)g(x 2 ), proving that g( ) is convex on F. That g(x) f(x) follows immediately from Y (x, xx T ) C and Q xx T + c T x = f(x).
Remains to show that ˆf(x) g(x). Assume that g(x) = Q X + c T x, where Y (x, X) C. From the definition of C, there exist x i F, and λ i 0, i = 1,..., k, k i=1 λ i = 1 such that ki=1 λ i x i = x, k i=1 λ i x i (x i ) T = X.
Remains to show that ˆf(x) g(x). Assume that g(x) = Q X + c T x, where Y (x, X) C. From the definition of C, there exist x i F, and λ i 0, i = 1,..., k, k i=1 λ i = 1 such that ki=1 λ i x i = x, k i=1 λ i x i (x i ) T = X. Follows that g(x) = Q X + c T x k = Q λ i x i (x i ) T + c T = i=1 k λ i f(x i ). i=1 k i=1 λ i x i
Remains to show that ˆf(x) g(x). Assume that g(x) = Q X + c T x, where Y (x, X) C. From the definition of C, there exist x i F, and λ i 0, i = 1,..., k, k i=1 λ i = 1 such that ki=1 λ i x i = x, k i=1 λ i x i (x i ) T = X. Follows that g(x) = Q X + c T x k = Q λ i x i (x i ) T + c T = i=1 k λ i f(x i ). i=1 k λ i x i i=1 But ˆf( ) is convex on F, and ˆf(x) f(x) for all x F, so ˆf(x) = ˆf k k k λ i x i λ i ˆf(x i ) λ i f(x i ) = g(x). i=1 i=1 i=1
Claimed relationship between QCQP and QCQP is an immediate consequence of Theorem 1. In particular, using Theorem 1, QCQP could be rewritten in the form ( QCQP) so that QCQP corresponds to X 0 = X 1 =... = X q. ẑ = min Q 0 X 0 + c T x s.t. Q i X i + c T i x d i, i = 1,..., q Y (x, X i ) C, i = 0, 1,..., q, QCQP with the added constraints
Claimed relationship between QCQP and QCQP is an immediate consequence of Theorem 1. In particular, using Theorem 1, QCQP could be rewritten in the form ( QCQP) so that QCQP corresponds to X 0 = X 1 =... = X q. ẑ = min Q 0 X 0 + c T x s.t. Q i X i + c T i x d i, i = 1,..., q Y (x, X i ) C, i = 0, 1,..., q, QCQP with the added constraints Corollary 1. Let ẑ and z denote solution values in convex relaxations QCQP and QCQP, respectively. Then ẑ z.
Note both problem of computing an exact convex lower envelope ˆf( ) and problem of characterizing C may be intractable. However, C can be represented using the cone of completely positive matrices. Define 1 x T s(x) T Y + (x, X) = x X Z(x, X), s(x) Z(x, X) T S(x, X) where s(x) = b Ax, S(x, X) = bb T Axb T bx T A T + AXA T, Z(x, X) = xb T XA T. Matrices S(x, X) and Z(x, X) relax s(x)s(x) T and xs(x) T, respectively.
Note both problem of computing an exact convex lower envelope ˆf( ) and problem of characterizing C may be intractable. However, C can be represented using the cone of completely positive matrices. Define 1 x T s(x) T Y + (x, X) = x X Z(x, X), s(x) Z(x, X) T S(x, X) where s(x) = b Ax, S(x, X) = bb T Axb T bx T A T + AXA T, Z(x, X) = xb T XA T. Matrices S(x, X) and Z(x, X) relax s(x)s(x) T and xs(x) T, respectively. Then (Burer 2009) C = {Y (x, X) : Y + (x, X) CP m+n+1 }, where CP k is the cone of k k completely positive matrices.
Distinction between QCQP and n = q = 1. Consider QCQP is already sharp for m = min x 2 1 s.t. x 2 1 1 2 0 x 1 1.
Distinction between QCQP and n = q = 1. Consider QCQP is already sharp for m = min x 2 1 s.t. x 2 1 1 2 0 x 1 1. Written in form of QCQP, x 2 1 1 2 is x2 1 1 2, and convex lower envelope on [0, 1] is x 1. Relaxation QCQP is then min x 2 1 s.t. x 1 1 2 0 x 1 1.
Distinction between QCQP and n = q = 1. Consider QCQP is already sharp for m = min x 2 1 s.t. x 2 1 1 2 0 x 1 1. Written in form of QCQP, x 2 1 1 2 is x2 1 1 2, and convex lower envelope on [0, 1] is x 1. Relaxation QCQP is then min x 2 1 s.t. x 1 1 2 0 x 1 1. Solution value is ẑ = 4 1. The solution value for QCQP is z = z = 1 2. For x 1 = 1 2, Y (x 1, x 11 ) C for x 11 [ 4 1, 1 2 ]. The solution of QCQP corresponds to using x 1 = 1 2 along with x 11 = 4 1 for the objective, and x 11 = 1 2 for the single nonlinear constraint.
Figure 1: Set C for example
Two computable relaxations For a quadratic function f(x) = x T Qx + c T x defined on F = {x : 0 x e}, the well-known αbb underestimator is f α (x) = x T (Q + Diag(α))x + (c α) T x, where α R n + has Q + Diag(α) 0. Since f α (x) ˆf(x), 0 x e. f α( ) is convex,
Two computable relaxations For a quadratic function f(x) = x T Qx + c T x defined on F = {x : 0 x e}, the well-known αbb underestimator is f α (x) = x T (Q + Diag(α))x + (c α) T x, where α R n + has Q + Diag(α) 0. Since f α (x) ˆf(x), 0 x e. A further relaxation of (QCQP αbb ) QCQP is then given by: f α( ) is convex, z αbb = min x T (Q 0 + Diag(α 0 ))x + (c 0 α 0 ) T x s.t. x T (Q i + Diag(α i ))x + (c i α i ) T x d i, i = 1,..., q 0 x e, where each α i is chosen so that Q i + Diag(α i ) 0.
For F = {x : 0 x e}, there are a variety of known constraints that are valid for Y (x, X) C:
For F = {x : 0 x e}, there are a variety of known constraints that are valid for Y (x, X) C: 1. The constraints from the Reformulation-Linearization Technique (RLT); {0, x i + x j 1} x ij {x i, x j }.
For F = {x : 0 x e}, there are a variety of known constraints that are valid for Y (x, X) C: 1. The constraints from the Reformulation-Linearization Technique (RLT); {0, x i + x j 1} x ij {x i, x j }. 2. The semidefinite programming (SDP) constraint Y (x, X) 0;
For F = {x : 0 x e}, there are a variety of known constraints that are valid for Y (x, X) C: 1. The constraints from the Reformulation-Linearization Technique (RLT); {0, x i + x j 1} x ij {x i, x j }. 2. The semidefinite programming (SDP) constraint Y (x, X) 0; 3. Constraints on the off-diagonal components of Y (x, X) coming from the Boolean Quadric Polytope (BQP), for example, the triangle inequalities for i j k, x i + x j + x k x ij + x ik + x jk + 1, x ij + x ik x i + x jk, x ij + x jk x j + x ik, x ik + x jk x k + x ij.
Consider relaxation that imposes Y (x, X) 0 together with the diagonal RLT constraints diag(x) x. Note that these conditions together imply the original bound constraints 0 x e. Result is: (QCQP SDP ) z SDP = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, i = 1,..., q Y (x, X) 0, diag(x) x.
Consider relaxation that imposes Y (x, X) 0 together with the diagonal RLT constraints diag(x) x. Note that these conditions together imply the original bound constraints 0 x e. Result is: (QCQP SDP ) z SDP = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, i = 1,..., q Y (x, X) 0, diag(x) x. Goal is to relate QCQP αbb and QCQP SDP. The following theorem shows that there is a simple relationship between the convexifications used to construct these problems.
Theorem 2. For 0 x e, let f α (x) = x T (Q + Diag(α))x + (c α) T x, where α 0 and Q + Diag(α) 0. Assume that Y (x, X) 0, diag(x) x. Then f α (x) Q X + c T x.
Theorem 2. For 0 x e, let f α (x) = x T (Q + Diag(α))x + (c α) T x, where α 0 and Q + Diag(α) 0. Assume that Y (x, X) 0, diag(x) x. Then f α (x) Q X + c T x. Proof: Let Q(α) = Q + Diag(α). Since Q(α) 0, f α (x) = (c α) T x + min{q(α) X : X xx T } = (c α) T x + min{q(α) X : X xx T, diag(x) x}, the last because diag(x) x holds automatically for X = xx T, 0 x e.
Theorem 2. For 0 x e, let f α (x) = x T (Q + Diag(α))x + (c α) T x, where α 0 and Q + Diag(α) 0. Assume that Y (x, X) 0, diag(x) x. Then f α (x) Q X + c T x. Proof: Let Q(α) = Q + Diag(α). Since Q(α) 0, f α (x) = (c α) T x + min{q(α) X : X xx T } = (c α) T x + min{q(α) X : X xx T, diag(x) x}, the last because diag(x) x holds automatically for X = xx T, 0 x e. But then Y (x, X) 0 and diag(x) x imply that f α (x) Q(α) X + (c α) T x = Q X + c T x + α T (diag(x) x) Q X + c T x.
Immediate corollary proves relationship between QCQP αbb and QCQP SDP first conjectured by Jeff Linderoth.
Immediate corollary proves relationship between QCQP αbb and QCQP SDP first conjectured by Jeff Linderoth.
Immediate corollary proves relationship between QCQP αbb and QCQP SDP first conjectured by Jeff Linderoth. Corollary 2. Let z αbb and z SDP denote the solution values in the convex relaxations QCQP αbb and QCQP SDP, respectively. Then z αbb z SDP.
More general convexifications Consider quadratic function f(x) = x T Qx + c T x, and v j R n, j = 1,..., k. Assume for x F, l j v T j x u j. Follows that (v T j x l j)(v T j x u j) 0, or (v T j x)2 (l j + u j )v T j x + l ju j 0.
More general convexifications Consider quadratic function f(x) = x T Qx + c T x, and v j R n, j = 1,..., k. Assume for x F, l j v T j x u j. Follows that (v T j x l j)(v T j x u j) 0, or (v T j x)2 (l j + u j )v T j x + l ju j 0. For α R k +, define Q(α) = Q + c(α) = c p(α) = k α j v j vj T, j=1 k α j (l j + u j )v j, j=1 k α j l j u j, j=1 and let f α (x) = x T Q(α)x + c(α) T x + p(α).
If Q(α) 0, f α ( ) is a convex underestimator for f( ) on F. Zheng, Sun and Li (2009) refer to functions of the form f α ( ) as DC underestimators, and apply them to convexify the objective in QCQP problems with only linear constraints (q = 0).
If Q(α) 0, f α ( ) is a convex underestimator for f( ) on F. Zheng, Sun and Li (2009) refer to functions of the form f α ( ) as DC underestimators, and apply them to convexify the objective in QCQP problems with only linear constraints (q = 0). Note that the αbb underestimator on 0 x e corresponds to v j = e j, l j = 0, u j = 1, j = 1,..., n. Additional possiblities for v j include:
If Q(α) 0, f α ( ) is a convex underestimator for f( ) on F. Zheng, Sun and Li (2009) refer to functions of the form f α ( ) as DC underestimators, and apply them to convexify the objective in QCQP problems with only linear constraints (q = 0). Note that the αbb underestimator on 0 x e corresponds to v j = e j, l j = 0, u j = 1, j = 1,..., n. Additional possiblities for v j include: eigenvectors corresponding to negative eigenvalues of Q,
If Q(α) 0, f α ( ) is a convex underestimator for f( ) on F. Zheng, Sun and Li (2009) refer to functions of the form f α ( ) as DC underestimators, and apply them to convexify the objective in QCQP problems with only linear constraints (q = 0). Note that the αbb underestimator on 0 x e corresponds to v j = e j, l j = 0, u j = 1, j = 1,..., n. Additional possiblities for v j include: eigenvectors corresponding to negative eigenvalues of Q, transposed rows of the constraint matrix A.
Using underestimators of the form f α ( ), obtain convex relaxation (QCQP DC ) z DC = min x T Q 0 (α 0 )x + c 0 (α 0 ) T x + p(α 0 ) s.t. x T Q i (α i )x + c i (α i ) T x + p(α i ) d i, i = 1,..., q x 0, Ax b, where each α i R k + is chosen so that Q i(α i ) 0.
Using underestimators of the form f α ( ), obtain convex relaxation (QCQP DC ) z DC = min x T Q 0 (α 0 )x + c 0 (α 0 ) T x + p(α 0 ) s.t. x T Q i (α i )x + c i (α i ) T x + p(α i ) d i, i = 1,..., q x 0, Ax b, where each α i R k + is chosen so that Q i(α i ) 0. We will compare QCQP DC to a relaxation of QCQP that combines the semidefiniteness condition Y (x, X) 0 with the RLT constraints on (x, X) that can be obtained from the original linear constraints x 0, Ax b.
The RLT constraints can be described very succinctly using the the matrix Y + (x, X); in fact these constraints correspond exactly to X 0, S(x, X) 0, Z(x, X) 0. Therefore the RLT constraints and Y (x, X) 0 together are equivalent to Y + (x, X) being a doubly nonnegative matrix. We define the relaxation (QCQP DNN ) z DNN = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, i = 1,..., q Y + (x, X) DNN m+n+1, where DNN k is the cone of k k doubly nonnegative matrices.
The RLT constraints can be described very succinctly using the the matrix Y + (x, X); in fact these constraints correspond exactly to X 0, S(x, X) 0, Z(x, X) 0. Therefore the RLT constraints and Y (x, X) 0 together are equivalent to Y + (x, X) being a doubly nonnegative matrix. We define the relaxation (QCQP DNN ) z DNN = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, i = 1,..., q Y + (x, X) DNN m+n+1, where DNN k is the cone of k k doubly nonnegative matrices. Note that the relaxation QCQP DNN is entirely determined by the original data from the problem QCQP; in particular, QCQP DNN does not involve the vectors v j and bounds (l j, u j ) used to construct the convexifications in QCQP DC.
To compare QCQP DC and QCQP DNN we require a generalization of Theorem 2 that applies to the more general convexification f α ( ). This result naturally involves the RLT constraints v T j Xv j (l j + u j ) T x l j u j, j = 1,..., k. (1) that are obtained from l j v T j x u j, j = 1,..., k.
To compare QCQP DC and QCQP DNN we require a generalization of Theorem 2 that applies to the more general convexification f α ( ). This result naturally involves the RLT constraints v T j Xv j (l j + u j ) T x l j u j, j = 1,..., k. (1) that are obtained from l j v T j x u j, j = 1,..., k. Theorem 3. For x F, let f α (x) = x T Q(α)x+c(α) T x+p(α), where α 0 and Q(α) 0. Assume that Y (x, X) 0 and (x, X) satisfy (1). Then f α (x) Q X + c T x.
To compare QCQP DC and QCQP DNN we require a generalization of Theorem 2 that applies to the more general convexification f α ( ). This result naturally involves the RLT constraints v T j Xv j (l j + u j ) T x l j u j, j = 1,..., k. (1) that are obtained from l j v T j x u j, j = 1,..., k. Theorem 3. For x F, let f α (x) = x T Q(α)x+c(α) T x+p(α), where α 0 and Q(α) 0. Assume that Y (x, X) 0 and (x, X) satisfy (1). Then f α (x) Q X + c T x. Proof. The proof is very similar to that of Theorem 2.
Theorem 4. Let z DC and z DNN denote the solution values in the convex relaxations QCQP DC and QCQP DNN, respectively. Then z DC z DNN.
Theorem 4. Let z DC and z DNN denote the solution values in the convex relaxations QCQP DC and QCQP DNN, respectively. Then z DC z DNN. Proof: Consider the convex relaxation z V = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, i = 1,..., q vj T Xv j (l j + u j ) T x l j u j, x 0, Ax b, Y (x, X) 0. j = 1,..., k
Theorem 4. Let z DC and z DNN denote the solution values in the convex relaxations QCQP DC and QCQP DNN, respectively. Then z DC z DNN. Proof: Consider the convex relaxation z V = min Q 0 X + c T 0 x s.t. Q i X + c T i x d i, i = 1,..., q vj T Xv j (l j + u j ) T x l j u j, x 0, Ax b, Y (x, X) 0. j = 1,..., k By Theorem 3 we immediately have z DC z V. However, the constraints l j vj T x u j are implied by the original constraints x 0, Ax b, and therefore (Sherali and Adams 1998, Proposition 8.2) the constraints (1) are implied by the RLT constraints X 0, S(x, X) 0, Z(x, X) 0. It follows that z V z DNN.
Conclusion Approach of approximating C has substantial advantages over common methodology of approximating convex lower envelopes, if solver can handle the constraint Y (x, X) 0.