Sum of Squares Relaxations for Polynomial Semi-definite Programming C.W.J. Hol, C.W. Scherer Delft University of Technology, Delft Center of Systems and Control (DCSC) Mekelweg 2, 2628CD Delft, The Netherlands c.w.j.hol@dcsc.tudelft.nl Abstract We present an extension of the scalar polynomial optimization by sum-of squares decompositions [5] to optimization problems with scalar polynomial objective and polynomial semi-definite constraints. We will show that the values of these relaxations converge to the optimal value under a constraint qualification. Although this convergence property is well known for polynomial problems with scalar constraints [5], to the best of our knowledge this result is new for matrix-valued inequalities. (We are aware of the parallel independent work of Kojima [6], that presents the same result with a different proof.) The result allows for a systematic improvement of LMI relaxations of non-convex polynomial semi-definite programming problems with guaranteed reduction of the relaxation gap to zero. This can be applied to various very hard control problems that can be written as polynomial semidefinite programs (SDP s), such as static or fixed-order controller synthesis. We present a direct and compact description of the resulting linear SDP s with full flexibility in the choice of the underlying monomial basis. The method is illustrated with a non-convex quadratic SDP problem and with H 2 -optimal static controller synthesis. 1 Introduction Recent improvements of semi-definite programming solvers and developments on polynomial optimization has resulted in a large increase in the research activity on applications of the so-called sum-of-squares (SOS) techniques in control. In this approach non-convex polynomial optimization problems are approximated by a family of convex problems that are relaxations of the original problem [1, 5]. These relaxations are based on decompositions of certain polynomials into a sum of squares. Using a theorem of Putinar [9] it can be shown (under suitable constraint qualifications) that the optimal values of these relaxed problems converge to the optimal value of the original problem. These relaxation schemes have recently been applied to various non-convex problems in control such as Lyapunov stability of nonlinear dynamic systems [7, 2] and robust stability analysis [4]. Many problems in control, including some very hard non-convex problems, can be formulated as semi-definite polynomial problems. An example is the static or fixed-order H 2 - synthesis problem, which can be written as a non-convex semi-definite polynomial optimization problem. The research of this author is sponsored by Philips CFT The research of this author is supported by the Technology Foundation STW, applied science division of NWO and the technology programme of the Ministry of Economic Affairs. 1
In this paper we present an extension of the scalar polynomial optimization by SOS decompositions [5] to optimization problems with scalar polynomial objective and nonlinear semi-definite constraints. We will show that the values of these relaxations converge to the optimal value under a constraint qualification. Although this convergence property is well known for polynomial problems with scalar constraints [5], to the best of our knowledge this result is, except for independent work of Kojima [6], new for matrix-valued inequalities. In Section 2 we show that the optimal value of a suitably constructed sequence of matrix sum-of-squares relaxations converges to the optimal value of the original polynomial semidefinite optimization problem. In Section 3 we present a direct and compact description of the resulting linear Semi-Definite Programs (SDP s) with full flexibility in the choice of the underlying monomial basis. We will describe how we can solve these sum-of squares relaxations by optimizing Linear Matrix Inequality (LMI) problems and give their sizes in Section 4. Two examples are presented in Section 5. The first example is a non-convex quadratic SDP and the second a H 2 -optimal static controller synthesis problem. 2 A direct polynomial SDP-approach In this section we present an extension of the scalar polynomial optimization by SOS decompositions [5] to optimization problems with scalar polynomial objective and nonlinear semidefinite constraints. We formulate the relaxations in terms of Lagrange duality with SOS polynomials as multipliers which seems a bit more straightforward than the corresponding dual formulation based on the problem of moments [5]. 2.1 Polynomial semi-definite programming For x R n let f(x) and G(x) denote scalar and symmetric-matrix-valued polynomials in x, where G maps into S m, the set of symmetric m m matrices. Consider the following polynomial semi-definite optimization problem with optimal value d opt : infimize f(x) subject to G(x) 0 (1) With any matrix S 0, the value inf x R n f(x) + S, G(x) is a lower bound for d opt by standard weak duality. However, not even the maximization of this lower bound over S 0 allows to close the duality gap due to non-convexity of the problem. This is the reason for considering, instead, Lagrange multiplier matrices S(x) which are globally positive semidefinite polynomial functions of x, i.e. polynomials satisfying S(x) 0 for all x R n. Still inf x R n f(x) + S(x), G(x) defines a lower bound of d opt, and the best lower bound that is achievable in this fashion is given by the supremal t for which there exists a globally positive semi-definite polynomial matrix S such that f(x) + S(x), G(x) t > 0 for all x R n. In order to render the determination of this lower bound computational we introduce the following concept. A symmetric matrix-valued m m-polynomial matrix S(x) is said to be a (matrix) sum-of-squares if there exists a (not necessarily square and typically tall) polynomial matrix T(x) such that S(x) = T(x) T T(x). 2
If T j (x), j = 1,...,q denote the rows of T(x), we infer S(x) = q j=1 T j(x) T T j (x). If S(x) is a scalar then T j (x) are scalars which implies S(x) = q j=1 T j(x) 2. This motivates our terminology since we are dealing with a generalization of classical scalar SOS representations. Very similar to the scalar case, every SOS matrix is globally positive semi-definite, but the converse is not necessarily true. Let us now just replace all inequalities in the above derived program for the lower bound computations by the requirement that the corresponding polynomial or polynomial matrices are SOS. This leads to the following optimization problem: supremize subject to t S(x) and f(x) + S(x), G(x) t are SOS (2) If fixing upper bounds on the degree of the SOS matrix S(x), the value of this problem can be computed by solving a standard linear SDP as will be seen in Section 3. In this fashion one can construct a family of LMI relaxations for computing increasingly improving lower bounds. Under a suitable constraint qualification, due to Putinar for scalar problems, it is possible to prove that the value of (2) actually equals d opt. To the best of our knowledge, the generalization to matrix valued problems as formulated in the following result has, except for the recent independent work of Kojima [6], not been presented anywhere else in the literature. Theorem 1 Let the following constraint qualification hold true: There exists some r > 0 and some SOS matrix R(x) such that Then the optimal value of (2) equals d opt. r x 2 + R(x), G(x) is SOS. (3) Proof. The value of (2) is not larger than d opt. Since trivial for d opt =, we assume that G(x) 0 is feasible. Choose any ǫ > 0 and some ˆx with G(ˆx) 0 and f(ˆx) d opt + ǫ. Let us now suppose that S(x) and f(x) + S(x), G(x) t are SOS. Then d opt + ǫ t f(ˆx) t f(ˆx) + S(ˆx), G(ˆx) t 0 and thus d opt + ǫ t. Since ǫ was arbitrary we infer d opt t. To prove the converse we first reveal that, due to the constraint qualification, we can replace G(x) by Ĝ(x) = diag(g(x), x 2 r) in both (1) and (2) without changing their values. Indeed if G(x) 0 we infer from (3) that r x 2 r x 2 + R(x), G(x) 0. Therefore the extra constraint x 2 r 0 is redundant for Problem 1. We show redundancy for (2) in two steps. If S(x) and f(x) t+ S(x), G(x) are SOS we can define the SOS matrix Ŝ(x) = diag(s(x), 0) to conclude that f(x) t + Ŝ(x), Ĝ(x) is SOS (since it just equals f(x) t + S(x), G(x) ). Conversely suppose that Ŝ(x) = ˆT(x) T ˆT(x) and ˆt(x) T ˆt(x) = f(x) t + Ŝ(x), Ĝ(x) are SOS. Partition ˆT(x) = (T(x) u(x)) according to the columns of Ĝ(x). With the SOS polynomial v(x) T v(x) = r x 2 + R(x), G(x) we infer ˆt(x) T ˆt(x) = f(x) t + T(x) T T(x), G(x) + u(x) T u(x)( x 2 r) = = f(x) t + T(x) T T(x), G(x) + u(x) T u(x)( R(x), G(x) v(x) T v(x)) = = f(x) t + T(x) T T(x) + u(x) T u(x)r(x), G(x) u(x) T u(x)v(x) T v(x). With R(x) = R f (x) T R f (x) we now observe that ( S(x) := T(x) T T(x) + u(x) T u(x)r(x) = 3 T(x) u(x) R f (x) ) T ( T(x) u(x) R f (x) )
and ( s(x) := ˆt(x) T ˆt(x) + u(x) T u(x)v(x) T v(x) = ˆt(x) u(x) v(x) ) T ( ˆt(x) u(x) v(x) ) are SOS. Due to f(x) t + S(x), G(x) = s(x) the claim is proved. Hence from now one we can assume without loss of generality that there exists a standard unit vector v 1 with v T 1 G(x)v 1 = x 2 r. (4) Let us now choose a sequence of unit vectors v 2, v 3,... such that v i, i = 1, 2,... is dense in the Euclidean unit sphere, and consider the family of scalar polynomial optimization problems infimize subject to f(x) vi TG(x)v i 0, i = 1,...,N (5) with optimal values d N. Since any x with G(x) 0 is feasible for (5), we infer d N d opt. Moreover it is clear that d N d N+1 which implies d N d 0 d opt for N. Let us prove prove that d 0 = d opt. Due to (4) the feasible set of (5) is contained in {x R n x 2 r} and hence compact. Therefore there exists an optimal solution x N of (5), and we can choose a subsequence N ν with x Nν x 0. Hence d 0 = lim ν d Nν = lim ν f(x Nν ) = f(x 0 ). Then d 0 = d opt follows if we can show that G(x 0 ) 0. Otherwise there exists a unit vector v with ǫ := v T G(x 0 )v > 0. By convergence there exists some K with G(x Nν ) K for all ν. By density there exists a sufficiently large ν such that K v i v 2 + 2K v i v < ǫ/2 for some i {1,...,N ν }. We can take ν with v T G(x Nν )v ǫ/2 and arrive at 0 v T i G(x Nν )v i = (v i v) T G(x Nν )(v i v) + 2v T G(x Nν )(v i v) + v T G(x Nν )v K v i v 2 2K v i v + ǫ/2 > 0, a contradiction. Let us finally fix any ǫ > 0 and choose N with d N d opt ǫ/2. This implies f(x) d opt + ǫ > 0 for all x with v T i G(x)v i 0 for i = 1,...,N. Due to (4) we can apply Putinar s scalar representation result [9] to infer that there exist polynomials t i (x) for which f(x) d opt + ǫ + With the SOS matrix t N 1 (x)v T 1 S N (x) := v i t i (x) T t i (x)vi T =. i=1 t N (x)vn T N t i (x) T t i (x)vi T G(x)v i is SOS. (6) i=1 T t 1 (x)v T 1. t N (x)v T N we conclude that f(x) d opt +ǫ+ S N (x), G(x) equals (6) and is thus SOS. This implies that the optimal value of (2) is at least d opt ǫ, and since ǫ > 0 was arbitrary the proof is finished. Theorem 1 is a natural extension of a theorem of Putinar [9] for scalar polynomial problems to polynomial SDP s. Indeed, Lasserre s approach [5] for minimizing f(x) over scalar polynomial constraints g i (x) 0, i = 1,...,m, is recovered with G(x) = diag(g 1 (x),..., g m (x))., 4
Moreover the constraint qualification in Theorem 1 is a natural generalization of that used by Schweighofer [11]. Remark.It is a direct consequence of Theorem 1 that, as in the scalar case [10], the constraint qualification (3) can be equivalently formulated as follows: there exist an SOS matrix R(x) and an SOS polynomial s(x) such that {x R n R(x), G(x) s(x) 0} is compact. 3 Verification of matrix SOS property Let us now discuss how to construct a linear SDP representation of (2) if restricting the search of the SOS matrix S(x) to an arbitrary subspace of polynomials matrices. The suggested description allows for complete flexibility in the choice of the corresponding monomial basis with a direct and compact description of the resulting linear SDP, even for problems that involve SOS matrices. Moreover it forms the basis for trying to reduce the relaxation sizes for specific problem instances. For all these purposes let us choose a polynomial vector u(x) = col(u 1 (x),...,u nu (x)) whose components u j (x) are pairwise different x-monomials. Then S(x) of dimension m m is said to be SOS with respect to the monomial basis u(x) if there exist real matrices T j, j = 1,...,n u, such that n u n u S(x) = T(x) T T(x) with T(x) = T j u j (x) = T j (u j (x) I m ). If U = (T 1 T nu ) and if P denotes the permutation that guarantees u(x) I m = P[I m u(x)] we infer with W = (UP) T (UP) 0 that j=1 j=1 S(x) = [I m u(x)] T W[I m u(x)]. (7) In order to render this relation more explicit let us continue with the following simple concepts. If M R nm nm is partitioned into n n blocks as (M jk ) j,k=1,...,m define Tr(M 11 ) Tr(M 1m ) Trace m (M) =..... Tr(M m1 ) Tr(M mm ) as well as the bilinear mapping.,. m : R mm nm R mm nm R m m as A, B m = Trace m (A T B). One then easily verifies that [I m u(x)] T W[I m u(x)] = W, I m u(x)u(x) T m. If we denote the pairwise different monomials in u(x)u(x) T by w j (x), j = 1,...,n w, and if we determine the unique symmetric Z j with u(x)u(x) T = n w j=1 Z j w j (x), we can conclude that S(x) = n w j=1 W, I m Z j m w j (x). (8) 5
This proves one direction of the complete characterization of S(x) being SOS with respect to u(x), to be considered as a flexible generalization of the Gram-matrix method to polynomial matrices. Lemma 2 The matrix polynomial S(x) is SOS with respect to the monomial basis u(x) iff there exist symmetric S j such that S(x) = n w j=1 S j w j (x), and the linear system has a solution W 0. W, I m Z j m = S j, j = 1,...,n w (9) Proof. If W 0 satisfies (9) we can determine a Cholesky factorization of PWP T as U T U to obtain W = (UP) T (UP) and reverse the arguments. 4 Construction of LMI relaxation families The constraint in (2) is equivalent to the existence of SOS polynomials S(x) and s(x) such that f(x) + S(x), G(x) t = s(x), for all x R n. (10) With a monomial vector v(x) = ( v 1 (x) v 2 (x)... v nv (x) ) T let us represent the constraint function as n v G(x) = B i v i (x) = B(I m v(x)), i=1 where B i S m, i = 1,...,n v and B := ( B 1 B 2... B nv ). Moreover, let us choose monomial vectors u(x) and y(x) of length n u and n y to parameterize the SOS polynomials S(x) and s(x) with respect to u(x) and y(x) with W 0 and V 0 respectively, as in Section 3. We infer S(x), G(x) = Tr(S(x)G(x)) = Tr ( W, I m [u(x)u(x) T ] m [B(I m v(x))] ) = Tr ( W, ( I m [u(x)u(x) T ] ) ([B (I m v(x))] I nu ) m = Tr ( W, (B I nu ) ( I m [ v(x) u(x)u(x) T]) m). ) Let us now choose the pairwise different monomials w 0 (x) = 1, w 1 (x),...,w nw (x) to allow for the representations v(x) u(x)u(x) T = n w j=0 P j w j (x), y(x)y(x) T = n w j=0 Q j w j (x) and f(x) = n w j=0 a j w j (x) with P j R (nunv) nu, Q j R nu nu, a j R, j = 1,...,n w. Then there exist SOS polynomials S(x) and s(x) with respect to u(x) and y(x) respectively, such that (10) holds true if and only if there exists a solution to the following LMI system: W 0, V 0, (11) a 0 + Tr ( W, (B I nu )(I m P 0 ) m ) t = Tr(V Q 0 ), (12) a j + Tr ( W, (B I nu )(I m P j ) m ) = Tr(V Q j ), j = 1,...,n w. (13) 6
Table 1: Lower bounds and optimal values of quadratic SDP problem α 1 α 2 α 3 lower bound 0.5063 0.5071 0.3997 optimal value 0.5097 0.5100 0.4100 We can hence easily supremize t over these LMI constraints to determine a lower bound on the optimal value of (1). Moreover these lower bounds are guaranteed to converge to the optimal value of (1) if we choose u(x) and y(x) to comprise all monomials up to a certain degree, and if we let the degree bound grow to infinity. The size of the LMI relaxation for (1) is easily determined as follows. The constraints are (11), (12) and (13). The condition on the matrices W and V to be nonnegative definite in (11) comprise inequalities in S mnu and S ny respectively. On top of that (12) and (13) add 1 and n w scalar equation constraints respectively. The decision variables in the LMI relaxation are the lower bound t and the matrices for the SOS representation W S mnu and V S ny. Since a symmetric matrix in S n can be parameterized by a vector in R 1 2 n(n+1), we end up in total with 1+ 1 2 mn u(mn u +1)+ 1 2 n y(n y +1) scalar variables in our LMI problem. 5 Applications 5.1 A quadratic SDP We computed lower bounds for the SDP problem minimize 1 x 2 + α(2/5 ( x 1 ) 2 2 3x 2 subject to G 1 (x) := 1 + x 2 2 + 3x 1 2 + 3x 1 1 + x 1 (x 1 + 1) + x 2 g 2 (x) :=.5 (x 1 0.4) 2 (x 2 0.2) 2 0 g 3 (x) := x 2 1 1 0, g 4(x) := x 2 2 1 0, ) 0 for three values of α, α 1 = 0.8, α 2 = 1.5, α 3 = 0. G 1 (x) 0 is a matrix-valued constraint, g 2 (x) 0 is chosen such that its feasible region is non-convex. The constraints g 3 (x) 0 and g 4 (x) 0 are just added to restrict the decision variables x 1 and x 2 in the interval ( 1, 1). For α = α 1 and α = α 2 the optimal solution lies on a point where the constraint g 2 (x) 0 with negative curvature is active, as is shown in Figure 1. For α = α 3 the optimal solution lies on a point where G 1 (x) 0 and g 2 (x) 0 are both active. With SOS bases u(x) = ( ) T 1 x 1 x 2 and y(x) = ( 1 x 1 x 2 x 1 x 2 x 2 1 x 2 ) T 2 we computed a lower bounds for the three values of α as shown in Table 1. The number of variables and constraints in our implementation of the LMI relaxation is 41 and 87 respectively. By gridding we have found optimal solutions as shown in Figure 1 and the optimal values are shown in Table 1. From the table we observe that in this example the algorithm finds the global optimal value with only a first order SOS basis, even though the non-convex constraint is active. Since G 1 (x) 0 is equivalent to det(g 1 (x)) 0, G 1 (1, 1)(x) 0 where G 1 (1, 1) denotes the left upper element of G 1, we can reduce this SDP problem to a scalar polynomial problem, which can be solved with the relaxation techniques for scalar polynomial optimization [5]. Indeed, the code GloptiPoly [3] gives the same results as in Table 1 with 27 LMI variables and 252 LMI constraints. We suspect however that for problems with a matrix-valued 7
0 0.1 0.2 0.3 3 0.4 1 2 x 2 0.5 0.6 0.7 0.8 0.9 1 0.5 0 0.5 1 x 1 Figure 1: Feasible region (grey filled area) and optimal solutions ( ) for α i, i {1, 2, 3} polynomial G(x) of large size, the polynomial det(g(x)) will have high polynomial degree, such that the resulting LMI relaxations will be (much) larger in terms of decision variables and constraints than in our approach, since all monomials that occur in det(g(x)) must be included in the monomial vector. Our future research is aimed at getting numerical evidence for this conjecture. 5.2 Static H 2 controller synthesis Static H 2 controller synthesis is a non-convex problem that is important for practical implementation of controllers. Consider the following state-space system of a plant with only the closed-loop A cl -matrix depending affinely on the static controller matrix K R m 2 p 2 : ( ) ( ) Acl (K) B cl A + B2 KC := 2 B 1 C 1 0 C cl D cl where A R n n, B 1 R n m 1, B 2 R n m 2, C 1 R p 1 n and C 2 R p 2 n. The problem of finding the static controller with optimal closed-loop H 2 -norm can be written as follows minimize Tr(C cl XC T cl ) subject to A cl (K)X + XA cl (K) T + B cl B T cl = 0, X 0 (14) This is a semi-definite polynomial problem, which is non-convex due to the bilinear coupling of the variables X and K. We computed lower bounds for randomly generated 4 th order plants with n = 4, m 1 = 2, m 2 = 1, p 1 = 3, p 2 = 1 and computed upper bounds by gridding, as shown in Table 2. To keep the size of the LMI problems small, we used the very simple SOS bases u(x) = ( ) T, ( 1 k 1 y(x) = 1 k1 svec(x) k 1 svec(x) ) T, where svec(x) denotes the symmetric vectorization of the symmetric matrix X. The number of decision variables in the LMI is 469. Table 2 reveals that the lower bound is equal to the upper bound for 6 out of 11 cases, such that the relaxation gap is zero. This indicates that small SOS bases are often sufficient to obtain exact relaxations. Furthermore the lower bounds are in all cases larger 8
Table 2: Upper bound, lower bounds and full order (FO) H 2 performance for randomly generated 4 th order system Upper bound Lower bound FO performance 6.7391 5.7813 4.6051 11.056 11.055 9.7174 14.376 12.435 11.459 8.7786 7.4020 7.3551 7.9459 6.9811 2.6839 8.878 8.8777 7.6834 16.142 16.144 15.388 10.56 10.383 8.1015 5.0222 5.0222 2.1391 44.073 44.076 42.675 12.025 12.018 9.5985 than the trivial lower bound of full order performance, which are shown in the 3 rd column. It is not yet clear whether there is a fundamental reason for the lower bounds being no worse than the full order performance for this choice of bases. 6 Conclusions We have shown that there exist sequences of SOS relaxations whose optimal value converge from below to the optimal value of polynomial SDP programs. Furthermore we have discussed how these relaxations can be reformulated as LMI optimization problems with full flexibility on the choice of monomial bases. We have applied the method to two non-convex problems, an academic polynomial SDP problem and the fixed-order H 2 synthesis problem. The first example illustrated that the number of LMI constraints in our relaxation is smaller than in the relaxation obtained after scalarisation of the matrix-valued constraint G(x) 0 using the principal minors. This difference in computational complexity is probably even larger for constraints on matrix-valued polynomials with many rows and columns. The H 2 -synthesis example illustrated the applicability to non-convex control problems. We have computed good lower bounds with remarkably simple monomial bases. Apart from these applications the presented convergence result is of value for a variety of other matrix-valued optimization problems. Additional examples in control are inputoutput selection, where the integer constraints of type p {0, 1} are replaced by a quadratic constraint p(p 1) = 0, and spectral factorization of multidimensional transfer functions to asses dissipativity of linear shift-invariant distributed systems [8]. References [1] G. Chesi, A. Garulli, A. Tesi, and A. Vicino. An LMI-based approach for characterizing the solution set of polynomial systems. In Proc. 39th IEEE Conf. Decision and Control, Sydney, Australia, 2000. [2] G. Chesi, A. Garulli, A. Tesi, and A. Vicino. Homogeneous Lyapunov functions for systems with structured uncertainties. preprint, 2002. 9
[3] J.B. Lasserre D. Henrion. Detecting global optimality and extracting solutions in gloptipoly. Technical report, LAAS-CNRS, 2003. [4] D. Henrion, M. Sebek, and V. Kucera. Positive polynomials and robust stabilization with fixed-order controllers. IEEE Transactions on Automatic Control, 48:1178 1186, 2003. [5] J.B. Lasserre. Global optimization with polynomials and the problem of moments. SIAM Journal of Optimization, 11:796 817, 2001. [6] M.Kojima. Sums of squares relaxations of polynomial semidefinite programs. Technical report, Tokyo Institute of Technolgy, 2003. [7] P.A. Parrilo. Structured Semidefinite Prograns and Semialgebraic Geometry Methods in Robustness and Optimization. PhD thesis, Cailifornia Institute of Technology, 2000. [8] H. Pillai and J.C. Willems. Lossless and dissipative distributed systems. SIAM J. Control Optim., 40(5):1406 1430, 2002. [9] M. Putinar. Positive polynomials on compact semi-algebraic sets. Indiana Univ. Math. J., 42:969 984, 1993. [10] K. Schmüdgen. The K-moment problem for compact semi-algebraic sets. Math. Ann., 289(2):203 206, 1991. [11] M. Schweighofer. Optimization of polynomials on compact semialgebraic sets. In Preprint, 2003. 10