LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING MEHDI GHASEMI AND MURRAY MARSHALL Abstract. We make use of a result of Hurwitz and Reznick [9] [21], and a consequence of this result due to Fidalgo and Kovacec [6], to establish, in Theorem 2.3, a new sufficient condition for a polynomial f R[X 1,..., X n ] of even degree to be a sum of squares. Theorem 2.3 yields as special cases the results of Ghasemi and Marshall in [7] and, consequently, also those of Fidalgo and Kovacec [6] and Lasserre [12]. We apply Theorem 2.3 to obtain a new lower bound f gp for f, and we explain how f gp can be computed using geometric programming. The lower bound f gp is generally not as good as the lower bound f sos introduced by Lasserre [11] and Parrilo and Sturmfels [17], which can be computed using semidefinite programming, but a run time comparison shows that, in practice, the computation of f gp is faster, and larger problems can be handled. The computation is simplest when the highest degree component of f has the form n a ixi, a i > 0, i = 1,..., n. The lower bounds for f established in [7] are obtained by evaluating the objective function of the geometric program at appropriate feasible points. 1. Introduction Fix a non-constant polynomial f R[X] = R[X 1,, X n ], where n 1 is an integer number, and let f be the global minimum of f, defined by f := inf{f(a) : a R n }. We say f is positive semidefinite (PSD) if f(a) 0 a R n. Clearly inf{f(a) : a R n } = sup{r R : f r is PSD}, so finding f reduces to determining when f r is PSD. Suppose that deg(f) = m and decompose f as f = f 0 + + f m where f i is a form with deg(f i ) = i, i = 0,..., m. This decomposition is called the homogeneous decomposition of f. A necessary condition for f is that f m is PSD (hence m is even). A form g R[X] is said to be positive definite (PD) if g(a) > 0 for all a R n, a 0. A sufficient condition for f is that f m is PD; see [13, Theorems 5.1 and 5.3] for more general results of this sort. It is known that deciding when a polynomial is PSD is NP-hard [1, Theorem 1.1]. Deciding when a polynomial is a sums of squares (SOS) is much easier. Actually, 1991 Mathematics Subject Classification. Primary 12D15 Secondary 14P99, 90C25. Key words and phrases. Positive polynomials, sums of squares, optimization, geometric programming. 1

2 MEHDI GHASEMI AND MURRAY MARSHALL there is a polynomial time method, known as semidefinite programming (SDP), which can be used to decide when a polynomial f R[X] is SOS [11] [17]. Note that any SOS polynomial is obviously PSD, so it is natural to ask if the converse is true, i.e. is every PSD polynomial SOS? This question first appeared in Minkowski s thesis and he guessed that in general the answer is NO. Later, in [8], Hilbert gave a complete answer to this question, see [2, Theorem 6.3.7]. Let us denote the cone of PSD forms of degree in n variables by P,n and the cone of SOS forms of degree in n variables by Σ,n. Hilbert proved that P,n = Σ,n if and only if (n 2) or (d = 1) or (n = 3 and d = 2). Let R[X] 2 denote the cone of all SOS polynomials in R[X] and, for f R[X], define f sos := sup{r R : f r R[X] 2 }. Since SOS implies PSD, f sos f. Moreover, if f sos then f sos can be computed in polynomial time, as close as desired, using SDP [11] [17]. However in practice the computation of f sos can only be carried out if the number of variables and degree are relatively small. 1 We denote by P,n and Σ,n, the interior of P,n and Σ,n in the vector space of forms of degree in R[X], equipped with the euclidean topology. A necessary condition for f sos is that f Σ,n. A sufficient condition for f sos is that f Σ,n [15, Proposition. 5.1]. In Section 2, we recall the Hurwitz-Reznick result (Theorem 2.1) and a corollary of the Hurwitz-Reznick result due to Fidalgo and Kovacec (Corollary 2.2). For the convenience of the reader we include proofs of these results. Using the latter result, we determine a new sufficient condition for a form f of degree to be SOS (Theorem 2.3). This new sufficient condition involves a lifting, i.e., the introduction of certain additional variables (the a α,i s) which are required to satisfy certain constraints depending on the coefficients of f. We explain how Theorem 2.3 can be applied, by making appropriate choices for the lifted variables a α,i, to derive various concrete criteria, in terms of the coefficients, for a form to be SOS. These include results and improvements of results which were proved earlier by Lasserre [12, Theorem 3], Fidalgo and Kovacec [6, Theorem 4.3], and Ghasemi and Marshall [7, Section 2]. In Section 3, we use Theorem 2.3 to establish a new lower bound f gp for f which can be computed using geometric programming. Although the lower bound found by this method is typically not as good as the lower bound found using SDP, a practical comparison confirms that the computation is faster, and larger problems can be handled. In Section 4 we explain how our results in Section 3 imply and improve on the results in [7, Section 3]. We show this by showing that the lower bounds for f established in [7] can be obtained by evaluating the objective function of the geometric program at appropriately chosen feasible points. 1 As explained in [23], the situation is somewhat better if f has structured sparsity.

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING 3 We conclude from all of this that, with the lifting proposed in Theorem 2.3, one can get better lower bounds than those obtained in [7], and at a computational cost which is much cheaper than the SDP-based method described in [11] and [17] (which also uses a (different) lifting). In this paper we denote by N the set of nonnegative integers {0, 1, 2,...}. For X = (X 1,..., X n ), a = (a 1,..., a n ) R n and α = (α 1,..., α n ) N n, define X α := X α1 1 X α n n, α := α 1 + + α n and a α := a α1 1 aα n n with the convention 0 0 = 1. Often we use a instead of a to denote an element of R n. Clearly, using these notations, every polynomial f R[X] can be written as f(x) = α N f n α X α, where f α R and f α = 0, except for finitely many α. Assume now that f is nonconstant and has even degree. Let Ω(f) = {α N n : f α 0} \ {0, ɛ 1,..., ɛ n }, where = deg(f), ɛ i = (δ i1,..., δ in ), and δ ij = { 1 i = j 0 i j. We denote f 0 and f ɛi by f 0 and f,i for short. Thus f has the form (1) f = f 0 + n f α X α + f,i Xi. α Ω(f) Let (f) = {α Ω(f) : f α X α is not a square in R[X]} = {α Ω(f) : either f α < 0 or α i is odd for some 1 i n (or both)}. Since the polynomial f is usually fixed, we will often denote Ω(f) and (f) just by Ω and for short. Let f(x, Y ) = Y f( X1 Y,..., Xn Y ). From (1) it is clear that f(x, Y ) = f 0 Y + α Ω f α X α Y α + is a form of degree, called the homogenization of f. well-known result: n f,i X i We have the following Proposition 1.1. f is PSD if and only if f is PSD. f is SOS if and only if f is SOS. Proof. See [14, Proposition 1.2.4]. 2. Sufficient conditions for a form to be SOS We recall the following result, due to Hurwitz and Reznick. Theorem 2.1 (Hurwitz-Reznick). Suppose p(x) = n α ix i X α 1 1 X α n n, where α = (α 1,..., α n ) N n, α =. Then p is SOBS. Here, SOBS is shorthand for a sum of binomial squares, i.e., a sum of squares of the form (ax α bx β ) 2 In his 1891 paper [9], Hurwitz uses symmetric polynomials in X 1,..., X to give an explicit representation of X i X i as a sum of squares. Theorem 2.1 can be deduced from this representation. Theorem 2.1 can also be deduced

4 MEHDI GHASEMI AND MURRAY MARSHALL from results in [20] [21], specially, from [21, Theorems 2.2 and 4.4]. Here is another more direct proof. Proof. By induction on n. If n = 1 then p = 0 and the result is clear. Assume now that n 2. We can assume each α i is strictly positive, otherwise, we reduce to a case with at most n 1 variables. Case 1: Suppose that there exist 1 i 1, i 2 n, such that i 1 i 2, with α i1 d and α i2 d. Decompose α = (α 1,..., α n ) as α = β + γ where β, γ N n, β i1 = 0, = 0 and β = γ = d. Then γ i2 therefore, (X β X γ ) 2 = X 2β 2X β X γ + X 2γ = X 2β 2X α + X 2γ, p(x) = n α i X i X α n = α i Xi d(x 2β + X 2γ (X β X γ ) 2 ) ( = 1 n ) 2β i Xi X 2β 2 ( + 1 n ) 2γ i Xi X 2γ + d(x β X γ ) 2. 2 Each term is SOBS, by the induction hypothesis. Case 2: Suppose we are not in Case 1. Since there is at most one i satisfying α i > d, it follows that n = 2, so p(x) = α 1 X1 + α 2 X2 X α 1 1 Xα 2 2. We know that p 0 on R 2, by the arithmetic-geometric inequality. Since n = 2 and p is homogeneous, it follows that p is SOS. Showing p is SOBS, requires more work. Denote by AGI(2, d) the set of all homogeneous polynomials of the form p = α 1 X1 +α 2 X2 X α 1 1 Xα 2 2, α 1, α 2 N and α 1 + α 2 =. This set is finite. If α 1 = 0 or α 1 = then p = 0 which is trivially SOBS. If α 1 = α 2 = d then p(x) = d(x1 d X2 d ) 2, which is also SOBS. Suppose now that 0 < α 1 <, α 1 d and α 1 > α 2 (The argument for α 1 < α 2 is similar). Decompose α = (α 1, α 2 ) as α = β + γ, β = (d, 0) and γ = (α 1 d, α 2 ). Expand p as in the proof of Case 1 to obtain ( 2 ) ( 2 ) p(x) = 1 2β i Xi X 2β + 1 2γ i Xi X 2γ + d(x β X γ ) 2. 2 2 Observe that 2 2β ixi X 2β = 0. Thus p = 1 2 p 1 + d(x β X γ ) 2, where p 1 = 2 2γ ixi X 2γ. If p 1 is SOBS then p is also SOBS. If p 1 is not SOBS then we can repeat to get p 1 = 1 2 p 2 + d(x β X γ ) 2. Continuing in this way we get a sequence p = p 0, p 1, p 2, with each p i an element of the finite set AGI(2, d), so p i = p j for some i < j. Since p i = 2 i j p j + a sum of binomial squares, this implies that (1 2 i j )p i, and hence p i is SOBS and therefore p is SOBS.

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING 5 In [6], Fidalgo and Kovacec prove the following result, which is a corollary of the Hurwitz-Reznick result. Corollary 2.2 (Fidalgo-Kovacec). For a form p(x) = n β ixi µx α such that α N n, α =, β i 0 for i = 1,, n, and µ 0 if all α i are even, the following are equivalent: (1) p is PSD. (2) µ n ααi i (3) p is SOBS. (4) p is SOS. () n βαi i. Proof. See [6, Theorem 2.3]. (3) (4) and (4) (1) are trivial, so it suffices to show (1) (2) and (2) (3). If some α i is odd then, making the change of variables Y i = X i, Y j = X j for j i, µ gets replaced by µ. In this way, we can assume µ 0. If some α i is zero, set X i = 0 and proceed by induction on n. In this way, we can assume α i > 0, i = 1,..., n. If µ = 0 the result is trivially true, so we can assume µ > 0. If some β i is zero, then (2) fails. Setting X j = 1 for j i, and letting X i, we see that (1) also fails. Thus the claimed implications are trivially true in this case. Thus we can assume β i > 0, i = 1,..., n. (1) (2). Assume (1), so p(x) 0 for all x R n. Taking we see that x := (( α i β i ) 1/,..., ( α n β n ) 1/ ), n n p(x) = α i µ ( α n i ) αi/ = µ ( α i ) αi/ 0, β i β i so µ n ( α i β i ) α i/. This proves (2). (2) (3). Make a change of variables X i = ( α i µ 1 := µ n ( α i β i ) α i/ so, by (2), µ 1, i.e., p(x) = n α i Y i µ 1 Y α = µ 1 [ n β i ) 1/ Y i, i = 1,... n. µ 1 1. Then α i Yi ( 1) + µ 1 which is SOBS, by the Hurwitz-Reznick result. This proves (3). n α i Y i Y α ], Next, we prove our main new result of this section, which gives a new sufficient condition on the coefficients for a polynomial to be a sum of squares. Theorem 2.3. Suppose f is a form of degree. A sufficient condition for f to be SOBS is that there exist nonnegative real numbers a α,i for α, i = 1,..., n such that (1) α () a α α = f α α α. (2) f,i a α,i, i = 1,..., n. Here, a α := (a α,1,..., a α,n ), a α α := a α1 α,1 aαn α,n and α α := α α1 1 ααn n. Let

6 MEHDI GHASEMI AND MURRAY MARSHALL Proof. Suppose that such real numbers exist. Then condition (1) together with Corollary 2.2 implies that n a α,ixi + f α X α is SOBS for each α, so n ( a α,i )Xi + f α X α is SOBS. Combining with (2), it follows that n f,ixi + f αx α is SOBS. Since each f α X α for α Ω \ is a square, this implies f(x) is SOBS. Remark 2.4. (i) From condition (1) of Theorem 2.3 we see that a α,i = 0 α i = 0. (ii) Let a be an array of real numbers satisfying the conditions of Theorem 2.3, and define the array a = (a α,i ) by a α,i = { a α,i if α i 0 0 if α i = 0. Then a also satisfies the conditions of Theorem 2.3. Thus we are free to require the converse condition α i = 0 a α,i = 0 too, if we want. We mention some corollaries of Theorem 2.3. Corollaries 2.5 and 2.6 were known earlier. Corollary 2.7 is an improved version of Corollary 2.6. Corollary 2.9 is a new result. Each corollary will be proved by applying Theorem 2.3 for a particular choice of the lifted variables a α,i. Corollary 2.5. See [12, Theorem 3] and [7, Corollary 2.2]. For any polynomial f R[X] of degree, if (L1) f 0 f α α and (L2) f,i f α α i, i = 1,..., n, then f is a sum of squares. Proof. Apply Theorem 2.3 to the homogenization f(x, Y ) of f, taking a α,i = f α αi, i = 1,..., n and a α,y = f α α i for each α (f). For α (f), ( ) α () a α α = () fα ( α ) ( ) αi n fα α i = () f α α ( α ) α f α α α α () = f α α α ( α ) α. This shows 2.3(1) holds for each pair (α, α ) (f) induced from α (f). (L1) and (L2) imply 2.3(2), therefore, by Theorem 2.3, f and hence f is SOBS. Corollary 2.6. See [6, Theorem 4.3] and [7, Theorem 2.3]. Suppose f R[X] is a form of degree and min f,i 1 f α (α α ) 1.,...,n Then f is SOBS. Proof. Apply Theorem 2.3 with a α,i = f α αα/, α, i = 1..., n.

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING 7 Corollary 2.7. Suppose f is a form of degree, f,i > 0, i = 1,..., n and Then f is SOBS. f α α α/ n f α i/,i 1. Proof. Apply Theorem 2.3 with a α,i = f α α α/ f,i. n j=1 f α j /,j Remark 2.8. Corollary 2.7 is an improved version of Corollary 2.6. This requires some explanation. Suppose that f,i 1 f α α α/, i = 1,..., n. Let := min{f,i : i = 1,..., n}. Then f,i0 and n f α α α/ f α i/,i n f α i/,i n f α i/,i 0 = f,i0, = 1 f α α α/ f,i0 1 f α α α/ f,i0 1. We note yet another sufficient condition for SOS-ness. f,i0 n f α i/,i Corollary 2.9. Let f R[X] be a form of degree. If f,i ( ) /αin fα α α i, i = 1,..., n,α i 0 then f is SOBS. Here n α := {i : α i 0}. Proof. Apply Theorem 2.3 with ( ) /αi n α fα α a α,i = i if α i 0 0 if α i = 0. The following example shows that the above corollaries are not as strong, either individually or collectively, as Theorem 2.3 itself. Example 2.10. Let f(x, Y, Z) = X 6 + Y 6 + Z 6 5X 4Y Z + 8. Corollary 2.5 does not apply to f, actually, (L1) translates into the false inequality 8 25 3, so (L1) fails. In a similar way, Corollaries 2.6, 2.7 and 2.9 do not apply to f, the homogenization of f, because the condition in Corollary 2.7 translates into the false inequality 1.126546 1 and the condition in Corollary 2.9 for i = 4 translates into the false inequality 8 10.108548. We try to apply Theorem 2.3. Let α 1 = (1, 0, 0, 5), α 2 = (0, 1, 0, 5) and α 3 = (0, 0, 1, 5), then = {α 1, α 2, α 3 }.

8 MEHDI GHASEMI AND MURRAY MARSHALL Denote a αi,j by a ij, we have to find positive reals a 11, a 22, a 33, a 14, a 24, a 34 such that the followings hold: 6 6 a 11 a 5 14 = 5 6 5 5, 1 a 11, 6 6 a 22 a 5 24 = 4 6 5 5, 1 a 22, 6 6 a 33 a 5 34 = 5 5, 1 a 33, 8 a 14 + a 24 + a 34. Take a 11 = a 22 = a 33 = 1 and solve equations on above set of conditions, we get a 14 + a 24 + a 34 7.674 < 8. This implies that f and hence f is SOBS. 3. Application to global optimization Let f R[X] be a non-constant polynomial of degree. Recall that f sos denotes the supremum of all real numbers r such that f r R[X] 2, f denotes the infimum of the set {f(a) : a R n }, and f sos f. Suppose f denotes the array of coefficients of non-constant terms of f and f 0 denotes the constant term of f. Suppose Φ(f, f 0 ) is a formula in terms of coefficients of f such that Φ(f, f 0 ) implies f is SOS. For such a criterion Φ, we have r (Φ(f, f 0 r) r f sos ), so f Φ := sup{r R : Φ(f, f 0 r)} is a lower bound for f sos and, consequently, for f. In this section we develop this idea, using Theorem 2.3, to find a new lower bound for f. Theorem 3.1. Let f be a non-constant polynomial of degree and r R. Suppose there exist nonnegative real numbers a α,i, α, i = 1,..., n, a α,i = 0 iff α i = 0, such that (1) () a α α = f α α α for each α such that α =, (2) f,i a α,i for i = 1,..., n, and (3) f 0 r [ ] ( α ) fα α α 1 α. < () a α α Then f r is SOBS. Here < := {α : α < }. Proof. Apply Theorem 2.3 to g := f r, the homogenization of f r. Since f = f 0 + n f,ixi + α Ω f αx α, it follows that g = (f 0 r)y + n X i + α Ω f αx α Y α. We know f r is SOBS if and only if g is SOBS. The sufficient condition for g to be SOBS given by Theorem 2.3 is that there exist non-negative real numbers a α,i and a α,y, and a α,i = 0 iff α i = 0 and a α,y = 0 iff α = such that (1) α () a α αa α α,y = f α α α ( α ) α, and (2) f,i a α,i, i = 1,..., n and f 0 r a α,y. Solving (1) for a α,y yields [ fα α α ] 1 α a α,y = ( α ) () a α,

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING 9 if α <. Take a α,y = 0 if α =. Conversely, defining a α,y in this way, for each α, it is easy to see that (1), (2), and (3) imply (1) and (2). Definition 3.2. For a non-constant polynomial f of degree we define f gp := sup{r R : a α,i R 0, α, i = 1,..., n, a α,i = 0 iff α i = 0 satisfying conditions (1), (2) and (3) of Theorem 3.1}. It follows, as a consequence of Theorem 3.1, that f gp f sos. Example 3.3. Let f(x, Y ) = X 4 + Y 4 X 2 Y 2 + X + Y. Here, = {α 1, α 2, α 3 }, where α 1 = (1, 0), α 2 = (0, 1) and α 3 = (2, 2). We are looking for non-negative reals a ij = a αi,j, i = 1, 2, 3, j = 1, 2 satisfying a 11 + a 21 + a 31 1, a 12 + a 22 + a 32 1, a 31 a 32 = 1 4. Taking a 11 = a 22 = a 31 = a 32 = 1 2, a 12 = a 21 = 0, we see that f gp 3. Taking X = Y = 1 we see that f 2 4/3 2 1/3 f( 1, 1 ) = 3. 2 1/3 2 1/3 2 4/3 Since f gp f sos f, it follows that f gp = f sos = f = 3 Remark 3.4. If Ω = 1 then f = f sos = f gp. 2 4/3. Proof. Say Ω = {α}, so f = n i=0 f,ixi + f 0 + f α X α. We know f gp f sos f, so it suffices to show that, for each real number r, f r f gp r. Fix r and assume f r. We want to show f gp r, i.e., that r satisfies the constraints of Theorem 3.1. Let g denote the homogenization of f r, i.e., g = n f,ixi + (f 0 r)y + f α X α Y α. Thus g is PSD. This implies, in particular, that f,i 0, i = 1,..., n and f 0 r. There are two cases to consider. Case 1. Suppose f α > 0 and all α i are even. Then α /, so =. In this case r satisfies trivially the constraints of Theorem 3.1, so f gp r. Case 2. Suppose either f α < 0 or not all of the α i are even. Then α, i.e., = Ω = {α}. In this case, applying Corollary 2.2, we deduce that (2) f α α α ( α ) α () n f α i,i (f 0 r) α. There are two subcases to consider. If α < then r satisfies the constraints of Theorem 3.1, taking { f,i if α i 0 a α,i = 0 if α i = 0. If α = then (2) reduces to fα α α () n f α i,i. In this case, r satisfies the constraints of Theorem 3.1, taking where a α,i = s = { sf,i if α i 0 0 if α i = 0. [ ] 1 f α α α α () n f. αi,i

10 MEHDI GHASEMI AND MURRAY MARSHALL If f,i > 0, i = 1,..., n then computation of f gp is a geometric programming problem. We explain this now. Definition 3.5. (geometric program) (1) A function φ : R n >0 R of the form φ(x) = cx a 1 1 xa n n, where c > 0, a i R and x = (x 1,..., x n ) is called a monomial function. A sum of monomial functions, i.e., a function of the form φ(x) = k c i x a 1i 1 x ani n where c i > 0 for i = 1,..., k, is called a posynomial function. (2) An optimization problem of the form { Minimize φ 0 (x) Subject to φ i (x) 1, i = 1,..., m and ψ j (x) = 1, j = 1,..., p where φ 0,..., φ m are posynomials and ψ 1,..., ψ p are monomial functions, is called a geometric program (GP). The subset of R n >0 defined by the constraints φ i (x) 1, i = 1,..., m and ψ j (x) = 1, j = 1,..., p is called the feasible set of the GP. φ 0 (x) is called the objective function. The output of the GP is the minimum, more precisely, the infimum of φ 0 (x), taken as x runs through the feasible set. In case the feasible set is empty the output is understood to be +. In [16] Nesterov and Nemirovskii introduce an interior-point method for solving GPs and prove worst-case polynomial-time complexity of the method. The method works very well in practice. See [3, Section 2.5] for some indication of how well the method works. See [4, Section 4.5] or [18, Section 5.3] for more about GPs. Corollary 3.6. Let f be a non-constant polynomial of degree with f,i > 0, i = 1,..., n. Then f gp = f 0 m where m is the output of the GP ] 1 α Minimize Subject to < ( α ) [ ( fα ) α α a α α a α,i f,i 1, i = 1,, n and () a α α f α α α = 1, α, α =. The variables in the program are the a α,i, α, i = 1,..., n, α i 0, the understanding being that a α,i = 0 iff α i = 0. Proof. f gp = f 0 m is immediate from the definition of f gp. Observe that [ (fα ) ] 1 α φ 0 (a) := ( α ) α α a α α, α <

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING 11 and φ i (a) := a α,i f,i, i = 1,..., n are posynomials in the variables a α,i, and ψ α (a) := () a α α f α α α, α, α = are monomial functions in the variables a α,i. Addendum: If either f,i < 0 for some i or f,i = 0 and α i 0 for some i and some α then f gp =. In all remaining cases, after deleting the columns of the array (a α,i ) corresponding to the indices i such that f,i = 0, we are reduced to the case where f,i > 0 for all i, i.e., we can apply GP to compute f gp. A special case occurs when f,i > 0, for i = 1,..., n and {α : α = } =. In this case, the equality constraints in the computation of m are vacuous and the feasible set is always non-empty, so f gp. Corollary 3.7. If α < for each α and f,i > 0 for i = 1,..., n, then f gp and f gp = f 0 m where m is the output of the GP Minimize [ ( ) ] 1 ( α ) fα α α a α α α Subject to a α,i f,i, i = 1,, n. Proof. Immediate from Corollary 3.6. Example 3.8. (1) Let f be the polynomial of Example 2.10. Then f gp = f sos = f 0.3265. (2) For g(x, Y, Z) = X 6 + Y 6 + Z 6 + X 2 Y Z 2 X 4 Y 4 Z 4 Y Z 3 XY 2 + 2, g 0.667, and g gp = g sos 1.6728. (3) For h(x, Y, Z) = g(x, Y, Z) + X 2, we have h gp 1.6728 < h sos 0.5028 and h 0.839. To compare the running time efficiency of computation of f sos using SDP with computation of f gp using GP, we set up a test to keep track of the running times. We also computed the average, maximum and minimum gap between f sos and f gp. All the polynomials were taken randomly of the form f(x) = X1 + + Xn + g(x) where g R[X] is of degree 1. In each case the computation was done for 50 polynomials 2 with coefficients uniformly distributed in the interval [ 10, 10], using SosTools and GPposy for Matlab 3. Although typically there is a large gap between f sos and f gp (see Table 1), the running time computation (see Table 2) shows that computation of f gp is much faster than f sos. 2 Except for the case n = 6, = 12. In this case the algorithm we used to generate random coefficients was taking too long to run, so we used just 10 polynomials instead of 50. 3 Hardware and Software specifications. Processor: Intel R Core TM 2 Duo CPU P8400 @ 2.26GHz, Memory: 2 GB, OS: Ubuntu 11.10-32 bit, Matlab: 7.9.0.529 (R2009b)

12 MEHDI GHASEMI AND MURRAY MARSHALL Table 1. Average, Minimum and Maximum of f sos f gp n 4 6 8 10 12 avg 12.4 82.6 204.7 592 1096.3 3 min 0 23.5 109.5 311.2 808.6 max 27.1 141.6 334.5 851.5 1492.8 avg 27.5 205.5 730.9 2663.0 6206.1 4 min 6.9 100.8 298.3 2098.1 5003.9 max 51.1 333.1 1044.7 3254.9 7306.3 avg 47.9 539.0 2369.0 9599.7-5 min 19.9 336.6 1823.8 8001.0 - max 100.2 763.9 2942.2 11129.7 - avg 84.4 1125.9 5963.1 - - 6 min 36.1 780.3 4637.3 - - max 146.3 1424.1 7421.3 - - Table 2. Average running time (seconds) n 4 6 8 10 12 3 f gp 0.08 0.09 0.11 0.23 0.33 f sos 0.73 1.00 1.64 2.81 6.27 4 f gp 0.09 0.13 0.27 0.78 2.16 f sos 0.96 1.76 5.6 26.14 176.45 5 f gp 0.10 0.23 0.76 3.44 15.41 f sos 1.42 4.13 45.18 673.63-6 f gp 0.11 0.35 2.17 16.54 105.5 f sos 1.56 13.31 574.9 - - The main advantage of f gp over f sos is that in the case of many variables and/or large degree, the former is computable (at least it is computable if n, and Ω are not too large) whereas the latter is not computable in general. Example 3.9. Let f(x, Y, Z) = X 40 + Y 40 + Z 40 XY Z. According to Remark 3.4, f = f sos = f gp. The running time for computing f gp 0.686 using GP was 0.18 seconds, but when we attempted to compute f sos directly, using SDP, the machine ran out of memory and halted, after about 4 hours. Table 3 shows the running time for computation of f gp for larger values of n and in cases where Ω is relatively small. This computation was done in Sage [22] using the CvxOpt package, on the same computer, for one (randomly chosen) polynomial in each case.

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING 13 Table 3. Computation time for f gp (seconds) for various sizes of Ω n \ Ω 10 20 30 40 50 60 70 80 90 100 20 0.24 1.4 2.9 10.8 13 31.8 45.6 67.7 121 152 10 40 0.28 1.5 4.3 14.8 29 43.5 70.3 133 170 220 60 0.42 1.6 5.8 15 26 53.5 86.4 129 180 343 20 0.69 6.1 13.1 36 71.3 151 180 348 432 659 20 40 1.0 7.4 33.6 78.1 154 255 512 749 1033 1461 60 1.4 12.1 41.3 104 205 451 778 1101 1551 2130 30 20 1.5 9.1 37.6 80.6 153.4 290 462 717 984 1491 40 4.4 31.3 82.1 175 416 683 1286 2024 3015 3999 4. Explicit lower bounds We explain how the lower bounds for f established in [7, Section 3] can be obtained by evaluating the objective function of the GP in Corollary 3.7 at suitably chosen feasible points. Recall that for a (univariate) polynomial of the form p(t) = t n n 1 i=0 a it i, where each a i is nonnegative and at least one a i is nonzero, C(p) denotes the unique positive root of p [19, Theorem 1.1.3]. See [5], [10, Ex. 4.6.2: 20] or [7, Proposition 1.2] for more details and upper bounds for C(p). Corollary 4.1. If α < for each α and f,i > 0 for i = 1,..., n, then f gp r L, where Here, f α r L := f 0 1 ( α ) f α k α (f α k max,,n C(t 1 := n f α i,i. ) 1 α i f α f α,i t α ). Proof. For each α and i = 1,, n, let α i a α,i = k f α (f α,i ) 1 α. 1 By definition of k, for each i, α i f α (f,i ) α k α k, hence a α,i = α i k f α (f α,i ) 1 α f,i. This shows that the array (a α,i : α, i = 1,, n) is a feasible point for the geometric program in the statement of Corollary 3.7. Plugging this into the

14 MEHDI GHASEMI AND MURRAY MARSHALL objective function of the program yields [ ( ) ( ) ] 1 ( α ) αi α fα αi α i 0 a α,i = ( α ) [ ( fα ) α i 0 = ( α ) [ ( fα ) α i 0 = 1 ( α ) f α k α (f α so r L = f 0 1 ) 1, ( αi k α α i f α ) ] 1 (f,i ) α 1 αi α ( ) ] 1 f α k α (f,i ) α αi α ( α ) f α k α (f α ) 1 f gp. Corollary 4.2. If α < for each α and f,i > 0 for i = 1,..., n, then f gp r FK, where r FK := f 0 k, k C(t 1 b it i ), Proof. Define b i := 1 i ( i), α =i a α,i := ( α ) α f α (α α f α ) 1, i = 1,..., 1. f α (αα f α )1/ f,i k α. Note that 1 b ik i k and, for each i = 1,..., n, a α,i = 1 = ( α ) α j=1, α =j 1 = f,i j=1 1 = f,i k b j k j f,i. f α (αα f α ) 1 f,i k α ( j) j 1 k k j ( j) j j=1 f α (αα f α ) 1 f,i k j, α =j f α (α α f α ) 1 Hence, (a α,i : α, i = 1,, n) belongs to the feasible set of the geometric program in Corollary 3.7. Plugging into the objective function, one sees after some effort that so r FK f gp. [ (fα ( α ) ) α α a α α ] 1 α = 1 j=1 b j k j k, Corollary 4.3. If α < for each α and f,i > 0 for i = 1,..., n, then f gp r dmt := f 0 ( [ (fα ) ] 1 α α ) t α α α f α, where t :=.

LOWER BOUNDS FOR POLYNOMIALS USING GEOMETRIC PROGRAMMING 15 Proof. Take a α,i = f,i t and apply Corollary 3.7. Remark 4.4. Let C be a cone in a finite dimensional real vector space V. Let C denote the interior of C. If a C and b V then b C if and only if b ɛa C for some real ɛ > 0 (See [14, Lemma 6.1.3] or [7, Remark 2.6]). Since n X i Σ,n [7, Corollary 2.5], for a polynomial f of degree, with f Σ,n, there exists an ɛ > 0 such that g = f ɛ( n X i ) Σ,n. The hypothesis of Corollary 3.7 holds for f g. In this way, corollaries 3.7, 4.1, 4.2 and 4.3, provide lower bounds for f sos. Moreover, the lower bounds obtained in this way, using corollaries 4.1, 4.2 and 4.3, are exactly the lower bounds obtained in [7]. Note that the assumptions in theorems 3.1, 3.2 and 3.3 in [7] are not quite the same as the assumptions in corollaries 4.1, 4.2 and 4.3. To get the lower bounds obtained in [7] one needs to set f,i = ɛ, i = 1,..., n. The bounds r L, r FK, r dmt provided by corollaries 4.1, 4.2 and 4.3 are typically not as good as the bound f gp provided by Corollary 3.7. Example 4.5. (Compare to [7, Example 4.2]) (a) For f(x, Y ) = X 6 + Y 6 + 7XY 2X 2 + 7, we have r L 1.124, r FK 0.99, r dmt 1.67 and f sos = f gp 0.4464, so f gp > r FK > r L > r dmt. (b) For f(x, Y ) = X 6 + Y 6 + 4XY + 10Y + 13, r L 0.81, r FK 0.93, r dmt 0.69 and f gp 0.15 f sos, so f gp > r dmt > r L > r FK. (c) For f(x, Y ) = X 4 + Y 4 + XY X 2 Y 2 + 1, f sos = f gp = r L = 0.125, r FK 0.832 and r dmt 0.875, so f gp = r L > r FK > r dmt. Acknowledgement. The authors wish to thank the two anonymous referees whose comments and suggestions led to improvements in the paper and in the clarity of the presentation. References [1] M. Bellare, P. Rogaway, The complexity of approximating a nonlinear program, Mathematical Programming 69 (429-441), 1993. [2] J. Bochnak, M. Coste, M. F. Roy, Géométrie algébrique réelle, Ergeb. Math. 12, Springer, 1987. Real algebraic geometry, Ergeb. Math. 36, Springer, 1998. [3] S. Boyd, S.-J. Kim, L. Vandenberghe, A. Hassibi, A tutorial on geometric programming, Optim. Eng. 8 (67-127), 2007. [4] S. Boyd, L. Vandenberghe, Convex optimization, Cambridge University Press, 2004. [5] E. Deutsch, Bounds for the zeros of polynomials, Amer. Math. Monthly 88 (205 206), 1981. [6] C. Fidalgo, and A. Kovacec, Positive semidefinite diagonal minus tail forms are sums of squares, Math. Z., 269 (629-645), 2011. [7] M. Ghasemi, M. Marshall, Lower bounds for a polynomial in terms of its coefficients, Arch. Math. (Basel) 95 (343-353), 2010. [8] D. Hilbert, Über die Darstellung definiter Formen als Summe von Formenquadraten, Math. Ann. 32 (342-350), 1888. [9] A. Hurwitz, Über den Vergleich des arithmetischen und des geometrischen Mittels, J. Reine Angew. Math., 108 (266-268), 1891. See also: Math. Werke, Basel (505-507), 1933. [10] D. Knuth, The art of computer programming, Volume 2, Addison-Wesley, New York, 1969.

16 MEHDI GHASEMI AND MURRAY MARSHALL [11] J. B. Lasserre, Global optimization with polynomials and the problem of moments, SIAM J. Optim. Volume 11, Issue 3 (796-817), 2001. [12] J. B. Lasserre, Sufficient conditions for a real polynomial to be a sum of squares, Arch. Math. (Basel) 89 (390-398), 2007. [13] M. Marshall, Optimization of polynomial functions, Canad. Math. Bull., 46(4) (575-587), 2003. [14] M. Marshall, Positive polynomials and sums of squares, Mathematical Surveys and Monographs, AMS., Vol.146, 2008. [15] M. Marshall, Representation of non-negative polynomials, degree bounds and applications to optimization, Canad. J. Math., 61 (205-221), 2009. [16] Y. Nesterov, A. Nemirovskii, Interior-point polynomial algorithms in convex programming, SIAM Studies in Mathematics 13, Philadelphia, 1994. [17] P. Parrilo, B. Sturmfels, Minimizing polynomial functions, Algorithmic and quantitative real algebraic geometry (Piscataway, NJ, 2001), 88-99, DIMACS Ser. Discrete Math. Theoret. Comput. Sci., 60, AMS., 2003. [18] A. L. Peressini, F. E. Sullivan and J. J. Uhl, Jr., The mathematics of nonlinear programming, UTM series, Springer, 1987. [19] V. V. Prasolov, Polynomials, Algorithms and computation in mathematics Vol.11, 2004. [20] B. Reznick, A quantitative version of Hurwitz s theorem on the arithmetic-geometric inequality, J. reine angew. Math. 377 (108-112), 1987. [21] B. Reznick, Forms derived from the arithmetic geometric inequality, Math. Ann. 283 (431-464), 1989. [22] W. A. Stein et al. Sage Mathematics Software (Version 4.6.2), The Sage Development Team, 2011, http://www.sagemath.org. [23] H. Waki, S. Kim, M. Kojima, M. Muramatsu, Sums of squares and semidefinite program relaxations for polynomial optimization problems with structured sparsity, SIAM J. Optim. 17 (218-242), 2006. Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, SK S7N 5E6, Canada E-mail address: mehdi.ghasemi@usask.ca, marshall@math.usask.ca