arxiv: v1 [math.oc] 28 Dec 2018

Similar documents
Optimization over Polynomials with Sums of Squares and Moment Matrices

A new look at nonnegativity on closed sets

Exact SDP Relaxations for Classes of Nonlinear Semidefinite Programming Problems

Hilbert s 17th Problem to Semidefinite Programming & Convex Algebraic Geometry

The moment-lp and moment-sos approaches

Convergence rates of moment-sum-of-squares hierarchies for volume approximation of semialgebraic sets

On Polynomial Optimization over Non-compact Semi-algebraic Sets

An Exact Jacobian SDP Relaxation for Polynomial Optimization

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

that a broad class of conic convex polynomial optimization problems, called

Semidefinite Programming

arxiv: v1 [math.oc] 9 Sep 2015

Semidefinite Representation of Convex Sets

arxiv: v1 [math.oc] 31 Jan 2017

Strong duality in Lasserre s hierarchy for polynomial optimization

Global Minimization of Rational Functions and the Nearest GCDs

CONVEXITY IN SEMI-ALGEBRAIC GEOMETRY AND POLYNOMIAL OPTIMIZATION

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

On John type ellipsoids

Optimization Theory. A Concise Introduction. Jiongmin Yong

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

Representations of Positive Polynomials: Theory, Practice, and

Unbounded Convex Semialgebraic Sets as Spectrahedral Shadows

COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS. Didier HENRION henrion

CLOSURES OF QUADRATIC MODULES

4. Algebra and Duality

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

THEOREMS, ETC., FOR MATH 516

Comparison of Lasserre s measure based bounds for polynomial optimization to bounds obtained by simulated annealing

Semi-definite representibility. For fun and profit

Cover Page. The handle holds various files of this Leiden University dissertation

Polynomial complementarity problems

Semidefinite representation of convex sets and convex hulls

Selected Examples of CONIC DUALITY AT WORK Robust Linear Optimization Synthesis of Linear Controllers Matrix Cube Theorem A.

Course 212: Academic Year Section 1: Metric Spaces

On the Complexity of Testing Attainment of the Optimal Value in Nonlinear Optimization

Convex Optimization & Parsimony of L p-balls representation

The Trust Region Subproblem with Non-Intersecting Linear Constraints

THEOREMS, ETC., FOR MATH 515

Minimum Ellipsoid Bounds for Solutions of Polynomial Systems via Sum of Squares

Course 311: Michaelmas Term 2005 Part III: Topics in Commutative Algebra

Ranks of Real Symmetric Tensors

The moment-lp and moment-sos approaches in optimization

Lecture 2: Convex Sets and Functions

Advanced SDPs Lecture 6: March 16, 2017

Strong Duality and Dual Pricing Properties in Semi-infinite Linear Programming A Non-Fourier-Motzkin Elimination Approach

Uniqueness of the Solutions of Some Completion Problems

EE 227A: Convex Optimization and Applications October 14, 2008

Moments and Positive Polynomials for Optimization II: LP- VERSUS SDP-relaxations

Moments and Positive Polynomials for Optimization II: LP- VERSUS SDP-relaxations

Lecture 3: Semidefinite Programming

A Unified Analysis of Nonconvex Optimization Duality and Penalty Methods with General Augmenting Functions

Chapter 2 Convex Analysis

Semi-infinite programming, duality, discretization and optimality conditions

Integral Jensen inequality

A semidefinite relaxation scheme for quadratically constrained quadratic problems with an additional linear constraint

Near-Potential Games: Geometry and Dynamics

Convex Optimization M2

Optimization and Optimal Control in Banach Spaces

CHAPTER I THE RIESZ REPRESENTATION THEOREM

A Geometrical Analysis of a Class of Nonconvex Conic Programs for Convex Conic Reformulations of Quadratic and Polynomial Optimization Problems

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Positive semidefinite matrix approximation with a trace constraint

Solving Global Optimization Problems with Sparse Polynomials and Unbounded Semialgebraic Feasible Sets

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Optimality, Duality, Complementarity for Constrained Optimization

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

arxiv: v1 [math.oc] 22 Sep 2016

Optimality Conditions for Constrained Optimization

Semidefinite Relaxations Approach to Polynomial Optimization and One Application. Li Wang

Characterizing Robust Solution Sets of Convex Programs under Data Uncertainty

c 2000 Society for Industrial and Applied Mathematics

On duality theory of conic linear problems

arxiv:math/ v3 [math.oc] 5 Oct 2007

A JOINT+MARGINAL APPROACH TO PARAMETRIC POLYNOMIAL OPTIMIZATION

15. Conic optimization

How to generate weakly infeasible semidefinite programs via Lasserre s relaxations for polynomial optimization

2 Sequences, Continuity, and Limits

Lecture Note 5: Semidefinite Programming for Stability Analysis

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

A new approximation hierarchy for polynomial conic optimization

ALGEBRAIC DEGREE OF POLYNOMIAL OPTIMIZATION. 1. Introduction. f 0 (x)

Are There Sixth Order Three Dimensional PNS Hankel Tensors?

On the sufficiency of finite support duals in semi-infinite linear programming

Extreme points of compact convex sets

Non-Convex Optimization via Real Algebraic Geometry

Metric Spaces and Topology

In particular, if A is a square matrix and λ is one of its eigenvalues, then we can find a non-zero column vector X with

1 Quantum states and von Neumann entropy

5. Duality. Lagrangian

Approximate Optimal Designs for Multivariate Polynomial Regression

Mathematics for Economists

What can be expressed via Conic Quadratic and Semidefinite Programming?

Kernel Method: Data Analysis with Positive Definite Kernels

On smoothness properties of optimal value functions at the boundary of their domain under complete convexity

An introduction to some aspects of functional analysis

1. Introduction Boundary estimates for the second derivatives of the solution to the Dirichlet problem for the Monge-Ampere equation

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Near-Potential Games: Geometry and Dynamics

Sum of Squares Relaxations for Polynomial Semi-definite Programming

Transcription:

On semi-infinite systems of convex polynomial inequalities and polynomial optimization problems Feng Guo a, Xiaoxia un b a chool of Mathematical ciences, Dalian University of Technology, Dalian, 64, China b chool of Mathematics, Dongbei University of Finance and Economics, Dalian, 65, China arxiv:8.987v [math.oc] 8 Dec 8 Abstract We consider the semi-infinite system of polynomial inequalities of the form K := {x R m p(x, y), y R n }, where p(x, Y ) is a real polynomial in the variables X and the parameters Y, the index set is a basic semialgebraic set in R n, p(x, y) is convex in X for every y. We propose a procedure to construct approximate semidefinite representations of K. These semidefinite representation sets are indexed by two indices which respectively bound the order of some moment matrices and the degree of sums of squares representations of some polynomials in the construction. As two indices increase, these semidefinite representation sets expand and contract, respectively, and can approximate K as closely as possible under some assumptions. ome special cases when we can fix one of the two indices or both are also investigated. Then, we consider the optimization problem of minimizing a convex polynomial over K. We present an DP relaxation method for this optimization problem by similar strategies used in constructing approximate semidefinite representations of K. Under certain assumptions, some approximate minimizers of the optimization problem can also be obtained from the DP relaxations. In some special cases, we show that the DP relaxation for the optimization problem is exact and all minimizers can be extracted. Keywords: semi-infinite systems, convex polynomials, semidefinite representations, semidefinite programming relaxations, sum of squares, polynomial optimization MC: 65K5, 9C, 9C34. introduction We consider the following semi-infinite system of polynomial inequalities K := {x R m p(x, y), y R n }, () where p(x, Y ) R[X, Y ] := R[X,..., X m, Y,..., Y n ] the polynomial ring in X and Y over the real field and the index set is a basic semialgebraic set defined by := {y R n g (y),..., g s (y) }, () where g j (Y ) R[Y ], j =,..., s. Lowercase letters (e.g. x, y) are hereinafter used for denoting points in a space while uppercase letters (e.g. X, Y ) for variables. In this paper, we assume that p(x, y) R[X] is convex for every y and hence K is a convex set in R m. Email addresses: fguo@dlut.edu.cn (Feng Guo), xiaoxiasun@dufe.edu.cn (Xiaoxia un)

We say a convex set C in R m is semidefinitely representable (or linear matrices inequality representable) if there exist some integers l, k and real k k symmetric matrices {A i } m i= and {B j} l j= such that C is identical with m l x Rm w R l, s.t. A + A i x i + B j w j (3) and (3) is called the semidefinite representation (or linear matrices inequality representation) of C. Many interesting convex sets are semidefinitely representable, see a collection in Ben-Tal and Nemirovski (). Clearly, optimizing a linear function over a semidefinitely representable set can be cast as a semidefinite progamming (DP) problem, while DP has an extremely wide area of applicaitons and can be solved by interior-point method to a given accuracy in polynomial time (c.f. Wolkowicz et al. ()). emidefinite representations of convex sets can help us to build DP relaxations of many computationally intractable optimization problems. Arising from above, one of the basic issues in convex algebraic geometry is to characterize convex sets in R m which are semidefinitely representable and give systematic procedures to obtain their semidefinite representations. Clearly, if a set in R m is semidefinitely representable, then it is convex and semialgebraic. Conversely, Nemirovski asked in his plenary address at the 6 ICM that whether each convex semialgebraic set is semidefinitely representable. Yet a negative answer has been recently given by cheiderer (8). Hence, it is reasonable to study how to construct approximate semidefinite representations of C, that is a sequence of semidefinite representation sets of the form (3) which converge to C in some sence. For a given basic semialgebraic set in R m, Lasserre (9b) and Gouveia et al. () proposed some methods to construct semidefinite outer approximations of the closure of its convex hull. These appproaches are based on the sums of squares representation of linear functions which are nonnegative on a basic semialgebraic set. If the basic semialgebraic set is compact, these approximations can be made arbitrarily close and become exact under some favorable conditions. ome extensions of these semidefinite approximations to noncompact basic semialgebraic sets are given in Guo et al. (5). For a convex semialgebraic set, Helton and Nie (9, ) proposed some sufficient conditions, in terms of curvature conditions for the boundary, for its semidefinite representability. These conditions are recently modified and improved by Kriel and chweighofer (8). In this paper, we first consider to construct approximate semidefinite representations of the set K in (). The difference of this problem from ones in the literature is that K is defined by infinitely many convex real polynomials. As there is a quantifier in the definition (), K is in fact a semialgebraic set by the Tarski-eidenberg principle (c.f. Bochnak et al. (998)). Theoretically, K can be decomposed as a finite union of basic closed semialgebraic sets and hence, as proved in Helton and Nie (9), the semidefinite approximations of K can be made by glueing together Lasserre relaxations Lasserre (9b) of many small pieces of K. However, such a decomposition of K may not be easily obtained and the approach given in Helton and Nie (9) is not constructive. These obstacles make the problem studied in this paper nontrivial. As the first contribution in this paper, we propose a procedure to construct approximate semidefinite representations of K. These semidefinite representation sets are indexed by two indices which respectively bound the order of some moment matrices and the degree of sums of squares representations of some polynomials in the construction. As two indices increase, these semidefinite representation sets expand and contract, respectively, and can approximate K as closely as possible under some assumptions. ome special cases when we can fix one of the two indices or both are also investigated. In the second part of this paper, we consider the following convex minimization problem (P) f := inf x K f(x) i= j= where K is defined in () and f(x) R[X] is convex. This problem is NP-hard. Indeed, it is obvious that the problem of minimizing a polynomial h(y ) R[Y ] over can be regarded as a special case of (P). As is well known, the polynomial optimization problem is NP-hard even when n >, h(y ) is a nonconvex quadratic polynomial and g j (Y ) s are linear (c.f. Pardalos and Vavasis (99)). Hence, a general the problem (P) cannot be expected to be solved in polynomial time unless P=NP.

The problem (P) can be seen as a special branch of convex semi-infinite programming (IP), in which the involved functions are not necessarily polynomials. Numerically, IP problems can be solved by different approaches including, for instance, discretization methods, local reduction methods, exchange methods, simplex-like methods and so on. ee Hettich and Kortanek (993); López and till (7); Goberna and López (7) and the references therein for details. One of main difficulties in numerical treatment of general IP problems is that the feasibility test of ū R m is equivalent to globally solve the lower level subproblem of min y p(ū, y) which is generally nonlinear and nonconvex. To the best of our knowledge, few of the numerical methods mentioned above are specially designed by exploiting features of polynomial optimization problems. Parpas and Rustem (9) proposed a discretization-like method to solve minimax polynomial optimization problems, which can be reformulated as semi-infinite polynomial programming (IPP) problems. Using polynomial approximation and an appropriate hierarchy of DP relaxations, Lasserre presented an algorithm to solve the generalized IPP problems in Lasserre (). Based on an exchange scheme, an DP relaxation method for solving IPP problems was proposed in Wang and Guo (3). By using representations of nonnegative polynomials in the univariate case, an DP method was given in Xu et al. (5) for linear IPP problems (a special case of (P)) with being closed intervals. As the second contribution in this paper, we present some DP relaxation methods for the problem (P) by similar strategies used in constructing approximate semidefinite representations of K. Under certain assumptions, some approximate minimizers of (P) can also be obtained from the DP relaxations. In some special cases, we show that the DP relaxation of (P) is exact and all minimizers can be extracted. This paper is organized as follows. In ection, we give some notation and preliminaries used in this paper. Approximate semidefinite representations of K as well as some examples are proposed in ection 3. We study DP relaxations of the problem (P) in ection 4.. Notation and Preliminaries Here is some notation used in this paper. The symbol N (resp., R) denotes the set of nonnegative integers (resp., real numbers). For any t R, t (resp. t ) denotes the smallest (resp. largest) integer that is not smaller (resp. larger) than t. For y = (y,..., y n ) R n, y denotes the standard Euclidean norm of y. For = (,..., n ) N n, = + + n. For k N, denote N n k = { Nn k} and N n k its cardinality. For y Rn and N n, y denotes y yn n. R[Y ] = R[Y,, Y n ] denotes the ring of polynomials in (Y,, Y n ) with real coefficients. For k N, denote by R[Y ] k the set of polynomials in R[Y ] of total degree up to k. For a symmetric matrix W, W ( ) means that W is positive semidefinite (definite). For two symmetric matrices A, B of the same size, A, B denotes the inner product of A and B. We say that the later condition holds for K if there exists u K such that p(u, y) > for all y and the point u is called a later point. Consider the semi-infinite convex polynomial optimization problem (P). Theorem.. (c.f. Borwein (98); Levin (969)) Assume that the later condition holds for K and the index set is compact. Then for any convex f(x) R[X], there exist points y,..., y l with l n such that f is equal to the optimal value of the discretization problem min f(x) s.t. p(x, y ),..., p(x, y l ). (4) x R m Corollary.. uppose that the assumptions in Theorem. hold. Then for any convex f[x] R[X], there exist points y,..., y l and nonnegative Lagrange multipliers λ,..., λ l R with l n such that the Lagrangian l L f (X) := f(x) f λ i p(x, y i ) (5) is nonnegative on R m. Next we recall some background about sums of squares (s.o.s) of polynomials and the dual theory of moment matrices. A polynomial φ(x) R[X] is said to be a sum of squares of polynomials if it can be 3 i=

written as φ(x) = t i= φ i(x) for some φ (X),..., φ t (X) R[X]. The symbols Σ [X] and Σ [Y ] denote the sets of polynomials that are s.o.s in R[X] and R[Y ], respectively. Notice that not every nonnegative polynomials can be written as s.o.s, see Reznick (). Lasserre and Netzer (7) gave the following s.o.s approximations of nonnegative polynomials via simple high degree perturbations. Theorem.3. (Lasserre and Netzer, 7, c.f. Theorem 3., 3. and Corollary 3.3) For a given h R[X], the followings are true. (i) For any r deg(h)/, there exists ε r such that h+ε(+ m j= Xr j ) is s.o.s if and only if ε ε r; (ii) If h is nonnegative on [, ] m, then ε r in (i) decreasingly converges to as r tends to ; (iii) For any ε >, if h is nonnegative on [, ] m, then there exists some r(h, ε) N such that h + ε( + ) is s.o.s for every r r(h, ε). m j= Xr j Moreover, ε r in Theorem.3 is computable by solving an DP problem, see (Lasserre and Netzer, 7, Theorem 3.). Now we consider the cone P() of polynomials in R[Y ] which are nonnegative on. Let G := {g,..., g s } be the set of polynomials that defines the semialgebraic set (). We denote by s Q(G) := σ j g j g =, σ j Σ [Y ], j =,,..., s j= the quadratic module generated by G and denote by s Q k (G) := σ j g j g =, σ j Σ [Y ], deg(σ j g j ) k, j =,,..., s j= its k-th quadratic module. It is clear that if h Q(G), then h(y) for any y. However, the converse is not necessarily true. Note that checking h Q k (G) for a fixed k N is an DP feasibility problem, see Lasserre (); Parrilo and turmfels (3). Definition.4. We say that Q(G) is Archimedean ψ(y) defines a compact set in R n. if there exists ψ Q(G) such that the inequality Note that the Archimedean property implies that is compact but the converse is not necessarily true. However, for any compact set we can always force the associated quadratic module to be Archimedean by adding a redundant constraint M y in the description of for sufficiently large M. Theorem.5. (Putinar, 993, Putinar s Positivstellensatz) uppose that Q(G) is Archimedean. If a polynomial h R[Y ] is positive on, then h Q k (G) for some k N. Consequently, for any d N, we have closure(q(g) R[Y ] d ) = P() R[X] d if Q(G) is Archimedean. For a polynomial h(y ) = h Y R[Y ], define the norm h := max h ). (6) ( We have the following result for an estimation of the order k in Theorem.5. Theorem.6. (Nie and chweighofer, 7, Theorem 6) uppose that Q(G) is Archimedean and ( τ, τ ) n for some τ >. Then there is some positive c R (depending only on g j s) such that for all h R[Y ] of degree d with min y h(y) >, we have h Q k (G) whenever [( k c exp d n d h τ d ) c ]. min y h(y) 4

A sequence of real numbers z := (z ) N n R Nn whose elements are indexed by n-tuples N n is called a moment sequence and the truncation (z ) N n k R Nn k is called a truncated moment sequence up to order k. For z R Nn, if there exists a Borel measure µ on R n such that z = Y dµ(y), N n, then we say that z has a representing measure µ. A basic problem in the theory of moments concerns the characterization of (infinite or truncated) sequences which have some representing measure. For any moment sequence z, the Riesz functional L z on R[Y ] is defined by ) L z ( q Y Yn n := q z, q(y ) R[Y ]. (7) For bounded moment sequences, we have the following results for the moment problem. Theorem.7. (Berg and Maserick, 984, Theorem.) Let z R Nn be a moment sequence such that L z (h) for all h Σ [Y ]. If there exist a, c > such that z ca for every N n, then z has exactly one representing measure µ on R n with support contained in [ c, c] n. Denote by M the set of those moment sequences which have some representing measure supported on in (). To characterize the elements in M, we need to introduce the definitions about moment matrices. The associated k-th moment matrix is the matrix M k (z) indexed by N n k, with (, β)-th entry z +β for, β N n k. Given a polynomial h(y ) = h Y, for k d h := deg(h)/, the (k d h )-th localizing moment matrix M k dh (fz) is defined as the moment matrix of the shifted vector ((hz) ) N n with (k d h ) (hz) = β h βz +β. For any q(y ) R[Y ] k, let q denote its column vector of coefficients in the canonical monomial basis of R[Y ] k. From the definition of the localizing moment matrix M k dh (hz), it is easy to check that q T M k dh (hz)q = L z (h(y )q(y ) ), q(y ) R[Y ] k dh. (8) Let d j := deg(g j )/, j =,..., s, d := max d j. (9) j For any v, let ζ k,v := [v ] N n k be the Zeta vector of v up to degree k, i.e., ζ k,v = [ v v n v v v v k n ]. Then, M k (ζ k,v ) and M k dj (g j ζ k,v ) for j =,..., s. In fact, let g =, then for each j =,,..., s, q T M k dj (g j ζ k,v )q = L ζk,v (g j (Y )q(y ) ) = g j (v)q(v), q(y ) R[Y ] k dj. Haviland (935) proved that the dual cone (P()) = M. Hence, in a dual view, Putinar s Positivstellensatz reads Theorem.8. (Putinar, 993, Putinar s Positivstellensatz) uppose that Q(G) is Archimedean. If M k (z) and M k dj (g j z) for all j =,..., s, and all k =,,..., then z M. For a truncated moment sequence z = (z ) N m k, we have the following sufficient condition for z M. Condition.9. A truncated moment sequence z = (z ) N m k satisfies the Rank Condition when rankm k d (z) = rankm k (z). Theorem.. (Curto and Fialkow, 5, Theorem.) uppose that a truncated moment sequence z = (z ) N m k satisfies that M k (z) and M k dj (g j z) for all j =,..., s, and the Rank Condition.9 holds with r := rankm k (z), then z has a unique r-atomic measure supported on. 5

To end this section, let us recall a very interesting subclass of convex polynomials in R[Y ] introduced by Helton and Nie (). Definition.. (Helton and Nie ()) A polynomial h R[Y ] is s.o.s-convex if its Hessian h is a s.o.s, i.e., there is some integer r and some matrix polynomial H R[Y ] r n such that h(y ) = H(Y ) T H(Y ). While checking the convexity of a convex polynomial is generally NP-hard (c.f. Ahmadi et al. (3)), s.o.s-convexity can be checked numerically by solving an DP, see Helton and Nie (). The following result plays a significant role in this paper. Lemma.. (Helton and Nie,, Lemma 8) Let h R[Y ] be s.o.s-convex. If h(u) = and h(u) = for some u R n, then h is s.o.s. 3. Approximate semidefinite representations of K As we always assume that the index set in the definition of K is compact in this paper, we first show that in generic case a set K with noncompact index set can be converted into compact case. 3.. Noncompact case In this subsection, we consider the set K in () with noncompact index set. We used the technique of homogenization proposed in Wang and Guo (3) to convert a semi-infinite system () with general noncompact index set into compact case. For a polynomial g(y ) R[Y ], denote its homogenization by g hom (Ỹ ) R[Ỹ ], where Ỹ = (Y, Y,..., Y n ), i.e., g hom (Ỹ ) = Y deg(g) g(y/y ). For the basic semialgebraic set in (), define > := {ỹ R n+ g hom (ỹ),..., gs hom (ỹ), y >, ỹ = }, := {ỹ R n+ g hom (ỹ),..., gs hom (ỹ), y, ỹ = }. () Proposition 3.. (Wang and Guo, 3, Proposition 4.) For any g(y ) R[Y ], g(y) on if and only if g hom (ỹ) on closure( > ). Let d Y := deg Y (p(x, Y )) and p hom (X, Ỹ ) be the homogenization of p(x, Y ) with respect to the variables Y. It follows that the set K in () is equivalent to {x R m p hom (x, ỹ), y closure( > )}. Replacing closure( > ) by the basic semialgebraic set, we get the following set K := {x R m p hom (x, ỹ), y }. It is obvious that K K since closure( > ). Definition 3.. (Nie (3)) is said to be closed at if closure( > ) =. Remark 3.3. Clearly, K = K when is closed at. Note that not every set of form () is closed at even when it is compact (Nie,, Example 5.). However, it is shown in (Wang and Guo, 3, Theorem 4.) that the closedness at is a generic property. Namely, if we consider the space of all coefficients of generators g j s of all possible sets of form () in the canonical monomial basis of R[Y ] d with d = max j deg(g j ), coefficients of g j s of those sets which are not closed at are in a Zariski closed set of the space. It follows that K = K for general index sets. Note that > depends only on, while depends not only on but also on the choice of the inequalities g (y),..., g s (y). In some cases, we can add some redundant inequalities in the description of to force it to be closed at (c.f. Guo et al. (5)). 6

For any polynomial g(y ) R[Y ], denote ĝ(y ) as its homogeneous part of the highest degree. Define Ŝ := {y R n ĝ (y),..., ĝ s (y), y = }. () In particular, denote ˆp(X, Y ) as the homogeneous parts of p(x, Y ) with respect to Y of the highest degree d Y. Definition 3.4. We say that the extended later condition holds for K if there exists a point u R m of K such that p(u, y) > for all y and ˆp(u, y) > for all y Ŝ. We call u an extended later point of K. Proposition 3.5. The later condition holds for K if and only if the extended later condition holds for K. Proof. uppose that u is an extended later point of K. For any ṽ = (v, v), we have v Ŝ if v = and v/v otherwise. It is straightforward to verify that the later condition also holds for K at u. uppose( that the later condition ) holds for K at u R m. For any point v R n, we have (, v) if v Ŝ and, v if v. Then similarly, it implies that the extended later condition + v + v holds for K at u. As a result of the above arguments, it is reasonable to consider the following assumption in the rest of this paper. Assumption 3.6. The set is compact, p(x, y) R[X] is convex for any y and the later condition holds for K. 3.. Approximate semidefinite representations of K We assume that K in () is compact and a scalar τ K such that x τ K for any x K is known. Define Θ r (X) = ( ) r m X i i= τ K R[X] for any r N. It is clear that Θr (x) for any x K and r N. For convenience, we write p(x, Y ) = p X,(Y )X = β p Y,β(X)Y β, i.e., p X, (Y ) and p Y,β (X) denote the coefficients of X and Y β in p(x, Y ) regarded as a polynomial in R[X] and R[Y ], respectively. Denote by B the unit ball in R m. Let d X = deg X (p(x, Y )) and d Y = deg Y (p(x, Y )). Recall the notation d in (9) and the Riesz function defined in (7). Let d K := max{ d Y /, d }. Theorem 3.7. uppose that K is compact. For integers r d X / and t d K, define z = (z ) N m r R Nm r, σ, σj Σ [Y ], j =,..., s, s.t. z =, M r (z), Λ r,t := x R m : L z (X i ) = x i, i =,..., m, L z (Θ k ), k = d X /,..., r, s p X, (Y )L z (X ) = σ + σ j g j, deg(σ), deg(σ j g j ) t. j= () Then, Λ r,t Λ r,t for any r > r d X / and Λ r,t Λ r,t for any t > t d K. If Assumption 3.6 holds, then the followings are true. (i) For any ε >, there exists an integer r(ε) d X / such that for every r r(ε) and t d K, it holds that Λ r,t K + εb. If Q(G) is Archimedean, then there exists integer t(ε) d K such that for every r d X / and t t(ε), it holds that K Λ r,t + εb. Consequently, Λ r,t converges to K as r and t both tend to ; (ii) If the Lagrangian L f (X) as defined in (5) is s.o.s for every linear f R[X], then K Λ r,t Λ r,t for any r d X /, t > t d K. For any ε >, if, moreover, Q(G) is Archimedean, then there exists integer t(ε) d K such that K Λ r,t + εb for any r d X /, t t(ε). Consequently, Λ r,t converges to K as t tends to for any r d X /. 7

Proof. For a fixed x Λ r,t, there exist z = (z ) N m r R Nm r, σ, σ j Σ [Y ] satisfying conditions in () for Λ r,t. Let z = (z ) N m r be the truncation of z. Then, it is clear that z, σ, σ j satisfy all conditions in () for Λ r,t and thus x Λ r,t. imilarly, if x Λ r,t, then x Λ r,t for any t > t d K. (i). Fix an ε > and a point v K + εb. Now we prove that there is some integer r(ε) that does not depend on v such that v Λ r,t for every r r(ε) and t d K, which implies that Λ r,t K + εb. By (Lasserre, 9b, Lemma 5), there exist a R m and b = min x K a T x statisfying a = and b τ K such that a T x b for any x K and a T v b < ε. Consider the optimization problem min x K a T x b. By Corollary., the associated Lagrangian L a,b (X) := a T X b l j= λ jp(x, y l ) as defined in (5) is nonnegative on R m for some y,..., y l and nonnegative λ,..., λ l R. In particular, L a,b is nonnegative on [ τ K, τ K ] m. By Theorem.3 (iii), there is some integer r(ε) d X / such that for any r r(ε), it holds that a T X b + ε l ( + Θ r) = σ + λ j p(x, y j ) (3) for some σ Σ[X]. As r r(ε) d X /, we have deg( σ) r. Now we show that r(ε) does not depend on v. According to (Lasserre and Netzer, 7, ec. 3.3), r(ε) depends on ε, the dimension m and the size of a, b, λ j s and the coefficients p(x, y j ) regarded as polynomials in R[X]. Fix a later point u K, since a T u b l j= λ jp(u, y j ), as proved in (Lasserre, 9b, Lemma 7), we have λ j at u b p(u, y j ) τ K p(u, y j ) j= τ K min j=,...,l p(u, y j ) τ K min y p(u, y) τ K p u, where p u := min y p(u, y) > since u is a later point and is compact. Write p(x, y j ) = p X,(y j )X, then p X, (y j ) max max y p X, (y). Hence, all a, b, λ j s and p X, (y j ) s are uniformly bounded, which means that r(ε) does not depend on v. For any r r(ε) and t d K, to the contrary, assume that v Λ r,t. Then, there exist z, σ, σ j s satisfying the conditions in () for Λ r,t. Let µ = l j= λ jδ yl where δ yl denotes the Dirac measure at y l. As deg( σ) r, it holds that > a T v b + ε = L z (a T X b) + ε L z (a T X b) + ε L z( + Θ r ) ) = L z ( σ + p(x, y)dµ(y) = L z ( σ) + p X, (y)l z (X )dµ(y) = L z ( σ) + σ + s σ j g j dµ(y), j= (4) which is a contradiction. Thus, v Λ r,t and Λ r,t K + εb. Fix a later point u K. Let u K be arbitrary. Now we first prove that there exist a point ū R m and an integer t(ε) that does not depend on u (in fact, it depends on ε, K,, u, p(x, Y ), g j s) such that u ū ε and ū Λ r,t for every r d X / and t t(ε), which implies that K Λ r,t + εb. If u u ε, then let ū = u ; otherwise, let λ = ε/ u u and ū = λu + ( λ)u, then we have > λ ε τ K, u ū = λ u u = ε and p(ū, y) λp(u, y) + ( λ)p(u, y) [as p(x, y) is convex in X] λp(u, y). [as u K] Let κ(ε) := min{ ε τ K, }. Then, in either case, it follows that p(ū, y) κ(ε)p(u, y) κ(ε)p u > 8

for any y. Write p(ū, Y ) = β p Y,β(ū)Y β R[Y ]. Recall the norm defined in (6), then p(ū, Y ) = max β p Y,β (ū) max x K p Y,β (x) ) max ) =: N p. ( β β As K is compact, N p is well-defined. Note that N p does not depend on u but only on p and K. By Theorem.6, there exists come positive c depending on g j s such that p(ū, Y ) Q t (G) whenever [( ) c ] t c exp d Y n d N Y pτ d Y κ(ε)p =: t(ε). u For any r d X /, set Let ζ r,ū be the Zeta vector of ū up to degree r. Then, it is clear that L ζr,ū (X i ) = ū i for i =,..., m, L ζr,ū (Θ k ) for k = d X /,..., r and M r (ζ r,ū ). We have p (Y )L ζr,ū (X ) = p(ū, Y ). It implies that ū Λ r,t and thus K Λ r,t + εb for every r d X / and t t(ε). (ii). By (i), we only need to prove Λ r,t K for any r d X / and t d K. Fix a point v K. By the eparation Theorem of convex sets, there exist a R m and b R such that a T x b for any x K and a T v b <. As proved in (i), there are some y,..., y l and nonnegative λ,..., λ l R such that a T X b l j= λ jp(x, y l ) is nonnagetive on R m. ince the associated Lagrangian L f (X) is s.o.s for every linear function f, we have l a T X b = σ + λ j p(x, y j ) (5) for some σ Σ[X]. To the contrary, assume that v Λ r,t. Then, there exist z, σ, σ j s satisfying the conditions in () for Λ r,t. Define µ as in (i). Like in (4), we get that s > a T v b = L z (a T X b) = L z ( σ) + σ + σ j g j dµ(y), (6) β j= which is a contradiction. Thus, v Λ r,t and hence Λ r,t K. Remark 3.8. According to the proof, the conclusions (i) and (ii) in Theorem 3.7 are still true if we simplify the condtion L z (Θ k ), k = d X /,..., r in () by L z (Θ r ). According to the proof of Theorem 3.7 (i), the equation p X, (Y )L z (X ) = σ + ( β β j= s σ j g j (7) in the definition of Λ r,t in () can be replaced by other representations for positive (nonnegative) polynomials if certain assumptions hold. For instance, if is compact but Q(G) is not Archimedean, we can use chmüdgen s Positivstellensatz chmüdgen (99) in the definition of Λ r,t to obtain the same results as in Theorem 3.7. Now we consider the case when m = and is a bounded interval. By some representation results of nonnegative polynomials in the univariate case, we shall see that analogous approximate semidefinite representations of K as in Theorem 3.7 can be obtained with some fixed order t. Without loss of generality, we can assume that = [, ]. Let Recall the well-known result j= [, ] = {y R g (y ) }, where g (Y ) = Y. (8) Theorem 3.9. (c.f. Powers and Reznick (); Laurent (9)) Let h R[Y ] and h on [, ], then h = σ + σ ( Y ) where σ, σ Σ [Y ] and deg(σ), deg(σ ( Y )) deg(h)/. 9

Theorem 3.. Assume that is in the case of (8) and K is compact. Let t = d K and consider the sets Λ r,t in () for r d X /. Then, K Λ r,t Λ r,t for any r > r d X /. uppose that Assumption 3.6 holds, then the followings are true. (i) For any ε >, there exists an integer r(ε) d X / such that Λ r,t K+εB holds for every r r(ε). Consequently, Λ r,t converges to K as r tends to ; (ii) If the Lagrangian L f (X) as defined in (5) is s.o.s for every linear f R[X]. then K = Λ r,t r d X /. for any Proof. For any u K, let ζ r,u be the Zeta vector of u of degree up to r. By Theorem 3.9, there exists σ, σ Σ [Y ] such that ζ r,u, σ, σ satisfy the conditions in the definition of Λ r,t in (). Hence, K Λ r,t for any r d X /. (i) ee the first part of the proof of Theorem 3.7 (i); (ii) It is clear since Λ r,t K for any r d X / and t d K by the proof of Theorem 3.7 (ii). Note that Λ r,t in () is indeed a semidefinite representation set of the form (3) for every r d X / and t d K. In fact, for any t d K, let m t (Y ) be the column vector consisting of all the monomials in Y of degree up to t. Let s(t) = ( ) n+t n which is the dimension of mt (Y ). Recall the definitions in (9). There exist positive semidefinite matrices Z R s(t) s(t), Z j R s(t dj) s(t dj), j =,..., s, such that σ(y ) = m t (Y ) T Z m t (Y ), σ j (Y ) = m t dj (Y ) T Z j m t dj (Y ), j =,..., s. For each β N n t, we can find symmetric matrices C β, C j,β R s(t dj) s(t dj), j =,..., s, such that the coefficients of Y β in σ and σ j g j are equal to Z, C β and Z j, C j,β, j =,,..., s, respectively. Write p(x, Y ) = β N p n Y,β (X)Y β and let t E β = L z (p Y,β (X)) Z, C β s Z j, C j,β for each β N n t. Then, All E β are linear in z, Z and Z j s. For each i =,..., m, let e i be the vector whose i-th component is and the others are. Denote by M(z, Z, Z,..., Z s ) the block diagonal matrix whose diagonal elements are z, z, M r (z), Z, Z,..., Z s, L z (Θ k ), k = d X /,..., r, E β, E β, β N m t. Then, we have the semidefinite representation j= Λ r,t = {(z e,..., z en ) R m M(z, Z, Z,..., Z s ) }. Note that the matrix M(z, Z, Z,..., Z j ) can be easily generated using Yalmip (Löfberg (4)). For m = and 3, we can first generate M(z, Z, Z,..., Z j ) and then use the software package Bermeja Rostalski () to draw the projected spectrahedron Λ r,t. Recall Theorem 3.7 (ii). We now strengthen Assumption 3.6 to Assumption 3.. The set is compact, p(x, y) R[X] is s.o.s-convex for any y and the later condition holds for K. Lemma 3.. uppose that Assumption 3. holds for K, then for any s.o.s-convex f R[X], the Lagrangian L f (X) defined in (5) is s.o.s. Proof. A Frank-Wolfe type theorem proved in Belousov (977) states that the discretization problem (4) has a minimizer u R m even when the feasible set of (4) is noncompact. ince L f (X) is s.o.s-convex and

L f (x) = L f (u) for any x R m, the conclusion follows from KKT optimality conditions for (4) and Lemma.. Consequently, if Assumption 3. holds, then L f (X) is s.o.s for every linear function f. In this case, instead of the sets Λ r,t in (), we can define z = (z ) N m dx R Nm d X, σ, σ j Σ [Y ], j =,..., s, Λ t := x R m s.t. z =, M dx / (z), L z (X i ) = x i, i =,..., m, : (9) s p X, (Y )L z (X ) = σ + σ j g j, deg(σ), deg(σ j g j ) t. and get the same results as in Theorem 3.7. That is, Theorem 3.3. uppose that Assumption 3. holds. Then for Λ t defined in (9), the followings are true. (i) K Λ t Λ t for any t > t d K ; (ii) If K is compact and Q(G) is Archimedean, then for any ε >, there exists integer t(ε) d K such that for every t t(ε), it holds that K Λ t + εb. Consequently, Λ t converges to K as t tends to. (iii) If is in the case of (8), then K = Λ t where t = d K. Proof. (i) Recall the proof of Theorem 3.7 (ii). Note that to show Λ r,t K for any r d X / and t d K, the constraints L z (Θ k ) in definition () are redundant. Moreover, it is clear that σ Σ[X] in (5) is of degree d X /. Hence, we can set r = d X / and define Λ t as in (9) to obtain (i); (ii) ee the proof of Theorem 3.7 (ii); (iii) imilar to the proof of Theorem 3., it is clear that K Λ t. Combining (i), K = Λ t follows. Corollary 3.4. Assume that the set is compact, p(x, y) R[X] is linear in X for any y and the later condition holds for K. For integer t d K, define σ, σ j Σ [Y ], j =,..., s, Λ t := x R m : s. () s.t. p(x, Y ) = σ + σ j g j, deg(σ), deg(σ j g j ) t. Then, the statements in Theorem 3.3 hold. Proof. Clearly, Assumption 3. holds. ince p(x, y) R[X] is linear in X for any y, it is easy to see that the sets Λ t defined in (9) and () are equal in this case. Remark 3.5. Note that we do not require K to be compact in Theorem 3.3 (i),(iii) and Corollary 3.4. 3.3. Illustrating examples Now we present some illustrating examples. As we shall see, the approximate semidefinite representations defined in this section are very tight for some given sets K. Example 3.6. Consider the polynomial j= j= f(x, X, X 3 ) =3X 8 + 8X 6 X + 4X 6 X 3 + 5X 4 X 4 43X 4 X X 3 35X 4 X 4 3 + 3X X 4 X 3 6X X X 4 3 + 4X X 6 3 + 6X 8 + 44X 6 X 3 + 7X 4 X 4 3 + 6X X 6 3 + 3X 8 3. It is proved in Ahmadi and Parrilo () that f(x, X, ) R[X, X ] is a convex but not s.o.s-convex. Rotate the shape in the (x, x )-plane defined by f(x, x, ) continuously around the origin by 9

Figure : The set K (left) and the semidefinite representation set Λ 4,4 (right) in Example 3.6..5.5.5.5 x x -.5 -.5 - - -.5 -.5 - - -.5 - -.5.5.5 x - - -.5 - -.5.5.5 x clockwise. Denote by K the common area of these shapes in this process. We illustrate K in the left of Figure by making a discrete rotation. In other words, the set K is defined by K = {(x, x ) R p(x, x, y, y ), where p(x, X, Y, Y ) = f(y X Y X, Y X + Y X, ) and y }, = {(y, y ) R y, y, y + y = }. It is clear that the assumptions in Theorem 3.7 holds for K and d X = d Y = 8, d K = 4. By the software Bermeja, the semidefinite representation set Λ 4,4 as defined in () is drawn in gray bounded by the red curve in the right of Figure. Example 3.7. Consider the set K = {(x, x ) R p(x, x, y, y ), y } where p(x, X, Y, Y ) = X Y X X Y X X X and = {(y, y ) R y, / y /, y y }. We illustrate K in the left of Figure by using some grid of. The Hessian matrix of p with respect to X and X is [ ] Y H = with det(h) = 4Y Y Y 4Y. Clearly, p(x, X, y, y ) is s.o.s-convex in (X, X ) for every y. We have d X =, d Y = and d K =. The semidefinite representation set Λ as defined in (9) is drawn in gray bounded by the red curve in the right of Figure. Example 3.8. Consider the ellipse which can be represented by where K = {(x, x ) R x + x + x x + x } {(x, x ) R p(x, x, y), y } p(x, X, Y ) = ( Y 4 Y 3 + 3Y + Y )X Y (Y )X + Y and = [, ] (ee Goberna and López (998)). As p(x, X, Y ) is linear in X and is an interval, we have K = Λ where Λ is defined as in () by Corollary 3.4. The set K and semidefinite representation set Λ are illustrated in Figure 3.

Figure : The set K (left) and the semidefinite representation set Λ (right) in Example 3.7..5.5.5.5 x x -.5 -.5 - - -.5 -.5 - - - - - - x x 3 3 x x Figure 3: The set K (left) and the semidefinite representation set Λ (right) in Example 3.8. - - - - -3-3 - - -3-3 3 x - - x 3 3

4. DP relaxations of convex semi-infinite polynomial programming For a convex polynomial f(x) R[X], consider the following convex semi-infinite polynomial programming problem (P) f := inf f(x) where K is defined in (). x K Let d P := max{deg(f), d X } and M () be the set of all (nonnagetive) Borel measures supported on. 4.. General case Consider the case when K is compact and Assumption 3.6 holds. Recall the Riesz funciton defined in (7). For any integer r d P /, we first convert (P) to the problem fr := sup ρ η ρ,η,µ,σ (P r ) s.t. f(x) ρ + η( + Θ r ) = p(x, y)dµ(y) + σ, () ρ R, η, µ M (), σ Σ [X] R[X] r, and its dual (P r) inf L z (f) z s.t. z =, L z (Θ r ), M r (z), p X, (y)l z (X ), y, z = (z ) N m r R Nm r. () Definition 4.. We call z (r) (r d P ) a nearly optimal solution of () if z (r) is feasible for () and lim r L z (r)(f) = lim r inf P r. Theorem 4.. uppose that f(x) is convex, K is compact and Assumption 3.6 holds. Let z (r) be a nearly optimal solution of () and ẑ (r) = {z (r) = }. (i) f r converges to f as r tends to ; (ii) f r is attainable in () and there is no dual gap between () and (); (iii) Assume that τ K = (possibly after scaling). Then, for any convergent subsequence {ẑ (ri) } of {ẑ (r) }, lim i ẑ (ri) is a minimizer of (P). Consequently, if x is the unique minimizer of (P), then lim r ẑ (r) = x ; (iv) If moreover, the Lagrangian L f (X) as defined in (5) is s.o.s, then f r = f for any r d P / and it is also attainable in (). Proof. (i) For any x K and y, we have Θ r (x) and p(x, y). Consequently, for any feasible point (ρ, η, µ, σ) of (), it holds that f(x) = ρ η( + Θ r (x)) + p(x, y)dµ(y) + σ(x) which implies that f r f. ρ η( + Θ r (x)) ρ η, 4

Conversely, by Corollary., there exist some y,..., y l and nonnegative Lagrange multipliers λ,..., λ l R such that f(x) f l j= λ j p(x, y l ) = f(x) f p(x, y)dµ(y), x R m, (3) where µ = l j= λ jδ yl M () and δ yl is the Dirac measure at y l. For any fixed r N with r d P, by Theorem.3 (i), there exists a ε r such that f(x) f p(x, y)dµ(y) + η( + Θ r ) Σ [X] R[X] r (4) if and only if η ε r. It means that () is feasible and fr f ε r. Moreover, by Theorem.3 (ii), ε r decreasingly converges to as r tends to. It then follows that fr converges to f as r tends to. (ii) Fix a later point u of K. ince is compact, there exists a neighborhood O u of u such that every point in O u is a later point of K. Let ν be the probability measure with uniform distribution in O u and set z = (z ) N m r where z = X dν. It is easy to see that z is strictly admissible for (). The conclusion follows due to the duality theorey in convex optimization. (iii) For any r d P, as τ K = and L z (r)(θ r ), it is clear that L z (r)(xi r ) for all i =,..., n. ince z (r) = and M r (z (r) ), we then deduce that z (r) for any r by (Lasserre and Netzer, 7, Lemma 4. and 4.3). Complete each z (r) with zeros to make it an infinite vector in R Nm indexed in the basis {X N m }. Then, it holds that {z (r) } [, ] Nm. Let {ẑ (ri) } be a convergent subsequence of {ẑ (r) }. By Tychonoff s theorem, there exists a convergent subsequence of the corresponding {z (ri) } in the product topology. Without loss of generality, we assume that the whole sequence {z (ri) } converges as i. That is, there exists z [, ] Nm such that lim i z (ri) = z holds for all N m. From the pointwise convergence, we have L z (h) for all h Σ [X]. As z [, ] Nm, by Theorem.7, z has exactly one representing measure ν with support contained in [, ] Nm. ince z (r) is nearly optimal solution of (), we obtain fdν(x) = f by (i) and (ii). Denote ( ) ẑ := X dν(x),..., X m dν(x). Then, lim i ẑ (ri) = ẑ. For any ε >, from the proof of Theorem 3.7 (i) and Remark 3.8, it is easy to see that there exists an integer r(ε) such that ẑ (ri) K + εb whenever r i r(ε). By the pointwise convergence, we deduce that ẑ K. Then, since f is convex, by Jensen s inequality, f f(ẑ ) fdν(x) = f. Hence, ẑ is indeed a minimizer of (). Assume that x is the unique minimizer of (). We have shown that {ẑ (r) } is contained in [, ] m and lim i ẑ (ri) = x for any convergent subsequence {ẑ (ri) }, therefore the whole sequence {ẑ (r) } converges to x. (iv) Under the assumption, (4) holds with η = and any r d P /. Hence, fr = f for any r d P / by the proof of (i). As K is compact, suppose that f is attainable in (P) at a minimizer x K. Let z R Nm r be the truncated moment sequence associated with the Dirac measure at x, then fr = f is attainable in () at z. Recall the definition of d j in (9). For any t d K, consider the DP relaxation of () fr,t psdp := sup ρ η ρ,η,w,σ s.t. f(x) ρ + η( + Θ r ) = p Y,β (X)w β + σ, β M t (w), M t dj (g j w), j =,..., s, ρ R, η, w = (w β ) β N n t, σ Σ [X] R[X] r. 5 (5)

Its dual is fr,t dsdp := inf L z (f) z,σ,σ j s.t. z =, L z (Θ r ), M r (z), s p X, (Y )L z (X ) = σ + σ j g j, σ, σ j Σ [Y ], j= z = (z ) N m r R Nm r, deg(σ), deg(σj g j ) t. (6) Theorem 4.3. For any integer r d P /, the followings are true. (i) If Q(G) is Archimedean and the later condition holds for K, then fr,t psdp to fr as t tends to ; and f dsdp r,t decreasingly converge (ii) For some order t d K, if Rank Condition.9 holds for w in the solution (ρ, η, w, σ ) of (5), then f psdp r,t = f r ; (iii) If is in the case of (8), then f psdp r,t = f dsdp r,t = f r where t = d K. Proof. (i) For any feasible point (ρ, η, µ, σ) of (), let w = (w β ) β N n t where w = Y β dµ, then (ρ, η, w, σ) is feasible for (5) and hence fr,t psdp fr for any t d K. Then by the weak duality and Theorem 4., we have fr fr,t psdp fr,t dsdp for any t d K. It is sufficient to prove that lim t fr,t dsdp = fr. Fixing an arbitrary ε >, we show that there is some t d K such that fr,t dsdp fr ε. Fix a later point u of K and let z = (z ) N m r where z = u. Then z is feasible for (6) for some t d K by Putinar s Positivstellensatz. If f z fr ε, then f dsdp r,t f r ε. Next, we assume that f z fr > ε. Then, we can choose another feasible point z of () such that f z f z > and f z fr ε/. Let δ := ε f (z z ) and ẑ = ( δ) z + δz. Then, we have < δ < and hence p X, (y)ẑ = ( δ) p X, (y) z + δ p X, (y)z >, y. Hence, ẑ is feasible for (6) for some ˆt d K by Putinar s Positivstellensatz. We have f dsdp r,ˆt f r f ẑ f r = ( δ) f z + δ f z f r = f z f r + δ f (z z ) ε + ε = ε. As ε is arbitrary, the conclusion follows. (ii) uppose that Rank Condition.9 holds for w in the solution (ρ, η, w, σ ) of (5) at some order t d K. Then, by Theorem., w admits some measure µ M (), i.e., wβ = Y β dµ for all β N n t. As fr fr,t psdp and (ρ, η, µ, σ ) is feasible for (), we conclude that fr = fr,t psdp. (iii) By the proof of (i), the conclusion follows due to Theorem 3.9 and Theorem 4. (ii). Corollary 4.4. uppose f(x) is convex, K is compact and Assumption 3.6 holds. Then, for any ε >, the followings are true. 6

(i) There exists a r(ε) N such that fr,t dsdp fr,t psdp f ε holds for any r r(ε) and t d K ; (ii) If Q(G) is Archimedean, then for any r d P /, there exists a t(r, ε) N such that fr,t psdp f + ε holds for any t t(r, ε); (iii) If is in the case of (8), then lim r f psdp r,t = lim r f dsdp r,t = f, where t = d K. Proof. (i) It is clear that fr fr,t psdp fr,t dsdp there exists a r(ε) N such that fr f ε holds for any r r(ε). Thus, (i) follows. f dsdp r,t holds for any r d P / and t d K. By Theorem 4. (i), (ii) Due to Theorem 4.3 (i), for any r d P /, there exists a t(r, ε) N such that f psdp r,t holds for any t t(r, ε). Then (ii) follows since f r f for any r d P / by Theorem 4. (i). (iii) It is clear by Theorem 4. (i) and Theorem 4.3 (iii). f dsdp r,t f r +ε Remark 4.5. (). Corollary 4.4 shows that we can approximate f by fr,t psdp and fr,t dsdp as closely as possible with r and t both large enough; (). Assume that τ K =. By Theorem 4.3 (i), for any r d P, there exists t(r) N such that f dsdp r,t(r) f r + /r. Denote by (z (r,t(r)), σ (r,t(r)), σ (r,t(r)) j ) a minimizer of f dsdp r,t(r), then {z (r,t(r)) } is a sequence of nearly optimal solutions of () and Theorem 4. (iii) holds for the corresponding truncated sequence {ẑ (r,t(r)) }. In particular, when (P) has an unique minizer x and r, t are large enough, we can expect that the truncation ẑ (r,t) of any approximate solution z (r,t) of (6) lies in a small neighborhood of x. 4...O.-Convex case If Assumption 3. holds and f(x) is s.o.s-convex, then the Lagrangian L f (X) as defined in (5) is s.o.s by Lemma 3. and now we can convert (P) to sup ρ ρ,µ,σ s.t. f(x) ρ = p(x, y)dµ(y) + σ, (7) ρ R, µ M (), σ Σ [X] R[X] dp /, and its dual inf L z (f) z s.t. z =, M dp / (z), p X, (y)l z (X ), y, z = (z ) N m dp R Nm d P. (8) Theorem 4.6. Assume that f(x) is s.o.s-convex and Assumption 3. holds, then the followings are true. (i) (7) is solvable with the optimal value equal to f and there is no dual gap between (7) and (8). Moreover, if (P) is solvable, then so is (8); (ii) If z is a minimizer of (8), then ẑ := {z = } is a minimizer of (P). Proof. (i) Denote by f sos the optimal value of (7). ince Assumption 3.6 holds, recalling (3), there exists µ M () such that L f (x) = f(x) f p(x, y)dµ for all x Rm. Note that the degree of L f (X) is even and at most d P /. As L f is s.o.s, it holds that f(x) f p(x, y)dµ Σ [X] R[X] dp /, 7

which means that (7) is feasible and f sos f. For any x K and feasible point (ρ, µ, σ) of (7), it holds that f(x) ρ which implies that f sos f. Consequently, we have f sos = f. ince (8) is strictly feasible (see the proof of Theorem 4. (ii)), (7) is solvable and there is no dual gap between (7) and (8). uppose that f is attainable in (P) at a minimizer x K. Let z R Nm d P be the truncated moment sequence associated with the Dirac measure at x, then f is attainable in (8) at z. (ii) By Theorem 3.3 (i) and the proof of Theorem 3.7 (ii), it is easy to see that ẑ K. As f(x) is s.o.s-convex, by (Lasserre, 9a, Theorem.6), the extension of Jensen s inequality f(ẑ ) L z (f) = f holds, which implies that ẑ is a minimizer of (P). The corresponding DP relaxations of (7) and (8) are ft psdp := sup ρ ρ,w,σ s.t. f(x) ρ = p Y,β (X)w β + σ, β and its dual ft dsdp := inf L z (f) z,σ,σ j M t (w), M t dj (g j w), j =,..., s, ρ R, w = (w β ) β N n t, σ Σ [X] R[X] dp / s.t. z =, M dp / (z), p X, (Y )L z (X ) = σ + s σ j g j, σ, σ j Σ [Y ], j= z = (z ) N m dp R Nm d P, deg(σ), deg(σ j g j ) t. (9) (3) Theorem 4.7. Assume that f(x) is s.o.s-convex and Assumption 3. holds, then the followings are true. (i) If Q(G) is Archimedean, then lim t f psdp t = lim t f dsdp t = f ; (ii) For some order t d K, if Rank Condition.9 holds for w in the solution (ρ, w, σ ) of (9), then f psdp t = f ; (iii) Let {(z (t), σ (t), σ (t) j )} be a sequence of nearly optimal solutions of (3) and ẑ (t) := {z (t) = }. For any convergent subsequence {ẑ (ti) } of {ẑ (t) }, lim i ẑ (ti) is a minimizer of (P). Consequently, if {ẑ (t) } is bounded and x is the unique minimizer of (P), then lim t ẑ (t) = x. (iv) If is in the case of (8). then ft psdp = ft dsdp = f where t = d K. If (P) is solvable, then x is a minimizer of (P) if and only if there exists a minimizer (z, σ, σj ) of (3) with t = t such that ẑ := {z = } = x. Proof. (i) and (ii): imilar to the proof of Theorem 4.3. (iii): ince f(x) is s.o.s-convex, due to the extended Jensen s inequality (Lasserre, 9a, Theorem.6), it holds that f(ẑ (t) ) L z (t)(f) and therefore f(lim t ẑ (t) ) f. By Theorem 3.3, the sequence {ẑ (t) } K and hence lim t ẑ (t) K. Thus, lim i ẑ (ti) is a minimizer of (P). (iv): By Theorem 4.6 (i) and the weak duality, it holds that f ft psdp ft dsdp for any t t. For any ε >, there exists a point x (ε) K such that f(x (ε) ) f + ε. Let z (ε) R Nm d P be the truncated moment sequence associated with the Dirac measure at x (ε). By Theorem 3.9, z (ε) is feasible to (3) with t = t, which implies that ft dsdp f + ε. ince ε is abitrary, it holds that ft psdp = ft dsdp = f. Clearly, we only need to prove the if part. ince Assumption 3. holds, we have ẑ K by Theorem 3.3 (iii). As f(x) is s.o.s-convex, due to the extended Jensen s inequality (Lasserre, 9a, Theorem.6), it holds that f f(ẑ ) L z (f) = f. Thus, ẑ is a minimizer of (P). Remark 4.8. Note that we do not require K to be compact in Theorem 4.6 and 4.7. 8

4.3. Linear case Now we consider the case when f(x), p(x, y) are linear in X for every y. Let f(x) = c T X and p(x, Y ) = a(y ) T X + b(y ) for some c R m, a(y ) R[Y ] m and b(y ) R[Y ]. Then, (P) becomes the following linear semi-infinite polynomial programming problem f := inf ct x s.t. a(y) T x + b(y), y. (3) x R m As d P / = in (3), the reformulation (8) in this case is just (3). The dual (7) can be written as sup b(y)dµ(y) µ M () (3) s.t. a i (y)dµ(y) = c i, i =,..., m. In fact, according to the proofs of Theorem 4. (i) and 4.6 (i), the set M () in (3) can be replaced by the set of atomic measures supported on. Then, we get sup λ y b(y) λ y y (33) s.t. λ y a(y) = c, λ y, y, y where only finitely many dual variables λ y, y, take positive values. The problem (33) is known as the Haar dual problem Charnes et al. (963) of (3). According to Theorem 4.6 (i), we reproduce the well-known result: Proposition 4.9. (Charnes et al. (965)) If is compact and the later condition holds for K, then (3) and (33) have the same optimal value which is attainable in (33). The correspondind DP relaxations (9) and (3) become ft psdp := sup b w w R Nn t N n t s.t. a i, w = c i, i =,..., m, (34) N n t M t (w), M t dj (g j w), j =,..., s, and f dsdp t := inf x,σ,σ j It follows from Theorem 4.7 that c T x s.t. a(y ) T x + b(y ) = σ + s σ j g j, j= σ, σ j Σ [X], deg(σ), deg(σ j g j ) t, j =,..., s. (35) Corollary 4.. If is compact and the later condition holds for K, then conclusions of Theorem 4.7 hold for (34) and (35). Example 4.. Now we consider three convex semi-infinite polynomial programming problems using the sets K defined in Example 3.6, 3.7 and 3.8. Notice that the constraints in the dual DP relaxations (6), (3) and (35) can be easily generated by Yalmip. Hence, we solve the following problems using these corresponding dual DP relaxations, which can also give us some informations on the minimizers of the problems. 9

Figure 4: The sets K and contoure lines of f in Example 4...5.534.5.5.534.5.894.894 x x -.5 -.5 - -.894 -.5 -.5 - - -.5 - -.5.5.5 x - - -.5 - -.5.5.5 x. Recall the sets K and defined in Example 3.6 where the polynomial p(x, X, y, y ) R[X, X ] is convex but not s.o.s-convex for every y. Ahmadi and Parrilo (3) constructed a polynomial f(x, X ) =89 363X 4 X + 553 64 X6 95 4 X5 + 497 6 X4 + 7X 6X 3 4X 3 + 387 + 363X 4 9X 5 + 77X 6 + 36X X + 49X X 3 55X X 968X X + 7X X 4 + 794X 3 X + 769 X X 3 X5 X + 43 4 X4 X + 67 X3 X 3 + 49 6 X X 4 399 X X 5 385 X3 X 44 X X 3 364X + 48X. 4 X (see (Ahmadi and Parrilo, 3, (5.))) which is convex but not s.o.s-convex. In order to illustrate the efficiency of the DP relaxations (6) better, we shift and scale f to define f(x, X ) := f(x, X )/, which is still convex but not s.o.s-convex. Then, consider the problem min x K f(x, x ), where d P = d X = 8. Letting r = t = 4, we have f dsdp 4,4 =.534 and the truncation of DP relaxation minimizer ẑ (4,4) = (.445,.6373). To show the accuracy of the solution, we draw some contoure lines of f, including f(x, x ) =.534, and mark the point ẑ (4,4) by red + in Figure 4 (left). As we can see, the line f(x, x ) =.534 is almost tangent to K at the point ẑ (4,4).. Recall the sets K and defined in Example 3.7. Let f(x, X ) := (X ) + X, i.e., the square of the distance function of a point to (, ), and consider the problem min x K f(x, x ). Then, the polynomials f(x, X ) and p(x, X, y, y ) for all y are s.o.s-convex. As d K =, solving the DP relaxation (3) with t =, we get f dsdp =.894 and ẑ () = (.3,.335). The corresponding contoures and the minimizer ẑ () are shown in Figure 4 (right). 3. Recall the sets K and defined in Example 3.8. Let f(x, X ) = X + X and consider the linear semi-infinite programming problem min x K f(x, x ). As pointed in Example 3.7, the boundary of K is the curve g(x, X ) := X + X + X X + X =. Hence, the problem is equivalent to min f(x, x ) s.t. g(x, x ) =. x R By the method of Lagrange multipliers, we get f = with the minimizer (, ). olving the DP relaxation (35) with t =, we get f dsdp = and ẑ () = (, 3.6349 6 ).