Moments and Positive Polynomials for Optimization II: LP- VERSUS SDP-relaxations LAAS-CNRS and Institute of Mathematics, Toulouse, France EECI Course: February 2016
LP-relaxations LP- VERSUS SDP-relaxations
LP-relaxations LP- VERSUS SDP-relaxations
LP-relaxations LP- VERSUS SDP-relaxations
Recall the Global Optimization problem P: f := min{ f (x) g j (x) 0, j = 1,..., m}, where f and g j are all POLYNOMIALS, and let K := { x R n g j (x) 0, j = 1,..., m} be the feasible set (a compact basic semi-algebraic set)
Putinar Positivstellensatz Assumption 1: For some M > 0, the quadratic polynomial M X 2 belongs to the quadratic module Q(g 1,..., g m ) Theorem (Putinar-Jacobi-Prestel) Let K be compact and Assumption 1 hold. Then [ f R[X] and f > 0 on K ] f Q(g 1,..., g m ), i.e., m f (x) = σ 0 (x) + σ j (x) g j (x), j=1 for some s.o.s. polynomials {σ j } m j=0. x R n
Putinar Positivstellensatz Assumption 1: For some M > 0, the quadratic polynomial M X 2 belongs to the quadratic module Q(g 1,..., g m ) Theorem (Putinar-Jacobi-Prestel) Let K be compact and Assumption 1 hold. Then [ f R[X] and f > 0 on K ] f Q(g 1,..., g m ), i.e., m f (x) = σ 0 (x) + σ j (x) g j (x), j=1 for some s.o.s. polynomials {σ j } m j=0. x R n
If one fixes an a priori bound on the degree of the s.o.s. polynomials {σ j }, checking f Q(g 1,..., g m ) reduces to solving a SDP!! Moreover, Assumption 1 holds true if e.g. : - all the g j s are linear (hence K is a polytope), or if - the set { x g j (x) 0} is compact for some j {1,..., m}. If x K x M for some (known) M, then it suffices to add the redundant quadratic constraint M 2 X 2 0, in the definition of K.
If one fixes an a priori bound on the degree of the s.o.s. polynomials {σ j }, checking f Q(g 1,..., g m ) reduces to solving a SDP!! Moreover, Assumption 1 holds true if e.g. : - all the g j s are linear (hence K is a polytope), or if - the set { x g j (x) 0} is compact for some j {1,..., m}. If x K x M for some (known) M, then it suffices to add the redundant quadratic constraint M 2 X 2 0, in the definition of K.
If one fixes an a priori bound on the degree of the s.o.s. polynomials {σ j }, checking f Q(g 1,..., g m ) reduces to solving a SDP!! Moreover, Assumption 1 holds true if e.g. : - all the g j s are linear (hence K is a polytope), or if - the set { x g j (x) 0} is compact for some j {1,..., m}. If x K x M for some (known) M, then it suffices to add the redundant quadratic constraint M 2 X 2 0, in the definition of K.
If one fixes an a priori bound on the degree of the s.o.s. polynomials {σ j }, checking f Q(g 1,..., g m ) reduces to solving a SDP!! Moreover, Assumption 1 holds true if e.g. : - all the g j s are linear (hence K is a polytope), or if - the set { x g j (x) 0} is compact for some j {1,..., m}. If x K x M for some (known) M, then it suffices to add the redundant quadratic constraint M 2 X 2 0, in the definition of K.
Krivine-Handelman-Vasilescu Positivstellensatz Assumption I: With g 0 = 1, the family {g 0,..., g m } generates the algebra R[x], that is, R[x 1,..., x n ] = R[g 0,..., g m ]. Assumption II: Recall that K is compact. Hence we also assume with no loss of generality (but possibly after scaling) that for every j = 1,..., m: 0 g j (x) 1 x K.
Krivine-Handelman-Vasilescu Positivstellensatz Assumption I: With g 0 = 1, the family {g 0,..., g m } generates the algebra R[x], that is, R[x 1,..., x n ] = R[g 0,..., g m ]. Assumption II: Recall that K is compact. Hence we also assume with no loss of generality (but possibly after scaling) that for every j = 1,..., m: 0 g j (x) 1 x K.
Notation: for α, β N m, let g(x) α = g 1 (x) α1 g m (x) αm (1 g(x)) β = (1 g 1 (x)) β1 (1 g m (x)) βm Theorem (Krivine,Vasilescu Positivstellensatz) Let Assumption I and Assumption II hold: If f R[x 1,..., x m ] is POSITIVE on K then f (x) = c αβ g(x) α (1 g(x)) β, x R n, α,β N m for finitely many positive coefficients (c αβ ).
Notation: for α, β N m, let g(x) α = g 1 (x) α1 g m (x) αm (1 g(x)) β = (1 g 1 (x)) β1 (1 g m (x)) βm Theorem (Krivine,Vasilescu Positivstellensatz) Let Assumption I and Assumption II hold: If f R[x 1,..., x m ] is POSITIVE on K then f (x) = c αβ g(x) α (1 g(x)) β, x R n, α,β N m for finitely many positive coefficients (c αβ ).
f (x) = Testing whether α,β N m c αβ g(x) α (1 g(x)) β, x R n, for finitely many positive coefficients (c αβ ) and with i α i + β i d... reduces to solving a LP!. Indeed, recall that f (x) = γ f γ x γ. So expand and state that α,β N m c αβ g(x) α (1 g(x)) β = γ N n θ γ (c) x γ f γ = θ γ (c), γ N n 2d ; c 0. a linear system!
f (x) = Testing whether α,β N m c αβ g(x) α (1 g(x)) β, x R n, for finitely many positive coefficients (c αβ ) and with i α i + β i d... reduces to solving a LP!. Indeed, recall that f (x) = γ f γ x γ. So expand and state that α,β N m c αβ g(x) α (1 g(x)) β = γ N n θ γ (c) x γ f γ = θ γ (c), γ N n 2d ; c 0. a linear system!
f (x) = Testing whether α,β N m c αβ g(x) α (1 g(x)) β, x R n, for finitely many positive coefficients (c αβ ) and with i α i + β i d... reduces to solving a LP!. Indeed, recall that f (x) = γ f γ x γ. So expand and state that α,β N m c αβ g(x) α (1 g(x)) β = γ N n θ γ (c) x γ f γ = θ γ (c), γ N n 2d ; c 0. a linear system!
DUAL side: The K-moment problem Let {X α } be a canonical basis for R[X], and let y := {y α } be a given sequence indexed in that basis. Recall the K-moment problem Given K R n, does there exist a measure µ on K, such that y α = X α dµ, α N n? (where X α = X α 1 1 X αn n ). K
Given y = {y α }, let L y : R[X] R, be the linear functional f (= α f α X α ) L y (f ) := α N n f α y α. Moment matrix M d (y) with rows and columns also indexed in the basis {X α }. M d (y)(α, β) := L y (X α+β ) = y α+β, α, β N n, α, β d.
For instance in R 2 : M 1 (y) = 1 {}}{ y 00 X 1 {}}{ {}}{ X 2 y 10 y 01 y 10 y 20 y 11 y 01 y 11 y 02 Importantly... M d (y) 0 L y (h 2 ) 0, h R[X] d
Localizing matrix The Localizing matrix" M d (θy) w.r.t. a polynomial θ R[X] with X θ(x) = γ θ γ X γ, has its rows and columns also indexed in the basis {X α } of R[X] d, and with entries: M d (θ y)(α, β) = L y (θ X α+β ) = θ γ y α+β+γ, γ N n { α, β N n α, β d.
For instance, in R 2, and with X θ(x) := 1 X 2 1 X 2 2, M 1 (θ y) = 1 {}}{ y 00 y 20 y 02, X 1 {}}{ y 10 y 30 y 12, X 2 {}}{ y 01 y 21 y 03 y 10 y 30 y 12, y 20 y 40 y 22, y 11 y 21 y 12 y 01 y 21 y 03, y 11 y 21 y 12, y 02 y 22 y 04. Importantly... M d (θ y) 0 L y (h 2 θ) 0, h R[X] d
Putinar s dual conditions Again K := { x R n g j (x) 0, j = 1,..., m}. Assumption 1: For some M > 0, the quadratic polynomial M X 2 is in the quadratic module Q(g 1,..., g m ) Theorem (Putinar: dual side) Let K be compact, and Assumption 1 hold. Then a sequence y = (y α ), α N n, has a representing measure µ on K if and only if ( ) L y (f 2 ) 0; L y (f 2 g j ) 0, j = 1,..., m; f R[X].
Putinar s dual conditions Again K := { x R n g j (x) 0, j = 1,..., m}. Assumption 1: For some M > 0, the quadratic polynomial M X 2 is in the quadratic module Q(g 1,..., g m ) Theorem (Putinar: dual side) Let K be compact, and Assumption 1 hold. Then a sequence y = (y α ), α N n, has a representing measure µ on K if and only if ( ) L y (f 2 ) 0; L y (f 2 g j ) 0, j = 1,..., m; f R[X].
Checking whether (**) holds for all f R[X] with degree d reduces to checking whether M d (y) 0 and M d (g j y) 0, for all j = 1,..., m! m + 1 LMI conditions to verify!
Krivine-Vasilescu: dual side Theorem Let K be compact, and Assumption I and II hold. Then the sequence y = (y α ), α N n, has a representing measure µ on K if and only if L y (g α (1 g) β ) 0, α, β N m.
LP-relaxations With f R[x], consider the hierarchy of LP-relaxations ρ d = max λ,c αβ λ f λ = α,β N m c αβ g α (1 g) β c αβ 0, α + β 2d and of course ρ d = ρ d for all d.
Theorem Assume that K is compact and Assumption I and II hold. Then the LP-relaxations CONVERGE, that is, ρ d f as d. The SHERALI-ADAMS RLT s hierarchy is exactly this type of LP-relaxations. Its convergence for 0/1 programs was proved with ah-hoc arguments. In fact, the rationale behind such convergence if Krivine-Vasilescu Positivstellensatz.
Theorem Assume that K is compact and Assumption I and II hold. Then the LP-relaxations CONVERGE, that is, ρ d f as d. The SHERALI-ADAMS RLT s hierarchy is exactly this type of LP-relaxations. Its convergence for 0/1 programs was proved with ah-hoc arguments. In fact, the rationale behind such convergence if Krivine-Vasilescu Positivstellensatz.
Theorem Assume that K is compact and Assumption I and II hold. Then the LP-relaxations CONVERGE, that is, ρ d f as d. The SHERALI-ADAMS RLT s hierarchy is exactly this type of LP-relaxations. Its convergence for 0/1 programs was proved with ah-hoc arguments. In fact, the rationale behind such convergence if Krivine-Vasilescu Positivstellensatz.
Theorem Assume that K is compact and Assumption I and II hold. Then the LP-relaxations CONVERGE, that is, ρ d f as d. The SHERALI-ADAMS RLT s hierarchy is exactly this type of LP-relaxations. Its convergence for 0/1 programs was proved with ah-hoc arguments. In fact, the rationale behind such convergence if Krivine-Vasilescu Positivstellensatz.
Some remarks on LP-relaxations 1. Notice the presence of binomial coefficients in both primal and dual LP-relaxations... which yields numericall ill-conditioning for relatively large d. 2. Let x K be a global minimizer, and for x K, let J(x) be the set of active constraints, i.e., g j (x) = 0 or...1 g k (x) = 0. Then FINITE convergence CANNOT occur if there exists nonoptimal x K with J(x) J(x )! And so... not possible for CONVEX problems in general! For instance, if K is a Polytope then FINITE convergence is possible only if every global minimizer is a vertex of K!
Some remarks on LP-relaxations 1. Notice the presence of binomial coefficients in both primal and dual LP-relaxations... which yields numericall ill-conditioning for relatively large d. 2. Let x K be a global minimizer, and for x K, let J(x) be the set of active constraints, i.e., g j (x) = 0 or...1 g k (x) = 0. Then FINITE convergence CANNOT occur if there exists nonoptimal x K with J(x) J(x )! And so... not possible for CONVEX problems in general! For instance, if K is a Polytope then FINITE convergence is possible only if every global minimizer is a vertex of K!
Some remarks on LP-relaxations 1. Notice the presence of binomial coefficients in both primal and dual LP-relaxations... which yields numericall ill-conditioning for relatively large d. 2. Let x K be a global minimizer, and for x K, let J(x) be the set of active constraints, i.e., g j (x) = 0 or...1 g k (x) = 0. Then FINITE convergence CANNOT occur if there exists nonoptimal x K with J(x) J(x )! And so... not possible for CONVEX problems in general! For instance, if K is a Polytope then FINITE convergence is possible only if every global minimizer is a vertex of K!
Some remarks on LP-relaxations 1. Notice the presence of binomial coefficients in both primal and dual LP-relaxations... which yields numericall ill-conditioning for relatively large d. 2. Let x K be a global minimizer, and for x K, let J(x) be the set of active constraints, i.e., g j (x) = 0 or...1 g k (x) = 0. Then FINITE convergence CANNOT occur if there exists nonoptimal x K with J(x) J(x )! And so... not possible for CONVEX problems in general! For instance, if K is a Polytope then FINITE convergence is possible only if every global minimizer is a vertex of K!
Some remarks on LP-relaxations 1. Notice the presence of binomial coefficients in both primal and dual LP-relaxations... which yields numericall ill-conditioning for relatively large d. 2. Let x K be a global minimizer, and for x K, let J(x) be the set of active constraints, i.e., g j (x) = 0 or...1 g k (x) = 0. Then FINITE convergence CANNOT occur if there exists nonoptimal x K with J(x) J(x )! And so... not possible for CONVEX problems in general! For instance, if K is a Polytope then FINITE convergence is possible only if every global minimizer is a vertex of K!
LP- and SDP-Relaxations with their dual Primal LP-relaxation Primal SDP-relaxation min y L y (f ) L y (g α (1 g) β ) 0 α, β N m, α + β 2d min y L y (f ) L y (h 2 g j ) 0, j = 1,..., m h, deg(h g j ) 2d, j m Dual LP-relaxation Dual SDP-relaxation max λ λ,{c αβ } f λ = c αβ g α (1 g) β α,β N m c αβ 0; α + β 2d max λ,{σ j } λ f λ = deg(σ j g j ) σ m σ j g j j=0 2d, j m s.o.s., j m