Towards Solving Bilevel Optimization Problems in Quantum Information Theory

Towards Solving Bilevel Optimization Problems in Quantum Information Theory ICFO-The Institute of Photonic Sciences and University of Borås 22 January 2016 Workshop on Linear Matrix Inequalities, Semidefinite Programming and Quantum Information Theory

Outline 1 The Bilevel Problem 2 Polynomial Optimization Problems of Commuting Variables 3 Parametric Polynomial Optimization Problems of Commuting Variables 4 Polynomial Optimization Problems of Noncommuting Variables 5 Summary

The Generic Form min f (x, y) x R n,y Rm such that g i (x, y) 0, i = 1,..., s This is the upper level problem.

The Generic Form min f (x, y) x R n,y Rm such that g i (x, y) 0, i = 1,..., s This is the upper level problem. The lower level problem is: y Y (x) = argmin w R m{g(x, w) : h j (w) 0, j = 1,..., r}.

The Generic Form min f (x, y) x R n,y Rm such that g i (x, y) 0, i = 1,..., s This is the upper level problem. The lower level problem is: or: y Y (x) = argmin w R m{g(x, w) : h j (w) 0, j = 1,..., r}. y Y (x) = argmin w R m{g(x, w) : h j (x, w) 0, j = 1,..., r}.

Min-Max Problems Also known as pursuit-evasion games min f (x, y) x R n,y Rm such that g i (x, y) 0, i = 1,..., s, y Y (x) = argmax w R m{f (x, w) : h j (w) 0, j = 1,..., r}.

The Two Types of Bilevel POPs 1 Convex lower level objective function.

The Two Types of Bilevel POPs 1 Convex lower level objective function. Solution: single-level relaxation with Lagrange multipliers. 2 Nonconvex lower level objective funtion.

The Two Types of Bilevel POPs 1 Convex lower level objective function. Solution: single-level relaxation with Lagrange multipliers. 2 Nonconvex lower level objective funtion. Solution: iterative approximate solution through parametric POPs and relaxations. V. Jeyakumar, J.B. Lasserre, G. Li, T.S. Pham. Convergent Semidefinite Programming Relaxations for Global Bilevel Polynomial Optimization Problems. arxiv:1506.02099. J.B. Lasserre. A joint+marginal approach to parametric polynomial optimization. arxiv:0905.2497

Convex Lower Level Objective Nondegeneracy y {h j (w) 0, j = 1,..., r}, h j (y) = 0 h j (y) 0

Convex Lower Level Objective Nondegeneracy y {h j (w) 0, j = 1,..., r}, h j (y) = 0 h j (y) 0 + Slater condition

Convex Lower Level Objective Nondegeneracy y {h j (w) 0, j = 1,..., r}, h j (y) = 0 h j (y) 0 + Slater condition: Equivalent single-level problem. min f (x, y) x R n,y Rm such that g i (x, y) 0, i = 1,..., s, h j (y) 0, i = 1,..., r, λ 0 y G(x, y) + r j=1 λ r h j (y) = 0,

Solving the Equivalent Problem Use the usual hierarchy of relaxation. Convergence. Price to pay:

Solving the Equivalent Problem Use the usual hierarchy of relaxation. Convergence. Price to pay: m + 2r + 2 more constraints.

Solving the Equivalent Problem Use the usual hierarchy of relaxation. Convergence. Price to pay: m + 2r + 2 more constraints. r + 1 Lagrange multipliers

A Case Study: Simulating POVMs Qubit space with a 4-outcome POVM M = {M 1, M 2, M 3, M 4 }, where M i 0, 4 i=1 M i = I. Depolarizing noise: (1 t)m i + ttr(m i ) I, t [0, 1], i = 1,..., 4. 2 A POVM is simulable by a projective measurement if and only int M = α p α M α, where M α are projective measurements, p α 0, and α p α = 1.

With a Fixed POVM With the tetrahedron POVM, it is an SDP: min t t [0,1] such that (1 t)m 1 + ttr(m 1 ) I 2 = N+ 12 + N+ 13 + N+ 14, (1 t)m 2 + ttr(m 2 ) I 2 = N 12 + N+ 23 + N+ 24, (1 t)m 3 + ttr(m 4 ) I 2 = N 13 + N 23 + N+ 34, (1 t)m 4 + ttr(m 4 ) I 2 = N 14 + N 24 + N 34, N + ij + N ij = p ij I, N ± ij 0, ij p ij = 1, p ij 0.

Which POVM is the Most Robust to Noise? We also optimize over POVMs: min t t [0,1] such that M i 0, 4 i=1 M i = I M = argmax{t (1 t)m 1 + ttr(m 1 ) I 2 = N+ 12 + N+ 13 + N+ 14, (1 t)m 2 + ttr(m 2 ) I 2 = N 12 + N+ 23 + N+ 24, (1 t)m 3 + ttr(m 4 ) I 2 = N 13 + N 23 + N+ 34, (1 t)m 4 + ttr(m 4 ) I 2 = N 14 + N 24 + N 34, N + ij + N ij = p ij I, N ± ij 0, ij p ij = 1, p ij 0.}

Quadratic Bilevel POP Parametrize the POVM, and it will become a min-max bilevel quadratic POP of commuting variables with convex lower level objective function. min f (x) x R n,y R m such that g i (x, y) 0, i = 1,..., s, y Y (x) = argmin w R m{f (x) : h j (x, w) 0, j = 1,..., r}. The problem: Lower level has 55 (31) variables alone. Bilevel has about 70, without Langrange multipliers. There is no sparsity structure. Actual solution is ad hoc, via discretization.

The Parametric Problem J(x) = inf y R m{f (x, y) : h j(y) 0, j = 1,..., r} where x X = {x R n : h k (x) 0, k = r + 1,..., t} J.B. Lasserre. A joint+marginal approach to parametric polynomial optimization. arxiv:0905.2497

Approximate J(x) Primal Form Let γ β = X x β dϕ(x), where ϕ is a Borel probability measure on X with a positive density with respect to the Lebesgue measure on R n.

Approximate J(x) Primal Form Let γ β = X x β dϕ(x), where ϕ is a Borel probability measure on X with a positive density with respect to the Lebesgue measure on R n.the primal form reads as inf z L z (f ) such that M k (z) 0, M k vj (h j z) 0, j = 1,..., t, L z (x β ) = γ β, β N n k. Positivity constraints fulfilled finite Borel representing measure µ on K = {h j (y) 0, j = 1,..., r} such that z αβ = K x α y β dµ.

Approximate J(x) Dual Form sup p,σi such that X pdϕ f p = σ 0 + t σ j h j j=1 p R[x], σ j Σ[x, y], j = 0,..., t degp 2k,degσ 0 2k, degσ j h j 2k, j = 1,..., t. The polynomial p = 2k β=0 λ βx β is the approximation J k (x).

Example X = [0, 1] K = {(x, y) : xy 1 2 + y 2 2 x = 0, y 1 2 + xy 2 2 x = 0} f (x, y) = (1 2x)(y 1 + y 2 ). Analytical solution: J(x)= 2 1 2x x/(1 + x).

Bilevel Problem of Nonconvex Lower Level We consider the ɛ-approximation of this case: min f (x, y) x R n,y Rm such that g i (x, y) 0, i = 1,..., s, h j (y) 0, j = 1,..., r, G(x, y) min w R m{g(x, w) : h j (w) 0, j = 1,..., r} ɛ. As ɛ 0, the approximation gives an increasing lower bound (arxiv:1506.02099).

Enter a Parametric POP min w R m{g(x, w) : h j (w) 0, j = 1,..., r} is a parametric POP.

Enter a Parametric POP min w R m{g(x, w) : h j (w) 0, j = 1,..., r} is a parametric POP. Find J(x), but ensure that the parameter set is compact: add a set of constraints on the coordinates of x: {M 2 x 2 l 0, l = 1,..., n} for some M > 0.

Iterative Algorithm to Solve Bilevel Problem of Nonconvex Lower Level 1 Fix level k of a relaxation and ɛ.

Iterative Algorithm to Solve Bilevel Problem of Nonconvex Lower Level 1 Fix level k of a relaxation and ɛ. 2 Find J k (x).

Iterative Algorithm to Solve Bilevel Problem of Nonconvex Lower Level 1 Fix level k of a relaxation and ɛ. 2 Find J k (x). 3 Is the semialgrebraic set {(x, y) g i (x, y) 0, i = 1,..., s, h j (y) 0, j = 1,..., r, G(x, y) J k (x) ɛ} non-empty? 4 Solve polynomial optimization problem: min f (x, y) x R n,y Rm such that g i (x, y) 0, i = 1,..., s, h j (y) 0, j = 1,..., r, G(x, y) J k (x) ɛ.

Polynomial Optimization Problems of Noncommuting Variables The generic form is: inf φ, p(x)φ X B(H),φ H such that φ = 1, g i (X) 0, i = 1,..., r, φ q j (X) φ 0, j = 1,..., l.

Polynomial Optimization Problems of Noncommuting Variables The generic form is: inf Tr(p(X)ρ) such that Tr(ρ) = 1, ρ 0, g i (X) 0, i = 1,..., r, Tr(q j (X)ρ) 0, j = 1,..., l.

Polynomial Optimization Problems of Noncommuting Variables The generic form is: inf Tr(p(X)ρ) such that Tr(ρ) = 1, ρ 0, g i (X) 0, i = 1,..., r, Tr(q j (X)ρ) 0, j = 1,..., l. What is the measure here?

The Relaxation We replace the optimization problem by the following SDP: min p w y w y such that w M(y) 0, M(g i y) 0, i = 1,..., r. q j,w y w 0, j = 1,..., l. w 2k Pironio, S.; Navascués, M. & Acín, A. Convergent relaxations of polynomial optimization problems with noncommuting variables. SIAM Journal on Optimization, SIAM, 2010, 20, 2157 2180. arxiv:0903.4368

Finding the Optimal Measurement for a Noisy State max P αβ s.t. αβ P αβ (αβ xy) αβ P αβ (ab xy) = P(ab xy) P αβ is quantum a, b, x, y α, β Here P(ab xy) are observed values for a fixed noisy state and fixed measurement. O. Nieto-Silleras, S. Pironio, J. Silman. Using complete measurement statistics for optimal device-independent randomness evaluation. New J. Phys. 16, 013035 (2014). arxiv:1309.3930

New Bell Inequalities The dual solution provides us with Bell coefficients f abxy. Using these, we would like to obtain an updated set of measurements M that minimize the Bell inequality: min M f abxy P(ab xy) abxy s.t. M A, M B are POVMs

Solving the Problem As POP of noncommuting variables?

Solving the Problem As POP of noncommuting variables? Parametrizing the measurements?

Solving the Problem As POP of noncommuting variables? Parametrizing the measurements? Seesaw.

Ground State Energy Problems H free = <rs> [ ] c r c s + c sc r γ(c r c s + c s c r ) 2λ r c r c r The fermionic operators are subject to the following constraints: {c r, c s} = δ rs I r, {c r, c s} = 0, {c r, c s } = 0.

Bilevel Formulation to Improve Quality Pauli operators: expensive in the NPA.

Bilevel Formulation to Improve Quality Pauli operators: expensive in the NPA. Jordan-Wigner transformation: establish connection between ladder operators and Pauli operators. E.g., the Ising model on a chain in either formulation.

The Problem of Generating SDPs Ncpol2sdpa: symbolic description of (non)commutative polynomial optimization problem to a numerical SDP relaxation. Sparsest possible SDPs. SDPA: Parallel and distributed SDP solver. Arbitrary-precision variant. Wittek, P.: Algorithm 950: Ncpol2sdpa Sparse Semidefinite Programming Relaxations for Polynomial Optimization Problems of Noncommuting Variables. ACM Transactions on Mathematical Software, 2015, 41(3):21. arxiv:1308.6029

Summary Bilevel optimization has many applications in quantum information theory. Bilevel SDPs would be great. POPs of commuting variables did not fare well thus far. Theory is for bilevel POPs of noncommuting variables is lagging behind.