Packing, coding, and ground states From information theory to physics. Lecture III. Packing and energy minimization bounds in compact spaces

Similar documents
Applications of semidefinite programming in Algebraic Combinatorics

Recall that any inner product space V has an associated norm defined by

Semidefinite Programming and Harmonic Analysis

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

SPECTRAL THEORY EVAN JENKINS

Linear Algebra. Min Yan

Strictly Positive Definite Functions on a Real Inner Product Space

1.3 Linear Dependence & span K

Semidefinite programming bounds for spherical codes

On Euclidean designs and potential energy

4 Group representations

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Classification of root systems

Strictly positive definite functions on a real inner product space

Delsarte-Yudin LP method and Universal Lower Bound on energy

David Hilbert was old and partly deaf in the nineteen thirties. Yet being a diligent

Tomato Packing and Lettuce-Based Crypto

Math 396. Quotient spaces

Chapter 3 Transformations

Sphere packing, lattice packing, and related problems

Cambridge University Press The Mathematics of Signal Processing Steven B. Damelin and Willard Miller Excerpt More information

Exercise Solutions to Functional Analysis

We saw in the last chapter that the linear Hamming codes are nontrivial perfect codes.

QUATERNIONS AND ROTATIONS

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

MATH 430 PART 2: GROUPS AND SUBGROUPS

MIT Algebraic techniques and semidefinite optimization May 9, Lecture 21. Lecturer: Pablo A. Parrilo Scribe:???

Math 261 Lecture Notes: Sections 6.1, 6.2, 6.3 and 6.4 Orthogonal Sets and Projections

22 Approximations - the method of least squares (1)

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

Linear Programming Redux

Kernel Method: Data Analysis with Positive Definite Kernels

There are two things that are particularly nice about the first basis

Mathematical Methods wk 1: Vectors

Mathematical Methods wk 1: Vectors

Functional Analysis HW #5

Reproducing Kernel Hilbert Spaces

Algebraic Methods in Combinatorics

The value of a problem is not so much coming up with the answer as in the ideas and attempted ideas it forces on the would be solver I.N.

1 Invariant subspaces

INVARIANT PROBABILITIES ON PROJECTIVE SPACES. 1. Introduction

Math 413/513 Chapter 6 (from Friedberg, Insel, & Spence)

Homework set 4 - Solutions

Invariant Semidefinite Programs

Secant Varieties of Segre Varieties. M. Catalisano, A.V. Geramita, A. Gimigliano

10. Smooth Varieties. 82 Andreas Gathmann

Math 110, Spring 2015: Midterm Solutions

Linear Equations in Linear Algebra

Chapter 8 Integral Operators

33 Idempotents and Characters

Smith theory. Andrew Putman. Abstract

Inner product spaces. Layers of structure:

Lecture 5. 1 Goermans-Williamson Algorithm for the maxcut problem

Lecture 11. Andrei Antonenko. February 26, Last time we studied bases of vector spaces. Today we re going to give some examples of bases.

MOMENT METHODS IN ENERGY MINIMIZATION: NEW BOUNDS FOR RIESZ MINIMAL ENERGY PROBLEMS

Algebraic Methods in Combinatorics

BOUNDS FOR SOLID ANGLES OF LATTICES OF RANK THREE

Math 18, Linear Algebra, Lecture C00, Spring 2017 Review and Practice Problems for Final Exam

CS 6820 Fall 2014 Lectures, October 3-20, 2014

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space

Spanning, linear dependence, dimension

Exercises to Applied Functional Analysis

Induction 1 = 1(1+1) = 2(2+1) = 3(3+1) 2

LINEAR ALGEBRA W W L CHEN

MATH 225 Summer 2005 Linear Algebra II Solutions to Assignment 1 Due: Wednesday July 13, 2005

The Witt designs, Golay codes and Mathieu groups

4 Chapter 4 Lecture Notes. Vector Spaces and Subspaces

Theorema Egregium, Intrinsic Curvature

Normality of adjointable module maps

MIDTERM I LINEAR ALGEBRA. Friday February 16, Name PRACTICE EXAM SOLUTIONS

Final Review Sheet. B = (1, 1 + 3x, 1 + x 2 ) then 2 + 3x + 6x 2

arxiv: v2 [math.oa] 21 Nov 2010

INF-SUP CONDITION FOR OPERATOR EQUATIONS

CHAPTER 6. Representations of compact groups

ALGEBRAIC GROUPS. Disclaimer: There are millions of errors in these notes!

Lecture 7: Semidefinite programming

Elementary linear algebra

MATH 23a, FALL 2002 THEORETICAL LINEAR ALGEBRA AND MULTIVARIABLE CALCULUS Solutions to Final Exam (in-class portion) January 22, 2003

Math 54. Selected Solutions for Week 5

1 Last time: multiplying vectors matrices

The Gram Schmidt Process

The Gram Schmidt Process

MA103 Introduction to Abstract Mathematics Second part, Analysis and Algebra

Support Vector Machines

ALGEBRAIC GEOMETRY COURSE NOTES, LECTURE 2: HILBERT S NULLSTELLENSATZ.

here, this space is in fact infinite-dimensional, so t σ ess. Exercise Let T B(H) be a self-adjoint operator on an infinitedimensional

Representation theory and quantum mechanics tutorial Spin and the hydrogen atom

Lecture 4 February 2

Math 210C. The representation ring

Matroid intersection, base packing and base covering for infinite matroids

Representer theorem and kernel examples

Inner Product Spaces

Chapter 1: Linear Equations

FOUNDATIONS OF ALGEBRAIC GEOMETRY CLASS 41

MATH 1120 (LINEAR ALGEBRA 1), FINAL EXAM FALL 2011 SOLUTIONS TO PRACTICE VERSION

Chapter 6: Orthogonality

REAL ANALYSIS II HOMEWORK 3. Conway, Page 49

Delsarte s linear programming bound

HW 4 SOLUTIONS. , x + x x 1 ) 2

Lecture 10: Vector Algebra: Orthogonal Basis

Transcription:

Packing, coding, and ground states From information theory to physics Lecture III. Packing and energy minimization bounds in compact spaces Henry Cohn Microsoft Research New England

Pair correlations For simplicity, we ll focus on finite point configurations in S n 1. The distance distribution of a finite subset C of S n 1 measures how often each distance occurs between pairs of points. For 1 t 1, define A t = #{(x, y) C 2 x, y = t}. Recall that x y 2 = 2 x, y, so A t counts the number of pairs at distance 2 2t. In physics terms, this is equivalent to the pair correlation function. We can express energy for a pair potential in terms of pair correlations: f ( x y 2 ) = f (2 2t)A t. x y 1 t<1 (The right side has only finitely many nonzero summands.)

Constraints on pair correlations Because x y f ( x y 2 ) = 1 t<1 f (2 2t)A t, figuring out how low the energy can be amounts to understanding what the possible pair correlation functions are. There are some obvious constraints for a configuration with N points: A t 0 for all t, A 1 = N, and t A t = N 2. These follow trivially from the definition A t = #{(x, y) C 2 x, y = t}. There are also less obvious constraints, such as A t t = 2 x, y = x 0. t x C This is the inequality we used to analyze simplices.

Delsarte linear programming inequalities Delsarte discovered (in a closely related context) an infinite sequence of linear inequalities generalizing this last one. They use special functions, namely Gegenbauer or ultraspherical polynomials. These are a family Pk n of polynomials in one variable, with deg(pk n ) = k, such that A t Pk n (t) 0 t for all k. In particular, P1 n (t) = t, so we recover the previous inequality, and P0 n (t) = 1. Equivalently, for every finite set C S n 1, Pk n ( x, y ) 0. We ll return shortly to what these polynomials are and why they have this property.

Linear programming bounds Now we can try to minimize 1 t<1 f (2 2t)A t subject to these inequalities. This is a linear function of A t, and all our inequalities are linear as well. Thus, we get a lower bound for energy from the following infinite-dimensional linear programming problem: Find A t for 1 t 1 to minimize A 1 = N A t 0 t A t = N 2 t A tpk n (t) 0 for k 1 1 t<1 A t f (2 2t) subject to:

Linear programming duality We can formulate the dual linear program, in which we try to prove bounds on energy by taking linear combinations of the constraints. That amounts to the following theorem: Theorem (Delsarte,..., Yudin). Suppose h = k h kpk n with h k 0 for k 1, and suppose h(t) f (2 2t) for t [ 1, 1]. Then every N-point configuration C on S n 1 satisfies f ( x y 2 ) N 2 h 0 Nh(1). x y In other words, we need a lower bound h for the potential function f such that h has non-negative ultraspherical coefficients. Then we get a lower bound for f -energy. How do we choose h to optimize this bound? Nobody knows in general, but we can do it in certain special cases.

Theorem (Delsarte,..., Yudin). Suppose h = k h kpk n with h k 0 for k 1, and suppose h(t) f (2 2t) for t [ 1, 1]. Then every N-point configuration C on S n 1 satisfies f ( x y 2 ) N 2 h 0 Nh(1). x y Proof: We have f ( x y 2 ) h( x, y ) x y x y = h( x, y ) Nh(1) = N 2 h 0 Nh(1) + k 1 N 2 h 0 Nh(1). h k Pk n ( x, y ) Q.E.D.

This all rests on the fundamental inequality Pk n ( x, y ) 0. It might seem like an extraordinarily wasteful proof technique, since we are throwing away tons of terms. But in fact Pk n ( x, y ) averages to zero over the whole sphere, so perhaps those terms aren t likely to be so large anyway.

Applying these bounds LP bounds are behind almost every case in which universal optimality, or indeed any sharp bound on energy, is known. When could the bound be sharp? We need f ( x y 2 ) = h( x, y ) for all x, y C with x y, and we need Pk n ( x, y ) = 0 for all k 1 for which h k > 0. In practice, we choose h to be a polynomial of as low a degree as possible subject to agreeing with f to order 2 at each inner product that occurs between distinct points in C. Then you can check that everything works and treat it as an undeserved miracle, or explain it via the following theorem.

Theorem (Cohn and Kumar). Every m-distance set that is a spherical (2m 1)-design is universally optimal. m-distance set = set with m distances between distinct points spherical k-design = finite subset D of sphere S n 1 such that for all polynomials p : R n R of total degree at most k, average of p over D = average of p over S n 1. (I.e., averaging at these points gives exact numerical integration for polynomials up to degree k.) This theorem handles every known universal optimum except the regular 600-cell. H. Cohn and A. Kumar, Universally optimal distribution of points on spheres, Journal of the American Mathematical Society 20 (2007), 99 148.

Packing bounds Theorem. Suppose h = k h kp n k with h k 0 for k 0 and h 0 > 0, and suppose h(t) 0 for t [ 1, cos θ]. Then every configuration C on S n 1 with minimal angle at least θ satisfies C h(1)/h 0. Proof: We have C h(1) h( x, y ) = k h k P n k ( x, y ) C 2 h 0. Q.E.D.

So what are ultraspherical polynomials? Orthogonal polynomials with respect to (1 t 2 ) (n 3)/2 dt on [ 1, 1]. I.e., if k l. 1 1 P n k (t)pn l (t)(1 t2 ) (n 3)/2 dt = 0 Equivalently, Pk n is orthogonal to all polynomials of degree less than k with respect to this measure. This uniquely determines them up to scaling (which is irrelevant for us, as long as we take Pk n (1) > 0). Just apply Gram-Schmidt orthogonalization to 1, t, t 2,... to compute them recursively.

Orthogonal polynomials Many wonderful implications. For example, orthogonality shows that Pk n has k distinct roots in [ 1, 1]. To see why, suppose P n k changed sign at only m points r 1,..., r m in [ 1, 1], with m < k. Then P n k (t)(t r 1)... (t r m ) would never change sign on [ 1, 1], which would contradict 1 1 P n k (t)(t r 1)... (t r m )(1 t 2 ) (n 3)/2 dt = 0 (which holds because (t r 1 )... (t r m ) has degree less than k).

Recall that as a representation of O(n), we can decompose L 2 (S n 1 ) as L 2 (S n 1 ) = W k, k 0 where W k consists of degree k spherical harmonics. Let x S n 1, and consider the linear map that takes f W k to f (x). This map must be the inner product with some unique element w k,x of W k, called a reproducing kernel. That is, for all f W k. f (x) = w k,x, f The function w k,x is invariant under all rotations that fix x, since such rotations preserve f (x). Thus, w k,x (y) must be a function of x, y alone. We define P n k by w k,x (y) = Pk n ( x, y ).

The reproducing kernel w k,x is a polynomial of degree k (because it is a spherical harmonic), and thus so is P n k. Furthermore, w k,x and w l,x are orthogonal in L 2 (S n 1 ) for k l, since they are spherical harmonics of different degrees. Thus, Pk n ( x, y )Pn l ( x, y ) dµ(y) = 0, S n 1 where µ is surface measure. If we orthogonally project from the surface of the sphere onto the axis from x to x, then µ projects to a constant times the measure (1 t 2 ) (n 3)/2 dt on [ 1, 1]. (This is a good multivariable calculus exercise.) Thus, we have recovered all the properties of ultraspherical polynomials we needed except for the fundamental inequality Pk n ( x, y ) 0.

As a side comment, we can now see that W k is an irreducible representation of O(n). If it broke up further, then each summand would have its own reproducing kernel, which would two different polynomials of degree k that would be orthogonal to each other as well as to lower degree polynomials. That is impossible (the space of polynomials of degree at most k has dimension too low to contain that).

The fundamental inequality Recall that the reproducing kernel property means w k,x, f = f (x) for all f W k. In particular, taking f = w k,y yields w k,x, w k,y = w k,y (x). Recall also that w k,y (x) = Pk n ( x, y ). Now we have Pk n ( x, y ) = w k,y (x) = w k,x, w k,y 2 = w k,x 0. x C

This is a perfect generalization of x C x 2 0, except instead of summing the vectors x, we are summing vectors w k,x in the Hilbert space W k. One interpretation is that x w k,x maps S n 1 into the unit sphere in the higher-dimensional space W k, and we re combining the trivial inequality x C w k,x 2 0 with that nontrivial mapping. When n = 2, the space W k has dimension 2 for k 1, so we are mapping S 1 to itself. This map wraps S 1 around itself k times, while the analogues for n 3 are more subtle.

Do ultraspherical polynomials span all the functions P satisfying P( x, y ) 0 for all C? No; see F. Pfender, Improved Delsarte bounds for spherical codes in small dimensions, J. Combin. Theory Ser. A 114 (2007), 1133 1147. However, they span all the positive-definite kernels: functions P such that for all x 1,..., x N S n 1, the N N matrix with entries P( x i, x j ) is positive semidefinite. I. J. Schoenberg, Positive definite functions on spheres, Duke Math. J. 9 (1942), 96 108.

Semidefinite programming bounds Generalizations put semidefinite constraints on higher correlation functions. A. Schrijver, New code upper bounds from the Terwilliger algebra and semidefinite programming, IEEE Transactions on Information Theory 51 (2005), 2859 2866. C. Bachoc and F. Vallentin, New upper bounds for kissing numbers from semidefinite programming, Journal of the American Mathematical Society 21 (2008), 909 924. H. Cohn and J. Woo, Three-point bounds for energy minimization, Journal of the American Mathematical Society 25 (2012), 929 958. D. de Laat and F. Vallentin, A semidefinite programming hierarchy for packing problems in discrete geometry, arxiv:1311.3789.

For more information Papers are available from: http://research.microsoft.com/~cohn Specifically, see Order and disorder in energy minimization: http://arxiv.org/abs/1003.3053