Embeddings of finite metric spaces in Euclidean space: a probabilistic view

Similar documents
Markov chains in smooth Banach spaces and Gromov hyperbolic metric spaces

Lecture 18: March 15

On Stochastic Decompositions of Metric Spaces

Finite Metric Spaces & Their Embeddings: Introduction and Basic Tools

Functional Analysis Review

MANOR MENDEL AND ASSAF NAOR

Randomized Algorithms

Fréchet embeddings of negative type metrics

Modern Discrete Probability Spectral Techniques

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 11 Luca Trevisan February 29, 2016

Your first day at work MATH 806 (Fall 2015)

08a. Operators on Hilbert spaces. 1. Boundedness, continuity, operator norms

Notes taken by Costis Georgiou revised by Hamed Hatami

Course Notes: Probability on Trees and Networks, Fall 2004

Basic Calculus Review

The intrinsic dimensionality of graphs

Local Global Tradeoffs in Metric Embeddings

ON ASSOUAD S EMBEDDING TECHNIQUE

On metric characterizations of some classes of Banach spaces

Basic Properties of Metric and Normed Spaces

SPECTRAL THEOREM FOR COMPACT SELF-ADJOINT OPERATORS

Topic: Balanced Cut, Sparsest Cut, and Metric Embeddings Date: 3/21/2007

Partitioning Metric Spaces

Small distortion and volume preserving embedding for Planar and Euclidian metrics Satish Rao

Séminaire BOURBAKI Janvier ème année, , n o 1047 THE RIBE PROGRAMME. by Keith BALL

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

arxiv: v2 [math.pr] 27 Oct 2015

25.1 Markov Chain Monte Carlo (MCMC)

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Metric structures in L 1 : Dimension, snowflakes, and average distortion

Recall that any inner product space V has an associated norm defined by

Linear Algebra Review

5 Compact linear operators

Projection Theorem 1

Doubling metric spaces and embeddings. Assaf Naor

The Hilbert Space of Random Variables

Supremum of simple stochastic processes

SPECTRAL PROPERTIES OF THE LAPLACIAN ON BOUNDED DOMAINS

Lecture 13: Spectral Graph Theory

Hardness of Embedding Metric Spaces of Equal Size

Diamond graphs and super-reflexivity

Brownian Motion. 1 Definition Brownian Motion Wiener measure... 3

Lecture 9: Counting Matchings

MATH 564/STAT 555 Applied Stochastic Processes Homework 2, September 18, 2015 Due September 30, 2015

On the Logarithmic Calculus and Sidorenko s Conjecture

Simons Workshop on Approximate Counting, Markov Chains and Phase Transitions: Open Problem Session

BOUNDARY VALUE PROBLEMS ON A HALF SIERPINSKI GASKET

Hilbert spaces. 1. Cauchy-Schwarz-Bunyakowsky inequality

Lecture 12 : Graph Laplacians and Cheeger s Inequality

15-854: Approximation Algorithms Lecturer: Anupam Gupta Topic: Approximating Metrics by Tree Metrics Date: 10/19/2005 Scribe: Roy Liu

1 Adjacency matrix and eigenvalues

TUG OF WAR INFINITY LAPLACIAN

Cuts & Metrics: Multiway Cut

Boolean Inner-Product Spaces and Boolean Matrices

215 Problem 1. (a) Define the total variation distance µ ν tv for probability distributions µ, ν on a finite set S. Show that

16 Embeddings of the Euclidean metric

P (A G) dp G P (A G)

ERRATA: Probabilistic Techniques in Analysis

Combinatorial semigroups and induced/deduced operators

Lecture 1 Measure concentration

APPLICATIONS OF DIFFERENTIABILITY IN R n.

Lecture Notes 1: Vector spaces

1 Stochastic Dynamic Programming

Anderson Localization on the Sierpinski Gasket

A SYNOPSIS OF HILBERT SPACE THEORY

Lecture: Expanders, in theory and in practice (2 of 2)

Algorithms for Picture Analysis. Lecture 07: Metrics. Axioms of a Metric

Analysis Preliminary Exam Workshop: Hilbert Spaces

Euclidean distortion and the Sparsest Cut

25 Minimum bandwidth: Approximation via volume respecting embeddings

Ergodic Theorems. Samy Tindel. Purdue University. Probability Theory 2 - MA 539. Taken from Probability: Theory and examples by R.

FUNCTIONAL ANALYSIS HAHN-BANACH THEOREM. F (m 2 ) + α m 2 + x 0

Functional Analysis Review

Linear Analysis Lecture 5

The Lévy-Itô decomposition and the Lévy-Khintchine formula in31 themarch dual of 2014 a nuclear 1 space. / 20

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ).

An Algorithmist s Toolkit Nov. 10, Lecture 17

Probability Theory I: Syllabus and Exercise

On the Distortion of Embedding Perfect Binary Trees into Low-dimensional Euclidean Spaces

Nonamenable Products are not Treeable

ON WEAKLY NONLINEAR BACKWARD PARABOLIC PROBLEM

CHAPTER VIII HILBERT SPACES

Inference for Stochastic Processes

Poorly embeddable metric spaces and Group Theory

Lower-Stretch Spanning Trees

Mathematics for Economists

First Passage Percolation

Analysis-3 lecture schemes

Exercises Measure Theoretic Probability

1. Foundations of Numerics from Advanced Mathematics. Linear Algebra

HAUSDORFF DIMENSION AND ITS APPLICATIONS

arxiv:math/ v1 [math.pr] 29 Nov 2002

Szemerédi s Lemma for the Analyst

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

A Concise Course on Stochastic Partial Differential Equations

A Union of Euclidean Spaces is Euclidean. Konstantin Makarychev, Northwestern Yury Makarychev, TTIC

Discrete Math Class 19 ( )

Spectral Mapping Theorem

Lecture 5 - Hausdorff and Gromov-Hausdorff Distance

Chapter II. Metric Spaces and the Topology of C

Transcription:

Embeddings of finite metric spaces in Euclidean space: a probabilistic view Yuval Peres May 11, 2006 Talk based on work joint with: Assaf Naor, Oded Schramm and Scott Sheffield

Definition: An invertible mapping f : X Y, where (X, d X ) and (Y, d Y ) are metric spaces, is a C-embedding if there exists a number r > 0 such that for all x, y X r d X (x, y) d Y (f (x), f (y)) Cr d X (x, y). The infimum of numbers C such that f is a C-embedding is called the distortion of f and is denoted by dist(f ). Equivalently, dist(f ) = f Lip f 1 Lip where { } dy (f (x), f (y)) f Lip = sup : x, y X, x y. d X (x, y)

Example Theorem: (Enflo, 1969) Let Ω k = {0, 1} k be the k-dimensional hypercube with l 1 metric, then any f : Ω k L 2 has distortion at least k. Remark: This is tight, as can be easily seen by taking the identity function from Ω k to l k 2. Enflo s proof is algebraic, and is hence fragile: if one edge of the cube is removed, the proof breaks down.

Theorem: (Bourgain, 1985) Every n-point metric space (X, d) can be embedded in an Euclidean space with an O(log n) distortion. Remark: Any embedding of an expander graph family into Euclidean space has distortion at least c log n (Linial, London and Rabinovich, 1995). Proof idea for Bourgain s theorem: For each cardinality k < n which is a power of 2, randomly pick α log n sets A V (G) independently by including each x X with probability 1/k. We have drawn O(log 2 n) sets A 1,..., A O(log 2 n). Map every vertex x X to the vector 1 log n (d(x, A 1), d(x, A 2 ),...). This mapping has distortion O(log n).

Theorem: (Bourgain, 1986) There is no bounded distortion embedding of the infinite binary tree into a Hilbert space. More precisely, any embedding of a binary tree of depth M and n = 2 M+1 1 vertices into a Hilbert space has distortion Ω( log M) = Ω( log log n).

The Markov type of metric spaces A Markov chain {Z t } t=0 with transition probabilities p ij := Pr(Z t+1 = j Z t = i) on the state space {1,..., n} is stationary if π i := P(Z t = i) does not depend on t and it is (time) reversible if π i p ij = π j p ji for every i, j {1,..., n}. Definition (Ball 1992): Given a metric space (X, d) we say that X has Markov type 2 if there exists a constant M > 0 such that for every stationary reversible Markov chain {Z t } t=0 on {1,..., n}, every mapping f : {1,..., n} X and every time t N, Ed(f (Z t ), f (Z 0 )) 2 M 2 ted(f (Z 1 ), f (Z 0 )) 2.

Theorem: (Ball 1992) R has Markov type 2 with constant M = 1. Proof: Let P = (p ij ) be the transition matrix of the Markov chain. Time reversibility is equivalent to the assertion that P is a self-adjoint operator in L 2 (π), hence L 2 (π) has an orthogonal basis of eigenfunctions of P with real eigenvalues. Also, since P is a stochastic matrix Pf f and thus if λ is an eigenvalue of P then λ 1. We have Ed(f (Z t ), f (Z 0 )) 2 = i,j π i p (t) ij [f (i) f (j)] 2 = 2 (I P t )f, f, and also Ed(f (Z 1 ), f (Z 0 )) 2 = 2 (I P)f, f.

So we are left to prove that, (I P t )f, f t (I P)f, f. Indeed, if f is an eigenfunction with eigenvalue λ this reduces to proving (1 λ t ) t(1 λ). Since λ 1, this reduces to 1 + λ + + λ t 1 t, which is obviously true. For any other f take f = n j=1 a jf j where {f j } is an orthonormal basis of eigenfunctions, (I P t )f, f = n aj 2 (I P t )f j, f j j=1 = t (I P)f, f. n aj 2 t (I P)f j, f j Corollary: Any Hilbert space has Markov type 2 with constant M = 1. j=1

Corollary: Any embedding of the hypercube {0, 1} k into Hilbert space has distortion at least k/4. Proof: Let {X j } be the simple random walk on the hypercube. It is easy to see that Ed(X 0, X j ) j 2 j k/4, which by Jensen s inequality implies Ed 2 (X 0, X j ) j 2 /4. Let f : {0, 1} k L 2 be a map. Assume without loss of generality that f is non-expanding mapping, i.e., f Lip = 1 (otherwise take f / f Lip ). Since L 2 has Markov type 2 with constant M = 1, for any j, we have Ed 2 [f (X 0 ), f (X j )] j. Take j = k/4, together this yields f 1 Lip k/4.

Corollary: Any embedding of an (n, d, λ)-expander family has distortion at least C d,λ log n. Proof: Let {X j } be the simple random walk on the expander with transition matrix P. Let g = 1 λ, take α > 0 such that gα < 1 and take t = α log n, it can be shown that P t (x, y) 2e (1 λ)α log n. Fix γ > 0 small enough such that d γ e (1 λ)α < 1. We wish to show that up to time t = α log n, the random walk on the expander has positive speed. Indeed, for any x V, since the ball B(x, γ log n) of radius γ log n around x has at most d γ log n vertices it follows that P x [X t B(x, γ log n)] d γ log n 2e (1 λ)α log n 0. This in turn implies that for large enough n Ed 2 (X 0, X t ) > γ2 log 2 n 2.

Let f : V L 2, and assume without loss of generality that f Lip = 1 (otherwise take f / f Lip ). We have proved in the previous theorem that Ed(f (X t ), f (X 0 )) 2 (1 + λ + λ 2 + + λ t 1 )Ed(f (Z 1 ), f (Z 0 )) 2. This immediately implies that Together this implies Ed 2 (f (X 0 ), f (X t )) 1 1 λ. f 1 Lip 1 λγ log n.

In similar ways one can prove that if a family of graphs have all degrees at least 3 and girth (size of smallest cycle) at least g then any embedding into Hilbert space has distortion at least Ω( g). So if the girth is at least c log n we cannot embed with distortion smaller than c log n, but we can with distortion O(log n) (Bourgain s theorem). Open question: What is the minimal possible distortion for an embedding of an n vertex graph of girth g = c log(n) in Euclidean space?

Theorem: (Naor, P., Sheffield, Schramm, 2004) L p for p > 2, trees and hyperbolic groups have Markov type 2. Open question: Do planar graphs (with graph distance) have Markov type 2?

Key ideas in proofs Stationary reversible Markov chains in R (and more generally, in any normed space) are difference of two martingales, a forward martingale, and a backward martingale; the squared norms of the martingale increments can be bounded using the original increments. For martingales, powerful inequalities due to Doob and Pisier are available. On a tree, the length of a path can be bounded by twice the difference between the maximum and the minimum distance to the root along the path. We will use a decomposition of a stationary reversible Markov chain into forward and backward martingales (inspired by a decomposition due to Lyons and Zhang for stochastic integrals.)

A central Lemma Lemma: Let {Z t } t=0 be a stationary time reversible Markov chain on {1,..., n} and f : {1,..., n} R. Then, for every time t > 0, E max 0 s t [f (Z s) f (Z 0 )] 2 15tE[f (Z 1 ) f (Z 0 )] 2. Proof: Let P : L 2 (π) L 2 (π) be the Markov operator, i.e. (Pf )(i) = E[f (Z s+1 ) Z s = i]. For any s {0,..., t 1} let D s = f (Z s+1 ) (Pf )(Z s ), and D s = f (Z s 1 ) (Pf )(Z s ). The first are martingale differences with respect to the natural filtration of Z 1,..., Z t, and the second, because of reversibility are martingale differences with respect to the natural filtration on Z t,..., Z 1.

Subtracting, Thus for any m f (Z s+1 ) f (Z s 1 ) = D s D s. m m f (Z 2m ) f (Z 0 ) = D 2k 1 D 2k 1. k=1 k=1 So, max f (Z s) f (Z 0 ) max 0 s t m t/2 k=1 m D 2k 1 + max + max l t/2 f (Z 2l+1) f (Z 2l ). m D 2k 1 m t/2 k=1

Take squares and use the fact (a + b + c) 2 3(a 2 + b 2 + c 2 ), which is implied by the Cauchy-Schwarz inequality, to get max f (Z s) f (Z 0 ) 2 3 max 0 s t m t/2 + 3 max m t/2 + 3 l t/2 m 2 D 2k 1 k=1 m 2 D 2k 1 k=1 f (Z 2l+1 ) f (Z 2l ) 2.

We will use Doob s L 2 maximum inequality for martingales (see, e.g., Durrett 1996) Consider E max 0 s t M2 s 4 E M t 2. M s+1 = j s,j odd Since M s is still a martingale, we have E max f (Z s) f (Z 0 ) 2 12 E 0 s t t/2 k=1 + 12E + 3 D j. t/2 k=1 0 l t/2 D 2k 1 2 D 2k 1 2 E f (Z 2l+1 ) f (Z 2l ) 2.

Denote V = E[ f (Z 1 ) f (Z 0 ) 2 ], and notice that D 0 = f (Z 1 ) f (Z 0 ) E[f (Z 1 ) f (Z 0 ) Z 0 ], which implies that D 0 is orthogonal to E[f (Z 1 ) f (Z 0 ) Z 0 ] in L 2 (π). So, by the Pythagorian law, for any s we have E[Ds 2 ] = E[D0 2 ] V. Summing everything up gives E max 0 s t f (Z s) f (Z 0 ) 2 6tV + 6tV + 3(t/2 + 1)V 15tV, which concludes the proof of the Lemma.

Trees have Markov Type 2 Theorem: (Naor, P., Sheffield, Schramm, 2004) Trees have Markov Type 2. Proof: Let T be a weighted tree, {Z j } be a reversible Markov chain on {1,..., n} and F : {1,..., n} T. Choose an arbitrary root and set for any vertex v, ψ(v) = d(root, v). If v 0,..., v t is a path in the tree, then d(v 0, v t ) max 0 j t ( ) ψ(v 0 ) ψ(v j ) + ψ(v t ) ψ(v j ), since choosing the closest vertex to the root on the path yields equality.

Let X j = F (Z j ). Connect X i to X i+1 by the shortest path for any 0 i t 1 to get a path between X 0 and X t. Since now the closest vertex to the root can be on any of the shortest paths between X j and X j+1, we get ( ) d(x 0, X t ) max ψ(x 0 ) ψ(x j ) + ψ(x t ) ψ(x j ) +2d(X j, X j+1 ). 0 j<t Square, and use Cauchy-Schwarz again, d(x 0, X t ) 2 3 max ( ψ(x 0 ) ψ(x j ) 2 + ψ(x t ) ψ(x j ) 2) 0 j t + 12 d 2 (X j, X j+1 ). 0 j<t

By our central Lemma with f = ψf we get, Ed(X 0, X t ) 2 90tE ψ(x 0 ) ψ(x 1 ) 2 + 6 0 j t Ed 2 (X j, X j+1 ). Since in any metric space ψ(x 1 ) ψ(x 0 ) d(x 0, X 1 ) and since the Markov chain is stationary we have Ed(X 0, X 1 ) = Ed(X j, X j+1 ) for any j. So Ed(X 0, X t ) 2 96tEd(X 0, X 1 ) 2, which concludes our proof.