Metric structures in L : Dimension, snowflakes, and average distortion James R. Lee U.C. Berkeley and Microsoft Research jrl@cs.berkeley.edu Assaf Naor Microsoft Research anaor@microsoft.com Manor Mendel Hebrew University mendelma@cs.huji.ac.il Abstract We study the metric properties of finite subsets of L. The analysis of such metrics is central to a number of important algorithmic problems involving the cut structure of weighted graphs, including the Sparsest Cut Problem, one of the most compelling open problems in the field of approximation. We present some new observations concerning the relation of L to dimension, topology, and Euclidean distortion. In particular, we offer new insights into the four main open problems surrounding the metric structure of L.
Introduction This paper is devoted to the analysis of metric properties of finite subsets of L. Such metrics occur in many important algorithmic contexts, and their analysis is key to progress on some fundamental problems. For instance, an Olog n)-approximate max-flow/min-cut theorem proved elusive for many years until, in [5, ], it was shown to follow from a theorem of Bourgain stating that every metric on n points embeds into L with distortion Olog n). The importance of L metrics has given rise to many problems and conjectures that have attracted a lot of attention in recent years. Four basic problems of this type are as follows. I. Is there an L analog of the Johnson-Lindenstrauss dimension reduction lemma [9]? II. Are all n-point subsets of L O log n ) -embeddable into Hilbert space? III. Are all squared-l metrics O)-embeddable into L? IV. Are all planar graphs O)-embeddable into L? Each of these questions has been asked many times before; we refer to [8, 9, 4, 8], in particular. Despite an immense amount of interest and effort, the metric properties of L have proved quite elusive; hence the name The mysterious L appearing in a survey of Linial at the ICM in 00 [4]. In this paper, we offer new insights into each of the above problems and touch on some relationships between them.. Results and techniques Dimension reduction. In [3], and soon after in [3], it was shown that if the Newman- Rabinovich diamond graph on n vertices α-embeds into l d then d nω/α). The proof in [3] is based on a linear programming argument, while the proof in [3] uses a geometric argument which reduces the problem to bounding from below the distortion required to embed the diamond graph in l p, < p <. These results settle the long standing open problem whether there is an L analog of the Johnson-Lindenstrauss dimension reduction lemma [9]. In other words, they show that the answer to problem I) above is No.). In Section, we show that the method of proof in [3] can be used to provide an even more striking counter example to this problem. A metric space X is called doubling with constant C if every ball in X can be covered by C balls of half the radius. Doubling metrics with bounded doubling constants are widely viewed as low dimensional see [6, 0] for some practical and theoretical applications of this viewpoint). In fact, they have bounded Assouad dimension see [7] for the definition). On the other hand, the doubling constant of the diamond graphs is Ω n) where n is the number of points). Based on a fractal construction due to Laakso [] and the method developed in [3], we prove the following theorem, which shows a strong lower bound on the dimension required to represent uniformly doubling subsets of L. Theorem.. There are arbitrarily large n-point subsets X L which are doubling with constant 6 but such that every α-embedding of X into l d requires d nω/α). In [, 6] it was asked whether any subset of l which is doubling well-embeds into l d with bounds on the distortion and the dimension that depend only on the doubling constant). In [6], it was shown that a similar property cannot hold for l. Our lower bound exponentially strengthens this result.
Planar metrics. The next result addresses problems III) and IV). Our motivation was an attempt to generalize the argument in [3] to prove that dimension reduction is impossible in L p for any < p <. A natural approach to this problem is to consider the point set used in [3, 3] namely, a natural realization of the diamond graph, G, in L ) with the metric induced by the L p norm instead of the L norm. This is easily seen to amount to proving lower bounds on the dimension required to embed the metric space G, d /p G ) in ld p. Unfortunately, this approach cannot work since we show that, for any planar metric X, d) and any 0 < ε <, the metric space X, d ε ) embeds in Hilbert space with distortion O / ε). The proof of this interesting fact is a straightforward application of Assouad s classical embedding theorem [] and Rao s embedding method [3]. The O / ε) upper bound is shown to be tight for every value 0 < ε <. The case ε = / has been previously observed by A. Gupta in his unpublished) thesis. It follows that any planar metric embeds into squared-l with O) distortion so that a positive solution to problem III) above implies a positive solution to problem IV). Euclidean distortion. Our final result addresses problem II) stated above and for more subtle reasons, problem III)). We show that the answer to this question is positive on average, in the following sense. Theorem.. For every f,..., f n L there is a linear operator T : L L such that ) T fi ) T f j ) /3 min ) T fi ) T f j ) /3 8 log n) /3 i<j n f i f j n 0. ) f i f j i<j n In other words, for any n-point subset in L, there exists a map into L such that distances are contracted by at most O log n) and the average expansion is O). We remark that a different notion of average embedding was recently studied by Rabinovich []. In [] one tries to embed planar) metrics into the line such that the average distance does not change too much. The exponent /3 above has no significance, and we can actually obtain the same result for any power ε, ε > 0 we refer to Section 4 for details). The proof of Theorem. follows from the following probabilistic lemma, which is implicit in [6]. We believe that this result is of independent interest. Lemma.3. There exists a distribution over linear mappings T : L L such that for every x L \ {0} the random variable T x) x ) x π. has density e /4x In contrast to Theorem., we show that problem II) cannot be resolved positively using linear mappings. Specifically, we show that there are arbitrarily large n-point subsets of L such that any linear embedding of them into L incurs distortion Ω n). As a corollary we settle the problem left open by Charikar and Sahai in [4], whether linear dimension reduction is possible in L p, p / {, }. The case p = was proved in [4] via linear programming techniques, and it seems impossible to generalize their lower bound to arbitrary L p. We show that there are arbitrarily large n-point subsets X L p namely, the same point set used in [4] to handle the case p = ), such that any linear embedding of X into l d p incurs distortion Ω [ n/d) /p / ], thus linear dimension reduction is impossible in any L p, p. Additionally, we show that there are arbitrarily large n-point subsets X L such any linear embedding of X into any n/d ) d-dimensional normed space incurs distortion Ω. This generalizes the Charikar-Sahai result to arbitrary low dimensional norms. 3
An inherently high-dimensional doubling metric in L This section is devoted to the proof of Theorem.. The case p = of the proof below reduces to the argument in [,, ]. Proof of Theorem.. Consider the Laakso graphs, {G i } i=0, which are defined as follows. G 0 is the graph on two vertices with one edge. To construct G i, take six copies of G i and scale their metric by a factor of 4. We glue four of them cyclicly by identifying pairs of endpoints, and attach at two opposite gluing points the remaining two copies. See Figure below. G 0 G G G 3 Figure : The Laakso graphs. As shown in [], the graphs {G i } i=0 are uniformly doubling see also [], for a simple argument showing they are doubling with constant 6). Moreover, since the G i s are series parallel graphs, they embed uniformly in L see [5]). We will show below that any embedding of G i in L p, < p incurs distortion at least + p 4 i. We then conclude as in [3] by observing that ld is 3-isomorphic to ld p when p = + log d, so that if G i embeds with distortion α in l d then α i 40 log d. This implies the required result since i log G i. The proof of the lower bound for the distortion required to embed G i into L p is by induction on i. We shall prove by induction that whenever f : G i L p is non-contracting then there exist two adjacent vertices u, v G i such that fu) fv) p d Gi u, v) + p 4 i observe that for u, v G i, d Gi u, v) = d Gi u, v)). For i = 0 there is nothing to prove. For i, since G i contains an isometric copy of G i, there are u, v G i corresponding to two adjacent vertices in G i such that fu) fv) p d Gi u, v) + p 4 i ). Let a, b be the two 4
midpoints between u and v in G i. By Lemma. in [3], fu) fv) p + p ) fa) fb) p fu) fa) p + fa) fv) p + fv) fb) p + fb) fu) p. Hence: max{ fu) fa) p, fa) fv) p, fv) fb) p, fb) fu) p} 4 fu) fv) p + 4 p ) fa) fb) p + p ) i ) d Gi u, v) + p 4 4 4 d G i a, b) = + p ) 4 4 i d Gi u, v) = + p ) 4 i max{d Gi u, a), d Gi a, v), d Gi v, b), d Gi b, u) }. We end this section by observing that the above approach also gives a lower bound on the dimension required to embed expanders in l. Proposition.. Let G be an n-point constant degree expander which embeds in l d with distortion at most α. Then d n Ω/α). Proof. By Matoušek s lower bound for the distortion ) required to embed expanders in l p [7], any embedding of G into l p incurs distortion Ω. Since l d is O)-equivalent to l d log d, we deduce that α Ω log n log d ). log n p We can also obtain a lower bound on the dimension required to embed the Hamming cube {0, } k into l. Our proof uses a simple concentration argument. An analogous concentration argument yields an alternative proof of Proposition.. Proposition.. Assume that {0, } k embeds into l d with distortion α. Then d kω/α). Proof. Let f = f,..., f d ) : {0, } k l d be a contraction such that for every u, v {0, } d, fu) fv) αdu, v) where d, ) denotes the Hamming metric). Denote by P the uniform probability measure on {0, } k. Since for every i k, f i is -Lipschitz, the standard isoperimetric inequality on the hypercube implies that P f i u) Ef i k/4α)) e Ωk/α). On the other hand, if u, v {0, } k are such that du, v) = k then there exist i d for which f i u) f i v) k/α, implying that max{ f i u) Ef i, f i v) Ef i } > k/4α). By the union bound it follows that de Ωk/α), as required. 3 Snowflake versions of planar metrics The problem of whether there is an analog of the Johnson-Lindenstrauss dimension reduction lemma in L p, < p <, is an interesting one which remains open. In view of the above proof and the proof in [3], a natural point set which is a candidate to demonstrate the impossibility of dimension reduction in L p is the realization of the diamond graph in l which appears 5
in [3], equipped with the l p metric. Since this point set consists of 0, vectors, this amounts to considering the diamond graph with its metric raised to the power p. Unfortunately, this approach cannot work; we show below that any planar graph whose metric is raised to the power ε has Euclidean distortion O / ε). Given a metric space X, d) and ε > 0, the metric space X, d ε ) is known in geometric analysis see e.g. [7]) as the ε snowflake version of X, d). Assouad s classical theorem [] states that any snowflake version of a doubling metric space is bi-lipschitz equivalent to a subset of some finite dimensional Euclidean space. A quantitative version of this result with bounds on the distortion and the dimension) was obtained in [6]. The following theorem is proved by combining embedding techniques of Rao [3] and Assouad []. A similar analysis is also used in [6]. In what follows we call a metric K r -excluded if it is the metric on a subset of a weighted graph which does not admit a K r minor. In particular, planar metrics are all K 5 -excluded. Theorem 3.. For any r N there exists a constant Cr) such that for every 0 < ɛ <, a ε snowflake version of a K r -excluded metric embeds into l with distortion at most Cr)/ ε. Our argument is based on the following lemma, the proof of which is contained in [3]. Lemma 3.. For every r N there is a constant δ = δr) such that for every ρ > 0 and every K r -excluded metric X, d) there exists a finitely supported probability distribution µ on partitions of X with the following properties:. For every P suppµ), and for every C P, diamc) ρ.. For every x X, E µ C P dx, X \ C) δρ. Observe that the sum under the expectation in ) above actually consists of only one summand. Proof of Theorem 3.. Let X be a K r -excluded metric. For each n Z, we define a map φ n as follows. Let µ n be the probability distribution on partitions of X from Lemma 3. with ρ = n/ ε). Fix a partition P suppµ n ). For any σ {, +} P, consider σ to be indexed by C P so that σ C has the obvious meaning. Following Rao [3], define φ P x) = P σ C dx, X \ C), σ {,+} P and write φ n = P suppµ n) µn P ) φ P here the symbol refers to the concatenation operator). Now, following Assouad [], let {e i } i Z be an orthonormal basis of l, and set C P Φx) = n Z nε/ ε) φ n x) e n Claim 3.3. For every n Z, and x, y X, we have φ n x) φ n y) min { dx, y), n/ ε)}. Additionally, if dx, y) > n/ ε), then φ n x) φ n y) δ n/ ε). 6
Proof. For any partition P suppµ n ), let C x, C y be the clusters of P containing x and y, respectively. Note that since for every C P, diamc) n/ ε), when dx, y) > n/ ε), we have C x C y. In this case, It follows that φ P x) φ P y) = E σ {,+} P σ Cx dx, X \ C x ) σ Cy dy, X \ C y ) dx, X \ C x) + dy, X \ C y ). φ n x) φ n y) = E µn φ P x) φ P y) E µ n dx, X \ C x ) + E µn dy, X \ C y ) δ n/ ε)). On the other hand, for every x, y X, since dx, X \ C x ), dy, X \ C y ) n/ ε), we have that φ P x) φ P y) min { dx, y), n/ ε)}, hence φ n x) φ n y) min { dx, y), n/ ε)}. To finish the analysis, let us fix x, y X and let m be such that dx, y) ε m, m+]. In this case, Φx) Φy) = n Z nε/ ε) φ n x) φ n y) 4 n + 4dx, y) n<m n m nɛ/ ε) = m+ + 4dx, y) mε/ ε) ε/ ε) = O /ε) dx, y) ε) On the other hand, Φx) Φy) mɛ/ ε) φ m x) φ m y) δ m δ dx, y) ε. The proof is complete. Remark 3.4. The O / ε) upper bound in Theorem 3. is tight. In fact, for i /ε, the ε snowflake version of the Laakso graph G i presented in Section ) has Euclidean distortion Ω / ε). To see this, let f : G i l be any non-contracting embedding of G i, d ε G i ) into l. For j i denote by K j the Lipschitz constant of the restriction of f to G j, d ε G i ) as before, we think of G j as a subset of G i ). Clearly K 0 =, and the same reasoning as in the proof of Theorem. shows that for j, Kj K j 4 + ε 4. This implies that Ki 4 + 4 +... + = Ω/ε), as required. ε 4 iε 7
4 Average distortion Euclidean embedding of subsets of L The heart of our argument is the following lemma which is implicit in [6], and which seems to be of independent interest. Lemma 4.. For every 0 < p there is a probability space Ω, P ) such that for every ω Ω there is a linear operator T ω : L p L such that for every x L p \ {0} the random variable X = T ωx) x p satisfies for every a R, Ee ax = e ap/. In particular, for p = the density of X is e /4x ) x π. Proof. Consider the following three sequences of random variables, {Y j } j, {θ j } j, {g j } j, such that each variable is independent of the others. For each j, Y j is uniformly distributed on [0, ], g j is a standard Gaussian and θ j is an exponential random variable, i.e. for λ 0, P θ j > λ) = e λ. Set Γ j = θ + + θ j. By Proposition.5. in [6], there is a constant C = Cp) such that if we define for f L p V f) = C j g j Γ /p j fy j ), then Ee iv f) = e f p p. Assume that the random variables {Y j } j and {Γ j } j are defined on a probability space Ω, P ) and that {g j } j are defined on a probability space Ω, P ), in which case we use the notation V f) = V f; ω, ω ). Define for ω Ω a linear operator T ω : L p L Ω, P ) by T ω f) = V f; ω, ). Since for every fixed ω Ω the random variable V f; ω, ) is Gaussian with variance T ω f), for every a R, E P eiav s;ω, ) = e a T ω f). Taking expectation with respect to P we find that, E P e a T ωf) = e a p f p p. This implies the required identity. The explicit distribution in the case p = follows from the fact that the inverse Laplace transform of x e x is y e /4y) πy 3 see for example [4]). Theorem 4.. For every f,..., f n L there is a linear operator T : L L such that: ) T fi ) T f j ) /3 min ) T fi ) T f j ) /3 8 log n) /3 i<j n f i f j n 0. ) f i f j i<j n Proof. Using the notation of lemma 4. in the case p = ) we find that for every a > 0, Ee ax = e a. Hence, for every a, ε > 0 and every < i < j n, ) Tω f i ) T ω f j ) P ε = P e ax e aε) e aε a. f i f j Choosing a = 4ε 4 the above upper bound becomes e /4ε). Consider the set A = { } Tω f i ) T ω f j ) Ω. f i f j 8 log n i<j n By the union bound, P A) >, so that P A) E ) Tω f i ) T ω f j ) /3 n EX ) /3 = f i f j π i<j n 0 x /3 e /4x ) x dx < 0. It follows that there exists ω A for which the operator T = T ω has the desired properties. 8
Remark 4.3. There is nothing special about the choice of the the power /3 in Corollary 4.. When p =, EX = but EX ε < for every 0 < ε <, so we may write the above average with the power ε replacing the exponent /3. Obvious generalizations of Corollary 4. hold true for every < p <, in which case the average distortion is of order Cp)log n) /p / and the power can be taken to be ). 5 The impossibility of linear dimension reduction in L p, p The above method cannot yield a O log n ) bound on the Euclidean distortion of n-point subsets of L. In fact, there are arbitrarily large n-point subsets of L on which any linear n embedding into L incurs distortion at least. This follows from the following simple lemma: Lemma 5.. For every p there are arbitrarily large n-point subsets of L p on which any linear embedding into L incurs distortion at least ) n /p /. Proof. Let w,..., w k be the rows of the k k Walsh matrix. Write w i = k j= w ije j where e,..., e k are the standard unit vectors in R k. Consider the set A = {0} {w i } k i= {e i} k i= l p. Let T : l p L be any linear operator which is non contracting and L-Lipschitz on A. Assume first of all that p <. Then: k+/p) = k i= w i p = k i= k i= j= which implies that L k/p /) = with the inequalities reversed. T w i = k k i= k j= w ij T e j ) k w i, w j T e i ), T e j ) = k T e j ) 4 k L, j= ) A /p /. When p > apply the same reasoning, We remark that the above point set was also used by Charikar and Sahai [4] to give a lower bound on linear dimension reduction in L. Their proof used a linear programming argument, which doesn t seem to be generalizable to the the case of L p, p >. Lemma 5. formally implies their result with a significantly simpler proof), and in fact proves the impossibility of linear dimension reduction in any L p, p. Indeed, if there were a linear operator which embeds A into l d p with distortion D then it would also be a D d /p / embedding into ) /p /. Similarly, since by John s theorem see e.g. [0]) l d. It follows that D A d any d-dimensional normed space is d equivalent to Hilbert space, we deduce that there are arbitrarily large n-point subsetsof L, any embedding of which into any d-dimensional normed n space incurs distortion at least d. 9
References [] Patrice Assouad. Plongements lipschitziens dans R n. Bull. Soc. Math. France, 4):49 448, 983. [] Y. Aumann and Y. Rabani. An Olog k) approximate min-cut max-flow theorem and approximation algorithm. SIAM J. Comput., 7):9 30, 998. [3] B. Brinkman and M. Charikar. On the impossibility of dimension reduction in l. In Proceedings of the 44th Annual IEEE Conference on Foundations of Computer Science. ACM, 003. [4] M. Charikar and A. Sahai. Dimension reduction in the l norm. In Proceedings of the 43rd Annual IEEE Conference on Foundations of Computer Science. ACM, 00. [5] A. Gupta, I. Newman, Y. Rabinovich, and A. Sinclair. Cuts, trees and l embeddings. In Proceedings of the 40th Annual Symposium on Foundations of Computer Science, 999. [6] Anupam Gutpa, Robert Krauthgamer, and James R. Lee. Bounded geometries, fractals, and low-distortion embeddings. In Proceedings of the 44th Annual Symposium on Foundations of Computer Science, to appear, 003. [7] J. Heinonen. Lectures on analysis on metric spaces. Universitext. Springer-Verlag, New York, 00. [8] P. Indyk. Algorithmic applications of low-distortion geometric embeddings. In Proceedings of the 4nd Annual IEEE Symposium on Foundations of Computer Science, pages 0 33. October 00. [9] William B. Johnson and Joram Lindenstrauss. Extensions of Lipschitz mappings into a Hilbert space. In Conference in modern analysis and probability New Haven, Conn., 98), volume 6 of Contemp. Math., pages 89 06. Amer. Math. Soc., Providence, RI, 984. [0] R. Krauthgamer and J. R. Lee. Navigating nets: Simple algorithms for proximity search. Submitted, 003. [] T. J. Laakso. Ahlfors Q-regular spaces with arbitrary Q > admitting weak Poincaré inequality. Geom. Funct. Anal., 0): 3, 000. [] Urs Lang and Conrad Plaut. Bilipschitz embeddings of metric spaces into space forms. Geom. Dedicata, 87-3):85 307, 00. [3] J. R. Lee and A. Naor. Embedding the diamond graph in L p and dimension reduction in L. Geom. Funct. Anal., to appear. [4] N. Linial. Finite metric spaces - combinatorics, geometry and algorithms. In Proceedings of the International Congress of Mathematicians III, pages 573 586, 00. [5] N. Linial, E. London, and Y. Rabinovich. The geometry of graphs and some of its algorithmic applications. Combinatorica, 5):5 45, 995. 0
[6] M. B. Marcus and G. Pisier. Characterizations of almost surely continuous p-stable random Fourier series and strongly stationary processes. Acta Math., 53-4):45 30, 984. [7] J. Matoušek. On embedding expanders into l p spaces. Israel J. Math., 0:89 97, 997. [8] J. Matoušek. Lectures on discrete geometry, volume of Graduate Texts in Mathematics. Springer-Verlag, New York, 00. [9] J. Matoušek. Open problems, workshop on discrete metric spaces and their algorithmic appl ications. Haifa, March 00. [0] Vitali D. Milman and Gideon Schechtman. Asymptotic theory of finite-dimensional normed spaces, volume 00 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 986. With an appendix by M. Gromov. [] I. Newman and Y. Rabinovich. A lower bound on the distortion of embedding planar metrics into euclidean space. Discrete Computational Geometry, 9):77 8, 003. [] Y. Rabinovich. On average distorsion of embedding metrics into the line and into l. In Proceedings of the 35th Annual ACM Symposium on Theory of Computing. ACM, 003. [3] S. Rao. Small distortion and volume preserving embeddings for planar and Euclidean metrics. In Proceedings of the 5th Annual Symposium on Computational Geometry, pages 300 306. ACM, 999. [4] David Vernon Widder. The Laplace Transform. Princeton Mathematical Series, v. 6. Princeton University Press, Princeton, N. J., 94.