Subspaces and orthogonal decompositions generated by bounded orthogonal systems

Subspaces and orthogonal decompositions generated by bounded orthogonal systems Olivier GUÉDON Shahar MENDELSON Alain PAJOR Nicole TOMCZAK-JAEGERMANN August 3, 006 Abstract We investigate properties of subspaces of L spanned by subsets of a finite orthonormal system bounded in the L norm. We first prove that there exists an arbitrarily large subset of this orthonormal system on which the L and the L norms are close, up to a logarithmic factor. Considering for example the Walsh system, we deduce the existence of two orthogonal subspaces of L n of dimension roughly n/ spanned by ± vectors i.e. Kashin s splitting) and in logarithmic distance to the Euclidean space. The same method applies for p >, and, in connection with the Λ p problem solved by Bourgain), we study large subsets of this orthonormal system on which the L and the L p norms are close again, up to a logarithmic factor). 0 Introduction In this note we consider a space L of functions on a probability space and we investigate properties of its subspaces spanned by a finite subset of an orthonormal system which consists of functions bounded in L. Typical examples of such systems are the trigonometric and the Walsh systems. 0 000 MSC-classification: 46B07, 4A45, 94B75, 5B05, 6G99 Partially supported by an Australian Research Council Discovery grant. This author holds the Canada Research Chair in Geometric Analysis.

The question we study is whether there is a subspace spanned by a large subset of the orthonormal system on which the L and L p norms are close. The two cases we focus on are when p > and p =. Formally we address Question Let ϕ j ) n j= be an orthonormal system in L with ϕ j L K.. Let k < n. Is there a subset I {,..., n} of cardinality larger than n k and A, such that for every a i ) n C n, ) / a i A a i ϕ i i I i I L?. For p > is there a large set J {,..., n} and A p > 0 such that for every a i ) n C n, ) / a j A p a j ϕ j j J j J Lp? First part of Question is connected with the problem of selecting a proportion of characters. Improving a result of Bourgain, Talagrand [Ta] showed that there exists a set I of cardinality proportional to n that is, I δ 0 n, where δ 0 > 0 depending on K is a small constant) such that on the span of ϕ i ) i I, the L and L norms are equivalent with A CKlog n log log n) /. Here we show that a similar result holds for every k < n. More precisely, introducing the spaces L n p p ) in.) below, we proved: Theorem A. Let ϕ j ) n j= be an orthonormal system in L n with ϕ j L n K for j n. For any k < n there exists a subset I {,..., n} with I n k such that for every a = a i ) C n, ) / a i C K n/k log n log + k)) 3/ i I where C > 0 is a universal constant. i I a i ϕ i L n Although Talagrand proved a better estimate, it is applicable only when n k is a small proportion of n. Our result allows us to obtain a Kashin,

splitting of C n spanned by a subset of a bounded orthonormal system. Recall that a Kashin splitting of C n consists of two orthogonal subspaces E 0, E of C n, both of dimension n, such that on E 0 and E the L n and L n norms are equivalent. In other words, there is an absolute constant c independent of n) such that for every x E i, i = 0,, c x L n x L n x L n. We show in Theorem.4 that if ϕ i ) n is a bounded orthonormal system as above, then we can find a subset I of {,..., n}, roughly of cardinality n/, such that, setting I c = {,..., n} \ I, if E 0 = spanϕ i ) i I and E = spanϕ i ) i I c, then the L n and L n are close on E i. This applies in particular to the trigonometric or the Walsh system. In the latter case, it gives a Kashin splitting of R n spanned by ±-valued vectors. Second part of Question is connected with the Λ p problem which has been solved by Bourgain [B]. He proved that if ϕ i ) n is an orthonormal system of functions bounded by K in L, then one can find J {,..., n}, J CK)n /p, such that on the span of ϕ j ) j J the L and L p are equivalent with an absolute constant). We will prove here the following Theorem B. Let ϕ j ) n j= L be an orthonormal system bounded in L, ϕ j L K for j n. Let < p <. There exists a subset J {,..., n} of cardinality J c p K 4/p n /p log n) 3 such that for all scalars a,... a n, ) / a j ϕ j K C p log n) 3/ a j j J j J Lp where c p and C p are constants depending only on p. The estimates we present here do not reconstruct the full strength of Bourgain s solution. The equivalence constant we obtain is slightly weaker, namely, Cp, K) log 3/ n, but the set J is larger, with cardinality at least Cp, K)n /p log 3 n. The proofs of all these estimates are simple and are based on arguments coming from Empirical Processes Theory. Acknowledgement: Part of the work on this article was conducted during the Trimesters Phenomena in High Dimensions held at the Schrödinger Institute, Vienna in the summer of 005 and at the Centre Emile Borel, 3

Henri Poincaré Institute, Paris in the Spring of 006. We are grateful to these Institutes for their hospitality and the excellent working atmosphere it provided. Preliminaries We begin with a notational convention. Throughout, all constants are positive numbers denoted by c, C etc. Their values can change from line to line. We require two preliminary results. The first one is an estimate on the supremum of the following empirical process defined on the probability space Ω, µ), f X i ) Ef, k sup f F where F is a class of functions and X,..., X k are independent random variables distributed according to µ. The argument is due to Rudelson [R] and the formulation below was implicit in [RV] or [GR]. We present it for the sake of completeness. Recall that for a metric space T, d), an admissible sequence of T is a collection of subsets of T, {T s : s 0}, such that for every s, T s = s and T 0 =. Definition. [Ta] For a metric space T, d), define γ T, d) = inf sup t T s/ dt, T s ), where the infimum is taken with respect to all admissible sequences of T. Theorem. There exists an absolute constant c for which the following holds. Let X,..., X k be independent copies of X and let s=0 d,k f, g) = max i k fx i) gx i ) and set U k = EγF, d,k )) / and σ F = sup f F Ef X)) /. Then, E sup f X f F i ) Ef ) X)) c max kσf U k, Uk..) 4

Proof. Set A := E sup f X i ) Ef X)) f F, then by a symmetrization argument [GZ], A E X E ε sup ε i f X i ) f F, where ε i ) are iid ±) Bernoulli random variables. Since the Rademacher k ) process ε if X i ) is subgaussian with respect to the l k metric, the f F inner expectation is governed by the random metric df, g) = f X i ) g X i ) ) ) /. Recall that d,k f, g) = max i k fx i ) gx i ), and thus for every f, g F, df, g) d,k f, g) sup f F / f X i )). Therefore, by a generic chaining argument [Ta], for every X,..., X k / E ε sup ε i f X i ) f F cγ F, d) c sup f X i )) γ F, d,k ), implying that E X E ε sup ε i f X i ) f F c E X sup f F Since we get E X sup f F from which the claim easily follows. f F ) / EX f X i ) γf, d,k ) ) /. f X i ) A + k sup E X f X) f F A ca + kσ F ) / U k, 5

The second preliminary we require deals with the following situation. Let n and define a random variable Y by Y = j with probability /n, for j n. Let Y,..., Y n be independent copies of Y and for k n set α k := {Y i : i k}. Lemma.3 For k n, Eα k = n /n) k). In particular, letting λ := k/n, k e λ λ Eα k k e λ + e n) λ. λ Furthermore, for 3 k n and every 0 < δ <, P α k Eα k ) δk/ /δ. Proof. The argument is straightforward. First, for k n it is easy to calculate the conditional expectation, Eα k α k ) = α k /n + α k + ) α k /n) = + α k /n). Thus Eα k = + /n) Eα k implying the desired formula for Eα k, by iterating and noting that α =. The estimates for Eα k easily follow. Similarly, Eαk α k ) = + /n)α k + /n)αk, which yields Eα k = + /n)eα k + /n)eα k. Thus, Eα k Eα k) is equal to ) Eαk Eα k ) ) + n n Eα k n ) n Eα k ) Eα k Eα k ) ) + 4 Eα k Eα k ) ) + 4. By iterating, it follows that for k 3, 0 Eα k Eα k) +k )/4 k/. Chebyshev s inequality implies the required deviation estimate. 6

Sections of l n Here we use Theorem. to prove our results on uniformly bounded orthonormal systems in L. In all our applications we will consider a probability space Ω, ν) and, functions X,..., X k in L for k =,,...) with X i L K, where K is a fixed constant. In this setting define the semi-norm X,k on L by y X,k = max X i, y for y L..) i k The two main applications we present here are based on comparing the L - and L -norms on subspaces generated by subsets of bounded orthonormal systems. In this part, we work in a central although slightly restricted) setting. Namely, we consider the space C n as a function space on {,..., n} equipped with the normalized counting measure and by L n p with p ) we denote the corresponding L p space, with the norm denoted by L n p and the inner product.,.. That is, for p < and y = y i ) C n, y L n p = n ) /p n y i p..) By B L n p denote the unit ball in L n p. The case p = is treated by a standard modification.) Our first theorem in this section allows us to find subsets I of {,..., n} such that on the corresponding subspace of L n, the norms from L n and from L n are comparable. Moreover, cardinality of I can be taken arbitrarily close to n. Theorem. Let ϕ j ) n j= be an orthonormal system in L n with ϕ j L n K for j n. For any k < n there exists a subset I {,..., n} with I n k such that for every a = a i ) C n, ) / a i C K n/k log n log + k)) 3/ a i ϕ i i I i I L n where C > 0 is a universal constant..3) 7

Our theorem should be compared to a theorem by Talagrand and Bourgain [Ta]), which gives subsets I of cardinality that is a sufficiently small proportion of n that is, I δ 0 n, where δ 0 > 0 depending on K is a small constant), but with a better estimate log n log log n. Let us also note that for the trigonometric system, it is well known and easy to see from Szemerédi s theorem, that for any subset I whose cardinality is a fixed proportion of n say, I n/4), the upper bound must asymptotically go to infinity as n, and we have been informed by J. Bourgain that this upper bound involves a logarithmic term log n. The proof of Theorem. requires the following upper bound of the γ functional, obtained by entropy estimates. Recall that given bodies K, B in C n the covering number NK, B) is the smallest number of translates of B needed to cover K. For a positive integer l the entropy number e l K, B) is the smallest ε such that NK, εb) l. For a linear operator T : X Y between two finite-dimensional) normed spaces X, Y with the unit balls B X, B Y, respectively), we let e l T ) = e l T B X ), B Y ). Lemma. There exist a constant C > 0 such that for every k n and X,..., X k with X i L n K, γ B L n, X,k ) CK log nlog + k)) 3/. Proof. Recall that by its definition, the γ functional is bounded by an appropriate entropy integral see [Ta]). In turn, writing this integral in terms of entropy numbers of operators see e.g. [P] chapter 5), and noting that the norms on l n and L n coincide, we get γ B L n, X,k ) C l= e l S ) l, where C is an absolute constant, and the operator S : l k l n is defined by Se i ) = X i for every i =,..., k. Also, observe that although the above sum should be formally extended to infinity, the exponential decay of entropy numbers e l for l larger than the rank of the operator, allows one to stop the summation at l = k. The operator S : l n l k satisfies S K and, by Proposition 3 of Carl [C], for l k, log + k/l) log + n/l) e l S ) c K. l 8

A simple computation completes the proof. Proof of Theorem.. Let X be the random vector taking the value ϕ i with probability /n. Let k < n and let X,..., X k be independent copies of X. We shall show that, with positive probability, the set of vectors ϕ i ) i I defined by { ϕ j ) j n \ X i ) } k satisfies the required inequality. Denote by Γ the k n matrix with rows X,..., X k, and thus, for y C n Γy = k X i, y e i. For y C n set y = n y i ) / and put S L n = {y : n j= y j = n} to be the unit sphere in L n. Note that for any star-shaped subset T in C n the following implication holds: whenever ρ > 0 satisfies the inequality Xi, y E X i, y ) kρ.4) 3n then sup y T ρs L n diamker Γ T ) ρ..5) Indeed, since E X i, y is constant = ρ /n) on T ρs L n, then condition.4) implies that for all y T ρs L n, kρ 3n X i, y = Γy 4kρ 3n..6) The homogeneity of.6) and the fact that T is star-shaped implying that if the lower bound in.6) holds for all y T ρs L n, then the same lower bound also holds for all y T with y L n ρ. This in turn shows that if y ker Γ T then y L n ρ, as required in.5). Finally note that since the n n matrix whose rows are ϕ,..., ϕ n is of course { orthogonal, } then the subspace ker Γ is spanned by the vectors ϕ i ) i I = ϕj ) j n \ X i ) k. The cardinality of this set satisfies I n k, with the sharp inequality if some among values of X,..., X k are equal. For T = B L n, this shows that whenever ρ satisfies.4), then i I a i ϕ i L n ρ i I a i ϕ i L n for all scalars a i, proving that.3) is satisfied with the constant ρ. 9

In order to find ρ which satisfies.4) with positive probability, we use Theorem. and observe that σ F = ρ/ n. It follows that E sup Xi, y E X i, y ) ) k c max ρ n U k, Uk.7) y B L n ρs L n where U k = Eγ B L n ρs L n, X,k )) /. By Lemma., γ B L n ρs L n, X,k ) γ B L n, X,k ) ck log n log + k)) 3/. This gives an estimate for U k that we plug in the right-hand side of.7). To conclude, we use this estimate to show that for some numerical constant C and for the choice of.4) holds with positive probability. ρ = C K log n log + k)) 3/ n/k, Remark.3 An analogous result to Theorem. also holds with probability close to. For any 0 < δ <, first observe that by.7) and Chebyshev s inequality, for any ρ > 0 we get a set of probability δ on which sup Xi, y E X i, y ) ) k c/δ ) max ρ n U k, Uk. y B L n ρ S L n Setting ρ := ck log nlog + k)) 3/ / δ δ, a similar calculation as in the theorem above shows that on the set of probability δ corresponding to ρ ) we have an analogue of.4), sup y B L n ρ S L n Xi, y E X i, y ) kρ 3n..8) Again the same argument as before shows that on the same set we have as claimed. diamker Γ B L n ) ρ = ck log nlog + k)) 3/ / δ δ, 0

Let us recall an important theorem concerning Euclidean sections of convex bodies, so-called Kashin s decomposition of l k [K], see also [S] for another proof and [ST], [P] for important generalizations, primarily connected with the notion of volume ratio). Kashin s theorem says that R k can be decomposed as a sum of two orthogonal subspaces E 0, E R k with dim E 0 = dim E = k on which the L k - and L k -norms are equivalent, namely, 3 eπ) x L k x L k x L k, for all x E m and m = 0,. The theorem below gives an analogous result on a decomposition of L n into two orthogonal subspaces with much more structure. To be more exact, the spaces are spanned by complementary subsets of a bounded orthonormal system in L n, which are relatively close to a Euclidean space. Considering a multiple of) the Walsh system for n = l ) we get two orthogonal subspaces of L n of dimension n/ spanned by ± vectors and in the logarithmic distance to the Euclidean space. Let us recall again that for proportional-dimensional subspaces spanned by a subset of characters, the logarithmic term is necessary in general, at least for the trigonometric system. Theorem.4 Let ϕ j ) n j= be an orthonormal system in L n with ϕ j L n K for j n. There exists a subset I {,..., n} with n/ c n I n/ + c n for some absolute constant c > 0, such that for m = 0, and every a = a i ) C n, ) / a i CKlog n) a i ϕ i i J m i J m L n where J 0 = I and J = I c and C > 0 is a universal constant..9) Proof. Let k = [λn], with λ := log. We use the same notation and a part of the argument of the proof of Theorem.. In particular, random vectors X,..., X k and the subset I {,..., n} are the same as before and T = B L n. It was shown that.8) is satisfied with δ = /4 and probability 3/4. Pick any realization of the X i s that satisfies.8) hence.3) is valid i.e. I satisfies inequality.9). Let Γ be the same k n matrix as before, we will show that diam ker Γ) T ) ρ.0)

by using a similar argument to.5) with ker Γ replaced by ker Γ). Hence, the set I c satisfies.9) as well. Thus having.8) satisfied with probability 3/4, implies that with the same probability.9) will be satisfied for both I and I c. To complete the proof it will be then sufficient to note that I = n {X,..., X k }, and since by the choice of λ, e λ = /, then by Lemma.3, with probability 3/4 for some absolute constant c > 0. n/ c n I n/ + c n, In the proof of.0) below we will refer to.4) rather than to.8), to be able to use directly parts of the proof of Theorem.. Recall that {ϕ i } i I = { ϕ j ) j n \ X i ) k }. As before,.4) implies that.6) is valid for all y T ρs L n. In turn, the upper estimate in the latter inequality implies that for all y T ρs L n, ϕ j, y j I n ϕ j, y j= X i, y ρ 4k )..) 3n Exactly as before,.) holds for all y T for which y L n ρ. Observe that since λ = log < 3/4 then 4k/3n) > 0. Therefore, if y T and ϕ j, y = 0 for all j I, then y L n ρ. Thus diam ker Γ) T ) ρ by the definition of I. Let us also mention recent progress on random and non-random Euclidean sections of L n of proportional dimension generated by ± vectors. It has been initiated by G. Schechtman see [Sch]) and followed by results in [LPRT], [AFMS] and [R]. These results are based on different model than the one we adopted in this section. 3 Subsets of bounded orthonormal systems We now pass to the discussion of subspaces generated by subsets of bounded orthonormal systems in L p, p >. Our result is closely related to the Λ p problem solved by Bourgain [B]). Namely, we are loosing on the equivalence estimates logarithmic instead of constant), although on the other hand,

curiously enough, the sets we get have cardinality larger than expected, also by a logarithmic factor. We shall use a general setting introduced at the beginning of Section and in particular the semi-norm X,k on L defined in.). Theorem 3. Let ϕ j ) n j= L be an orthonormal system with ϕ j L K for j n and let < p <. There exists a subset J {,..., n} of cardinality J c p K 4/p n /p log n) 3 such that for all scalars a,... a n, a j ϕ j K C p log n) 3/ j J j J Lp where c p and C p depend only on p. The proof requires a lemma analogous to Lemma.. a j ) / Lemma 3. For every < q there exists a constant C q depending only on q such that for every k and X,..., X k with X i L K, γ B Lq, X,k ) KC q log + k)) 3/. Proof. The same argument as in Lemma. shows that γ B Lq, X,k ) C l= e l S ) l, 3.) where C is an absolute constant, the operator S : l k L p is defined by Se i ) = X i for every i =,..., k, and p = q/q ). Since for < p < L p is uniformly convex then a duality result from [BPST] yields that there is a constant c p depending only on p such that l= e l S ) l c p l= e l S) l. For p, L p has type with the type constant depending on p), so using Proposition of Carl [C], it follows that for l k, log + k/l) e l S) C p S. l 3

This inequality together with S max i k X i Lp K gives an estimate of the right-hand side of 3.). We conclude that γ B Lq, X,k ) K C p log + k)) 3/, where C p depends on p and hence on q) only. Proof of Theorem 3. Define the random vector X by X = ϕ j with probability /n for j n. Fix k n to be determined later and let X,..., X k be independent copies of X. For k set B k = {a C k : k a i }. We will prove that there exists k K 4/p c p n /p log n) 3 such that E sup a B k a i X i KC p log n) Lp 3/. 3.) The theorem will follow by applying Lemma.3 for k 0 = [K 4/p c p n /p log n) 3 ]+ and observing that in the notation of that lemma) with high probability, α k0 k 0 /. Thus, the set {X,..., X k0 } contains at least k 0 / distinct vectors ϕ j s. Passing to the proof of 3.), ) / E sup a a B i X i = E sup X i, y Lp. k y B L q Next, apply Theorem. for F = B Lq. Recall that d,k f, g) = f g X,k and note that the latter expectation is less than or equal to E sup y B Lq C X i, y k E X, y ) / + kσ F max ) ) / kσ F U k, Uk ) + k σf. By Lemma 3., U k KC p log k) 3/, and since the functions ϕ i are bounded in L it is evident that for every a i ) n C n, n n a i ϕ i K a i K L n ) / n a i. 4

Moreover, since ϕ i ) n are orthonormal in L then n n ) / a i ϕ i = a i L, and by Hölder inequality it follows that for every p and every a i ) n C n, n n ) / a i ϕ i K Lp /p n / /p a i. Considering the operator S : l n L p defined by Se i ) = ϕ i for i =,..., n the above inequality means that S K /p n / /p. By duality, σ F := sup y B Lq E X, y ) / = n / sup y B Lq n ) / ϕ i, y = n / S K /p n /p. Therefore, by Theorem., E sup a a B i X i k Lp C max k K /p c p n /p log k) 3/, K c plog k) 3 ) + k K /p n /p ). ) / Setting k to be the smallest integer greater than K 4/p C p n /p log n) 3, we obtain the claimed inequality 3.). References [AFMS] S. Artstein-Avidan, O. Friedland, V. Milman & S. Sodin, Polynomial bounds for large Bernoulli sections of the cross polytope, preprint. [B] J. Bourgain, Bounded orthogonal systems and the Λp)-set problem, Acta Math. 6 989), no. 3-4, 7 45. 5

[BPST] J. Bourgain, A. Pajor, S. Szarek & N. Tomczak-Jaegermann, On the duality problem for entropy numbers of operators, in Geometric aspects of functional analysis 987 88), 50 63, Lecture Notes in Math., 376, Springer, Berlin, 989. [C] B. Carl, Inequalities of Bernstein-Jackson-type and the degree of compactness of operators in Banach spaces, Ann. Inst. Fourier Grenoble) 35 985), no. 3, 79 8. [GZ] E. Giné and J. Zinn, Some limit theorems for empirical processes. With discussion, Ann. Probab. 984), no. 4, 99 998. [GR] O. Guédon & M. Rudelson, L p moments of random vectors via majorizing measures, to appear in Adv. Math. [LPRT] A. Litvak, A. Pajor, M. Rudelson & N. Tomczak-Jaegermann, Smallest singular value of random matrices and geometry of random polytopes, Adv. Math. 95 005), no., 49 53. [K] [P] B. Kashin, The widths of certain finite-dimensional sets and classes of smooth functions, Izv. Akad. Nauk SSSR Ser. Mat. 4 977), no., 334 35, 478. G. Pisier, The volume of convex bodies and Banach space geometry, Cambridge Univ. Press, Cambridge, 989. [R] M. Rudelson, Private communication [R] M. Rudelson, Lower estimates for the singular values of random matrices, C. R. Math. Acad. Sci. Paris 34 006), no. 4, 47 5. [RV] M. Rudelson & R. Vershynin, Sparse reconstruction by convex relaxation: Fourier and Gaussian measurements, CISS 006 40th Annual Conference on Information Sciences and Systems). [S] S. Szarek, On Kashin s almost Euclidean orthogonal decomposition of l n, Bull. Acad. Polon. Sci. Sér. Sci. Math. Astronom. Phys. 6 978), no. 8, 69 694. [ST] S. Szarek & N. Tomczak-Jaegermann, On nearly Euclidean decomposition for some classes of Banach spaces. Compositio Math. 40 980), no. 3, 367 385. [Sch] G. schechtman, Special orthogonal splittings of L k, Israel J. Math. 39 004), 337 347. [Ta] M. Talagrand, The generic chaining, Springer, Berlin, 005. [Ta] M. Talagrand, Selecting a proportion of characters, Israel J. Math. 08 998), 73 9. 6

O. Guédon Institut de Mathématiques de Jussieu, Université Pierre et Marie Curie, Paris 6, 4 place Jussieu, 75005 Paris, France e-mail: guedon@math.jussieu.fr S. Mendelson, Centre for Mathematics and its Applications, The Australian National University, Canberra, ACT 000, Australia e-mail: shahar.mendelson@anu.edu.au A. Pajor, Équipe d Analyse et Mathématiques Appliquées, Université de Marne-la-Vallée, 5, boulevard Descartes, Champs sur Marne, 77454 Marne-la-Vallée, Cedex, France e-mail: Alain.Pajor@univ-mlv.fr N. Tomczak-Jaegermann, Dept. of Math. and Stat. Sciences, University of Alberta, Edmonton, Alberta, Canada, T6G G. e-mail: nicole.tomczak@ualberta.ca 7