GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS. Elchanan Mossel

Size: px

Start display at page:

Download "GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS. Elchanan Mossel"

Douglas Blake
5 years ago
Views:

1 Geom. Funct. Anal. Vol. 19 (2010) DOI /s x Publshed onlne January 22, The Author(s) GAFA Geometrc And Functonal Analyss Ths artcle s publshed wth open access at Sprngerlnk.com GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS Elchanan Mossel Abstract. In ths paper we derve tght bounds on the expected value of products of low nfluence functons defned on correlated probablty spaces. The proofs are based on extendng Fourer theory to an arbtrary number of correlated probablty spaces, on a generalzaton of an nvarance prncple recently obtaned wth O Donnell and Oleszkewcz for multlnear polynomals wth low nfluences and bounded degree and on propertes of mult-dmensonal Gaussan dstrbutons. We present two applcatons of the new bounds to the theory of socal choce. We show that Majorty s asymptotcally the most predctable functon among all low nfluence functons gven a random sample of the voters. Moreover, we derve an almost tght bound n the context of Condorcet aggregaton and low nfluence votng schemes on a large number of canddates. In partcular, we show that for every low nfluence aggregaton functon, the probablty that Condorcet votng on k canddates wll result n a unque canddate that s preferable to all others s k 1+o(1). Ths matches the asymptotc behavor of the majorty functon for whch the probablty s k 1 o(1). A number of applcatons n hardness of approxmaton n theoretcal computer scence were obtaned usng the results derved here n subsequent work by Raghavendra and by Austrn and Mossel. A dfferent type of applcatons nvolves hypergraphs and arthmetc relatons n product spaces. For example, we show that f A Z n m s of low nfluences, then the number of k tuples (x 1,...,x k ) A k satsfyng k =1 x B n mod m where B [m] satsfes B 2s(1± o(1))p[a] k (m k 1 B ) n whch s the same as f A were a random set of probablty P[A]. Our results also show that, for a general set A wthout any restrcton on the nfluences, there exsts a set of coordnates S [n]wth S = O(1) such that f C = {x : y A, y [n]\s = x [n]\s } then the number of k-tuples (x 1,...,x k ) C k satsfyng k =1 x B n mod m s (1 ± o(1))p[c] k (m k 1 B ) n. Keywords and phrases: Invarance, dscrete harmonc analyss, votng, hardness of approxmaton, Gaussan sopermetrc nequaltes 2010 Mathematcs Subject Classfcaton: 60F05, 05A16 Supported by a Sloan fellowshp n Mathematcs, by BSF grant , NSF Career Award (DMS ) and by ONR award N Part of ths work was carred out whle the author was vstng IPAM, UCLA

2 1714 E. MOSSEL GAFA 1 Introducton 1.1 Harmonc analyss of boolean functons. Ths paper studes low nfluence functons f :Ω n [0, 1], where (Ω n,µ n ) s a product probablty space and where the nfluence of the th coordnate on f, denoted by Inf (f) s defned by Inf (f) =E [ Var[f(X 1,...,X n ) X j, 1 j n, j ] ], (1) where for any set S [n] the condtonal varance Var[f(X 1,...,X n ) X, S] s defned va Var [ f(x 1,...,X n ) X, S ] = E [ (f(x 1,...,X n ) E[f(X 1,...,X n ) X, S]) 2 X, S ]. The study of low nfluence functons s motvated by applcatons from the theory of socal choce n mathematcal economcs, by applcatons n the theory of hardness of approxmaton n theoretcal computer scence and by problems n addtve number theory. We refer the reader to some recent papers [KhKMO1,2], [MOO1,2], [DMR], [ST], [Gr] for motvaton and general background. The man theorems establshed here provde tght bounds on the expected value of the product of functons defned on correlated probablty spaces. These n turn mply some new results n the theory of socal choce and n the theory of hypergraphs. Applcaton to hardness of approxmaton n computer scence were derved n subsequent work n [AM1] and [R]. In our man result we consder a probablty measure P defned on a space k =1 Ω(). Lettng f :(Ω () ) n [0, 1], 1 k, be a collecton of low nfluence functons we derve tght bounds on E[f 1...f k ]ntermsofe[f 1 ],...,E[f k ] and a measure of correlaton between the spaces Ω (1),...,Ω (k). The bounds are expressed n terms of extremal probabltes n Gaussan space, that can be calculated n the case k =2. Whenk 2andP s a parwse ndependent dstrbuton our bounds show that E[f 1...f k ]scloseto k =1 E[f ]. We also apply a smple recursve argument n order to obtan results for general functons not necessarly of low nfluences. The results show that the bounds for low nfluence functons hold for general functons after the functons have been modfed n a bounded number of coordnates. The rest of the ntroducton s devoted to varous applcatons followed by statements of the man techncal results. 1.2 Predcton of low nfluence votng. Suppose n voters are to make a bnary decson. Assume that the outcome of the vote s determned by a socal choce functon f : { 1, 1} n { 1, 1}, so that the outcome of the vote s f(x 1,...,x n ) where x { 1, 1} s the vote of voter. We assume that the votes are ndependent, each ±1 wth probablty 1/2. It s natural to assume that the functon f satsfes f( x) = f(x),.e. t does not dscrmnate between the two canddates. Note that ths mples that E[f] = 0 under the unform dstrbuton. A natural way to try and predct the outcome of the vote s to sample a subset of the voters, by samplng each voter ndependently wth probablty ρ. Condtoned on a vector X of votes the dstrbuton of Y, the sampled votes, s..d. where Y = X wth probablty ρ and Y = (for unknown) otherwse.

3 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1715 Condtoned on Y = y, the vector of sampled votes, the optmal predcton of the outcome of the vote s gven by sgn((tf)(y)) where (Tf)(y) =E [ f(x) Y = y ]. (2) Ths mples that the probablty of correct predcton (also called predctablty) s gven by P [ f =sgn(tf) ] = 1 ) 2( 1+E[f sgn(tf)]. For example, when f(x) =x 1 s the dctator functon, we have E[f sgn(tf)] = ρ correspondng to the trval fact that the outcome of the electon s known when voter 1 s sampled and are ±1 wth probablty 1/2 otherwse. The noton of predctablty s natural n statstcal contexts. It was also studed n a more combnatoral context n [O]. In the frst applcaton presented here we show that Theorem 1.1 (Majorty s most predctable). Let 0 ρ 1 and ɛ>0 be gven. Then there exsts a τ>0such that f f : { 1, 1} n [ 1, 1] satsfes E[f] =0and Inf (f) τ for all, then E [ f sgn(tf) ] 2 π arcsn ρ + ɛ, (3) where T s defned n (2). Moreover, t follows from the central lmt theorem (see secton 7.5; a verson of ths calculaton also appears n [O]) that f Maj n (x 1,...,x n )=sgn ( n =1 x ),then lm E[ Maj n n sgn(t Maj n ) ] = 2 π arcsn ρ. Remark 1.2. Note that Theorem 1.1 proves a weaker statement than showng that Majorty s the most predctable functon. The statement only asserts that f a functon has low enough nfluences than ts predctablty cannot be more than ɛ larger than the asymptotc predctablty value acheved by the majorty functon when the number of voters n. Ths slghtly naccurate ttle of the theorem s n lne wth prevous results such as the Majorty s stablest theorem (see below). Smlar language may be used later when nformally dscussng statements of varous theorems. Remark 1.3. One may wonder f for a fnte n, amongall functons f : { 1, 1} n { 1, 1} wth E[f] = 0, majorty s the most predctable functon. Note that the predctablty of the dctator functon f(x) =x 1 s gven by ρ, and 2 π arcsn ρ>ρ for ρ 0. Therefore when ρ s small and n s large the majorty functon s more predctable than the dctator functon. However, note that when ρ 1wehave ρ> 2 π arcsn ρ and therefore for values of ρ closeto1andlargen the dctator functon s more predctable than the majorty functon. We note that the bound obtaned n Theorem 1.1 s remnscent of the Majorty s stablest theorem [MOO1,2] as both nvolve the arcsn functon. However, the two theorems are qute dfferent. The Majorty s stablest theorem asserts that under the same condton as n Theorem 1.1 t holds that E [ f(x)f(y ) ] 2 π arcsn ρ + ɛ,

4 1716 E. MOSSEL GAFA where (X,Y ) { 1, 1} 2 are..d. wth E[X ]=E[Y ]=0andE[X Y ]=ρ. Thus Majorty s stablest consders two correlated votng vectors, whle Majorty s most predctable consders a sample of one votng vector. In fact, both results follow from the more general nvarance prncple presented here. We note a further dfference between stablty and predctablty: It s well known that n the context of Majorty s stablest, for all 0 <ρ<1, among all boolean functons wth E[f] =0, the maxmum of E[f(x)f(y)] s obtaned for dctator functons of the form f(x) =x. As dscussed above, for ρ closeto0andlargen, the dctator s less predctable than the majorty functon. We also note that the An t over untl t s over theorem [MOO1,2] provdes a bound under the same condtons on P [Tf > 1 δ], for small δ. However, ths bound s not tght and does not mply Theorem 1.1. Smlarly, Theorem 1.1 does not mply the An t over untl t s over theorem. The bounds n An t over untl t s over were derved usng nvarance of Tf whle the bound (3) requres the jont nvarance of f and Tf. 1.3 Condorcet paradoxes. Suppose n voters rank k canddates. It s assumed that each voter has a lnear order σ S(k) on the canddates. In Condorcet votng, the rankngs are aggregated by decdng for each par of canddates whch one s superor among the n voters. More formally, the aggregaton results n a tournament G k on the set [k]. Recall that G k s a tournament on [k] f t s a drected graph on the vertex set [k] such that for all a, b [k] ether(a>b) G k or (b >a) G k. Gven ndvdual rankngs (σ ) n =1 the tournament G k s defned as follows. Let x a>b () =1,fσ (a) >σ (b), and x a>b () = 1 fσ (a) <σ (b). Note that x b>a = x a>b. The bnary decson between each par of canddates s performed va a antsymmetrc functon f : { 1, 1} n { 1, 1} so that f( x) = f(x) for all x { 1, 1}. The tournament G k = G k (σ; f) s then defned by lettng (a >b) G k f and only f f(x a>b )=1. Note that there are 2 (k 2) tournaments whle there are only k! =2 Θ(k log k) lnear rankngs. For the purposes of socal choce, some tournaments make more sense than others. Defnton 1.4. We say that a tournament G k s lnear f t s acyclc. We wll wrte Acyc(G k ) for the logcal statement that G k s acyclc. Non-lnear tournaments are often referred to as non-ratonal n economcs as they represent an order where there are 3 canddates a, b and c such that a s preferred to b, b s preferred to c and c s preferred to a. We say that the tournament G k s a unque max tournament f there s a canddate a [k] such that for all b a t holds that (a > b) G k. We wrte UnqMax(G k ) for the logcal statement that G k has a unque max. Note that the unque max property s weaker than lnearty. It corresponds to the fact that there s a canddate that domnates all other canddates.

5 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1717 Followng [Ka1,2], we consder the probablty dstrbuton over n voters, where the voters have ndependent preferences and each one chooses a rankng unformly at random among all k! orderngs. Note that the margnal dstrbutons on vectors x a>b s the unform dstrbuton over { 1, 1} n and that f f : { 1, 1} n { 1, 1} s ant-symmetrc then E[f] = 0. The case that s now understood s k = 3. Note that n ths case G 3 s a unque max f and only f t s lnear. Kala [Ka1] studed the probablty of a ratonal outcome gven that the n voters vote ndependently and at random from the sx possble ratonal rankngs. He showed that the probablty of a ratonal outcome n ths case may be expressed as 3 4 (1 + E[fTf]) where T s the Bonam Beckner operator wth parameter ρ = 1/3. The Bonam Beckner operator may be defned as follows. Let (X,Y ) { 1, 1} 2 be..d. wth E[X ]=E[Y ]=0andE[X Y ]=ρ for 1 n. Forf : { 1, 1} n R, andx { 1, 1} n, the Bonam Beckner operator T appled to f s defned va (Tf)(x 1,...,x n )=E[f(Y 1,...,Y n ) X 1 = x 1,...,X n = x n ]. It s natural to ask whch functon f wth small nfluences s most lkely to produce a ratonal outcome. Instead of consderng small nfluences, Kala consdered the essentally stronger assumpton that f s monotone and transtve-symmetrc ;.e. that for all 1 <j n there exsts a permutaton σ on [n] wthσ() =j such that f(x 1,...,x n )=f(x σ(1),...,x σ(n) ) for all (x 1,...,x n ). Kala conjectured that, as n,themaxmumof 3 4 (1+E[fTf]) among all transtve-symmetrc functons 3 approaches the same lmt as lm n 4 (1+E[Maj n T Maj n ]). Ths was proven usng the Majorty s stablest theorem, [MOO1,2]. Here we obtan smlar results for any value of k. Our result s not tght, but almost tght. More specfcally we show Theorem 1.5 (Majorty s best for Condorcet). Consder Condorcet votng on k canddates. Then for all ɛ>0there exsts τ = τ(k, ɛ) > 0 such that f f : { 1, 1} n { 1, 1} s ant-symmetrc and Inf (f) τ for all, then P [ UnqMax(G k (σ; f)) ] k 1+ok(1) + ɛ. (4) Moreover for f =Maj n we have Inf (f) =O(n 1/2 ) and t holds that P [ UnqMax(G k (σ; f)) ] k 1 ok(1) o n (1). (5) Interestngly, we are not able to derve smlar results for Acyc. We do calculate the probablty that Acyc holds for majorty. Proposton 1.6. We have lm P[ Acyc(G k (σ;maj n n )) ] =exp ( Θ(k 5/3 ) ). (6) We note that results n economcs [Be] have shown that for a majorty vote the probablty that the outcome wll contan a Hamltonan cycle when the number of voters goes to nfnty s 1 o k (1). 1.4 Hypergraph and addtve applcatons. Here we dscuss some applcatons concernng hypergraph problems. We let Ω be a fnte set equpped wth the unform probablty measure denoted P. We let R Ω k denote a k-wse relaton. For sets A 1,...,A k Ω n we wll be nterested n the number of k-tuples

6 1718 E. MOSSEL GAFA x 1 A 1,...,x k A k satsfyng the relaton R n all coordnates,.e. (x 1,...,xk ) R for all. Assume below that R satsfes the followng two propertes: For all a Ωandall1 j k t holds that P[x = a (x 1,...,x k ) R(x)] = Ω 1. (Ths assumpton s actually not needed for the general statement we state t for smplcty only). The relaton R s connected. Ths means that for all x, y R there exsts a path x = y(0),y(1),...,y(r) =y n R such that y() andy( + 1) dffer n one coordnate only. We wll say that the relaton R Ω k s parwse smooth f for all, j [k] and a, b Ω t holds that P [ x = a, x j = b (x 1,...,x k ) R ] = P [ x = a (x 1,...,x k ) R ] P [ x j = b (x 1,...,x k ) R ]. As a concrete example, consder the case where Ω = Z m and R conssts of all k-tuples satsfyng k =1 x B mod m where B Z m. When k 2wehave P[x = a R] =m 1 for all and a. Whenk 3, we have parwse smoothness. The connectvty condton holds whenever B > 1. For a set A Z n m and S [n] we defne A S = { } y : x A, x [n]\s = y [n]\s, A S = { y : xs.t. x [n]\s = y [n]\s, t holds that x A }. Our man result n the context of hypergraphs s the followng. Theorem 1.7. Let R be a connected relaton on Ω k. Then there exst two contnuous functons Γ :(0, 1) k (0, 1) and Γ:(0, 1) k (0, 1) such that for every ɛ>0 there exsts a τ>0such that f A 1,...,A k Ω n are sets wth Inf (A j ) τ for all and j then ( Γ(P[A1 ],...,P[A k ]) ɛ ) P[R n ] P [ R n (A 1,...,A k ) ] ( Γ(P[A 1 ],...,P[A k ]) + ɛ ) P[R n ]. If R s parwse smooth, then, P[R n (A 1,...,A k )] P[R n ]P[A 1 ]...P[A k ] ɛp[r n ]. Moreover, one can take τ = ɛ O(mk+1 log(1/ɛ)/ɛ). For general sets A 1,...,A k, not necessarly of low nfluences, there exsts a set S of coordnates such that S O(1/τ) and the statements above hold for A S 1,...,AS k and for A S 1,...,A S k. 1.5 Correlated spaces. A central concept that s extensvely studed and repeatedly used n the paper s that of correlated probablty spaces. The noton of correlaton between two probablty spaces used here s the same as the maxmum correlaton coeffcent ntroduced by Hrschfeld and Gebelen [G]. We wll later show how to relate correlated spaces to nose operators.

7 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1719 Defnton 1.8. Gven a probablty measure P defned on k =1 Ω(),wesaythat Ω (1),...,Ω (k) are correlated spaces. ForA Ω () we let [ k ] P[A] =P (ω 1,...,ω k ) Ω (j) : ω A, and smlarly E[f] for f :Ω () R. We wll abuse notaton by wrtng P[A] for P n [A], fora ( k =1 Ω()) n or A (Ω () ) n, and smlarly for E. Defnton 1.9. Gven two lnear subspaces A and B of L 2 (P) we defne the correlaton between A and B by ρ(a, B; P) =ρ(a, B) =sup { Cov[f,g]:f A, g B, Var[f] =Var[g] =1 }. (7) Let Ω= ( Ω (1) Ω (2), P ). We defne the correlaton ρ ( Ω (1), Ω (2) ; P ) by lettng ρ ( Ω (1), Ω (2) ; P ) = ρ ( L 2 (Ω (1), P),L 2 (Ω (2), P); P ). (8) More generally, let Ω = ( k =1 Ω(), P ) and for a subset S [k], wrte Ω (S) = S Ω().Thecorrelaton vector ρ ( Ω (1),...,Ω (k) ; P ) s a length k 1 vector whose th coordnate s gven by ( ρ() =ρ Ω (j), k j=+1 ) Ω (j) ; P, for 1 k 1. Thecorrelaton ρ ( Ω (1),...,Ω (k) ; P ) s defned by lettng ρ ( Ω (1),...,Ω (k) ; P ) ( 1 k ) =max ρ Ω (j) Ω (j), Ω () ; P. (9) 1 k j=+1 When the probablty measure P wll be clear from the context we wll wrte ρ ( Ω (1),...,Ω (k)) for ρ ( Ω (1),...,Ω (k) ; P ),etc. Remark It s easy to see that ρ ( Ω (1), Ω (2) ; P ) s the second sngular value of the condtonal expectaton operator mappng f L 2( Ω (2), P ) to g(x) = E[f(Y ) X = x] L 2( Ω (1), P ). Thus ρ ( Ω (1), Ω (2) ; P ) s the second sngular value of the matrx correspondng to the operator T wth respect to orthonormal bass of L 2( Ω (1), P ) and L 2( Ω (2), P ). Defnton Gven ( k =1 Ω(), P ),wesaythatω (1),...,Ω (k) are r-wse ndependent f for all S [k] wth S r and for all S A S Ω() t holds that [ ] P A = P[A ]. S S The noton of r-wse ndependence s central n computer scence and dscrete mathematcs, n partcular n the context of randomzed algorthms and computatonal complexty. 1.6 Gaussan stablty. Our man result states bounds n terms of Gaussan stablty measures whch we dscuss next. Let γ be the one-dmensonal Gaussan measure.

8 1720 E. MOSSEL GAFA Defnton Gven µ [0, 1], defne χ µ : R {0, 1} to be the ndcator functon of the nterval (,t], wheret s chosen so that E γ [χ µ ]=µ. Explctly, t = Φ 1 (µ), where Φ denotes the dstrbuton functon of a standard Gaussan. Furthermore, defne Γ ρ (µ, ν) =P [ X Φ 1 (µ), Y Φ 1 (ν) ], Γ ρ (µ, ν) =P [ X Φ 1 (µ), Y Φ 1 (1 ν) ], ( ) where (X, Y ) s a two-dmensonal Gaussan vector wth covarance matrx 1 ρ ρ 1. Gven (ρ 1,...,ρ k 1 ) [0, 1] k 1 and (µ 1,...,µ k ) [0, 1] k for k 3 we defne by nducton ( Γ ρ1,...,ρ k 1 (µ 1,...,µ k )=Γ ρ1 µ1, Γ ρ2,...,ρ k 1 (µ 2,...,µ k ) ), and smlarly Γ (). 1.7 Statements of man results. We now state our man results. We state the results both for low nfluence functons and for general functons. For the latter t s useful to defne the followng notons: Defnton Let f :Ω n R and S [n]. We defne f S (x) =sup ( ) f(y) :y [n]\s = x [n]\s, f S (x) =nf ( ) f(y) :y [n]\s = x [n]\s. Theorem Let ( k ) Ω(j), P, 1 n, be a sequence of fnte probablty spaces such that for all 1 n the mnmum probablty of any atom n k Ω(j) s at least α. Assume furthermore that there exsts ρ [0, 1] k 1 and 0 ρ<1 such that ρ ( Ω (1),...,Ω (k) ) ; P ρ, ρ ( Ω ({1,...,j}), Ω ({j+1,...,k}) ) ; P ρ(j), (10) for all, j. Then for all ɛ>0 there exsts τ>0such that f n f j : Ω (j) [0, 1], for 1 j k, satsfy then =1 ( Γ ρ E[f1 ],...,E[f k ] ) [ k ɛ E If nstead of (10) we assume that for all, j j,then k ( max Inf (f j ) ) τ, (11),j f j ] Γ ρ ( E[f1 ],...,E[f k ] ) + ɛ. (12) ρ ( Ω (j), Ω (j ) ; P ) =0, (13) [ k E[f j ] ɛ E f j ] k E[f j ]+ɛ. (14)

9 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1721 One may take ( ) τ = ɛ O log(1/ɛ)log(1/α) (1 ρ)ɛ. A truncaton argument allows one to relax the condtons on the nfluences. Proposton For statement (12) to hold n the case where k =2t suffces to requre that ( max mn(inf (f 1 ), Inf (f 2 )) ) τ (15) nstead of (11). In the case where for each the spaces Ω (1),...,Ω (k) are s-wse ndependent, for statement (14) to hold t suffces to requre that for all {j :Inf (f j ) >τ} s. (16) An easy recursve argument allows one to conclude the followng result that does not requre low nfluences (11). Proposton Consder the settng of Theorem 1.14 wthout the assumptons on low nfluences (11). Assumng (10), thereexstsasets of sze O(1/τ) such that the functons f S j satsfy [ k E f S j ] Γ ρ ( E[f S 1 ],...,E[f S k ] ) ɛ Γ ρ ( E[f1 ],...,E[f k ] ) ɛ, and the functons f S j satsfy [ k ] E f S ( Γ j ρ E[f S 1 ],...,E[f S k ]) ( ɛ Γ ρ E[f1 ],...,E[f k ] ) ɛ. Assumng (13), wehave [ k E and smlarly for f. f S j ] k E[f S j ] ɛ k E[f j ] ɛ, 1.8 Road map. Let us revew some of the man technques we use n ths paper. We develop a Fourer theory on correlated spaces n secton 2. Prevous work consdered Fourer theory on one product space and reversble operators wth respect to that space [DMR]. Our results here allow us to study non-reversble operators whch n turn allows us to study products of k correlated spaces. An mportant fact we prove that s used repeatedly s that general nose operators respect Efron Sten decomposton. Ths fact n partcular allows us to truncate functons to ther low-degree parts when consderng the expected value of the product of functons on correlated spaces. In order to derve an nvarance prncple we need to extend the approach of [Ro], [MOO1,2] to prove the jont nvarance of a number of multlnear polynomals. The proof of the extenson appears n sectons 3 and 4. The

10 1722 E. MOSSEL GAFA proof follows the same man steps as n [Ro], [MOO1,2],.e. the Lndeberg strategy for provng the CLT [L] where nvarance s establshed by swtchng one varable at a tme. In the Gaussan realm, we need to extend Borell s sopermetrc result [Bor] both n the case of two collectons of Gaussans and n the case of k > 2 collectons. Ths s done n secton 5. The proof of the man result, Theorem 1.14 follows n secton 6. The proof of the extensons gven n Proposton 1.15 uses a truncaton argument for whch s-wse ndependence plays a crucal role. The proof of Proposton 1.16 s based on a smple recursve argument. In secton 7 we apply the nose bounds n order to derve the socal choce results. Some calculatons wth the majorty functon n the socal choce settng, n partcular showng the tghtness of Theorems 1.1 and 1.5 are gven n secton 7.5. We conclude by dscussng the applcatons to hypergraphs n secton Subsequent work and applcatons n computer scence. Subsequent to postng a draft of ths paper on the arxv, two applcatons of our results to hardness of approxmaton n computer scence were establshed. Both results are n the context of the Unque Games conjecture n computatonal complexty [Kh]. Furthermore, both results consder an mportant problem n computer scence, that s, the problem of solvng constrant satsfacton problems (CSP). Gven a predcate P :[q] k {0, 1}, where[q] ={1,...,q} for some nteger q, we defne Max CSP(P ) to be the algorthmc problem where we are gven a set of varables x 1,...,x n takng values n [q] and a set of constrants of the form P (l 1,...,l k ), where each l = x j + a, wherex j s one of the varables and a [q] saconstant (addton s mod q). More generally, n the problem of Max k-csp q we are gven a set of constrants each nvolvng k of the varables x 1,...,x n. The most well-studed casesthecaseofq = 2 denoted Max k-csp. The objectve s to fnd an assgnment to the varables satsfyng as many of the constrants as possble. The problem of Max k-csp q s NP-hard for any k 2, q 2, and as a consequence, a large body of research s devoted to studyng how well the problem can be approxmated. We say that a (randomzed) algorthm has approxmaton rato α f, for all nstances, the algorthm s guaranteed to fnd an assgnment whch (n expectaton) satsfes at least α Opt of the constrants, where Opt s the maxmum number of smultaneously satsfed constrants, over any assgnment. The results of [AM1] (see also [AM2]) show that assumng the Unque Games conjecture, for any predcate P for whch there exsts a parwse ndependent dstrbuton over [q] k wth unform margnals, whose support s contaned n P 1 (1), s approxmaton reslent. In other words, there s no polynomal tme algorthm whch acheves a better approxmaton factor than assgnng the varables at random. Ths result mples n turn that for general k 3andq 2, the Max k-csp q problem s UG-hard to approxmate wthn O(kq 2 )/q k + ɛ. Moreover, for the specal case of

11 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1723 q = 2,.e. boolean varables, t gves hardness of (k + O(k ))/2 k + ɛ, mprovng upon the best prevous bound [ST] of 2k/2 k + ɛ by essentally a factor 2. Fnally, agan for q = 2, assumng that the famous Hadamard conjecture s true, the results are further mproved, and the O(k ) term can be replaced by the constant 4. These results should be compared to pror work by Samorodntsky and Trevsan [ST] who usng the Gowers norm, proved that the Max k-csp problem has a hardness factor of 2 log 2 k+1 /2 k,whchs(k +1)/2 k for k =2 r 1, but can be as large as 2k/2 k for general k. From the quanttatve pont of vew [AM2] gves stronger hardness than [ST] for Max k-csp q, even n the already thoroughly explored q = 2 case. These mprovements may seem very small, beng an mprovement only by a multplcatve factor 2. However, t s well known that t s mpossble to get non-approxmablty results whch are better than (k +1)/2 k, and thus, n ths respect, the hardness of (k +4)/2 k assumng the Hadamard conjecture s n fact optmal to wthn a very small addtve factor. Also, the results of [AM2] gve approxmaton resstance of Max CSP(P ) for a much larger varety of predcates (any P contanng a balanced parwse ndependent dstrbuton). From a qualtatve pont of vew, the analyss of [AM2] s very drect. Furthermore, t s general enough to accommodate any doman [q] wth vrtually no extra effort. Also, ther proof usng the man result of the current paper,.e. bounds on expectatons of products under certan types of correlaton, puttng t n the same general framework as many other UGC-based hardness results, n partcular those for 2-CSPs. In a second beautful result by Raghavendra [R] the results of the current paper were used to obtan very general hardness results for Max CSP(P ). In [R] t s shown that for every predcate P and for every approxmaton factor whch s smaller than the UG-hardness of the problem, there exsts a polynomal tme algorthm whch acheves ths approxmaton rato. Thus for every P the UG-hardness of Max CSP(P ) s sharp. The proof of the results uses the results obtaned here n order to defne and analyze the reducton from UG gven the ntegralty gap of the correspondng convex optmzaton problem. We note that for most predcates the UG hardness of Max CSP(P ) s unknown and therefore the results of [R] complement those of [AM1] Acknowledgments. I would lke to thank Noam Nsan for suggestng that generalzaton of the nvarance prncple should be useful n the socal choce context and Gl Kala for pontng out some helpful references. I would lke to thank Terence Tao for helpful dscussons and references on addtve number theory and Szemered regularty. Thanks to Per Austrn, Nck Crawford, Marcus Issacson and Ryan O Donnell for a careful readng of a draft of ths paper. Fnally, many thanks to an anonymous referee for many helpful comments.

12 1724 E. MOSSEL GAFA 2 Correlated Spaces and Nose In ths secton we defne and study the noton of correlated spaces and nose operators n a general settng. 2.1 Correlated probablty spaces and nose operators. We begn by defnng nose operators and gvng some basc examples. Defnton 2.1. Let ( Ω (1) Ω (2), P ) be two correlated spaces. The Markov operator assocated wth ( Ω (1), Ω (2)) s the operator mappng f L p( Ω (2), P ) to Tf L p( Ω (1), P ) by (Tf)(x) =E [ f(y ) X = x ], for x Ω (1) and where (X, Y ) Ω (1) Ω (2) s dstrbuted accordng to P. Example 2.2. In order to defne the Bonam Beckner operator T = T ρ on a space (Ω,µ), consder the space (Ω Ω,ν)whereν(x, y) =(1 ρ)µ(x)µ(y)+ρδ(x = y)µ(x), where δ(x = y) s the functon on Ω Ω whch takes the value 1 when x = y, and0 otherwse. In ths case, the operator T satsfes (Tf)(x) =E [ f(y ) X = x ], (17) where the condtonal dstrbuton of Y gven X = x s ρδ x +(1 ρ)µ, whereδ x s the delta measure on x. Remark 2.3. The constructon above may be generalzed as follows. Gven any Markov chan on Ω that s reversble wth respect to µ, we may look at the measure ν on Ω Ω defned by the Markov chan. In ths case T s the Markov operator determned by the chan. The same constructon apples under the weaker condton that T has µ as ts statonary dstrbuton. It s straghtforward to verfy that Proposton 2.4. Suppose that for each 1 n, ( Ω (1) Ω (2) ),µ are correlated spaces and T s the Markov operator assocated wth Ω (1) and Ω (2). Then ( n =1 Ω(1), n =1 Ω(2), n =1 µ ) defnes two correlated spaces and the Markov operator T assocated wth them s gven by T = n =1 T. Example 2.5. For product spaces ( n =1 Ω, n =1 µ ), the Bonam Beckner operator T = T ρ s defned by T = n =1 T ρ, (18) where T s the Bonam Beckner operator on (Ω Ω,µ ). Ths Markov operator s the one most commonly dscussed n prevous work, see e.g. [KKL], [KhKMO1], [MOO2]. In a more recent work [DMR] the case of Ω Ω wth T a reversble Markov operator wth respect to a measure µ on Ω was studed. Example 2.6. In the context of the Majorty s most predctable theorem, Theorem 1.1, the underlyng space s Ω = {±1} {0, ±1} where element (x, y) Ω corresponds to a voter wth vote x andasampledvotey where ether y = x f the vote s quered or y = 0 otherwse. The probablty measure µ s gven by µ(x, y) = 1 )( ) 2( δ(x =1)+δ(x = 1) ρδ(y = x)+(1 ρ)δ(y =0).

13 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1725 Note that the margnal dstrbutons on Ω S = {0, ±1} and Ω V = {±1} are gven by µ =(1 ρ)δ 0 + ρ 2 (δ 1 + δ 1 ), ν = 1 2 (δ 1 + δ 1 ), and ν( ±1) = δ ±1, ν( 0) = 1 2 (δ 1 + δ 1 ). Gven ndependent copes µ of µ and ν of ν, the measure µ = n =1 µ corresponds to the dstrbuton of a sample of voters where each voter s sampled ndependently wth probablty ρ and the dstrbuton of the voters s gven by ν = n =1 ν. Example 2.7. The second non-reversble example s natural n the context of Condorcet votng. For smplcty, we frst dscuss the case of 3 possble outcomes. The general case s dscussed later. Let τ denote the unform measure on the set permutatons on the set [3] denoted S 3. Note that each element σ S 3 defnes an element f { 1, 1} (3 2) by lettng f(, j) = sgn(σ() σ(j)). The measure so defned, defnes 3 correlated probablty spaces ( ) {±1} (3 2), P. Note that the projecton of P to each coordnate s unform and P ( f(3, 1) = 1 f(1, 2) = f(2, 3) = 1 ) =0, P ( f(3, 1) = 1 f(1, 2) = f(2, 3) = 1 ) =0, and P ( f(3, 1) = ±1 f(1, 2) f(2, 3) ) =1/ Propertes of correlated spaces and Markov operators. Here we derve propertes of correlated spaces and Markov operators that wll be repeatedly used below. We start wth the followng whch was already known to Rény [Ré]. Lemma 2.8. Let ( Ω (1) Ω (2), P ) be two correlated spaces. Let f be a Ω (2) measurable functon wth E[f] =0,andE[f 2 ]=1. Then among all g that are Ω (1) measurable satsfyng E[g 2 ]=1, a maxmzer of E[fg] s gven by Tf g = E[(Tf) 2 ], (19) where T s the Markov operator assocated wth ( Ω (1), Ω (2)).Moreover, E[gf] = E[fTf] E[(Tf) 2 ] = E[(Tf) 2 ]. (20) Proof. To prove (19) let h be an Ω (1) measurable functon wth h 2 =1. Wrte h = αg+βh where α 2 +β 2 =1and h 2 = 1 s orthogonal to g. From the propertes of condtonal expectaton t follows that E[fh ] = 0. Therefore we may choose an optmzer satsfyng α ±1. Equaton (20) follows snce Tf s a condtonal expectaton. The same reasonng shows that E[fg] = 0 for every Ω (1) measurable functon g f Tf s dentcally 0. The followng lemma s useful n boundng ρ ( Ω (1), Ω (2) ; P ) from Defnton 1.9 n generc stuatons. Roughly speakng, t shows that connectvty of the support of P on correlated spaces Ω (1) Ω (2) mples that ρ<1.

14 1726 E. MOSSEL GAFA Lemma 2.9. Let ( Ω (1) Ω (2), P ) be two correlated spaces such that the probablty of the smallest atom n Ω (1) Ω (2) s at least α>0. Defne a b-partte graph G = ( Ω (1), Ω (2),E ) where (a, b) Ω (1) Ω (2) satsfes (a, b) E f P(a, b) > 0. Then f G s connected then ρ ( Ω (1), Ω (2) ; P ) 1 α 2 /2. Proof. For the proof t would be useful to consder G = ( Ω (1) Ω (2),E ),aweghted drected graph where the weght W (a, b) of the drected edge from a to b s P[b a] and the weght of the drected edge from b to a s W (b, a) =P[a b]. Note that the mnmal non-zero weght must be at least α and that W (a b) > 0ffW (b >a) > 0. Ths later fact mples that G s strongly connected. Note furthermore that G s b-partte. Let A be the transton probablty matrx defned by the weghted graph G. Snce G s connected and W (a, b) α for all a and b such that W (a b) > 0, t follows by Cheeger s nequalty that the spectral gap of A s at least α 2 /2. Snce G s connected and b-partte, the multplctes of the egenvalues ±1 areboth1. Correspondng egenfunctons are the constant 1 functons and the functon takng the value 1 on Ω (1) and the value 1 onω (2). In order to bound ρ t suffces by Lemma 2.8 to bound Af 2 for a functon f that s supported on Ω (2) and satsfes E[f] = 0. Note that such a functon s orthogonal to the egenvectors of A correspondng to the egenvalues 1and1. It therefore follows that Af 2 (1 α 2 /2) f 2 as needed. One nce property of Markov operators that wll be used below s that they respect the Efron Sten decomposton. Gven a vector x n an n-dmensonal product space and S [n] wewrtex S for the vector (x : S). Gven probablty spaces Ω 1,...,Ω n,weusetheconventonofwrtngx for a random varable that s dstrbuted accordng to the measure of Ω and x for an element of Ω. We wll also wrte X S for (X : S). Defnton Let (Ω 1,µ 1 ),...,(Ω n,µ n ) be dscrete probablty spaces (Ω,µ)= n =1 (Ω,µ ). The Efron Sten decomposton of f :Ω R s gven by f(x) = f S (x S ), (21) S [n] where the functons f S satsfy f S depends only on x S. For all S S and all x S, t holds that E[f S X S = x S ]=0. It s well known that the Efron Sten decomposton exsts and that t s unque [ES]. We quckly recall the proof of exstence. The functon f S s gven by f S (x) = S S( 1) S\S E [ ] f(x) X S = x S whch mples f S (x) = S S E [ f X S = x ] ( 1) S\S = E [ ] f X [n] = x [n] = f(x). S:S S

15 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1727 Moreover, for S S we have E[f S X S = x S ]=E[f S X S S = x S S] andfors that s a strct subset of S we have E[f S X S = x S ]= ( 1) S\S E [ f(x) X S S = x ] S S S S = S S E [ f(x) XS = x S ] S S S (S\S ) ( 1) S\ S =0. We now prove that the Efron Sten decomposton commutes wth Markov operators. Proposton Let ( Ω (1) Ω (2) ), P be correlated spaces and let T the Markov operator assocated wth Ω (1) and Ω (2) for 1 n. Let Ω (1) = n =1 Ω (1), Ω (2) = n =1 Ω (2), P = n P, T = n =1T. Suppose f L 2( Ω (2)) has Efron Sten decomposton (21). Then the Efron Sten decomposton of Tf satsfes (Tf) S = T (f S ). Proof. Clearly T (f S ) s a functon of x S only. Moreover, for all S S and x 0 S t holds that E [ (T (f S ))(X) X S = x 0 ] [ S = E fs (Y ) X S = x 0 ] S = E [ E[f S (Y ) Y S ]P[Y S X S = x 0 S ]] =0, where the second equalty follows from the fact that Y s ndependent of X S condton on Y S. We next derve a useful bound showng that n the settng above f ρ ( Ω (1) Ω (2) ; P ) < 1 for all then Tf depends on the low degree expanson of f. Proposton Assume the settng of Proposton 2.11 and that further for all t holds that ρ ( Ω (1), Ω (2) ) ; P ρ. Then for all f t holds that T (fs ) ( 2 ρ ) f S 2. Proof. Wthout loss of generalty f suffces to prove the statement of Proposton 2.12 for S =[n]. Thus our goal s to know that ( n Tf 2 ρ ) f 2 S =1 (from now on S wll denote a set dfferent from [n]). For each 0 r n, let T (r) denote the followng operator. T (r) maps a functon g of z = (z 1,...,z n )=(x 1,...,x r 1,y r,...,y n ) to a functon T (r) g of w =(w 1,...,w n )= (x 1,...,x r,y r+1,...,y n ) defned as follows: T (r) g(w) =E [ g(z) W = w ]. (Here Z =(X 1,...,X r 1,Y r,...,y n ) and smlarly W.) =1

16 1728 E. MOSSEL GAFA Let g be a functon such that for any subset S [n] andallz S, E [ ] g(z) Z S = z S =0. We clam that T (r) g 2 ρ r g 2 (22) and that for all subsets S [n] t holds that E [ (T (r) ] g)(w ) W S = w S =0. (23) Note that (22) and (23) together mply the desred bound as T = T (n) T (1). For (22) note that f S =[n] \{r} and f = T (r) g then by Lemma 2.8 E [ f 2 ] [ ] (W ) Z S = z S = E g(z)f(w ) ZS = z S ρ r E [ ] [ ] f 2 (W ) Z S = z S E g 2 (Z) Z S = z S. So E [ f 2 ] (W ) Z S = z S ρ 2 r E [ g 2 ] (W ) Z S = z S, whch gves f 2 ρ r g 2 by ntegraton. For (23) we note that f S [n] then E [ ] [ ] [ f(w ) W S = w S = E g(z) WS = w S = E E[g(Z) ZS ]P[Z S W S = w S ] ] =0. Ths concludes the proof. Proposton Assume the settng of Proposton Then ( n n n ) ρ Ω (1), Ω (2) ; P =maxρ ( Ω (1), Ω (2) ). =1 =1 =1 Proof. Let f L 2( n ) =1 Ω(2) wth E[f] =0andVar[f] = 1. Expand f accordng to ts Efron Sten decomposton f = f S, S (f =0snceE[f] = 0). Then by Propostons 2.11 and 2.12 E [ (Tf) 2] [( ( )) 2 ] [( ) 2 ] = E T f S = E Tf S = E [ (Tf s ) 2] ρ f S 2 2 S S S S S max ρ 2 f S 2 2 =max ρ 2. S The other nequalty s trval. 3 Background: Influences and Hypercontractvty In ths secton we recall and generalze some defntons and results from [MOO2]. In partcular, the generalzatons allow us the study non-reversble Markov operators and correlated ensembles. For the reader who s famlar wth [MOO2] t suffces to look at subsectons 3.3 and 3.5.

17 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS Influences and nose stablty n product spaces. Let (Ω 1,µ 1 ),...,(Ω n,µ n ) be probablty spaces and let (Ω,µ) denote the product probablty space. Let f =(f 1,...,f k ):Ω 1 Ω n R k. Defnton 3.1. The nfluence of the th coordnate on f s Inf (f) = E [ Var[f j x 1,...,x 1,x +1,...,x n ] ]. 1 j k 3.2 Multlnear polynomals. In ths subsecton we recall and slghtly generalze the setup and notaton used n [MOO2]. Recall that we are nterested n functons on product of fnte probablty spaces, f :Ω 1 Ω n R. Foreach, the space of all functons Ω R can be expressed as the span of a fnte set of orthonormal random varables, X,0 =1,X,1,X,2,X,3,...;thenf can be wrtten as a multlnear polynomal n the X,j s. In fact, t wll be convenent for us to mostly dsregard the Ω s and work drectly wth sets of orthonormal random varables; n ths case, we can even drop the restrcton of fnteness. We thus begn wth the followng defnton: Defnton 3.2 [MOO2]. We call a collecton of fntely many orthonormal real random varables, one of whch s the constant 1, anorthonormal ensemble. We wll wrte a typcal sequence of n orthonormal ensembles as X =(X 1,...,X n ),where X = {X,0 =1,X,1,...,X,m }. We call a sequence of orthonormal ensembles X ndependent f the ensembles are ndependent famles of random varables. We wll henceforth be concerned only wth ndependent sequences of orthonormal ensembles, and we wll call these sequences of ensembles, for brevty. Smlarly, when wrtng an ensemble we wll always mean an orthogonal ensemble. Remark 3.3 [MOO2]. Gven a sequence of ndependent random varables X 1,...,X n wth E[X ]=0andE[X 2 ] = 1, we can vew them as a sequence of ensembles X by renamng X = X,1 and settng X,0 =1asrequred. Defnton 3.4 [MOO2]. We denote by G the Gaussan sequence of ensembles, n whch G = {G,0 =1,G,1,G,2,...,G,m } and all G,j s wth j 1 are ndependent standard Gaussans. The Gaussan ensembles dscussed n ths paper wll often have m chosen to match the m of a gven ensemble. As mentoned, we wll be nterested n multlnear polynomals over sequences of ensembles. By ths we mean sums of products of the random varables, where each product s obtaned by multplyng one random varable from each ensemble. Defnton 3.5 [MOO2]. A mult-ndex σ s a sequence (σ 1,...,σ n ) n N n. The degree of σ, denoted σ, s { [n] : σ > 0}. Gven a doubly-ndexed set of ndetermnates {x,j } [n],j N,wewrtex σ for the monomal n =1 x,σ.wenow defne a multlnear polynomal over such a set of ndetermnates to be any expresson Q(x) = c σ x σ (24) σ

18 1730 E. MOSSEL GAFA where the c σ s are real constants, all but fntely many of whch are zero. The degree of Q(x) s max{ σ : c σ 0}, atmostn. We also use the notaton Q d (x) = c σ x σ σ d and, analogously, Q =d (x) and Q >d (x). Naturally, we wll consder applyng multlnear polynomals Q to sequences of ensembles X ; the dstrbuton of these random varables Q(X )sthesubjectofour nvarance prncple. Snce Q(X ) can be thought of as a functon on a product space Ω 1 Ω n as descrbed at the begnnng of ths secton, there s a consstent way to defne the notons of nfluences, T ρ, and nose stablty from secton 3.1. For example, the nfluence of the th ensemble on Q s Inf (Q(X )) = E [ Var[Q(X ) X 1,...,X 1, X +1,...,X n ] ]. Usng ndependence and orthonormalty, t s easy to show the followng formulas: Proposton 3.6. Let X be a sequence of ensembles and Q a multlnear polynomal as n (24). Then E[Q(X )] = c 0 ; E [ Q(X ) 2] = c 2 σ; Var[Q(X )] = c 2 σ ; σ and Inf (Q(X )) = σ:σ >0 c 2 σ. σ >0 For ρ [0, 1] we defne the operator T ρ as actng formally on multlnear polynomals Q(x) as n (24) by (T ρ Q)(x) = ρ σ c σ x σ. (25) σ We note the defnton n (17) and (18) are consstent wth the defnton n (25) n the sense that for any ensemble X the two defntons result n the same functon (T ρ Q)(X ). We fnally recall the noton of low-degree nfluences, a noton that has proven crucal n the analyss of PCPs n hardness of approxmaton n computer scence (see, e.g. [KhKMO1]). Defnton 3.7 [MOO2]. The d-low-degree nfluence of the th ensemble on Q(X ) s Inf d (Q(X )) = Inf d (Q) = σ: σ d, σ >0 Note that ths gves a way to defne low-degree nfluences Inf d (f) for functons f :Ω 1 Ω n R on fnte product spaces. There sn t an especally natural nterpretaton of Inf d (f). However, the noton s mportant for PCPs due to the fact that a functon wth varance 1 cannot have too many coordnates wth substantal low-degree nfluence; ths s reflected n the followng easy proposton: c 2 σ.

19 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1731 Proposton 3.8 [MOO2]. Suppose Q s multlnear polynomal as n (24). Then Inf d (Q) d Var[Q]. The proof follows snce Inf d (Q) = c 2 σ = σ c 2 σ d c 2 σ = dvar[q]. σ: σ d, σ >0 σ:0< σ d σ:0< σ 3.3 Vector-valued multlnear polynomals. For the nvarance prncple dscussed here we wll need to consder vector-valued multlnear polynomals. Defnton 3.9. A k-dmensonal multlnear polynomal over a set of ndetermnates s gven by Q =(Q 1,...,Q k ) (26) where each Q j s a multlnear polynomal as n (24). Thedegree of Q s the maxmal degree of the Q j s. Defnton We adopt the standard notaton and wrte Q q for E 1/q [ k Q q ];wewrtevar[q] for E[ Q E[Q] 2 2 ] and k ( Inf (Q(X )) = Inf Qj (X ) ). Usng these defntons, t s easy to see Proposton Let X be a sequence of ensembles and Q =(Q 1,...,Q k ) a k-dmensonal multlnear polynomal where Q j s defned as n (24) wth c j σ as ts coeffcents. Then, E[Q(X )] = (c 1 0,...,ck 0 ); Q(X ) 2 2 = c j 2 σ ; Var[Q(X )] = c j 2 σ. j,σ j, σ j >0 Fnally, we recall the standard mult-ndex notaton assocated wth k-dmensonal multlnear polynomals. A mult-ndex of dmenson k s a vector ( 1,..., k ), where each j s an nteger. We wrte for k and! for 1! 2! k!. Gven a functon ψ of k varables, we wll wrte ψ () for the partal dervatve of f taken 1 tmes wth respect to the frst varable, 2 wth respect to the second etc. (We wll only consder functons ψ that are smooth enough that the order of dervatves does not matter.) We wll also wrte Q for the product Q 1 1 Q k k. 3.4 Hypercontractvty. As n [MOO2] the nvarance prncple requres that the ensembles nvolved are hypercontractve. Recall that Y s (2,q,η)-hypercontractve wth some η (0, 1) f and only f E[Y ]=0andE[ Y q ] <. Also, f Y s (2,q,η)-hypercontractve then η (q 1) 1/2. Defnton Let X be a sequence of ensembles. For 1 p q< and 0 <η<1 we say that X s (p, q, η)-hypercontractve f (Tη Q)(X ) q Q(X ) p for every multlnear polynomal Q over X. Snce T η s a contractve sem-group, we have

20 1732 E. MOSSEL GAFA Remark If X s (p, q, η)-hypercontractve then t s (p, q, η )-hypercontractve for any 0 <η η. There s a related noton of hypercontractvty for sets of random varables whch consders all polynomals n the varables, not just multlnear polynomals; see, e.g. Janson [J]. We summarze some of the basc propertes below, see [MOO2] for detals. Proposton 3.14 [MOO2]. Suppose X s a sequence of n 1 ensembles and Y s an ndependent sequence of n 2 ensembles. Assume both are (p, q, η)-hypercontractve. Then the sequence of ensembles X Y =(X 1,...,X n1, Y 1,...,Y n2 ) s also (p, q, η)- hypercontractve. Proposton 3.15 [MOO2]. Let X be a (2,q,η)-hypercontractve sequence of ensembles and Q a multlnear polynomal over X of degree d. Then Q(X ) q η d Q(X ) 2. We end ths secton by recordng the some hypercontractve estmates to be used later. The result for ±1 Rademacher varables s well known and due orgnally to Bonam [Bo] and ndependently Beckner [B]; the same result for Gaussan and unform random varables s also well known and n fact follows easly from the Rademacher case. The optmal hypercontractvty constants for general fnte spaces was recently determned by Wolff [W] (see also [Ol]): Theorem Let X denote ether a unformly random ±1 bt, a standard one-dmensonal Gaussan, or a random varable unform on [ 3, 3 ]. Then X s (2,q,(q 1) 1/2 )-hypercontractve. Theorem 3.17 (Wolff [W]). Let X be any mean-zero random varable on a fnte probablty space n whch the mnmum nonzero probablty of any atom s α 1/2. Then X s (2,q,η q (α))-hypercontractve, where ( ) 1/2 A 1/q A 1/q η q (α) = A 1/q A 1/q wth A = 1 α α, 1 q + 1 q =1. Note the followng specal case: Proposton 3.18 [W]. η 3 (α) = ( A 1/3 + A 1/3) 1/2 α 0 α 1/6, and also 1 2 α1/6 η 3 (α) 2 1/2, for all α [0, 1/2]. 3.5 Vector hypercontracton. For our purposes we wll also need to obtan hypercontracton results n cases where Q s a k-dmensonal multlnear polynomal. We wll need to consder vector-valued multlnear polynomals.

21 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1733 Proposton Let X be a (2,q,η)-hypercontractve sequence of ensembles and Q a multlnear polynomal over X of degree d and dmenson k. Assume q s nteger and let be a mult-ndex wth = q. Then E [ Q (X ) ] k η dq E[Q 2 j ] j/2. Proof. E [ Q (X ) ] k E [ Q j q] j k /q η dq E[Q 2 j ] j/2, where the frst nequalty s Hölder and the second follows by hypercontractvty. 4 Multdmensonal Invarance Prncple In ths secton we generalze the nvarance prncple from [MOO2] to the multdmensonal settng. We omt some easy steps that are ether dentcal or easy adaptaton of the proofs of [MOO2]. 4.1 Hypotheses for nvarance theorems. Below we wll prove a generalzaton of the nvarance prncple [MOO2]. The nvarance prncple proven there concerns a multlnear polynomal Q over two hypercontractve sequences of ensembles, X and Y; furthermore,x and Y are assumed to satsfy a matchng moments condton, descrbed below. It s possble to generalze the nvarance prncple to vector-valued multlnear polynomals under each of the hypercontractvty assumptons H1, H2, H3 and H4 of [MOO2]. However, snce the proof of all generalzatons s essentally the same and snce for the applcatons studed here t suffces to consder the hypothess H3, ths s the only hypothess that wll be dscussed n the paper. It s defned as follows: H3 Let X be a sequence of n ensembles n whch the random varables n each ensemble X form a bass for the real-valued functons on some fnte probablty space Ω. Further assume that the least nonzero probablty of any atom n any Ω s α 1 2,andletη = 1 2 α1/6.letybe any (2, 3,η)-hypercontractve sequence of ensembles such that all the random varables n Y are ndependent. Fnally, let Q be a k-dmensonal multlnear polynomal as n (26). 4.2 Functonal settng. The essence of our nvarance prncple s that f Q s of bounded degree and has low nfluences then the random varables Q(X )andq(y) are close n dstrbuton. The smplest way to formulate ths concluson s to say that f Ψ : R k R s a suffcently nce test functon then Ψ(Q(X )) and Ψ(Q(Y)) are close n expectaton. Theorem 4.1. Assume hypothess H3. Further assume that Q s a k-dmensonal multlnear polynomal, that Var[Q] 1, deg(q) d, andinf (Q) τ for all. Let Ψ:R k R be a C 3 functon wth Ψ () B unformly for every vector wth

22 = 3. Then E[Ψ(Q(X ))] E[Ψ(Q(Y))] ɛ := 2dB k 3 (8α 1/2 ) d τ 1/ E. MOSSEL GAFA Proof. Note that by Proposton 3.18, the random varables satsfy (2, 3, η) hypercontractvty wth η = 1 2 α1/6. We begn by defnng ntermedate sequences between X and Y. For = 0, 1,...,n,letZ () denote the sequence of n ensembles (Y 1,...,Y, X +1,...,X n ) and let Q () = Q(Z () ). Our goal wll be to show E[Ψ(Q ( 1) )] E[Ψ(Q () )] 2Bk 3 η 3d Inf (Q) 3/2 (27) for each [n]. Summng ths over wll complete the proof snce Z (0) = X, Z (n) = Y, and n Inf (Q) 3/2 τ 1/2 =1 n Inf (Q) =τ 1/2 =1 n =1 Inf d (Q) dτ 1/2, where we used Proposton 3.8 and j Var[Q j] 1. Let us fx a partcular [n] and proceed to prove (27). Gven a mult-ndex σ, wrte σ \ for the same mult-ndex except wth σ =0. Nowwrte Q = R = S = σ:σ =0 σ:σ >0 σ:σ >0 c σ Z () σ, c σ X,σ Z () σ\, c σ Y,σ Z () σ\. Note that Q and the varables Z () σ\ are ndependent of the varables n X and Y and that Q ( 1) = Q + R and Q () = Q + S. To bound the left sde of (27),.e. E[Ψ( Q + R) Ψ( Q + S)], weusetaylor s theorem: for all x, y R, Ψ k (x) y k Ψ(x + y) k! k <3 k =3 B k! y k. In partcular, E[ Ψ( Q + R) ] [ Ψ (k) ( E Q) ] R k k! k <3 k =3 and smlarly, E[ Ψ( Q + S) ] [ Ψ (k) ( E Q) ] S k k! k <3 k =3 B k! E[ R k] (28) B k! E[ S k]. (29) We wll see below that R and S have fnte 3 rd moments. Moreover, for 0 k r wth r = 3 t holds that Ψ (k) ( Q) R k k! B Q r k R k (and smlarly for S). Thus all moments above are fnte. We now clam that for all 0 k < 3 t holds that E [ Ψ (k) ( Q) R k] = E [ Ψ (k) ( Q) S k]. (30)

23 GAFA GAUSSIAN BOUNDS FOR NOISE CORRELATION OF FUNCTIONS 1735 Ths follows from the fact that the expressons n the expected values when vewed as multlnear polynomals n the varables n X and Y respectvely are of degree 2 and each monomal term n X has the same coeffcent as the correspondng monomal n Y. From (28), (29) and (30) t follows that E[Ψ( Q + R) Ψ( Q + S)] B ( E[ R r ]+E[ S r ] ). (31) r! r=3 We now use hypercontractvty. By Proposton 3.14 each Z () s (2,r,η)-hypercontractve. Thus by Proposton 3.19, E [ R r] k η 3d E[Rj 2 ] rj/2, E [ S r] k η 3d E[Sj 2 ] rj/2. (32) However, E[Sj 2 ]=E[Rj 2 ]= c 2 σ =Inf j (Q j ) Inf (Q). (33) σ j :σ j >0 Combnng (31), (32) and (33) t follows that E[Ψ( Q + R) Ψ( Q + S)] 2Bk 3 η 3d Inf (Q) 3/2 confrmng (27) and completng the proof. 4.3 Invarance prncple other functonals, and smoothed verson. The basc nvarance prncple shows that E[Ψ(Q(X ))] and E[Ψ(Q(Y))] are close f Ψ s a C 3 functonal wth bounded 3rd dervatve. To show that the dstrbutons of Q(X ) and Q(Y) are close n other senses we need the nvarance prncple for less smooth functonals. Ths we can obtan usng straghtforward approxmaton arguments, see for example [MOO2]. For applcatons nvolvng bounded functons, t wll be mportant to bound the followng functonals. We let f [0,1] : R R be defned by f [0,1] (x) =max ( mn(x, 1), 0 ) =mn ( max(x, 0), 1 ), and ζ : R k R be defned by k ( ζ(x) = x f [0,1] (x ) ) 2. (34) Smlarly, we defne =1 χ(x) = k f [0,1] (x ). (35) =1 Repeatng the proofs of [MOO2] one obtans Theorem 4.2. Assume hypothess H3. Further assume that Q =(Q 1,...,Q k ) s a k-dmensonal multlnear polynomal wth Var[Q] 1 and Inf log(1/τ)/k (Q) τ for all, where K =log(1/α). Suppose further that for all d t holds that Var[Q >d ] (1 γ) 2d where 0 <γ<1.

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number