Sketchng Algorthms for Bg Data Fall 2017 Prof. Jelan Nelson Lecture 9 Sept 29, 2017 Scrbe: Mtal Bafna 1 Fast JL transform Typcally we have some hgh-mensonal computatonal geometry problem, an we use JL to spee up our algorthm n two steps: (1) apply a JL map Π to reuce the problem to low menson m, then (2) solve the lower-mensonal problem. As m s mae smaller, typcally (2) becomes faster. However, eally we woul also lke step (1) to be as fast as possble. In ths secton, we nvestgate two approaches to spee up the computaton of Πx. One of the analyses wll make use of the followng Chernoff boun. Theorem 1 (Chernoff boun). Let X 1,..., X n be nepenent ranom varables n [0, τ], an wrte µ := E X. Then ε > 0, P( ( e ε X µ > εµ) < 2 (1 + ε) 1+ε ) µ/τ The approach we cover here was nvestgate by Alon an Chazelle [AC09]. Ths approach gves a runnng tme to compute Πx of roughly O( log ). They calle ther transformaton the Fast Johnson-Lnenstrauss Transform (FJLT). A constructon smlar to thers, whch we wll analyze here, s the m n matrx Π efne as Π = SHD (1) m where S s an m samplng matrx wth replacement (each row has a 1 n a unformly ranom locaton an zeroes elsewhere, an the rows are nepenent), H s a boune orthonormal system, an D = ag(α) for a vector α of nepenent Raemachers. A boune orthonormal system s a matrx H C such that H H = I an max,j H,j 1/. For example, H can be the Fourer matrx or Haamar matrx. The motvaton for the constructon (1) s spee: D can be apple n O() tme, H n O( log ) tme (e.g. usng the Fast Fourer Transform or ve an conquer n the case of the Haamar matrx), an S n O(m) tme. Thus, overall, applyng Π to any fxe vector x takes O( log ) tme. Compare ths wth usng a ense matrx of Raemachers, whch takes O(m) tme to apply. We wll now gve some ntuton behn why such a Π works. Conser the samplng matrx S whch samples a ranom coornate of x. If the norm of x s sprea out among ts coornates then n expectaton the norm of Sx s the norm of x. But what o we o n the case where x has mass only on a few coornates. It s known that a Fourer matrx spreas out the mass of vectors wth hghly concentrate mass an vce versa. So we multply S wth H an to hanle the case where H concentrates the mass of vectors wth ther mass sprea out we fnally multply x n the begnnng by D α. 1
1.1 Analyss of [AC09] We wll show that for m ε 2 log(1/δ) log(/δ), the ranom Π escrbe n (1) proves DJL. We wll conser the case of H as the normalze Haamar matrx, so that every entry of H s n { 1/, 1/ }. Theorem 2. Let x R n be an arbtrary unt norm vector, an suppose 0 < ε, δ < 1/2. Also let Π = m SHD as escrbe above wth a number of rows equal to m ε 2 log(1/δ) log(n/δ). Then P Π ( Πx 2 2 1 > ε) < δ. Proof. Defne y = HDx. The goal s to frst show that HDx = O( log(/δ)/n) wth probablty 1 δ/2, then contone on ths event, that (1 ε) m Sy 2 2 (1 + ε) wth probablty 1 δ/2. For the frst event, note y = (HDx) = n j=1 σ j ( 1 γ,j x j ) = σ, z, where γ,j = 1 an z s the vector wth (z ) j = 1 γ,j x j. Thus by Khntchne s nequalty Thus by a unon boun,, P( y > 2 log(4/δ) n ) < 2e log(/δ) = δ 2. 2 log(4/δ) 2 log(4/δ) P( y > ) = P( : y > ) < δ n n 2. Now, let us conton on ths event that y 2 2 log(4/δ)/n := τ/. For [m], efne X = y 2. By the Chernoff boun above, P( m =1 whch s at most δ/2 for m ε 2 log(1/δ) log(/δ). ( e ε ) m/τ X m > εm) < 2 (1 + ε) 1+ε, Remark 1. Note that the FJLT as analyze above proves suboptmal m. If one esre optmal m, one can nstea use the embeng matrx Π Π,where Π s the FJLT an Π s, say, a ense matrx wth Raemacher entres havng the optmal m = O(ε 2 log(1/δ)) rows. The ownse s that the runtme to apply our embeng worsens by an atve m m. [AC09] slghtly mprove ths atve term (by an ε 2 multplcatve factor) by replacng the matrx S wth a ranom sparse matrx P. Can a better analyss be gven? Unfortunately not by much: the quaratc epenence log 2 (1/δ) nees to be there by an example of Erc Prce. The ba case s x has 1/ 1/4 on the frst coornates, an magne δ 2. 2
1.2 Analyss base on RIP Here we gve a fferent analyss, base on combnng the man results of [KW11] an [RV08] whch use the metho of channg, as seen n the last lecture. Frst we have to gve a efnton. Defnton 1. We say a matrx Π R m n satsfes the (ε, k)-restrcte sometry property (or RIP for short) f for all k-sparse vectors x of unt Euclean norm, 1 ε Πx 2 2 1 + ε. Usng the fact that the operator norm of a matrx M s equal to sup x x T Mx, t follows that beng (ε, k)-rip s equvalent to sup I k (Π (T ) ) Π (T ) < ε, T [n] where Π (T ) s the m T matrx obtane by restrctng Π to the columns n T. As we wll see later n the course, ths noton of RIP s useful for compresse sensng, whch s closely relate to the heavy htters problem. For now, we wll just use t to obtan fast JL by combnng t wth the followng theorem of [KW11]. Theorem 3. There exsts a unversal constant C > 0 such that the followng hols. Suppose A satsfes (ε/c, k)-rip for k C log(1/δ), an let α { 1, 1} n be chosen unformly at ranom. Then for any x R n of unt norm P α ( AD α x 2 2 1 > ε) < δ. In other wors, the probablty strbuton Π = AD α over matrces, nuce by α, satsfes the strbutonal JL property. We wll not prove Theorem 3 here, but we wll show that the matrx msh satsfes RIP wth postve probablty for farly small m. That s, there oes some choce of few rows of a boune orthonormal system that gves RIP (though unfortunately we o not know whch explct set, though see [BDF + 11]). A number of bouns on the best m to acheve RIP for samplng Fourer/Haamar rows were gven, startng wth the work of Canés an Tao [CT06]. Then subsequent works gave better bouns [RV08, Bou14, HR16]. An analyss was also gven for a relate constructon n [NPW14]. We wll gve the analyss of [RV08] snce t s most smlar to what we saw n the last lecture. Recall for T R n, r(t ) := E sup σ, x. σ x T Last lecture we not nclue the absolute values, but t oes not make much of a fference (the Khntchne tal boun only ffers by a factor of two). Also recall that we showe r(t ) (T, 2 ), 3
where for T a set of vectors of at most unt norm, 1 (T, ) 2 k lg1/2 N (T,, 1 2 k ) lg 1/2 N (T,, u)u nf k=1 0 {T r} r=1 Ths was the Duley boun. Let us now show that for RIP, m = Ω(ε 2 k log 4 n) suffces. 2 r/2 sup x T r. x T We wll analyze a slghtly fferent constructon, just for ease of notaton. Instea of samplng m rows from H, we wll smply keep each row wth probablty m/, nepenently. Let η be an ncator for whether we keep row. Also, let us efne x to equal the th row of H, so x { 1, 1} n. We let β = E µ sup I k 1 m terms of β. µ z (T ) ) T an we wll now get an upper boun for β n E sup I k 1 µ z (T ) µ m = E sup E ( 1 µ µ µ m z (T ) 1 E µ,µ m sup ) T ) T 1 m µ z (T ) ) T ) µ z (T ) ) T µ z (T ) ) T ) Jensen s nequalty = 1 m E µ,µ,σ sup σ (µ µ )z (T ) ) T By symmetrzaton over σ 2 m E µ E σ sup σ µ z (T ) ) T Trangle nequalty = 2 m E µ E σ sup = 2 m E µ E σ sup sup σ µ x, z (T ) x R n 2 Usng the efn of operator norm of a matrx [] sup x D,k 2 [] σ µ x, z (T ) 2 where D,k 2 = set of all k-sparse unt vectors n R We let, T µ = {µ 1 x, z 1 2,..., µ x, z 2, x D,k 2 } an r(t µ) = E sup z Tµ σ, z. Duley s nequalty gves us that r(t ) (T, l 2 ). Let g(x) = (µ 1 x, z 1,..., µ x, z ) an g(y) s efne smlarly. We have that, So we get that, g(x) g(y) 2 max 1 j z j, x y 2 m (β + 1) 1/2. β β + 1 (D,k 2, ) m, 4
whch mples that β 2 CRβ CR 0. References [AC09] Nr Alon an Bernar Chazelle. The fast Johnson Lnenstrauss transform an approxmate nearest neghbors. SIAM J. Comput., 39(1):302 322, 2009. [BDF + 11] Jean Bourgan, Stephen Dlworth, Kevn For, Serge Konyagn, an Denka Kutzarova. Explct constructons of RIP matrces an relate problems. Duke Mathematcal Journal, 159(1):145 185, 2011. [Bou14] Jean Bourgan. An mprove estmate n the restrcte sometry problem. Geometrc Aspects of Functonal Analyss, 2116:65 70, 2014. [CT06] Emmanuel J. Canés an Terence Tao. Near-optmal sgnal recovery from ranom projectons: unversal encong strateges? IEEE Trans. Inform. Theory, 52(12):5406 5425, 2006. [HR16] [KW11] Ishay Havv an Oe Regev. The restrcte sometry property of subsample fourer matrces. In Proceengs of the Twenty-Seventh Annual ACM-SIAM Symposum on Dscrete Algorthms (SODA), pages 288 297, 2016. Felx Krahmer an Rachel War. New an mprove Johnson-Lnenstrauss embengs va the Restrcte Isometry Property. SIAM J. Math. Anal., 43(3):1269 1281, 2011. [NPW14] Jelan Nelson, Erc Prce, an Mary Wootters. New constructons of RIP matrces wth fast multplcaton an fewer rows. In Proceengs of the 25th Annual ACM-SIAM Symposum on Dscrete Algorthms (SODA), pages 1515 1528, January 2014. [RV08] Mark Ruelson an Roman Vershynn. On sparse reconstructon from Fourer an Gaussan measurements. Comm. Pure Appl. Math., 61(8):1025 1045, 2008. 5