Supplementary Material for Limits on Sparse Support Recovery via Linear Sketching with Random Expander Matrices

Joata Scarlett ad Volka Cever Supplemetary Materal for Lmts o Sparse Support Recovery va Lear Sketcg wt Radom Expader Matrces (AISTATS 26, Joata Scarlett ad Volka Cever) Note tat all ctatos ere are to te bblograpy te ma documet, ad smlarly for may of te crossrefereces. A Proof of Lemma I te otato of Defto, let E ( =,...,k)beteevettatsomesets of cardalty fals to satsfy te expaso property,.e., N (S) < ( )d S. We start wt te followg o-asymptotc boud gve 8]: p d d PE] d. (44) d Applyg te bouds log p log p ad log d d dh 2 ( ), weobta log PE] log p + dh 2 ( )+ d log d (45) = log p d log H 2 ( ). (46) d Sce k = (), weobtafromteuoboudtatp =,...,k E s! provded tat (46) teds to all. Ts s true provded tat (2) olds; te domat codto s te oe wt = k. for B Proof of Teorem 3 Recall te deftos of te radom varables () (), ad te formato destes (25) (27). We fx te costats,..., k arbtrarly, ad cosder a decoder tat searces for te uque set s 2S suc tat for all 2 k declared. ĩ(x sdf ; y x seq ) > sdf (47) parttos (s df,s eq ) of s wt s df 6= ;. If o suc s exsts, or f multple exst, te a error s Sce te jot dstrbuto of ( s, s, Y s S = s) s te same for all s our setup (cf., Secto.2), ad te decoder tat we ave cose exbts a smlar symmetry, we ca codto o S = s = {,...,k}. Byteuo boud, te error probablty s upper bouded by P e P o ĩ( sdf ; Y seq ) sdf + s2s\{s} P ĩ( s\s ; Y s\s ) > sdf, (48) were ere ad subsequetly we let te codto s df 6= ; rema mplct. I te summad of te secod term, we ave upper bouded te probablty of a tersecto of 2 k evets by just oe suc evet, amely, te oe wt te formato desty correspodg to s df = s\s ad s eq = s \ s. As metoed prevously, a key tool te proof s te followg cage of measure (wt := s df ): P Y seq (y x seq )= Y P (x ) P Y sdf seq (y x sdf, x seq ) (49) x sdf 2s df ( + Y ) P(x ) P Y sdf seq (y x sdf, x seq ) (5) 2s df x sdf =( + ) e PY seq (y x seq ), (5) were we ave used te deftos (23) (24), ad (5) follows from (2). By a detcal argumet, we ave P Y seq s (y x seq,b s ) ( + ) e PY seq s (y x seq,b s ), (52)

Lmts o Sparse Support Recovery va Lear Sketcg wt Radom Expader Matrces were e P Y seq s := P Y seq s as a..d. law. We ca weake te secod probablty (48) as follows (wt := s\s ): P ĩ( s\s ; Y s\s ) > = P k (x s\s )P (x s\s ) dy P Y seq (y x s\s ) log P Y sdf seq (y x s\s, x s\s ) > (53) x s\s,x R s\s ep Y seq (y x s\s ) ( + ) (x s\s )P (x s\s ) dy P e Y seq (y x s\s ) log P Y sdf seq (y x s\s, x s\s ) > R ep Y seq (y x s\s ) ( + ) x s\s,x s\s P k =( + )e, x s\s,x s\s P k (54) (x s\s )P (x s\s ) dy P Y sdf seq (y x s\s, x s\s )e (55) R were (53) we used te fact tat te output vector depeds oly o te colums of x s correspodg to etres of s tat are also s, (54) follows from (5), ad (55) follows by boudg P e Y seq usg te evet wt te dcator fucto, ad te upper boudg te dcator fucto by oe. Substtutg (56) to (48) gves P e P ĩ( sdf ; Y seq ) were te combatoral terms arse from a stadard coutg argumet 7]. o + = (56) k p k k ( + )e, (57) We ow fx te costats,..., k arbtrarly, ad recall te followg steps from 7] (aga wrtg := s df ): o P ĩ( sdf ; Y seq ) = P P P log P Y sdf seq (Y sdf, seq ) (58) ep Y seq (Y seq ) log P Y sdf seq (Y sdf, seq ) ep Y seq (Y seq ) Te secod term (6) s upper bouded as P = log P log + P \ log log P Y sdf seq (Y sdf, seq ) + (Y seq, s) ep Y seq (Y seq ) (Y seq, s) > + P log log ep Y seq (Y seq ) (Y seq, s) ep Y seq (Y seq ) (Y seq, s) > (59) ep Y seq (Y seq ) (Y seq, s) >. (6) ep Y seq (Y seq ) (Y seq, s) > (6) P s (b s )P k (x seq ) dy P Y seq s (y x seq,b s ) log b R s,x seq ep Y seq (y x seq ) (y x seq,b s ) > (62)

( + ) ( + ) =( + ) k = k e Joata Scarlett ad Volka Cever P s (b s )P k (x seq ) dy e ep Y seq (y x seq ) P Y seq s (y x seq,b s ) log b R s,x seq (y x seq,b s ) > P s (b s )P k (x seq ) dy P e Y seq (y x seq )e b R s,x seq, (63) (64) were (6) follows from te uo boud, ad te remag steps follow te argumets used (53) (56) (wt (52) used place of (5)). We ow upper boud te frst term (6), aga followg 7]. Te umerator te frst term (6) equals P Y s (Y s ) for all (s df,s eq ) (recall te defto (22)), ad we ca tus wrte te overall term as P log P Y s (Y s ) max Usg te same steps as tose used (58) (6), we ca upper boud ts by P log P Y s s (Y s, s) max for ay costat P (65) log e P Y seq s (Y seq, s)+ +. (66) log e P Y seq s (Y seq, s)+ + +. Reversg te step (66), ts ca equvaletly be wrtte as + P log P Y s, s (Y s, s) > (67) P Y s (Y s ) log P Y sdf seq s (Y sdf, seq, s) + + + P log P Y s, s (Y s, s) >. (68) (Y seq, s) P Y s (Y s ) Te frst logartm te frst term s te formato desty (26). Moreover, te coces k p k k = log ( + ) k k = log ( + ) (69) (7) make (65) ad te secod term (57) be upper bouded by (68), ad recallg tat = s df,weobta(28). eac. Hece, ad combg (6) wt (65) ad C Proof of Teorem 2 Fx <b m <b max <, adletb := {b s :m b b m \ max b b max }. Te ma step provg Teorem 2 s extedg te argumets of Secto 4.5 to sow tat s df log p P e P max :s df 6=; I sdf,s eq ( s ) ( + ) \ s 2B + P s /2 B + o(), (7) ad P e P max :s df 6=; s df log p I sdf,s eq ( s ) ( ) \ s 2B + o(), (72) Before provg tese, we sow ow tey yeld te teorem. Usg (6), t s readly verfed tat eac I sdf,s eq ( s ), wt a..d. Gaussa vector s, sacotuousradomvarableavgomasspots. Bytakg! suffcetly slowly ad otg tat we ave restrcted s to te set B (wt wc all of te I sdf,s eq ( s ) are

Lmts o Sparse Support Recovery va Lear Sketcg wt Radom Expader Matrces bouded away from zero ad fty), we coclude tat (7) (72) rema true we s replaced by zero, ad ts cotrbuto s factored to te o() terms. Hece, we obta Teorem 2 by () droppg te codto s 2B from te frst probablty (7); () usg te detty PA \A 2 ] PA ] PA 2 ] to remove te same codto from te frst probablty (72); () otg tat te remader term P s /2 B ca be made arbtrarly small by coosg b m suffcetly small ad b max suffcetly large. It remas to establs (7) (72). Recall te value of gve followg Lemma 3. Te above coce of B esures tat all of te o-zero etres are bouded away from ad, sotattemutualformatosi sdf,s eq ( s ) ad varaces V sdf,s eq ( s ) are bouded away from zero ad fty, ad ece = (). Sce P s s cotuous, we must coose ad adle P (29) dfferetly to te above. Smlarly to te aalyss of Gaussa measuremets 7], we fx > ad ote tat Cebysev s equalty mples = I + r V =) P ( ), (73) were I := I( s ; Y s ) (74) V := Var log P Y s, s (Y s, s). (75) P Y s (Y s ) Te followg s a stragtforward exteso of 7, Prop. 4] to expader-based measuremets. Proposto. Te quattes I ad V defed (74) (75) satsfy Proof. See Appedx E. I k 2 log + d 2 2 (76) V 2. (77) We ca ow obta (7) (72) usg te steps of te prevous subsecto; te codto P s 2B ] arses (35) ad (39) due to te fact tat ts codto was used to obta a bouded varace (32), ad te frst two probabltes (7) arse from te detty PA A 2 ] PA A c 2]+PA 2 ]. Te oly addtoal step s sowg tat we ca smultaeously aceve = o(log p) ad P ( )=o() te acevablty part weever = (log p), te same way tat we sowed 2 s df log = o(log p) te prevous subsecto. Ts mmedately follows by substtutg (76) (77) to (73) (alog wt d = O() =O(log p)) toobta = O(log log p)+ p log p = o(log p) for ay >, adotgtat (ad ece P ( )) (73)cabearbtrarlysmall. D Proof of Lemma 3 We prove te lemma by caracterzg te varace of a geeral fucto of ( s, Y) of te form f ( s, Y) := P = f(() s,y () ). Clearly all of te quattes ı for te varous (s df,s eq ) ca be wrtte ts geeral form. We ave Var f ( s, Y) = Var f( s (),Y () ) (78) = = = j= Cov f( () s = Var f( s,y) +( 2,Y () ),f( s (j),y (j) ) (79) )Cov f( s,y),f(s,y ), (8) were ( s,y) ad ( s,y ) correspod to two dfferet dces {,,}; ere (8) follows by smple symmetry cosderatos for te cases = j ad 6= j.

Joata Scarlett ad Volka Cever To compute te covarace term (8), we frst fd te jot dstrbuto of ( s,y) ad (s,y ). As oted 29, Sec. IV-B], a uform permutato of a vector wt d oes ad d zeros ca be terpreted as successvely performg uform samplg from a collecto of symbols wtout replacemet ( tmes total), were te tal collecto cotas d oes ad d zeros. By cosderg te frst two steps of ts procedure, we obta P = x ]=P (x ) (8) P = x = x ]= P (x ) {x = x } for =, 2, were P () = P () = d. Deotg te rgt-ad sde of (82) by P (x x ), ad wrtg µ f := Ef( s,y)], tecovarace(8)sgveby Cov f( s,y),f(s,y ) = E f( s,y) µ f f(s,y ) µ f (83) = P(x k s ) Y P(x x ) E f(x s,y) µ f f(x s,y ) µ f s = x s,s = x s. (84) x s x s 2s We ow cosder te varous terms arsg by substtutg (82) to (84) ad performg a bomal-type expaso of te product: Tere s a sgle term of te form (84) wt eac P x(x x ) replaced by P (x ). Ts yelds a average of f( s,y) µ f f( s,y ) µ f over depedet radom varables s ad s,adtereforeevaluatesto zero. Tere are k terms wc oe value Px(x x {x ) (84) s replaced by =x } ad te oter k are replaced by P (x ). Eac suc term ca be wrtte as ( ) Var Ef( 2 s,y) s\{} ], wc tur beaves as Var Ef( s,y) s\{} ] + O(). All of te remag terms replace P x(x x ) (84) by {x=x } for at least two values of. Allsucterms are easly verfed to beave as O 2, ad te umber of suc terms s fte ad does ot scale wt (recall tat k s fxed by assumpto). Substtutg tese cases to (84) ad recallg tat k = () ad d = (), weobta(4). E Proof of Proposto Here we caracterze I ad V,defed(74) (75),vaaextesoofteaalyssgve7,App.B]. Sce Y = s s +, weave (82) I = I( s ; Y s )=H(Y s ) H(Y s, s) (85) = H( s s + s ) H(). (86) From 25, C. 9], we ave H() = 2 log(2 e 2 ) ad H( s s + s = x s )= 2 log (2 e) det( 2 I + 2 x s x T s ), were I s te detty matrx. Averagg te latter over s ad substtutg tese to (86) gves I = 2 E log det I + = 2 E log det I k + = 2 k = E log + k 2 log + d 2 2 2 2 s T s 2 2 T s s 2 2 ( T s s ) (87) (88) (89), (9)

Lmts o Sparse Support Recovery va Lear Sketcg wt Radom Expader Matrces were (88) follows from te detty det(i + AB) =det(i + BA), (89) follows by wrtg te determat as a product of egevalues (deoted by ( )), ad (9) follows from Jese s equalty ad te followg calculato: k k E = ( T s s ) = k ETr(T s s )] = E T ]=d, (9) sce te squared orm of s d almost surely. Ts cocludes te proof of (76). We ow tur to te boudg of te varace. Aga usg te fact tat Y = s log P Y s, s (Y s, s) P Y s (Y s ) = log P () P Y s ( s s + s ) = I 2 2 T + 2 ( s s + ) T 2 I + 2 s T s s +, weave (92) (s s + ), (93) were P s te desty of, ad (93) follows by a drect substtuto of te destes P N(, 2 I) ad P Y S ( x s ) N(, 2 I + 2 x s x T s ). Observe ow tat 2 T s a sum of depedet 2 radom varables wt oe degree of freedom (eac avg a varace of 2), ad ece te secod term (93) as a varace of 2. Moreover, by wrtg M =(M 2 ) T M 2 for te symmetrc postve defte matrx M = 2 I + 2 s T s,we smlarly observe tat te fal term (93) s a sum of 2 varables (ts s true codtoed o ay s = x s, ad ece also true ucodtoally), aga yeldg a varace of 2. We tus obta (77) usg te detty VarA + B] VarA] + VarB] + 2 max{vara], VarB]}.