Supplementary Materials for Statistical-Computational Phase Transitions in Planted Models: The High-Dimensional Setting
|
|
- Joan Hodges
- 6 years ago
- Views:
Transcription
1 Supplemetary Materials for Statistical-Computatioal Phase Trasitios i Plated Models: The High-Dimesioal Settig Yudog Che The Uiversity of Califoria, Berkeley yudog.che@eecs.berkeley.edu Jiamig Xu Uiversity of Illiois at Urbaa-Champaig jxu18@illiois.edu Abstract We provide the proofs for the theorems i the mai paper. 1 Proofs for Plated Clusterig I this sectio, Theorems 1 6 refer to the theorems i the mai paper. Equatios are umbered cotiuously from the mai paper. We let 1 := r ad 2 := r be the umbers of oisolated ad isolated odes, respectively. 1.1 Proof of Theorem 1 The proof relies o iformatio theoretical argumets ad the Fao s iequaliy [4]. We use D (Ber(p Ber(q to deote the L divergece betwee two Beroulli distributios with mea p ad q. We first state a upper boud o D (Ber(p Ber(q, which is used later i the proof: D (Ber(p Ber(q = p log p q + (1 p log 1 p 1 q (a p p q + (1 p q p (p q2 = q 1 q q(1 q, (16 where (a follows from the iequality log x x 1, x 0. Let P (Y,A be the joit distributio of Y ad A whe Y is sampled from Y uiformly at radom ad A is geerated accordig to the plated clusterig model. Because the supremum is lower bouded by the average, we have if Ŷ sup P [Ŷ ] Y if P (Y,A [Ŷ ] Y. (17 Y Y Ŷ Let H(X be the etropy of a radom variable X ad I(X; Z the mutual iformatio betwee X ad Z. By Fao s iequality, we have for ay Ŷ, P (Y,A(Ŷ Y 1 I(Y ; A + 1. (18 log Y Simple coutig gives that Y = ( 1! 1 r!(!. Note that ( r 1 ( 1 1 ad ( e! e ( e. It follows that Y (/ ( 1 /e 1 ( e r(r/e r e r r/2 (/e e(r. r 1
2 This implies log Y log uder the assumptio that 8 /2 ad 32. O the other had, ote that H(A ( 2 H(A12 because the A ij s are idetically distributed by symmetry. Furthermore,the A ij s are idepedet coditioed o Y, so H(A Y = ( 2 H(A12 Y12. It follows that I(Y ; A = H(A H(A Y ( 2 I(Y 12 ; A 12. We boud I(Y12 ; A 12 below. Observe that ( 2 ( ( P(Y12 2 r+ 1 (r 1! = 1 = = α := Y, ad thus P(A 12 = 1 = β := αp + (1 αq. It follows that I(Y 12; A 12 = αd (Ber(p Ber(β + (1 αd (Ber(q Ber(β (a (p β2 α β(1 β = α(1 α(p q2 β(1 β + (1 αq (q β2 β(1 β (b α(p q2 q(1 q, where (a follows from (16 ad (b follows because β(1 β αp(1 p + (1 αq(1 q due to the cocavity of x(1 x. Hece we have I(Y ; A 1( 1(p q 2 2q(1 q. Combiig with (18, we obtai P (Y,A(Y Y 1 1 ( 1(p q 2 q(1 q log 3/4 ( 1(p q2 q(1 q log, (19 where the last iequality holds because 1 log 8 whe /2 ad 32. The RHS above is at least 1/2 whe the first coditio (1 i the theorem holds. Substitutig ito (17 proves sufficiecy of (1. We ow tur to the third coditio (3. Sice p > q, we have β p by the defiitio of β. Moreover, β αp ad 1 β = 1 q α(p q (1 α(1 q. It follows that I(Y12; A 12 = αp log p (1 p + α(1 p log + (1 αd (Ber(q Ber(β β 1 β αp log 1 (q β2 + (1 α α β(1 β = αp log 1 α + α2 (1 α(p q 2 β(1 β αp log 1 α + α(p q2 p(1 q αp log e α. where the first iequality follows from p/β 1/α, 1 β 1 p ad (16, the secod iequality follows from β(1 β α(1 αp(1 q, ad the last iequality follows from p q p(1 q. By the defiitio of α, we have α ( 1 ( 1 e whe 8. It follows that I(Y 12 ; A 12 αp log e2, ad thus I(Y ; A 1 2 1p log e2 1. By equatio (19, if p 16, i.e., the coditio (3 i the theorem holds, the P(Y Y 1/2. It remais to prove the sufficiecy of the secod coditio (3. Let M = ad Ȳ = {Y 0, Y 1,..., Y M} be a subset of Y with cardiality M + 1, which is specified later. Let P (Y,A deote the joit distributio of Y ad A whe Y is sampled from Ȳ uiformly at radom ad the A is geerated accordig to the plated clusterig model. By Fao s iequality, we have if Ŷ sup P [Ŷ ] Y if P (Y,A [Ŷ ] Y if Y Y Ŷ Ŷ { 1 I(Y ; A + 1 log Ŷ }. (20 2
3 We costruct Ȳ as follows. Let Y 0 be the clusterig matrix such that the clusters {C l } r l=1 are give by C l = {(l 1 + 1,..., l}. Iformally, each Y i with i 1 is obtaied from Y 0 by swappig two odes i two differet clusters. More specifically, for each i [ M], (i if the ( + i-th ode belogs to cluster C l for some l, the Y i has the first right cluster as {1, 2,..., 1, + i} ad the l-th right cluster as D l \ { + i} {}, ad all the other clusters idetical to Y 0 ; (ii if the ( +i-th ode is a isolated ode i Y 0, the Y i has the first right cluster as {1, 2,..., 1, +i} ad ode as a isolated ode, ad all the other clusters idetical to Y 0. Let P i be the distributio of the graph A coditioed o Y = Y i. Note that each P i is a product of 1 2( 1 Beroulli distributios. We have the followig chai of iequalities: I(Y ; A (a 1 (M M i,i =0 max D (P i P i i,i =0,...,M D (P i P i (b 3D (Ber(p Ber(q + 3D (Ber(q Ber(p (c 3(p q 2 1 mi{p(1 p, q(1 q} where (a follows from the covexity of L divergece, (b follows by our costructio of {Y i }, ad (c follows from (16. If (2 holds, the I(Y ; A 1 4 log( = 1 4 log Ȳ. Sice log( log(/2 4 if 128, it follows from (20 that the miimax error probability is at least 1/ Proof of Theorem 2 Let X, Y := Tr(X Y deote the ier product betwee two matrices. Assume that p > q first. For ay feasible solutio Y Y of (4, we defie (Y := A, Y Y. To prove the theorem, it suffices to show that (Y > 0 for all feasible Y with Y Y. For simplicity, i this proof we use a differet covetio that Y ii = 0 ad Y ii = 0 for all i V. Note that (Y = E[A], Y Y + A E[A], Y Y, (21 where E[A] = q11 +(p qy qi, 1 is the all oe vector i R ad I is the idetity matrix. Let d(y := Y, Y Y ; sice i,j Y ij = i,j Y ij, we have O the other had, observe that A E[A], Y Y = 2 E[A], Y Y = (p qd(y. (22 (i<j: Y ij =1 Y ij =0 (A ij p 2 } {{ } T 1 (Y (i<j: Y ij =0 Y ij =1 (A ij q. } {{ } T 2 (Y Here T 1 (Y (T 2 (Y, resp. is the sum of 1 2d(Y i.i.d. cetered Beroulli radom variables with parameter p (q, resp.. Let δ 1 = (p q/(2p ad δ 2 = (p q/(2q. By the Berstei iequality, we have for each fix Y Y, { P T 1 (Y δ } 1 d(y p 2 { } P T 2 (Y δ 2 d(y q 2 ( exp ( exp δ 2 1 4(1 p + 4δ 1 /3 δ 2 2 4(1 q + 4δ 2 /3 (a d(y p exp ( (b d(y q exp ( (p q2 20p(1 q d(y (p q2 20p(1 q d(y,, 3
4 where (a ad (b hold because p > q ad p q p(1 q. It follows from the uio boud that { 1 P 2 A E[A], Y Y = T 1 (Y T 2 (Y 1 } (p q2 (p qd(y 2 exp ( 2 20p(1 q d(y. This implies (p q2 P { (Y 0} 2 exp ( 20p(1 q d(y (23 i view of (21 ad (22. This boud holds for each fixed Y Y. We proceed to boud the probability of the evet { Y Y : Y Y, (Y 0}. Note that 2( 1 d(y r 2 for ay feasible Y Y, where the lower boud is achieved by swappig a ode i V 1 with a ode i V 2, ad the upper boud follows from i,j Y ij r2. The key step is to upper-boud the cardiality of the set {Y Y : d(y = t} for each t. This is doe i the followig combiatorial lemma. Lemma 1.1. For each t [2( 1, r 2 ], we have {Y Y : d(y = t} 25t2 2 20t/. (24 We prove the lemma i Sectio Combiig the lemma with (23 ad the uio boud, we obtai 2 2 P { Y Y : Y Y, (Y 0} (a 50 r 2 t=2 2 r 2 t=2 2 r 2 t=2 2 r 2 t=2 2 P { Y Y : d(y = t, (Y 0} ( { Y Y : d(y = t} exp (p q2 t 20p(1 q 25t 2 ( 2 20t/ exp (p q2 t 20p(1 q 2 5t/ 50r , where (a follows from the assumptio that (p q 2 C p(1 q log for a large costat C. This meas Y is the uique optimal solutio with high probability. A similar argumet applies to the case with q > p. This completes the proof of the theorem Proof of Lemma 1.1 Recall that C 1,..., C r are the true clusters associated with Y. Fix a Y Y with d(y = t. The cluster matrix Y defies a ew ordered partitio (C 1,..., C r+1 of V accordig to the followig procedure. 1. Let C r+1 := {i : Y ij = 0, j}. 4
5 2. The odes i V \C r+1 ca be further partitioed ito r ew clusters of size, where odes i ad j are i the same cluster if ad oly if Y ij = 1; we defie a orderig C 1,..., C r of these r ew clusters as follows. (a For each ew cluster C, if there exists a k [r] such that C Ck > /2, the we label this ew cluster as C k ; this label is uique because the cluster size is. (b The remaiig clusters are labeled arbitrarily. This ew partitio has the followig properties: (A0 (C 1,..., C r, C r+1 is a partitio of V, ad C k = for all k [r]. (A1 For every k [r], either C k Ck > /2, or C k C k /2 for all k [r]; (A2 We have r C k C r+1 2 Ck C r+1 + Ck C k C k C k = t. k=1 Here, Properties (A0 ad (A1 are direct cosequeces of how we label the ew clusters, ad Property (A2 follows from the followig equalities t = d(y = {(i, j V V : Yij = 1, Y ij = 0} r = {(i, j : (i, j Ck C k, i j, Y ij = 0} = k=1 r {(i, j : (i, j Ck C k, i j, (i, j C r+1 C r+1 } k=1 + r k=1 {(i, j : (i, j C k C k, (i, j C k C k }. Sice these properties are satisfied by ay Y with d(y = t, we have {Y Y : d(y = t} {(C 1,..., C r, C r+1 : it has properties (A0 (A2}. (25 To prove the lemma, it suffices to upper boud the right had side of (25. Fix a ordered partitio C := (C 1,..., C r, C r+1 with properties (A0 (A2. We cosider the first true cluster, C 1. Defie m 1 := k [r+1],k 1 C k C 1, which is the umber of odes i C 1 that are misclassified by C. We cosider two cases. If C 1 C1 > /10, the C1 C k C1 C k 2 C1 C 1 k [r+1] k 1 C 1 C k > m 1 /5. If C 1 C 1 /10, the by coditio (A1 we must have C k C 1 /2 for all 1 k r 5
6 ad m 1 > 9/10. Hece, =m k r C 1 C k C 1 C k + C 1 C r+1 2 C 1 C r+1 I {k 1}I {k 1} C k C 1 C k C 1 + C 1 C r+1 2 C 1 C r+1 m m m 1. C k C 1 2 C 1 C r+1 We coclude that we always have C 1 C k C 1 C k + C 1 C r+1 2 C 1 C r m 1. The above iequality holds if we replace C1 by C k ad m 1 by m k (defied similarly for each k [r]. It follows that r C k C r+1 2 Ck C r+1 + Ck C k C k C k r m k. 5 k=1 Thus by Property (A2, we have k [r] m k 5t/, i.e., the total umber of misclassified odes i V 1 is upper bouded by 5t/. This meas that the total umber of misclassified odes i V 2 is also upper bouded by 5t/ because by our cluster size costrait, oe misclassified ode i V 2 must produce oe misclassified ode i V 1. Therefore, the total umber of misclassified odes i V 1 ca at most take 5t/ differet values ad the same is true for the total umber of misclassified odes i V 2. For a fixed umber of misclassified odes i V 1 ad V 2, there are at most 5t/ 1 5t/ 2 differet ways to choose these misclassified odes. Each misclassified ode i V 1 ca be assiged to oe of r 1 differet clusters or leave isolated, ad each misclassified ode i V 2 ca be assiged to oe of r differet clusters. Hece, the right had side of (25 is upper bouded by 25t 2 5t/ 2 1 5t/ 2 r 10t/ 25t2 20t/, which proves the lemma Proof of Theorem 3 The proof uses several matrix orms. The spectral orm X of a matrix X is the largest sigular value of X. The uclear orm X is the sum of sigular values. We also eed the L 1 orm X 1 = i,j X ij ad the L orm X = max i,j X ij. Let X, Y = Tr(X Y deote the ier product betwee two matrices. For a vector x, x 2 is the usual Euclidea orm. Defie (Y = Y Y, A. It suffices to show that (Y > 0 for all feasible solutio Y of the program (6 (8 with Y Y. Rewrite (Y as (Y = Ā, Y Y + A Ā, Y Y = Ā, Y Y + λ W, Y Y, (26 k=1 6
7 where Ā = q11 + (p qy ad W := (A Ā/λ. Because i,j Y ij = r 2 = i,j Y ij ad Y ij [0, 1], the first term i the RHS of (26 satisfies Ā, Y Y = (p q Y, Y Y = p q 2 Y Y 1. We ow cotrol the secod term i (26. Note that Var[A ij ] p(1 q. Defie σ 2 := max{p(1 q, q(1 p}. Note that Ā E[A] 1. By assumptio, σ2 C log / for a costat C. We eed the followig boud, which is proved i Sectio to follow. Lemma 1.2. Uder the otatio above, if σ 2 C log / for a costat C, the there exists a costat C such that with high probability A E[A] C σ 2 log + q(1 q. It follows that with high probability A Ā A E[A] + Ā E[A] C σ 2 log + q(1 q for a uiversal costat C. Defie λ := C σ log + q(1 q ad the ormalized oise matrix. Thus w.h.p. W 1. Let u k be the ormalized characteristic vector of cluster C k, i.e., u k(i = 1/ if ode i is i cluster C k ad u k(i = 0 otherwise. Let U = [u 1,..., u r ]. The Y = UU is the sigular value decompositio of Y. Defie the projectios P T (M = UU M + MUU UU MUU ad P T (M = M P T (M. Because P T (W W 1 w.h.p., UU + P T (W is a subgradiet of X at X = Y. Sice Y = 1, it follows that for ay feasible Y, 0 Y Y UU + P T (W, Y Y. Substitutig ito (26, we obtai that for ay feasible Y, (Y p q 2 Y Y 1 + λ P T (W UU, Y Y ( p q λ UU P T (λw Y Y 1 2 ( p q = λ 2 P T (λw Y Y 1, (27 where the last equality holds because UU = 1/. We proceed by boudig the term P T (λw i (27. From the defiitio of P T, we have P T (λw UU (λw + (λw UU + UU (λw UU 3 UU (λw. Assume ode i belogs to cluster k. The (UU (λw ij = (u k u k (λw ij = (1/ i C k (λw i j, which is the average of idepedet radom variables. By Berstei s iequality (Theorem A.1 we have with probability at least 1 4, i Ck 6σ 2 log + 2 log C 1 σ log, for some costat C 1, where the last iequality follows from the assumptio (9. It follows from the uio boud over all (i, j that that P T (λw 3 UU (λw 3C 1 σ log / with probability at least 1 2. Substitutig back to (27, we coclude that with probability at least 1 1, ( p q (Y λ 2 3C 1σ log / Y Y 1 > 0 for all feasible Y Y, where the last iequality follows from the assumptio (9. 7
8 1.3.1 Proof of Lemma 1.2 Let R := support(y ad P R ( : R R be the operator which sets the etries outside of R to be zero. Let B 1 = P R (A E[A] ad B 2 = A E[A] B 1. The B 1 is a block-diagoal matrix with r blocks of size ad has etries with variace bouded by σ 2. Applyig the matrix Berstei iequality [5], we get that with high probability, B 1 c 1 σ 2 log for a costat c 1. O the other had, B 2 has etries with variace bouded by max{q(1 q, c 2 log /} for a costat c 2. By Lemma 1.3 below, we obtai that B 2 c 3 max{ q(1 q, log } for a costat c 3. It follows that A E[A] B 1 + B 2 c 1 σ 2 log + c 3 max{ q(1 q, log } which completes the proof of the lemma. C σ 2 log + q(1 q, Lemma 1.3. Let M deote the symmetric matrix such that M ij (1 i < j are idepedet radom variables with P(M ij = 1 p ij = p ij ad P(M ij = p ij = 1 p ij, ad M ii = 0. Suppose that Var[M ij ] σ 2 with σ 2 C log / for a costat C, the with high probability M Cσ for a costat C. Proof. If σ 2 log7, the Theorem 8.4 i [1] implies that M 3σ w.h.p. If C log σ 2 log 7, the Lemma 2 i [2] implies that M Cσ w.h.p. for some uiversal costat C. 1.4 Proof of Theorem 4 We first claim that (p q c 2 p + q implies (p q c2 2q uder the assumptio that /2 ad q c 1 log. I fact, if p q, the the claim trivially holds. If p > q, the q < p/ p/2. It follows that p/2 < (p q c 2 p + q c2 2p. Thus, p < 8c 2 2 which cotradicts the assumptio that p > q c 1 log. Therefore, p > q caot hold. Hece, it suffices to show that if (p q c 2 2q, the Y is ot a optimal solutio of the covex program (6 (8. We do this by showig that the optimality of Y implies (p q > c 2 2q. Let I be the all-oe matrix. Let R := support(y ad A := support(a. Recall that Y = UU is the sigular value decompositio of Y, ad the orthogoal projectio oto the space T is give by P T (M = UU M + MUU UU MUU. Cosider the Lagragia L(Y ; λ, µ, F, G := A, Y + λ ( Y Y + η ( I, Y r 2 F, Y + G, Y I, where λ 0, η R, F ij 0 ad G ij 0, i, j are the Lagragia multipliers. Note that Y = r2 I 2 is strictly feasible so strog duality holds by Slater s Theorem. By stadard covex aalysis, if Y = Y is a optimal solutio, the there must exists some F, G ad λ for which the followig 8
9 T coditios hold: 0 L(Y ; λ, µ, F, G Y Y =Y, F ij 0, (i, j, G ij 0, (i, j, λ 0, F ij = 0, (i, j R, G ij = 0, (i, j R c. } Statioary coditio Dual feasibility } Complemetary slackess Recall that M R is a subgradiet of X at X = Y if ad oly if P T (M = UU ad P T (M 1. Let H = F G; the T coditios imply that there exist some λ, η, W ad H obeyig ( A λ UU + W ηi + H = 0, (28 λ 0, (29 P T W = 0, (30 W 1, (31 H ij 0, (i, j R, (32 H ij 0, (i, j R c. (33 Now observe that UU W UU = 0 by (30. We left ad right multiply (28 by UU to obtai Ā λuu ηi + H = 0, where for ay matrix X R, X := UU XUU is the matrix obtaied by takig the average i each block of X. Cosider the last display equatio o etries i R ad R c respectively. Applyig the Berstei iequality (Theorem A.1 for each etry of Ā ij, we get that with high probability, p λ η + H ij c 3 p(1 p log q(1 q log q η + H ij c 3 c 4 log c 4 log 2 2 (a ɛ 0, (i, j R (34 8 (b ɛ 0 8, (i, j Rc (35 for some costats c 3, c 4 > 0, where (a ad (b follow from the assumptio c 1 log with c1 sufficietly large. Usig (32 ad (33, we get q ɛ 0 8 q c 3 q(1 q log c 4 log 2 2 η p + c 3 p(1 p log + c 4 log 2 2 λ p + ɛ 0 8 λ. (36 It follows that λ (p q + c 3 ( p(1 p log + q(1 q log + c 4 log { } c 4 log 4 max (p q, c 3 p(1 p log, c3 q(1 q log,. (37 c 1 9
10 O the other had, (31, (30 ad (28 imply λ 2 = λ(uu + W 2 1 λ(uu + W 2 F = 1 A ηi + H 2 F 1 A R c ηi R c + H R c 2 F 1 (1 η 2 A ij, (i,j R c where X R c deotes that matrix obtaied from X by settig the etries outside R c to zero. Usig η ɛ 0, which is a cosequece of (36, (29 ad the assumptio p 1 ɛ 0, we obtai λ ɛ2 0 (i,j R c A ij,. (38 Note that (i,j R c A ij equals two times the sum of ( ( 2 r 2 i.i.d. Beroulli radom variables with parameter q. By the Cheroff boud of Biomial distributios ad the assumptio that q c 1 log, we kow with high probability (i<j R c A ij Cq 2 for some costat C. It follows from (38 that λ 2 C q for some costat C > 0. Combiig with (37 ad the assumptio that q c 1 log, we coclude that (p q C q for some costat C > 0. This completes the proof of the theorem. 1.5 Proof of Theorem 5 The degree d i of ode i is distributed as Bi( 1, p plus a idepedet Bi(, q if i V 1. Otherwise d i is distributed as Bi( 1, q if i V 2. It follows that E[d i ] = ( 1q +( 1(p q if i V 1 ad E[d i ] = ( 1q if i V 2. If we defie σ 2 = p(1 q + q(1 q, the we further have Var[d i ] σ 2. Set t := ( 1 p q /2 σ 2 ; the Berstei iequality (Theorem A.1 gives ( t 2 P{ d i E[d i ] t} 2 exp 2σ 2 2 exp ( ( 12 (p q 2 + 2t/3 12σ 2 2 2, where the last iequality follows from assumptio (12. By the uio boud, with probability at least 1 2 1, d i > (p q 2 + q for all odes i V 1 ad d i < (p q 2 + q for all odes i V 2. Therefore, all odes i V 2 are correctly declared to be isolated with high probability. The umber of commo eighbors S ij betwee the odes i ad j is distributed as Bi( 2, p 2 plus a idepedet Bi(, q 2 if i ad j are i the same cluster, ad it is distributed as Bi(2( 1, pq plus a idepedet Bi( 2, q 2 if i ad j are i differet clusters. Hece, E[S ij ] equals ( 2p 2 + ( q 2 if i ad j are i the same cluster ad 2( 1pq + ( 2q 2 otherwise. The differece i mea equals (p q 2 2p(p q. Let σ 2 = 2p 2 (1 q 2 +q 2 (1 q 2. The Var[S ij ] σ 2. Set t := (p q 2 /3 σ 2. Assumptio (13 implies that t > 2p(p q. Applyig the Berstei iequality (Theorem A.1, we obtai ( P{ S ij E[S ij ] t t 2 } 2 exp 2σ 2 2 exp ( 2 (p q 4 + 2t/3 27σ 2 2 3, where the last iequality follows from the assumptio (13. By uio boud, with probability at least 1 2 1, S ij > (p q pq + q 2 for all odes i, j from the same cluster ad S ij < (p q pq + q 2 for all odes i, j from two differet clusters. Therefore the simple algorithm returs the true clusters w.h.p. 10
11 1.6 Proof of Theorem 6 For simplicity we assume ad 2 are eve umbers. We partitio V 1 ito two equal-sized subsets V 1+ ad V 1 such that half of the odes of each cluster are i V 1+. Similarly, V 2 is partitioed ito two equal-sized subsets V 2+ ad V 2. To prove the theorem, we eed the followig aticocetratio iequality. Theorem 1.4 (Theorem i [3]. Let X 1,..., X be idepedet radom variables such that 0 X i 1 for all i. Suppose X = i=1 X i ad σ 2 := i=1 Var[X i] 200. The for all 0 t σ 2 /100 ad some uiversal costat c > 0, we have P [X E[X] + t] ce t2 /(3σ 2. Idetifyig isolated odes For each ode i i V 1+ V 2+, let d i+ be the umber of its eighbors i V 1+ V 2+ ad d i be the umber of its eighbors i V 1 V 2, so d i = d i+ + d i. We cosider two cases. Case 1: Suppose (p + ( q log 1 q log 2. I this case we have ( 1 2 (p q 2 2c 2 (p+q log 1 by (14. For each i V 1+, d i is distributed as Bi(/2, p plus a idepedet Bi(( /2, q. Let t := ( 1(p q + 2, γ 1 := E[d i ] t = q/2 + (p q/2 t ad σd 2 := Var[d i ] = 1 2 p(1 p ( q(1 q. Sice /2, p, q are bouded away from 1 ad p + q p 2 + q 2 c 1 log, we have σd 2 c log 200. Combiig wit (14, we further have σd 4 c 2 (p q 2 / log 1 c log t 2. We ca thus apply Theorem 1.4 ad get ( P [d i γ 1 ] c exp ( t2 (( 1(p q σd 2 = exp c1 c, 3(p(1 p + ( q(1 q for some costat c > 0 that ca be made small by choosig c 2 above sufficietly small. Let i := arg mi i V1+ d i. Sice the radom variables {d i : i V 1+ } are mutually idepedet, we have P [d i γ 1 ] = ( P [d i γ 1 ] (1 c1 c 1/2 exp c 1 c 1 /2 1/4. i V 1+ O the other had, for each i V 1+, d i+ is distributed as Bi(/2 1, p plus a idepedet Bi(( /2, q. Sice the media of Bi(, p is at most p + 1, we kow that with probability at least 1/2, d i+ γ 2 := q/2 + (p q/2 p + 2. Now observe that the two sets of radom variables {d i+, i V 1+ } ad {d i, i V 1+ } are idepedet of each other, so d i+ is idepedet of i for each i V 1+. It follows that P [d i + γ 2 ] = i V 1+ P [d i+ γ 2 i = i] P [i = i] = i V 1+ P [d i+ γ 2 ] P [i = i] 1 2. Combiig the last two display equatios by the uio boud, we obtai that with probability at least 1/4, d i = d i + d i + γ 1 + γ 2 = ( 1q, ad thus ode i will be icorrectly declared as a isolated ode. Case 2: Suppose (p + q log 1 q log 2. I this case we have ( 1 2 (p q 2 2c 2 q log 2 by assumptio. Defie i = arg max i V2+ d i. Usig the same argumet as i Case 1, we ca show that with probability at least 1/4, d i ( 1q + ( 1(p q ad thus ode i will icorrectly declared as a o-isolated ode. 11
12 Recoverig clusters For two odes i, j V 1, let S ij+ be the umber of their commo eighbors i V 1+ V 2+ ad S ij be the umber of their commo eighbors i V 1 V 2, so S ij+ = S ij+ +S ij. For each pair of odes i, j i V 1+ that are from the same cluster, S ij is distributed as Bi(/2, p 2 plus a idepedet Bi(( /2, q 2. Let t := (p q 2 + 4, γ 3 := E[S ij ] t = q 2 /2+(p 2 q 2 /2 t, ad σs 2 := Var[S ij ] = 1 2 p2 (1 p ( q2 (1 q 2. Sice /2, p, q are bouded away from 1 ad p 2 + q 2 c 1 log, we have that σs ad σ2 S 100t. Theorem 1.4 implies that there exists a costat c > 0 such that ( P [S ij γ 3 ] c exp ( t 2 ((p q = c exp 3(p 2 (1 p 2 + ( q 2 (1 q 2 c c 1, 3σ 2 S where the costat c > 0 ca be made sufficietly small by choosig c 2 sufficietly small i the statemet of the lemma. Without loss of geerality, we may re-label the odes such that V 1+ = {1, 2,..., 1 /2} ad for each k = 1,..., 1 /4, the odes 2k 1 ad 2k are i the same cluster. Note that the radom variables {S (2k 12k : k = 1, 2,..., 1 /4} are mutually idepedet. Let i = arg mi k=1,2,...,1 /4 S (2k 12k ad j = i + 1; it follows that P [S i j γ 3 ] (1 c c 1 1/4 exp( c 1 c 1 /4 1/4. O the other had, sice S ij+ is distributed as Bi(/2 2, p 2 plus a idepedet Bi(( /2, q 2, we use the media argumet to obtai that with probability at least 1/2, S ij+ γ 4 := q 2 /2 + (p 2 q 2 /2 2p Because {S ij+, i, j V 1+ } oly depeds o the edges betwee V 1+ ad V 1+ V 2+, ad (i, j oly depeds o the edges betwee V 1+ ad V 1 V 2, we kow {S ij+, i, j V 1+ } ad (i, j are idepedet of each other. It follows that S i j + γ 4 with probability at least 1/2. It follows that with probability at least 1/4, S i j = S i j + S i j + γ 3 + γ 4 = 2( 1pq + ( 2q 2 ad thus the odes i, j will be icorrectly assiged to two differet clusters. Appedices A Stadard Berstei Iequality Theorem A.1. Let X 1,..., X be idepedet radom variables such that X i M almost surely. Let σ 2 = i=1 Var(X i, the for ay t 0 [ ] ( t 2 P X i t exp 2σ Mt. i=1 A cosequet of the above iequality is P [ i=1 X i 2σ 2 u + 2Mu 3 ] e u for ay u > 0. 12
13 Refereces [1] S. Chattergee. Matrix estimatio by uiversal sigular value thresholdig. arxiv: , [2] L. Massoulié ad D.-C. Tomozei. Distributed user profilig via spectral methods. arxiv: , [3] J. Matoušek ad J. Vodrák. The probabilistic method, lecture otes. kam.mff.cui. cz/ matousek/ prob-l-2pp.ps.gz, [4]. Rohe, S. Chatterjee, ad B. Yu. Spectral clusterig ad the high-dimesioal stochastic blockmodel. The Aals of Statistics, 39(4, [5] J. Tropp. User-friedly tail bouds for sums of radom matrices. Foudatios of Computatioal Mathematics, 12(4: ,
Convergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLecture 2. The Lovász Local Lemma
Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio
More informationACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory
1. Graph Theory Prove that there exist o simple plaar triagulatio T ad two distict adjacet vertices x, y V (T ) such that x ad y are the oly vertices of T of odd degree. Do ot use the Four-Color Theorem.
More informationLinear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d
Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y
More information5.1 Review of Singular Value Decomposition (SVD)
MGMT 69000: Topics i High-dimesioal Data Aalysis Falll 06 Lecture 5: Spectral Clusterig: Overview (cotd) ad Aalysis Lecturer: Jiamig Xu Scribe: Adarsh Barik, Taotao He, September 3, 06 Outlie Review of
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationSupplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate
Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationProblem Set 2 Solutions
CS271 Radomess & Computatio, Sprig 2018 Problem Set 2 Solutios Poit totals are i the margi; the maximum total umber of poits was 52. 1. Probabilistic method for domiatig sets 6pts Pick a radom subset S
More informationECE 330:541, Stochastic Signals and Systems Lecture Notes on Limit Theorems from Probability Fall 2002
ECE 330:541, Stochastic Sigals ad Systems Lecture Notes o Limit Theorems from robability Fall 00 I practice, there are two ways we ca costruct a ew sequece of radom variables from a old sequece of radom
More informationRademacher Complexity
EECS 598: Statistical Learig Theory, Witer 204 Topic 0 Rademacher Complexity Lecturer: Clayto Scott Scribe: Ya Deg, Kevi Moo Disclaimer: These otes have ot bee subjected to the usual scrutiy reserved for
More information1 Duality revisited. AM 221: Advanced Optimization Spring 2016
AM 22: Advaced Optimizatio Sprig 206 Prof. Yaro Siger Sectio 7 Wedesday, Mar. 9th Duality revisited I this sectio, we will give a slightly differet perspective o duality. optimizatio program: f(x) x R
More informationSupplemental Material: Proofs
Proof to Theorem Supplemetal Material: Proofs Proof. Let be the miimal umber of traiig items to esure a uique solutio θ. First cosider the case. It happes if ad oly if θ ad Rak(A) d, which is a special
More informationA Hadamard-type lower bound for symmetric diagonally dominant positive matrices
A Hadamard-type lower boud for symmetric diagoally domiat positive matrices Christopher J. Hillar, Adre Wibisoo Uiversity of Califoria, Berkeley Jauary 7, 205 Abstract We prove a ew lower-boud form of
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationIntroduction to Probability. Ariel Yadin
Itroductio to robability Ariel Yadi Lecture 2 *** Ja. 7 ***. Covergece of Radom Variables As i the case of sequeces of umbers, we would like to talk about covergece of radom variables. There are may ways
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More informationLecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationLecture 9: Expanders Part 2, Extractors
Lecture 9: Expaders Part, Extractors Topics i Complexity Theory ad Pseudoradomess Sprig 013 Rutgers Uiversity Swastik Kopparty Scribes: Jaso Perry, Joh Kim I this lecture, we will discuss further the pseudoradomess
More informationSupplementary Material for Fast Stochastic AUC Maximization with O(1/n)-Convergence Rate
Supplemetary Material for Fast Stochastic AUC Maximizatio with O/-Covergece Rate Migrui Liu Xiaoxua Zhag Zaiyi Che Xiaoyu Wag 3 iabao Yag echical Lemmas ized versio of Hoeffdig s iequality, ote that We
More informationNotes 5 : More on the a.s. convergence of sums
Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationSelf-normalized deviation inequalities with application to t-statistic
Self-ormalized deviatio iequalities with applicatio to t-statistic Xiequa Fa Ceter for Applied Mathematics, Tiaji Uiversity, 30007 Tiaji, Chia Abstract Let ξ i i 1 be a sequece of idepedet ad symmetric
More informationLECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK)
LECTURE 8: ORTHOGONALITY (CHAPTER 5 IN THE BOOK) Everythig marked by is ot required by the course syllabus I this lecture, all vector spaces is over the real umber R. All vectors i R is viewed as a colum
More informationNotes for Lecture 11
U.C. Berkeley CS78: Computatioal Complexity Hadout N Professor Luca Trevisa 3/4/008 Notes for Lecture Eigevalues, Expasio, ad Radom Walks As usual by ow, let G = (V, E) be a udirected d-regular graph with
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationSolution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1
Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationLecture 7: October 18, 2017
Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem
More informationLet us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.
Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,
More informationarxiv: v1 [math.pr] 13 Oct 2011
A tail iequality for quadratic forms of subgaussia radom vectors Daiel Hsu, Sham M. Kakade,, ad Tog Zhag 3 arxiv:0.84v math.pr] 3 Oct 0 Microsoft Research New Eglad Departmet of Statistics, Wharto School,
More informationLecture 27. Capacity of additive Gaussian noise channel and the sphere packing bound
Lecture 7 Ageda for the lecture Gaussia chael with average power costraits Capacity of additive Gaussia oise chael ad the sphere packig boud 7. Additive Gaussia oise chael Up to this poit, we have bee
More information6. Kalman filter implementation for linear algebraic equations. Karhunen-Loeve decomposition
6. Kalma filter implemetatio for liear algebraic equatios. Karhue-Loeve decompositio 6.1. Solvable liear algebraic systems. Probabilistic iterpretatio. Let A be a quadratic matrix (ot obligatory osigular.
More informationDimension-free PAC-Bayesian bounds for the estimation of the mean of a random vector
Dimesio-free PAC-Bayesia bouds for the estimatio of the mea of a radom vector Olivier Catoi CREST CNRS UMR 9194 Uiversité Paris Saclay olivier.catoi@esae.fr Ilaria Giulii Laboratoire de Probabilités et
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More informationGeneralized Semi- Markov Processes (GSMP)
Geeralized Semi- Markov Processes (GSMP) Summary Some Defiitios Markov ad Semi-Markov Processes The Poisso Process Properties of the Poisso Process Iterarrival times Memoryless property ad the residual
More informationLinear Support Vector Machines
Liear Support Vector Machies David S. Roseberg The Support Vector Machie For a liear support vector machie (SVM), we use the hypothesis space of affie fuctios F = { f(x) = w T x + b w R d, b R } ad evaluate
More informationHomework Set #3 - Solutions
EE 15 - Applicatios of Covex Optimizatio i Sigal Processig ad Commuicatios Dr. Adre Tkaceko JPL Third Term 11-1 Homework Set #3 - Solutios 1. a) Note that x is closer to x tha to x l i the Euclidea orm
More informationRandomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)
Radomized Algorithms I, Sprig 08, Departmet of Computer Sciece, Uiversity of Helsiki Homework : Solutios Discussed Jauary 5, 08). Exercise.: Cosider the followig balls-ad-bi game. We start with oe black
More informationMath 155 (Lecture 3)
Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,
More informationAlgebra of Least Squares
October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal
More informationNotes 27 : Brownian motion: path properties
Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X
More informationExponential Families and Bayesian Inference
Computer Visio Expoetial Families ad Bayesia Iferece Lecture Expoetial Families A expoetial family of distributios is a d-parameter family f(x; havig the followig form: f(x; = h(xe g(t T (x B(, (. where
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationTechnical Proofs for Homogeneity Pursuit
Techical Proofs for Homogeeity Pursuit bstract This is the supplemetal material for the article Homogeeity Pursuit, submitted for publicatio i Joural of the merica Statistical ssociatio. B Proofs B. Proof
More informationLecture 14: Graph Entropy
15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number
More information18.657: Mathematics of Machine Learning
8.657: Mathematics of Machie Learig Lecturer: Philippe Rigollet Lecture 4 Scribe: Cheg Mao Sep., 05 I this lecture, we cotiue to discuss the effect of oise o the rate of the excess risk E(h) = R(h) R(h
More informationAPPENDIX A SMO ALGORITHM
AENDIX A SMO ALGORITHM Sequetial Miimal Optimizatio SMO) is a simple algorithm that ca quickly solve the SVM Q problem without ay extra matrix storage ad without usig time-cosumig umerical Q optimizatio
More informationThe log-behavior of n p(n) and n p(n)/n
Ramauja J. 44 017, 81-99 The log-behavior of p ad p/ William Y.C. Che 1 ad Ke Y. Zheg 1 Ceter for Applied Mathematics Tiaji Uiversity Tiaji 0007, P. R. Chia Ceter for Combiatorics, LPMC Nakai Uivercity
More information4. Partial Sums and the Central Limit Theorem
1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems
More informationSupplement to Graph Sparsification Approaches for Laplacian Smoothing
Supplemet to Graph Sparsificatio Approaches for Laplacia Smoothig Veerajaeyulu Sadhaala Yu-Xiag Wag Rya J. Tibshirai Machie Learig Departmet Caregie Mello Uiversity Pittsburgh PA 53 Departmet of Statistics
More informationLecture 2: Concentration Bounds
CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy
More informationb i u x i U a i j u x i u x j
M ath 5 2 7 Fall 2 0 0 9 L ecture 1 9 N ov. 1 6, 2 0 0 9 ) S ecod- Order Elliptic Equatios: Weak S olutios 1. Defiitios. I this ad the followig two lectures we will study the boudary value problem Here
More informationOnline hypergraph matching: hiring teams of secretaries
Olie hypergraph matchig: hirig teams of secretaries Rafael M. Frogillo Advisor: Robert Kleiberg May 29, 2008 Itroductio The goal of this paper is to fid a competitive algorithm for the followig problem.
More informationMath 451: Euclidean and Non-Euclidean Geometry MWF 3pm, Gasson 204 Homework 3 Solutions
Math 451: Euclidea ad No-Euclidea Geometry MWF 3pm, Gasso 204 Homework 3 Solutios Exercises from 1.4 ad 1.5 of the otes: 4.3, 4.10, 4.12, 4.14, 4.15, 5.3, 5.4, 5.5 Exercise 4.3. Explai why Hp, q) = {x
More informationMath 216A Notes, Week 5
Math 6A Notes, Week 5 Scribe: Ayastassia Sebolt Disclaimer: These otes are ot early as polished (ad quite possibly ot early as correct) as a published paper. Please use them at your ow risk.. Thresholds
More informationEE 4TM4: Digital Communications II Information Measures
EE 4TM4: Digital Commuicatios II Iformatio Measures Defiitio : The etropy H(X) of a discrete radom variable X is defied by We also write H(p) for the above quatity. Lemma : H(X) 0. H(X) = x X Proof: 0
More informationLecture 6: Coupon Collector s problem
Radomized Algorithms Lecture 6: Coupo Collector s problem Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Radomized Algorithms - Lecture 6 1 / 16 Variace: key features
More informationAsymptotic distribution of products of sums of independent random variables
Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege
More informationChapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities
Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other
More informationThe random version of Dvoretzky s theorem in l n
The radom versio of Dvoretzky s theorem i l Gideo Schechtma Abstract We show that with high probability a sectio of the l ball of dimesio k cε log c > 0 a uiversal costat) is ε close to a multiple of the
More informationEntropy Rates and Asymptotic Equipartition
Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,
More informationJournal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula
Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials
More informationMathematics 170B Selected HW Solutions.
Mathematics 17B Selected HW Solutios. F 4. Suppose X is B(,p). (a)fidthemometgeeratigfuctiom (s)of(x p)/ p(1 p). Write q = 1 p. The MGF of X is (pe s + q), sice X ca be writte as the sum of idepedet Beroulli
More informationLecture 19. sup y 1,..., yn B d n
STAT 06A: Polyomials of adom Variables Lecture date: Nov Lecture 19 Grothedieck s Iequality Scribe: Be Hough The scribes are based o a guest lecture by ya O Doell. I this lecture we prove Grothedieck s
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationECE534, Spring 2018: Final Exam
ECE534, Srig 2018: Fial Exam Problem 1 Let X N (0, 1) ad Y N (0, 1) be ideedet radom variables. variables V = X + Y ad W = X 2Y. Defie the radom (a) Are V, W joitly Gaussia? Justify your aswer. (b) Comute
More informationSymmetric Two-User Gaussian Interference Channel with Common Messages
Symmetric Two-User Gaussia Iterferece Chael with Commo Messages Qua Geg CSL ad Dept. of ECE UIUC, IL 680 Email: geg5@illiois.edu Tie Liu Dept. of Electrical ad Computer Egieerig Texas A&M Uiversity, TX
More informationApplication to Random Graphs
A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let
More informationLearning Theory: Lecture Notes
Learig Theory: Lecture Notes Kamalika Chaudhuri October 4, 0 Cocetratio of Averages Cocetratio of measure is very useful i showig bouds o the errors of machie-learig algorithms. We will begi with a basic
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationGeometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT
OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca
More informationMath 61CM - Solutions to homework 3
Math 6CM - Solutios to homework 3 Cédric De Groote October 2 th, 208 Problem : Let F be a field, m 0 a fixed oegative iteger ad let V = {a 0 + a x + + a m x m a 0,, a m F} be the vector space cosistig
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationWeek 10. f2 j=2 2 j k ; j; k 2 Zg is an orthonormal basis for L 2 (R). This function is called mother wavelet, which can be often constructed
Wee 0 A Itroductio to Wavelet regressio. De itio: Wavelet is a fuctio such that f j= j ; j; Zg is a orthoormal basis for L (R). This fuctio is called mother wavelet, which ca be ofte costructed from father
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data
ECE 6980 A Algorithmic ad Iformatio-Theoretic Toolbo for Massive Data Istructor: Jayadev Acharya Lecture # Scribe: Huayu Zhag 8th August, 017 1 Recap X =, ε is a accuracy parameter, ad δ is a error parameter.
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationDiagonal approximations by martingales
Alea 7, 257 276 200 Diagoal approximatios by martigales Jaa Klicarová ad Dalibor Volý Faculty of Ecoomics, Uiversity of South Bohemia, Studetsa 3, 370 05, Cese Budejovice, Czech Republic E-mail address:
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More information1 Convergence in Probability and the Weak Law of Large Numbers
36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec
More informationLecture 2: April 3, 2013
TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 2: April 3, 203 Scribe: Shubhedu Trivedi Coi tosses cotiued We retur to the coi tossig example from the last lecture agai: Example. Give,
More informationDisjoint Systems. Abstract
Disjoit Systems Noga Alo ad Bey Sudaov Departmet of Mathematics Raymod ad Beverly Sacler Faculty of Exact Scieces Tel Aviv Uiversity, Tel Aviv, Israel Abstract A disjoit system of type (,,, ) is a collectio
More informationCS284A: Representations and Algorithms in Molecular Biology
CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by
More informationLecture 5: April 17, 2013
TTIC/CMSC 350 Mathematical Toolkit Sprig 203 Madhur Tulsiai Lecture 5: April 7, 203 Scribe: Somaye Hashemifar Cheroff bouds recap We recall the Cheroff/Hoeffdig bouds we derived i the last lecture idepedet
More informationLecture 6 Simple alternatives and the Neyman-Pearson lemma
STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationSieve Estimators: Consistency and Rates of Convergence
EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes
More informationSolution to Chapter 2 Analytical Exercises
Nov. 25, 23, Revised Dec. 27, 23 Hayashi Ecoometrics Solutio to Chapter 2 Aalytical Exercises. For ay ε >, So, plim z =. O the other had, which meas that lim E(z =. 2. As show i the hit, Prob( z > ε =
More informationDetailed proofs of Propositions 3.1 and 3.2
Detailed proofs of Propositios 3. ad 3. Proof of Propositio 3. NB: itegratio sets are geerally omitted for itegrals defied over a uit hypercube [0, s with ay s d. We first give four lemmas. The proof of
More informationLaw of the sum of Bernoulli random variables
Law of the sum of Beroulli radom variables Nicolas Chevallier Uiversité de Haute Alsace, 4, rue des frères Lumière 68093 Mulhouse icolas.chevallier@uha.fr December 006 Abstract Let be the set of all possible
More informationLarge holes in quasi-random graphs
Large holes i quasi-radom graphs Joaa Polcy Departmet of Discrete Mathematics Adam Mickiewicz Uiversity Pozań, Polad joaska@amuedupl Submitted: Nov 23, 2006; Accepted: Apr 10, 2008; Published: Apr 18,
More informationSlide Set 13 Linear Model with Endogenous Regressors and the GMM estimator
Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday
More information