Collision-based Testers are Optimal for Uniformity and Closeness
|
|
- Jewel Marsh
- 5 years ago
- Views:
Transcription
1 Electronic Colloquiu on Coputational Coplexity, Report No. 178 (016) Collision-based Testers are Optial for Unifority and Closeness Ilias Diakonikolas Theis Gouleakis John Peebles Eric Price USC MIT MIT UT Austin Noveber 10, 016 Abstract We study the fundaental probles of (i) unifority testing of a discrete distribution, and (ii) closeness testing between two discrete distributions with bounded l -nor. These probles have been extensively studied in distribution testing and saple-optial estiators are known for the [Pan08, CDVV14, VV14, DKN15b]. In this work, we show that the original collision-based testers proposed for these probles [GR00, BFR + 00] are saple-optial, up to constant factors. Previous analyses showed saple coplexity upper bounds for these testers that are optial as a function of the doain size n, but suboptial by polynoial factors in the error paraeter ɛ. Our ain contribution is a new tight analysis establishing that these collision-based testers are inforation-theoretically optial, up to constant factors, both in the dependence on n and in the dependence on ɛ. 1 Introduction 1.1 Background and Our Results The generic inference proble in distribution property testing [BFR + 00, BFR + 13] (also see, e.g., [Rub1, Can15, Gol16b]) is the following: given saple access to one or ore unknown distributions, deterine whether they satisfy soe global property or are far fro satisfying the property. During the past couple of decades, distribution testing whose roots lie in statistical hypothesis testing [NP33, LR05] has developed into a ature field. One of the ost fundaental tasks in this field is deciding whether an unknown discrete distribution is approxiately unifor on its doain, known as the proble of unifority testing. Forally, we want to design an algorith that, given independent saples fro a discrete distribution p over [n] and a paraeter ɛ > 0, distinguishes (with high probability) the case that p is unifor fro the case that p is ɛ-far fro unifor, i.e., the total variation distance between p and the unifor distribution over [n] is at least ɛ. Unifority testing was the very first proble considered in this line of work: Goldreich and Ron [GR00], otivated by the question of testing the expansion of graphs, proposed a siple and natural unifority tester that relies on the collision probability of the unknown distribution. The collision probability of a discrete distribution p is the probability that two saples drawn according to p are equal. The key intuition here is that the unifor distribution has the iniu collision probability aong all distributions on the sae doain, and that any distribution that is ɛ-far Part of this research was perfored when the author was at the University of Edinburgh, and while visiting MIT. Supported in part by a Marie Curie Career Integration grant and an EPSRC grant. This aterial is based upon work supported by the NSF under Grant No This aterial is based upon work supported by the NSF Graduate Research Fellowship under Grant No , and by the NSF under Grant No ISSN
2 fro unifor has noticeably larger collision probability. Foralizing this intuition, Goldreich and Ron [GR00] showed that the collision-based unifority tester succeeds after drawing O(n 1/ /ɛ 4 ) saples fro the unknown distribution. An inforation-theoretic lower bound of Ω(n 1/ ) on the nuber of saples required by any unifority tester follows fro a siple birthday-paradox arguent [GR00, BFF + 01], even for constant values of the paraeter ɛ. In subsequent work, Paninski [Pan08] showed an inforation-theoretic lower bound of Ω(n 1/ /ɛ ), and also provided a atching upper bound of O(n 1/ /ɛ ) that holds under the assuption that ɛ = Ω(n 1/4 ) 1. This lower bound assuption on ɛ is not inherent: As shown in a nuber of recent works [VV14, DKN15b] (see also [ADJ + 1, CDVV14]), a variant of Pearson s χ -tester can test unifority with O(n 1/ /ɛ ) saples for all values of n, ɛ > 0. The chi-squared type testers of [CDVV14, VV14] are siple, but are also arguably slightly less natural than the original collision-based unifority tester [GR00]. Perhaps surprisingly, prior to this work, the saple coplexity of the collision unifority tester was not fully understood. In particular, it was not known whether the saple upper bound of O(n 1/ /ɛ 4 ) established in [GR00] is tight for this tester, or there exists an iproved analysis that can give a better upper bound. As our first ain contribution (Theore 1), we provide a new analysis of the collision unifority tester establishing a tight O(n 1/ /ɛ ) upper bound on its saple coplexity. That is, we show that the originally proposed unifority tester is in fact saple-optial, up to constant factors. A related testing proble of central iportance in the field is the following: Given saples fro two unknown distributions p, q over [n] with the proise that ax{ p, q } b, distinguish between the cases that p q ɛ/ and p q ɛ. That is, we want to test the closeness between two unknown distributions with sall l -nor. (We reark here that the assuption that both p and q have sall l -nor is critical in this context.) The seinal work of Batu et al. [BFR + 00] gave a collision-based tester for this proble that uses O(b /ɛ 4 +b 1/ /ɛ ) saples. Subsequent work by Chan, Diakonikolas, Valiant, and Valiant [CDVV14] gave a different chi-squared type tester that uses O(b 1/ /ɛ ); this saple bound was shown [CDVV14, VV14] to be optial, up to constant factors. Siilarly to the case of unifority testing, prior to this work, it was not known whether the analysis of the collision-based closeness tester in [BFR + 00] is tight. As our second contribution, we show (Theore 8) that (essentially) the collision-based tester of [BFR + 00] succeeds with O(b 1/ /ɛ ) saples, i.e., it is saple-optial, up to constants, for the corresponding proble. Reark. Unifority testing has been a useful algorithic priitive for several other distribution testing probles as well [BFF + 01, DDS + 13, DKN15b, DKN15a, CDGR16, Gol16a]. Notably, Goldreich [Gol16a] recently showed that the ore general proble of testing the identity of any explicitly given distribution can be reduced to unifority testing with only a constant factor loss in saple coplexity. The proble of l closeness testing for distributions with sall l nor has been identified as an iportant algorithic priitive since the original work of Batu et al. [BFR + 00] who exploited it to obtain the first l 1 closeness tester. Recently, Diakonikolas and Kane [DK16] gave a collection of reductions fro various distribution testing probles to the above l closeness testing proble. The approach of [DK16] shows that one can obtain saple-optial testers for a range of different properties of distributions by applying an optial tester for the above proble as a black-box. 1 The unifority tester of [Pan08] relies on the nuber of unique eleents, i.e., the eleents that appear in the saple set exactly once. Such a tester is only eaningful in the regie that the total nuber of saples is saller than the doain size.
3 1. Overview of Analysis We now provide a brief suary of previous analyses and a coparison with our work. The canonical way to construct and analyze distribution property testers roughly works as follows: Given independent saples s 1,..., s fro our distribution(s), we consider an appropriate rando variable (statistic) F (s 1,..., s ). If F (s 1,..., s ) exceeds an appropriately defined threshold T, our tester rejects; otherwise, it accepts. The canonical analysis proceeds by bounding the expectation and variance of F for the case that the distribution(s) satisfy the property (copleteness), and the case they are ɛ-far fro satisfying the property (soundness), followed by an application of Chebyshev s inequality. The ain difficulty is choosing the statistic F appropriately so that the expectations for the copleteness and soundness cases are sufficiently separated after a sall nuber of saples, and at the sae tie the variance of the statistic is not too large. Typically, the challenging step in the analysis is bounding fro above the variance of F in the soundness case. Our analysis follows this standard fraework. Roughly speaking, for both probles we consider, we provide a tighter analysis of the variance of the corresponding estiators, that in turn leads to the optial saple coplexity upper bound. More specifically, for the case of unifority testing, the arguent of [GR00] proceeds by showing that the collision tester yields a (1+γ)-ultiplicative approxiation of the l -nor of the unknown distribution with O(n 1/ /γ ) saples. Setting γ = ɛ gives a unifority testing under the l 1 distance that uses O(n 1/ /ɛ 4 ) saples. We note that the quadratic dependence on 1/γ in the ultiplicative approxiation of the l nor is tight in general. (For an easy exaple, consider the case that our distribution is either unifor over two eleents, or assigns probability ass 1/ γ, 1/ + γ to the eleents.) Roughly speaking, we show that we can do better when the l nor of the distribution in question is sall. More specifically, the collision unifority tester can distinguish between the case that p (1 + γ/)/n and p (1 + γ)/n with O(n1/ /γ) saples. This iediately yields the desired l 1 guarantee. For the closeness testing proble (under our bounded l nor assuption), Batu et al. [BFR + 00] construct a statistic whose expectation is proportional to the square of the l distance between the two distributions p and q. This statistic has three ters whose expectations are proportional to p, q, and p q respectively. Specifically, the first ter is obtained by considering the nuber of self-collisions of a set of saples fro p. Siilarly, the second ter is proportional to the nuber of self-collisions of a set of saples fro q. The third ter is obtained by considering the nuber of cross-collisions between soe saples fro p and q. In order to siplify the analysis, [BFR + 00] uses a separate set of fresh saples for the cross-collisions ter. This set is independent of the set of saples used for the two self-collisions ters. While this choice akes the analysis cleaner, it ends up increasing the variance of the estiator too uch leading to a sub-optial saple upper bound. We show that by reusing saples to calculate the nuber of cross-collisions, one achieves sufficiently good variance to get optial saple coplexity. This coes at the cost of a ore coplicated analysis involving a very careful calculation of the variance. 1.3 Notation We write [n] to denote the set {1,..., n}. We consider discrete distributions over [n], which are functions p : [n] [0, 1] such that n i=1 p i = 1. We use the notation p i to denote the probability of eleent i in distribution p. We will denote by U n the unifor distribution over [n]. For r 1, the l r nor of a distribution is identified with the l r nor of the corresponding vector, i.e., p r = ( n i=1 p i r ) 1/r. The l 1 (resp. l ) distance between distributions p and q is defined as the the l 1 (resp. l ) nor of the vector of their difference, i.e., p q 1 = n i=1 p i q i and p q = n i=1 (p i q i ). 3
4 Testing Unifority via Collisions In this section, we show that the natural collision unifority tester proposed in [GR00] is sapleoptial up to constant factors. More specifically, we are given saples fro a probability distribution p over [n], and we wish to distinguish (with high constant probability) between the cases that p is unifor versus ɛ-far fro unifor in l 1 -distance. The ain result of this section is that the collision-based unifority tester succeeds in this task with = O(n 1/ /ɛ ) saples. In fact, we prove the following stronger l -guarantee for the collisions tester: With = O(n 1/ /ɛ ) saples, it distinguishes between the cases that p U n ɛ /(n) (copleteness) versus p U n ɛ /n (soundness). The desired l 1 guarantee follows fro this l guarantee by an application of the Cauchy-Schwarz inequality in the soundness case. Forally, we analyze the following tester: Algorith Test-Unifority-Collisions(p, n, ɛ) Input: saple access to a distribution p over [n], and ɛ > 0. Output: YES if p U n ɛ /(n); NO if p U n ɛ /n. 1. Draw iid saples fro p.. Let σ ij be an indicator variable which is 1 if saples i and j are the sae and 0 otherwise. 3. Define the rando variable s = i<j σ ij and the threshold t = ( ) 1+3ɛ /4 4. If s t return NO ; otherwise, return YES. n The following theore characterizes the perforance of the above estiator: Theore 1. The above estiator, when given saples drawn fro a distribution p over [n] will, with probability at least 3/4, distinguish the case that p U n ɛ /(n) fro the case that p U n ɛ /n provided that 300n 1/ /ɛ. The rest of this section is devoted to the proof of Theore 1. Note that the condition of the theore is equivalent to testing whether p 1+ɛ / n versus p 1+ɛ n. Our tester takes = 300n1/ saples fro p and distinguishes between the two cases with probability at least 3/4. ɛ.1 Analysis of Test-Unifority-Collisions The analysis proceeds by bounding the expectation and variance of the estiator for the copleteness and soundness cases, and applying Chebyshev s inequality. The novelty here is a tight analysis of the variance which leads to the optial saple bound. We start by recalling the following siple closed forula for the expected value: Lea. We have that E[s] = ( ) p. Proof. For any i, j, the probability that saples i and j are equal is p. By this and linearity of expectation, we get E[s] = E ij σ ij = ij E[σ ij ] = ij p = ( ) p. 4
5 Thus, we see that in the copleteness case the expected value is at ost ( ) 1+ɛ / n. In the soundness case, the expected value is at least ( ) 1+ɛ n. This otivates our choice of the threshold t halfway between these expected values. In order to argue that the statistic will be close to its expected value, we bound its variance fro above and use Chebyshev s inequality. We bound the variance in two steps. First, we obtain the following bound: Lea 3. We have that Var[s] p + 3 ( p 3 3 p 4 ). Proof. The lea follows fro the following chain of (in-)equalities: Var[s] = E[s ] E[s] = E σ ij σ kl i<j = E k<l i<j; k<l all distinct ( )( = ( = σ ij σ kl + ) p 4 + ( ) p 4 i<j<l σ ij σ jl + σ ij σ kj + i<j i,k<j i k ( ) p ( 3 ) p ) ( p p 4 ) + ( 1)( ) ( p 3 3 p 4 ) p + 3 ( p 3 3 p 4 ). σ ij ( ( ) p ) p 4 ( ) p 4 Reark. We note that the upper bound of the previous lea is tight, up to constant factors. The 3 p 4 ter is critical for getting the optial dependence on ɛ in the saple bound. Continuing the analysis, we now derive an upper bound on the nuber of saples that suffices for the tester to have the desired success probability of 3/4. Lea 4. Let α satisfy p = 1+α n and σ be the standard deviation of s. The nuber of saples required by Test-Unifority-Collisions is at ost 5σn α 3ɛ /4, in order to get error probability at ost 1/4. Proof. By Chebyshev s inequality, we have that [ ( ) ] Pr s p kσ 1 k where σ Var[s]. 5
6 We want s to be closer to its expected value than the threshold is to its expected value because when this occurs, the tester outputs the right answer. Furtherore, to achieve our desired probability of error of at ost 1/4, we want this to happen with probability at least 3/4. So, we set k =, and then we want ( ) ( ) kσ E[s] t = p (1 + 3ɛ /4)/n = α 3ɛ /4 /n It suffices for the nuber of saples to satisfy the slightly stronger condition that So, it suffices to have σ α 3ɛ /4. 5n 5σn α 3ɛ /4. We ight as well take the sallest nuber of saples for which the tester works, which iplies the desired inequality. We are now ready to show an upper bound on the nuber of saples in the copleteness case, i.e., when p is the unifor distribution. Lea 5. In the copleteness case, the required nuber of saples is at ost in order to get error probability 1/4. 6n1/ ɛ, Proof. It is easy to see that p = 1/n and p 3 3 = p 4 = 1/n. Thus, by Lea 3, σ /n 1/. Also, we know α = 0 when p is unifor. Substituting these two facts into Lea 4 and solving for gives 6n1/ ɛ. We now turn to the soundness case, where p is far fro unifor. By Lea 4, it suffices to bound fro above the variance σ. We proceed by a case analysis based on whether the ter p or 3 ( p 3 3 p 4 ) contributes ore to the variance..1.1 Case when p is Larger Lea 6. Consider the soundness case, where p = (1 + α)/n for α ɛ. If p contributes ore to the variance, i.e., if p 3 ( p 3 3 p 4 ), then the required nuber of saples is at ost 48n1/ ɛ in order to get error probability 1/4. 6
7 Proof. Suppose that p 3 ( p 3 3 p 4 ). Then σ p = (1+α)/n. Substituting this into Lea 4 and solving for gives that the necessary nuber of saples is at ost 1 + α 8n 1/ (α 3ɛ /4). Using calculus to axiize this expression by varying α, one gets that α = ɛ axiizes the expression. Thus, 1 + ɛ 3n 1/ ɛ 3n 1/ ɛ < 48n1/ ɛ..1. Case when 3 ( p 3 3 p 4 ) is Larger Lea 7. Consider the soundness case, where p = (1 + α)/n for α ɛ. If 3 ( p 3 3 p 4 ) contributes ore to the variance, i.e., if 3 ( p 3 3 p 4 ) p, then the required nuber of saples is at ost in order to get error probability 1/4. 300n1/ ɛ Proof. Suppose that 3 ( p 3 3 p 4 ) p. Then σ 3 ( p 3 3 p 4 ). Substituting this into Lea 4 and solving for gives that the necessary nuber of saples is at ost 50n p 3 3 p 4 (α 3ɛ /4). Let us paraeterize p as p i = 1/n + a i for soe vector a. Then we have a = α/n, and we can write 50n p 3 3 p 4 (α 3ɛ /4) 50n p 3 3 p 4 (α/4) (since ɛ α) 50n p n (α/4) = 50n = 50n = 50n = 50n 50n = 50n [ n i=1 (1/n + a i) 3] 1 n (α/4) [ n n n i=1 a i + 3 n n i=1 a i + n i=1 a3 i (α/4) 3 n n i=1 a i + 3 n 3 n (α/4) n i=1 a i + n i=1 a3 i (α/4) 3 n a + a 3 3 (α/4) 50n n i=1 a i + n i=1 a3 i 3 n a + a 3 (α/4) 3 n (α/n) + (α/n)3/ (α/4) = 400 α + 800n1/ α 400 ɛ + 800n1/ ɛ 300n1/ ɛ. ] 1 n (since n a i = 0) i=1 (since ɛ α) 7
8 Note that, as entioned earlier, if we had ignored the p 4 ter, we would have had an Ω(1/ɛ 4 ) ter in our bound, which would have given us the wrong dependence on ɛ. Theore 1 now follows as an iediate consequence of these last three leas. Reark. It is worth noting that the collisions statistic analyzed in this section is very siilar to the chi-squared-like unifority tester in [DKN15b] itself a siplification of siilar testers in [CDVV14, VV14] which also achieves the optial saple coplexity of O(n 1/ /ɛ ). Specifically, if X i denotes the nuber of ties we see the i-th doain eleent in the saple, the [DKN15b] statistic is i (X i /n) X i = i<j σ ij n i X i + n. We note that the [DKN15b] analysis uses Poissonization; i.e., instead of drawing saples fro the distribution, we draw Poi() saples. Without Poissonization, the aforeentioned statistic siplifies to s n, where s is the collisions statistic. While the non-poissonized versions of the two testers are equivalent, the Poissonized versions are not. Specifically, the Poissonized version of the [DKN15b] unifority tester has sufficiently good variance to yield the saple-optial bound. On the other hand, the Poissonized version of the collisions statistic does not have good variance: Specifically, its variance does not have the p 4 ter which as noted earlier is necessary to get the optial ɛ dependence. 3 Testing Closeness via Collisions Given saples fro two unknown distributions p, q over [n] with the proise that ax{ p, q } b, we want to distinguish between the cases that p q ɛ/ versus p q ɛ. We show that a natural collisions-based tester succeeds in this task with O(b 1/ /ɛ ) saples. The estiator we analyze is a slight variant of the l tester in [BFR + 00], described in pseudocode below. We define the nuber of self-collisions in a sequence of saples fro a distribution as i<j σ ij, where σ ij is the indicator variable denoting whether saples i and j are the sae. Siilarly, we define the nuber of cross-collisions between two sequences of saples as i,j l ij, where l ij is the indicator variable denoting whether saple i fro the first sequence is the sae as saple j fro the second sequence. Algorith Test-Closeness-Collisions(p, q, n, b, ɛ) Input: saple access to distribution p, q over [n], ɛ, b > 0. Output: YES if p q ɛ/; NO if p q ɛ. 1. Draw two ultisets S p, S q of iid saples fro p, q. Let C 1 denote the nuber of self-collisions of S p, C denote the nuber of self-collisions of S q, and C 3 denote the nuber of cross-collisions between S p and S q.. Define the rando variable Z = C 1 + C 1 3. If Z t return NO ; otherwise, return YES. C 3 and the threshold t = ( ) ɛ /. The following theore characterizes the perforance of the above estiator: Theore 8. There exists an absolute constant c such that the above estiator, when given saples drawn fro each of two distributions, p, q over [n] will, with probability at least 3/4, distinguish the case p q ɛ/ fro the case that p q ɛ provided that c b1/, where b is an ɛ upper bound on p, q. 8
9 3.1 Analysis of Test-Closeness-Collisions Let X i, Y i be the nuber of ties we see the eleent i in each set of saples S p and S q, respectively. The above rando variables are distributed as follows: X i Bin(, p i ), Y i Bin(, q i ). Note that the statistic Z can be written as Z = 1 n i=1 [ (Xi Y i ) X i Y i ] + 1 n i=1 [X i (X i 1) + Y i (Y i 1)] = 1 A + 1 B, where A = n i=1 [ (Xi Y i ) X i Y i ] and B = n i=1 [X i(x i 1) + Y i (Y i 1)]. Note that { } ( 1) 1 Var[Z] 4 ax 4 Var[A], 4 Var[B] Note that B essentially corresponds to the nuber of collisions within two disjoint sets of saples, hence we already have an upper bound on its variance. The bulk of the analysis goes into bounding fro above the variance of A = n i=1 A i = n [ i=1 (Xi Y i ) ] X i Y i. Reark. The l collision-based tester we analyze here is closely related to the l -tester of [CDVV14]. Specifically, the A ter in the expression for Z has the sae forula as the l -tester of [CDVV14]. However, a key difference is that the statistic of [CDVV14] is Poissonized, which is crucial for its analysis. We now proceed to analyze the collision-based closeness tester. We start with a siple forula for its expectation: Lea 9. For the expectation of the statistic Z in the closeness tester, we have: ( ) E[Z] = p q. (1) Proof. Viewing p and q as vectors, we have ( E[Z] = E[C 1 + C 1 C 3] = ) (p p) + ( ) (q q) 1 (p q) =. ( ) p q. For the variance, we show the following upper bound: Lea 10. For the variance of the statistic Z in the closeness tester, we have: Var[Z] 116 b p q 4b 1/. To prove this lea, we will use the following proposition, whose proof is deferred to the following subsection. Proposition 11. We have that Var[A] 100 b i (p i q i )(p i q i ). Proof of Lea 10. Recall that by Lea 3 we have Var[B] 4 ( p + q ) ( p 3 3 p 4 + q 3 3 q 4 ). 9
10 Cobined with Proposition 11, we obtain: { } ( 1) 1 Var[Z] 4 ax 4 Var[A], 4 Var[B] ax{100 b (p i q i )(p i qi ), i 4( p + q ) + 4( p 3 3 p 4 + q 3 3 q 4 )}. The second ter in the ax stateent is at ost 16b. Thus, we have Var[Z] 116( 1) b + 8( 1) i (p i q i )(p i q i ) 116 b i (p i q i ) (p i + q i ) 116 b i (p i q i ) 4 i (p i + q i ) (by the Cauchy-Schwarz inequality) 116 b p q 4b 1/ (since i (p i + q i ) 4b). 3. Proof of Theore 8 By Lea 10, we have that Var[Z] 116 b p q 4b 1/ 116 b p q b 1/. We wish to show we can distinguish the copleteness case (i.e., p q ɛ/) fro the soundness case (i.e., p q ɛ). Set α = p q. Then we are proised that either α ɛ or α ɛ /4. Recall we chose t = ( )ɛ and that Lea 9 says that E[Z] = ( ) α. Since E[Z copleteness case] t E[Z soundness case], the only way we fail to distinguish the copleteness and soundness cases is if Z deviates fro its expectation additively by at least ( ) ɛ ( ) ( ) t E[Z] = α ax{α, ɛ }/4, where the last inequality follows by the proise on α in the copleteness and soundness cases. By Chebyshev s inequality, the probability this happens is at ost ( ) Pr[ Z E[Z] ax{α, ɛ }/4 ] Var[Z] [t E[Z]] 116 b αb ) 1/ ax{α, ɛ }/4] 3768 b ɛ 4 + [ ( 4096 b1/ 3768 b 4096 b1/ ɛ 4 + ɛ, { 1 in α, α } ɛ 4 where we siplified using the assuption that. Thus, if we set = O( b1/ ), we get a ɛ constant probability of error in both cases as desired. In the copleteness case where α ε /4 and E[Z] = ( ) ( α, Z has to deviate by at least ) ε /4 ( ) α to ɛ cross the threshold t = ( ) ε /. In the soundness case where α ε, Z has to deviate by at least ( ) α/ ɛ/ to cross the threshold t. 10
11 3.3 Proof of Proposition 11 Recall that A = n i=1 A i = n [ i=1 (Xi Y i ) ] X i Y i, hence Var(A) = n i=1 Var(A i) + i j Cov(A i, A j ). We proceed to bound fro above the individual variances and covariances via a sequence of eleentary but quite tedious calculations Bounding Var(A i ): Since we can write: A i = (X i Y i ) X i Y i = X i + Y i X i Y i X i Y i, Var(A i ) = Var(X i ) + Var(Y i ) + 4Var(X i Y i ) + Var(X i ) + Var(Y i ) + [ Cov(X i, X i Y i ) Cov(X i, X i ) Cov(Y i, X i Y i ) Cov(Y i, Y i ) + Cov(X i Y i, X i ) + Cov(X i Y i, Y i )]. We proceed to calculate the individual quantities: (a) Cov(Xi, X i ) = Cov([σ r = σ s = i], [σ t = i]) r,s,t [] = Cov([σ r = i], [σ r = i]) + r [] r,s [], r s Cov([σ r = σ s = i], [σ r = i]) = p i (1 p i ) + ( )(E[[σ r = σ s = i] [σ r = i]] p i p i ) = p i (1 p i ) + ( )(p i p 3 i ) = p i (1 p i )[1 + p i ( 1)] = p i (1 p i )[1 p i + p i ] = p i (1 p i )(1 p i ) + p i (1 p i ). (b) Cov(X i, X i Y i ) = E[X 3 i Y i ] E[X i ] E[X i Y i ] = Cov(X i, X i ) E[Y i ] = p i q i (1 p i )(1 p i )+ 3 p i q i (1 p i ). (c) Cov(X i, X i Y i ) = Var(X i ) E[Y i ] = p i (1 p i )q i. (d) Var(X i ) =E[X 4 i ] (E[X i ]) =p i (1 7p i + 7p i + 1p i 18p i + 6 p i 6p 3 i + 11p 3 i 6 p 3 i + 3 p 3 i ) (p i p i + p i ) =p i 7p i + 7 p i + 1p 3 i 18 p 3 i p 3 i 6p 4 i + 11 p 4 i 6 3 p 4 i + 4 p 4 i ( p i + p 4 i + 4 p 4 i p 3 i + 3 p 3 i 3 p 4 i ) =p i 7p i + 6 p i + 1p 3 i 16 p 3 i p 3 i 6p 4 i + 10 p 4 i 4 3 p 4 i =p i 7p i + 6 p i + 1p 3 i 6p 4 i 16 p 3 i p 3 i + 10 p 4 i 4 3 p 4 i. 11
12 (e) Var(X i Y i ) = E[X i Y i ] (E[X i Y i ]) = E[X i ]E[Y i ] (E[X i ]E[Y i ]) = (p i p i + p i ) (q i q i + q i ) 4 p i q i = p i q i + p i q i (p i q i + p i q i ) + 3 (p i q i + p i q i ) 3 p i q i. So, we get: Var(A i ) = p i 7p i + 6 p i + 1p 3 i 6p 4 i 16 p 3 i p 3 i + 10 p 4 i 4 3 p 4 i + q i 7q i + 6 q i + 6q 3 i 16 q 3 i q 3 i + 10 q 4 i 4 3 q 4 i + 4( (p i q i + p i q i p i q i p i q i ) + 3 (p i q i + p i q i ) 3 p i q i ) + p i (1 p i ) + q i (1 q i ) 4( p i (1 p i )(1 p i ) + 3 p i (1 p i ))q i (p i (1 p i )(1 p i ) + 4 p i (1 p i )) 4( q i (1 q i )(1 q i ) + 3 q i (1 q i ))p i q i (1 q i )(1 q i ) 4 q i (1 q i ) + 4 p i (1 p i )q i + 4 q i (1 q i )p i = [p i 7p i + 1p 3 i 6p 4 i + q i 7q i + 1q 3 i 6q 4 i + p i p i + q i q i p i (1 p i )(1 p i ) q i (1 q i )(1 q i )] + [ 4p i q i (1 p i )(1 p i ) 4p i q i (1 q i )(1 q i ) + 4p i q i ( p i q i ) + 6p i 16p 3 i + 10p 4 i + 6q i 16q 3 i + 10q 4 i + 4p i q i (1 + p i q i p i q i ) 4p i + 4p 3 i 4q i + 4q 3 i ] + 3 [4p 3 i 4p 4 i + 4q 3 i 4q 4 i + 4p i q i (p i + q i ) 8p i q i 8p 1q i 8p i q i + 8p 3 1q i + 8p i q 3 i ] = [ p i + 8p 3 i 6p 4 i q i + 8q 3 i 6q 4 i ] + [(p i + q i ) 1p 3 i + 10p 4 i + 4p i q i 8p 3 i q i + 4p i q i + 4p i q i 1q 3 i 8p i q 3 i + 10q 4 i )] (p i q i ) [p i (1 p i ) + q i (1 q i )] 8(p 3 i + q 3 i ) + 1 (p i + q i ) (p i q i ) (p i + q i ) 0 (p i + q i ) (p i q i ) (p i + q i ). 3.4 Bounding the Covariances It suffices to show that the covariances of A i and A j, for i j, are appropriately bounded fro above. Let i j. Note that if σ r is the result of saple k, we have: Cov(X i, X j ) = Cov([σ r = i], [σ u = j]) = Cov([σ r = i], [σ r = j]) = p i p j. r,u [] r [] Siilarly, Cov(Xi, X j ) = Cov([σ r = σ s = i], [σ t = j]) r,s,t [] = Cov([σ r = i], [σ r = j]) + r [] r,s [], r s = p i p j ( 1)p i p j. Cov([σ r = σ s = i], [σ r = j]) 1
13 And, Siilarly, Cov(X i, X j ) = = 4 r,s,t,u [] unique r,s,u [] Cov([σ r = σ s = i], [σ t = σ u = j]) unique r,s [] unique r,s [] unique r,t [] r [] Cov([σ r = σ s = i], [σ r = σ u = j]) Cov([σ r = σ s = i], [σ r = σ s = j]) Cov([σ r = σ s = i], [σ r = j]) Cov([σ r = i], [σ r = σ t = j]) Cov([σ r = i], [σ r = j]) = p i p j ( 1)(p i p j + p i p j + p i p j) 4( 1)( )p i p j = p i p j ( 1)(p i p j + p i p j) ( 1)( 3)p i p j. Cov(X i Y i, X j Y j ) = E[X i Y i X j Y j ] E[X i Y i ]E[X j Y j ] Also, = E[X i X j ]E[Y i Y j ] E[X i ]E[Y i ]E[X j ]E[Y j ] = (Cov(X i, X j ) + E[X i ]E[X j ]) (Cov(Y i, Y j ) + E[Y i ]E[Y j ]) E[X i ]E[X j ]E[Y i ]E[Y j ] = ( 3 )p i p j q i q j. Cov(X i Y i, X j ) = E[X i Y i X j ] E[X i Y i ]E[X j ] = E[X i X j ]E[Y i ] E[X i ]E[Y i ]E[X j ] = (Cov(X i, X j ) + E[X i ]E[X j ]) E[Y i ] E[X i ]E[X j ]E[Y i ] = Cov(X i, X j )E[Y i ]. Siilar equations hold if we swap i and j and/or we swap X and Y. Because covariance is bilinear, this gives us all the inforation we need in order to exactly copute Cov(A i, A j ). In particular, by setting W i = X i Y i, we have: Cov(A i, A j ) = Cov(W i X i Y i, W j X j Y j ) For the suands we have: (a) = Cov(X i, X j ) + Cov(Y i, Y j ) + Cov(X i, Y j ) + Cov(X j, Y i ) Cov(Wi, X j ) Cov(Wi, Y j ) Cov(Wj, X i ) Cov(Wj, Y i ) + Cov(Wi, Wj ). Cov(W i, X j ) = Cov((X i Y i ), X j ) = Cov(X i, X j ) Cov(X i Y i, X j ) = p i p j ( 1)p i p j + p i p j q i = p i p j (1 p i ) + p i p j (q i p i ). 13
14 (b) Cov(W i, Y j ) = q i q j (1 q i ) + q i q j (p i q i ). (c) Cov(W j, X i ) = p i p j (1 p j ) + p i p j (q j p j ). (d) Cov(W j, Y i ) = q i q j (1 q j ) + q i q j (p j q j ). (e) Cov(W i, W j ) = Cov(X i, X j ) + Cov(Y i, Y j ) + 4 Cov(X i Y i, X j Y j ) By substituting, we get: Cov(Xi, X j Y j ) Cov(Xj, X i Y i ) Cov(Yi, X j Y j ) Cov(Yj, X i Y i ) = Cov(Xi, Xj ) + Cov(Yi, Yj ) + 4 Cov(X i Y i, X j Y j ) Cov(X i, X j )E[Y j ] Cov(X j, X i )E[Y i ] Cov(Y i, Y j )E[X j ] Cov(Y j, Y i )E[X i ] = p i p j ( 1)(p i p j + p i p j) ( 1)( 3)p i p j q i q j ( 1)(q i q j + q i q j ) ( 1)( 3)q i q j + 4( 3 )p i p j q i q j + p i p j q j + 4 ( 1)p i p j q j + p i p j q i + 4 ( 1)p jp i q i + q i q j p j + 4 ( 1)q i q j p j + q i q j p i + 4 ( 1)q j q i p i. Cov(A i, A j ) = Cov(X i, X j ) + Cov(Y i, Y j ) Cov(W i, X j ) Cov(W i, Y j ) Cov(W j, X i ) Cov(W j, Y i ) + Cov(W i, W j ) = (p i p j + q i q j ) + p i p j (1 p i ) p i p j (q i p i ) + q i q j (1 q i ) q i q j (p i q i ) + p i p j (1 p j ) p i p j (q j p j ) + q i q j (1 q j ) q i q j (p j q j ) + Cov(W i, W j ) = [p i p j (q i + q j ) + q i q j (p i + p j )] + [p i p j (q i + q j ) + q i q j (p i + p j )] ( 1)( 3)p i p j ( 1)( 3)q i q j + 4( 3 )p i p j q i q j + 4 ( 1)(p i q j + p j q i )(p i p j + q i q j ) = 6(p i p j + q i q j ) + [10(p i p j + q i q j ) + 4p i p j q i q j 4(p i q j + p j q i )(p i p j + q i q j )] 3 [4(p i p j + q i q j ) + 8p i p j q i q j 4(p i q j + p j q i )(p i p j + q i q j )] = 6(p i p j + q i q j ) + [5(p i p j + q i q j ) + p i p j q i q j (p i q j + p j q i )(p i p j + q i q j )] 4 3 [(p i p j + q i q j ) (p i q j + p j q i )(p i p j + q i q j )]. 14
15 In suary, Cov(A i, A j ) = 6(p i p j + q i q j ) + [(5p i p j + 5q i q j ) 6p i p j q i q j p i q i (p j q j ) p j q j (p i q i ) ] 4 3 (p i q i )(p j q j )(p i p j + q i q j ). The total contribution of the covariances to the variance for all i j is i j Cov(A i, A j ). We consider the coefficients on each of the powers of separately. We have: [ 3 ] i j Cov(A i, A j ) = 4 i j (p i q i )(p j q j )(p i p j + q i q j ) = 4 i 4 i = 4 i 4 i (p i q i ) (p i + qi ) 4 (p i q i )(p j q j )(p i p j + q i q j ) i,j (p i q i ) (p i + q i ) 4 (p i q i )(p j q j )(p i p j + q i q j ) i,j (p i q i ) (p i + q i ) 4(p q) (pp + qq )(p q) (p i q i ) (p i + q i ). Also, [] i j Cov(A i, A j ) 0. Finally, we have [ ] i j Cov(A i, A j ) = i j [(5p i p j + 5q i q j ) 6p i p j q i q j p i q i (p j q j ) p j q j (p i q i ) ] 10 i j (p i p j + q i q j ) 10 i,j 3.5 Copleting the Proof (p i p j + q i q j ) = 10[p (pp )p + q (qq )q] = 10[(p p)(p p) + (q q)(q q)] = 10 p q 4 0b 0b. Var[A] = n Cov(A i, A j ) Var[A i ] + i=1 i j n ( ) 80 pi + q i (p i q i ) (p i + q i ) i=1 + 0 b i (p i q i ) (p i + q i ) 100 b i (p i q i )(p i q i ). 15
16 References [ADJ + 1] J. Acharya, H. Das, A. Jafarpour, A. Orlitsky, S. Pan, and A. Suresh. Copetitive classification and closeness testing. In COLT, 01. [BFF + 01] [BFR + 00] [BFR + 13] T. Batu, E. Fischer, L. Fortnow, R. Kuar, R. Rubinfeld, and P. White. Testing rando variables for independence and identity. In Proc. 4nd IEEE Syposiu on Foundations of Coputer Science, pages , 001. T. Batu, L. Fortnow, R. Rubinfeld, W. D. Sith, and P. White. Testing that distributions are close. In IEEE Syposiu on Foundations of Coputer Science, pages 59 69, 000. T. Batu, L. Fortnow, R. Rubinfeld, W. D. Sith, and P. White. Testing closeness of discrete distributions. J. ACM, 60(1):4, 013. [Can15] C. L. Canonne. A survey on distribution testing: Your data is big. but is it blue? Electronic Colloquiu on Coputational Coplexity (ECCC), :63, 015. [CDGR16] C. L. Canonne, I. Diakonikolas, T. Gouleakis, and R. Rubinfeld. Testing shape restrictions of discrete distributions. In 33rd Syposiu on Theoretical Aspects of Coputer Science, STACS, pages 5:1 5:14, 016. [CDVV14] S. Chan, I. Diakonikolas, P. Valiant, and G. Valiant. Optial algoriths for testing closeness of discrete distributions. In SODA, pages , 014. [DDS + 13] C. Daskalakis, I. Diakonikolas, R. Servedio, G. Valiant, and P. Valiant. Testing k-odal distributions: Optial algoriths via reductions. In SODA, pages , 013. [DK16] I. Diakonikolas and D. M. Kane. A new approach for testing properties of discrete distributions. CoRR, abs/ , 016. In FOCS 16. [DKN15a] I. Diakonikolas, D. M. Kane, and V. Nikishkin. Optial algoriths and lower bounds for testing closeness of structured distributions. In 56th Annual IEEE Syposiu on Foundations of Coputer Science, FOCS 015, 015. [DKN15b] I. Diakonikolas, D. M. Kane, and V. Nikishkin. Testing Identity of Structured Distributions. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Syposiu on Discrete Algoriths, SODA 015, 015. [Gol16a] O. Goldreich. The unifor distribution is coplete with respect to testing identity to a fixed distribution. Electronic Colloquiu on Coputational Coplexity (ECCC), 3:15, 016. [Gol16b] O. Goldreich. Lecture Notes on Property Testing of Distributions. Available at oded/pdf/pt-dist.pdf, March, 016. [GR00] O. Goldreich and D. Ron. On testing expansion in bounded-degree graphs. Electronic Colloqiu on Coputational Coplexity, 7(0), 000. [LR05] E. L. Lehann and J. P. Roano. Testing statistical hypotheses. Springer Texts in Statistics. Springer,
17 [NP33] [Pan08] J. Neyan and E. S. Pearson. On the proble of the ost efficient tests of statistical hypotheses. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Matheatical or Physical Character, 31( ):89 337, L. Paninski. A coincidence-based test for unifority given very sparsely-sapled discrete data. IEEE Transactions on Inforation Theory, 54: , 008. [Rub1] R. Rubinfeld. Taing big probability distributions. XRDS, 19(1):4 8, 01. [VV14] G. Valiant and P. Valiant. An autoatic inequality prover and instance optial identity testing. In FOCS, ECCC ISSN
On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs
On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing
More information13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices
CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay
More informationTesting Properties of Collections of Distributions
Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the
More informationE0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis
E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds
More informationTesting Distributions Candidacy Talk
Testing Distributions Candidacy Talk Clément Canonne Columbia University 2015 Introduction Introduction Property testing: what can we say about an object while barely looking at it? P 3 / 20 Introduction
More informationApproximating and Testing k-histogram Distributions in Sub-linear Time
Electronic Colloquiu on Coputational Coplexity, Revision 1 of Report No. 171 (011) Approxiating and Testing k-histogra Distributions in Sub-linear Tie Piotr Indyk Reut Levi Ronitt Rubinfeld July 31, 014
More informationApproximating and Testing k-histogram Distributions in Sub-linear time
Electronic Colloquiu on Coputational Coplexity, Report No. 171 011 Approxiating and Testing k-histogra Distributions in Sub-linear tie Piotr Indyk Reut Levi Ronitt Rubinfeld Noveber 9, 011 Abstract A discrete
More informationLecture 21. Interior Point Methods Setup and Algorithm
Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and
More information1 Distribution Property Testing
ECE 6980 An Algorithmic and Information-Theoretic Toolbox for Massive Data Instructor: Jayadev Acharya Lecture #10-11 Scribe: JA 27, 29 September, 2016 Please send errors to acharya@cornell.edu 1 Distribution
More informationA note on the multiplication of sparse matrices
Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani
More informationLecture October 23. Scribes: Ruixin Qiang and Alana Shine
CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single
More informationarxiv: v1 [cs.ds] 3 Feb 2014
arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/
More informationTight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions
Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805
More informationFeature Extraction Techniques
Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that
More informationThe Weierstrass Approximation Theorem
36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined
More informationBipartite subgraphs and the smallest eigenvalue
Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.
More information3.8 Three Types of Convergence
3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to
More informationEstimating Entropy and Entropy Norm on Data Streams
Estiating Entropy and Entropy Nor on Data Streas Ait Chakrabarti 1, Khanh Do Ba 1, and S. Muthukrishnan 2 1 Departent of Coputer Science, Dartouth College, Hanover, NH 03755, USA 2 Departent of Coputer
More informationBlock designs and statistics
Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I Contents 1. Preliinaries 2. The ain result 3. The Rieann integral 4. The integral of a nonnegative
More informationNon-Parametric Non-Line-of-Sight Identification 1
Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,
More informationA Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness
A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,
More informationComputable Shell Decomposition Bounds
Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung
More informationStatistics and Probability Letters
Statistics and Probability Letters 79 2009 223 233 Contents lists available at ScienceDirect Statistics and Probability Letters journal hoepage: www.elsevier.co/locate/stapro A CLT for a one-diensional
More informationImproved Guarantees for Agnostic Learning of Disjunctions
Iproved Guarantees for Agnostic Learning of Disjunctions Pranjal Awasthi Carnegie Mellon University pawasthi@cs.cu.edu Avri Blu Carnegie Mellon University avri@cs.cu.edu Or Sheffet Carnegie Mellon University
More informationA Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science
A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest
More informationarxiv: v1 [cs.ds] 17 Mar 2016
Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass
More informationA Simple Regression Problem
A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where
More information1 Generalization bounds based on Rademacher complexity
COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges
More informationLearnability and Stability in the General Learning Setting
Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu
More informationComputable Shell Decomposition Bounds
Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago
More information3.3 Variational Characterization of Singular Values
3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and
More informationList Scheduling and LPT Oliver Braun (09/05/2017)
List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)
More information1 Proof of learning bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a
More informationQuantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search
Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths
More informationResearch Article On the Isolated Vertices and Connectivity in Random Intersection Graphs
International Cobinatorics Volue 2011, Article ID 872703, 9 pages doi:10.1155/2011/872703 Research Article On the Isolated Vertices and Connectivity in Rando Intersection Graphs Yilun Shang Institute for
More informatione-companion ONLY AVAILABLE IN ELECTRONIC FORM
OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher
More informationBest Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr
More informationA Bernstein-Markov Theorem for Normed Spaces
A Bernstein-Markov Theore for Nored Spaces Lawrence A. Harris Departent of Matheatics, University of Kentucky Lexington, Kentucky 40506-0027 Abstract Let X and Y be real nored linear spaces and let φ :
More informationLecture 20 November 7, 2013
CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,
More informationOn Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40
On Poset Merging Peter Chen Guoli Ding Steve Seiden Abstract We consider the follow poset erging proble: Let X and Y be two subsets of a partially ordered set S. Given coplete inforation about the ordering
More informationSharp Time Data Tradeoffs for Linear Inverse Problems
Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used
More information1 Rademacher Complexity Bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability
More informationThe Frequent Paucity of Trivial Strings
The Frequent Paucity of Trivial Strings Jack H. Lutz Departent of Coputer Science Iowa State University Aes, IA 50011, USA lutz@cs.iastate.edu Abstract A 1976 theore of Chaitin can be used to show that
More informationHomework 3 Solutions CSE 101 Summer 2017
Hoework 3 Solutions CSE 0 Suer 207. Scheduling algoriths The following n = 2 jobs with given processing ties have to be scheduled on = 3 parallel and identical processors with the objective of iniizing
More informationThe degree of a typical vertex in generalized random intersection graph models
Discrete Matheatics 306 006 15 165 www.elsevier.co/locate/disc The degree of a typical vertex in generalized rando intersection graph odels Jerzy Jaworski a, Michał Karoński a, Dudley Stark b a Departent
More informationarxiv: v2 [math.co] 3 Dec 2008
arxiv:0805.2814v2 [ath.co] 3 Dec 2008 Connectivity of the Unifor Rando Intersection Graph Sion R. Blacburn and Stefanie Gere Departent of Matheatics Royal Holloway, University of London Egha, Surrey TW20
More informationarxiv: v1 [cs.ds] 29 Jan 2012
A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore
More informationTHE AVERAGE NORM OF POLYNOMIALS OF FIXED HEIGHT
THE AVERAGE NORM OF POLYNOMIALS OF FIXED HEIGHT PETER BORWEIN AND KWOK-KWONG STEPHEN CHOI Abstract. Let n be any integer and ( n ) X F n : a i z i : a i, ± i be the set of all polynoials of height and
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October
More informationA Note on Online Scheduling for Jobs with Arbitrary Release Times
A Note on Online Scheduling for Jobs with Arbitrary Release Ties Jihuan Ding, and Guochuan Zhang College of Operations Research and Manageent Science, Qufu Noral University, Rizhao 7686, China dingjihuan@hotail.co
More informationFast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials
Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter
More informationUniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval
Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,
More informationProbability Distributions
Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples
More informationUnderstanding Machine Learning Solution Manual
Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution
More informationIn this chapter, we consider several graph-theoretic and probabilistic models
THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions
More informationDetection and Estimation Theory
ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer
More informationComputational and Statistical Learning Theory
Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic
More information. The univariate situation. It is well-known for a long tie that denoinators of Pade approxiants can be considered as orthogonal polynoials with respe
PROPERTIES OF MULTIVARIATE HOMOGENEOUS ORTHOGONAL POLYNOMIALS Brahi Benouahane y Annie Cuyt? Keywords Abstract It is well-known that the denoinators of Pade approxiants can be considered as orthogonal
More information16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437
cs-ftl 010/9/8 0:40 page 431 #437 16 Independence 16.1 efinitions Suppose that we flip two fair coins siultaneously on opposite sides of a roo. Intuitively, the way one coin lands does not affect the way
More informationESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics
ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents
More informationKeywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution
Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality
More informationLower Bounds for Quantized Matrix Completion
Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &
More informationIN modern society that various systems have become more
Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto
More informationOptimal Jamming Over Additive Noise: Vector Source-Channel Case
Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper
More informationOn the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation
journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation
More informationFixed-to-Variable Length Distribution Matching
Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de
More informationCompression and Predictive Distributions for Large Alphabet i.i.d and Markov models
2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511
More informationBest Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence
Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon, Mohaad Ghavazadeh, Alessandro Lazaric To cite this version: Victor Gabillon, Mohaad Ghavazadeh, Alessandro
More informationNecessity of low effective dimension
Necessity of low effective diension Art B. Owen Stanford University October 2002, Orig: July 2002 Abstract Practitioners have long noticed that quasi-monte Carlo ethods work very well on functions that
More informationConvex Programming for Scheduling Unrelated Parallel Machines
Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly
More informationAlgorithms for parallel processor scheduling with distinct due windows and unit-time jobs
BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and
More informationDistance Optimal Target Assignment in Robotic Networks under Communication and Sensing Constraints
Distance Optial Target Assignent in Robotic Networks under Counication and Sensing Constraints Jingjin Yu CSAIL @ MIT/MechE @ BU Soon-Jo Chung Petros G. Voulgaris AE @ University of Illinois Supported
More informationM ath. Res. Lett. 15 (2008), no. 2, c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS. Van H. Vu. 1.
M ath. Res. Lett. 15 (2008), no. 2, 375 388 c International Press 2008 SUM-PRODUCT ESTIMATES VIA DIRECTED EXPANDERS Van H. Vu Abstract. Let F q be a finite field of order q and P be a polynoial in F q[x
More informationNew Bounds for Learning Intervals with Implications for Semi-Supervised Learning
JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University
More informationEfficient Filter Banks And Interpolators
Efficient Filter Banks And Interpolators A. G. DEMPSTER AND N. P. MURPHY Departent of Electronic Systes University of Westinster 115 New Cavendish St, London W1M 8JS United Kingdo Abstract: - Graphical
More informationNew Slack-Monotonic Schedulability Analysis of Real-Time Tasks on Multiprocessors
New Slack-Monotonic Schedulability Analysis of Real-Tie Tasks on Multiprocessors Risat Mahud Pathan and Jan Jonsson Chalers University of Technology SE-41 96, Göteborg, Sweden {risat, janjo}@chalers.se
More informationA note on the realignment criterion
A note on the realignent criterion Chi-Kwong Li 1, Yiu-Tung Poon and Nung-Sing Sze 3 1 Departent of Matheatics, College of Willia & Mary, Williasburg, VA 3185, USA Departent of Matheatics, Iowa State University,
More informationModel Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon
Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,
More informationRobust polynomial regression up to the information theoretic limit
58th Annual IEEE Syposiu on Foundations of Coputer Science Robust polynoial regression up to the inforation theoretic liit Daniel Kane Departent of Coputer Science and Engineering / Departent of Matheatics
More informationOn Conditions for Linearity of Optimal Estimation
On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at
More information1 Identical Parallel Machines
FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j
More informationFairness via priority scheduling
Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation
More informationtime time δ jobs jobs
Approxiating Total Flow Tie on Parallel Machines Stefano Leonardi Danny Raz y Abstract We consider the proble of optiizing the total ow tie of a strea of jobs that are released over tie in a ultiprocessor
More informationCurious Bounds for Floor Function Sums
1 47 6 11 Journal of Integer Sequences, Vol. 1 (018), Article 18.1.8 Curious Bounds for Floor Function Sus Thotsaporn Thanatipanonda and Elaine Wong 1 Science Division Mahidol University International
More informationWeighted- 1 minimization with multiple weighting sets
Weighted- 1 iniization with ultiple weighting sets Hassan Mansour a,b and Özgür Yılaza a Matheatics Departent, University of British Colubia, Vancouver - BC, Canada; b Coputer Science Departent, University
More informationlecture 36: Linear Multistep Mehods: Zero Stability
95 lecture 36: Linear Multistep Mehods: Zero Stability 5.6 Linear ultistep ethods: zero stability Does consistency iply convergence for linear ultistep ethods? This is always the case for one-step ethods,
More informationDistributed Subgradient Methods for Multi-agent Optimization
1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions
More informationCharacterization of the Line Complexity of Cellular Automata Generated by Polynomial Transition Rules. Bertrand Stone
Characterization of the Line Coplexity of Cellular Autoata Generated by Polynoial Transition Rules Bertrand Stone Abstract Cellular autoata are discrete dynaical systes which consist of changing patterns
More informationIntelligent Systems: Reasoning and Recognition. Artificial Neural Networks
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More informationLost-Sales Problems with Stochastic Lead Times: Convexity Results for Base-Stock Policies
OPERATIONS RESEARCH Vol. 52, No. 5, Septeber October 2004, pp. 795 803 issn 0030-364X eissn 1526-5463 04 5205 0795 infors doi 10.1287/opre.1040.0130 2004 INFORMS TECHNICAL NOTE Lost-Sales Probles with
More informationClosed-form evaluations of Fibonacci Lucas reciprocal sums with three factors
Notes on Nuber Theory Discrete Matheatics Print ISSN 30-32 Online ISSN 2367-827 Vol. 23 207 No. 2 04 6 Closed-for evaluations of Fibonacci Lucas reciprocal sus with three factors Robert Frontczak Lesbank
More informationTesting equality of variances for multiple univariate normal populations
University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate
More informationAsymptotics of weighted random sums
Asyptotics of weighted rando sus José Manuel Corcuera, David Nualart, Mark Podolskij arxiv:402.44v [ath.pr] 6 Feb 204 February 7, 204 Abstract In this paper we study the asyptotic behaviour of weighted
More informationAnalyzing Simulation Results
Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient
More informationStatistical Logic Cell Delay Analysis Using a Current-based Model
Statistical Logic Cell Delay Analysis Using a Current-based Model Hanif Fatei Shahin Nazarian Massoud Pedra Dept. of EE-Systes, University of Southern California, Los Angeles, CA 90089 {fatei, shahin,
More informationTail estimates for norms of sums of log-concave random vectors
Tail estiates for nors of sus of log-concave rando vectors Rados law Adaczak Rafa l Lata la Alexander E. Litvak Alain Pajor Nicole Toczak-Jaegerann Abstract We establish new tail estiates for order statistics
More information