Testing Properties of Collections of Distributions

Size: px
Start display at page:

Download "Testing Properties of Collections of Distributions"

Transcription

1 Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the nuber of distributions in the collection is a paraeter of the proble. Previous work on property testing of distributions considered single distributions or pairs of distributions. We suggest two odels that differ in the way the algorith is given access to saples fro the distributions. In one odel the algorith ay ask for a saple fro any distribution of its choice, and in the other the choice of the distribution is rando. Our ain focus is on the basic proble of distinguishing between the case that all the distributions in the collection are the sae or very siilar, and the case that it is necessary to odify the distributions in the collection in a non-negligible anner so as to obtain this property. We give alost tight upper and lower bounds for this testing proble, as well as study an extension to a clusterability property. One of our lower bounds directly iplies a lower bound on testing independence of a joint distribution, a result which was left open by previous work. School of Coputer Science, Tel Aviv University. E-ail: reuti.levi@gail.co. Research supported by the Israel Science Foundation grant nos. 47/09 and 46/08 School of Electrical Engineering, Tel Aviv University. E-ail: danar@eng.tau.ac.il. Research supported by the Israel Science Foundation grant nuber 46/08. CSAIL, MIT, Cabridge MA 039 and the Blavatnik School of Coputer Science, Tel Aviv University. E-ail: ronitt@csail.it.edu. Research supported by NSF grants and , Marie Curie Reintegration grant PIRG03- GA and the Israel Science Foundation grant nos. 47/09 and 675/09.

2 Introduction In recent years, several works have investigated the proble of testing various properties of data that is ost naturally thought of as saples of an unknown distribution. More specifically, the goal in testing a specific property is to distinguish the case that the saples coe fro a distribution that has the property fro the case that the saples coe fro a distribution that is far usually in ters of l nor, but other nors have been studied as well fro any distribution that has the property. To give just a few exaples, such tasks include testing whether a distribution is unifor [GR00, Pan08] or siilar to another known distribution [BFR + 0], and testing whether a joint distribution is independent [BFF + 0]. Related tasks concern sublinear estiation of various easures of a distribution, such as its entropy [BDKR05, GMV09] or its support size [RRSS09]. Recently, general techniques have been designed to obtain nearly tight lower bounds on such testing and estiation probles [Val08a, Val08b]. These types of questions have arisen in several disparate areas, including physics [Ma8, SKSB98, NBS04], cryptography and pseudorando nuber generation [Knu69], statistics [Csi67, Har75, WW95, Pan04, Pan08, Pan03], learning theory [Ya95], property testing of graphs and sequences e.g.,[gr00, CS07, KS08, NS07, RRRS07, FM08] and streaing algoriths e.g., [AMS99, FKSV99, FS00, GMV09, CMIM03, CK04, BYJK + 0, IM08, BO0a, BO0b, BO08, IKOS09]. In these works, there has been significant focus on properties of distributions over very large doains, where standard statistical techniques based on learning an approxiation of the distribution ay be very inefficient. In this work we consider the setting in which one receives data which is ost naturally thought of as saples of several distributions, for exaple, when studying purchase patterns in several geographic locations, or the behavior of linguistic data aong varied text sources. Such data could also be generated when saples of the distributions coe fro various sensors that are each part of a large sensor-net. In these exaples, it ay be reasonable to assue that the nuber of such distributions ight be quite large, even on the order of a thousand or ore. However, for the ost part, previous research has considered properties of at ost two distributions [BFR + 00, Val08a]. We propose new odels of property testing that apply to properties of several distributions. We then consider the coplexity of testing properties within these odels, beginning with properties that we view as basic and expect to be useful in constructing building blocks for future work. We focus on quantifying the dependence of the saple coplexities of the testing algoriths in ters of the nuber of distributions that are being considered, as well as the size of the doain of the distributions.. Our Contributions.. The Models We begin by proposing two odels that describe possible access patterns to ultiple distributions D,..., D over the sae doain [n]. In these odels there is no explicit description of the distribution the algorith is only given access to the distributions via saples. In the first odel, referred to as the sapling odel, at each tie step, the algorith receives a pair of the for i, j where i is selected uniforly in [] and j [n] is distributed according to D i. In the second odel, referred to as the query odel, at each tie step, the algorith is allowed to specify i [] and receives j that is distributed according to D i. It is iediate that any algorith in the sapling odel can also be used in the query odel. On the other hand, as is iplied by our results, there are property testing probles which have a significantly larger saple coplexity in the sapling odel than in the query odel. In both odels the task is to distinguish between the case that the tested distributions have the property

3 and the case that they are ɛ-far fro having the property, for a given distance paraeter ɛ. Distance to the property is easured in ters of the average l -distance between the tested distributions and the closest collection of distributions that have the property. In all of our results, the dependence of the algoriths on the distance paraeter ɛ is inverse polynoial. Hence, for the sake of succinctness, in all that follows we do not ention this dependence explicitly. We note that the sapling odel can be extended to allow the choice of the distribution that is, the index i to be non-unifor i.e., be deterined by a weight w i and the distance easure is adapted accordingly... Testing Equivalence in the sapling odel One of the first properties of distributions studied in the property testing odel is that of deterining whether two distributions over doain [n] are identical alternatively, very close or far according to the l -distance. In [BFR + 0], an algorith is given that uses Õn/3 saples and distinguishes between the case that the two distributions are ɛ-far and the case that they are Oɛ/ n-close. This algorith has been shown to be nearly tight in ters of the dependence on n by Valiant [Val08b]. Valiant also shows that in order to distinguish between the case that the distributions are ɛ-far and the case that they are β-close, for two constants ɛ and β, requires alost linear dependence on n. Our ain focus is on a natural generalization, which we refer to as the equivalence property of distributions D,..., D, in which the goal of the tester is to distinguish the case in which all distributions are the sae or, slightly ore generally, that there is a distribution D for which i= D i D polyɛ/ n, fro the case in which there is no distribution D for which i= D i D ɛ. To solve this proble in the unifor sapling odel with saple coplexity Õn/3 which ensures with high probability that each distribution is sapled Ωn /3 log ties, one can ake calls to the algorith of [BFR + 0] to check that every distribution is close to D. OUR ALGORITHMS. We show that one can get a better saple coplexity dependence on. Specifically, we give two algoriths, one with saple coplexity Õn/3 /3 + and the other with saple coplexity Õn/ / +n. The first result in fact holds for the case that for each saple pair i, j, the distribution D i which generated j is not selected necessarily uniforly, and furtherore, it is unknown according to what weight it is selected. The second result holds for the case where the selection is non-unifor, but the weights are known. Moreover, the second result extends to the case in which it is desired that the tester pass distributions that are close for each eleent, to within a ultiplicative factor of ± ɛ/c for soe constant c >, and for sufficiently large frequencies. Thus, starting fro the known result for =, as long as n, the coplexity grows as Õn/3 /3 + = Õn/3 /3, and once n, the coplexity is Õn / / + n = Õn/ / which is lower than the forer expression when n. Both of our algoriths build on the close relation between testing equivalence and testing independence of a joint distribution over [] [n] which was studied in [BFF + 0]. The Õn/3 /3 + algorith follows fro [BFF + 0] after we fill in a certain gap in the analysis of their algorith due to an iprecision of a clai given in [BFR + 00]. The Õn/ / + n algorith exploits the fact that i is selected uniforly or, ore generally, according to a known weight w i to iprove on the Õn/3 /3 + algorith in the case that n. ALMOST MATCHING LOWER BOUNDS. We show that the behavior of the upper bound on the saple coplexity of the proble is not just an artifact of our algoriths, but rather alost captures the coplexity of the proble. Naely, we give alost atching lower bounds of Ωn /3 /3 for n = Ω log and Ωn / / for every n and. The latter lower bound can be viewed as a generalization of a lower

4 bound given in [BFR + 0], but the analysis is soewhat ore subtle. Our lower bound of Ωn /3 /3 consists of two parts. The first is a general theore concerning testing syetric properties of collections of distributions. This theore extends a central lea of Valiant [Val08b] on which he builds his lower bounds, and in particular the lower bound of Ωn /3 for testing whether two distributions are identical or far fro each other i.e., the case of equivalence for =. The second part is a construction of two collections of distributions to which the theore is applied where the construction is based on the one proposed in [BFF + 0] for testing independence. As in [Val08b], the lower bound is shown by focusing on the siilarity between the typical collision statistics of a faily of collections of distributions that have the property and a faily of collections of distributions that are far fro having the property. However, since any ore types of collisions are expected to occur in the case of collections of distributions, our proof outline is ore intricate and requires new ways of upper bounding the probabilities of certain types of events...3 Testing Clusterability in the query odel The second property that we consider is a natural generalization of the equivalence property. Naely, we ask whether the distributions can be partitioned into at ost k subsets clusters, such that within in cluster the distance between every two distributions is very sall. We study this property in the query odel, and give an algorith whose coplexity does not depend on the nuber of distributions and for which the dependence on n is Õn/3. The dependence on k is alost linear. The algoriths works by cobining the diaeter clustering algorith of [ADPR03] for points in a general etric space where the algorith has access to the corresponding distance atrix with the closeness of distributions tester of [BFR + 0]. Note that the results of [Val08b] iply that this is tight to within polylogarithic factors in n...4 Iplications of our results As noted previously, in the course of proving the lower bound of Ωn /3 /3 for the equivalence property, we prove a general theore concerning testability of syetric properties of collections of distributions which extends a lea in [Val08b]. This theore ay have applications to proving other lower bounds on collections of distributions. Further byproducts of our research regard the saple coplexity of testing whether a joint distribution is independent, More precisely, the following question is considered in [BFR + 0]: Let Q be a distribution over pairs of eleents drawn fro [] [n] without loss of generality, assue n ; what is the saple coplexity in ters of and n required to distinguish independent joint distributions, fro those that are far fro the nearest independent joint distribution in ter of l distance? The lower bound claied in [BFF + 0], contains a known gap in the proof. Siilar gaps in the lower bounds of [BFR + 0] for testing the closeness of distributions and of [BDKR05] for estiating the entropy of a distribution were settled by the work of [Val08b], which applies to syetric properties. Since independence is not a syetric property, the work of [Val08b] cannot be directly applied here. In this work, we show that the lower bound of Ωn /3 /3 indeed holds. Furtherore, by the aforeentioned correction of the upper bound of Õn/3 /3 fro [BFF + 0], we get nearly tight bounds on the coplexity of testing independence.. Other related work Other works on testing and estiating properties of single or pairs of distributions include [Bat0, GMV09, BKR04, RS04, AAK + 07, RX0, BNNR09, ACS0, AIOR09]. 3

5 .3 Open Probles and Further Research There are several possible directions for further research on testing properties of collections of distributions, and we next give a few exaples. One natural extension of our results is to give algoriths for testing the property of clusterability for k > in the sapling odel. One ay also consider testing properties of collections of distributions that are defined by certain easures of distributions, and ay be less sensitive to the exact for of the distributions. For exaple, a very basic easure is the ean expected value of the distribution, when we view the doain [n] as integers instead of eleent naes, or when we consider other doains. Given this easure, we ay consider testing whether the distributions all have siilar eans or whether they should be odified significantly so that this holds. It is not hard to verify that this property can be quite easily tested in the query odel by selecting Θ/ɛ distributions uniforly and estiating the ean of each. On the other hand, in the sapling odel an Ω lower bound is quite iediate even for n = and a constant ɛ. We are currently investigating whether the coplexity of this proble in the sapling odel is in fact higher, and it would be interesting to consider other easures as well..4 Organization We start by providing notation and definitions in Section. In Section 3 we give the lower bound of Ωn /3 /3 for testing equivalence in the unifor sapling odel, which is the ain technical contribution of this paper. In Section 4 we give our second lower bound of Ωn / / for testing equivalence and our algoriths for the proble follow in Sections 5 and 6. We conclude with our algorith for testing clusterability in the query odel in Section 7. Preliinaries Let [n] def = {,..., n}, and let D = D,..., D be a list of distributions, where D i : [n] [0, ] and n j= D ij = for every i. For a vector v = v,..., v n R n, let v = n i= v i denote the l nor of the vector v. For a property P of lists of distributions and 0 ɛ, we say that D is ɛ-far fro having P if i= D i D i > ɛ for every list D = D,..., D that has the property P note that D i D i is twice the the statistical distance between the two distributions. Given a distance paraeter ɛ, a testing algorith for a property P should distinguish between the case that D has the property P and the case that it is ɛ-far fro P. We consider two odels within which this task is perfored.. The Query Model. In this odel the testing algorith ay indicate an index i of its choice and it gets a saple j distributed according to D i.. The Sapling Model. In this odel the algorith cannot select query a distribution of its choice. Rather, it ay obtain a pair i, j where i is selected uniforly we refer to this as the Unifor sapling odel and j is distributed according to D i. We also consider a generalization in which there is an underlying weight vector w = w,..., w where i= w i =, and the distribution D i is selected according to w. In this case the notion of ɛ-far needs to be odified accordingly. Naely, we say that D is ɛ-far fro P with respect to w if i= w i D i Di > ɛ for every list D = D,..., D that has the property P. 4

6 We consider two variants of this non-unifor odel: The Known-Weights sapling odel, in which w is known to the algorith, and the Unknown-Weights sapling odel in which w is known. A ain focus of this work is on the following property. We shall say that a list D = D... D of distributions over [n] belongs to P,n eq or has the property P,n eq if D i = D i for all i, i. 3 A Lower Bound of Ωn /3 /3 for Testing Equivalence in the Unifor Sapling Model when n = Ω log In this section we prove the following theore: Theore Any testing algorith for the property P eq,n in the unifor sapling odel for every ɛ /0 and for n > c log where c is soe sufficiently large constant, requires Ωn /3 /3 saples. The proof of Theore consists of two parts. The first is a general theore Theore concerning testing syetric properties of lists of distributions. This theore extends a lea of Valiant [Val08b, Le ] which leads to what Valiant refers to as the Wishful Thinking Theore. The second part is a construction of two lists of distributions to which Theore is applied. Our analysis uses a technique called Poissonization [Szp0] which was used in the past in the context of lower bounds for testing and estiating properties of distributions in [RRSS09, Val08a, Val08b], and hence we first introduce soe preliinaries concerning Poisson distributions. We later provide soe intuition regarding the benefits of Poissonization. 3. Preliinaries concerning Poisson distributions For a positive real nuber λ, the Poisson distribution poiλ takes the value x N where N = {0,,,...} with probability poix; λ = e λ λ x /x!. The expectation and variance of poiλ are both λ. For λ and λ we shall use the following bound on the l distance between the corresponding Poisson distributions for a proof see for exaple [RRSS09, Clai A.]: poiλ poiλ λ λ. For a vector λ = λ,..., λ d of positive real nubers, the corresponding ultivariate Poisson distribution poi λ is the product distribution poiλ... poiλ d. That is, poi λ assigns each vector x = x..., x d N d the probability d i= poix i; λ i. We shall soeties consider vectors λ whose coordinates are indexed by vectors a = a,..., a N, and will use λ a to denote the coordinate of λ that corresponds to a. Thus, poi λ a is a univariate Poisson distribution. With a slight abuse of notation, for a subset I [d] or I N, we let poi λi denote the ultivariate Poisson distributions restricted to the coordinates of λ in I. For any two d-diensional vectors λ + = λ +,..., λ+ d and λ = λ,..., λ d of positive real values, we get fro the proof of [Val08b, Lea 4.5.3] that, poi λ + poi λ d poiλ + j poiλ j, j= for our purposes we shall use the following generalized lea. 5

7 Lea For any two d-diensional vectors λ + = λ +,..., λ+ d and λ = λ,..., λ d of positive real values, and for any partition {I i } l i= of [d], poi λ + poi λ l poi λ + I i poi λ I i. i= Proof: Let {I i } l i= be a partition of [d], let i denote i,... i d, by the triangle inequality we have that for every k [l], poi i ; λ + poi i ; λ = poii j ; λ + j poii j ; λ j j [d] j [d] poii j ; λ + j poii j ; λ + j poii j ; λ j j [d] j [d]\i k j I k + poii j ; λ + j poii j ; λ j poii j ; λ j. j [d]\i k j I k Hence, we obtain that poi λ + poi λ = poi i ; λ + poi i ; λ i N d Thus, the lea follows by induction on l. We shall also ake use of the following Lea. j [d] poi λ + I k poi λ I k + poi λ + [d] \ I k poi λ [d] \ I k. Lea For any two d-diensional vectors λ + = λ +,..., λ+ d and λ = λ,..., λ d of positive real values, poi λ + poi λ d λ j λ+ j. Proof: In order to prove the lea we shall use the KL-divergence between distributions. Naely, for two distributions p and p over a doain X, D KL p p def = x X p x ln p x p x. Let λ + = λ +..., λ+ d, λ = λ..., λ d and let i denote i,... i d. We have that j= λ j ln poi i ; λ + poi i ; λ = = d j= ln e λ j λ+ j λ + ij j /λ j d λ j λ+ j + i j lnλ + j /λ j j= d λ j λ+ j + i j λ + j /λ j, j= 6

8 where in the last inequality we used the fact that ln x x for every x > 0. Therefore, we obtain that D KL poi λ + poi λ = poi i ; λ + ln poi i ; λ+ poi i ; λ i N d d λ j λ+ j + λ+ j λ+ j /λ j j= d λ j = λ+ j, j= λ j where in Equation we used the facts that i N poii; λ = and i N poii; λ i = λ. The l distance is related to the KL-divergence by D D D KL D D and thus we obtain the lea. The next lea bounds the probability that a Poisson rando variable is significantly saller than its expected value. Lea 3 Let X poiλ, then, Pr[X < λ/] < 3/4 λ/4. Proof: Consider the atching between j and j + λ/ for every j = 0,..., λ/. We consider the ratio between poij; λ and poij + λ/; λ: poij + λ/; λ poij; λ = e λ λ j+λ/ /j + λ/! e λ λ j /j! = = > λ λ/ j + λ/j + λ/ j + λ j + λ/ λ j + λ/ λ j + λ λ λ λ λ λ/ λ λ/4 3/4λ = 4/3 λ/4 This iplies that and the proof is copleted. Pr[X < λ/] = < Pr[X < λ/] Pr[λ/ X < λ] Pr[λ/ X < λ] Pr[X < λ/] Pr[λ/ X < λ] < 3/4 λ/4, 7

9 The next two notations will play an iportant technical role in our analysis. For a list of distributions D = D... D, an integer κ and a vector a = a,..., a N, let p D,κ j; a def = poia i ; κ D i j. 3 i= That is, for a fixed choice of a doain eleent j [n], consider perforing independent trials, one for each distribution D i, where in trial i we select a non-negative integer according to the Poisson distribution poiλ for λ = κ D i j. Then p D,κ j; a is the probability of the joint event that we get an outcoe of a i in trial i, for each i []. Let λ D,κ be a vector whose coordinates are indexed by all a N, such that λ D,κ a = n p D,κ j; a. 4 j= That is, λ D,κ a is the expected nuber of ties we get the joint outcoe a,..., a if we perfor the probabilistic process defined above independently for every j [n]. 3. Testability of syetric properties of lists of distributions In this subsection we prove the following theore which is used to prove Theore. Theore Let D + and D be two lists of distributions over [n], all of whose frequencies are at ost where κ is soe positive integer and 0 < δ <. If δ κ poi λ D +,κ poi λ D,κ < δ 5, 5 then testing in the unifor sapling odel any syetric property of distributions such that D + has the property, while D is Ω-far fro having the property requires Ωκ saples. A HIGH-LEVEL DISCUSSION OF THE PROOF OF THEOREM. For an eleent j [n] and a distribution D i, i [], let α i,j be the nuber of ties the pair i, j appears in the saple when the saple is selected according to soe sapling odel. Thus α,j,..., α,j is the saple histogra of the eleent j. The histogra of the eleents histogras is called the fingerprint of the saple. That is, the fingerprint indicates, for every a N, the nuber of eleents j such that α,j,..., α,j = a. As shown in [BFR + 0], when testing syetric properties of distributions, it can be assued without loss of generality that the testing algorith is provided only with the fingerprint of the saple. Furtherore, since the nuber, n, of eleents is fixed, it suffices to give the tester the fingerprint of the saple without the 0 = 0,..., 0 entry. For exaple, consider the distributions D and D over {,, 3} such that D [j] = /3 for every j {,, 3}, D [] = D [] = / and D [3] = 0. Assue that we saple D, D four ties, according to the unifor sapling odel and we get the saples,,,,,,, 3, where the first coordinate denotes the distribution, and the second coordinate denotes the eleent. Then the saple histogra of eleent is, because was selected once by D and once by D. For the eleents, 3 we have the saple histogras 0, and, 0, respectively. The fingerprint of the saple is 0,,, 0,, 0, 0,... for the following order of histogras: 0, 0, 0,,, 0,, 0,, 0,, 3, 0,.... In order to prove Theore, we would like to show that the distributions of the fingerprints when the saple is generated according to D + and when it is generated according to D are siilar, for a saple size 8

10 that is below the lower bound stated in the theore. For each choice of eleent j [n] and a distribution D i, the nuber of ties the saple i, j appears, i.e. α i,j, depends on the nuber of ties the other saples appear siply because the total nuber of saples is fixed. Furtherore, for each histogra a, the nuber of eleents with saple histogra identical to a is dependent on the nuber of ties the other histogras appear, because the nuber of saples is fixed. For instance, in the exaple above, if we know that we have the histogra 0, once and the histogra, once, then we know that third histogra cannot be, 0. In addition, it is dependent because the nuber of eleents is fixed. We thus see that the distribution of the fingerprints is rather difficult to analyze and therefore it is difficult to bound the statistical distance between two different such distributions. Therefore, we would like to break as uch of the above dependencies. To this end we define a slightly different process for generating the saples that involves Poissonization [Szp0]. In the Poissonized process the nuber of saples we take fro each distribution D i, denoted by κ i, is distributed according to the Poisson distribution. We prove that, while the overall nuber of saples the Poissonized process takes is bigger just by a constant factor fro the unifor process, we get with very high probability that κ i > κ i, for every i, where κ i is the nuber of saples taken fro D i. This iplies that if we prove a lower bound for algoriths that receive saples generated by the Poissonized process, then we obtain a related lower bound for algoriths that work in the unifor sapling odel. As opposed to the process that takes a fixed nuber of saples according to the unifor sapling odel, the benefit of the Poissonized process is that the α i,j s deterined by this process are independent. Therefore, the type of saple histogra that eleent j has is copletely independent of the types of saple histogras the other eleents have. We get that the fingerprint distribution is a generalized ultinoial distribution, which has been studied by Roos [Roo99] the connection is due to Valiant [Val08a]. Definition In the Poissonized unifor sapling odel with paraeter κ which we ll refer to as the κ-poissonized odel, given a list D = D,..., D of distributions, a saple is generated as follows: Draw κ,..., κ poiκ Return κ i saples distributed according to D i for each i []. Lea 4 Assue there exists a tester T in the unifor sapling odel for a property P of lists of distributions, that takes a saple of size s = κ where κ c for soe sufficiently large constant c, and works for every ɛ ɛ 0 where ɛ 0 is a constant and whose success probability is at least /3. Then there exists a tester T for P in the Poissonized unifor sapling odel with paraeter 4κ, that works for every ɛ ɛ 0 and whose success probability is at least Proof: Roughly speaking, the tester T tries to siulate T if it has a sufficiently large saple, and otherwise it guesses the answer. More precisely, consider a tester T that receives κ saples where κ poi4κ. By Lea 3 we have that, Pr [ κ < κ ] 3/4 κ. If κ κ then T siulates T on the first κ saples that it got. Otherwise it outputs accept or reject with equal probability. The probability that κ κ is at least 3/4 κ, which is greater than 4 5 for κ > c and a sufficiently large constant c. Therefore, the success probability of T is at least = 9 30, as desired. Given Lea 4 it suffices to consider saples that are generated in the Poissonized unifor sapling odel. The process for generating a saple {α,j,..., α,j } j [n] recall that α i,j is the nuber of ties 9

11 that eleent j was selected by distribution D i in the κ-poissonized odel is equivalent to the following process: For each i [] and j [n], independently select α i,j according to poiκ D i j see [Fel67, p. 6]. Thus the probability of getting a particular histogra a j = a,j,..., a,j for eleent j is p D,κ j; a j as defined in Equation 3. We can represent the event that the histogra of eleent j is a j by a Bernoulli rando vector b j that is indexed by all a N, is in the coordinate corresponding to a j, and is 0 elsewhere. Given this representation, the fingerprint of the saple corresponds to n j= b j. In fact, we would like b j to be of finite diension, so we have to consider only a finite nuber sufficiently large of possible histogras. Under this relaxation, b j = 0,..., 0 would correspond to the case that the saple histogra of eleent j is not in the set of histogras we consider. Roos s theore, stated next, shows that the distribution of the fingerprints can be approxiated by a ultivariate Poisson distribution the Poisson here is related to the fact that the fingerprints distributions are generalized ultinoial distributions and not related to the Poisson fro the Poissonization process. For siplicity, the theore is stated for vectors bj that are indexed directly, that is b j = b j,,..., b j,h. Theore 3 [Roo99] Let D Sn be the distribution of the su S n of n independent Bernoulli rando vectors b,..., [ ] [ ] b n in R h where Pr bj = e l = p j,l and Pr bj = 0,..., 0 = h l= p j,l here e l satisfies e j,l = and e j,l = 0 for every l l. Suppose we define an h-diensional vector λ = λ,..., λ h as follows: λ l = n j= p j,l. Then D Sn poi λ 88 5 h n j= p j,l l= n j= p j,l. 6 We next show how to obtain a bound on sus of the for given in Equation 6 under appropriate conditions. Lea 5 Given a list D = D,..., D of distributions over [n] and a real nuber 0 < δ / such that for all i [] and for all j [n], D i j for soe integer κ, we have that a N \ 0 δ κ n j= pd,κ j; a n j= pd,κ j; a δ. 7 Proof: n j= pd,κ j; a n j= pd,κ j; a a N \ 0 = a N \ 0 a N \ 0 a N \ 0 ax p D j; a j ax j δ δ a a= poia i ; κ D i j i= a +...+a a δ, 8 0

12 where the inequality in Equation 8 holds for δ / and the inequality in Equation 8 follows fro: and the proof is copleted. e κ Dij κ D i j a poia; κ D i j = a! κ D i j a δ a, Proof of Theore : By the first preise of the theore, D i + j, D+ i j δ κ for every i [] and j [n]. By Lea 5 this iplies that Equation 7 holds both for D = D + and for D = D. Cobining this with Theore 3 we get that the l distance between the fingerprint distribution when the saple is generated according to D + in the κ-poissonized odel, see Definition and the distribution poi λ D +,κ is at ost δ = 5 δ, and an analogous stateent holds for D. By applying the preise in Equation 5 concerning the l distance between poi λ D +,κ and poi λ D,κ and the triangle inequality, we get that the l distance between the two fingerprint distributions is saller than 76 5 δ δ 5 = 6 30, which iplies that the statistical difference is saller than 8 30, and thus it is not possible to distinguish between D+ and D in the κ-poissonized odel with success probability at least By Lea 4 we get the desired result. 3.3 Proof of Theore In this subsection we show how to apply Theore to two lists of distributions, D + and D, which we will define shortly, where D + P eq = P,n eq while D is /0-far fro P eq. Recall that by the preise of Theore, n c log for soe sufficiently large constant c >. In the proof it will be convenient to assue that is even and that n which corresponds in the lea to t is divisible by 4. It is not hard to verify that it is possible to reduce the general case to this case. In order to define D, we shall need the next lea. Lea 6 For every two even integers and t, there exists a 0/-valued atrix M with rows and t coluns for which the following holds:. In each row and each colun of M, exactly half of the eleents are and the other half are 0.. For every integer x < /, and for every subset S [] of size x, the nuber of coluns j such that M[i, j] = for every i S is at least t x x x ln t, and at ost t. x + x ln t Proof: Consider selecting a atrix M randoly as follows: Denote the first t/ coluns of M by F. For each colun in F, pick, independently fro the other t/ coluns in F, a rando half of its eleents to be, and the other half of the eleents to be 0. Coluns t/ +,..., t are the negations of rows,..., t/, respectively. Thus, in each row and each colun of M, exactly half of the eleents are and the other half are 0.

13 Consider a fixed choice of x. For each colun j between and t, each subset of coluns S [] of size x, and b {0, }, define the indicator rando variable I S,j,b to be if and only if M[i, j] = b for every i S. Hence, Pr[I S,j,b = ] =... x. Clearly, Pr[I S,j,b = ] < x. On the other hand, Pr[I S,j,b = ] x x = x x x x x. where the last inequality is due to Bernoulli s inequality which states that + x n > + nx, for every real nuber x > 0 and an integer n > [MV70]. Let E S,b denote the expected value of t/ j= I S,j,b. Fro the fact that coluns t/ +,..., t are the negations of coluns,..., t/ it follows that t j=t/+ I S,j, = t/ j= I S,j,0. Therefore, the expected nuber of coluns j t such that M[i, j] = for every i S is siply E S, + E S,0 that is, at ost t and at least t x x x. By the additive Chernoff bound, [ t/ ] tx ln Pr I S,j,b E S,b > j= < exp t/x ln /t = x. Thus, by taking a union bound over b {0, }, [ t Pr I S,j, E S, + E S,0 > ] tx ln j= < 4 x. By taking a union bound over all subsets S we get that M has the desired properties with probability greater than 0. We first define D +, in which all distributions are identical. Specifically, for each i []: if j n/3 /3 D i + def n j = /3 /3 n if n < j n 0 o.w. 9 We now turn to defining D. Let M be a atrix as in Lea 6 for t = n/. For every i []: if j n/3 /3 n /3 /3 D i j def = n if n < j n and M[i, j n/] = 0 o.w. 0

14 For both D + and D, we refer to the eleents j n/3 /3 as the heavy eleents, and to the eleents n j n, as the light eleents. Observe that each heavy eleent has exactly the sae probability weight,, in all distributions D + n /3 /3 i and Di. On the other hand, for each light eleent i, while D+ i j = n for every i, in D we have that D i + j = n for half of the distributions, the distributions selected by the M, and D i + j = 0 for half of the distributions, the distributions which are not selected by M. We later use the properties of M to bound the l distance between the fingerprints distributions of D + and D. A HIGH-LEVEL DISCUSSION. To gain soe intuition before delving into the detailed proof, consider first the special case that = which was studied by Valiant [Val08a], and indeed the construction is the sae as the one he analyzes and was initially proposed in [BFR + 00]. In this case each heavy eleent has probability weight Θ/n /3 and we would like to establish a lower bound of Ωn /3 on the nuber of saples required to distinguish between D + and D. That is, we would like to show that the corresponding fingerprints distributions when the saple is of size on /3 are very siilar. The first ain observation is that since the probability weight of light eleents is Θ/n in both D + and D, the probability that a light eleent will appear ore than twice in a saple of size on /3 is very sall. That is using the fingerprints of histogras notation we introduced previously, for each a = a, a such that a + a >, the saple will not include with high probability any light eleent j such that α,j = a and α,j = a for both D + and D. Moreover, for every x {, }, the expected nuber of eleents j such that α,j, α,j = x, 0 is the sae in D + and D, as well as the variance fro syetry, the sae applies to 0, x. Thus, ost of the difference between the fingerprints distributions is due to the nubers of eleents j such that α,j, α,j =,. For this setting we do expect to see a non-negligble difference for light eleents between D + and D in particular, we cannot get the, histogra for light eleents in D, as opposed to D +. Here is where the heavy eleents coe into play. Recall that in both D + and D the heavy eleents have the sae probability weight, so that the expected nuber of heavy eleents i such that a,j, a,j =, is the sae for D + and D. However, intuitively, the variance of these nubers for the heavy eleents swaps the differences between the light eleents so that it is not possible to distinguish between D + and D. The actual proof, which foralizes and quantifies this intuition, considers the difference between the values of the vectors λ D+,k and λ D,k as defined in Equation 4 in the coordinates corresponding to a such that a + a =. We can then apply Leas and to obtain Equation 5 in Theore. Turning to >, it is no longer true that in a saple of size on /3 /3 we will not get histogra vectors a such that i= a i > for light eleents. Thus we have to deal with any ore vectors a of diension and to bound the total contribution of all of the to the difference between fingerprints of D + and of D. To this end we partition the set of all possible histogras vectors into several subsets according to their Haing weight i= a i and depending on whether all a is are in {0, }, or there exists a least one a i such that a i. In particular, to deal with the forer whose nuber, for each choice of Haing weight x is relatively large, i.e., roughly x, we use the properties of the atrix M based on which D is defined. We note that fro the analysis we see that, siilarly to when =, we need the variance of the heavy eleents to play a role just for the cases where i= a i = while in the other cases the total contribution of the light eleents is rather sall. In the reainder of this section we provide the details of the analysis. Before establishing that indeed D is Ω-far fro P eq, we introduce soe ore notation which will be used throughout the reainder of the proof of Theore. Let S x be the set of vectors that contain exactly x coordinates that are, and all the rest are 0 which corresponds to an eleent that was sapled once or 0 ties by each distribution. Let A x be the set of vector that their coordinates su up to x but ust 3

15 contain at least one coordinate that is which corresponds to an eleent that was saples at least twice by at least one distribution. More forally, for any integer x, we define the following two subsets of N : { def S x = a N : i= a } i = x and, i [], a i < and A x def = { a N : i= a i = x and i [], a i For a N, let sup a def = {i : a i 0} denote the support of a, and let I M a def = {j : D i j = n } i sup a. Note that in ters of the atrix M based on which D is defined, I M a consists of the coluns in M whose restriction to the support of a contains only s. In ters of the D, it corresponds to the set of light eleents that ight have a saple histogra of a when sapling according to D. Lea 7 For every > 5 and for n c ln for soe sufficiently large c, we have that i= D i D > /0 for every distribution D over [n]. That is, the list D is /0-far fro P eq. Proof: Consider any a S. By Lea 6, setting t = n/, the size of I M a, i.e. the nuber of light eleents l such that Di [l] = n for every i sup a, is at ost n ln n. The sae lower bound holds for the nuber of light eleents l such that Di [l] = 0 for every i sup a. This iplies that for every i i in [], for at least n n ln n of the light eleents, l, we have that Di [l] = n while D i [l] = 0, or that D i [l] = n while D i [l] = 0. Therefore, D i D i 8 ln n, which for n c ln and a sufficiently large constant c, is at least 8. Thus, by the triangle inequality we have that for every D, i= D i D 8, which greater than /0 for > 5. In what follows we work towards establishing that Equation 5 in Theore holds for D + and D. Set κ = δ n/3, where δ is a constant to be deterined later. We shall use the shorthand λ + for λ D+,κ, and λ /3 for λ D,κ recall that the notation λ D,κ was introduced in Equation 4. By the definition of λ +, for each a N, n λ + κ D i + a = ja i e κ D+ i j a i! = j= i= n /3 /3 / j= = n/3 /3 e δ i= δ/ a i e δ/ a i! δ/ a i i= a i! + + n j=n/+ i= n e δ/n/3 i= } δ/n /3 /3 a i e δ/n/3 /3 a i! δ/n /3 /3 a i. a i! By the construction of M, for every light j, i= D i j = n = n. Therefore, λ a = n/3 /3 δ/ a i δ/n /3 /3 a i e δ + a i! a i! i= e δ/n/3 j I M a i= 4.

16 Hence, λ + a and λ a differ only on the ter which corresponds to the contribution of the light eleents. Equations and deonstrate why we choose M with the specific properties defined in Lea 6. First of all, in order for every Di to be a probability distribution, we want each row of M to su up to exactly n/. We also want each colun of M to su up to exactly /, in order to get i= e κ D+ i j = i= e κ D i j. Finally, we would have liked I M a i= a i to equal n/ for every a. This would iply that λ + a and λ a are equal. As we show below, this is in fact true for every a S. For vectors a S x where x >, the second condition in Lea 6 ensures that I M a is sufficiently close to n x. This property of M is not necessary in order to bound the contribution of the vectors in A x. The bound that we give for those vectors is less tight, but since there are fewer such vectors, it suffices. We start by considering the contribution to Equation 5 of histogra vectors a S i.e., vectors of the for 0,..., 0,, 0,..., 0 which correspond to the nuber of eleents that are sapled only by one distribution, once. We prove that in the Poissonized unifor sapling odel, for every a S the nuber of eleents with such saple histogra is distributed exactly the sae in D + and D. Lea 8 poi λ + a poi λ a = 0. a S Proof: For every a S, the size of I M a is n 4, thus, j I M a i= δ/n /3 /3 a i = n a i! δ/n /3 /3 a i. a i! By Equations and, it follows that λ + a λ a = 0 for every a S. The lea follows by applying Equation. We now turn to bounding the contribution to Equation 5 of histogra vectors a A i.e., vectors of the for 0,..., 0,, 0,..., 0 which correspond to the nuber of eleents that are sapled only by one distribution, twice. i= Lea 9 poi λ + A poi λ A 3δ. Proof: For every a A, the size of I M a is n 4, thus, δ/n /3 /3 a i j I M a i= a i! By Equations, and it follows that δ/n /3 /3 a i = n a i! i=. λ a λ + a = = n δ/n /3 /3 a i a i! e δ/n/3 i= n /3 δ, 3 4e δ/n/3 4/3 5

17 and that By Equations 3 and 4 we have that λ a n/3 /3 e δ δ/ a i i= a i! = n/3 δ 4e δ. 4 5/3 λ a λ + a λ a By Equation 5 and the fact that A = we get eδ δ/n/3 δ 4 δ. 5 λ a λ a + a A λ a δ = δ The lea follows by applying Lea. Recall that for a subset I of N, poi λi denotes the ultivariate Poisson distributions restricted to the coordinates of λ that are indexed by the vectors in I. We separately deal with S x where x < /, and x /, where our ain efforts are with respect to the forer, as the latter correspond to very low probability events. Lea 0 For 6, n c ln where c is a sufficiently large constant and for δ /6 poi λ / + x= S x poi λ / x= S x 3δ. Proof: Let a be a vector in S x then by the definition of S x, every coordinate of a is 0 or. Therefore we ake the following siplification of Equation : For each a / x= S x, λ + a = n/3 /3 δ x n e δ + e δ x δ/n/3 n /3 /3. By Lea 6, for every a / x= S x the size of I M a is at ost n n x x x 4x ln n. By Equation this iplies that λ a = n/3 /3 δ x e δ + x + n e δ/n/3 x + η δ x n /3 /3, 4x ln n and at least 6

18 x where x + 4x ln 4x ln n η n and thus η x x x + 4 ln n. By the facts x that n c ln for soe sufficiently large constant c, and that x for every x < / and 6, we obtain that η x. So we have that λ + a λ a n e δ x x δ/n/3 n /3 /3 n 4δ x 4 n /3 4/3 x, and that Then we get, for δ /, that λ a n/3 /3 δ x e δ, λ + a λ a λ a eδ n 4/3 /3 4δ x n /3 /3 x n4/3 /3 4δ x n /3 /3 x x n4/3 4/3 4x /x δ n /3 /3 n4/3 4/3 8δ n /3 /3 x. Suing over all a / x= S x we get: a S / x= S x λ a λ + a λ a = x= n 4/3 4/3 64δ x=0 x 8δ /3 n /3 8δ /3 n /3 x 6 64δ 8δ 8δ 7 where in Equation 6 we used the fact that n >, and Equation 7 holds for δ /6. The lea follows by applying Lea. Lea For n, and δ /4, poi λ + a poi λ a 3δ 3. x / a S x 7

19 Proof: We first observe that S x x /x for every x 6. To see why this is true, observe that S x equals the nuber of possibilities of arranging x balls in bins, i.e., + x S x = x + xx x! x x! = x x! x x x x, where we have used the preise that and thus x 6. By Equations and and the fact that x y ax{x, y} for every positive real nubers x,y, x / a S x λ + a λ a n x / a S x i= = n a S x = n x / x=/ x=/ x x n x x=/ = 8δ 3 x=/ 3 δ n /3 /3 δ n /3 /3 n δ n /3 /3 δ n /3 /3 x δ /3 n /3 x δ /3 n /3 ai P i= a i x x 8δ 3 δ 8 6δ 3 9 where in Equation 8 we used the fact that n and Equation 9 holds for δ /4. The lea follows by applying Equation. We finally turn to the contribution of a A x such that x 3. Lea For n and δ /4, x 3 a A x poi λ + a poi λ a 6δ 3. Proof: We first observe that A x x for every x. To see why this is true, observe that A x equals the nuber of possibilities of arranging x balls, where one ball is a special double ball in bins. 8

20 By Equations and and the fact that x y ax{x, y} for every positive real nubers x,y, x 3 a A x λ + a λ a n δ n /3 /3 x 3 a A x i= = n δ n /3 /3 x 3 a A x x n δ n /3 /3 x=3 x = n δ /3 x=3 = 4δ 3 x=0 n /3 x δ /3 n /3 ai P i= a i x 4δ 3 δ 0 8δ 3 where in Equation 0 we used the fact that n and Equation holds for δ /4. The lea follows by applying Equation. We are now ready to finalize the proof of Theore. Proof of Theore : Let D + and D be as defined in Equations 9 and 0, respectively, and recall that κ = δ n/3 where δ will be set subsequently. By the definition of the distributions in D + and D, /3 the probability weight assigned to each eleent is at ost = δ n /3 /3 κ, as required by Theore. By Lea 7, D is /0-far fro P eq. Therefore, it reains to establish that Equation 5 holds for D + and D. Consider the following partition of N : / { a} a S, A, S x, { a} S a Sx, { a} x / a S x 3 Ax, x= where { a} a T denotes the list of all singletons of eleents in T. By Lea it follows that poi λ + poi λ poi λ + a poi λ a a S + poi λ + A poi λ A x= / / + poi λ + S x poi λ S x + x / a S x + x 3 9 a A x x= poi λ + a poi λ a poi λ + a poi λ a.

21 For δ < /6 we get by Leas 8 that poi λ + poi λ 35δ + 48δ 3, which is less than δ 5 for δ = / A lower bound for testing Independence Corollary 4 Given a joint distribution Q over [] [n] ipossible to test if Q is independent or /48-far fro independent using on /3 /3 saples. Proof: Follows directly fro Lea 5 and Theore. 4 A Lower Bound of Ωn / / for Testing Equivalence in the Unifor Sapling Model In this section we prove the following theore: Theore 5 Testing the property P eq,n in the unifor sapling odel for every ɛ / and 64 requires Ωn / / saples. We assue without loss of generality that n is even or else, we set the probability weight of the eleent n to 0 in all distributions considered, and work with n that is even. Define H n to be the set of all distributions over [n] that have probability n on exactly half of the eleents and 0 on the other half. Define Hn to be the set of all possible lists of distributions fro H n. Define Un to consist of only a single list of distributions each of which is identical to U n, where U n denotes the unifor distribution over [n]. Thus the single list in Un belongs to P,n. eq On the other hand we show, in Lea 3, that Hn contains ostly lists of distributions that are Ω-far fro P,n. eq However, we also show, in Lea 4, that any tester in the unifor sapling odel that takes less than n / / /6 saples cannot distinguish between D that was uniforly drawn fro Hn and D = U n,..., U n Un. Details follow. Lea 3 For every 3, with probability at least over the choice of D Hn we have that D is /-far fro P,n. eq Proof: We need to prove that with probability at least over the choice of D Hn, for every v = v,..., v n R n which corresponds to a distribution i.e., v j 0 for every j [n] and n j= v j =, D i v >. i= We shall actually prove a slightly ore general stateent. Naely, that Equation holds for every vector v R n. We define the function, ed D : [n] [0, ], such that ed D j = µ D j,..., D j, where µ x,..., x denotes the edian of x,..., x where if is even, it is the value in position 0

22 in sorted non-decreasing order. The su i= x i c is iniized when c = µ x,..., x. Therefore, for every D and every vector v R n, Di ed D i= D i v. 3 i= Recall that for every D = D,..., D in Hn, and for each i, j [] [n], we have that either D i j = n, or D ij = 0. Thus, ed D j = 0 when D i j = 0 for at least half of the i s in [] and ed D j = n otherwise. We next show that for every i, j [] [n], the probability over D H n that D i j will have the sae value as ed D j is at ost a bit bigger than half. More precisely, we show that for every i, j [] [n]: [ Pr D H n Di j ed D j ]. 4 Fix i, j [] [n], and consider selecting D uniforly at rando fro Hn. Suppose we first deterine the values D i j for i i, and set D i j in the end. For each i, j the probability that D i j = 0 is /, and the probability that D i j = n is /. If ore than / of the outcoes are 0, or ore than / are n, then the value of edd j is already deterined. Conditioned on this we have that the probability that D i j ed D j is exactly /. On the other hand, if at ost / are 0 and at ost / are n that is, for odd there are / that are 0 and / that are n, and for even there are / of one kind and / of the other then necessarily ed D j = D i j. We thus bound the probability of this event. First consider the case that is odd so that is even. [ Pr Bin, = ] = =!!!. 5 By Stirling s approxiation,! = π e e λ, where λ is a paraeter that satisfies + < λ <, thus,!!! < π e e π/ / e / e /+ 6 = e 6+ π/ 7 < π/ 8, 9 where Inequalities 8 and 9 hold for 3. In case is even, the probability over the choice of D i j for i i that ed D j is deterined by D i j is Pr [ Bin, ] [ ] = + Pr Bin, =.

23 Hence, Equation 4 holds for all and we obtain the following bound on the expectation [ ] Di ed D n [ = E D H Di n j ed D j ] 30 E D H n i= while the axiu value is bounded as Di ed D = i= i= j= [ = n Pr D H n Dj i ed D i ] n n n 3 3 =, 33 i= j= Assue for the sake of contradiction that [ ] Di ed D / Pr D H n i= then by Equation 36 we have, [ ] Di ed D E D H n i= n j= n Di j ed D j 34 n 35 =. 36 < >, =, 39 which contradicts Equation 33. Recall that for an eleent j [n] and a distribution D i, i [], we let a i,j denote the nuber of ties the pair i, j appears in the saple when the saple is selected in the unifor sapling odel. Thus a,j,..., a,j is the saple histogra of the eleent j. Since the saple points are selected independently, a saple is siply the union of the histogras of the different eleents, or equivalently, a atrix M in N n. Lea 4 Let U be the distribution of the histogra of q saples taken fro the unifor distribution over [] [n], and let H be the distribution of the histogra of q saples taken fro a rando list of distributions in H n. Then, U H 4q n. 40 Proof: For every atrix M N n, let A M be the event of getting the histogra M i.e. M[i, j] = x if eleent j is chosen exactly x ties fro distribution D i in the saple; For every x = x,..., x N, let B x be the event of getting a histogra M with exactly x i saples fro distribution D i i.e., such that

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Approximating and Testing k-histogram Distributions in Sub-linear time

Approximating and Testing k-histogram Distributions in Sub-linear time Electronic Colloquiu on Coputational Coplexity, Report No. 171 011 Approxiating and Testing k-histogra Distributions in Sub-linear tie Piotr Indyk Reut Levi Ronitt Rubinfeld Noveber 9, 011 Abstract A discrete

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Collision-based Testers are Optimal for Uniformity and Closeness

Collision-based Testers are Optimal for Uniformity and Closeness Electronic Colloquiu on Coputational Coplexity, Report No. 178 (016) Collision-based Testers are Optial for Unifority and Closeness Ilias Diakonikolas Theis Gouleakis John Peebles Eric Price USC MIT MIT

More information

Approximating and Testing k-histogram Distributions in Sub-linear Time

Approximating and Testing k-histogram Distributions in Sub-linear Time Electronic Colloquiu on Coputational Coplexity, Revision 1 of Report No. 171 (011) Approxiating and Testing k-histogra Distributions in Sub-linear Tie Piotr Indyk Reut Levi Ronitt Rubinfeld July 31, 014

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

arxiv: v1 [math.nt] 14 Sep 2014

arxiv: v1 [math.nt] 14 Sep 2014 ROTATION REMAINDERS P. JAMESON GRABER, WASHINGTON AND LEE UNIVERSITY 08 arxiv:1409.411v1 [ath.nt] 14 Sep 014 Abstract. We study properties of an array of nubers, called the triangle, in which each row

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1. Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Curious Bounds for Floor Function Sums

Curious Bounds for Floor Function Sums 1 47 6 11 Journal of Integer Sequences, Vol. 1 (018), Article 18.1.8 Curious Bounds for Floor Function Sus Thotsaporn Thanatipanonda and Elaine Wong 1 Science Division Mahidol University International

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Bipartite subgraphs and the smallest eigenvalue

Bipartite subgraphs and the smallest eigenvalue Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805

More information

Multi-Dimensional Hegselmann-Krause Dynamics

Multi-Dimensional Hegselmann-Krause Dynamics Multi-Diensional Hegselann-Krause Dynaics A. Nedić Industrial and Enterprise Systes Engineering Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu B. Touri Coordinated Science Laboratory

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

Convex Programming for Scheduling Unrelated Parallel Machines

Convex Programming for Scheduling Unrelated Parallel Machines Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly

More information

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13 CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I Contents 1. Preliinaries 2. The ain result 3. The Rieann integral 4. The integral of a nonnegative

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011) E0 370 Statistical Learning Theory Lecture 5 Aug 5, 0 Covering Nubers, Pseudo-Diension, and Fat-Shattering Diension Lecturer: Shivani Agarwal Scribe: Shivani Agarwal Introduction So far we have seen how

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

Detection and Estimation Theory

Detection and Estimation Theory ESE 54 Detection and Estiation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electronic Systes and Signals Research Laboratory Electrical and Systes Engineering Washington University 11 Urbauer

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Lecture 9 November 23, 2015

Lecture 9 November 23, 2015 CSC244: Discrepancy Theory in Coputer Science Fall 25 Aleksandar Nikolov Lecture 9 Noveber 23, 25 Scribe: Nick Spooner Properties of γ 2 Recall that γ 2 (A) is defined for A R n as follows: γ 2 (A) = in{r(u)

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn

More information

Chaotic Coupled Map Lattices

Chaotic Coupled Map Lattices Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

arxiv: v2 [math.co] 3 Dec 2008

arxiv: v2 [math.co] 3 Dec 2008 arxiv:0805.2814v2 [ath.co] 3 Dec 2008 Connectivity of the Unifor Rando Intersection Graph Sion R. Blacburn and Stefanie Gere Departent of Matheatics Royal Holloway, University of London Egha, Surrey TW20

More information

Generalized eigenfunctions and a Borel Theorem on the Sierpinski Gasket.

Generalized eigenfunctions and a Borel Theorem on the Sierpinski Gasket. Generalized eigenfunctions and a Borel Theore on the Sierpinski Gasket. Kasso A. Okoudjou, Luke G. Rogers, and Robert S. Strichartz May 26, 2006 1 Introduction There is a well developed theory (see [5,

More information

Estimating Entropy and Entropy Norm on Data Streams

Estimating Entropy and Entropy Norm on Data Streams Estiating Entropy and Entropy Nor on Data Streas Ait Chakrabarti 1, Khanh Do Ba 1, and S. Muthukrishnan 2 1 Departent of Coputer Science, Dartouth College, Hanover, NH 03755, USA 2 Departent of Coputer

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

The Transactional Nature of Quantum Information

The Transactional Nature of Quantum Information The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Reed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product.

Reed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product. Coding Theory Massoud Malek Reed-Muller Codes An iportant class of linear block codes rich in algebraic and geoetric structure is the class of Reed-Muller codes, which includes the Extended Haing code.

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Note on generating all subsets of a finite set with disjoint unions

Note on generating all subsets of a finite set with disjoint unions Note on generating all subsets of a finite set with disjoint unions David Ellis e-ail: dce27@ca.ac.uk Subitted: Dec 2, 2008; Accepted: May 12, 2009; Published: May 20, 2009 Matheatics Subject Classification:

More information

. The univariate situation. It is well-known for a long tie that denoinators of Pade approxiants can be considered as orthogonal polynoials with respe

. The univariate situation. It is well-known for a long tie that denoinators of Pade approxiants can be considered as orthogonal polynoials with respe PROPERTIES OF MULTIVARIATE HOMOGENEOUS ORTHOGONAL POLYNOMIALS Brahi Benouahane y Annie Cuyt? Keywords Abstract It is well-known that the denoinators of Pade approxiants can be considered as orthogonal

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

Supplement to: Subsampling Methods for Persistent Homology

Supplement to: Subsampling Methods for Persistent Homology Suppleent to: Subsapling Methods for Persistent Hoology A. Technical results In this section, we present soe technical results that will be used to prove the ain theores. First, we expand the notation

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

arxiv: v3 [cs.sy] 29 Dec 2014

arxiv: v3 [cs.sy] 29 Dec 2014 A QUADRATIC LOWER BOUND FOR THE CONVERGENCE RATE IN THE ONE-DIMENSIONAL HEGSELMANN-KRAUSE BOUNDED CONFIDENCE DYNAMICS EDVIN WEDIN AND PETER HEGARTY Abstract. Let f k n) be the axiu nuber of tie steps taken

More information

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University

More information

Ocean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers

Ocean 420 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers Ocean 40 Physical Processes in the Ocean Project 1: Hydrostatic Balance, Advection and Diffusion Answers 1. Hydrostatic Balance a) Set all of the levels on one of the coluns to the lowest possible density.

More information

Lean Walsh Transform

Lean Walsh Transform Lean Walsh Transfor Edo Liberty 5th March 007 inforal intro We show an orthogonal atrix A of size d log 4 3 d (α = log 4 3) which is applicable in tie O(d). By applying a rando sign change atrix S to the

More information

INDEPENDENT SETS IN HYPERGRAPHS

INDEPENDENT SETS IN HYPERGRAPHS INDEPENDENT SETS IN HYPERGRAPHS Abstract. Many iportant theores and conjectures in cobinatorics, such as the theore of Szeerédi on arithetic progressions and the Erdős-Stone Theore in extreal graph theory,

More information

Fundamental Limits of Database Alignment

Fundamental Limits of Database Alignment Fundaental Liits of Database Alignent Daniel Cullina Dept of Electrical Engineering Princeton University dcullina@princetonedu Prateek Mittal Dept of Electrical Engineering Princeton University pittal@princetonedu

More information

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition Upper bound on false alar rate for landine detection and classification using syntactic pattern recognition Ahed O. Nasif, Brian L. Mark, Kenneth J. Hintz, and Nathalia Peixoto Dept. of Electrical and

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

A Bernstein-Markov Theorem for Normed Spaces

A Bernstein-Markov Theorem for Normed Spaces A Bernstein-Markov Theore for Nored Spaces Lawrence A. Harris Departent of Matheatics, University of Kentucky Lexington, Kentucky 40506-0027 Abstract Let X and Y be real nored linear spaces and let φ :

More information

lecture 36: Linear Multistep Mehods: Zero Stability

lecture 36: Linear Multistep Mehods: Zero Stability 95 lecture 36: Linear Multistep Mehods: Zero Stability 5.6 Linear ultistep ethods: zero stability Does consistency iply convergence for linear ultistep ethods? This is always the case for one-step ethods,

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information

Improved Guarantees for Agnostic Learning of Disjunctions

Improved Guarantees for Agnostic Learning of Disjunctions Iproved Guarantees for Agnostic Learning of Disjunctions Pranjal Awasthi Carnegie Mellon University pawasthi@cs.cu.edu Avri Blu Carnegie Mellon University avri@cs.cu.edu Or Sheffet Carnegie Mellon University

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

Characterization of the Line Complexity of Cellular Automata Generated by Polynomial Transition Rules. Bertrand Stone

Characterization of the Line Complexity of Cellular Automata Generated by Polynomial Transition Rules. Bertrand Stone Characterization of the Line Coplexity of Cellular Autoata Generated by Polynoial Transition Rules Bertrand Stone Abstract Cellular autoata are discrete dynaical systes which consist of changing patterns

More information

Tail estimates for norms of sums of log-concave random vectors

Tail estimates for norms of sums of log-concave random vectors Tail estiates for nors of sus of log-concave rando vectors Rados law Adaczak Rafa l Lata la Alexander E. Litvak Alain Pajor Nicole Toczak-Jaegerann Abstract We establish new tail estiates for order statistics

More information

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:

More information

Statistics and Probability Letters

Statistics and Probability Letters Statistics and Probability Letters 79 2009 223 233 Contents lists available at ScienceDirect Statistics and Probability Letters journal hoepage: www.elsevier.co/locate/stapro A CLT for a one-diensional

More information

A := A i : {A i } S. is an algebra. The same object is obtained when the union in required to be disjoint.

A := A i : {A i } S. is an algebra. The same object is obtained when the union in required to be disjoint. 59 6. ABSTRACT MEASURE THEORY Having developed the Lebesgue integral with respect to the general easures, we now have a general concept with few specific exaples to actually test it on. Indeed, so far

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

OBJECTIVES INTRODUCTION

OBJECTIVES INTRODUCTION M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and

More information

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax: A general forulation of the cross-nested logit odel Michel Bierlaire, EPFL Conference paper STRC 2001 Session: Choices A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics,

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information