arxiv: v1 [cs.ds] 17 Mar 2016

Size: px
Start display at page:

Download "arxiv: v1 [cs.ds] 17 Mar 2016"

Transcription

1 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv: v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass streaing algoriths for approxiating the classic set cover proble. For finding an α-approxiate set cover (for any α = o( n using a single-pass streaing algorith, we show that Θ(n/α space is both sufficient and necessary (up to an O(log n factor; here denotes nuber of the sets and n denotes size of the universe. This provides a strong negative answer to the open question posed by Indyk et al. [17] regarding the possibility of having a single-pass algorith with a sall approxiation factor that uses sub-linear space. We further study the proble of estiating the size of a iniu set cover (as opposed to finding the actual sets, and establish that an additional factor of α saving in the space is achievable in this case and that this is the best possible. In other words, we show that Θ(n/α 2 space is both sufficient and necessary (up to logarithic factors for estiating the size of a iniu set cover to within a factor of α. Our algorith in fact works for the ore general proble of estiating the optial value of a covering integer progra. On the other hand, our lower bound holds even for set cover instances where the sets are presented in a rando order. Departent of Coputer and Inforation Science, University of Pennsylvania. Supported in part by National Science Foundation grants CCF , CCF , and IIS and an Adobe research award. Eail: {sassadi,sanjeev,yangli2}@cis.upenn.edu.

2 1 Introduction The set cover proble is a fundaental optiization proble with any applications in coputer science and related disciplines. The input is a universe [n] and a collection of subsets of [n], S = S 1,..., S, and the goal is to find a subset of S with the sallest cardinality that covers [n], i.e., whose union is [n]; we call such a collection of sets a iniu set cover and throughout the paper denote its cardinality by opt := opt(s. The set cover proble can be forulated in the well-established streaing odel [1, 23], whereby the sets in S are presented one by one in a strea and the goal is to solve the set cover proble using a space-efficient algorith. The streaing setting for the set cover proble has been studied in several recent work, including [7, 10, 12, 17, 27]. We refer the interested reader to these references for any applications of the set cover proble in the streaing odel. In this paper, we focus on algoriths that ake only one pass over the strea (i.e., single-pass streaing algoriths, and our goal is to settle the space coplexity of single-pass streaing algoriths that approxiate the set cover proble. Two versions of the set cover proble are considered in this paper: (i coputing a iniu set cover, and (ii coputing the size of a iniu set cover. Forally, Definition 1 (α-approxiation. An algorith A is said to α-approxiate the set cover proble iff on every input instance S, A outputs a collection of (the indices of at ost α opt sets that covers [n], along with a certificate of covering which, for each eleent e [n], specifies the set used for covering e. If A is a randoized algorith, we require that the certificate corresponds to a valid set cover w.p. 1 at least 2/3. We reark that the requireent of returning a certificate of covering is standard in the literature (see, e.g., [7, 12]. Definition 2 (α-estiation. An algorith A is said to α-estiate the set cover proble iff on every input instance S, A outputs an estiate for the cardinality of a iniu set cover in the range [opt, α opt]. If A is a randoized algorith, we require that: ( Pr A(S [opt, α opt] 2/3 1.1 Our Results We resolve the space coplexities of both versions of the set cover proble. Specifically, we show that for any α = o( n/ log n and any = poly(n, There is a deterinistic single-pass streaing algorith that α-approxiates the set cover proble using space Õ(n/α bits and oreover, any single-pass streaing algorith (possibly randoized that α-approxiates the set cover proble ust use Ω(n/α bits of space. There is a randoized single-pass streaing algorith that α-estiates the set cover proble using space Õ(n/α 2 bits and oreover, any single-pass streaing algorith (possibly randoized that α-estiates the set cover proble ust use Ω(n/α 2 bits of space. We should point out right away that in this paper, we are not concerned with poly-tie coputability, though, our algoriths for set cover can be ade coputationally efficient for any α log n by allowing an extra log n factor in the space requireent 2. 1 Throughout, we use w.p. and w.h.p. to abbreviate with probability and with high probability, respectively. 2 Set cover adits a classic log n-approxiation algorith [19, 28], and unless P = NP, there is no polynoial tie α-approxiation for the set cover proble for α < (1 ǫ log n (for any constant ǫ > 0 [11, 13, 22]. 1

3 We establish our upper bound result for α-estiation for a uch ore general proble: estiating the optial value of a covering integer linear progra (see Section 4 for a foral definition. Moreover, the space lower bound for α-estiation (for the original set cover proble holds even if the sets are presented in a rando order. We now describe each of these two sets of results in ore details. Approxiating Set Cover. There is a very siple deterinistic α-approxiation algorith for the set cover proble using space Õ(n/α bits which we ention in Section 1.2 for copleteness. Perhaps surprisingly, we establish that this siple algorith is essentially the best possible; any α-approxiation algorith for the set cover proble requires Ω(n/α bits of space (see Theore 1 for a foral stateent. Prior to our work, the best known lower bounds for single-pass streas ruled out (3/2 ǫ-approxiation using o(n space [17] (see also [15], o( n-approxiation in o( space [7, 12], and O(1-approxiation in o(n space [10] (only for deterinistic algoriths; see Section 1.3 for ore detail on different existing lower bounds. Note that these lower bound results leave open the possibility of a single-pass randoized 3/2-approxiation or even a deterinistic O(log n-approxiation algorith for the set cover proble using only Õ( space. Our result on the other hand, rules out the possibility of having any non-trivial tradeoff between the approxiation factor and the space requireent, answering an open question raised by Indyk et al. [17] in the strongest sense possible. We should also point out that the bound of α = o( n/ log n in our lower bound is tight up to an O(log n factor since an O( n-approxiation is known to be achievable in Õ(n space (essentially independent of for = poly(n [7, 12]. Estiating Set Cover Size. We present an Õ(n/α 2 space algorith for α-estiating the set cover proble, and in general any covering integer progra (see Theore 3 for a foral stateent. Our upper bound suggests that if one is only interested in α-estiating the size of a iniu set cover (instead of knowing the actual sets, then an additional α factor saving in the space (copare to the best possible α-approxiation algorith is possible. To the best of our knowledge, this is the first non-trivial gap between the space coplexity of α-approxiation and α-estiation for the set cover proble. We further show that the space coplexity of our Õ(n/α 2 space α-estiation algorith for the set cover proble is essentially tight (up to logarithic factors. In other words, any α-estiation algorith for set cover (possibly randoized requires Ω(n/α 2 space (see Theore 4 for a foral stateent. There are exaples of classic optiization probles in the streaing literature for which size estiation sees to be distinctly easier in the rando arrival streas 3 copare to the adversarial streas (see, e.g., [20]. However, we show that this is not the case for the set cover proble, i.e., the lower bound of Ω(n/α 2 for α-estiation continues to hold even for rando arrival streas. We note in passing two other results also: (i our bounds for α-approxiation and α- estiation also prove tight bounds on the one-way counication coplexity of the two-player counication odel of the set cover proble (see Theore 2 and Theore 5, previously studied in [7, 10, 24]; (ii the use of randoization in our α-estiation algorith is inevitable: any deterinistic α-estiation algorith for the set cover requires Ω(n/α bits of space (see Theore 6. 3 In rando arrival streas, the input (in our case, the collection of sets is randoly peruted before being presented to the algorith 2

4 1.2 Our Techniques Upper bounds. An α-approxiation using Õ(n/α bits of space can be siply achieved as follows. Merge (i.e., take the union of every α sets in the strea into a single set; at the end of the strea, solve the set cover proble over the erged sets. To recover a certificate of covering, we also record for each eleent e in each erged set, any set in the erge that covers e. It is an easy exercise to verify that this algorith indeed achieves an α-approxiation and can be ipleented in space Õ(n/α bits. Our Õ(n/α 2 -space α-estiation algorith is ore involved and in fact works for any covering integer progra (henceforth, a covering ILP for short. We follow the line of work in [10] and [17] by perforing diensionality reduction over the sets (in our case, coluns of the constraint atrix A and storing their projection over a randoly sapled subset of the universe (here, constraints during the strea. However, the goal of our constraint sapling approach is entirely different fro the ones in [10,17]. The eleent sapling approach of [10, 17] ais to find a sall cover of the sapled universe which also covers the vast ajority of the eleents in the original universe. This allows the algorith to find a sall set cover of the sapled universe in one pass while reducing the nuber of reaining uncovered eleents for the next pass; hence, applying this approach repeatedly over ultiple passes on the input allows one to obtain a coplete cover. On the other hand, the goal of our constraint sapling approach is to create a saller instance of set cover (in general, covering ILP with the property that the iniu set cover size of the sapled instance is a proxy for the iniu set cover size of the original instance. We crucially use the fact that the algorith does not need to identify the actual cover and hence it can estiate the size of the solution based on the optiu set cover size in the sapled universe. At the core of our approach is a siple yet very general lea, referred to as the constraint sapling lea (Lea 4.1 which ay be of independent interest. Inforally, this lea states that for any covering ILP instance I, the optial value of a sub-sapled instance I R, obtained by picking roughly 1/α fraction of the constraints uniforly at rando fro I, is an α estiator of the optiu value of I whenever no constraint is too hard to satisfy. Nevertheless, the constraint sapling is not enough for reducing the space to eet the desired Õ(n/α 2 bound (see Theore 3. Hence, we cobine it with a pruning step, siilar to the set filtering step of [17] for (unweighted set cover (see also GreedyPass algorith of [7] to sparsify the coluns in the input atrix A before perforing the sapling. We point out that as the variables in I can have different weights in the objective function (e.g. for weighted set cover, our pruning step needs to be sensitive to the weights. Lower bounds. As is typical in the streaing literature, our lower bounds are obtained by establishing counication coplexity lower bounds; in particular, in the one-way two-player counication odel. To prove these bounds, we use the inforation coplexity paradig, which allows one to reduce the proble, via a direct su type arguent, to ultiple instances of a sipler proble. For our lower bound for α-estiation, this sipler proble turned out to be a variant of the well-known Set Disjointness proble. However, for the lower bound of α-approxiation algoriths, we introduce and analyze a new interediate proble, called the Trap proble. The Trap proble is a non-boolean proble defined as follows: Alice is given a set S, Bob is given a set E such that all eleents of E belong to S except for a special eleent e, and the goal of the players is to trap this special eleent, i.e., to find a sall subset of E which contains e. For our purpose, Bob only needs to trap e in a set of cardinality E /2. To prove a lower bound for the Trap proble, we design a novel reduction fro the well-known Index proble, which requires Alice and Bob to use the protocol for the Trap proble over non- 3

5 legal inputs (i.e., the ones for which the Trap proble is not well-defined, while ensuring that they are not being fooled by the output of the Trap protocol over these inputs. To prove our lower bound for α-estiation in rando arrival streas, we follow the approach of [5] in proving the counication coplexity lower bounds when the input data is randoly allocated between the players (as opposed to adversarial partitions. However, the distributions and the proble considered in this paper are different fro the ones in [5]. 1.3 Related Work Counication coplexity of the set cover proble was first studied by Nisan [24]. Aong other things, Nisan showed that the two-player counication coplexity of ( 1 2 ǫ log n- estiating the set cover is Ω(. In particular, this iplies that any constant-pass streaing algorith that ( 1 2 ǫ log n-estiates the set cover ust use Ω( bits of space. Saha and Getoor [27] initiated the study of set cover in the sei-streaing odel [14] where the sets are arriving in a strea and the algoriths are required to use Õ(n space, and obtained an O(log n-approxiation via an O(log n-pass algorith that uses O(n log n space. A siilar perforance was also achieved by [8] in the context of disk friendly algoriths. As designed, the algorith of [8] achieves (1 + β ln n-approxiation by aking O(log β n passes over the strea using O(n log n space. The single-pass sei-streaing setting for set cover was initially and throughly studied by Eek and Rosén [12]. They provided an O( n-approxiation using Õ(n space (which also extends to the weighted set cover proble and a lower bound that states that no seistreaing algorith (i.e., an algorith using only Õ(n space that O(n 1/2 ǫ -estiates set cover exists. Recently, Chakrabarti and Wirth [7] generalized the space bounds in [12] to ultipass algoriths, providing an alost coplete understanding of the pass/approxiation tradeoff for sei-streaing algoriths. In particular, they developed a deterinistic p- pass (p + 1 n 1/(p+1 -approxiation algorith in Õ(n space and prove that any p-pass n 1/(p+1 /(c (p+1 2 -estiation algorith requires Ω(n c /p 3 space for soe constant c > 1 ( in their hard instances is Θ(n cp. This, in particular, iplies that any single-pass o( n- estiation algorith requires Ω( space. Deaine et al. [10] studied the trade-off between the nuber of passes, the approxiation ratio, and the space requireent of general streaing algoriths (i.e., not necessarily seistreaing for the set cover proble and developed an algorith that for any δ = Ω(1/ log n, akes O(4 1/δ passes over the strea and achieves an O(4 1/δ ρ-approxiation using Õ(n δ space; here ρ is the approxiation factor of the off-line algorith for solving the set cover proble. The authors further showed that any constant-pass deterinistic O(1-estiation algorith for the set cover requires Ω(n space. Very recently, Indyk et al. [17] (see also [15] ade a significant iproveent on the trade-off achieved by [10]: they presented an algorith that for any δ > 0, akes O(1/δ passes over the strea and achieves an O(ρ/δ- approxiation using Õ(n δ space. The authors also established two lower bounds: for ulti-pass algoriths, any algorith that coputes an optial set cover solution while aking only( 1 2δ 1 passes ust use Ω(n δ space. More relevant to our paper, they also showed that any single-pass streaing algorith (possibly randoized that can distinguish between the instances with set cover size of 2 and 3 w.h.p., ust use Ω(n bits. Organization. We introduce in Section 2 soe preliinaries needed for the rest of the paper. In Section 3, we present our Ω(n/α space lower bound for coputing an α-approxiate set cover in a single-pass. In Section 4, we present a single-pass streaing algorith for estiating the optial value of a covering integer progra and prove an Õ(n/α 2 upper bound on the space coplexity of α-estiating the (weighted set cover proble. In Section 5, we present our Ω(n/α 2 space lower bound for α-estiating set cover in a single-pass. 4

6 Finally, Section 6 contains our Ω(n/α space lower bound for deterinistic α-estiation algoriths. 2 Preliinaries Notation. We use bold face letters to represent rando variables. For any rando variable X, supp(x denotes its support set. We define X := log supp(x. For any k- diensional tuple X = (X 1,..., X k and any i [k], we define X <i := (X 1,..., X i 1, and X i := (X 1,..., X i 1, X i+1,..., X k. The notation X R U indicates that X is chosen uniforly at rando fro a set U. Finally, we use upper case letters (e.g. M to represent atrices and lower case letter (e.g. v to represent vectors. Concentration Bounds. We use an extension of the Chernoff-Hoeffding bound for negatively correlated rando variables. Rando variables X 1,..., X n are negatively correlated if for every set S [n], Pr( i S X i = 1 i S Pr(X i = 1. It was first proved in [25] that the Chernoff-Hoeffding bound continues to hold for the case of rando variables that satisfy this generalized version of negative correlation (see also [16]. 2.1 Tools fro Inforation Theory We briefly review soe basic concepts fro inforation theory needed for establishing our lower bounds. For a broader introduction to the field, we refer the reader to the excellent text by Cover and Thoas [9]. In the following, we denote the Shannon Entropy of a rando variable A by H(A and the utual inforation of two rando variables A and B by I(A; B = H(A H(A B = H(B H(B A. If the distribution D of the rando variables is not clear fro the context, we use H D (A (resp. I D (A; B. We use H 2 to denote the binary entropy function where for +(1 δ log 1 1 δ. any real nuber 0 < δ < 1, H 2 (δ = δ log 1 δ We use the following basic properties of entropy and utual inforation (proofs can be found in [9], Chapter 2. Clai 2.1. Let A, B, and C be three rando variables H(A A. H(A = A iff A is uniforly distributed over its support. 2. I(A; B 0. The equality holds iff A and B are independent. 3. Conditioning on a rando variable reduces entropy: H(A B, C H(A B. The equality holds iff A and C are independent conditioned on B. 4. Subadditivity of entropy: H(A, B C H(A C+ H(B C. 5. The chain rule for utual inforation: I(A, B; C = I(A; C+ I(B; C A. 6. For any event E independent of A and B, H(A B, E = H(A B. 7. For any event E independent of A, B and C, I(A; B C, E = I(A; B C. The following clai (Fano s inequality states that if a rando variable A can be used to estiate the value of another rando variable B, then A should consue ost of the entropy of B. Clai 2.2 (Fano s inequality. For any binary rando variable B and any (possibly randoized function f that predicts B based on A, if Pr(f(A = B = δ, then H(B A H 2 (δ. 5

7 We also use the following siple clai, which states that conditioning on independent rando variables can only increase the utual inforation. Clai 2.3. For any rando variables A, B, C, and D, if A and D are independent conditioned on C, then I(A; B C I(A; B C, D. Proof. Since A and D are independent conditioned on C, by Clai 2.1-(3, H(A C = H(A C, D and H(A C, B H(A C, B, D. We have, I(A; B C = H(A C H(A C, B = H(A C, D H(A C, B H(A C, D H(A C, B, D = I(A; B C, D 2.2 Counication Coplexity and Inforation Coplexity Counication coplexity and inforation coplexity play an iportant role in our lower bound proofs. We now provide necessary definitions for copleteness. Counication coplexity. Our lowers bounds for single-pass streaing algoriths are established through counication coplexity lower bounds. Here, we briefly provide soe context necessary for our purpose; for a ore detailed treatent of counication coplexity, we refer the reader to the excellent text by Kushilevitz and Nisan [21]. We focus on the two-player one-way counication odel. Let P be a relation with doain X Y Z. Alice receives an input X X and Bob receives Y Y, where (X, Y are chosen fro a joint distribution D over X Y. In addition to private randoness, the players also have an access to a shared public tape of rando bits R. Alice sends a single essage M(X, R and Bob needs to output an answer Z := Z(M(X, R, Y, R such that (X, Y, Z P. We use Π to denote a protocol used by the players. Unless specified otherwise, we always assue that the protocol Π can be randoized (using both public and private randoness, even against a prior distribution D of inputs. For any 0 < δ < 1, we say Π is a δ-error protocol for P over a distribution D, if the probability that for an input (X, Y, Bob outputs soe Z where (X, Y, Z / P is at ost δ (the probability is taken over the randoness of both the distribution and the protocol. Definition 3. The counication cost of a protocol Π for a proble P on an input distribution D, denoted by Π, is the worst-case size of the essage sent fro Alice to Bob in the protocol Π, when the inputs are chosen fro the distribution D. The counication coplexity CC δ D (P of a proble P with respect to a distribution D is the iniu counication cost of a δ-error protocol Π over D. Inforation coplexity. There are several possible definitions of inforation coplexity of a counication proble that have been considered depending on the application (see, e.g., [2 4, 6]. Our definition is tuned specifically for one-way protocols, siilar in the spirit of [3] (see also [18]. Definition 4. Consider an input distribution D and a protocol Π (for soe proble P. Let X be the rando variable for the input of Alice drawn fro D, and let Π := Π(X be the rando variable denoting the essage sent fro Alice to Bob concatenated with the public randoness R used by Π. The inforation cost ICost D (Π of a one-way protocol Π with respect to D is I D (Π; X. The inforation coplexity IC δ D (P of P with respect to a distributiondis the iniu ICost D(Π taken over all one-way δ-error protocols Π for P over D. 6

8 Note that any public coin protocol is a distribution over private coins protocols, run by first using public randoness to saple a rando string R = R and then running the corresponding private coin protocol Π R. We also use Π R to denote the rando variable of the essage sent fro Alice to Bob, assuing that the public randoness is R = R. We have the following well-known clai. Clai 2.4. For any distribution D and any protocol ] Π, let R denote the public randoness used in Π; then, ICost D (Π = E R R [I D (Π R ; X R = R. Proof. Let Π = (M, R, where M denotes the essage sent by Alice and R is the public randoness. We have, ICost D (Π = I(Π; X = I(M, R; X = I(R; X+ I(M; X R (the chain rule for utual inforation, Clai 2.1-(5 [ ] = E I D (Π R ; X R = R R R (M = Π R whenever R = R and I(R; X = 0 by Clai 2.1-(2 The following well-known proposition (see, e.g., [6] relates counication coplexity and inforation coplexity. Proposition 2.5. For every 0 < δ < 1 and every distribution D: CC δ D (P ICδ D (P. Proof. Let Π be a protocol with the iniu counication coplexity for P on D and R denotes the public randoness of Π; using Clai 2.4, we can write, [ IC δ D (P = E R R [ Π R E R R ] I D (Π R ; X R = R ] Π = CC δ D (P [ ] E H D (Π R R = R R R 3 An Ω(n/α-Space Lower bound for α-approxiate Set Cover In this section, we prove that the siple α-approxiation algorith described in Section 1.2 is in fact optial in ters of the space requireent. Forally, n Theore 1. For any α = o( log n and = poly(n, any randoized single-pass streaing algorith that α-approxiates the set cover proble with probability at least 2/3 requires Ω(n/α bits of space. Fix a (sufficiently large value for n, = poly(n (also = Ω(α log n, and α = o( log n ; throughout this section, SetCover apx refers to the proble of α-approxiating the set cover proble for instances with +1 sets 4 defined over the universe [n] in the one-way counication odel, whereby the sets are partitioned between Alice and Bob. 4 To siplify the exposition, we use +1 instead of as the nuber of sets. n 7

9 Overview. We design a hard input distribution D apx for SetCover apx, whereby Alice is provided with a collection of sets S 1,..., S, each of size (roughly n/α and Bob is given a single set T of size (roughly n 2α. The input to the players are correlated such that there exists a set S i in Alice s collection (i is unknown to Alice, such that S i T covers all eleents in [n] except for a single special eleent. This in particular ensures that the optial set cover size in this distribution is at ost 3 w.h.p. On the other hand, we hide this special eleent aong the 2α eleents in T in a way that if Bob does not have (essentially full inforation about Alice s collection, he cannot even identify a set of α eleents fro T that contain this special eleent (w.p strictly ore than half. This iplies that in order for Bob to be sure that he returns a valid set cover, he should additionally cover a ajority of T with sets other than S i. We design the distribution in a way that the sets in Alice s collection are far fro each other and hence Bob is forced to use a distinct set for (roughly each eleent in T that he needs to cover with sets other than S i. This iplies that Bob needs to output a set cover of size α (i.e., an (α/3-approxiation to ensure that every eleent in [n] is covered. 3.1 A Hard Input Distribution for SetCover apx Consider the following distribution. Distribution D apx. A hard input distribution for SetCover apx. Notation. LetF be the collection of all subsets of[n] with cardinality n 10α, andl := 2α log. Alice. The input of Alice is a collection of sets S = (S 1,..., S, where for any i [], S i is a set chosen independently and uniforly at rando fro F. Bob. Pick an i [] (called the special index uniforly at rando; the input to Bob is a set T = [n]\e, where E is chosen uniforly at rando fro all subsets of [n] with E = l and E\S i = 1. a a Since α = o( n/ log n and = poly(n, the size of E is strictly saller than the size of S i. The clais below suarize soe useful properties of the distribution D apx. Clai 3.1. For any instance (S, T D apx, with probability 1 o(1, opt(s, T 3. Proof. Let e denote the eleent in E\S i. S i contains 1 rando subsets of [n] of size n/10α, drawn independent of the choice of e. Therefore, each set in S i covers e with probability 1/10α. The probability that none of these 1 sets covers e is at ost (1 1/10α 1 (1 1/10α Ω(α log n exp( Ω(α log n/10α = o(1 Hence, with probability 1 o(1, there is at least one set S S i that covers e. Now, it is straightforward to verify that (S i, T, S for a valid set cover. Lea 3.2. With probability 1 o(1, no collection of 3α sets fro S i covers ore than l/2 eleents of E. Proof. Recall that the sets in S i and the set E are chosen independent of each other. For each set S S i and for each eleent e E, we define an indicator binary rando variable X e, where X e = 1 iff e S. Let X := e X e, which is the nuber of eleents in E covered by S. We have, E[X] = E[X e ] = E e 10α = log 5 8

10 Moreover, the variables X e are negatively correlated since for any set S E, Pr ( e S X e = 1 = ( n S n 10α S ( n n 10α Hence, by the extended Chernoff bound (see Section 2, ( Pr X log = o( 1 3 = ( n ( n 10α 10α 1... ( n 10α S + 1 (n (n 1...(n S +1 ( 1 S = 10α Pr(X e = 1 e S Therefore, using union bound over all 1 sets in S i, with probability 1 o(1, no set in S i covers ore than log /3 eleents in E, which iplies that any collection of 3α sets can only cover up to 3α log /3 = l/2 eleents in E. 3.2 The Lower Bound for the Distribution D apx In order to prove our lower bound for SetCover apx on D apx, we define an interediate counication proble which we call the Trap proble. Proble 1 (Trap proble. Alice is given a set S [n] and Bob is given a set E [n] such that E\S = {e }; Bob needs to output a set L E with L E /2 such that e L. In the following, we use Trap to refer to the trap proble with S = n/10α and E = l = 2α log (notice the siilarity to the paraeters in D apx. We define the following distribution D Trap for Trap. Alice is given a set S R F (recall that F is the collection of all subsets of [n] of size n/10α and Bob is given a set E chosen uniforly at rando fro all sets that satisfy E\S = 1 and E = 2α log. We first use a direct su style arguent to prove that under the distributions D apx and D Trap, inforation coplexity of solving SetCover apx is essentially equivalent to solving copies of Trap. Forally, Lea 3.3. For any constant δ < 1/2, IC δ D apx (SetCover apx IC δ+o(1 D Trap (Trap. Proof. Let Π SC be a δ-error protocol for SetCover apx ; we design a δ -error protocol Π Trap for solving Trap over D Trap with paraeter δ = δ+o(1 such that the inforation cost of Π Trap on D Trap is at ost 1 ICost D apx (Π SC. The protocol Π Trap is as follows. Protocol Π Trap. The protocol for solving Trap using a protocol Π SC for SetCover apx. Input: An instance (S, E D Trap. Output: A set L with L E /2, such that e L. 1. Using public randoness, the players saple an index i [] uniforly at rando. 2. Alice creates a tuple S = (S 1,..., S by assigning S i = S and sapling each reaining set uniforly at rando fro F using private randoness. Bob creates a set T := E. 3. The players run the protocol Π SC over the input (S, T. 4. Bob coputes the set L of all eleents in E = T whose certificate (i.e., the set used to cover the is not S i, and outputs L. 9

11 We first argue the correctness of Π Trap and then bound its inforation cost. To argue the correctness, notice that the distribution of instances of SetCover apx constructed in the reduction is exactly D apx. Consequently, it follows fro Clai 3.1 that, with probability 1 o(1, any α-approxiate set cover can have at ost 3α sets. Let Ŝ be the set cover coputed by Bob inus the sets S i and T. As e E = T and oreover is not in S i, it follows that e should be covered by soe set in Ŝ. This eans that the set L that is output by Bob contains e. Moreover, by Lea 3.2, the nuber of eleents in E covered by the sets in Ŝ is at ost l/2 w.p. 1 o(1. Hence, L l/2 = E /2. This iplies that: ( ( Pr Π Trap errs Pr Π SC errs + o(1 δ+o(1 D Trap D apx We now bound the inforation cost of Π Trap. Let I be the rando variable for the choice of i [] in the protocol Π Trap (which is unifor in []. Using Clai 2.4, we have, [ ( ICost DTrap (Π Trap = E I DTrap i I = 1 Π i Trap ; S I = i ] = 1 I Dapx (Π SC ; S i I = i = 1 I DTrap (Π SC ; S i I = i I Dapx ( Π SC ; S i where the last two equalities hold since (i the joint distribution of Π SC and S i conditioned on I = i under D Trap is equivalent to the one under D apx, and (ii the rando variables Π SC and S i are independent of the event I = i (by the definition of D apx and hence we can drop the conditioning on this event (by Clai 2.1-(7. We can further derive, ICost DTrap (Π Trap = 1 ( I Dapx Π SC ; S i 1 I Dapx ( Π SC ; S i S <i The inequality holds since S i and S <i are independent and conditioning on independent variables can only increase the utual inforation (i.e., Clai 2.3. Finally, ICost DTrap (Π Trap 1 ( I Dapx Π SC ; S i S <i = 1 I D apx (Π SC ;S = 1 ICost D apx (Π SC where the first equality is by the chain rule for utual inforation (see Clai 2.1-(5. Having established Lea 3.3, our task now is to lower bound the inforation coplexity of Trap over the distribution D Trap. We prove this lower bound using a novel reduction fro the well-known Index proble, denoted by Index n k. In Indexn k over the distribution D Index, Alice is given a set A [n] of size k chosen uniforly at rando and Bob is given an eleent a such that w.p. 1/2 a R A and w.p. 1/2 a R [n]\ A; Bob needs to deterine whether a A (the YES case or not (the NO case. We reark that siilar distributions for Index n k have been previously studied in the literature (see, e.g., [26], Section 3.3. For the sake of copleteness, we provide a self-contained proof of the following lea in Appendix A. Lea 3.4. For any k < n/2, and any constant δ < 1/2, IC δ D Index (Index n k = Ω(k. Using Lea 3.4, we prove the following lea, which is the key part of the proof. Lea 3.5. For any constant δ < 1/2, IC δ D Trap (Trap = Ω(n/α. 10

12 Proof. Let k = n/10α; we design a δ -error protocol Π Index for Index n k protocol Π Trap (over D Trap as a subroutine, for soe constant δ < 1/2. using any δ-error Protocol Π Index. The protocol for reducing Index n k to Trap. Input: An instance (A, a D Index. Output: YES if a A and NO otherwise. 1. Alice picks a set B A with B = l 1 uniforly at rando using private randoness. 2. To invoke the protocol Π Trap, Alice creates a set S := A and sends the essage Π Trap (S, along with the set B to Bob. 3. If a B, Bob outputs YES and terinates the protocol. 4. Otherwise, Bob constructs a set E = B {a} and coputes L := Π Trap (S, E using the essage received fro Alice. 5. If a L, Bob outputs NO, and otherwise outputs YES. We should note right away that the distribution of instances for Trap defined in the previous reduction does not atch D Trap. Therefore, we need a ore careful arguent to establish the correctness of the reduction. We prove this lea in two clais; the first clai establishes the correctness of the reduction and the second one proves an upper bound on the inforation cost of Π Index based on the inforation cost of Π Trap. Clai 3.6. Π Index is a δ -error protocol for Index n k over D Index for the paraeter k = n/10α and a constant δ < 1/2. Proof. Let R denote the private coins used by Alice to construct the set B. Also, definedindex Y (resp. DIndex N as the distribution of YES instances (resp. NO instances of D Index. We have, ( Pr Π Index errs = 1 ( D Index,R 2 Pr Π Index errs + 1 ( DIndex Y,R 2 Pr Π Index errs DIndex N,R Note that we do not consider the randoness of the protocol Π Trap (used in construction of Π Index as it is independent of the randoness of the distribution D Index and the private coins R. We now bound each ter in Equation (1 separately. We first start with the easier case which is the second ter. The distribution of instances (S, E for Trap created in the reduction by the choice of (A, a D N Index and the randoness of R, is the sae as the distribution D Trap. Moreover, in this case, the output of Π Index would be wrong iff a E\S (corresponding to the eleent e in Trap does not belong to the set L output by Π Trap. Hence, ( ( Pr Π Index errs = Pr Π Trap errs δ (2 DIndex N,R D Trap We now bound the first ter in Equation (1. Note that when (A, a DIndex Y, there is a sall chance that Π Index is lucky" and a belongs to the set B (see Line (3 of the protocol. Let this event be E. Conditioned on E, Bob outputs the correct answer with probability 1; however note that probability of E happening is only o(1. Now suppose E does not happen. In this case, the distribution of instances (S, E created by the choice of (A, a DIndex Y (and randoness of R does not atch the distribution D Trap. However, we have the following (1 11

13 iportant property: Given that (S, E is the instance of Trap created by choosing (A, a fro DIndex Y and sapling l 1 rando eleents of A (using R, the eleent a is unifor over the set E. In other words, knowing (S, E does not reveal any inforation about the eleent a. Note that since (S, E is not chosen according to the distribution D Trap (actually it is not even a legal input for Trap, it is possible that Π Trap terinates, outputs a non-valid set, or outputs a set L E. Unless L E (and satisfies the cardinality constraint, Bob is always able to deterine that Π Trap is not functioning correctly and hence outputs YES (and errs with probability at ost δ < 1/2. However, if L E, Bob would not know whether the input to Π Trap is legal or not. In the following, we explicitly analyze this case. In this case, L is a subset of E chosen by the (inner randoness of Π Trap for a fixed S and E and oreover L E /2 (by definition of Trap. The probability that Π Index errs in this case is exactly equal to the probability that a L. However, as stated before, for a fixed (S, E, the choice of L is independent of the choice of a and oreover, a is unifor over E; hence a L happens with probability at ost 1/2. Forally, (here, R Trap denotes the inner randoness of Π Trap ( Pr (Π Index errs E = Pr a L = Π Trap (S, E E DIndex Y,R DIndex Y,R ] = E E Pr a L S = S, E = E, R Trap = R Trap,E [ ( (S,E (S,E E R Trap R Trap DIndex Y,R = E (S,E (S,E E (L = Π Trap (S, E is a fixed set conditioned on (S, E, R Trap [ L ] E R Trap R Trap E (a is unifor on E conditioned on (S, E, R Trap and E Hence, we have, Pr D Y Index,R (Π Index errs E 1 2, since by definition, for any output set L, L E /2. As stated earlier, whenever E happens, Π Index akes no error; hence, Pr (Π Index errs = Pr (E Pr (Π Index errs E 1 o(1 DIndex Y,R DIndex Y,R DIndex Y,R 2 (3 Finally, by plugging the bounds in Equations (2,3 in Equation (1 and assuing δ is bounded away fro 1/2, we have, Pr (Π Index errs 1 D Index,R 2 1 o( δ = 1 o(1 + δ ǫ for soe constant ǫ bounded away fro 0. We now bound the inforation cost of Π Index under D Index. Clai 3.7. ICost DIndex (Π Index ICost DTrap (Π Trap + O(l log n. Proof. We have, ICost DIndex (Π Index = I DIndex (Π Index (A; A = I DIndex (Π Trap (S, B; A ( = I DIndex (Π Trap (S; A + I DIndex B; A Π Trap (S (the chain rule for utual inforation, Clai 2.1-(5 ( I DIndex (Π Trap (S; A + H DIndex B Π Trap (S 12

14 I DIndex (Π Trap (S; A + O(l log n ( B = O(l log n and Clai 2.1-(1 = I DIndex (Π Trap (S; S + O(l log n (A = S as defined in Π Index = I DTrap (Π Trap (S; S + O(l log n (the joint distribution of (Π Trap (S, S is identical under D Index and D Trap = ICost DTrap (Π Trap +O(l log n The lower bound now follows fro Clais 3.6 and 3.7, and Lea 3.4 for the paraeters k = S = n 10α and δ < 1/2, and using the fact that α = o( n/ log n, l = 2α log, and = poly(n, and hence Ω(n/α = ω(l log n. To conclude, by Lea 3.3 and Lea 3.5, for any set of paraeters δ < 1/2, α = o( and = poly(n, ( IC δ D apx (SetCover apx Ω(n/α = Ω(n/α n log n, Since the inforation coplexity is a lower bound on the counication coplexity (Proposition 2.5, we have, Theore 2. For any constant δ < 1/2, α = o( n log n, and = poly(n, CC δ D apx (SetCover apx = Ω(n/α Finally, since one-way counication coplexity is also a lower bound on the space coplexity of single-pass streaing algoriths, we obtain Theore 1 as a corollary of Theore 2. 4 An Õ(n/α 2 -Space Upper Bound for α-estiation In this section, we show that if we are only interested in estiating the size of a iniu set cover (instead of finding the actual sets, we can bypass the Ω(n/α lower bound established in Section 3. In fact, we prove this upper bound for the ore general proble of estiating the optial solution of a covering integer progra (henceforth, covering ILP in the streaing setting. A covering ILP can be forally defined as follows. in c x s.t. Ax b where A is a atrix with diension n, b is a vector of diension n, c is a vector of diension, and x is an -diensional vector of non-negative integer variables. Moreover, all coefficients in A, b, and c are also non-negative integers. We denote this linear progra by ILP Cover (A, b, c. We use a ax, b ax, and c ax, to denote the largest entry of, respectively, the atrix A, the vector b, and the vector c. Finally, we define the optial value of the I := ILP Cover (A, b, c as c x where x is the optial solution to I, and denote it by opt := opt(i. We consider the following streaing setting for covering ILPs. The input to a streaing algorith for an instance I := ILP Cover (A, b, c is the n-diensional vector b, and a strea of the coluns of A presented one by one, where the i-th colun of A, A i, is presented along with the i-th entry of c, denoted by c i (we will refer to c i as the weight of the i-th colun. It is easy to see that this streaing setting for covering ILPs captures, as special cases, the set cover proble, the weighted set cover proble, and the set ulti-cover proble. We prove the following theore for α-estiating the optial value of a covering ILPs in the streaing setting. 13

15 Theore 3. There is a randoized algorith that given a paraeter α 1, for any instance I := ILP Cover (A, b, c with poly(n-bounded entries, akes a single pass over a strea of coluns of A (presented in an arbitrary order, and outputs an α-estiation to opt(i w.h.p. using space Õ ( (n/α 2 b ax + +nb ax bits. In particular, for the weighted set cover proble with poly(n bounded weights and α n, the space coplexity of this algorith is Õ(n/α 2 + n. 5 To prove Theore 3, we design a general approach based on sapling constraints of a covering ILP instance. The goal is to show that if we saple (roughly 1/α fraction of the constraints fro an instance I := ILP Cover (A, b, c, then the optiu value of the resulting covering ILP, denoted by I R, is a good estiator of opt(i. Note that in general, this ay not be the case; siply consider a weighted set cover instance that contains an eleent e which is only covered by a singleton set of weight W (for W and all the reaining sets are of weight 1 only. Clearly, opt(i R opt(i as long as e is not sapled in I R, which happens w.p. 1 1/α. To circuvent this issue, we define a notion of cost for covering ILPs which, inforally, is the iniu value of the objective function if the goal is to only satisfy a single constraint (in the above exaple, the cost of that weighted set cover instance is W. This allows us to bound the loss incurred in the process of estiation by sapling based on the cost of the covering ILP. Constraint sapling alone can only reduce the space requireent by a factor of α, which is not enough to eet the bounds given in Theore 3. Hence, we cobine it with a pruning step to sparsify the coluns in A before perforing the sapling. We should point out that as coluns are weighted, the pruning step needs to be sensitive to the weights. In the rest of this section, we first introduce our constraint sapling lea (Lea 4.1 and prove its correctness, and then provide our algorith for Theore Covering ILPs and Constraint Sapling Lea In this section, we provide a general result for estiating the optial value of a Covering ILP using a sapling based approach. For a vector v, we will use v i to denote the i-th diension of v. For a atrix A, we will use A i to denote the i-th colun of A, and use a j,i to denote the entry of A at the i-th colun and the j-th row (to atch the notation with the set cover proble, we use a j,i instead of the standard notation a i,j. For each constraint j [n] (i.e., the j-th constraint of a covering ILP instance I := ILP Cover (A, b, c, we define the cost of the constraint j, denoted by Cost(j, as, Cost(j := in x c x s.t a j,i x i b j which is the iniu solution value of the objective function for satisfying the constraint j. Furtherore, the cost of I, denoted by Cost(I, is defined to be Clearly, Cost(I is a lower bound on opt(i. Cost(I := ax j [n] Cost(j Constraint Sapling. Given any instance of covering ILP I := ILP Cover (A, b, c, let I R be a covering ILP instance ILP Cover (A, b, c obtained by setting b j := b j with probability p, and bj := 0 with probability 1 p, for each diension j [n] of b independently. Note that setting 5 Note that Ω(n space is necessary to even deterine whether or not a given instance is feasible. 14

16 bj := 0 in I R is equivalent to reoving the j-th constraint fro I, since all entries in I are non-negative. Therefore, intuitively, I R is a covering ILP obtained by sapling (and keeping the constraints of I with a sapling rate of p. We establish the following lea which asserts that opt(i R is a good estiator of opt(i (under certain conditions. As opt(i R opt(i trivially holds (reoving constraints can only decrease the optial value, it suffices to give a lower bound on opt(i R. Lea 4.1 (Constraint Sapling Lea. Fix an α 32 ln n; for any covering ILP I with n constraints, suppose I R is obtained fro I by sapling each constraint with probability p := 4 ln n α ; then ( Pr opt(i R +Cost(I opt(i 3 8α 4 Proof. Suppose by contradiction that the lea stateent is false and throughout the proof let I be any instance where w.p. at least 1/4, opt(i R +Cost(I < opt(i 8α (we denote this event by E 1 (I R, or shortly E 1. We will show that in this case, I has a feasible solution with a value saller than opt(i. To continue, define E 2 (I R (or E 2 in short as the event that opt(i R < opt(i 8α. Note that whenever E 1 happens, then E 2 also happens, hence E 2 happens w.p. at least 1/4. For the sake of analysis, suppose we repeat, for 32α ties, the procedure of sapling each constraint of I independently with probability p, and obtain 32α covering ILP instances S := { IR 1 },...,I 32α R. Since E2 happens with probability at least 1/4 on each instance I R, the expected nuber of ties that E 2 happens for instances in S is at least 8α > 12 ln n. Hence, by the Chernoff bound, with probability at least 1 1/n, E 2 happens on at least 4α of instances in S. Let T S be a set of 4α sapled instances for which E 2 happens. In the following, we show that if I has the property that Pr(E 1 (I R 1/4, then w.p. at least 1 1/n, every constraint in I appears in at least one of the instances in T. Since each of these 4α instances adits a solution of value at ost opt(i 8α (by the definition of E 2, the ax of their solutions, i.e., the vector obtained by setting the i-th entry to be the largest value of x i aong all these solutions, gives a feasible solution to I with value at ost 4α opt(i 8α = opt(i 2 ; a contradiction. We use j I R to denote the event that the constraint j of I is sapled in I R, and we need to show that w.h.p. for all j, there exists an instance I R T where j I R. We establish the following clai. ( Clai 4.2. For any j [n], Pr j I R E 2 (I R ln n 2α. Before proving Clai 4.2, we show how this clai would iply the lea. By Clai 4.2, for each of the 4α instances I R T, and for any j [n], the probability that the constraint j is sapled in I R is at least ln n 2α. Then, the probability that j is sapled in none of the 4α instances of T is at ost: ( 1 ln n 4α 1 exp( 2 ln n = 2α n 2 Hence, by union bound, w.p. at least 1 1/n, every constraint appears in at least one of the instances in T, and this will coplete the proof. It reains to prove Clai 4.2. Proof of Clai 4.2. Fix any j [n]; by Bayes rule, ( Pr (E 2 (I R j I R Pr(j I R Pr j I R E 2 (I R = ( Pr E 2 (I R 15

17 ( Since Pr E 2 (I R 1 and Pr(j I R = p = 4 ln n α, we have, ( Pr j I R E 2 (I R Pr (E 2 (I R j I R 4 ln n α and it suffices to establish a lower bound of 1/8 for Pr (E 2 (I R j I R. Consider the following probabilistic process (for a fixed j [n]: we first reove the constraint j fro I (w.p. 1 and then saple each of the reaining constraints of I w.p. p. Let I R be an instance created by this process. We prove Pr(E 2(I R j I R 1/8 in two steps by first showing that the probability that E 1 happens to I R (i.e., Pr(E 1(I R is at least 1/8, and then use a coupling arguent to prove that Pr(E 2 (I R j I R Pr(E 1 (I R. We first show that Pr(E 1 (I R (which by definition is the probability that opt(i R + Cost(I opt(i 16α is at least 1/8. To see this, note that the probability that E 1 happens to I R is equal to the probability that E 1 happens to I R conditioned on j not being sapled (i.e., Pr(E 1 (I R j / I R. Now, if we expand Pr(E 1 (I R, ( Pr E 1 (I R = Pr(j I R Pr (E 1 (I R j I R + Pr(j / I R Pr (E 1 (I R j / I R ( Pr(j I R +Pr (E 1 (I R j / I R = p+pr E 1 (I R As Pr(E 1 (I R 1/4 and p = 4 ln n α 1/8 (since α 32 ln n, we have, ( 1/4 1/8+Pr E 1 (I R and therefore, Pr(E 1 (I R 1/8. It reains to show that Pr(E 2 (I R j I R Pr(E 1 (I R. To see this, note that conditioned on j I R, the distribution of sapling all constraints other than j is exactly the sae as the distribution of I R. Therefore, for any instance I R drawn fro this distribution, there is a unique instance I R sapled fro the original constraint sapling distribution conditioned on j I R. For any such (I R,I R pair, we have opt(i R opt(i R +Cost(j ( opt(i R +Cost(I since satisfying the constraint j I R requires increasing the value of the objective function in I R by at ost Cost(j. Therefore if opt(i opt R +Cost(I 8α (i.e., E 1 happens to I R, then opt(i R opt 8α (i.e., E 2 happens to I R conditioned on j I R. Hence, ( Pr (E 2 (I R j I R Pr E 1 (I R 1/8 Plugging in this bound in Equation (4, we obtain that Pr(j I R E 2 ln n 2α. 4.2 An α-estiation of Covering ILPs in the Streaing Setting We now prove Theore 3. Throughout this section, for siplicity of exposition, we assue that α 32 ln n (otherwise the space bound in Theore 3 is enough to store the whole input and solve the proble optially, the value of c ax is provided to the algorith, and x is a vector of binary variables, i.e., x {0, 1} (hence covering ILP instances are always referring to covering ILP instances with binary variables; in Section 4.3, we describe how to eliinate the later two assuptions. Algorith overview. For any covering ILP instance I := ILP Cover (A, b, c, our algorith estiates opt := opt(i in two parts running in parallel. In the first part, the goal is siply to (4 16

18 copute Cost(I (see Clai 4.3. For the second part, we design a tester algorith (henceforth, Tester that given any guess k of the value of opt, if k opt, Tester accepts k w.p. 1 and for any k where Cost(I k opt 32α, w.h.p. Tester rejects k. Let K := {2 γ } γ [ log(cax ]; for each k K (in parallel, we run Tester(k. At the end of the strea, the algorith knows Cost(I (using the output of the part one, and hence it can identify aong all guesses that are at least Cost(I, the sallest guess accepted by Tester (denoted by k. On one hand, k opt since for any guess k opt, k Cost(I also (since opt Cost(I and Tester accepts k. On the other hand, k opt 32α w.h.p. since (i if Cost(I opt 32α, k Cost(I opt opt opt 32α and (ii if Cost(I < 32α, the guess 32α will be rejected w.h.p. by Tester. Consequently, 32α k is an O(α-estiation of opt(i. We first show that one can copute the Cost of a covering ILP presented in a strea using a siple dynaic prograing algorith. Clai 4.3. For anyi := ILP Cover (A, b, c presented by a strea of coluns, Cost(I can be coputed in space O(nb ax log c ax bits. Proof. We aintain arrays Cost j [y] + for any j [n] and y [b ax ]: Cost j [y] is the iniu cost for achieving a coverage of y for the j-th constraint. Define Cost j [y] = 0 for y 0. Upon arrival of A i, c i, copute Cost j [y] = in(cost j [y], Cost j [y a j,i ]+c i, for every y fro b j down to 1. We then have Cost(I = ax j Cost j [b j ]. To continue, we need the following notation. For any vector v with diension d and any set S [d], v(s denotes the projection of v onto the diensions indexed by S. For any two vectors u and v, let in(u, v denote a vector w where at the i-th diension: w i = in(u i, v i, i.e., the coordinate-wise iniu. We now provide the aforeentioned Tester algorith. Tester(k: An algorith for testing a guess k of the optial value of a covering ILP. Input: An instance I := ILP Cover (A, b, c presented as a strea A 1, c 1,..., A, c, a paraeter α 32 ln n, and a guess k K. Output: ACCEPT if k opt and REJECT if Cost(I k opt 32α. The answer could be either ACCEPT or REJECT if opt 32α < k < opt. 1. Preprocessing: (i Maintain an n-diensional vector b res b, an -diensional vector c 0, and an n diensional atrix à 0 n. (ii Let V be a subset of [n] obtained by sapling each eleent in [n] independently with probability p := 4 ln n/α. 2. Streaing: when a pair A i, c i arrives: (i If c i > k, directly continue to the next input pair of the strea. Otherwise: (ii Prune step: Let u i := in(b res, A i (the coordinate-wise iniu. If u i 1 n b ax α, update b res b res u i (we say A i, c i is pruned by Tester in this case. Otherwise, assign à i u i (V, and c i c i. 3. At the end of the strea, solve the following covering ILP (denoted by I tester : in c x s.t. Ãx b res (V If opt(i tester is at ost k, ACCEPT; otherwise REJECT. 17

Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem

Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem Sepehr Assadi Department of Computer and Information Science University of Pennsylvania Philadelphia, PA, USA sassadi@cis.upenn.edu

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1. Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized

More information

Estimating Entropy and Entropy Norm on Data Streams

Estimating Entropy and Entropy Norm on Data Streams Estiating Entropy and Entropy Nor on Data Streas Ait Chakrabarti 1, Khanh Do Ba 1, and S. Muthukrishnan 2 1 Departent of Coputer Science, Dartouth College, Hanover, NH 03755, USA 2 Departent of Coputer

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

A note on the multiplication of sparse matrices

A note on the multiplication of sparse matrices Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

1 Identical Parallel Machines

1 Identical Parallel Machines FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j

More information

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

Bipartite subgraphs and the smallest eigenvalue

Bipartite subgraphs and the smallest eigenvalue Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

arxiv: v1 [cs.ds] 29 Jan 2012

arxiv: v1 [cs.ds] 29 Jan 2012 A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

Improved Guarantees for Agnostic Learning of Disjunctions

Improved Guarantees for Agnostic Learning of Disjunctions Iproved Guarantees for Agnostic Learning of Disjunctions Pranjal Awasthi Carnegie Mellon University pawasthi@cs.cu.edu Avri Blu Carnegie Mellon University avri@cs.cu.edu Or Sheffet Carnegie Mellon University

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Lean Walsh Transform

Lean Walsh Transform Lean Walsh Transfor Edo Liberty 5th March 007 inforal intro We show an orthogonal atrix A of size d log 4 3 d (α = log 4 3) which is applicable in tie O(d). By applying a rando sign change atrix S to the

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion

Supplementary Material for Fast and Provable Algorithms for Spectrally Sparse Signal Reconstruction via Low-Rank Hankel Matrix Completion Suppleentary Material for Fast and Provable Algoriths for Spectrally Sparse Signal Reconstruction via Low-Ran Hanel Matrix Copletion Jian-Feng Cai Tianing Wang Ke Wei March 1, 017 Abstract We establish

More information

Optimal Jamming Over Additive Noise: Vector Source-Channel Case

Optimal Jamming Over Additive Noise: Vector Source-Channel Case Fifty-first Annual Allerton Conference Allerton House, UIUC, Illinois, USA October 2-3, 2013 Optial Jaing Over Additive Noise: Vector Source-Channel Case Erah Akyol and Kenneth Rose Abstract This paper

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

OPTIMIZATION in multi-agent networks has attracted

OPTIMIZATION in multi-agent networks has attracted Distributed constrained optiization and consensus in uncertain networks via proxial iniization Kostas Margellos, Alessandro Falsone, Sione Garatti and Maria Prandini arxiv:603.039v3 [ath.oc] 3 May 07 Abstract

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40 On Poset Merging Peter Chen Guoli Ding Steve Seiden Abstract We consider the follow poset erging proble: Let X and Y be two subsets of a partially ordered set S. Given coplete inforation about the ordering

More information

Reed-Muller codes for random erasures and errors

Reed-Muller codes for random erasures and errors Reed-Muller codes for rando erasures and errors Eanuel Abbe Air Shpilka Avi Wigderson Abstract This paper studies the paraeters for which Reed-Muller (RM) codes over GF (2) can correct rando erasures and

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

Convex Programming for Scheduling Unrelated Parallel Machines

Convex Programming for Scheduling Unrelated Parallel Machines Convex Prograing for Scheduling Unrelated Parallel Machines Yossi Azar Air Epstein Abstract We consider the classical proble of scheduling parallel unrelated achines. Each job is to be processed by exactly

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Physics 215 Winter The Density Matrix

Physics 215 Winter The Density Matrix Physics 215 Winter 2018 The Density Matrix The quantu space of states is a Hilbert space H. Any state vector ψ H is a pure state. Since any linear cobination of eleents of H are also an eleent of H, it

More information

time time δ jobs jobs

time time δ jobs jobs Approxiating Total Flow Tie on Parallel Machines Stefano Leonardi Danny Raz y Abstract We consider the proble of optiizing the total ow tie of a strea of jobs that are released over tie in a ultiprocessor

More information

Hybrid System Identification: An SDP Approach

Hybrid System Identification: An SDP Approach 49th IEEE Conference on Decision and Control Deceber 15-17, 2010 Hilton Atlanta Hotel, Atlanta, GA, USA Hybrid Syste Identification: An SDP Approach C Feng, C M Lagoa, N Ozay and M Sznaier Abstract The

More information

Homework 3 Solutions CSE 101 Summer 2017

Homework 3 Solutions CSE 101 Summer 2017 Hoework 3 Solutions CSE 0 Suer 207. Scheduling algoriths The following n = 2 jobs with given processing ties have to be scheduled on = 3 parallel and identical processors with the objective of iniizing

More information

Reed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product.

Reed-Muller Codes. m r inductive definition. Later, we shall explain how to construct Reed-Muller codes using the Kronecker product. Coding Theory Massoud Malek Reed-Muller Codes An iportant class of linear block codes rich in algebraic and geoetric structure is the class of Reed-Muller codes, which includes the Extended Haing code.

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Multi-Dimensional Hegselmann-Krause Dynamics

Multi-Dimensional Hegselmann-Krause Dynamics Multi-Diensional Hegselann-Krause Dynaics A. Nedić Industrial and Enterprise Systes Engineering Dept. University of Illinois Urbana, IL 680 angelia@illinois.edu B. Touri Coordinated Science Laboratory

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J. Probability and Stochastic Processes: A Friendly Introduction for Electrical and oputer Engineers Roy D. Yates and David J. Goodan Proble Solutions : Yates and Goodan,1..3 1.3.1 1.4.6 1.4.7 1.4.8 1..6

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies

Approximation in Stochastic Scheduling: The Power of LP-Based Priority Policies Approxiation in Stochastic Scheduling: The Power of -Based Priority Policies Rolf Möhring, Andreas Schulz, Marc Uetz Setting (A P p stoch, r E( w and (B P p stoch E( w We will assue that the processing

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

Birthday Paradox Calculations and Approximation

Birthday Paradox Calculations and Approximation Birthday Paradox Calculations and Approxiation Joshua E. Hill InfoGard Laboratories -March- v. Birthday Proble In the birthday proble, we have a group of n randoly selected people. If we assue that birthdays

More information

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials

Fast Montgomery-like Square Root Computation over GF(2 m ) for All Trinomials Fast Montgoery-like Square Root Coputation over GF( ) for All Trinoials Yin Li a, Yu Zhang a, a Departent of Coputer Science and Technology, Xinyang Noral University, Henan, P.R.China Abstract This letter

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information

Fixed-to-Variable Length Distribution Matching

Fixed-to-Variable Length Distribution Matching Fixed-to-Variable Length Distribution Matching Rana Ali Ajad and Georg Böcherer Institute for Counications Engineering Technische Universität München, Gerany Eail: raa2463@gail.co,georg.boecherer@tu.de

More information

Lecture 9 November 23, 2015

Lecture 9 November 23, 2015 CSC244: Discrepancy Theory in Coputer Science Fall 25 Aleksandar Nikolov Lecture 9 Noveber 23, 25 Scribe: Nick Spooner Properties of γ 2 Recall that γ 2 (A) is defined for A R n as follows: γ 2 (A) = in{r(u)

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109

More information

The Transactional Nature of Quantum Information

The Transactional Nature of Quantum Information The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.

More information

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding

Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding IEEE TRANSACTIONS ON INFORMATION THEORY (SUBMITTED PAPER) 1 Design of Spatially Coupled LDPC Codes over GF(q) for Windowed Decoding Lai Wei, Student Meber, IEEE, David G. M. Mitchell, Meber, IEEE, Thoas

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information

Generalized Alignment Chain: Improved Converse Results for Index Coding

Generalized Alignment Chain: Improved Converse Results for Index Coding Generalized Alignent Chain: Iproved Converse Results for Index Coding Yucheng Liu and Parastoo Sadeghi Research School of Electrical, Energy and Materials Engineering Australian National University, Canberra,

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel 1 Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel Rai Cohen, Graduate Student eber, IEEE, and Yuval Cassuto, Senior eber, IEEE arxiv:1510.05311v2 [cs.it] 24 ay 2016 Abstract In

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Nonclairvoyant Scheduling to Minimize the Total Flow Time on Single and Parallel Machines

Nonclairvoyant Scheduling to Minimize the Total Flow Time on Single and Parallel Machines Nonclairvoyant Scheduling to Miniize the Total Flow Tie on Single and Parallel Machines LUCA BECCHETTI AND STEFANO LEONARDI University of Roe La Sapienza, Roe, Italy Abstract. Scheduling a sequence of

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

Deflation of the I-O Series Some Technical Aspects. Giorgio Rampa University of Genoa April 2007

Deflation of the I-O Series Some Technical Aspects. Giorgio Rampa University of Genoa April 2007 Deflation of the I-O Series 1959-2. Soe Technical Aspects Giorgio Rapa University of Genoa g.rapa@unige.it April 27 1. Introduction The nuber of sectors is 42 for the period 1965-2 and 38 for the initial

More information

Randomized Accuracy-Aware Program Transformations For Efficient Approximate Computations

Randomized Accuracy-Aware Program Transformations For Efficient Approximate Computations Randoized Accuracy-Aware Progra Transforations For Efficient Approxiate Coputations Zeyuan Allen Zhu Sasa Misailovic Jonathan A. Kelner Martin Rinard MIT CSAIL zeyuan@csail.it.edu isailo@it.edu kelner@it.edu

More information

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13 CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

Tight Complexity Bounds for Optimizing Composite Objectives

Tight Complexity Bounds for Optimizing Composite Objectives Tight Coplexity Bounds for Optiizing Coposite Objectives Blake Woodworth Toyota Technological Institute at Chicago Chicago, IL, 60637 blake@ttic.edu Nathan Srebro Toyota Technological Institute at Chicago

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

A Theoretical Analysis of a Warm Start Technique

A Theoretical Analysis of a Warm Start Technique A Theoretical Analysis of a War Start Technique Martin A. Zinkevich Yahoo! Labs 701 First Avenue Sunnyvale, CA Abstract Batch gradient descent looks at every data point for every step, which is wasteful

More information

Lecture 20 November 7, 2013

Lecture 20 November 7, 2013 CS 229r: Algoriths for Big Data Fall 2013 Prof. Jelani Nelson Lecture 20 Noveber 7, 2013 Scribe: Yun Willia Yu 1 Introduction Today we re going to go through the analysis of atrix copletion. First though,

More information

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography Tight Bounds for axial Identifiability of Failure Nodes in Boolean Network Toography Nicola Galesi Sapienza Università di Roa nicola.galesi@uniroa1.it Fariba Ranjbar Sapienza Università di Roa fariba.ranjbar@uniroa1.it

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University

More information

Constant-Space String-Matching. in Sublinear Average Time. (Extended Abstract) Wojciech Rytter z. Warsaw University. and. University of Liverpool

Constant-Space String-Matching. in Sublinear Average Time. (Extended Abstract) Wojciech Rytter z. Warsaw University. and. University of Liverpool Constant-Space String-Matching in Sublinear Average Tie (Extended Abstract) Maxie Crocheore Universite de Marne-la-Vallee Leszek Gasieniec y Max-Planck Institut fur Inforatik Wojciech Rytter z Warsaw University

More information

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 60, NO. 2, FEBRUARY ETSP stands for the Euclidean traveling salesman problem.

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 60, NO. 2, FEBRUARY ETSP stands for the Euclidean traveling salesman problem. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 60, NO., FEBRUARY 015 37 Target Assignent in Robotic Networks: Distance Optiality Guarantees and Hierarchical Strategies Jingjin Yu, Meber, IEEE, Soon-Jo Chung,

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links

Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Time-Varying Jamming Links Vulnerability of MRD-Code-Based Universal Secure Error-Correcting Network Codes under Tie-Varying Jaing Links Jun Kurihara KDDI R&D Laboratories, Inc 2 5 Ohara, Fujiino, Saitaa, 356 8502 Japan Eail: kurihara@kddilabsjp

More information

ma x = -bv x + F rod.

ma x = -bv x + F rod. Notes on Dynaical Systes Dynaics is the study of change. The priary ingredients of a dynaical syste are its state and its rule of change (also soeties called the dynaic). Dynaical systes can be continuous

More information

1 Proving the Fundamental Theorem of Statistical Learning

1 Proving the Fundamental Theorem of Statistical Learning THEORETICAL MACHINE LEARNING COS 5 LECTURE #7 APRIL 5, 6 LECTURER: ELAD HAZAN NAME: FERMI MA ANDDANIEL SUO oving te Fundaental Teore of Statistical Learning In tis section, we prove te following: Teore.

More information

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements

Compressive Distilled Sensing: Sparse Recovery Using Adaptivity in Compressive Measurements 1 Copressive Distilled Sensing: Sparse Recovery Using Adaptivity in Copressive Measureents Jarvis D. Haupt 1 Richard G. Baraniuk 1 Rui M. Castro 2 and Robert D. Nowak 3 1 Dept. of Electrical and Coputer

More information