Characterizing the Sample Complexity of Private Learners

Size: px
Start display at page:

Download "Characterizing the Sample Complexity of Private Learners"

Transcription

1 Characterizing the Saple Coplexity of ivate Learners Aos Beiel ept of Coputer Science Ben-Gurion University Kobbi Nissi ept of Coputer Science Ben-Gurion University & Harvard University Uri Steer ept of Coputer Science Ben-Gurion University ABSTRACT In 2008, Kasiviswanathan el al defined private learning as a cobination of PAC learning and differential privacy 16 Inforally, a private learner is applied to a collection of labeled individual inforation and outputs a hypothesis while preserving the privacy of each individual Kasiviswanathan et al gave a generic construction of private learners for finite) concept classes, with saple coplexity logarithic in the size of the concept class This saple coplexity is higher than what is needed for non-private learners, hence leaving open the possibility that the saple coplexity of private learning ay be soeties significantly higher than that of non-private learning We give a cobinatorial characterization of the saple size sufficient and necessary to privately learn a class of concepts This characterization is analogous to the well known characterization of the saple coplexity of nonprivate learning in ters of the VC diension of the concept class We introduce the notion of probabilistic representation of a concept class, and our new coplexity easure Repi corresponds to the size of the sallest probabilistic representation of the concept class We show that any private learning algorith for a concept class C with saple coplexity iplies RepiC) = O), and that there exists a private learning algorith with saple coplexity = ORepiC)) We further deonstrate that a siilar characterization holds for the database size needed for privately coputing a large class of optiization probles and also for the well studied proble of private data release Categories and Subject escriptors K1 Coputers and Society: Public Policy Issues ivacy; F2 Analysis of Algoriths and oble Coplexity: Miscellaneous Research partially supported by the Israel Science Foundation grants No 938/09 and 2761/12) and by the Frankel Center for Coputer Science Perission to ake digital or hard copies of all or part of this work for personal or classroo use is granted without fee provided that copies are not ade or distributed for profit or coercial advantage and that copies bear this notice and the full citation on the first page To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific perission and/or a fee ITCS 13, January 9 12, 2013, Berkeley, California, USA Copyright 2013 ACM /13/01 $1500 General Ters Theory Keywords ifferential privacy, PAC learning, Saple coplexity, obabilistic Representation 1 INTROUCTION Motivated by the observation that learning generalizes any of the analyses applied to large collections of data, Kasiviswanathan el al 16 defined in 2008 private learning as a cobination of probably approxiately correct PAC) learning 19 and differential privacy 11 A PAC learner is given a collection of labeled exaples sapled according to an unknown probability distribution and labeled according to an unknown concept) and generalizes the labeled exaples into a hypothesis h that should predict with high accuracy the labeling of fresh exaples taken fro the sae unknown distribution and labeled with the sae unknown concept The privacy requireent is that the choice of h preserves differential privacy of saple points Intuitively this eans that this choice should not be significantly affected by any particular saple ifferential privacy is increasingly accepted as a standard for rigorous privacy and recent research has shown that differentially private variants exists to any analyses We refer the reader to surveys of work 9, 10 The saple coplexity required for learning a concept class C deterines the aount of labeled data needed for learning a concept c C It is well known that the saple coplexity of learning a concept class C non-privately) is proportional to a coplexity easure of the class C knowns as the VC-diension 20, 6, 13 Kasiviswanathan et al 16 proved that a private learner exists for every finite concept class The proof is via a generic construction that exhibits saple coplexity logarithic in the size of the concept class The VC-diension of a concept class is bounded by this quantity and significantly lower for soe interesting concept classes), and hence the results of 16 left open the possibility that the saple coplexity of private learning ay be significantly higher than that of non-private learning In analogy to the characterization of the saple coplexity of non-private) PAC learners via the VC-diension, we give a cobinatorial characterization of the saple size sufficient and necessary for private PAC learners Towards obtaining this characterization, we introduce the notion of 97

2 probabilistic representation of a concept class We note that our characterization, as the VC-diension characterization, ignores the coputation required by the learner Soe of our algoriths are, however, coputationally efficient 11 Related Work We start with a short description of prior work on the saple coplexity of private learning To siplify the exposition, we ignore dependencies on the error, confidence and privacy paraeters by considering the constants for this and the following section The dependency on these paraeters would be ade explicit in the later sections of the paper Recall that the saple coplexity of non-private learners for a class of functions C is proportional to the VC-diension of the class 6, 13 a cobinatorial easure of the class that is equal to the size of the largest set of inputs that is shattered by the class This characterization, as ours, ignores the coputation required by the learner Kasiviswanathan et al 16 showed, inforally, that every finite concept class C can be learned privately ignoring coputational coplexity) Their construction is based on the exponential echanis of McSherry and Talwar 17, and the Oln C ) bound on saple coplexity results fro the union bound arguent used in the analysis of the exponential echanis Coputationally efficient learners were shown to exist by Blu et al for all concept classes that can be efficiently learned in the statistical queries odel Kasiviswanathan et al 16 showed an exaple of a concept class the class of parity functions that is not learnable in the statistical queries odel but can be learned privately and efficiently These positive results suggest that any natural coputational learning tasks that are efficiently learned non-privately can be learned privately and efficiently Beiel et al 3 studied the saple coplexity of private learning They exained the concept class of point functions POINT d where each concept evaluates to one on exactly one point of the doain and to zero otherwise Note that the VC-diension of POINT d is one Beiel et al proved lower bounds on the saple coplexity of properly and privately learning the class POINT d and related classes), iplying that the VC diension of a class does not characterize the saple coplexity of private proper learning On the other hand, they observed that the saple coplexity can be iproved for iproper private learners whenever there exists a saller hypothesis class H that represents C in the sense that for every concept c C and for every distribution on the exaples, there is a hypothesis h H that is close to c Using the exponential echanis to choose aong the hypotheses in H instead of C, the saple coplexity is reduced to ln H this is why the size of the representation H is defined to be ln H ) For soe classes this can draatically iprove the saple coplexity, eg, for the class POINT d defined in Exaple 32), the saple coplexity is iproved fro Oln POINT d ) = Od) to Oln d) Using other techniques, Beiel et al showed that the saple coplexity of learning POINT d can be reduced even further to O1), hence showing the largest possible gap between proper and non proper private learning Such a gap does not exists for non-private learning Chaudhuri and Hsu 7 studied the saple coplexity needed for private learning infinite concept classes when the data is drawn fro a continuous distribution They showed that under these settings there exists a siple concept class for which any proper learner that uses a finite nuber of exaples and guarantees differential privacy fails to satisfy accuracy guarantee for at least one data distribution This iplies that the results of Kasiviswanathan et al 16 do not extend to infinite hypothesis classes Interestingly, our results iply an iproper private algorith for an infinite extension of the class POINT that is, a class over the natural nubers of all boolean functions that return 1 on exactly one nuber) Chaudhuri and Hsu 7 also study learning algoriths that are only required to protect the privacy of the labels and do not necessarily protect the privacy of the exaples theselves) They prove upper bounds and lower bounds on the saple coplexity of such algoriths In particular, they prove a lower bound on the saple coplexity using the doubling diension of the disagreeent etric of the hypothesis class with respect to the unlabeled data distribution This result does not iply our characterization as the privacy requireent in protecting the labels is uch weaker than protecting the saple point and the label A line of research started in 18) that is very relevant to our paper is boosting learning algoriths, that is, taking learning algoriths that have a big classification error and producing a learning algorith with sall error work et al 12 show how to privately boost accuracy, that is, given a private learning algoriths that have a big classification error, they produce a private learning algorith with sall error In Lea 31, we show how to boost the accuracy α for probabilistic representations This gives an alternative private boosting, whose proof is sipler However, as it uses the exponential echanis, it is generally) not coputationally efficient 12 Our Results Beiel et al 3 showed how to use a representation of a class to privately learn it We ake an additional step in iproving the saple coplexity by considering a probabilistic representation of a concept class C Instead of one collection H representing C, we consider a list of collections H 1,, H r such that for every c C and every distribution on the exaples, if we saple a collection H i fro the list, then with high probability there is a hypothesis h H i that is close to c To privately learn C, the learning algorith first saples i {1,, r} and then uses the exponential echanis to select a hypothesis fro H i This reduces the saple coplexity to Oax i ln H i ); the size of the probabilistic representation is hence defined to be ax i ln H i We show that for POINT d there exists a probabilistic representation of size O1) This results in a private learning algorith with saple coplexity O1), atching a different private algorith for POINT d presented in 3 Our new algorith offers soe iproveent in the saple coplexity copared to the algorith of 3 when considering the learning and privacy paraeters Furtherore, our algorith can be ade coputationally efficient without aking any coputational hardness assuptions, while the efficient version in 3 assues the existence of one-way functions Finally, it is conceptually sipler and in particular it avoids the sub-sapling technique used in 3 One can ask if there are private learning algoriths with saller saple coplexity than the size of the sallest probabilistic representation We show that the answer is no 98

3 the size of the sallest probabilistic representation is a lower bound on the saple coplexity Thus, the size of the sallest probabilistic representation of a class C, which we call the representation diension and denote by RepiC), characterizes up to constants) the saple size necessary and sufficient for privately learning the class C We also show that for concepts defined over a finite doain, the difference between the sizes of the best deterinistic and probabilistic representation is bounded Naely, that if C is a concept class over the doain {0, 1} d, then there exists a deterinistic representation of C of size ORepiC) + ln d) Thus, for classes whose sallest deterinistic representation is of size ωln d), the size of the sallest deterinistic representation characterizes the saple coplexity of private learning of the class The notion of probabilistic representation applies not only to private learning, but also to optiization probles We consider a scenario where there is a doain X, a database S of records, each taken fro the doain X, a set of solutions F, and a quality function q : X F 0, 1 that we wish to axiize If the exponential echanis is used for approxiately) solving the proble, then the size of the database should be Ωln F ) in order to achieve a reasonable approxiation Using our notions of a representation of F and of a probabilistic representation of F, one can reduce the size of the inial database without paying too uch in the quality of the solution Interestingly, a siilar notion to representation, called solution list algoriths, was considered in 2 for constructing secure protocols for search probles while leaking only a few bits on the input Curiously, their notion of leakage is very different fro that of differential privacy We give two exaples of such optiization probles First, an exaple inspired by 2: each record in the database is a clause with exactly 3 literals and we want to find an assignent satisfying at least 7/8 fraction of the clauses while protecting the privacy of the clauses A construction of 2 yields a deterinistic representation for this proble where the size of the database can be uch saller Using a probabilistic representation, we can give a good assignent even for databases of constant size This exaple is a siple instance of a scenario, where each individual has a preference on the solution and we want to choose a solution axiizing the nuber of individuals whose preference are et, while protecting the privacy of the preference Another exaple of optiization is sanitization, where given a database we want to publish a synthetic database, which gives a siilar utility as the original database while protecting the privacy of the individual records of the database Using our techniques, we study the inial database size for which sanitization gives reasonable perforance with respect to a given faily of queries Open oble We still do not know the relation between this diension and the VC diension By Sauer s Lea, if C is a concept class over {0, 1} d, then the nuber of functions in C is at ost expd VCC)) By 16, there is a private learning algorith for C whose saple size is Od VCC)), thus, the probabilistic representation diension of C is Od VCC)) We do not know if there is a class C such that RepiC) VCC) A candidate for such separation appears in 1 2 PRELIMINARIES Notation We use O γgn)) as a shorthand for Ohγ) gn)) for soe non-negative function h Given a set B of cardinality r, and a distribution P on {1, 2,, r}, we use the notation b P B to denote a rando eleent of B chosen according to P 21 eliinaries fro ivacy A database is a vector S = z 1,, z ) over a doain X, where each entry z i S represents inforation contributed by one individual atabases S 1 and S 2 are called neighboring if they differ in exactly one entry An algorith preserves differential privacy if neighboring databases induce nearby outcoe distributions Forally, efinition 21 ifferential ivacy 11) A randoized algorith A is ɛ-differentially private if for all neighboring databases S 1, S 2, and for all sets F of outputs, AS 1) F expɛ) AS 2) F 1) The probability is taken over the rando coins of A An iediate consequence of the definition is that for any two databases S 1, S 2 X, and for all sets F of outputs, AS 1) F exp ɛ) AS 2) F 22 eliinaries fro Learning Theory Let X d = {0, 1} d A concept c : X d {0, 1} is a function that labels exaples taken fro the doain X d by either 0 or 1 A concept class C over X d is a class of concepts apping X d to {0, 1} PAC learning algoriths are given exaples sapled according to an unknown probability distribution over X d, and labeled according to an unknown target concept c C The generalization error of a hypothesis h : X d {0, 1} is defined as error c, h) = hx) cx) x X d For a labeled saple S = x i, y i) i=1, the epirical error of h is error Sh) = 1 {i : hxi) yi} efinition 22 An α-good hypothesis for c and is a hypothesis h such that error c, h) α efinition 23 PAC Learning 19) Algorith A is an α, β)-pac learner for a concept class C over X d using hypothesis class H and saple size if for all concepts c C, all distributions on X d, given an input of saples S = z 1,, z ), where z i = x i, cx i)) and x i are drawn iid fro, algorith A outputs a hypothesis h H satisfying error c, h) α 1 β The probability is taken over the rando choice of the exaples in S according to and the coin tosses of the learner A efinition 2 An algorith satisfying efinition 23 with H C is called a proper PAC learner; otherwise it is called an iproper PAC learner 99

4 23 ivate Learning As a private learner is a PAC learner, its outcoe hypothesis should also be a good predictor of labels Hence, the privacy requireent fro a private learner is not that an application of the hypothesis h on a new saple pertaining to an individual) should leak no inforation about the saple efinition 25 ivate PAC Learning 16) Let A be an algorith that gets an input S = z 1,, z ) Algorith A is an α, β, ɛ)-ppac learner for a concept class C over X d using hypothesis class H and saple size if ivacy Algorith A is ɛ-differentially private as forulated in efinition 21); Utility Algorith A is an α, β)-pac learner for C using H and saple size as forulated in efinition 23) 2 The Exponential Mechanis We next describe the exponential echanis of McSherry and Talwar 17 We present its private learning variant; however, it can be used in ore general scenarios The goal here is to chooses a hypothesis h H approxiately iniizing the epirical error The choice is probabilistic, where the probability ass that is assigned to each hypothesis decreases exponentially with its epirical error Inputs: a privacy paraeter ɛ, a hypothesis class H, and labeled saples S = x i, y i) i=1 1 h H define qs, h) = {i : hx i) = y i} 2 Randoly choose h H with probability exp ɛ qs, h)/2) exp ɛ qs, f)/2) f H oposition 26 enote ê in f H {error Sf)} The probability that the exponential echanis outputs a hypothesis h such that error Sh) > ê + is at ost H exp ɛ /2) Moreover, The exponential echanis is ɛ differentially private 25 Concentration Bounds Let X 1,, X n be independent rando variables where X i = 1 = p and X i = 0 = 1 p for soe 0 < p < 1 Clearly, E i Xi = pn Chernoff and Hoeffding bounds show that the su is concentrated around this expected value: i Xi > 1 + δ)pn exp pnδ 2 /3 ) for δ > 0, i Xi < 1 δ)pn exp pnδ 2 /2 ) for 0 < δ < 1, i Xi pn > δ 2 exp 2δ 2 /n ) for δ 0 The first two inequalities are known as the ultiplicative Chernoff bounds 8, and the last inequality is known as the Hoeffding bound 15 3 THE SAMPLE COMPLEXITY OF PRI- VATE LEARNERS In this section we present a cobinatorial easure of a concept class C that characterizes the saple coplexity necessary and sufficient for privately learning C The easure is a probabilistic representation of the class C We start with the notation of deterinistic representation fro 3 efinition 31 3) A hypothesis class H is an α- representation for a class C if for every c C and every distribution on X d there exists a hypothesis h H such that error c, h) α Exaple 32 POINT d ) For j X d, define c j : X d {0, 1} as c jx) = 1 if x = j, and c jx) = 0 otherwise efine POINT d = {c j} j Xd In 3 it was shown that for α < 1/2, every α-representation for POINT d ust be of cardinality at least d, and that an α-representation H d for POINT d exists where H d = Od/α 2 ) The above representation can be used for non-private learning, by taking a big enough saple and finding a hypothesis h H d iniizing the epirical error For private learning it was shown in 3 that a saple of size O α,β,ɛ log H d ) suffices, with a learner that eploys the exponential echanis to choose a hypothesis fro H d efinition 33 For a hypothesis class H we denote sizeh) = ln H We define the eterinistic Representation iension of a concept class C as { RepiC) = in sizeh) : H 1 } -represents C Exaple 3 By the results of 3, stated in the previous exaple, RepiPOINT d ) = θlnd)) We are now ready to present the notion of a probabilistic representation The idea behind this notion is that we have a list of hypothesis classes, such that for every concept c and distribution, if we saple a hypothesis class fro the list, then with high probability it contains a hypothesis that is close to c efinition 35 Let P be a distribution over {1, 2,, r}, and let H = {H 1, H 2,, H r} be a faily of hypothesis classes every H i H is a set of boolean functions) We say that H, P) is an α, β)-probabilistic representation for a class C if for every c C and every distribution on X d : h Hi st errorc, h) α 1 β P The probability is over randoly choosing a set H i P H Exaple 36 POINT d ) In Section 7 we construct for every α and every β a pair H, P) that α, β)- probabilistically represents the class POINT d, where H contains all the sets of at ost ln1/β) boolean functions α efinition 37 Let H = {H 1, H 2,, H r} be a faily of hypothesis classes We denote H = r, and sizeh ) = ax{ ln H i : H i H } We define the Representation iension of a concept class C as RepiC) = in sizeh ) : P st H, P) is a 1, 1 )-probabilistic representation for C Exaple 38 POINT d ) The size of the probabilistic representation entioned in Exaple 36 is ln ln1/β)) α Placing α = β = 1, we see that the Representation iension of POINT d is constant 100

5 31 Equivalence of α, β)-obabilistic Representation and ivate Learning We now show that RepiC) characterizes the saple coplexity of private learners We start by showing in Lea 39 that an α, β)-probabilistic representation of C iplies a private learning algorith whose saple coplexity is the size of the representation We then show in Lea 312 that if there is a private learning algorith with saple coplexity, then there is probabilistic representation of C of size O); this lea iplies that RepiC) is a lower bound on the saple coplexity Recall that RepiC) is the size of the sallest probabilistic representation for α = β = 1/ Thus, to coplete the proof we show in Lea 31 that a probabilistic representation with α = β = 1/ iplies a probabilistic representation for arbitrary α and β Lea 39 If there a exists pair H, P) that α, β)-probabilistically represents a class C, then for every ɛ there exists an algorith A that 6α, β, ɛ)-ppac learns C 1 with a saple size = O ) sizeh ) + ln 1 )) αɛ β oof Let H, P) be an α, β)-probabilistic representation for the class C, and consider the following algorith A: Inputs: S = x i, y i) i=1, and a privacy paraeter ɛ 1 Randoly choose H i P H 2 Choose h H i using the exp echanis with ɛ By the properties of the exponential echanis, A is ɛ- differentially private We will show that with saple size 1 = O ), sizeh ) + ln 1 )) algorith A is a 6α, β)- αɛ β PAC learner for C Fix soe c C and, and define the following 3 good events: E 1 E 2 E 3 H i chosen in step 1 contains at least one hypothesis h st error Sh) 2α For every h H i st error Sh) 3α, it holds that error c, h) 6α The exponential echanis chooses an h such that error Sh) α + in f Hi {error Sf)} We first show that if those 3 good events happen, algorith A returns a 6α-good hypothesis Event E 1 ensures the existence of a hypothesis f H i st error Sf) 2α Thus, event E 1 E 3 ensures algorith A chooses using the exponential echanis) a hypothesis h H i st error Sh) 3α Event E 2 ensures therefore that this h obeys error c, h) 6α We will now show that those 3 events happen with high probability As H, P) is an α, β)-probabilistic representation for the class C, the chosen H i contains a hypothesis h st error c, h) α with probability at least 1 β; by the Chernoff bound with probability at least 1 exp α/3) this hypothesis has epirical error at ost 2α Event E 1 happens with probability at least 1 β)1 exp α/3)) > 1 β + exp α/3)), which is at least 1 2β) for 3 α ln1/β) Using the Chernoff bound, the probability that a hypothesis h st error c, h) > 6α has epirical error 3α is less than exp α3/) Using the union bound, the probability that there is such a hypothesis in H i is at ost H i exp α3/) Therefore, E 2 1 H i exp α3/) For ln H i )), this probability is at least 1 β) 3α β The exponential echanis ensures that the probability of event E 3 is at least 1 H i exp ɛα/2) see Section 2), which is at least 1 β) for 2 ln H i ) αɛ β All in all, by setting = 3 sizeh ) + ln 1 )) we ensure that the probability of A failing to output a 6α-good αɛ β hypothesis is at ost β We will deonstrate the above lea with two exaples: Exaple 310 Efficient learner for POINT d ) As described in Exaple 36, there exists an H, P) that α/6, β/)-probabilistically represents the class POINT d, where sizeh ) = O α,β,ɛ 1) By Lea 39, there exists an algorith that α, β, ɛ)-ppac learns C with saple size = O α,β,ɛ 1) The existence of an algorith with saple coplexity O1) was already proven in 3 Moreover, assuing the existence of oneway functions, their learner is efficient Our constructions yields an efficient learner, without assuptions To see this, consider again algorith A presented in the above proof, and note that as sizeh ) is constant, step 2 could be done in constant tie Step 1 can be done efficiently as we can efficiently saple a set H i P H In Clai 71 we initially construct a probabilistic representation in which the description of every hypothesis is exponential in d The representation is than revised using pairwise independence to yield a representation in which every hypothesis h has a short description, and given x the value hx) can be coputed efficiently Exaple 311 POINT N ) Consider the class POINT N, which is exactly like POINT d, only over the natural nubers By results of 7, 3, it is ipossible to properly PPAC learn the class POINT N Our construction can yield an inefficient) iproper private learner for POINT N with O α,β,ɛ 1) saples The details are deferred to Section 7 The next lea shows that a private learning algorith iplies a probabilistic representation This lea can be used to lower bound the saple coplexity of private learners Lea 312 Let α 1/ If there exists an algorith A that α, 1, ɛ)-ppac learns a concept class C with a saple size, then there exists a pair H, P) that 1/, 1/)- 2 probabilistically represents the class C such that sizeh ) = O ɛα) oof Let A be an α, 1, ɛ)-ppac learner for the class 2 C using hypothesis class F whose saple size is Without loss of generality, we can assue that 3 ln) since A α can ignore part of the saple) For a target concept c C and a distribution on X d, we define G α = {h F : error c, h) α} Fix soe c C and a distribution on X d, and define the following distribution on X d : { 1 α + α x, x = 0 d x = α x, x 0 d Note that for every x X d, x α x 2) 101

6 As A is an α, 1 )-PAC learner, it holds that 2 1 AS) G α,a 2, where the probability is over A s randoness and over sapling the exaples in S according to ) In addition, by inequality 2), every hypothesis h with error c, h) > 1/ has error strictly greater than α under : error c, h) α error c, h) > α So, every α-good hypothesis for c and is a 1 -good hypothesis for c and That is, G α G 1/ Therefore,,A AS) G 1/ 1 2 We say that a database S of labeled exaples is good if the unlabeled exaple 0 d appears in S at least 1 8α) ties Let S be a database constructed by taking iid saples fro, labeled by c By the Chernoff bound, S is good with probability at least 1 exp α/3) Hence, AS) G 1/ ) S is good) 1,A 2 exp α/3) 1 Therefore, there exists a database S good of saples that contains the unlabeled saple 0 d at least 1 8α) ties, and A AS good ) G 1/ 1, where the probability is only over the randoness of A All of the exaples in S good including the exaple 0 d ) are labeled by c For σ {0, 1}, denote by 0 σ a database containing copies of the exaple 0 d labeled as σ As A is ɛ-differentially private, and as the target concept c labels the exaple 0 d by either 0 or 1, for at least one σ {0, 1} it holds that A 0 σ) G 1/ exp 8αɛ) AS good ) G 1/ A A exp 8αɛ) 1/ 3) That is, AA 0 σ) / G 1/ 1 1 e 8αɛ Now, consider a set H containing the outcoes of ln)e 8αɛ executions of A 0 0), and the outcoes of ln)e 8αɛ executions of A 0 1) The probability that H does not contain a 1 -good hypothesis for c and is at ost 1 1 e 8αɛ ) ln)e8αɛ 1 Thus, H = { H F : H 2 ln)e 8αɛ}, and P, the distribution induced by A 0 0) and A 0 1), are a 1/, 1/)- probabilistic representation for the class C Note that the value c0 d ) is unknown, and can be either 0 or 1 Therefore the construction uses the two possible values one of the correct) It holds that sizeh ) = ax{ ln H : H H } = ln8 ln)) + 8αɛ = O ɛα) Lea 31 shows how to construct a probabilistic representation for an arbitrary α and β fro a probabilistic representation with α = β = 1/; in other words we boost α and β The proof of this lea is cobinatorial It allows us to start with a private learning algorith with constant α and β, ove to a representation, use the cobinatorial boosting, and ove back to a private algorith with sall α and β This should be contrasted with the private boosting of 12 which is algorithic and ore coplicated however, the algorith of work et al 12 is coputationally efficient) We first show how to construct a probabilistic representation for arbitrary β fro a probabilistic representation with β = 1/ Clai 313 For every concept class C and for every β, there exists a pair H, P) that 1/, β)-probabilistically represents C where sizeh ) RepiC) + ln ln1/β) oof Let β < 1/, and let H 0, P 0 ) be a 1, 1 )- probabilistic representation for C with sizeh 0 ) = RepiC) k 0 that is, for every Hi 0 H 0 it holds that Hi 0 e k 0 ) enote H 0 = {H1, 0 H2, 0, Hr}, 0 and consider the following faily of hypothesis classes: H 1 = { H 0 i 1 H 0 i ln1/β) } : 1 i 1 i ln1/β) r Note that for every Hi 1 H 1 it holds that Hi 1 ln1/β)e k 0 and so sizeh 1 ) k 1 k 0 + ln ln1/β) We will now show an appropriate distribution P 1 on H 1 st H 1, P 1 ) is a 1, β)-probabilistic representation for C To this end, consider the following process for randoly choosing an H 1 H 1 : 1 enote M = ln1/β) 2 For i = 1,, M : Randoly choose Hi 0 P0 H 0 3 Return H 1 = M i=1 H0 i The above process induces a distribution on H 1, denoted as P 1 As H 0 is a 1, 1 )-probabilistic representation for C, we have that h H 1 st error c, h) 1/ = P 1 M = h H 0 i st error c, h) 1/ P i=1 0 ) M 1 β Lea 31 For every concept class C, every α, and every β, there exists H, P) that α, β)-probabilistically represents C where sizeh ) = O ln 1 ) RepiC)+ln ln ln 1 α α )+ln ln 1 β ))) oof Let C be a concept class, and let H 1, P 1 ) be a 1, β/t )-probabilistic representation for C where T will be set later) By Clai 313, such a representation exists with sizeh 1 ) k 1 RepiC) + ln lnt/β) We use H 1 and P 1 to create an α, β)- probabilistic representation for C We begin with two notations: 1 For T hypotheses h 1,, h T we denote by aj h1,,h T the ajority hypothesis That is, aj h1,,h T x) = 1 if and only if {h i : h ix) = 1} T/2 2 For T hypothesis classes { H 1,, H T we denote } MAJH 1,, H T ) = aj h1,,ht : 1 i T h i H i Consider the following faily of hypothesis classes: { } H = MAJH i1,, H it ) : H i1,, H it H 1 102

7 Moreover, denote the distribution on H induced by the following rando process as P: For j = 1,, T : Randoly choose H ij P 1 H 1 Return MAJH i1,, H it ) Next we show that H, P) is an α, β)-probabilistic representation for C: For a fixed pair of a target concept c and a distribution, randoly choose H i1,, H it P 1 H 1 We now show that with probability at least 1 β) the set MAJH i1,, H it ) contains at least one α-good hypothesis for c, To this end, denote 1 = and consider the following thought experient, inspired by the Adaboost Algorith of 1: For t = 1, T : 1 Fail if H it does not contain a 1 -good hypothesis for c, t 2 enote by h t H it a 1 -good hypothesis for c, t { 2tx), if h tx) cx) 3 t+1x) = ) 1 error t c,h t ) 1 error t c,h t ) tx), otherwise Note that as 1 is a probability distribution on X d ; the sae is true for 2, 3,, T As H 1, P 1 ) is a 1, β/t )- probabilistic representation for C, the failure probability of every iteration is at ost β/t Thus using the union bound), with probability at least 1 β) the whole thought experient will succeed, and in this case we show that the error of h fin = aj h1,,h T is at ost α Consider the set R = {x : h fin x) cx)} X d This is the set of points on which at least T/2 of h 1,, h T err Next consider the partition of R to the following sets: R t = { x R : h tx) cx) ) i>t h ix) = cx) )} That is, R t contains the points x R on which h t is last to err Clearly tr t) 1/, as R t is a subset of the set of points on which h t errs Moreover, ) t T /2 tr t) 1R t) 2 T /2 1 error tc, h t) so, Finally, 1R t) 2 T /2 1 1/ 1 1/ 1R t) 2 T /2 1 1/ = R t) R t) tr t) ) T /2, 3 1 error t c, h t) 1 1/ ) T /2 1 3 error c, h fin ) = R) = T 2 1 ) T /2 = T 3 8 T t=t /2 3 ) t T /2 ) T /2 ) T /2 3 R t) ) T /2 Choosing T = 1 ln 2 α ), we get that errorc, h fin) α Hence, H, P) is an α, β)-probabilistic representation for C Moreover, for every H i H we have that H i e k 1 ) T, and so sizeh ) k 1 T RepiC) + ln lnt/β) ) T = O ln 1 α ) RepiC) + ln ln ln 1 α ) + ln ln 1 β ))) The next theore states the ain result of this section Repi characterizes the saple coplexity of private learning Theore 315 Let C be a concept class Θ ) RepiC) β αɛ saples are necessary and sufficient for the private learning of the class C oof Fix soe α 1/, β 1/2, and ɛ By Lea 31, there exists a pair H, P) that α, β )-represent class C, 6 where sizeh ) = O ln1/α) RepiC) + ln ln ln1/α) + ln ln1/β) )) Therefore, by Lea 39, there exists an algorith A that α, β, ɛ)-ppac learns the class C with a saple size 1 = O β αɛ ln 1 α ) RepiC) + ln ln ln 1 )) α ) For the lower bound, let A be an α, β, ɛ)-ppac learner for the class C with a saple size, where α 1/ and β 1/2 By Lea 312, there exists an H, P) that 1, 1 )- probabilistically represents the class C and sizeh ) = ln8)+ ln ln) + 8αɛ Therefore, by definition, RepiC) ln8 ln)) + 8αɛ Thus, 1 8αɛ RepiC) ln8 ln)) ) = Ω RepiC) αɛ ) FROM A PROBABILISTIC REPRESEN- TATION TO A ETERMINISTIC REPRE- SENTATION In this section we will establish a connection between the probabilistic) representation diension of a class and its deterinistic representation diension Observation 1 Let H, P) be an α, β)-probabilistic representation for a concept class C Then, B = H i H Hi is an α-representation of C oof As H, P) is an α, β)-probabilistic representation for C, for every c and every h Hi st errorc, h) α 1 β > 0 P The probability is over choosing a set H i P H In particular, for every c and every there exists an H i H that contains an α-good hypothesis The siple construction in Observation 1 ay result in a very large deterinistic representation For exaple, in Clai 71 we show an H, P) that α, β)- probabilistically represents the class POINT d, where H contains all the sets 103

8 of at ost α ln 1 β ) boolean functions While H i H Hi = 2 X d is indeed an α-representation for POINT d, it is extreely over-sized We will show that it is not necessary to take the union of all the H i s in H in order to get an α-representation for C As H, P) is an α, β)-probabilistic representation, for every c and every, with probability at least 1 β a randoly chosen H i P H contains an α-good hypothesis The straight forward strategy here is to first boost β as in Clai 313, and then use the union bound over all possible c C and over all possible distributions on X d Unfortunately, there are infinitely any such distributions, and the proof will be soewhat ore coplicated efinition 2 Let H = {H 1, H 2,, H r} be a faily of hypothesis classes, and P be a distribution over {1,, r} We will denote the following non private algorith as LearnerH, P,, γ): Input: a saple S = x i, y i) i=1 1 Randoly choose H i P H 2 If for every h H i error Sh) > γ, then fail 3 Return h H i iniizing error Sh) We will say that LearnerH, P,, γ) is β-successful for a class C over X d, if for every c C and every distribution on X d, given an input saple drawn iid according to and labeled by c, algorith Learner fails with probability at ost β Clai 3 If H, P) is an α, β)-probabilistic representation for a class C, then, for 3 ln1/β), algorith α LearnerH, P,, 2α) is 2β-successful for C oof We will show that with probability at least 1 2β, the set H i chosen in Step 1) contains at least one hypothesis h st error Sh) 2α As H, P) is an α, β)- probabilistic representation for class C, the chosen H i will contain a hypothesis h st error c, h) α with probability at least 1 β; by the Chernoff bound with probability at least 1 exp α/3) this hypothesis has epirical error at ost 2α The set H i contains a hypothesis h st error Sh) 2α with probability at least 1 β)1 exp α/3)) > 1 β + exp α/3)), which is at least 1 2β) for 3 ln1/β) α Clai Let H be a faily of hypothesis classes, and P a distribution on it Let γ, β and be such that sizeh ) + ln 1 )) If LearnerH, P,, γ) is β-successful γ β for a class C over X d, then there exists Ĥ H and a distribution P on it, st LearnerĤ, P,, γ) is a 2γ, 3β)- PAC learner for C and Ĥ = d β 2 oof For every input S = x i, y i) i=1, denote by p S the probability of LearnerH, P,, γ) failing on step 2 the probability is only over the choice of H i P H in the first step) As LearnerH, P,, γ) is β-successful, P, LearnerH, P,, γ) fails = S S ps β Consider the following process, denoted by oc, for randoly choosing a ultiset H of size t t will be set later): For i = 1,, t : Randoly choose H i P H Return H = H 1, H 2,, H t) enote by U t the unifor distribution on {1, 2,, t} As before, for every input S = x i, y i) i=1, denote by p S the probability of Learner H, U t,, γ) failing on its second step again, the probability is only over the choice of H i Ut H in the first step) Using those notations: Learner H, Ut,, γ) fails = U t, S S ps Fix a saple S As the choice of H i Ut H is unifor, { H i H : h H i error Sh) > γ} p S = H Using the Hoeffding bound, P roc p S p S β 2e 2tβ2 The probability is over choosing the ultiset H There are at ost 2 d+1) saples of size as every entry in the saple is an eleent of X d, concatenated with a label bit) Using the union bound over all possible saples S, S st p S p S β 2 d+1) 2 e 2tβ2 P roc For t d the above probability is strictly less than 1 This β 2 eans that for t = d there exists a ultiset Ĥ such that β 2 p S p S β for every saple S We will show that for this Ĥ, LearnerĤ, Ut,, γ) is a 2γ, 3β)-PAC learner Fix a target concept c C and a distribution on X d efine the following two good events: E 1 LearnerĤ, Ut,, γ) outputs a hypothesis h such that error Sh) γ E 2 For every h H i st error Sh) γ, it holds that error c, h) 2γ Note that if those two events happen, LearnerĤ, Ut,, γ) returns a 2γ-good hypothesis for c and We will show that those two events happen with high probability We start by bounding the failure probability of LearnerĤ, Ut,, γ) Learner Ĥ, U t,, γ) fails U t, = S S S ps S ps + β) = LearnerH, P,, γ) fails + β 2β P, When LearnerĤ, Ut,, γ) does not fail, it returns a hypothesis h with epirical error at ost γ Thus, E 1 1 2β Using the Chernoff bound, the probability that a hypothesis h with error c, h) > 2γ has epirical error γ is less 10

9 than exp γ/) Using the union bound, the probability that there is such a hypothesis in H i is at ost H i exp γ/) Therefore, E 2 1 H i exp γ/) For ln H i ), this probability is at least 1 β) γ β All in all, the probability of LearnerH, P,, γ) failing to output a 2γ-good hypothesis is at ost 3β Theore 5 If there exists a pair H, P) that α, β)- probabilistically represents a class C over X d where H ight be very big), then there exists a pair Ĥ, P) that α, 6β)-probabilistically represents C, where Ĥ H, and Ĥ = 3d sizeh ) + ln 1β ) αβ ) 2 oof Let H, P) be an α, β)-probabilistic representation for a class C Set = 3 sizeh )+ln 1 )) By Clai α β 3, LearnerH, P,, 2α) is 2β-successful for class C By Clai, there exists an Ĥ H and a distribution P on it, such that LearnerĤ, P,, 2α) is a α, 6β)-PAC learner for C and Ĥ = d = 3d sizeh ) + ln 1 )) β 2 αβ 2 β Assue towards contradiction that Ĥ, P) does not α, 6β)-represent C So, there exist a concept c C and a distribution st, with probability strictly greater than 6β, a randoly chosen H i P Ĥ does not contain a αgood hypothesis for c, Therefore, for those c and, LearnerĤ, P,, 2α) will fail to return a α-good hypothesis with probability strictly greater than 6β Theore 6 For every class C over X d there exists a 1 - representation B such that sizeb) = Olnd) +RepiC)) oof By Lea 31, there exists a pair H, P) that 1, 1 )-probabilistically represents C such that sizeh ) = ORepiC)) Using Theore 5, there exists a pair Ĥ, P) that 1, 1 )-probabilistically represents C, such that 2 sizeĥ ) = sizeh ) and Ĥ = O d sizeh )) We can now use Observation 1 and construct the set B = H i Ĥ Hi which is a 1 -representation for the class C In addition, B = O Ĥ e sizeh )) = O d sizeh ) e sizeh )) Thus, sizeb) = ln B = O lnd) + RepiC)) Corollary 7 For every concept class C over X d, RepiC) = Olnd) + RepiC)) Corollary 8 There exists a constant N st for every concept class C over X d where RepiC) N logd), the saple coplexity that is necessary and sufficient for privately learning C is Θ α,β RepiC)) 5 PROBABILISTIC REPRESENTATION FOR PRIVATELY SOLVING OPTIMIZA- TION PROBLEMS The notion of probabilistic representation applies not only to private learning, but also to a broader task of optiization probles We consider the following scenario: efinition 51 An optiization proble OPT over a universe X and a set of solutions F is defined by a quality function q : X F 0, 1 Given a database S, the task is to choose a solution f F such that qs, f) is axiized Notation We will refer to the optiization proble defined by a quality function q as OPT q efinition 52 An α-good solution for a database S is a solution s such that qs, s) ax f F {qs, f)} α Given an optiization proble OPT q, one can use the exponential echanis to choose a solution s F In general, this ethod achieves a reasonable solution only for databases of size Ωlog F /ɛ) To see this, consider a case where there exists a database S of records such that exactly one solution t F has a quality of qs, t) = 1, and every other f F has a quality of qs, f) = 1/2 The probability of the exponential echanis choosing t is: Unless t is chosen = expɛ/2) F 1) expɛ/) + expɛ/2) ln F 1) = Ω 1 ln F ), ) ɛ ɛ the above probability is strictly less than 1/2 Using our notations of probabilistic representation, it ight be possible to reduce the necessary database size Consider using the exponential echanis for choosing a solution s, not out of F, but rather fro a saller set of solutions B Roughly speaking, the factor of ln F in requireent ) will now be replaced with ln B, which corresponds to size of the representation Therefore, the database size should be at least ln B /ɛ So needs to be bigger than the size of the representation by at least a factor of 1/ɛ In the following analysis we will denote this required gap, ie, / ln B, as We will see that the existence of a private approxiation algorith iplies a probabilistic representation with 1 < 1, and that a probabilistic representation with > 1 iplies a private approxiation algo- ɛ rith Bigger corresponds to better privacy; however, it ight be harder to achieve efinition 53 Let OPT q be an optiization proble over a universe X and a set of solutions F Let B be a set of solutions, and denote sizeb) = ln B We say that B is an α-deterinistic representation of OPT q for databases of eleents if for every S X there exists a solution s B such that qs, s) ax f F {qs, f)} α efinition 5 Let B be an α-deterinistic representation of OPT q for databases of eleents enote If > 1, then we say that the ratio of B is sizeb) An α-deterinistic representation B with ratio is required to support all the databases of = sizeb) eleents That is, for every S X, the set B is required to contain at least one α-good solution Fix S X Intuitively, controls the ratio between and nuber of bits needed to represent an α-good solution for S As B contains an α-good solution for S, and assuing B is publicly known, this solution could be represented with ln B = sizeb) = / bits 105

10 efinition 55 Let OPT q be an optiization proble over a universe X and a set of solutions F Let P be a distribution over {1, 2,, r}, and let B = {B 1, B 2,, B r} be a faily of solution sets for OPT q We denote sizeb) = ax{ ln B i : B i B } We say that B, P) is an α, β)- probabilistic representation of OPT q for databases of eleents if for every S X : s B i st qs, s) ax{qs, f)} α 1 β P f F efinition 56 Let B, P) be an α, β)-probabilistic representation of OPT q for databases of eleents enote If > 1, then we say that the ratio of the sizeb) representation is efinition 57 An optiization proble OPT q is bounded if S 1 qs 1, f) S 2 qs 2, f) 1 for every solution f and every two neighboring databases S 1, S 2 We are interested in approxiating bounded optiization probles, while guaranteeing differential privacy: efinition 58 Let OPT q be a bounded optiization proble over a universe X and a set of solutions F An algorith A is an α, β, ɛ)-private approxiation algorith for OPT q with a database of records if: 1 Algorith A is ɛ-differentially private as forulated in efinition 21); 2 For every S X, algorith A outputs with probability at least 1 β) a solution s such that qs, s) ax f F {qs, f)} α Exaple 59 Sanitization) Consider a class of predicates C over X A database S contains points taken fro X A predicate query Q c for c C is defined as Q cs) = 1 {xi S : cxi) = 1} Blu et al 5 defined S a sanitizer or data release echanis) as a differentially private algorith that, on input a database S, outputs another database Ŝ with entries taken fro X A sanitizer A is α, β)-useful for predicates in the class C if for every database S it holds that c C QcS) Q cŝ) α 1 β A This scenario can be viewed as a bounded optiization proble: The solutions are sanitized databases For an input database S and and a sanitized database Ŝ, the quality function is } qs, Ŝ) = 1 ax { Q cs) Q cŝ) c C To see that this optiization proble is bounded, note that for every two neighboring databases S 1, S 2 of eleents, and every c C it holds that Q cs 1) Q cs 2) 1 Therefore, for every sanitized database f, qs 1, f) qs 2, f) = ax { QcS1) Qcf) } ax{ QcS2) Qcf) } 1 c C c C The next two leas establish an equivalence between a private approxiation algorith and a probabilistic representation for a bounded optiization proble Lea 510 Let OPT q be a bounded optiization proble over a universe X If there exists a pair B, P) that α, β)-probabilistically represents OPT q for databases of eleents, st the ratio of B, P) is > 1, then for every ˆα, ˆβ, ɛ satisfying ) 2 ln1/ ˆβ) 1 +, ɛˆα sizeb) there exists an α + ˆα), β + ˆβ), ɛ ) -approxiation algorith for OPT q with a database of size oof Consider the following algorith A: Inputs: a database S X, and a privacy paraeter ɛ 1 Randoly choose B i P B 2 Choose s B i using the exponential echanis, that is, with probability expɛ qs, s)/2) f B i expɛ qs, f)/2) By the properties of the exponential echanis, A is ɛ- differentially private Fix a database S X, and define the following 2 bad events: E 1 The set B i chosen in step 1 does not contain a solution s st qs, s) ax f F {qs, f)} α E 2 The solution s chosen in step 2 is such that qs, s) < ax t Bi qs, t) ˆα Note that if those two bad events do not occur, algorith A outputs a solution s such that qs, s) ax f F {qs, f)} α ˆα As B, P) is an α, β)-probabilistic representation of OPT q for databases of size, event E 1 happens with probability at ost β By the properties of the exponential echanis, the probability of event E 2 is bounded by B i exp ɛˆα/2) As = sizeb), this probability is at ost E 2 sizeb) exp ɛˆα/2) = sizeb) exp ɛ sizeb) ˆα/2) ) ) ln1/ ˆβ) sizeb) exp 1 + sizeb) sizeb) = sizeb) exp sizeb) ln1/ ˆβ)) = ˆβ Therefore, algorith A outputs an α + ˆα)-good solution with probability at least 1 β ˆβ) Lea 511 Let OPT q be an optiization proble If there exists an α, β, ɛ)-private approxiation algorith for OPT q with a database of records, then for every ˆβ satisfying ln 1 ) + ln ln 1ˆβ ) + ɛ > 1, 1 β there exists a pair B, P) that α, ˆβ)-probabilistically represents OPT q for databases of eleents, where the ratio of the representation is 106

11 oof Let A be an α, β, ɛ)-private approxiation algorith for OPT q, with a saple size Fix an arbitrary input database S X efine G as the set of all solutions s, possibly outputted by A, such that qs, s) ax f F {qs, f)} α As A is an α, β, ɛ)-approxiation algorith, A AS) G 1 β As A is ɛ-differentially private, A A 0) G 1 β)e ɛ, where 0 is a database with zeros That is, A A 0) / G 1 1 β)e ɛ Now, consider a set B containing the outcoes of Γ 1 ln 1ˆβ )e ɛ executions of A 0) The probability that 1 β B does not contain a solutions s G is at ost 1 1 β)e ɛ ) Γ ˆβ Thus, B = {B supporta) : B Γ}, and P, the distribution induced by A 0), are an α, ˆβ)- probabilistic representation of OPT q for databases with eleents Moreover, the ratio of the representation is sizeb) = = ax{ ln B : B B } ln 1 ) + ln ln 1ˆβ ) + ɛ = 1 β 51 Exact 3SAT Consider the following bounded optiization proble, denoted as OPT E3SAT: The universe X is the set of all possible clauses with exactly 3 different literals over n variables, and the set of solutions F is the set of all possible 2 n assignents Given a database S = σ 1, σ 2,, σ ) containing E3CNF clauses, the quality of an assignent a F is qs, a) = {i : aσi) = 1} Aiing at the very different) objective of secure protocols for search probles, Beiel et al 2 defined the notation of solution-list algoriths, which corresponds to our notation of deterinistic representation We next rephrase their results using our notations R1 For every α > 0 and every > 1, there exists a set B that α + 1/8)-deterinistically represents OPT E3SAT for databases of size = O ln lnn) + ln1/α) ) ), and a ratio of R2 Let α < 1/2 and > 1 For every set B that α- deterinistically represents OPT E3SAT for databases of size with a ratio of, it holds that = Ω ln lnn) ) Using R1) and a deterinistic version of Lea 510, for every α, β, ɛ > 0, there exists an 1/8 + α), β, ɛ ) - approxiation algorith for OPT E3SAT with a database of = O α,β,ɛ ln lnn)) clauses By R2), this is the best possible using a deterinistic representation We can reduce the necessary database size, using a probabilistic representation Fix a clause with three different literals If we pick an assignent at rando, then with probability at least 7/8 it satisfies the clause Now, fix any exact 3CNF forula If we pick an assignent at rando, then the expected fraction of satisfied clauses is at least 7/8 Moreover, for every 0 < α < 7/8, the fraction of satisfied α clauses is at least 7/8 α) with probability at least So, if we pick t = ln1/β) lnα+1/8)+ln1/α) α+1/8 rando assignents, the probability that none of the will satisfy at least 7/8 α) ) t α clauses is at ost α+1/8 = β So, for every > 1, B = {B : B is a set of at ost t assignents}, and P, the distribution induced on B by randoly picking t assignents, are an 1/8 + α), β ) -probabilistic representation of OPT E3SAT for databases of size lnt) and a ratio of By Lea 511, for every ɛ there exists an 1/8 + α), β, ɛ ) -approxiation algorith for OPT E3SAT with a database of = o α,β,ɛ 1) clauses 6 EXTENSIONS 61 ɛ, δ)-ifferential ivacy The notation of ɛ-differential privacy was generalized to ɛ, δ)-differential privacy, where the requireent in inequality 1) is changed to AS 1) F expɛ) AS 2) F + δ The proof of Lea 312 reains valid even if algorith A is only ɛ, δ)-differential private for δ 1 8 e 8αɛ 1 e ɛ ) 5) To see this, note that inequality 3) changes to A 0) G A ) ) ) A e ɛ δ e ɛ δ e ɛ δ 1 8α 1 ) e 8αɛ δ e iɛ i=0 1 ) 1 e 8αɛ δ 1 e ɛ 1 8 e 8αɛ The rest of the proof reains alost intact only inor changes in the constants) With that in ind, we see that the lower bound showed in Theore 315 for ɛ-differentially private that is, with δ = 0) learners also applies for ɛ, δ)- differentially private learners satisfying inequality 5) That is, every such learner for a class C ust use Ω RepiC) αɛ saples When using ɛ, δ)-differential privacy, δ should be negligible in the security paraeter, that is, in d the representation length of eleents in X d Therefore, using ɛ, δ)- differential privacy instead of ɛ-differential privacy cannot reduce the saple coplexity for PPAC learning a concept class C whenever RepiC) = O logd)) 62 obabilistic Representation Using a Hypothesis Class We will now consider a generalization of our representation notations that can be useful when considering PPAC learners that use a specific hypothesis class In particular, those notation can be useful when considering proper-ppac learners, that is, a learner that learns a class C using a hypothesis class B C efinition 61 We define the α-eterinistic Representation iension of a concept class C using a hypothesis ) 107

1 Generalization bounds based on Rademacher complexity

1 Generalization bounds based on Rademacher complexity COS 5: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #0 Scribe: Suqi Liu March 07, 08 Last tie we started proving this very general result about how quickly the epirical average converges

More information

1 Proof of learning bounds

1 Proof of learning bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a

More information

1 Rademacher Complexity Bounds

1 Rademacher Complexity Bounds COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #10 Scribe: Max Goer March 07, 2013 1 Radeacher Coplexity Bounds Recall the following theore fro last lecture: Theore 1. With probability

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory Proble sets 5 and 6 Due: Noveber th Please send your solutions to learning-subissions@ttic.edu Notations/Definitions Recall the definition of saple based Radeacher

More information

Bounds on the Sample Complexity for Private Learning and Private Data Release

Bounds on the Sample Complexity for Private Learning and Private Data Release Bounds on the Sample Complexity for Private Learning and Private Data Release Amos Beimel Hai Brenner Shiva Prasad Kasiviswanathan Kobbi Nissim June 28, 2013 Abstract Learning is a task that generalizes

More information

Understanding Machine Learning Solution Manual

Understanding Machine Learning Solution Manual Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y

More information

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011)

E0 370 Statistical Learning Theory Lecture 5 (Aug 25, 2011) E0 370 Statistical Learning Theory Lecture 5 Aug 5, 0 Covering Nubers, Pseudo-Diension, and Fat-Shattering Diension Lecturer: Shivani Agarwal Scribe: Shivani Agarwal Introduction So far we have seen how

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

1 Proving the Fundamental Theorem of Statistical Learning

1 Proving the Fundamental Theorem of Statistical Learning THEORETICAL MACHINE LEARNING COS 5 LECTURE #7 APRIL 5, 6 LECTURER: ELAD HAZAN NAME: FERMI MA ANDDANIEL SUO oving te Fundaental Teore of Statistical Learning In tis section, we prove te following: Teore.

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

Computational and Statistical Learning Theory

Computational and Statistical Learning Theory Coputational and Statistical Learning Theory TTIC 31120 Prof. Nati Srebro Lecture 2: PAC Learning and VC Theory I Fro Adversarial Online to Statistical Three reasons to ove fro worst-case deterinistic

More information

arxiv: v1 [cs.ds] 17 Mar 2016

arxiv: v1 [cs.ds] 17 Mar 2016 Tight Bounds for Single-Pass Streaing Coplexity of the Set Cover Proble Sepehr Assadi Sanjeev Khanna Yang Li Abstract arxiv:1603.05715v1 [cs.ds] 17 Mar 2016 We resolve the space coplexity of single-pass

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Testing Properties of Collections of Distributions

Testing Properties of Collections of Distributions Testing Properties of Collections of Distributions Reut Levi Dana Ron Ronitt Rubinfeld April 9, 0 Abstract We propose a fraework for studying property testing of collections of distributions, where the

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13

CSE525: Randomized Algorithms and Probabilistic Analysis May 16, Lecture 13 CSE55: Randoied Algoriths and obabilistic Analysis May 6, Lecture Lecturer: Anna Karlin Scribe: Noah Siegel, Jonathan Shi Rando walks and Markov chains This lecture discusses Markov chains, which capture

More information

Improved Guarantees for Agnostic Learning of Disjunctions

Improved Guarantees for Agnostic Learning of Disjunctions Iproved Guarantees for Agnostic Learning of Disjunctions Pranjal Awasthi Carnegie Mellon University pawasthi@cs.cu.edu Avri Blu Carnegie Mellon University avri@cs.cu.edu Or Sheffet Carnegie Mellon University

More information

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1.

Handout 7. and Pr [M(x) = χ L (x) M(x) =? ] = 1. Notes on Coplexity Theory Last updated: October, 2005 Jonathan Katz Handout 7 1 More on Randoized Coplexity Classes Reinder: so far we have seen RP,coRP, and BPP. We introduce two ore tie-bounded randoized

More information

Lecture 21. Interior Point Methods Setup and Algorithm

Lecture 21. Interior Point Methods Setup and Algorithm Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and

More information

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs

On the Inapproximability of Vertex Cover on k-partite k-uniform Hypergraphs On the Inapproxiability of Vertex Cover on k-partite k-unifor Hypergraphs Venkatesan Guruswai and Rishi Saket Coputer Science Departent Carnegie Mellon University Pittsburgh, PA 1513. Abstract. Coputing

More information

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness

A Note on Scheduling Tall/Small Multiprocessor Tasks with Unit Processing Time to Minimize Maximum Tardiness A Note on Scheduling Tall/Sall Multiprocessor Tasks with Unit Processing Tie to Miniize Maxiu Tardiness Philippe Baptiste and Baruch Schieber IBM T.J. Watson Research Center P.O. Box 218, Yorktown Heights,

More information

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science

A Better Algorithm For an Ancient Scheduling Problem. David R. Karger Steven J. Phillips Eric Torng. Department of Computer Science A Better Algorith For an Ancient Scheduling Proble David R. Karger Steven J. Phillips Eric Torng Departent of Coputer Science Stanford University Stanford, CA 9435-4 Abstract One of the oldest and siplest

More information

Learnability and Stability in the General Learning Setting

Learnability and Stability in the General Learning Setting Learnability and Stability in the General Learning Setting Shai Shalev-Shwartz TTI-Chicago shai@tti-c.org Ohad Shair The Hebrew University ohadsh@cs.huji.ac.il Nathan Srebro TTI-Chicago nati@uchicago.edu

More information

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning

New Bounds for Learning Intervals with Implications for Semi-Supervised Learning JMLR: Workshop and Conference Proceedings vol (1) 1 15 New Bounds for Learning Intervals with Iplications for Sei-Supervised Learning David P. Helbold dph@soe.ucsc.edu Departent of Coputer Science, University

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Combining Classifiers

Combining Classifiers Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/

More information

On Constant Power Water-filling

On Constant Power Water-filling On Constant Power Water-filling Wei Yu and John M. Cioffi Electrical Engineering Departent Stanford University, Stanford, CA94305, U.S.A. eails: {weiyu,cioffi}@stanford.edu Abstract This paper derives

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE7C (Spring 018: Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee7c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee7c@berkeley.edu October 15,

More information

Randomized Recovery for Boolean Compressed Sensing

Randomized Recovery for Boolean Compressed Sensing Randoized Recovery for Boolean Copressed Sensing Mitra Fatei and Martin Vetterli Laboratory of Audiovisual Counication École Polytechnique Fédéral de Lausanne (EPFL) Eail: {itra.fatei, artin.vetterli}@epfl.ch

More information

Polygonal Designs: Existence and Construction

Polygonal Designs: Existence and Construction Polygonal Designs: Existence and Construction John Hegean Departent of Matheatics, Stanford University, Stanford, CA 9405 Jeff Langford Departent of Matheatics, Drake University, Des Moines, IA 5011 G

More information

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks

Bounds on the Minimax Rate for Estimating a Prior over a VC Class from Independent Learning Tasks Bounds on the Miniax Rate for Estiating a Prior over a VC Class fro Independent Learning Tasks Liu Yang Steve Hanneke Jaie Carbonell Deceber 01 CMU-ML-1-11 School of Coputer Science Carnegie Mellon University

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437

16 Independence Definitions Potential Pitfall Alternative Formulation. mcs-ftl 2010/9/8 0:40 page 431 #437 cs-ftl 010/9/8 0:40 page 431 #437 16 Independence 16.1 efinitions Suppose that we flip two fair coins siultaneously on opposite sides of a roo. Intuitively, the way one coin lands does not affect the way

More information

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,

Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

A Smoothed Boosting Algorithm Using Probabilistic Output Codes

A Smoothed Boosting Algorithm Using Probabilistic Output Codes A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu

More information

arxiv: v3 [cs.lg] 7 Jan 2016

arxiv: v3 [cs.lg] 7 Jan 2016 Efficient and Parsionious Agnostic Active Learning Tzu-Kuo Huang Alekh Agarwal Daniel J. Hsu tkhuang@icrosoft.co alekha@icrosoft.co djhsu@cs.colubia.edu John Langford Robert E. Schapire jcl@icrosoft.co

More information

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization

Support Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering

More information

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay

A Low-Complexity Congestion Control and Scheduling Algorithm for Multihop Wireless Networks with Order-Optimal Per-Flow Delay A Low-Coplexity Congestion Control and Scheduling Algorith for Multihop Wireless Networks with Order-Optial Per-Flow Delay Po-Kai Huang, Xiaojun Lin, and Chih-Chun Wang School of Electrical and Coputer

More information

Chaotic Coupled Map Lattices

Chaotic Coupled Map Lattices Chaotic Coupled Map Lattices Author: Dustin Keys Advisors: Dr. Robert Indik, Dr. Kevin Lin 1 Introduction When a syste of chaotic aps is coupled in a way that allows the to share inforation about each

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition

Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition

More information

arxiv: v1 [cs.ds] 29 Jan 2012

arxiv: v1 [cs.ds] 29 Jan 2012 A parallel approxiation algorith for ixed packing covering seidefinite progras arxiv:1201.6090v1 [cs.ds] 29 Jan 2012 Rahul Jain National U. Singapore January 28, 2012 Abstract Penghui Yao National U. Singapore

More information

Supplement to: Subsampling Methods for Persistent Homology

Supplement to: Subsampling Methods for Persistent Homology Suppleent to: Subsapling Methods for Persistent Hoology A. Technical results In this section, we present soe technical results that will be used to prove the ain theores. First, we expand the notation

More information

Bounds on the Sample Complexity for Private Learning and Private Data Release

Bounds on the Sample Complexity for Private Learning and Private Data Release Bounds on the Sample Complexity for Private Learning and Private Data Release Amos Beimel 1,, Shiva Prasad Kasiviswanathan 2, and Kobbi Nissim 1,3, 1 Dept. of Computer Science, Ben-Gurion University 2

More information

Feature Extraction Techniques

Feature Extraction Techniques Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that

More information

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions

Tight Information-Theoretic Lower Bounds for Welfare Maximization in Combinatorial Auctions Tight Inforation-Theoretic Lower Bounds for Welfare Maxiization in Cobinatorial Auctions Vahab Mirrokni Jan Vondrák Theory Group, Microsoft Dept of Matheatics Research Princeton University Redond, WA 9805

More information

Bipartite subgraphs and the smallest eigenvalue

Bipartite subgraphs and the smallest eigenvalue Bipartite subgraphs and the sallest eigenvalue Noga Alon Benny Sudaov Abstract Two results dealing with the relation between the sallest eigenvalue of a graph and its bipartite subgraphs are obtained.

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography

Tight Bounds for Maximal Identifiability of Failure Nodes in Boolean Network Tomography Tight Bounds for axial Identifiability of Failure Nodes in Boolean Network Toography Nicola Galesi Sapienza Università di Roa nicola.galesi@uniroa1.it Fariba Ranjbar Sapienza Università di Roa fariba.ranjbar@uniroa1.it

More information

List Scheduling and LPT Oliver Braun (09/05/2017)

List Scheduling and LPT Oliver Braun (09/05/2017) List Scheduling and LPT Oliver Braun (09/05/207) We investigate the classical scheduling proble P ax where a set of n independent jobs has to be processed on 2 parallel and identical processors (achines)

More information

1 Identical Parallel Machines

1 Identical Parallel Machines FB3: Matheatik/Inforatik Dr. Syaantak Das Winter 2017/18 Optiizing under Uncertainty Lecture Notes 3: Scheduling to Miniize Makespan In any standard scheduling proble, we are given a set of jobs J = {j

More information

Curious Bounds for Floor Function Sums

Curious Bounds for Floor Function Sums 1 47 6 11 Journal of Integer Sequences, Vol. 1 (018), Article 18.1.8 Curious Bounds for Floor Function Sus Thotsaporn Thanatipanonda and Elaine Wong 1 Science Division Mahidol University International

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

A Theoretical Framework for Deep Transfer Learning

A Theoretical Framework for Deep Transfer Learning A Theoretical Fraewor for Deep Transfer Learning Toer Galanti The School of Coputer Science Tel Aviv University toer22g@gail.co Lior Wolf The School of Coputer Science Tel Aviv University wolf@cs.tau.ac.il

More information

Exact tensor completion with sum-of-squares

Exact tensor completion with sum-of-squares Proceedings of Machine Learning Research vol 65:1 54, 2017 30th Annual Conference on Learning Theory Exact tensor copletion with su-of-squares Aaron Potechin Institute for Advanced Study, Princeton David

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes

Graphical Models in Local, Asymmetric Multi-Agent Markov Decision Processes Graphical Models in Local, Asyetric Multi-Agent Markov Decision Processes Ditri Dolgov and Edund Durfee Departent of Electrical Engineering and Coputer Science University of Michigan Ann Arbor, MI 48109

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Computational Learning Theory

Computational Learning Theory CS 446 Machine Learning Fall 2016 OCT 11, 2016 Computational Learning Theory Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes 1 PAC Learning We want to develop a theory to relate the probability of successful

More information

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition

Upper bound on false alarm rate for landmine detection and classification using syntactic pattern recognition Upper bound on false alar rate for landine detection and classification using syntactic pattern recognition Ahed O. Nasif, Brian L. Mark, Kenneth J. Hintz, and Nathalia Peixoto Dept. of Electrical and

More information

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

The Simplex Method is Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate The Siplex Method is Strongly Polynoial for the Markov Decision Proble with a Fixed Discount Rate Yinyu Ye April 20, 2010 Abstract In this note we prove that the classic siplex ethod with the ost-negativereduced-cost

More information

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40

On Poset Merging. 1 Introduction. Peter Chen Guoli Ding Steve Seiden. Keywords: Merging, Partial Order, Lower Bounds. AMS Classification: 68W40 On Poset Merging Peter Chen Guoli Ding Steve Seiden Abstract We consider the follow poset erging proble: Let X and Y be two subsets of a partially ordered set S. Given coplete inforation about the ordering

More information

Interactive Markov Models of Evolutionary Algorithms

Interactive Markov Models of Evolutionary Algorithms Cleveland State University EngagedScholarship@CSU Electrical Engineering & Coputer Science Faculty Publications Electrical Engineering & Coputer Science Departent 2015 Interactive Markov Models of Evolutionary

More information

VC Dimension and Sauer s Lemma

VC Dimension and Sauer s Lemma CMSC 35900 (Spring 2008) Learning Theory Lecture: VC Diension and Sauer s Lea Instructors: Sha Kakade and Abuj Tewari Radeacher Averages and Growth Function Theore Let F be a class of ±-valued functions

More information

Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation

Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation Algorithic Stability and Sanity-Check Bounds for Leave-One-Out Cross-Validation Michael Kearns AT&T Labs Research Murray Hill, New Jersey kearns@research.att.co Dana Ron MIT Cabridge, MA danar@theory.lcs.it.edu

More information

Solutions of some selected problems of Homework 4

Solutions of some selected problems of Homework 4 Solutions of soe selected probles of Hoework 4 Sangchul Lee May 7, 2018 Proble 1 Let there be light A professor has two light bulbs in his garage. When both are burned out, they are replaced, and the next

More information

PAC-Bayes Analysis Of Maximum Entropy Learning

PAC-Bayes Analysis Of Maximum Entropy Learning PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E

More information

arxiv: v3 [quant-ph] 18 Oct 2017

arxiv: v3 [quant-ph] 18 Oct 2017 Self-guaranteed easureent-based quantu coputation Masahito Hayashi 1,, and Michal Hajdušek, 1 Graduate School of Matheatics, Nagoya University, Furocho, Chikusa-ku, Nagoya 464-860, Japan Centre for Quantu

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Boosting with log-loss

Boosting with log-loss Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

A Probabilistic and RIPless Theory of Compressed Sensing

A Probabilistic and RIPless Theory of Compressed Sensing A Probabilistic and RIPless Theory of Copressed Sensing Eanuel J Candès and Yaniv Plan 2 Departents of Matheatics and of Statistics, Stanford University, Stanford, CA 94305 2 Applied and Coputational Matheatics,

More information

time time δ jobs jobs

time time δ jobs jobs Approxiating Total Flow Tie on Parallel Machines Stefano Leonardi Danny Raz y Abstract We consider the proble of optiizing the total ow tie of a strea of jobs that are released over tie in a ultiprocessor

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel

Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel 1 Iterative Decoding of LDPC Codes over the q-ary Partial Erasure Channel Rai Cohen, Graduate Student eber, IEEE, and Yuval Cassuto, Senior eber, IEEE arxiv:1510.05311v2 [cs.it] 24 ay 2016 Abstract In

More information

lecture 36: Linear Multistep Mehods: Zero Stability

lecture 36: Linear Multistep Mehods: Zero Stability 95 lecture 36: Linear Multistep Mehods: Zero Stability 5.6 Linear ultistep ethods: zero stability Does consistency iply convergence for linear ultistep ethods? This is always the case for one-step ethods,

More information

A := A i : {A i } S. is an algebra. The same object is obtained when the union in required to be disjoint.

A := A i : {A i } S. is an algebra. The same object is obtained when the union in required to be disjoint. 59 6. ABSTRACT MEASURE THEORY Having developed the Lebesgue integral with respect to the general easures, we now have a general concept with few specific exaples to actually test it on. Indeed, so far

More information

Randomized Accuracy-Aware Program Transformations For Efficient Approximate Computations

Randomized Accuracy-Aware Program Transformations For Efficient Approximate Computations Randoized Accuracy-Aware Progra Transforations For Efficient Approxiate Coputations Zeyuan Allen Zhu Sasa Misailovic Jonathan A. Kelner Martin Rinard MIT CSAIL zeyuan@csail.it.edu isailo@it.edu kelner@it.edu

More information

On Process Complexity

On Process Complexity On Process Coplexity Ada R. Day School of Matheatics, Statistics and Coputer Science, Victoria University of Wellington, PO Box 600, Wellington 6140, New Zealand, Eail: ada.day@cs.vuw.ac.nz Abstract Process

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Note on generating all subsets of a finite set with disjoint unions

Note on generating all subsets of a finite set with disjoint unions Note on generating all subsets of a finite set with disjoint unions David Ellis e-ail: dce27@ca.ac.uk Subitted: Dec 2, 2008; Accepted: May 12, 2009; Published: May 20, 2009 Matheatics Subject Classification:

More information