CMPUT651: Differential Privacy

Size: px

Start display at page:

Download "CMPUT651: Differential Privacy"

Hugo Green
5 years ago
Views:

1 CMPUT65: Differential Privacy Homework assignment # 2 Due date: Apr. 3rd, 208 Discussion and the exchange of ideas are essential to doing academic work. For assignments in this course, you are encouraged to consult with your classmates as you work on problem sets. However, after discussions with peers, make sure that you can work through the problems yourself and ensure that any answers you submit for evaluation are the result of your own efforts. In addition, you must cite any books, articles, websites, lectures, etc. that have helped you with your work using appropriate citation practices. Similarly, you must list the names of students with whom you have collaborated on problem sets. You should solve all questions. Warm-Up Questions. Based on the BLR-mechanism, give a differentially private mechanism for answering the set of all quantile-queries. Specifically, the universe U consists of the T integer numbers ranging from to T. A quantilequery is characterized by a parameter q [0, ]. Given a database D {, 2,..., T } n we denote its values in sorted order as x (), x (2),..., x (n), and so we can answer the exact quantile query by outputting any value v s.t. x (j) v for any j =, 2,..., αn, and x (j) v for any j = αn,..., n. Recall our definition from HW: A mechanism (α, β)-approximates quantile queries, if for any q the mechanism returns a value v s.t. with probability β it holds that x (j) v for any j =, 2,..., (q α)n and x (j) v for any j = (q + α)n,..., n. Given α, β, ε, your goal is to give one ɛ-differential private mechanism that (α, β)-approximates all quantile queries in time O(n log(n)) + T O(/α), provided n is sufficiently large. Specify the lower bound on n as a function of ε, α, β. Outline: Argue first that is suffices to look solely at α quantiles q {0, α, 2α, 3α..., α, } and make sure your mechanism ( α 2, β)-approximates all these queries. Then use BLR-type idea to approximate these particular (small) set of quantile queries with a synthetic dataset of size O( α ). 2. Show that in the k-fold composition of Gaussian mechanisms, the δs don t add up! Let f : U n R be a real-value function with GS(f) =. Let M, M 2,..., M k be k independent runs of the Gaussian mechanism that preserve (ɛ, δ)-differential privacy. That is, in each mechanism M j we output z j which is an independent sample from N (f(d), σ 2 ) 2 ln(2/δ) ɛ. with σ = Let M be the mechanism that concatenates all k outputs of M, M 2,..., M k. Show that M preserves (O(ɛ k + ɛ 2 k), δ)-differential privacy. You may use the following two facts about Gaussians without proof:

2 For a r.v. X sampled from a normal Gaussian, X N (0, ), it holds that Pr[ X > 2 ln(2/δ)] < δ. For any two independent random variables x N (µ, σ 2), x 2 N (µ 2, σ2 2 ) it holds that x + x 2 N (µ + µ 2, σ 2 + σ2 2 ). (I.e., the sum of independent Gaussian behaves like a sample from a Gaussian centered around the sum of the means and with variance which is the sum of the variances.) 2

3 2 Questions 2. The Private Multiplicative Weights Algorithm Complete the full details of the Private Multiplicative Weights algorithm from class.. Suppose there exists a non-negative hyperplane x R d such that x = and a collection of examples a, a 2,..., a T such that for each j, a j. Fix γ > 0. Our goal is to give for all a j a numeric label l j [, ] such that l j a j, x γ. Each time we fail to do so, we incur a mistake and we are also told if we overshot (i.e., l j > a j, x + γ) or we undershot (i.e., l j < a j, x γ). Show that the Multiplicative Weights algorithm (with parameter γ 2 and suitably chosen cost vectors) makes at most 4 ln(d) mistakes (and no more than the same number of updates). γ 2 Consider now the following version of the Private MW, where the dataset is represented as a probability distribution over U (with T elements). You are given k queries where each query q j is in the form of a vector v j R T with v j, whose answer is v j, D. So q j has sensitivity n. procedure Private MW (Taking parameters: α, σ.) Init: t 0 D 0 = ( T,..., T ) Sample T t Lap(σ) For each query q j : ( ) Sample X j, Y j Lap(σ) ind. if ( v j, D t + X j > v j, D α + T t) then Update D t using MW with parameter α 2 and cost= v j to get D t+. 6 ln(t ) Increment(t) and if (t > α ) abort Resample T t Lap(σ) Goto ( ) (repeat the same loop until we answer q j ) else if ( v j, D t + Y j < v j, D 3 4 α + T t) then Update D t using MW with parameter α 2 and cost= v j to get D t+. 6 ln(t ) Increment(t) and if (t > α ) abort Resample T t Lap(σ) Goto ( ) (repeat the same loop until we answer q j ) else answer v j, D t 2. Given 0 < β < e, show that w.p. β we have that all random variables in this algorithm (namely all X j s and Y j s and T t s, a total of 3k random variables) are always upper bounded in magnitude by τ = 6σ ln(k/β). Infer that if τ < 8α then (a) all of our answers to the queries are always within a bound of α, and that (b) updates of the MW-algorithm are over examples where we overshot or undershot by at least α 2, and thus (c) our algorithm doesn t abort. 6 ln(t ) 3. Denote c =. Given ɛ, set σ so that the c-fold composition of the sparse vector over α 2 2k queries of sensitivity n is ɛ-dp overall (using basic composition). Find the smallest α 3

4 for which we have that τ α/8. You are free to omit constants and use solely asymptotic notation. 6 ln(t ) 4. Denote c =. Given ɛ, set σ so that the c-fold composition of the sparse vector over 2k α 2 queries of sensitivity n is (ɛ, δ)-dp overall (using advance composition). Find the smallest α for which we have that τ < α/8. You are free to omit constants and use solely asymptotic notation. 4

5 3 Problems 3. Combinatorial Optimization and Approximation Consider the problem of Max-k-Coverage, where an instance I of the problem is composed of a ground set U of n elements, and m subsets S j U. Our goal is to find a set T [m] of size k as to maximize max T : T =k S j where we denote the value of the optimal solution as OP T. Here, we will consider algorithms for solving the Max-k-Coverage problem that are also (ɛ, δ)-differentially private. We call two instances I and I neighbors if they are based on the same ground set U and both have m subsets, and there exists at most one element e U such that for any S j I and S j I it holds that S j and S j are either identical or differ only on e. Often, we represent the problem with a bipartite graph, with n nodes on the right (each representing an element in the ground set) and m nodes on the left (each representing a subset). We put an edge from an element e node to a subset S j node if e S j. Our goal is therefore to find a set of k nodes on the left that are adjacent to as many nodes on the right as possible. In that respect, I and I are neighbors if there exists a right node e such that the two bipartite graphs that I and I induce differ only on the edges that are incident to e. j T. Give an O(m k )-time algorithm that is (ɛ, δ)-differentially private and uses the Laplace mechanism to approximate OP T. What is its utility guarantee? 2. Give an O(m k )-time algorithm that is ɛ-differentially private and uses the exponential mechanism. What is its utility guarantee? 3. Where both previous algorithm has exponential dependency on k, here s a simple greedy algorithm that runs in time O(mk) and gives a ( /e)-approximation for the Max-k- Coverage problem (without concern for privacy): (i) Initialize T 0 =. (ii) For each time t =, 2, 3..., k find the set S jt that covers the most elements out of the set of elements that are not yet covered by j T t S j, and set T t T t {j t }. To analyze this algorithm we denote c t = OP T. j T t S j One can show that the following ratio holds: c t ( ) t k OP T And so c k ( k )k OP T e OP T, hence j T t S j ( e )OP T. Give a (ɛ, δ)-differentially private version of the greedy algorithm, and analyze its utility. (Aim to optimize its utility.) That is, bound w.h.p the gap between the private-version of the greedy algorithm and the non-private version. Compare this gap to the gap you got in the previous article (question 2). 5

6 3.2 Randomized-Response for Many Types In class, we showed the Randomized-Response mechanism in the case where there were only two possible types. We now extend it to a universe of size T, namely U = {, 2, 3,..., T }. In this question, we examine the cost of extending the universe from 2 types to T types. However, we will maintain the main property of the randomize-response mechanism we apply the same mechanism M to each user independently, and so it runs in the local model, without a trusted authority. Naïve Extension of the Randomized-Response to T types. The straight-forward way to extend the randomized-response mechanism to T types is to use the following mechanism on each user, whose true type is denoted as t: { x + y, if t = t Pr[ M(t) = t ] = x, for any t t. Naturally, the best utility is obtained when x + y is as large as possible and x is as small as possible. What are the best values of x and y can we set and have that M is ɛ-dp? Assuming ɛ < show that y you derive is ɛ T. 2. What estimator θ t will you use to estimate the number of people of type t in the dataset? 3. Show that this mechanism has a standard deviation of O( nt ɛ ) The Mechanism of Bassily-Smith [STOC5]. A recent work has proposed a novel mechanism in the local-model. The mechanism considers each person of type t as a vector in a T -dimensional space which is of the form: (0, 0,.., 0,, 0,...0) (namely, all coordinates are 0 except for the t-th coordinate which is ). Each user then runs the Randomized-Response mechanism with parameter p on each coordinate independently, where p is set such that e ɛ 2. Thus, the signal the users 2 p sends is composed of T bits, each chosen w.r.t to the t-th bit of the user s representative vector. 4. Prove that the above algorithm is ε-dp. 5. Observe that like in the standard Randomized-Response, for each type t, if the dataset is composed of n t people of type t then the expected number of users whose report has t- coordinate set to is ( 2 + p) n t + ( 2 p) (n n t ). We thus use the same estimation θ t from standard Randomized-Response to approximate n t (only for each t we look at a different coordinate). 2 +p Given β > 0, based on HW.2., show that there exists a constant c such that [ ] Pr for all t, θ t n t < c ɛ n ln( Tβ ) β reducing our dependency on T to logarithmic. 6

7 The Mechanism of Bassily-Nissim-Stemmer-Thakurta [NIPS207]. A recent work has proposed a novel mechanism in the local-model in which each user doesn t need to run the Randomized- Response T times but rather just once (and thus send just a single bit rather than T ). The mechanism, per user i works as follows: i. We pick T random variable Zi,..., ZT i {, }. We publish Z i = Zi,..., ZT i., each chosen uniformly and independently among ii. User i, whose true type is t i looks solely at the t i coordinate of Z i, and then returns a bit b i {, } such that 6. Show that this mechanism is ɛ-dp. Pr[b i = Z t i i ] = +ɛ/4 2 Pr[b i = Z t i i ] = ɛ/ We now show how to use this mechanism to approximate the counts n t def = # users of type t. Denote b as the n-dimensional vector of responses from all users, and Z t as the n-dimensional vector of the random bits chosen for all users had they been of type t. Show that for any type t we have E[ b, Z t ] = E[ i b iz t i ] = ɛ 4 nt, where the expectation is taken over both the choice of variables Z t i s and the responses b is. 8. We thus set our t-th estimator as θ t = 4 ɛ b, Z t. Show that there exists a constant c > 0 such that for any β > 0 we have that [ ] n ln( Tβ ) Pr for all t, θ t n t c ɛ β 7

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016 Lecture 1: Introduction and Review We begin with a short introduction to the course, and logistics. We then survey some basics about approximation algorithms and probability. We also introduce some of