CMPUT651: Differential Privacy

Size: px
Start display at page:

Download "CMPUT651: Differential Privacy"

Transcription

1 CMPUT65: Differential Privacy Homework assignment # 2 Due date: Apr. 3rd, 208 Discussion and the exchange of ideas are essential to doing academic work. For assignments in this course, you are encouraged to consult with your classmates as you work on problem sets. However, after discussions with peers, make sure that you can work through the problems yourself and ensure that any answers you submit for evaluation are the result of your own efforts. In addition, you must cite any books, articles, websites, lectures, etc. that have helped you with your work using appropriate citation practices. Similarly, you must list the names of students with whom you have collaborated on problem sets. You should solve all questions. Warm-Up Questions. Based on the BLR-mechanism, give a differentially private mechanism for answering the set of all quantile-queries. Specifically, the universe U consists of the T integer numbers ranging from to T. A quantilequery is characterized by a parameter q [0, ]. Given a database D {, 2,..., T } n we denote its values in sorted order as x (), x (2),..., x (n), and so we can answer the exact quantile query by outputting any value v s.t. x (j) v for any j =, 2,..., αn, and x (j) v for any j = αn,..., n. Recall our definition from HW: A mechanism (α, β)-approximates quantile queries, if for any q the mechanism returns a value v s.t. with probability β it holds that x (j) v for any j =, 2,..., (q α)n and x (j) v for any j = (q + α)n,..., n. Given α, β, ε, your goal is to give one ɛ-differential private mechanism that (α, β)-approximates all quantile queries in time O(n log(n)) + T O(/α), provided n is sufficiently large. Specify the lower bound on n as a function of ε, α, β. Outline: Argue first that is suffices to look solely at α quantiles q {0, α, 2α, 3α..., α, } and make sure your mechanism ( α 2, β)-approximates all these queries. Then use BLR-type idea to approximate these particular (small) set of quantile queries with a synthetic dataset of size O( α ). 2. Show that in the k-fold composition of Gaussian mechanisms, the δs don t add up! Let f : U n R be a real-value function with GS(f) =. Let M, M 2,..., M k be k independent runs of the Gaussian mechanism that preserve (ɛ, δ)-differential privacy. That is, in each mechanism M j we output z j which is an independent sample from N (f(d), σ 2 ) 2 ln(2/δ) ɛ. with σ = Let M be the mechanism that concatenates all k outputs of M, M 2,..., M k. Show that M preserves (O(ɛ k + ɛ 2 k), δ)-differential privacy. You may use the following two facts about Gaussians without proof:

2 For a r.v. X sampled from a normal Gaussian, X N (0, ), it holds that Pr[ X > 2 ln(2/δ)] < δ. For any two independent random variables x N (µ, σ 2), x 2 N (µ 2, σ2 2 ) it holds that x + x 2 N (µ + µ 2, σ 2 + σ2 2 ). (I.e., the sum of independent Gaussian behaves like a sample from a Gaussian centered around the sum of the means and with variance which is the sum of the variances.) 2

3 2 Questions 2. The Private Multiplicative Weights Algorithm Complete the full details of the Private Multiplicative Weights algorithm from class.. Suppose there exists a non-negative hyperplane x R d such that x = and a collection of examples a, a 2,..., a T such that for each j, a j. Fix γ > 0. Our goal is to give for all a j a numeric label l j [, ] such that l j a j, x γ. Each time we fail to do so, we incur a mistake and we are also told if we overshot (i.e., l j > a j, x + γ) or we undershot (i.e., l j < a j, x γ). Show that the Multiplicative Weights algorithm (with parameter γ 2 and suitably chosen cost vectors) makes at most 4 ln(d) mistakes (and no more than the same number of updates). γ 2 Consider now the following version of the Private MW, where the dataset is represented as a probability distribution over U (with T elements). You are given k queries where each query q j is in the form of a vector v j R T with v j, whose answer is v j, D. So q j has sensitivity n. procedure Private MW (Taking parameters: α, σ.) Init: t 0 D 0 = ( T,..., T ) Sample T t Lap(σ) For each query q j : ( ) Sample X j, Y j Lap(σ) ind. if ( v j, D t + X j > v j, D α + T t) then Update D t using MW with parameter α 2 and cost= v j to get D t+. 6 ln(t ) Increment(t) and if (t > α ) abort Resample T t Lap(σ) Goto ( ) (repeat the same loop until we answer q j ) else if ( v j, D t + Y j < v j, D 3 4 α + T t) then Update D t using MW with parameter α 2 and cost= v j to get D t+. 6 ln(t ) Increment(t) and if (t > α ) abort Resample T t Lap(σ) Goto ( ) (repeat the same loop until we answer q j ) else answer v j, D t 2. Given 0 < β < e, show that w.p. β we have that all random variables in this algorithm (namely all X j s and Y j s and T t s, a total of 3k random variables) are always upper bounded in magnitude by τ = 6σ ln(k/β). Infer that if τ < 8α then (a) all of our answers to the queries are always within a bound of α, and that (b) updates of the MW-algorithm are over examples where we overshot or undershot by at least α 2, and thus (c) our algorithm doesn t abort. 6 ln(t ) 3. Denote c =. Given ɛ, set σ so that the c-fold composition of the sparse vector over α 2 2k queries of sensitivity n is ɛ-dp overall (using basic composition). Find the smallest α 3

4 for which we have that τ α/8. You are free to omit constants and use solely asymptotic notation. 6 ln(t ) 4. Denote c =. Given ɛ, set σ so that the c-fold composition of the sparse vector over 2k α 2 queries of sensitivity n is (ɛ, δ)-dp overall (using advance composition). Find the smallest α for which we have that τ < α/8. You are free to omit constants and use solely asymptotic notation. 4

5 3 Problems 3. Combinatorial Optimization and Approximation Consider the problem of Max-k-Coverage, where an instance I of the problem is composed of a ground set U of n elements, and m subsets S j U. Our goal is to find a set T [m] of size k as to maximize max T : T =k S j where we denote the value of the optimal solution as OP T. Here, we will consider algorithms for solving the Max-k-Coverage problem that are also (ɛ, δ)-differentially private. We call two instances I and I neighbors if they are based on the same ground set U and both have m subsets, and there exists at most one element e U such that for any S j I and S j I it holds that S j and S j are either identical or differ only on e. Often, we represent the problem with a bipartite graph, with n nodes on the right (each representing an element in the ground set) and m nodes on the left (each representing a subset). We put an edge from an element e node to a subset S j node if e S j. Our goal is therefore to find a set of k nodes on the left that are adjacent to as many nodes on the right as possible. In that respect, I and I are neighbors if there exists a right node e such that the two bipartite graphs that I and I induce differ only on the edges that are incident to e. j T. Give an O(m k )-time algorithm that is (ɛ, δ)-differentially private and uses the Laplace mechanism to approximate OP T. What is its utility guarantee? 2. Give an O(m k )-time algorithm that is ɛ-differentially private and uses the exponential mechanism. What is its utility guarantee? 3. Where both previous algorithm has exponential dependency on k, here s a simple greedy algorithm that runs in time O(mk) and gives a ( /e)-approximation for the Max-k- Coverage problem (without concern for privacy): (i) Initialize T 0 =. (ii) For each time t =, 2, 3..., k find the set S jt that covers the most elements out of the set of elements that are not yet covered by j T t S j, and set T t T t {j t }. To analyze this algorithm we denote c t = OP T. j T t S j One can show that the following ratio holds: c t ( ) t k OP T And so c k ( k )k OP T e OP T, hence j T t S j ( e )OP T. Give a (ɛ, δ)-differentially private version of the greedy algorithm, and analyze its utility. (Aim to optimize its utility.) That is, bound w.h.p the gap between the private-version of the greedy algorithm and the non-private version. Compare this gap to the gap you got in the previous article (question 2). 5

6 3.2 Randomized-Response for Many Types In class, we showed the Randomized-Response mechanism in the case where there were only two possible types. We now extend it to a universe of size T, namely U = {, 2, 3,..., T }. In this question, we examine the cost of extending the universe from 2 types to T types. However, we will maintain the main property of the randomize-response mechanism we apply the same mechanism M to each user independently, and so it runs in the local model, without a trusted authority. Naïve Extension of the Randomized-Response to T types. The straight-forward way to extend the randomized-response mechanism to T types is to use the following mechanism on each user, whose true type is denoted as t: { x + y, if t = t Pr[ M(t) = t ] = x, for any t t. Naturally, the best utility is obtained when x + y is as large as possible and x is as small as possible. What are the best values of x and y can we set and have that M is ɛ-dp? Assuming ɛ < show that y you derive is ɛ T. 2. What estimator θ t will you use to estimate the number of people of type t in the dataset? 3. Show that this mechanism has a standard deviation of O( nt ɛ ) The Mechanism of Bassily-Smith [STOC5]. A recent work has proposed a novel mechanism in the local-model. The mechanism considers each person of type t as a vector in a T -dimensional space which is of the form: (0, 0,.., 0,, 0,...0) (namely, all coordinates are 0 except for the t-th coordinate which is ). Each user then runs the Randomized-Response mechanism with parameter p on each coordinate independently, where p is set such that e ɛ 2. Thus, the signal the users 2 p sends is composed of T bits, each chosen w.r.t to the t-th bit of the user s representative vector. 4. Prove that the above algorithm is ε-dp. 5. Observe that like in the standard Randomized-Response, for each type t, if the dataset is composed of n t people of type t then the expected number of users whose report has t- coordinate set to is ( 2 + p) n t + ( 2 p) (n n t ). We thus use the same estimation θ t from standard Randomized-Response to approximate n t (only for each t we look at a different coordinate). 2 +p Given β > 0, based on HW.2., show that there exists a constant c such that [ ] Pr for all t, θ t n t < c ɛ n ln( Tβ ) β reducing our dependency on T to logarithmic. 6

7 The Mechanism of Bassily-Nissim-Stemmer-Thakurta [NIPS207]. A recent work has proposed a novel mechanism in the local-model in which each user doesn t need to run the Randomized- Response T times but rather just once (and thus send just a single bit rather than T ). The mechanism, per user i works as follows: i. We pick T random variable Zi,..., ZT i {, }. We publish Z i = Zi,..., ZT i., each chosen uniformly and independently among ii. User i, whose true type is t i looks solely at the t i coordinate of Z i, and then returns a bit b i {, } such that 6. Show that this mechanism is ɛ-dp. Pr[b i = Z t i i ] = +ɛ/4 2 Pr[b i = Z t i i ] = ɛ/ We now show how to use this mechanism to approximate the counts n t def = # users of type t. Denote b as the n-dimensional vector of responses from all users, and Z t as the n-dimensional vector of the random bits chosen for all users had they been of type t. Show that for any type t we have E[ b, Z t ] = E[ i b iz t i ] = ɛ 4 nt, where the expectation is taken over both the choice of variables Z t i s and the responses b is. 8. We thus set our t-th estimator as θ t = 4 ɛ b, Z t. Show that there exists a constant c > 0 such that for any β > 0 we have that [ ] n ln( Tβ ) Pr for all t, θ t n t c ɛ β 7

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016 Lecture 1: Introduction and Review We begin with a short introduction to the course, and logistics. We then survey some basics about approximation algorithms and probability. We also introduce some of

More information

Problem set 1. (c) Is the Ford-Fulkerson algorithm guaranteed to produce an acyclic maximum flow?

Problem set 1. (c) Is the Ford-Fulkerson algorithm guaranteed to produce an acyclic maximum flow? CS261, Winter 2017. Instructor: Ashish Goel. Problem set 1 Electronic submission to Gradescope due 11:59pm Thursday 2/2. Form a group of 2-3 students that is, submit one homework with all of your names.

More information

CS261: Problem Set #3

CS261: Problem Set #3 CS261: Problem Set #3 Due by 11:59 PM on Tuesday, February 23, 2016 Instructions: (1) Form a group of 1-3 students. You should turn in only one write-up for your entire group. (2) Submission instructions:

More information

Personalized Social Recommendations Accurate or Private

Personalized Social Recommendations Accurate or Private Personalized Social Recommendations Accurate or Private Presented by: Lurye Jenny Paper by: Ashwin Machanavajjhala, Aleksandra Korolova, Atish Das Sarma Outline Introduction Motivation The model General

More information

The Optimal Mechanism in Differential Privacy

The Optimal Mechanism in Differential Privacy The Optimal Mechanism in Differential Privacy Quan Geng Advisor: Prof. Pramod Viswanath 11/07/2013 PhD Final Exam of Quan Geng, ECE, UIUC 1 Outline Background on Differential Privacy ε-differential Privacy:

More information

The Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm

The Algorithmic Foundations of Adaptive Data Analysis November, Lecture The Multiplicative Weights Algorithm he Algorithmic Foundations of Adaptive Data Analysis November, 207 Lecture 5-6 Lecturer: Aaron Roth Scribe: Aaron Roth he Multiplicative Weights Algorithm In this lecture, we define and analyze a classic,

More information

Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained Empirical Risk Minimization

Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained Empirical Risk Minimization Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained Empirical Risk Minimization Katrina Ligett HUJI & Caltech joint with Seth Neel, Aaron Roth, Bo Waggoner, Steven Wu NIPS 2017

More information

Answering Many Queries with Differential Privacy

Answering Many Queries with Differential Privacy 6.889 New Developments in Cryptography May 6, 2011 Answering Many Queries with Differential Privacy Instructors: Shafi Goldwasser, Yael Kalai, Leo Reyzin, Boaz Barak, and Salil Vadhan Lecturer: Jonathan

More information

On Node-differentially Private Algorithms for Graph Statistics

On Node-differentially Private Algorithms for Graph Statistics On Node-differentially Private Algorithms for Graph Statistics Om Dipakbhai Thakkar August 26, 2015 Abstract In this report, we start by surveying three papers on node differential privacy. First, we look

More information

What Can We Learn Privately?

What Can We Learn Privately? What Can We Learn Privately? Sofya Raskhodnikova Penn State University Joint work with Shiva Kasiviswanathan Homin Lee Kobbi Nissim Adam Smith Los Alamos UT Austin Ben Gurion Penn State To appear in SICOMP

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 2 Luca Trevisan August 29, 2017 U.C. Berkeley CS94: Beyond Worst-Case Analysis Handout Luca Trevisan August 9, 07 Scribe: Mahshid Montazer Lecture In this lecture, we study the Max Cut problem in random graphs. We compute the probable

More information

Privacy of Numeric Queries Via Simple Value Perturbation. The Laplace Mechanism

Privacy of Numeric Queries Via Simple Value Perturbation. The Laplace Mechanism Privacy of Numeric Queries Via Simple Value Perturbation The Laplace Mechanism Differential Privacy A Basic Model Let X represent an abstract data universe and D be a multi-set of elements from X. i.e.

More information

Answering Numeric Queries

Answering Numeric Queries Answering Numeric Queries The Laplace Distribution: Lap(b) is the probability distribution with p.d.f.: p x b) = 1 x exp 2b b i.e. a symmetric exponential distribution Y Lap b, E Y = b Pr Y t b = e t Answering

More information

Privacy in Statistical Databases

Privacy in Statistical Databases Privacy in Statistical Databases Individuals x 1 x 2 x n Server/agency ) answers. A queries Users Government, researchers, businesses or) Malicious adversary What information can be released? Two conflicting

More information

1 Differential Privacy and Statistical Query Learning

1 Differential Privacy and Statistical Query Learning 10-806 Foundations of Machine Learning and Data Science Lecturer: Maria-Florina Balcan Lecture 5: December 07, 015 1 Differential Privacy and Statistical Query Learning 1.1 Differential Privacy Suppose

More information

Homework 4 Solutions

Homework 4 Solutions CS 174: Combinatorics and Discrete Probability Fall 01 Homework 4 Solutions Problem 1. (Exercise 3.4 from MU 5 points) Recall the randomized algorithm discussed in class for finding the median of a set

More information

Locally Differentially Private Protocols for Frequency Estimation. Tianhao Wang, Jeremiah Blocki, Ninghui Li, Somesh Jha

Locally Differentially Private Protocols for Frequency Estimation. Tianhao Wang, Jeremiah Blocki, Ninghui Li, Somesh Jha Locally Differentially Private Protocols for Frequency Estimation Tianhao Wang, Jeremiah Blocki, Ninghui Li, Somesh Jha Differential Privacy Differential Privacy Classical setting Differential Privacy

More information

The Optimal Mechanism in Differential Privacy

The Optimal Mechanism in Differential Privacy The Optimal Mechanism in Differential Privacy Quan Geng Advisor: Prof. Pramod Viswanath 3/29/2013 PhD Prelimary Exam of Quan Geng, ECE, UIUC 1 Outline Background on differential privacy Problem formulation

More information

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 31, 2017

U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 31, 2017 U.C. Berkeley CS294: Beyond Worst-Case Analysis Handout 3 Luca Trevisan August 3, 207 Scribed by Keyhan Vakil Lecture 3 In which we complete the study of Independent Set and Max Cut in G n,p random graphs.

More information

Relaxed Locally Correctable Codes in Computationally Bounded Channels

Relaxed Locally Correctable Codes in Computationally Bounded Channels Relaxed Locally Correctable Codes in Computationally Bounded Channels Elena Grigorescu (Purdue) Joint with Jeremiah Blocki (Purdue), Venkata Gandikota (JHU), Samson Zhou (Purdue) Classical Locally Decodable/Correctable

More information

Maryam Shoaran Alex Thomo Jens Weber. University of Victoria, Canada

Maryam Shoaran Alex Thomo Jens Weber. University of Victoria, Canada Maryam Shoaran Alex Thomo Jens Weber University of Victoria, Canada Introduction Challenge: Evidence of Participation Sample Aggregates Zero-Knowledge Privacy Analysis of Utility of ZKP Conclusions 12/17/2015

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 22 Maximum Flow Applications Image segmentation Project selection Extensions to Max Flow Sofya Raskhodnikova 11/07/2016 S. Raskhodnikova; based on slides by E. Demaine,

More information

Machine Learning: Homework 5

Machine Learning: Homework 5 0-60 Machine Learning: Homework 5 Due 5:0 p.m. Thursday, March, 06 TAs: Travis Dick and Han Zhao Instructions Late homework policy: Homework is worth full credit if submitted before the due date, half

More information

Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent

Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent Homework 5 ADMM, Primal-dual interior point Dual Theory, Dual ascent CMU 10-725/36-725: Convex Optimization (Fall 2017) OUT: Nov 4 DUE: Nov 18, 11:59 PM START HERE: Instructions Collaboration policy: Collaboration

More information

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1

Lecture 10. Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Lecture 10 Sublinear Time Algorithms (contd) CSC2420 Allan Borodin & Nisarg Shah 1 Recap Sublinear time algorithms Deterministic + exact: binary search Deterministic + inexact: estimating diameter in a

More information

Testing that distributions are close

Testing that distributions are close Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, Patrick White May 26, 2010 Anat Ganor Introduction Goal Given two distributions over an n element set, we wish to check whether these distributions

More information

Limits to Approximability: When Algorithms Won't Help You. Note: Contents of today s lecture won t be on the exam

Limits to Approximability: When Algorithms Won't Help You. Note: Contents of today s lecture won t be on the exam Limits to Approximability: When Algorithms Won't Help You Note: Contents of today s lecture won t be on the exam Outline Limits to Approximability: basic results Detour: Provers, verifiers, and NP Graph

More information

Differential Privacy and Pan-Private Algorithms. Cynthia Dwork, Microsoft Research

Differential Privacy and Pan-Private Algorithms. Cynthia Dwork, Microsoft Research Differential Privacy and Pan-Private Algorithms Cynthia Dwork, Microsoft Research A Dream? C? Original Database Sanitization Very Vague AndVery Ambitious Census, medical, educational, financial data, commuting

More information

[Title removed for anonymity]

[Title removed for anonymity] [Title removed for anonymity] Graham Cormode graham@research.att.com Magda Procopiuc(AT&T) Divesh Srivastava(AT&T) Thanh Tran (UMass Amherst) 1 Introduction Privacy is a common theme in public discourse

More information

Lecture 1 Introduction to Differential Privacy: January 28

Lecture 1 Introduction to Differential Privacy: January 28 Introduction to Differential Privacy CSE711: Topics in Differential Privacy Spring 2016 Lecture 1 Introduction to Differential Privacy: January 28 Lecturer: Marco Gaboardi Scribe: Marco Gaboardi This note

More information

Algorithms CMSC Homework set #1 due January 14, 2015

Algorithms CMSC Homework set #1 due January 14, 2015 Algorithms CMSC-27200 http://alg15.cs.uchicago.edu Homework set #1 due January 14, 2015 Read the homework instructions on the website. The instructions that follow here are only an incomplete summary.

More information

CMPSCI611: The Matroid Theorem Lecture 5

CMPSCI611: The Matroid Theorem Lecture 5 CMPSCI611: The Matroid Theorem Lecture 5 We first review our definitions: A subset system is a set E together with a set of subsets of E, called I, such that I is closed under inclusion. This means that

More information

COMPSCI 611 Advanced Algorithms Second Midterm Exam Fall 2017

COMPSCI 611 Advanced Algorithms Second Midterm Exam Fall 2017 NAME: COMPSCI 611 Advanced Algorithms Second Midterm Exam Fall 2017 A. McGregor 15 November 2017 DIRECTIONS: Do not turn over the page until you are told to do so. This is a closed book exam. No communicating

More information

Lecture 13: 04/23/2014

Lecture 13: 04/23/2014 COMS 6998-3: Sub-Linear Algorithms in Learning and Testing Lecturer: Rocco Servedio Lecture 13: 04/23/2014 Spring 2014 Scribe: Psallidas Fotios Administrative: Submit HW problem solutions by Wednesday,

More information

1 Approximate Quantiles and Summaries

1 Approximate Quantiles and Summaries CS 598CSC: Algorithms for Big Data Lecture date: Sept 25, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri Suppose we have a stream a 1, a 2,..., a n of objects from an ordered universe. For simplicity

More information

SOLUTION FOR HOMEWORK 4, STAT 4352

SOLUTION FOR HOMEWORK 4, STAT 4352 SOLUTION FOR HOMEWORK 4, STAT 4352 Welcome to your fourth homework. Here we begin the study of confidence intervals, Errors, etc. Recall that X n := (X 1,...,X n ) denotes the vector of n observations.

More information

1 Maximizing a Submodular Function

1 Maximizing a Submodular Function 6.883 Learning with Combinatorial Structure Notes for Lecture 16 Author: Arpit Agarwal 1 Maximizing a Submodular Function In the last lecture we looked at maximization of a monotone submodular function,

More information

Linear Sketches A Useful Tool in Streaming and Compressive Sensing

Linear Sketches A Useful Tool in Streaming and Compressive Sensing Linear Sketches A Useful Tool in Streaming and Compressive Sensing Qin Zhang 1-1 Linear sketch Random linear projection M : R n R k that preserves properties of any v R n with high prob. where k n. M =

More information

Ad Placement Strategies

Ad Placement Strategies Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January

More information

Error Correcting Codes Questions Pool

Error Correcting Codes Questions Pool Error Correcting Codes Questions Pool Amnon Ta-Shma and Dean Doron January 3, 018 General guidelines The questions fall into several categories: (Know). (Mandatory). (Bonus). Make sure you know how to

More information

Integer Linear Programs

Integer Linear Programs Lecture 2: Review, Linear Programming Relaxations Today we will talk about expressing combinatorial problems as mathematical programs, specifically Integer Linear Programs (ILPs). We then see what happens

More information

CIS 800/002 The Algorithmic Foundations of Data Privacy September 29, Lecture 6. The Net Mechanism: A Partial Converse

CIS 800/002 The Algorithmic Foundations of Data Privacy September 29, Lecture 6. The Net Mechanism: A Partial Converse CIS 800/002 The Algorithmic Foundations of Data Privacy September 29, 20 Lecturer: Aaron Roth Lecture 6 Scribe: Aaron Roth Finishing up from last time. Last time we showed: The Net Mechanism: A Partial

More information

Differential Privacy and its Application in Aggregation

Differential Privacy and its Application in Aggregation Differential Privacy and its Application in Aggregation Part 1 Differential Privacy presenter: Le Chen Nanyang Technological University lechen0213@gmail.com October 5, 2013 Introduction Outline Introduction

More information

6.867 Machine learning, lecture 23 (Jaakkola)

6.867 Machine learning, lecture 23 (Jaakkola) Lecture topics: Markov Random Fields Probabilistic inference Markov Random Fields We will briefly go over undirected graphical models or Markov Random Fields (MRFs) as they will be needed in the context

More information

1. A poset P has no chain on five elements and no antichain on five elements. Determine, with proof, the largest possible number of elements in P.

1. A poset P has no chain on five elements and no antichain on five elements. Determine, with proof, the largest possible number of elements in P. 1. A poset P has no chain on five elements and no antichain on five elements. Determine, with proof, the largest possible number of elements in P. By Mirsky s Theorem, since P has no chain on five elements,

More information

1.1 Basis of Statistical Decision Theory

1.1 Basis of Statistical Decision Theory ECE598: Information-theoretic methods in high-dimensional statistics Spring 2016 Lecture 1: Introduction Lecturer: Yihong Wu Scribe: AmirEmad Ghassami, Jan 21, 2016 [Ed. Jan 31] Outline: Introduction of

More information

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS

DATA MINING LECTURE 13. Link Analysis Ranking PageRank -- Random walks HITS DATA MINING LECTURE 3 Link Analysis Ranking PageRank -- Random walks HITS How to organize the web First try: Manually curated Web Directories How to organize the web Second try: Web Search Information

More information

Privacy-Preserving Data Mining

Privacy-Preserving Data Mining CS 380S Privacy-Preserving Data Mining Vitaly Shmatikov slide 1 Reading Assignment Evfimievski, Gehrke, Srikant. Limiting Privacy Breaches in Privacy-Preserving Data Mining (PODS 2003). Blum, Dwork, McSherry,

More information

Lecture 11- Differential Privacy

Lecture 11- Differential Privacy 6.889 New Developments in Cryptography May 3, 2011 Lecture 11- Differential Privacy Lecturer: Salil Vadhan Scribes: Alan Deckelbaum and Emily Shen 1 Introduction In class today (and the next two lectures)

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression (Continued) Generative v. Discriminative Decision rees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 31 st, 2007 2005-2007 Carlos Guestrin 1 Generative

More information

Midterm Exam, Spring 2005

Midterm Exam, Spring 2005 10-701 Midterm Exam, Spring 2005 1. Write your name and your email address below. Name: Email address: 2. There should be 15 numbered pages in this exam (including this cover sheet). 3. Write your name

More information

Fall 2017 Qualifier Exam: OPTIMIZATION. September 18, 2017

Fall 2017 Qualifier Exam: OPTIMIZATION. September 18, 2017 Fall 2017 Qualifier Exam: OPTIMIZATION September 18, 2017 GENERAL INSTRUCTIONS: 1 Answer each question in a separate book 2 Indicate on the cover of each book the area of the exam, your code number, and

More information

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it

Statistics 300B Winter 2018 Final Exam Due 24 Hours after receiving it Statistics 300B Winter 08 Final Exam Due 4 Hours after receiving it Directions: This test is open book and open internet, but must be done without consulting other students. Any consultation of other students

More information

Lecture 15: Random Projections

Lecture 15: Random Projections Lecture 15: Random Projections Introduction to Learning and Analysis of Big Data Kontorovich and Sabato (BGU) Lecture 15 1 / 11 Review of PCA Unsupervised learning technique Performs dimensionality reduction

More information

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai

PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions. My T. Thai PCPs and Inapproximability Gap-producing and Gap-Preserving Reductions My T. Thai 1 1 Hardness of Approximation Consider a maximization problem Π such as MAX-E3SAT. To show that it is NP-hard to approximation

More information

7.5 Bipartite Matching

7.5 Bipartite Matching 7. Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E is a matching if each node appears in at most edge in M. Max matching: find a max cardinality matching. Bipartite Matching

More information

arxiv: v1 [cs.cr] 28 Apr 2015

arxiv: v1 [cs.cr] 28 Apr 2015 Differentially Private Release and Learning of Threshold Functions Mark Bun Kobbi Nissim Uri Stemmer Salil Vadhan April 28, 2015 arxiv:1504.07553v1 [cs.cr] 28 Apr 2015 Abstract We prove new upper and lower

More information

Show that the following problems are NP-complete

Show that the following problems are NP-complete Show that the following problems are NP-complete April 7, 2018 Below is a list of 30 exercises in which you are asked to prove that some problem is NP-complete. The goal is to better understand the theory

More information

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20.

Machine Learning, Midterm Exam: Spring 2008 SOLUTIONS. Q Topic Max. Score Score. 1 Short answer questions 20. 10-601 Machine Learning, Midterm Exam: Spring 2008 Please put your name on this cover sheet If you need more room to work out your answer to a question, use the back of the page and clearly mark on the

More information

CMPSCI 611 Advanced Algorithms Midterm Exam Fall 2015

CMPSCI 611 Advanced Algorithms Midterm Exam Fall 2015 NAME: CMPSCI 611 Advanced Algorithms Midterm Exam Fall 015 A. McGregor 1 October 015 DIRECTIONS: Do not turn over the page until you are told to do so. This is a closed book exam. No communicating with

More information

Lecture 3. 1 Polynomial-time algorithms for the maximum flow problem

Lecture 3. 1 Polynomial-time algorithms for the maximum flow problem ORIE 633 Network Flows August 30, 2007 Lecturer: David P. Williamson Lecture 3 Scribe: Gema Plaza-Martínez 1 Polynomial-time algorithms for the maximum flow problem 1.1 Introduction Let s turn now to considering

More information

Differentially Private Real-time Data Release over Infinite Trajectory Streams

Differentially Private Real-time Data Release over Infinite Trajectory Streams Differentially Private Real-time Data Release over Infinite Trajectory Streams Kyoto University, Japan Department of Social Informatics Yang Cao, Masatoshi Yoshikawa 1 Outline Motivation: opportunity &

More information

1 Introduction (January 21)

1 Introduction (January 21) CS 97: Concrete Models of Computation Spring Introduction (January ). Deterministic Complexity Consider a monotonically nondecreasing function f : {,,..., n} {, }, where f() = and f(n) =. We call f a step

More information

Solutions to MAT 117 Test #3

Solutions to MAT 117 Test #3 Solutions to MAT 7 Test #3 Because there are two versions of the test, solutions will only be given for Form C. Differences from the Form D version will be given. (The values for Form C appear above those

More information

Report on Differential Privacy

Report on Differential Privacy Report on Differential Privacy Lembit Valgma Supervised by Vesal Vojdani December 19, 2017 1 Introduction Over the past decade the collection and analysis of personal data has increased a lot. This has

More information

Differentially Private Linear Regression

Differentially Private Linear Regression Differentially Private Linear Regression Christian Baehr August 5, 2017 Your abstract. Abstract 1 Introduction My research involved testing and implementing tools into the Harvard Privacy Tools Project

More information

Data Mining Recitation Notes Week 3

Data Mining Recitation Notes Week 3 Data Mining Recitation Notes Week 3 Jack Rae January 28, 2013 1 Information Retrieval Given a set of documents, pull the (k) most similar document(s) to a given query. 1.1 Setup Say we have D documents

More information

Name: Matriculation Number: Tutorial Group: A B C D E

Name: Matriculation Number: Tutorial Group: A B C D E Name: Matriculation Number: Tutorial Group: A B C D E Question: 1 (5 Points) 2 (6 Points) 3 (5 Points) 4 (5 Points) Total (21 points) Score: General instructions: The written test contains 4 questions

More information

Homework 1 Submission

Homework 1 Submission Homework Submission Sample Solution; Due Date: Thursday, May 4, :59 pm Directions: Your solutions should be typed and submitted as a single pdf on Gradescope by the due date. L A TEX is preferred but not

More information

The expansion of random regular graphs

The expansion of random regular graphs The expansion of random regular graphs David Ellis Introduction Our aim is now to show that for any d 3, almost all d-regular graphs on {1, 2,..., n} have edge-expansion ratio at least c d d (if nd is

More information

Heavy Hitters and the Structure of Local Privacy

Heavy Hitters and the Structure of Local Privacy Heavy Hitters and the Structure of Local Privacy Mark Bun Jelani Nelson Uri Stemmer November 3, 207 Abstract We present a new locally differentially private algorithm for the heavy hitters problem which

More information

Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations

Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations Joshua R. Wang November 1, 2016 1 Model and Results Continuing from last week, we again examine provable algorithms for

More information

Sampling. Everything Data CompSci Spring 2014

Sampling. Everything Data CompSci Spring 2014 Sampling Everything Data CompSci 290.01 Spring 2014 2 Announcements (Thu. Mar 26) Homework #11 will be posted by noon tomorrow. 3 Outline Simple Random Sampling Means & Proportions Importance Sampling

More information

New Statistical Applications for Differential Privacy

New Statistical Applications for Differential Privacy New Statistical Applications for Differential Privacy Rob Hall 11/5/2012 Committee: Stephen Fienberg, Larry Wasserman, Alessandro Rinaldo, Adam Smith. rjhall@cs.cmu.edu http://www.cs.cmu.edu/~rjhall 1

More information

Lecture 20: Introduction to Differential Privacy

Lecture 20: Introduction to Differential Privacy 6.885: Advanced Topics in Data Processing Fall 2013 Stephen Tu Lecture 20: Introduction to Differential Privacy 1 Overview This lecture aims to provide a very broad introduction to the topic of differential

More information

HW 3: Heavy-tails and Small Worlds. 1 When monkeys type... [20 points, collaboration allowed]

HW 3: Heavy-tails and Small Worlds. 1 When monkeys type... [20 points, collaboration allowed] CS/EE/CMS 144 Assigned: 01/18/18 HW 3: Heavy-tails and Small Worlds Guru: Yu Su/Rachael Due: 01/25/18 by 10:30am We encourage you to discuss collaboration problems with others, but you need to write up

More information

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016) FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016) The final exam will be on Thursday, May 12, from 8:00 10:00 am, at our regular class location (CSI 2117). It will be closed-book and closed-notes, except

More information

Reading 10 : Asymptotic Analysis

Reading 10 : Asymptotic Analysis CS/Math 240: Introduction to Discrete Mathematics Fall 201 Instructor: Beck Hasti and Gautam Prakriya Reading 10 : Asymptotic Analysis In the last reading, we analyzed the running times of various algorithms.

More information

E190Q Lecture 10 Autonomous Robot Navigation

E190Q Lecture 10 Autonomous Robot Navigation E190Q Lecture 10 Autonomous Robot Navigation Instructor: Chris Clark Semester: Spring 2015 1 Figures courtesy of Siegwart & Nourbakhsh Kilobots 2 https://www.youtube.com/watch?v=2ialuwgafd0 Control Structures

More information

CS 161: Design and Analysis of Algorithms

CS 161: Design and Analysis of Algorithms CS 161: Design and Analysis of Algorithms Greedy Algorithms 3: Minimum Spanning Trees/Scheduling Disjoint Sets, continued Analysis of Kruskal s Algorithm Interval Scheduling Disjoint Sets, Continued Each

More information

CSC 373: Algorithm Design and Analysis Lecture 12

CSC 373: Algorithm Design and Analysis Lecture 12 CSC 373: Algorithm Design and Analysis Lecture 12 Allan Borodin February 4, 2013 1 / 16 Lecture 12: Announcements and Outline Announcements Term test 1 in tutorials. Need to use only two rooms due to sickness

More information

More Approximation Algorithms

More Approximation Algorithms CS 473: Algorithms, Spring 2018 More Approximation Algorithms Lecture 25 April 26, 2018 Most slides are courtesy Prof. Chekuri Ruta (UIUC) CS473 1 Spring 2018 1 / 28 Formal definition of approximation

More information

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Randomized Algorithms Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized

More information

1 Hoeffding s Inequality

1 Hoeffding s Inequality Proailistic Method: Hoeffding s Inequality and Differential Privacy Lecturer: Huert Chan Date: 27 May 22 Hoeffding s Inequality. Approximate Counting y Random Sampling Suppose there is a ag containing

More information

A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis

A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis Moritz Hardt Center for Computational Intractability Department of Computer Science Princeton University Email: mhardt@cs.princeton.edu

More information

Midterm: CS 6375 Spring 2015 Solutions

Midterm: CS 6375 Spring 2015 Solutions Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an

More information

Algorithm Design and Analysis

Algorithm Design and Analysis Algorithm Design and Analysis LECTURE 7 Greedy Graph Algorithms Shortest paths Minimum Spanning Tree Sofya Raskhodnikova 9/14/016 S. Raskhodnikova; based on slides by E. Demaine, C. Leiserson, A. Smith,

More information

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet.

The exam is closed book, closed calculator, and closed notes except your one-page crib sheet. CS 188 Fall 2018 Introduction to Artificial Intelligence Practice Final You have approximately 2 hours 50 minutes. The exam is closed book, closed calculator, and closed notes except your one-page crib

More information

A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis

A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis Moritz Hardt Guy N. Rothblum Abstract We consider statistical data analysis in the interactive setting. In this setting a trusted

More information

Algorithms: Lecture 12. Chalmers University of Technology

Algorithms: Lecture 12. Chalmers University of Technology Algorithms: Lecture 1 Chalmers University of Technology Today s Topics Shortest Paths Network Flow Algorithms Shortest Path in a Graph Shortest Path Problem Shortest path network. Directed graph G = (V,

More information

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 6: Provable Approximation via Linear Programming Lecturer: Matt Weinberg Scribe: Sanjeev Arora One of the running themes in this course is

More information

Introduction to Computer Science and Programming for Astronomers

Introduction to Computer Science and Programming for Astronomers Introduction to Computer Science and Programming for Astronomers Lecture 8. István Szapudi Institute for Astronomy University of Hawaii March 7, 2018 Outline Reminder 1 Reminder 2 3 4 Reminder We have

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch 6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch Today Probabilistic Turing Machines and Probabilistic Time Complexity Classes Now add a new capability to standard TMs: random

More information

Enabling Accurate Analysis of Private Network Data

Enabling Accurate Analysis of Private Network Data Enabling Accurate Analysis of Private Network Data Michael Hay Joint work with Gerome Miklau, David Jensen, Chao Li, Don Towsley University of Massachusetts, Amherst Vibhor Rastogi, Dan Suciu University

More information

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn CMU 10-701: Machine Learning (Fall 2016) https://piazza.com/class/is95mzbrvpn63d OUT: September 13th DUE: September

More information

Topics in Theoretical Computer Science April 08, Lecture 8

Topics in Theoretical Computer Science April 08, Lecture 8 Topics in Theoretical Computer Science April 08, 204 Lecture 8 Lecturer: Ola Svensson Scribes: David Leydier and Samuel Grütter Introduction In this lecture we will introduce Linear Programming. It was

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,

More information

8.1 Concentration inequality for Gaussian random matrix (cont d)

8.1 Concentration inequality for Gaussian random matrix (cont d) MGMT 69: Topics in High-dimensional Data Analysis Falll 26 Lecture 8: Spectral clustering and Laplacian matrices Lecturer: Jiaming Xu Scribe: Hyun-Ju Oh and Taotao He, October 4, 26 Outline Concentration

More information

Scribes: Po-Hsuan Wei, William Kuzmaul Editor: Kevin Wu Date: October 18, 2016

Scribes: Po-Hsuan Wei, William Kuzmaul Editor: Kevin Wu Date: October 18, 2016 CS 267 Lecture 7 Graph Spanners Scribes: Po-Hsuan Wei, William Kuzmaul Editor: Kevin Wu Date: October 18, 2016 1 Graph Spanners Our goal is to compress information about distances in a graph by looking

More information