Database Privacy: k-anonymity and de-anonymization attacks
|
|
- Lawrence Bruce
- 5 years ago
- Views:
Transcription
1 18734: Foundations of Privacy Database Privacy: k-anonymity and de-anonymization attacks Piotr Mardziel or Anupam Datta CMU Fall 2018
2 Publicly Released Large Datasets } Useful for improving recommendation systems, collaborative research } Contain personal information } Mechanisms to protect privacy, e.g. anonymization by removing names } Yet, private information leaked by attacks on anonymization mechanisms 2
3 Non-Interactive Linking Background/A uxiliary Information DB1 DB2 Algorithm to link information De-identified record 3
4 Roadmap } Motivation } Privacy definitions } Netflix-IMDb attack } Theoretical analysis } Empirical verification of assumptions } Conclusion 4
5 Sanitization of Databases Add noise, delete names, etc. Real Database Sanitized Database Health records Census data Protect privacy Provide useful information (utility) 5
6 Database Privacy } Releasing sanitized databases 1. k-anonymity [Samarati 2001; Sweeney 2002] 2. Differential privacy [Dwork et al. 2006] (future lecture) 6
7 Re-identification by linking Linking two sets of data on shared attributes may uniquely identify some individuals: 87 % of US population uniquely identifiable by 5-digit ZIP, gender, DOB 7
8 K-anonymity } Quasi-identifier: Set of attributes that can be linked with external data to uniquely identify individuals } Make every record in the table indistinguishable from at least k-1 other records with respect to quasi-identifiers } Linking on quasi-identifiers yields at least k records for each possible value of the quasi-identifier 8
9 K-anonymity and beyond Provides some protection: linking on ZIP, age, nationality yields 4 records Limitations: lack of diversity in sensitive attributes, background knowledge, subsequent releases on the same data set l-diversity, m-invariance, t-closeness, 9
10 Re-identification Attacks in Practice Examples: } Netflix-IMDB } Movielens attack } Twitter-Flicker } Recommendation systems Amazon, Hunch,.. Goal of De-anonymization: To find information about a record in the released dataset 10
11 Roadmap } Motivation } Privacy definitions } Netflix-IMDb attack } Theoretical analysis } Empirical verification of assumptions } Conclusion 11
12 Anonymization Mechanism Delete name identifiers and add noise Gladiator Titanic Heidi Bob Alice Charlie Each row corresponds to an individual Each column corresponds to an attribute, e.g. movie Gladiator Titanic Heidi r r Anonymized Netflix DB 12 r
13 De-anonymization Attacks Still Possible } Isolation Attacks } Recover individual s record from anonymized database } E.g., find user s record in anonymized Netflix movie database } Information Amplification Attacks } Find more information about individual in anonymized database } E.g. find ratings for specific movie for user in Netflix database 13
14 Netflix-IMDb Empirical Attack [Narayanan et al 2008] Anonymized Netflix DB Gladiator Titanic Heidi r r r Publicly available IMDb ratings (noisy) Titanic Heidi Bob 2 1 Used as auxiliary information Weighted Scoring Algorithm Isolation Attack! r
15 Problem Statement Anonymized database Gladiator Titanic Heidi Auxiliary information about a record (noisy) r Titanic Heidi r Bob 2 1 r Attacker uses algorithm to find record Enhance theoretical understanding of why empirical de-anonymization attacks work 15 Attacker s goal: Find r 1 or record similar to Bob s record
16 Research Goal Characterize classes of auxiliary information and properties of database for which re-identification is possible 16
17 Roadmap } Motivation } Privacy definitions } Netflix-IMDb attack } Theoretical analysis } Empirical verification of assumptions } Conclusion 17
18 Netflix-IMDb Empirical Attack [Narayanan et al 2008] Anonymized Netflix DB Gladiator Titanic Heidi r r r Publicly available IMDb ratings (noisy) Titanic Heidi Bob 2 1 Used as auxiliary information How do you measure similarity of this record with Bob s record? (Similarity Metric) 18 Weighted Scoring Algorithm r What does auxiliary information about a record mean?
19 Definition: Asymmetric Similarity Metric Gladiator v 1 Titanic v 2 Heidi v 3 y r Intuition: Measures how closely two people s ratings match on one movie Movie (i) T(y(i), r(i)) Gladiator 0 Titanic 0.6 Individual Attribute Similarity y(i) r(i) T(y(i),r(i)) =1 p(i) 5 0 T( y(v 1 ), r(v 1)) = 1 = 0 5 p(i): range of attribute i Heidi 0 Similarity Metric Intuition: Measures how closely two people s ratings match overall 19 S(y,r) 0.6/2 = 0.3 S(y,r) = i supp(y) T(y(i),r(i)) supp(y) supp(y): non null attributes in y
20 Definition: Auxiliary Information Intuition: aux about y should be a subset of record y aux can be noisy e.g. Netflix r y sample aux captures information available outside normal data release process e.g. IMDb aux perturb Bound level of perturbation in aux γ [0,1] (m,γ)-perturbed auxiliary information i supp(aux).t(y(i),aux(i)) 1 γ supp(aux) = m = no. of non null attributes in aux 20
21 Weighted Scoring [Narayanan et al 2008, Frankowski et al 2006] Intuition: The fewer the number of people who watched a movie, the rarer it is Weight of an attribute i w(i) = 1 log( supp(i) ) supp(i) = no. of non null entries in column i Use weight as an indicator of rarity Score gives a weighted average of how closely two people match on every movie, giving higher weight to rare movies 21 Score( aux, rj) = Scoring Methodology supp(aux) = m = no. of non null attributes in aux Compute Score for every record r in anonymized DB to find out which one is closest to target record y w( i)* T( aux( i), rj( i)) i supp ( aux) supp( aux)
22 Weighted Scoring Algorithm [Narayanan et al 2008] Compute Score for every r in D w i v 1 v 2 v 3 r r r Score(aux, r j ) Score( aux, r j ) = w( i)* T( aux( i), r i supp ( aux) supp( aux) v 1 v aux One of the records r in anonymized database is y, which row is it? Eccentricity measure > threshold e(aux,d) = max r D(Score(aux,r)) max 2, r D(Score(aux,r)) j ( i)) 22 Output record with max Score r Score(aux, r) used to predict S(y,r)
23 Where do Theorems Fit? Computed: Score of all records r in D with aux Theorems help bridge the gap Desired: Guarantee about Similarity r r
24 Theorems } Theorem 1: When Isolation Attacks work? } Theorem 2: Why Information Amplification Attacks work? 24
25 Theorem 1: When Isolation Attacks work? Intuition: If eccentricity is high, algorithm always finds the record corresponding to auxiliary information! If aux is (m,γ)-perturbed Eccentricity threshold > γm Eccentricity: Highest score - Second highest score then Score(aux,r) = Score(aux,y) γ: Indicator of perturbation in aux M : Average of weights in aux r: Record output by algorithm y : Target record If r is the only record with the highest score then r = y 25
26 Isolation Attack: Theorem A. Datta, D. Sharma and A. Sinha. Provable De-anonymization of Large Datasets with Sparse Dimensions. In proceedings of ETAPS First Conference on Principles of Security and Trust (POST 2012) 26
27 Theorems } Theorem 1: When Isolation Attacks work? } Theorem 2: Why Information Amplification Attacks work? 27
28 Intuition: Why Information Amplification Attacks work? } If two records agree on rare attributes, then with high probability they agree on other attributes too } Use intuition to find record r similar to aux on many rare attributes (using aux as proxy for y) 28
29 Intuition: Why Information Amplification Attacks work? > 0.75 For > 90% of records } If a high fraction of attributes in aux are rare, then any record r that is similar to aux, is similar to y Similarity > 0.75 Similarity >
30 Theorem 2: Why Information Amplification Attacks work? Define Function f D ( η1, η2, η3) If a high fraction of attributes in aux are rare, then any record r similar to aux, is similar to y - Measure overall similarity between target record y and r that depends on: η : 1 η : 2 η3 : Fraction of rare attributes in aux Lower bound on similarity between r and aux Fraction of target records for which guarantee holds S( y, r) fd( η 1, η2, η3) 30
31 Theorem 2: Why Information Amplification Attacks work? Using Function f D ( η1, η2, η3) S( y, r) fd( η 1, η2, η3) Theorem gives guarantee about similarity of record output by algorithm with target record 31
32 Roadmap } Motivation } Privacy definitions } Netflix-IMDb attack } Theoretical analysis } Empirical verification of assumptions } Conclusion 32
33 Empirical verification } Use `anonymized' Netflix database with 480,189 users and 17,770 movies } Percentage values claimed in our results = percentage of records not filtered out because of } insufficient attributes required to form aux OR } insufficient rare or non-rare attributes required to form aux A. Datta, D. Sharma and A. Sinha. Provable De-anonymization of Large Datasets with Sparse Dimensions. In proceedings of ETAPS First Conference on Principles of Security and Trust (POST 2012) 33
34 Do Assumptions hold over Netflix Database? % Records for which Theorem 1 assumptions hold m = 10 m = 20 % Records Perturbation measure, gamma (γ) m : no. of attributes in aux Averaged over sample of records chosen with replacement 34 A. Datta, D. Sharma and A. Sinha. Provable De-anonymization of Large Datasets with Sparse Dimensions. In proceedings of ETAPS First Conference on Principles of Security and Trust (POST 2012)
35 Does Intuition about f D hold for Netflix Database? η : 1 η : 2 η : 3 f D ( η, η, 3) η Fraction of rare attributes in aux Lower bound on similarity between r and aux Fraction of target records for which guarantee holds can be evaluated given D f(n1,n2,0.9) n2:min value of S(aux,r) S( y, r) fd( η 1, η2, η3) Intuition verified n1:fraction of rare attributes in aux For Netflix DB, f D ( η 3) and 1, η2, η is monotonically increasing in η 1 η2 and tends to 1 as increases η 2 1
36 Roadmap } Motivation } Privacy definitions } Netflix-IMDb attack } Theoretical analysis } Empirical verification of assumptions } Conclusion 36
37 Conclusion } Naïve anonymization mechanisms do not work } We obtain provable bounds about, and verify empirically, why some de-anonymization attacks work in practice } Even perturbed auxiliary information can be used to launch de-anonymization attacks if: } Database has many rare dimensions and } Auxiliary information has information about these rare dimensions 37
38 Summary } Anonymity via sanitization } Offline sanitization } Online sanitization (next lecture) } Privacy definitions } k-anonymity } l-diversity }... 38
39 Summary } Deanonmyization attacks } Isolation } Amplification } Measuring attack success without ground truth } Assumptions } (m, γ)-perturbation } Measurables } similarity } eccentricity } η 1, η 2, η 3 39
40 Deanonymization Ground Truth Y y sanitization process modeling assumption Sanitized R Auxiliary Aux r aux 40
41 Isolation attack Ground Truth Y y isolation attack sanitization process modeling assumption (m, γ)-perturbation Sanitized R Auxiliary Aux r r aux 41
42 Amplification attack Ground Truth Y y* i 5 y 4 sanitization process amplification attack modeling assumption (m, γ)-perturbation Sanitized R Auxiliary Aux actual identity r* assumed identity r i 5 4 aux i??? 42 η 1 Fraction of rare attributes in aux η 2 Minimal similarity between r and aux η 3 Fraction of records satisfying of η 1, η 2
43 Offline/non-interactive release sanitized dataset k-anonymity Minimum anonymity set size Anonymization settings Privacy definitions Online/interactive sanitize queries l-diversity Minimum sensitive range size 43
44 Modeling Y -- Ground Truth records (NOT KNOWN) R -- Sanitized records = Obf(Y) Aux -- Auxiliary records (m, γ)-perturbation g.t./aux relationship Assumptions and Experimental Measurements Given aux in Aux, isolate r in R closest to it Deanonymization attacks Measurements e eccentricity best isolate r vs second best r η 1 Fraction of rare attributes in aux η 2 Minimal similarity between r and aux η 3 Fraction of records satisfying of η 1, η 2 Isolation Link auxiliary aux in A to r in R. Is aux is same identity as g.t. y à r? Amplification Use R to find values of fields not in aux Are predicted values close to g.t. y? Success estimation Theorem 1: Isolation success (m, γ), eccentricity à successful isolation Theorem 2: Amplification success η 1, η 2, η 3 à isolated r is close to g.t. y 44
Theoretical Results on De-Anonymization via Linkage Attacks
377 402 Theoretical Results on De-Anonymization via Linkage Attacks Martin M. Merener York University, N520 Ross, 4700 Keele Street, Toronto, ON, M3J 1P3, Canada. E-mail: merener@mathstat.yorku.ca Abstract.
More informationLecture 11- Differential Privacy
6.889 New Developments in Cryptography May 3, 2011 Lecture 11- Differential Privacy Lecturer: Salil Vadhan Scribes: Alan Deckelbaum and Emily Shen 1 Introduction In class today (and the next two lectures)
More informationReport on Differential Privacy
Report on Differential Privacy Lembit Valgma Supervised by Vesal Vojdani December 19, 2017 1 Introduction Over the past decade the collection and analysis of personal data has increased a lot. This has
More informationThe Optimal Mechanism in Differential Privacy
The Optimal Mechanism in Differential Privacy Quan Geng Advisor: Prof. Pramod Viswanath 3/29/2013 PhD Prelimary Exam of Quan Geng, ECE, UIUC 1 Outline Background on differential privacy Problem formulation
More informationDifferential Privacy and Pan-Private Algorithms. Cynthia Dwork, Microsoft Research
Differential Privacy and Pan-Private Algorithms Cynthia Dwork, Microsoft Research A Dream? C? Original Database Sanitization Very Vague AndVery Ambitious Census, medical, educational, financial data, commuting
More informationPrivacy of Numeric Queries Via Simple Value Perturbation. The Laplace Mechanism
Privacy of Numeric Queries Via Simple Value Perturbation The Laplace Mechanism Differential Privacy A Basic Model Let X represent an abstract data universe and D be a multi-set of elements from X. i.e.
More informationDifferential Privacy for Probabilistic Systems. Michael Carl Tschantz Anupam Datta Dilsun Kaynar. May 14, 2009 CMU-CyLab
Differential Privacy for Probabilistic Systems Michael Carl Tschantz Anupam Datta Dilsun Kaynar May 14, 2009 CMU-CyLab-09-008 CyLab Carnegie Mellon University Pittsburgh, PA 15213 Differential Privacy
More informationCollaborative Filtering. Radek Pelánek
Collaborative Filtering Radek Pelánek 2017 Notes on Lecture the most technical lecture of the course includes some scary looking math, but typically with intuitive interpretation use of standard machine
More informationDifferential Privacy and its Application in Aggregation
Differential Privacy and its Application in Aggregation Part 1 Differential Privacy presenter: Le Chen Nanyang Technological University lechen0213@gmail.com October 5, 2013 Introduction Outline Introduction
More informationt-closeness: Privacy beyond k-anonymity and l-diversity
t-closeness: Privacy beyond k-anonymity and l-diversity paper by Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian presentation by Caitlin Lustig for CS 295D Definitions Attributes: explicit identifiers,
More informationThe Optimal Mechanism in Differential Privacy
The Optimal Mechanism in Differential Privacy Quan Geng Advisor: Prof. Pramod Viswanath 11/07/2013 PhD Final Exam of Quan Geng, ECE, UIUC 1 Outline Background on Differential Privacy ε-differential Privacy:
More informationPreliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use!
Data Mining The art of extracting knowledge from large bodies of structured data. Let s put it to use! 1 Recommendations 2 Basic Recommendations with Collaborative Filtering Making Recommendations 4 The
More informationPrivacy in Statistical Databases
Privacy in Statistical Databases Individuals x 1 x 2 x n Server/agency ( ) answers. A queries Users Government, researchers, businesses (or) Malicious adversary What information can be released? Two conflicting
More information18734: Foundations of Privacy. Anonymous Cash. Anupam Datta. CMU Fall 2018
18734: Foundations of Privacy Anonymous Cash Anupam Datta CMU Fall 2018 Today: Electronic Cash Goals Alice can ask for Bank to issue coins from her account. Alice can spend coins. Bank cannot track what
More informationDifferential Privacy: a short tutorial. Presenter: WANG Yuxiang
Differential Privacy: a short tutorial Presenter: WANG Yuxiang Some slides/materials extracted from Aaron Roth s Lecture at CMU The Algorithmic Foundations of Data Privacy [Website] Cynthia Dwork s FOCS
More informationCalibrating Noise to Sensitivity in Private Data Analysis
Calibrating Noise to Sensitivity in Private Data Analysis Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith Mamadou H. Diallo 1 Overview Issues: preserving privacy in statistical databases Assumption:
More informationDifferential Privacy
CS 380S Differential Privacy Vitaly Shmatikov most slides from Adam Smith (Penn State) slide 1 Reading Assignment Dwork. Differential Privacy (invited talk at ICALP 2006). slide 2 Basic Setting DB= x 1
More informationProgress in Data Anonymization: from k-anonymity to the minimality attack
Progress in Data Anonymization: from k-anonymity to the minimality attack Graham Cormode graham@research.att.com Tiancheng Li, Ninghua Li, Divesh Srivastava 1 Why Anonymize? For Data Sharing Give real(istic)
More informationPublishing Search Logs A Comparative Study of Privacy Guarantees
Publishing Search Logs A Comparative Study of Privacy Guarantees Michaela Götz Ashwin Machanavajjhala Guozhang Wang Xiaokui Xiao Johannes Gehrke Abstract Search engine companies collect the database of
More informationDecoupled Collaborative Ranking
Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides
More informationAcknowledgements. Today s Lecture. Attacking k-anonymity. Ahem Homogeneity. Data Privacy in Biomedicine: Lecture 18
Acknowledgeents Data Privacy in Bioedicine Lecture 18: Fro Identifier to Attribute Disclosure and Beyond Bradley Malin, PhD (b.alin@vanderbilt.edu) Professor of Bioedical Inforatics, Biostatistics, and
More information[Title removed for anonymity]
[Title removed for anonymity] Graham Cormode graham@research.att.com Magda Procopiuc(AT&T) Divesh Srivastava(AT&T) Thanh Tran (UMass Amherst) 1 Introduction Privacy is a common theme in public discourse
More informationDifferentially Private Real-time Data Release over Infinite Trajectory Streams
Differentially Private Real-time Data Release over Infinite Trajectory Streams Kyoto University, Japan Department of Social Informatics Yang Cao, Masatoshi Yoshikawa 1 Outline Motivation: opportunity &
More informationl-diversity: Privacy Beyond k-anonymity
l-diversity: Privacy Beyond k-anonymity Ashwin Machanavajjhala Johannes Gehrke Daniel Kifer Muthuramakrishnan Venkitasubramaniam Department of Computer Science, Cornell University {mvnak, johannes, dkifer,
More informationLecture 20: Introduction to Differential Privacy
6.885: Advanced Topics in Data Processing Fall 2013 Stephen Tu Lecture 20: Introduction to Differential Privacy 1 Overview This lecture aims to provide a very broad introduction to the topic of differential
More informationPrivacy-preserving Data Mining
Privacy-preserving Data Mining What is [data] privacy? Privacy and Data Mining Privacy-preserving Data mining: main approaches Anonymization Obfuscation Cryptographic hiding Challenges Definition of privacy
More informationJun Zhang Department of Computer Science University of Kentucky
Jun Zhang Department of Computer Science University of Kentucky Background on Privacy Attacks General Data Perturbation Model SVD and Its Properties Data Privacy Attacks Experimental Results Summary 2
More informationCS425: Algorithms for Web Scale Data
CS: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS. The original slides can be accessed at: www.mmds.org Customer
More informationRecommendation Systems
Recommendation Systems Pawan Goyal CSE, IITKGP October 21, 2014 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 21, 2014 1 / 52 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation
More informationPrivacy-Preserving Data Mining
CS 380S Privacy-Preserving Data Mining Vitaly Shmatikov slide 1 Reading Assignment Evfimievski, Gehrke, Srikant. Limiting Privacy Breaches in Privacy-Preserving Data Mining (PODS 2003). Blum, Dwork, McSherry,
More informationPrivacy and Data Mining: New Developments and Challenges. Plan
Privacy and Data Mining: New Developments and Challenges Stan Matwin School of Information Technology and Engineering Universit[é y] [d of]ottawa, Canada stan@site.uottawa.ca Plan Why privacy?? Classification
More informationHow Well Does Privacy Compose?
How Well Does Privacy Compose? Thomas Steinke Almaden IPAM/UCLA, Los Angeles CA, 10 Jan. 2018 This Talk Composition!! What is composition? Why is it important? Composition & high-dimensional (e.g. genetic)
More informationAnswering Many Queries with Differential Privacy
6.889 New Developments in Cryptography May 6, 2011 Answering Many Queries with Differential Privacy Instructors: Shafi Goldwasser, Yael Kalai, Leo Reyzin, Boaz Barak, and Salil Vadhan Lecturer: Jonathan
More informationCryptography CS 555. Topic 25: Quantum Crpytography. CS555 Topic 25 1
Cryptography CS 555 Topic 25: Quantum Crpytography CS555 Topic 25 1 Outline and Readings Outline: What is Identity Based Encryption Quantum cryptography Readings: CS555 Topic 25 2 Identity Based Encryption
More informationAn Algorithm for k-anonymitybased Fingerprinting
An Algorithm for k-anonymitybased Fingerprinting Sebastian Schrittwieser, Peter Kieseberg, Isao Echizen, Sven Wohlgemuth and Noboru Sonehara, Edgar Weippl National Institute of Informatics, Japan Vienna
More information5th March Unconditional Security of Quantum Key Distribution With Practical Devices. Hermen Jan Hupkes
5th March 2004 Unconditional Security of Quantum Key Distribution With Practical Devices Hermen Jan Hupkes The setting Alice wants to send a message to Bob. Channel is dangerous and vulnerable to attack.
More informationPrivacy-MaxEnt: Integrating Background Knowledge in Privacy Quantification
Privacy-MaxEnt: Integrating Background Knowledge in Privacy Quantification Wenliang Du, Zhouxuan Teng, and Zutao Zhu Department of Electrical Engineering and Computer Science Syracuse University, Syracuse,
More informationBasics in Cryptology. Outline. II Distributed Cryptography. Key Management. Outline. David Pointcheval. ENS Paris 2018
Basics in Cryptology II Distributed Cryptography David Pointcheval Ecole normale supérieure, CNRS & INRIA ENS Paris 2018 NS/CNRS/INRIA Cascade David Pointcheval 1/26ENS/CNRS/INRIA Cascade David Pointcheval
More informationEE 381V: Large Scale Learning Spring Lecture 16 March 7
EE 381V: Large Scale Learning Spring 2013 Lecture 16 March 7 Lecturer: Caramanis & Sanghavi Scribe: Tianyang Bai 16.1 Topics Covered In this lecture, we introduced one method of matrix completion via SVD-based
More informationCollaborative filtering with information-rich and information-sparse entities
Mach Learn (2014) 97:177 203 DOI 10.1007/s10994-014-5454-z Collaborative filtering with information-rich and information-sparse entities Kai Zhu Rui Wu Lei Ying R. Srikant Received: 2 March 2014 / Accepted:
More informationThe Pennsylvania State University The Graduate School College of Engineering A FIRST PRINCIPLE APPROACH TOWARD DATA PRIVACY AND UTILITY
The Pennsylvania State University The Graduate School College of Engineering A FIRST PRINCIPLE APPROACH TOWARD DATA PRIVACY AND UTILITY A Dissertation in Computer Science and Engineering by Bing-Rong Lin
More informationLecture Notes 10: Matrix Factorization
Optimization-based data analysis Fall 207 Lecture Notes 0: Matrix Factorization Low-rank models. Rank- model Consider the problem of modeling a quantity y[i, j] that depends on two indices i and j. To
More informationCMPUT651: Differential Privacy
CMPUT65: Differential Privacy Homework assignment # 2 Due date: Apr. 3rd, 208 Discussion and the exchange of ideas are essential to doing academic work. For assignments in this course, you are encouraged
More informationTowards a Systematic Analysis of Privacy Definitions
Towards a Systematic Analysis of Privacy Definitions Bing-Rong Lin and Daniel Kifer Department of Computer Science & Engineering, Pennsylvania State University, {blin,dkifer}@cse.psu.edu Abstract In statistical
More informationt-closeness: Privacy Beyond k-anonymity and l-diversity
t-closeness: Privacy Beyond k-anonymity and l-diversity Ninghui Li Tiancheng Li Department of Computer Science, Purdue University {ninghui, li83}@cs.purdue.edu Suresh Venkatasubramanian AT&T Labs Research
More informationA Key Recovery Attack on MDPC with CCA Security Using Decoding Errors
A Key Recovery Attack on MDPC with CCA Security Using Decoding Errors Qian Guo Thomas Johansson Paul Stankovski Dept. of Electrical and Information Technology, Lund University ASIACRYPT 2016 Dec 8th, 2016
More informationOn Node-differentially Private Algorithms for Graph Statistics
On Node-differentially Private Algorithms for Graph Statistics Om Dipakbhai Thakkar August 26, 2015 Abstract In this report, we start by surveying three papers on node differential privacy. First, we look
More informationMaryam Shoaran Alex Thomo Jens Weber. University of Victoria, Canada
Maryam Shoaran Alex Thomo Jens Weber University of Victoria, Canada Introduction Challenge: Evidence of Participation Sample Aggregates Zero-Knowledge Privacy Analysis of Utility of ZKP Conclusions 12/17/2015
More informationTechniques for Dimensionality Reduction. PCA and Other Matrix Factorization Methods
Techniques for Dimensionality Reduction PCA and Other Matrix Factorization Methods Outline Principle Compoments Analysis (PCA) Example (Bishop, ch 12) PCA as a mixture model variant With a continuous latent
More informationCS-E4320 Cryptography and Data Security Lecture 11: Key Management, Secret Sharing
Lecture 11: Key Management, Secret Sharing Céline Blondeau Email: celine.blondeau@aalto.fi Department of Computer Science Aalto University, School of Science Key Management Secret Sharing Shamir s Threshold
More informationLecture 3 Linear Algebra Background
Lecture 3 Linear Algebra Background Dan Sheldon September 17, 2012 Motivation Preview of next class: y (1) w 0 + w 1 x (1) 1 + w 2 x (1) 2 +... + w d x (1) d y (2) w 0 + w 1 x (2) 1 + w 2 x (2) 2 +...
More informationPseudonym and Anonymous Credential Systems. Kyle Soska 4/13/2016
Pseudonym and Anonymous Credential Systems Kyle Soska 4/13/2016 Moving Past Encryption Encryption Does: Hide the contents of messages that are being communicated Provide tools for authenticating messages
More informationCOS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2018
COS433/Math 473: Cryptography Mark Zhandry Princeton University Spring 2018 Secret Sharing Vault should only open if both Alice and Bob are present Vault should only open if Alice, Bob, and Charlie are
More informationarxiv: v1 [math.co] 12 Dec 2017
Topology of Privacy: Lattice Structures and Information Bubbles for Inference and Obfuscation arxiv:1712.04130v1 [math.co] 12 Dec 2017 Michael Erdmann Carnegie Mellon University December 12, 2017 c 2017
More informationDifferentially Private ANOVA Testing
Differentially Private ANOVA Testing Zachary Campbell campbza@reed.edu Andrew Bray abray@reed.edu Anna Ritz Biology Department aritz@reed.edu Adam Groce agroce@reed.edu arxiv:1711335v2 [cs.cr] 20 Feb 2018
More informationSOCIAL networks provide a virtual stage for users to
IEEE TRANSACTION ON DEPENDABLE AND SECURE COMPUTING 1 Collective Data-Sanitization for Preventing Sensitive Information Inference Attacks in Social Networks Zhipeng Cai, Senior Member, IEEE, Zaobo He,
More informationPrivacy in Statistical Databases
Privacy in Statistical Databases Individuals x 1 x 2 x n Server/agency ) answers. A queries Users Government, researchers, businesses or) Malicious adversary What information can be released? Two conflicting
More information1 Differential Privacy and Statistical Query Learning
10-806 Foundations of Machine Learning and Data Science Lecturer: Maria-Florina Balcan Lecture 5: December 07, 015 1 Differential Privacy and Statistical Query Learning 1.1 Differential Privacy Suppose
More informationProvably Private Data Anonymization: Or, k-anonymity Meets Differential Privacy
Provably Private Data Anonymization: Or, k-anonymity Meets Differential Privacy Ninghui Li Purdue University 305 N. University Street, West Lafayette, IN 47907, USA ninghui@cs.purdue.edu Wahbeh Qardaji
More informationLarge-scale Collaborative Ranking in Near-Linear Time
Large-scale Collaborative Ranking in Near-Linear Time Liwei Wu Depts of Statistics and Computer Science UC Davis KDD 17, Halifax, Canada August 13-17, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack
More informationIntroduction to Modern Cryptography Lecture 11
Introduction to Modern Cryptography Lecture 11 January 10, 2017 Instructor: Benny Chor Teaching Assistant: Orit Moskovich School of Computer Science Tel-Aviv University Fall Semester, 2016 17 Tuesday 12:00
More informationAdversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials
Adversarial Image Perturbation for Privacy Protection A Game Theory Perspective Supplementary Materials 1. Contents The supplementary materials contain auxiliary experiments for the empirical analyses
More informationBinary Principal Component Analysis in the Netflix Collaborative Filtering Task
Binary Principal Component Analysis in the Netflix Collaborative Filtering Task László Kozma, Alexander Ilin, Tapani Raiko first.last@tkk.fi Helsinki University of Technology Adaptive Informatics Research
More informationRecommendation Systems
Recommendation Systems Pawan Goyal CSE, IITKGP October 29-30, 2015 Pawan Goyal (IIT Kharagpur) Recommendation Systems October 29-30, 2015 1 / 61 Recommendation System? Pawan Goyal (IIT Kharagpur) Recommendation
More informationContext: Theory of variable privacy. The channel coding theorem and the security of binary randomization. Poorvi Vora Hewlett-Packard Co.
Context: Theory of variable privacy The channel coding theorem and the security of binary randomization Poorvi Vora Hewlett-Packard Co. Theory of security allows binary choice for data revelation: Bob
More informationPrivacy-preserving logistic regression
Privacy-preserving logistic regression Kamalika Chaudhuri Information Theory and Applications University of California, San Diego kamalika@soe.ucsd.edu Claire Monteleoni Center for Computational Learning
More informationLecture 1 Introduction to Differential Privacy: January 28
Introduction to Differential Privacy CSE711: Topics in Differential Privacy Spring 2016 Lecture 1 Introduction to Differential Privacy: January 28 Lecturer: Marco Gaboardi Scribe: Marco Gaboardi This note
More informationDATA MINING LECTURE 8. Dimensionality Reduction PCA -- SVD
DATA MINING LECTURE 8 Dimensionality Reduction PCA -- SVD The curse of dimensionality Real data usually have thousands, or millions of dimensions E.g., web documents, where the dimensionality is the vocabulary
More informationAN AXIOMATIC VIEW OF STATISTICAL PRIVACY AND UTILITY
AN AXIOMATIC VIEW OF STATISTICAL PRIVACY AND UTILITY DANIEL KIFER AND BING-RONG LIN Abstract. Privacy and utility are words that frequently appear in the literature on statistical privacy. But what do
More informationEnabling Accurate Analysis of Private Network Data
Enabling Accurate Analysis of Private Network Data Michael Hay Joint work with Gerome Miklau, David Jensen, Chao Li, Don Towsley University of Massachusetts, Amherst Vibhor Rastogi, Dan Suciu University
More informationCrowd-Learning: Improving the Quality of Crowdsourcing Using Sequential Learning
Crowd-Learning: Improving the Quality of Crowdsourcing Using Sequential Learning Mingyan Liu (Joint work with Yang Liu) Department of Electrical Engineering and Computer Science University of Michigan,
More informationPufferfish Privacy Mechanisms for Correlated Data. Shuang Song, Yizhen Wang, Kamalika Chaudhuri University of California, San Diego
Pufferfish Privacy Mechanisms for Correlated Data Shuang Song, Yizhen Wang, Kamalika Chaudhuri University of California, San Diego Sensitive Data with Correlation } Data from social networks, } eg. Spreading
More informationAccuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained Empirical Risk Minimization
Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained Empirical Risk Minimization Katrina Ligett HUJI & Caltech joint with Seth Neel, Aaron Roth, Bo Waggoner, Steven Wu NIPS 2017
More informationAnswering Numeric Queries
Answering Numeric Queries The Laplace Distribution: Lap(b) is the probability distribution with p.d.f.: p x b) = 1 x exp 2b b i.e. a symmetric exponential distribution Y Lap b, E Y = b Pr Y t b = e t Answering
More informationNature Methods: doi: /nmeth Supplementary Figure 1
Supplementary Figure 1 Schematic comparison of linking attacks and detection of a genome in a mixture of attacks. (a) Each box in the figure represents a dataset in the form of a matrix. Multiple boxes
More informationOn the Complexity of the Minimum Independent Set Partition Problem
On the Complexity of the Minimum Independent Set Partition Problem T-H. Hubert Chan 1, Charalampos Papamanthou 2, and Zhichao Zhao 1 1 Department of Computer Science the University of Hong Kong {hubert,zczhao}@cs.hku.hk
More informationCSE 598A Algorithmic Challenges in Data Privacy January 28, Lecture 2
CSE 598A Algorithmic Challenges in Data Privacy January 28, 2010 Lecture 2 Lecturer: Sofya Raskhodnikova & Adam Smith Scribe: Bing-Rong Lin Theme: Using global sensitivity of f. Suppose we have f of interest.
More informationStatistical Privacy For Privacy Preserving Information Sharing
Statistical Privacy For Privacy Preserving Information Sharing Johannes Gehrke Cornell University http://www.cs.cornell.edu/johannes Joint work with: Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh
More informationarxiv: v4 [cs.db] 1 Sep 2017
Composing Differential Privacy and Secure Computation: A case study on scaling private record linkage arxiv:70.00535v4 [cs.db] Sep 07 ABSTRACT Xi He Duke University hexi88@cs.duke.edu Cheryl Flynn AT&T
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Recommender Systems Instructor: Yizhou Sun yzsun@cs.ucla.edu May 17, 2017 Methods Learnt: Last Lecture Classification Clustering Vector Data Text Data Recommender System Decision
More informationDifferential Privacy: An Economic Method for Choosing Epsilon
Differential Privacy: An Economic Method for Choosing Epsilon Justin Hsu Marco Gaboardi Andreas Haeberlen Sanjeev Khanna Arjun Narayan Benjamin C. Pierce Aaron Roth University of Pennsylvania, USA University
More informationSpectral k-support Norm Regularization
Spectral k-support Norm Regularization Andrew McDonald Department of Computer Science, UCL (Joint work with Massimiliano Pontil and Dimitris Stamos) 25 March, 2015 1 / 19 Problem: Matrix Completion Goal:
More informationCommunity Detection. fundamental limits & efficient algorithms. Laurent Massoulié, Inria
Community Detection fundamental limits & efficient algorithms Laurent Massoulié, Inria Community Detection From graph of node-to-node interactions, identify groups of similar nodes Example: Graph of US
More informationRecommender Systems EE448, Big Data Mining, Lecture 10. Weinan Zhang Shanghai Jiao Tong University
2018 EE448, Big Data Mining, Lecture 10 Recommender Systems Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html Content of This Course Overview of
More informationCS120, Quantum Cryptography, Fall 2016
CS10, Quantum Cryptography, Fall 016 Homework # due: 10:9AM, October 18th, 016 Ground rules: Your homework should be submitted to the marked bins that will be by Annenberg 41. Please format your solutions
More informationIdentification, Data Combination and the Risk of Disclosure 1 Tatiana Komarova, 2 Denis Nekipelov, 3 Evgeny Yakovlev. 4
Identification, Data Combination and the Risk of Disclosure 1 Tatiana Komarova, 2 Denis Nekipelov, 3 Evgeny Yakovlev. 4 This version: September 24, 2012 ABSTRACT Data combination is routinely used by researchers
More informationk-points-of-interest Low-Complexity Privacy-Preserving k-pois Search Scheme by Dividing and Aggregating POI-Table
Computer Security Symposium 2014 22-24 October 2014 k-points-of-interest 223-8522 3-14-1 utsunomiya@sasase.ics.keio.ac.jp POIs Points of Interest Lien POI POI POI POI Low-Complexity Privacy-Preserving
More informationLecture 10. Public Key Cryptography: Encryption + Signatures. Identification
Lecture 10 Public Key Cryptography: Encryption + Signatures 1 Identification Public key cryptography can be also used for IDENTIFICATION Identification is an interactive protocol whereby one party: prover
More informationDifferentially Private Password Frequency Lists
Differentially Private Password Frequency Lists Or, How to release statistics from 70 million passwords on purpose Jeremiah Blocki Microsoft Research Email: jblocki@microsoftcom Anupam Datta Carnegie Mellon
More informationExtremal Mechanisms for Local Differential Privacy
Extremal Mechanisms for Local Differential Privacy Peter Kairouz Sewoong Oh 2 Pramod Viswanath Department of Electrical & Computer Engineering 2 Department of Industrial & Enterprise Systems Engineering
More informationCSCI 5520: Foundations of Data Privacy Lecture 5 The Chinese University of Hong Kong, Spring February 2015
CSCI 5520: Foundations of Data Privacy Lecture 5 The Chinese University of Hong Kong, Spring 2015 3 February 2015 The interactive online mechanism that we discussed in the last two lectures allows efficient
More informationCS246 Final Exam, Winter 2011
CS246 Final Exam, Winter 2011 1. Your name and student ID. Name:... Student ID:... 2. I agree to comply with Stanford Honor Code. Signature:... 3. There should be 17 numbered pages in this exam (including
More informationPrivacy Preserving Neighbourhood-based Collaborative Filtering Recommendation Algorithms
Privacy Preserving Neighbourhood-based Collaborative Filtering Recommendation Algorithms Zhigang Lu School of Computer Science The University of Adelaide A thesis submitted for the degree of Master of
More informationVIDEO Intypedia008en LESSON 8: SECRET SHARING PROTOCOL. AUTHOR: Luis Hernández Encinas. Spanish Scientific Research Council in Madrid, Spain
VIDEO Intypedia008en LESSON 8: SECRET SHARING PROTOCOL AUTHOR: Luis Hernández Encinas Spanish Scientific Research Council in Madrid, Spain Hello and welcome to Intypedia. We have learned many things about
More informationDAN FELDMAN, UNIVERSITY OF HAIFA
PRIVATE CORE-SETS FOR FACE IDENTIFICATION DAN FELDMAN, UNIVERSITY OF HAIFA A JOINT WORK WITH Rita Osadchy: SOME SLIDES FROM: KOBBI NISSIM [STOC 11] PRIVATE CORESETS [OAKLAND 10] - SECURE COMPUTATION OF
More informationLocal Differential Privacy
Local Differential Privacy Peter Kairouz Department of Electrical & Computer Engineering University of Illinois at Urbana-Champaign Joint work with Sewoong Oh (UIUC) and Pramod Viswanath (UIUC) / 33 Wireless
More informationECash and Anonymous Credentials
ECash and Anonymous Credentials CS/ECE 598MAN: Applied Cryptography Nikita Borisov November 9, 2009 1 E-cash Chaum s E-cash Offline E-cash 2 Anonymous Credentials e-cash-based Credentials Brands Credentials
More informationUtility-Privacy Tradeoff in Databases: An Information-theoretic Approach
1 Utility-Privacy Tradeoff in Databases: An Information-theoretic Approach Lalitha Sankar, Member, IEEE, S. Raj Rajagopalan, and H. Vincent Poor, Fellow, IEEE arxiv:1102.3751v4 [cs.it] 21 Jan 2013 Abstract
More informationInteract with Strangers
Interact with Strangers RATE: Recommendation-aware Trust Evaluation in Online Social Networks Wenjun Jiang 1, 2, Jie Wu 2, and Guojun Wang 1 1. School of Information Science and Engineering, Central South
More informationBounded Privacy: Formalising the Trade-Off Between Privacy and Quality of Service
H. Langweg, H. Langweg, M. Meier, M. Meier, B.C. B.C. Witt, Witt, D. Reinhardt D. Reinhardt et al. (Hrsg.): Sicherheit 2018, Lecture Notes in ininformatics (LNI), Gesellschaft für fürinformatik, Bonn 2018
More information