Differentially Private Real-time Data Release over Infinite Trajectory Streams
|
|
- Louise Atkins
- 5 years ago
- Views:
Transcription
1 Differentially Private Real-time Data Release over Infinite Trajectory Streams Kyoto University, Japan Department of Social Informatics Yang Cao, Masatoshi Yoshikawa 1
2 Outline Motivation: opportunity & privacy risk Problem definition and analysis Proposed solution Experiment results Conclusion & Future work 2
3 Motivation: Opportunity A great opportunity to utilize personal real-life data Easy to collect Life log data, along with people s trajectory. <uid,time,loc, > Trajectory streams are consisting of many people s trajectories Statistics of trajectory streams are useful e.g., Q count {how many people at Pittsburgh station now?} A trajectory: time sequenced locations 3
4 Motivation: Opportunity A great opportunity to utilize personal real-life data Easy to collect Life log data, along with people s trajectory. <uid,time,loc, > Trajectory streams are consisting of many people s trajectories Statistics of trajectory streams are useful e.g., Count: How many people at Pittsburgh station now? A trajectory: time sequenced locations e.g., Count: How many people at Pittsburgh station with Heart Rate>100 now? Health-aware Navigation System Marketing Analysis Intelligent Transportation System Leverage statistics of trajectory streams to data-based innovations! 4
5 Motivation: Privacy Risk Publish Statistics (of personal data) is risky 5
6 Motivation: Privacy Risk Publish Statistics (of personal data) is risky E.g., Release the data of Q1: COUNT(Sex=Female)=A. Q2: COUNT(Sex=Female OR What if B=A+1? Name Age Sex Dis. u1 40 M HIV u2 30 M - (Age=40 & Sex=Male & Employer= u1 )) = B 6
7 Motivation: Privacy Risk Publish Statistics (of personal data) is risky E.g., Release the data of Q1: COUNT(Sex=Female)=A. Q2: COUNT(Sex=Female OR If B=A+1 (Age=40 & Sex=Male & Employer= u1 )) = B Q3: COUNT(Sex=Female OR (Age=42 & Sex=Male & Employer= u1 ) & Diagnosis= HIV ) = C C = 1 or 0 Name Age Sex Dis. u1 40 M HIV u2 30 M - 7
8 Motivation: Privacy Risk Publish Statistics (of personal data) is risky E.g., Release the data of Q1: COUNT(Sex=Female)=A. Q2: COUNT(Sex=Female OR If B=A+1 (Age=40 & Sex=Male & Employer= u1 )) = B Q3: COUNT(Sex=Female OR (Age=42 & Sex=Male & Employer= u1 ) & Diagnosis= HIV ) = C C = 1 or 0 Positively or negatively compromised! Name Age Sex Dis. u1 40 M HIV u2 30 M - 8
9 Motivation: Privacy Risk Publish Statistics (of personal data) is risky Linkage attack on anonymized data [1][2] Released database An attacker s database [1] L. Sweeney, Simple demographics often identify people uniquely, Health (San Francisco), [2] C. Dwork, A Firm Foundation for Private Data Analysis, Commun. ACM, Jan
10 Motivation: Privacy Risk Publish Statistics (of personal data) is risky Linkage attack [1][2] Personal Trajectory data is highly sensitive! Four spatiotemporal data points can identify 95% of the individuals [3] Goal: to publish statistics of trajectory streams by a Privacy Preserving Data Publishing (PPDP) method, for open data utilizing untrusted cloud services data mining outsourcing [1] L. Sweeney, Simple demographics often identify people uniquely, Health (San Francisco), [2] C. Dwork, A Firm Foundation for Private Data Analysis, Commun. ACM, Jan [3]Y.-A. de Montjoye et al, Unique in the Crowd: The privacy bounds of human mobility, Sci. Rep.,Mar
11 Our contributions A rigorous and flexible PPDP framework over infinite trajectory streams The first definition of personalized privacy model for spatiotemporal data The protection is based on ε-differential Privacy. rigorous Designed algorithms to publish counts in real-time. E.g., Counts: How many people at Pittsburgh station now? Published data utility is better than the previous results. flexible real-time trajectory data users privacy preferences Privacy Model & PPDP algorithm sensitive data raw statistics in real-time noisy data publishable! Private Data 11
12 Our contributions A rigorous and flexible PPDP framework over infinite trajectory streams The first definition of personalized privacy model for spatiotemporal data The protection is based on ε-differential Privacy. rigorousness Designed algorithms to publish counts in real-time. E.g., Counts: How many people at Pittsburgh station now Personlized privacy & Published data utility is better. flexibility real-time trajectory data users privacy preferences Privacy Model & PPDP algorithm sensitive data raw statistics in real-time noisy data publishable! Private Data 12
13 Outline Motivation: opportunity & privacy risk Problem definition and analysis Proposed solution Experiment result Conclusion & Future work 13
14 Problem Definition PPDP over infinite trajectory streams Data collection process as follows: Trusted Server uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (b) trajectory representation of raw data safely publish risk locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics (a) raw data 14
15 Problem Definition PPDP over infinite trajectory streams How to transform (c) to safe version (c), while keeping it similar to (c) as much as possible? Trusted Server safely publish uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (b) trajectory representation of raw data risk locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics 15
16 Problem Definition PPDP over infinite trajectory streams How to transform (c) to safe version (c), while keeping it similar to (c) as much as possible? Ad-hoc methods CANNOT provide a reliable privacy guarantee e.g., method of deleting the values of 1 It is hard to model the attacker s background knowledge in this big data locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk Ad-hoc PPDP algorithm locs t1 t2 t3 t4 t5 park office bar gym (c) safe(?) statistics 16
17 Differential Privacy: A rigorous privacy definition ε-differential Privacy (ε-dp) [4] As a de facto privacy standard for statistical data publishing. Randomized algorithm A achieve ε-dp, if it satisfies Pr[A(Q(D))] Pr[A(Q(D*))] eε, ε > 0 D*:database except any one individual s data then A achieves ε-dp ; ε is a given positive parameter. D Q( ) D* Q( ) ε: Privacy budget unitary privacy level control. Robust under attacker with arbitrary background knowl. Laplace Mechanism & Exponential Mechanism can be used as sub-procedure of A [4] C. Dwork, et al., Calibrating Noise to Sensitivity in Private Data Analysis, TCC [5] F. McSherry and K. Talwar, Mechanism Design via Differential Privacy, FOCS,
18 Differential Privacy: A rigorous privacy definition ε-differential Privacy (ε-dp) [4] As a de facto privacy standard for statistical data publishing. Randomized algorithm A achieve ε-dp, if it satisfies Pr[A(Q(D))] Pr[A(Q(D*))] eε, ε > 0 D*:database except any one individual s data then A achieves ε-dp ; ε is a given positive parameter. D A( Q( )) D* A( Q( )) ε: Privacy budget unitary privacy level control. Robust under attacker with arbitrary background knowl. Laplace Mechanism & Exponential Mechanism can be used as sub-procedure of A [4] C. Dwork, et al., Calibrating Noise to Sensitivity in Private Data Analysis, TCC [5] F. McSherry and K. Talwar, Mechanism Design via Differential Privacy, FOCS, Less noise More High Data utility Low Low Privacy Level High ε 0 18
19 Differential Privacy: A rigorous privacy definition ε-differential Privacy (ε-dp) [4] As a de facto privacy standard for statistical data publishing. Randomized algorithm A achieve ε-dp, if it satisfies Pr[A(Q(D))] Pr[A(Q(D*))] eε, ε > 0 D*:database except any one individual s data then A achieves ε-dp ; ε is a given positive parameter. D A( Q( )) D* A( Q( )) ε: Privacy budget unitary privacy level control. Robust under Linkage attack. Laplace Mechanism [4] & Exponential Mechanism [5] Less noise High Data utility can be used as sub-procedure of A [4] C. Dwork, et al., Calibrating Noise to Sensitivity in Private Data Analysis, TCC [5] F. McSherry and K. Talwar, Mechanism Design via Differential Privacy, FOCS, More Low Low Privacy Level High ε 0 19
20 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! 20
21 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! 21
22 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! Two naive methods: (1) to protect data of each one timestamp. (2) to protect each user s data of all timestamps. However, it is not safe that protecting only 1 data point uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data ε ε ε ε ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (b) trajectory representation of raw data risk locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics 22
23 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! Two naive methods: (1) to protect data of each one timestamp. (2) to protect each user s data of all timestamps. However, it is unrealistic for infinite trajectory streams uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data ε ε ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (b) trajectory representation of raw data risk locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics 23
24 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! Two naive methods: (1) to protect data of each one timestamp. (2) to protect each user s data of all timestamps. Using Laplace Mechanism [4] as sub-procedure: uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data ε ε ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (b) trajectory representation of raw data risk locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics [4] C. Dwork, et al., Calibrating Noise to Sensitivity in Private Data Analysis, TCC
25 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! Two naive methods: (1) to protect data of each one timestamp. (2) to protect each user s data of all timestamps. Using Laplace Mechanism [4] as sub-procedure: risk locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics adding special Laplace noise to counts PPDP algo. A locs t1 t2 t3 t4 t5 park office bar gym (c) ε-dp statistics [4] C. Dwork, et al., Calibrating Noise to Sensitivity in Private Data Analysis, TCC
26 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! Two naive methods: (1) to protect data of each one timestamp. (2) to protect each user s data of all timestamps. Our observation: In real-life, individuals may have different requirements on privacy. ε ε ε ε ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (1) protecting any one patiotemporal data point ε ε ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (2) protecting all length of trajectories 26
27 Problem Analysis PPDP over infinite trajectory streams How to apply ε-dp to PPDP of infinite trajectory streams? depends on what we want to protect! Two naive methods: (1) to protect data of each one timestamp. (2) to protect each user s data of all timestamps. Our observation: In real-life, individuals may have different requirements on privacy. 2-trajectory 3-trajectory ll1 ll2 ll3 privacy preference of user i overall privacy : ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (our method) to protect any ll-trajectory ll successive spatio-temporal data points 27
28 Related work Differential Privacy on finite streams (1) to protect data of each one timestamp. event-level privacy [8] (2) to protect each user s data of all timestamps. user-level privacy [8] FAST [7]: Laplace + Kalman filter (predict/correct the noisy data) Differential Privacy on infinite streams w-event privacy [6] [6]G. Kellaris et al, Differentially Private Event Sequences over Infinite Streams, VLDB 14. [7]L. Fan et al, FAST: Differentially Private Real-time Aggregate Monitor with Filtering and Adaptive Sampling, SIGMOD 13. [8] C. Dwork, Differential Privacy in New Settings., SODA, pp ,
29 Related work Differential Privacy on finite streams (1) to protect data of each one timestamp. event-level privacy [8] (2) to protect each user s data of all timestamps. user-level privacy [8] FAST [7]: Laplace + Kalman filter (predict/correct the noisy data) Differential Privacy on infinite streams w-event privacy [6] ε ε ε ε ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (1) event-level privacy ε ε ε t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar (2) user-level privacy [7]L. Fan et al, FAST: Differentially Private Real-time Aggregate Monitor with Filtering and Adaptive Sampling, SIGMOD 13. [8] C. Dwork, Differential Privacy in New Settings., SODA, pp ,
30 Related work Differential Privacy on finite streams (1) to protect data of each one timestamp. event-level privacy [8] (2) to protect each user s data of all timestamps. user-level privacy [8] FAST [7]: Laplace + Kalman filter (predict/correct the noisy data) Differential Privacy on infinite streams w-event privacy [6] t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar ll-trajectory privacy(ll=3) (this study) t1 t2 t3 t4 t5 u1 park bar u2 bar park u3 park office gym bar w-event privacy (w=3) [6] [6]G. Kellaris et al, Differentially Private Event Sequences over Infinite Streams, VLDB
31 Outline Motivation: opportunity & privacy risk Problem definition and analysis Proposed solution Experiment result Conclusion & Future work 31
32 Overview of our solution A rigorous and flexible privacy model: ll-trajectory privacy Challenge: proof of how to achieve it ll1 ll2 ll3 PPDP algorithm to publish counts in real-time. u 1 u 2 u 3 Challenge: real-time & infinite streams design a Greedy Algorithm (GA) to dynamically add noise at each timestamp; Challenge: to much noise re-publish the noisy data with Minimum Manhattan Distance (MMD) to the current data ll-trajectory PPDP algorithm privacy model t1 t2 t3 t4 t5 pa ba rk r ba pa r rk pa rk offi ce gy ba m r (b) Infinite trajectories risk loc t t t t t par offi bar gy (c) raw statistics Laplace Mechanism re-publish strategy loc t t t t t par offi bar gy (c) l-trajectory private data 32
33 Overview of our solution A rigorous and flexible privacy model: ll-trajectory privacy Challenge: proof of how to achieve it by DP mechanism ll1 ll2 ll3 PPDP algorithm to publish counts in real-time. u 1 u 2 u 3 Challenge: real-time & infinite streams design a Greedy Algorithm (GA) to dynamically add noise at each timestamp; Challenge: to much noise re-publish the noisy data with Minimum Manhattan Distance (MMD) to the current data ll-trajectory PPDP algorithm privacy model t1 t2 t3 t4 t5 pa ba rk r ba pa r rk pa rk offi ce gy ba m r (b) Infinite trajectories risk loc t t t t t par offi bar gy (c) raw statistics Laplace Mechanism re-publish strategy loc t t t t t par offi bar gy (c) l-trajectory private data 33
34 Overview of our solution A rigorous and flexible privacy model: ll-trajectory privacy Challenge: proof of how to achieve it by DP mechanism ll1 ll2 ll3 PPDP algorithm to publish counts in real-time. u 1 u 2 u 3 Challenge: real-time & infinite streams design a Greedy Algorithm (GA) to dynamically add noise at each timestamp; Challenge: to much noise re-publish the noisy data with Minimum Manhattan Distance (MMD) to the current data ll-trajectory PPDP algorithm privacy model t1 t2 t3 t4 t5 pa ba rk r ba pa r rk pa rk offi ce gy ba m r (b) Infinite trajectories risk loc t t t t t par offi bar gy (c) raw statistics Laplace Mechanism GA re-publish strategy loc t t t t t par offi bar gy (c) l-trajectory private data 34
35 Overview of our solution A rigorous and flexible privacy model: ll-trajectory privacy Challenge: proof of how to achieve it by DP mechanism ll1 ll2 ll3 PPDP algorithm to publish counts in real-time. u 1 u 2 u 3 Challenge: real-time & infinite streams design a Greedy Algorithm (GA) to dynamically add noise at each timestamp; Challenge: to much noise re-publish the noisy data with Minimum Manhattan Distance (MMD) to the current data ll-trajectory PPDP algorithm privacy model t1 t2 t3 t4 t5 pa ba rk r ba pa r rk pa rk offi ce gy ba m r (b) Infinite trajectories risk loc t t t t t par offi bar gy (c) raw statistics Laplace Mechanism GA re-publish strategy MMD loc t t t t t par offi bar gy (c) l-trajectory private data 35
36 Overview of our solution A rigorous and flexible privacy model: ll-trajectory privacy Challenge: proof of how to achieve it by DP mechanism ll1 ll2 ll3 PPDP algorithm to publish counts in real-time. u 1 u 2 u 3 Challenge: real-time & infinite streams design a Greedy Algorithm (GA) to dynamically add noise at each timestamp; Challenge: to much noise re-publish the noisy data with Minimum Manhattan Distance (MMD) to the current data ll-trajectory PPDP algorithm privacy model t1 t2 t3 t4 t5 pa ba rk r ba pa r rk pa rk offi ce gy ba m r (b) Infinite trajectories risk loc t t t t t par offi bar gy (c) raw statistics Laplace Mechanism GA re-publish strategy MMD loc t t t t t par offi bar gy (c) l-trajectory private data 36
37 How to achieve ll-trajectory privacy We proved that how to achieve it by conventional DP mechanisms (e.g., Laplace/Exponential mechanism) ε i is privacy budget variables at each ti the sum of ε i at timestamp i of any ll-trajectory should be less than ε uid tim loc. u1 t1 e park t1 t2 t3 t4 t5 u3 t1 park u2 t2 bar u1 park bar ll1 u3 t2 office u2 bar park ll2 u3 t3 gym u1 t5 bar u3 park office gym bar ll3 u2 t5 park u3 t5 bar (a) raw data privacy budget variables: ε1 ε2 ε3 ε4 ε5 (b) trajectory representation of raw data locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk 37
38 How to achieve ll-trajectory privacy We proved that how to achieve it by conventional DP mechanisms (e.g., Laplace/Exponential mechanism) ε i is privacy budget variables at each ti the sum of ε i at timestamp i of any ll-trajectory should be less than ε ε1+ε5 ε ε1+ε2+ε3 ε uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data privacy budget variables: ε1 ε2 ε3 ε4 ε5 t1 t2 t3 t4 t5 u1 park bar ll1 u2 bar park ll2 u3 park office gym bar ll3 (b) trajectory representation of raw data ε2+ε5 ε ε2+ε3+ε5 ε locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk 38
39 uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data How to achieve ll-trajectory privacy We proved that how to achieve it by conventional DP mechanisms (e.g., Laplace/Exponential mechanism) If we know all data in advance: ε i is privacy budget variables at each ti using Linear Programming the sum of ε i at timestamp i of any Maximize: ll-trajectory should be less than ε ε1+ε5 ε ε1+ε2+ε3 ε privacy budget variables: ε2+ε5 ε ε1 ε2 ε3 ε4 ε5 ε2+ε3+ε5 ε t1 t2 t3 t4 t5 u1 park bar ll1 u2 bar park ll2 u3 park office gym bar ll3 (b) trajectory representation of raw data approximate Maximum Utility locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk 39
40 How to achieve ll-trajectory privacy We proved that how to achieve it by conventional DP mechanisms (e.g., Laplace/Exponential mechanism) If we know all data in advance: privacy budget variables ε However, we need: i at each ti using Real-time Linear publishing Programming & the sum of ε i at timestamp i of any for Maximize: ll-trajectory Infinite streams should be less than ε ε1+ε5 ε ε1+ε2+ε3 ε privacy budget variables: ε2+ε5 ε ε1 ε2 ε3 ε4 ε5 ε2+ε3+ε5 ε uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data t1 t2 t3 t4 t5 u1 park bar ll1 u2 bar park ll2 u3 park office gym bar ll3 (b) trajectory representation of raw data locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk 40
41 How to achieve ll-trajectory privacy We proved that how to achieve it by conventional DP mechanisms (e.g., Laplace/Exponential mechanism) privacy budget variables ε i at each ti the sum of ε i at timestamp i of any ll-trajectory should be less than ε ε1+ε5 ε? ε1+ε2+ε3 ε privacy budget variables: ε2+ε5 ε? ε1 ε2 ε3 ε4 ε5 ε2+ε3+ε5 ε? uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data t1 t2 t3 t4 t5 u1 park bar ll1 u2 bar park ll2 u3 park office gym bar ll3 (b) trajectory representation of raw data locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk 41
42 PPDP algorithm: GA GA: Greedy Algorithm to get approximately optimal ε i by incomplete information uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data privacy budget variables: ε1 ε2 ε3 ε4 ε5 t1 t2 t3 t4 t5 u1 park bar ll1 u2 bar park ll2 u3 park office gym bar ll3 (b) trajectory representation of raw data ε1+ε5 ε? ε1+ε2+ε3 ε ε2+ε5 ε? ε2+ε3+ε5 ε? locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk 42
43 PPDP algorithm: GA GA: Greedy Algorithm to get approximately optimal ε i by incomplete information idea: exponential decay (1) set? =0, then compute ε3=w (2) use w/2 as the value of ε3 (reserve w/2 for? ) uid tim loc. u1 t1 e park u3 t1 park u2 t2 bar u3 t2 office u3 t3 gym u1 t5 bar u2 t5 park u3 t5 bar (a) raw data privacy budget variables: ε1 ε2 ε3 ε4 ε5 t1 t2 t3 t4 t5 u1 park bar ll1 u2 bar park ll2 u3 park office gym bar ll3 (b) trajectory representation of raw data ε1+ε5 ε? ε1+ε2+ε3 ε ε2+ε5 ε? ε2+ε3+ε5 ε? locs t1 t2 t3 t4 t5 park office bar gym (c) raw statistics risk 43
44 PPDP algorithm: MMD Republishing strategy to improve data utility of counts idea: in real-life data, the data has periodically repeated pattern re-publish Adjacent noisy (Adj) data (has been studied in [6]) re-publish noisy data who is holding Minimum Manhattan Distance (MMD) to real data of current timestamp (using Exponential Mechanism[5]) published noisy data n1 ~ nt-1 the current real data rt counts time= nmmd or nadj [5] F. McSherry and K. Talwar, Mechanism Design via Differential Privacy, FOCS, [6]G. Kellaris et al, Differentially Private Event Sequences over Infinite Streams, VLDB
45 Personalized privacy control Proposed framework dynamic budget allocation (GA) Private Approximation strategy PPDP framework collected database at timestamp t ε 1 ε t-1 us 1 us t n 1 n t-1 r t GA Dynamic Budget Allocation set of uids us t real statistics r t ε t,1 ε t,2 D t Private Approximation Strategy Private Publishing stream Adj or MMD Private publishing ε t private statistics n t capable of publishing diverse statistical data over infinite trajectory streams Adj / MMD is optimized for counts publishing 45
46 Outline Motivation: opportunity & privacy risk Problem definition and analysis Proposed solution Experiment result Future work 46
47 Experiments Four real-life trajectory datasets Desc. PeopleFlow 1 Geolife 2 T-Drive 3 WorldCup98 4 people moving; people moving; diverse type of mobility; sparse;beijing taxis moving; Beijing webpage click streams Locs. Amt ,000 users Amt. 11, , ,762 TimeStamps Amt. 1,694 1, Interval of TS 5 mins 1 min 10 mins 1 hour Length of TS 6 days 24 hours* ~7 days ~35 days Datapoints Amt. 102, ,990 37,255 1,258, *.Aggregating 50,176 hours timestamps to 24 hours by omitting the date. 47
48 Experiments Budget Allocation Approximation Strategy realtime Utility Evaluation Uniform uniformly X O bad LP Approximately global optimised X X normal GA+Adj dynamically republish Adj O better GA+MMD dynamically republish MMD O Best FAST fix [7] uniformly _ O better,not stable [7]L. Fan et al, FAST: Differentially Private Real-time Aggregate Monitor with Filtering and Adaptive Sampling, SIGMOD
49 Experiments Real data v.s. Noisy data(ll=20,ε=1) real data GA+MMD FAST Uniform PeopleFlow counts real data GA+MMD FAST [7] timestamp
50 Experiments Metrics Mean of Absolute Error (MAE) MAE(R, N) = 1 T * locs T locs i=1 j=1 r i [ j] n i [ j] All of them are the lower, the better Mean of Square Error (MSE) sensitivity to large error MSE(R, N) = 1 T * locs T locs i=1 j=1 r i [ j] n i [ j] 2 KL-divergence similarity of two distribution (the lower the more similar ) D KL (R N) = 1 T T locs! ln r i[ j] * $ # & r " n i [ j] * i [ j] * % i=1 j=1 50
51 Experiments MAE/MSE by varying ll (ε=1) PeopleFlow Geolife T-Drive WorldCup98 1E+03 1E+02 UNIFORM FAST GA+Adj GA+MMD 1E+03 1E+02 UNIFORM FAST GA+Adj GA+MMD 1E+03 1E+02 UNIFORM FAST GA+Adj GA+MMD 1E+03 1E+02 UNIFORM FAST GA+Adj GA+MMD MAE 1E+01 1E+01 1E+01 1E+01 1E+00 1E E E ll ll ll ll PeopleFlow Geolife T-Drive WorldCup98 1E+06 1E+05 Uniform GA+Adj FAST GA+MMD 1E+06 1E+05 Uniform GA+Adj FAST GA+MMD 1E+06 1E+05 Uniform GA+Adj FAST GA+MMD 1E+06 1E+05 Uniform GA+Adj FAST GA+MMD MSE 1E+04 1E+03 1E+04 1E+03 1E+04 1E+03 1E+04 1E+03 1E+02 1E+02 1E+02 1E+02 1E+01 1E ll 1E ll 1E ll ll 51
52 Experiments MAE/MSE by varying ε (ll=20) PeopleFlow Geolife T-Drive WorldCup98 1E+06 1E+05 1E+04 UNIFORM FAST GA+Adj GA+MMD 1E+06 1E+04 UNIFORM FAST GA+Adj GA+MMD 1E+06 1E+04 UNIFORM FAST GA+Adj GA+MMD 1E+06 1E+04 UNIFORM FAST GA+Adj GA+MMD MAE 1E+03 1E+02 1E+02 1E+02 1E+02 1E+01 1E+00 1E+00 1E+00 1E E E E E ε ε ε ε PeopleFlow Geolife T-Drive WorldCup98 1E+12 1E+09 UNIFORM FAST GA+Adj GA+MMD 1E+12 1E+08 UNIFORM FAST GA+Adj GA+MMD 1E+12 1E+09 UNIFORM FAST GA+Adj GA+MMD 1E+12 1E+09 UNIFORM FAST GA+Adj GA+MMD MSE 1E+06 1E+06 1E+06 1E+03 1E+04 1E+03 1E+03 1E+00 1E+00 1E+00 1E E E E E ε ε ε ε 52
53 Outline Motivation: opportunity & privacy risk Problem definition and analysis Proposed solution Experiment result Conclusion & Future work 53
54 Conclusion Proposed a rigorous and flexible privacy model for spatio-temporal data: l-trajectory privacy; Personal Data application: PPDM Privacy Models business model: privacy as services/money PPDP algorithms PPDM algorithms PPDP Framework for spatio-temporal data; safe OpenData safe MiningResult Algorithms GA+MMD for publishing private counts with high utility in real-time. 54
55 Future Work Improve algorithm GA+MMD for private counts the counts should be non-negative & integer More flexible privacy models ll-trajectory privacy Location-based privacy model PPDP/ PPDM for mining personal spatio-temporal data 55
56 Thank you! Any question? 56
Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks. Yang Cao Emory University
Modeling Data Correlations in Private Data Mining with Markov Model and Markov Networks Yang Cao Emory University 207..5 Outline Data Mining with Differential Privacy (DP) Scenario: Spatiotemporal Data
More informationThe Optimal Mechanism in Differential Privacy
The Optimal Mechanism in Differential Privacy Quan Geng Advisor: Prof. Pramod Viswanath 11/07/2013 PhD Final Exam of Quan Geng, ECE, UIUC 1 Outline Background on Differential Privacy ε-differential Privacy:
More informationMaryam Shoaran Alex Thomo Jens Weber. University of Victoria, Canada
Maryam Shoaran Alex Thomo Jens Weber University of Victoria, Canada Introduction Challenge: Evidence of Participation Sample Aggregates Zero-Knowledge Privacy Analysis of Utility of ZKP Conclusions 12/17/2015
More informationA Framework for Protec/ng Worker Loca/on Privacy in Spa/al Crowdsourcing
A Framework for Protec/ng Worker Loca/on Privacy in Spa/al Crowdsourcing Nov 12 2014 Hien To, Gabriel Ghinita, Cyrus Shahabi VLDB 2014 1 Mo/va/on Ubiquity of mobile users 6.5 billion mobile subscrip/ons,
More informationThe Optimal Mechanism in Differential Privacy
The Optimal Mechanism in Differential Privacy Quan Geng Advisor: Prof. Pramod Viswanath 3/29/2013 PhD Prelimary Exam of Quan Geng, ECE, UIUC 1 Outline Background on differential privacy Problem formulation
More informationEnabling Accurate Analysis of Private Network Data
Enabling Accurate Analysis of Private Network Data Michael Hay Joint work with Gerome Miklau, David Jensen, Chao Li, Don Towsley University of Massachusetts, Amherst Vibhor Rastogi, Dan Suciu University
More informationCMPUT651: Differential Privacy
CMPUT65: Differential Privacy Homework assignment # 2 Due date: Apr. 3rd, 208 Discussion and the exchange of ideas are essential to doing academic work. For assignments in this course, you are encouraged
More informationHuman resource data location privacy protection method based on prefix characteristics
Acta Technica 62 No. 1B/2017, 437 446 c 2017 Institute of Thermomechanics CAS, v.v.i. Human resource data location privacy protection method based on prefix characteristics Yulong Qi 1, 2, Enyi Zhou 1
More informationLecture 11- Differential Privacy
6.889 New Developments in Cryptography May 3, 2011 Lecture 11- Differential Privacy Lecturer: Salil Vadhan Scribes: Alan Deckelbaum and Emily Shen 1 Introduction In class today (and the next two lectures)
More informationDifferentially Private Event Sequences over Infinite Streams
Differentially Private Event Sequences over Infinite Streams Georgios Kellaris 1 Stavros Papadopoulos 2 Xiaokui Xiao 3 Dimitris Papadias 1 1 HKUST 2 Intel Labs / MIT 3 Nanyang Technological University
More informationPufferfish Privacy Mechanisms for Correlated Data. Shuang Song, Yizhen Wang, Kamalika Chaudhuri University of California, San Diego
Pufferfish Privacy Mechanisms for Correlated Data Shuang Song, Yizhen Wang, Kamalika Chaudhuri University of California, San Diego Sensitive Data with Correlation } Data from social networks, } eg. Spreading
More informationDatabase Privacy: k-anonymity and de-anonymization attacks
18734: Foundations of Privacy Database Privacy: k-anonymity and de-anonymization attacks Piotr Mardziel or Anupam Datta CMU Fall 2018 Publicly Released Large Datasets } Useful for improving recommendation
More informationRényi Differential Privacy
Rényi Differential Privacy Ilya Mironov Google Brain DIMACS, October 24, 2017 Relaxation of Differential Privacy (ε, δ)-differential Privacy: The distribution of the output M(D) on database D is (nearly)
More informationLocally Differentially Private Protocols for Frequency Estimation. Tianhao Wang, Jeremiah Blocki, Ninghui Li, Somesh Jha
Locally Differentially Private Protocols for Frequency Estimation Tianhao Wang, Jeremiah Blocki, Ninghui Li, Somesh Jha Differential Privacy Differential Privacy Classical setting Differential Privacy
More informationPrivacy of Numeric Queries Via Simple Value Perturbation. The Laplace Mechanism
Privacy of Numeric Queries Via Simple Value Perturbation The Laplace Mechanism Differential Privacy A Basic Model Let X represent an abstract data universe and D be a multi-set of elements from X. i.e.
More informationReport on Differential Privacy
Report on Differential Privacy Lembit Valgma Supervised by Vesal Vojdani December 19, 2017 1 Introduction Over the past decade the collection and analysis of personal data has increased a lot. This has
More informationPersonalized Social Recommendations Accurate or Private
Personalized Social Recommendations Accurate or Private Presented by: Lurye Jenny Paper by: Ashwin Machanavajjhala, Aleksandra Korolova, Atish Das Sarma Outline Introduction Motivation The model General
More informationDifferential Privacy and its Application in Aggregation
Differential Privacy and its Application in Aggregation Part 1 Differential Privacy presenter: Le Chen Nanyang Technological University lechen0213@gmail.com October 5, 2013 Introduction Outline Introduction
More informationOn Node-differentially Private Algorithms for Graph Statistics
On Node-differentially Private Algorithms for Graph Statistics Om Dipakbhai Thakkar August 26, 2015 Abstract In this report, we start by surveying three papers on node differential privacy. First, we look
More informationCalibrating Noise to Sensitivity in Private Data Analysis
Calibrating Noise to Sensitivity in Private Data Analysis Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith Mamadou H. Diallo 1 Overview Issues: preserving privacy in statistical databases Assumption:
More informationDifferential Privacy: a short tutorial. Presenter: WANG Yuxiang
Differential Privacy: a short tutorial Presenter: WANG Yuxiang Some slides/materials extracted from Aaron Roth s Lecture at CMU The Algorithmic Foundations of Data Privacy [Website] Cynthia Dwork s FOCS
More informationPrivacy in Statistical Databases
Privacy in Statistical Databases Individuals x 1 x 2 x n Server/agency ) answers. A queries Users Government, researchers, businesses or) Malicious adversary What information can be released? Two conflicting
More informationWhat Can We Learn Privately?
What Can We Learn Privately? Sofya Raskhodnikova Penn State University Joint work with Shiva Kasiviswanathan Homin Lee Kobbi Nissim Adam Smith Los Alamos UT Austin Ben Gurion Penn State To appear in SICOMP
More informationAccuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained Empirical Risk Minimization
Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained Empirical Risk Minimization Katrina Ligett HUJI & Caltech joint with Seth Neel, Aaron Roth, Bo Waggoner, Steven Wu NIPS 2017
More informationInformation-theoretic foundations of differential privacy
Information-theoretic foundations of differential privacy Darakhshan J. Mir Rutgers University, Piscataway NJ 08854, USA, mir@cs.rutgers.edu Abstract. We examine the information-theoretic foundations of
More informationLecture 20: Introduction to Differential Privacy
6.885: Advanced Topics in Data Processing Fall 2013 Stephen Tu Lecture 20: Introduction to Differential Privacy 1 Overview This lecture aims to provide a very broad introduction to the topic of differential
More informationDifferentially Private Linear Regression
Differentially Private Linear Regression Christian Baehr August 5, 2017 Your abstract. Abstract 1 Introduction My research involved testing and implementing tools into the Harvard Privacy Tools Project
More informationPoS(CENet2017)018. Privacy Preserving SVM with Different Kernel Functions for Multi-Classification Datasets. Speaker 2
Privacy Preserving SVM with Different Kernel Functions for Multi-Classification Datasets 1 Shaanxi Normal University, Xi'an, China E-mail: lizekun@snnu.edu.cn Shuyu Li Shaanxi Normal University, Xi'an,
More informationDifferentially Private Publication of Location Entropy
Differentially Private Publication of Location Entropy Hien To hto@usc.edu Kien Nguyen kien.nguyen@usc.edu yrus Shahabi shahabi@usc.edu Department of omputer Science, University of Southern alifornia Los
More informationSemantic Security: Privacy Definitions Revisited
185 198 Semantic Security: Privacy Definitions Revisited Jinfei Liu, Li Xiong, Jun Luo Department of Mathematics & Computer Science, Emory University, Atlanta, USA Huawei Noah s Ark Laboratory, HongKong
More informationDifferential Privacy and Pan-Private Algorithms. Cynthia Dwork, Microsoft Research
Differential Privacy and Pan-Private Algorithms Cynthia Dwork, Microsoft Research A Dream? C? Original Database Sanitization Very Vague AndVery Ambitious Census, medical, educational, financial data, commuting
More informationDifferentially Private ANOVA Testing
Differentially Private ANOVA Testing Zachary Campbell campbza@reed.edu Andrew Bray abray@reed.edu Anna Ritz Biology Department aritz@reed.edu Adam Groce agroce@reed.edu arxiv:1711335v2 [cs.cr] 20 Feb 2018
More informationAnalysis Based on SVM for Untrusted Mobile Crowd Sensing
Analysis Based on SVM for Untrusted Mobile Crowd Sensing * Ms. Yuga. R. Belkhode, Dr. S. W. Mohod *Student, Professor Computer Science and Engineering, Bapurao Deshmukh College of Engineering, India. *Email
More informationExploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data
Exploring the Patterns of Human Mobility Using Heterogeneous Traffic Trajectory Data Jinzhong Wang April 13, 2016 The UBD Group Mobile and Social Computing Laboratory School of Software, Dalian University
More informationNew Statistical Applications for Differential Privacy
New Statistical Applications for Differential Privacy Rob Hall 11/5/2012 Committee: Stephen Fienberg, Larry Wasserman, Alessandro Rinaldo, Adam Smith. rjhall@cs.cmu.edu http://www.cs.cmu.edu/~rjhall 1
More informationDifferential Privacy with Bounded Priors: Reconciling Utility and Privacy in Genome-Wide Association Studies
Differential Privacy with ounded Priors: Reconciling Utility and Privacy in Genome-Wide Association Studies ASTRACT Florian Tramèr Zhicong Huang Jean-Pierre Hubaux School of IC, EPFL firstname.lastname@epfl.ch
More informationEncapsulating Urban Traffic Rhythms into Road Networks
Encapsulating Urban Traffic Rhythms into Road Networks Junjie Wang +, Dong Wei +, Kun He, Hang Gong, Pu Wang * School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan,
More informationChallenges in Privacy-Preserving Analysis of Structured Data Kamalika Chaudhuri
Challenges in Privacy-Preserving Analysis of Structured Data Kamalika Chaudhuri Computer Science and Engineering University of California, San Diego Sensitive Structured Data Medical Records Search Logs
More informationHow Well Does Privacy Compose?
How Well Does Privacy Compose? Thomas Steinke Almaden IPAM/UCLA, Los Angeles CA, 10 Jan. 2018 This Talk Composition!! What is composition? Why is it important? Composition & high-dimensional (e.g. genetic)
More informationJun Zhang Department of Computer Science University of Kentucky
Application i of Wavelets in Privacy-preserving Data Mining Jun Zhang Department of Computer Science University of Kentucky Outline Privacy-preserving in Collaborative Data Analysis Advantages of Wavelets
More informationDiagnosing New York City s Noises with Ubiquitous Data
Diagnosing New York City s Noises with Ubiquitous Data Dr. Yu Zheng yuzheng@microsoft.com Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Background Many cities suffer
More informationTASK ASSIGNMENT WITH RIGOROUS PRIVACY PROTECTION IN SPATIAL CROWDSOURCING. Hien To. Submitted in Partial Fulfillment of the Requirements
TASK ASSIGNMENT WITH RIGOROUS PRIVACY PROTECTION IN SPATIAL CROWDSOURCING by Hien To Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in Computer Science Viterbi
More information1 Differential Privacy and Statistical Query Learning
10-806 Foundations of Machine Learning and Data Science Lecturer: Maria-Florina Balcan Lecture 5: December 07, 015 1 Differential Privacy and Statistical Query Learning 1.1 Differential Privacy Suppose
More informationWhom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services
Whom to Ask? Jury Selection for Decision Making Tasks on Micro-blog Services Caleb Chen CAO, Jieying SHE, Yongxin TONG, Lei CHEN The Hong Kong University of Science and Technology Is Istanbul the capital
More informationLocal Differential Privacy
Local Differential Privacy Peter Kairouz Department of Electrical & Computer Engineering University of Illinois at Urbana-Champaign Joint work with Sewoong Oh (UIUC) and Pramod Viswanath (UIUC) / 33 Wireless
More informationA Computational Movement Analysis Framework For Exploring Anonymity In Human Mobility Trajectories
A Computational Movement Analysis Framework For Exploring Anonymity In Human Mobility Trajectories Jennifer A. Miller 2015 UT CID Report #1512 This UT CID research was supported in part by the following
More informationTuan V. Dinh, Lachlan Andrew and Philip Branch
Predicting supercomputing workload using per user information Tuan V. Dinh, Lachlan Andrew and Philip Branch 13 th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. Delft, 14 th -16
More informationMobility Analytics through Social and Personal Data. Pierre Senellart
Mobility Analytics through Social and Personal Data Pierre Senellart Session: Big Data & Transport Business Convention on Big Data Université Paris-Saclay, 25 novembre 2015 Analyzing Transportation and
More informationCSCI 5520: Foundations of Data Privacy Lecture 5 The Chinese University of Hong Kong, Spring February 2015
CSCI 5520: Foundations of Data Privacy Lecture 5 The Chinese University of Hong Kong, Spring 2015 3 February 2015 The interactive online mechanism that we discussed in the last two lectures allows efficient
More informationSpatial Data Science. Soumya K Ghosh
Workshop on Data Science and Machine Learning (DSML 17) ISI Kolkata, March 28-31, 2017 Spatial Data Science Soumya K Ghosh Professor Department of Computer Science and Engineering Indian Institute of Technology,
More informationExploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales. ACM MobiCom 2014, Maui, HI
Exploring Human Mobility with Multi-Source Data at Extremely Large Metropolitan Scales Desheng Zhang & Tian He University of Minnesota, USA Jun Huang, Ye Li, Fan Zhang, Chengzhong Xu Shenzhen Institute
More informationEstimating Large Scale Population Movement ML Dublin Meetup
Deutsche Bank COO Chief Data Office Estimating Large Scale Population Movement ML Dublin Meetup John Doyle PhD Assistant Vice President CDO Research & Development Science & Innovation john.doyle@db.com
More informationLecture 1 Introduction to Differential Privacy: January 28
Introduction to Differential Privacy CSE711: Topics in Differential Privacy Spring 2016 Lecture 1 Introduction to Differential Privacy: January 28 Lecturer: Marco Gaboardi Scribe: Marco Gaboardi This note
More informationAn Overview of Traffic Matrix Estimation Methods
An Overview of Traffic Matrix Estimation Methods Nina Taft Berkeley www.intel.com/research Problem Statement 1 st generation solutions 2 nd generation solutions 3 rd generation solutions Summary Outline
More informationDifferential Privacy in an RKHS
Differential Privacy in an RKHS Rob Hall (with Larry Wasserman and Alessandro Rinaldo) 2/20/2012 rjhall@cs.cmu.edu http://www.cs.cmu.edu/~rjhall 1 Overview Why care about privacy? Differential Privacy
More informationNorthrop Grumman Concept Paper
Northrop Grumman Concept Paper A Comprehensive Geospatial Web-based Solution for NWS Impact-based Decision Support Services Glenn Higgins April 10, 2014 Northrop Grumman Corporation Information Systems
More information1 Hoeffding s Inequality
Proailistic Method: Hoeffding s Inequality and Differential Privacy Lecturer: Huert Chan Date: 27 May 22 Hoeffding s Inequality. Approximate Counting y Random Sampling Suppose there is a ag containing
More informationArcGIS is Advancing. Both Contributing and Integrating many new Innovations. IoT. Smart Mapping. Smart Devices Advanced Analytics
ArcGIS is Advancing IoT Smart Devices Advanced Analytics Smart Mapping Real-Time Faster Computing Web Services Crowdsourcing Sensor Networks Both Contributing and Integrating many new Innovations ArcGIS
More informationPrivacy-Preserving Data Mining
CS 380S Privacy-Preserving Data Mining Vitaly Shmatikov slide 1 Reading Assignment Evfimievski, Gehrke, Srikant. Limiting Privacy Breaches in Privacy-Preserving Data Mining (PODS 2003). Blum, Dwork, McSherry,
More informationExtremal Mechanisms for Local Differential Privacy
Extremal Mechanisms for Local Differential Privacy Peter Kairouz Sewoong Oh 2 Pramod Viswanath Department of Electrical & Computer Engineering 2 Department of Industrial & Enterprise Systems Engineering
More informationPrivacy-preserving Data Mining
Privacy-preserving Data Mining What is [data] privacy? Privacy and Data Mining Privacy-preserving Data mining: main approaches Anonymization Obfuscation Cryptographic hiding Challenges Definition of privacy
More informationCHARTING SPATIAL BUSINESS TRANSFORMATION
CHARTING SPATIAL BUSINESS TRANSFORMATION An in-depth look at the business patterns of GIS and location intelligence adoption in the private sector EXECUTIVE SUMMARY The global use of geographic information
More informationH E L S I N K I 3D+ City Models and Smart Projects. Project Manager/ Architect/MSc (Civ.Eng) Jarmo Suomisto
H E L S I N K I 3D+ 3D City Models and Smart Projects Project Manager/ Architect/MSc (Civ.Eng) Jarmo Suomisto Project Manager/ MSc (Civ.Eng) Kari Kaisla Coordinator/ MSc (Civ.Eng) Enni Airaksinen Project
More informationSpatial Crowdsourcing: Challenges and Applications
Spatial Crowdsourcing: Challenges and Applications Liu Qiyu, Zeng Yuxiang 2018/3/25 We get some pictures and materials from tutorial spatial crowdsourcing: Challenges, Techniques,and Applications in VLDB
More informationTowards information flow control. Chaire Informatique et sciences numériques Collège de France, cours du 30 mars 2011
Towards information flow control Chaire Informatique et sciences numériques Collège de France, cours du 30 mars 2011 Mandatory access controls and security levels DAC vs. MAC Discretionary access control
More informationTime Series Data Cleaning
Time Series Data Cleaning Shaoxu Song http://ise.thss.tsinghua.edu.cn/sxsong/ Dirty Time Series Data Unreliable Readings Sensor monitoring GPS trajectory J. Freire, A. Bessa, F. Chirigati, H. T. Vo, K.
More informationJun Zhang Department of Computer Science University of Kentucky
Jun Zhang Department of Computer Science University of Kentucky Background on Privacy Attacks General Data Perturbation Model SVD and Its Properties Data Privacy Attacks Experimental Results Summary 2
More informationDetecting Origin-Destination Mobility Flows From Geotagged Tweets in Greater Los Angeles Area
Detecting Origin-Destination Mobility Flows From Geotagged Tweets in Greater Los Angeles Area Song Gao 1, Jiue-An Yang 1,2, Bo Yan 1, Yingjie Hu 1, Krzysztof Janowicz 1, Grant McKenzie 1 1 STKO Lab, Department
More informationData-driven characterization of multidirectional pedestrian trac
Data-driven characterization of multidirectional pedestrian trac Marija Nikoli Michel Bierlaire Flurin Hänseler TGF 5 Delft University of Technology, the Netherlands October 8, 5 / 5 Outline. Motivation
More informationVisualisation of Spatial Data
Visualisation of Spatial Data VU Visual Data Science Johanna Schmidt WS 2018/19 2 Visual Data Science Introduction to Visualisation Basics of Information Visualisation Charts and Techniques Introduction
More informationCollaborative topic models: motivations cont
Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.
More informationDifferentially Private Sequential Data Publication via Variable-Length N-Grams
Differentially Private Sequential Data Publication via Variable-Length N-Grams Rui Chen Concordia University Montreal, Canada ru_che@encs.concordia.ca Gergely Acs INRIA France gergely.acs@inria.fr Claude
More informationTRAITS to put you on the map
TRAITS to put you on the map Know what s where See the big picture Connect the dots Get it right Use where to say WOW Look around Spread the word Make it yours Finding your way Location is associated with
More informationDifferential Privacy and Verification. Marco Gaboardi University at Buffalo, SUNY
Differential Privacy and Verification Marco Gaboardi University at Buffalo, SUNY Given a program P, is it differentially private? P Verification Tool yes? no? Proof P Verification Tool yes? no? Given a
More informationDemographic Data in ArcGIS. Harry J. Moore IV
Demographic Data in ArcGIS Harry J. Moore IV Outline What is demographic data? Esri Demographic data - Real world examples with GIS - Redistricting - Emergency Preparedness - Economic Development Next
More informationDifferentially Private Sequential Data Publication via Variable-Length N-Grams
Author manuscript, published in "ACM Computer and Communication Security (CCS) (2012)" Differentially Private Sequential Data Publication via Variable-Length N-Grams Rui Chen Concordia University Montreal,
More informationPublishing Search Logs A Comparative Study of Privacy Guarantees
Publishing Search Logs A Comparative Study of Privacy Guarantees Michaela Götz Ashwin Machanavajjhala Guozhang Wang Xiaokui Xiao Johannes Gehrke Abstract Search engine companies collect the database of
More informationCrime Analysis. GIS Solutions for Intelligence-Led Policing
Crime Analysis GIS Solutions for Intelligence-Led Policing Applying GIS Technology to Crime Analysis Know Your Community Analyze Your Crime Use Your Advantage GIS aids crime analysis by Identifying and
More information[Title removed for anonymity]
[Title removed for anonymity] Graham Cormode graham@research.att.com Magda Procopiuc(AT&T) Divesh Srivastava(AT&T) Thanh Tran (UMass Amherst) 1 Introduction Privacy is a common theme in public discourse
More informationPAC-learning, VC Dimension and Margin-based Bounds
More details: General: http://www.learning-with-kernels.org/ Example of more complex bounds: http://www.research.ibm.com/people/t/tzhang/papers/jmlr02_cover.ps.gz PAC-learning, VC Dimension and Margin-based
More informationWishart Mechanism for Differentially Private Principal Components Analysis
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-6) Wishart Mechanism for Differentially Private Principal Components Analysis Wuxuan Jiang, Cong Xie and Zhihua Zhang Department
More informationNotes on the Exponential Mechanism. (Differential privacy)
Notes on the Exponential Mechanism. (Differential privacy) Boston University CS 558. Sharon Goldberg. February 22, 2012 1 Description of the mechanism Let D be the domain of input datasets. Let R be the
More information* Abstract. Keywords: Smart Card Data, Public Transportation, Land Use, Non-negative Matrix Factorization.
Analysis of Activity Trends Based on Smart Card Data of Public Transportation T. N. Maeda* 1, J. Mori 1, F. Toriumi 1, H. Ohashi 1 1 The University of Tokyo, 7-3-1 Hongo Bunkyo-ku, Tokyo, Japan *Email:
More informationDPT: Differentially Private Trajectory Synthesis Using Hierarchical Reference Systems
DPT: Differentially Private Trajectory Synthesis Using Hierarchical Reference Systems Xi He Duke University hexi88@cs.duke.edu Cecilia M. Procopiuc AT&T Labs-Research mpro@google.com Graham Cormode University
More informationCognitive Engineering for Geographic Information Science
Cognitive Engineering for Geographic Information Science Martin Raubal Department of Geography, UCSB raubal@geog.ucsb.edu 21 Jan 2009 ThinkSpatial, UCSB 1 GIScience Motivation systematic study of all aspects
More informationBiggest problem with data analytics. tons of data, not much insights.
Biggest problem with data analytics tons of data, not much insights. Mobile Marketing CAC KPI Advertisers Organic Which customer segment should we go after? Owned Media S E O What ad copy should I create?
More informationDifferentially Private Password Frequency Lists
Differentially Private Password Frequency Lists Or, How to release statistics from 70 million passwords on purpose Jeremiah Blocki Microsoft Research Email: jblocki@microsoftcom Anupam Datta Carnegie Mellon
More informationUncertain Time-Series Similarity: Return to the Basics
Uncertain Time-Series Similarity: Return to the Basics Dallachiesa et al., VLDB 2012 Li Xiong, CS730 Problem Problem: uncertain time-series similarity Applications: location tracking of moving objects;
More informationAnswering Many Queries with Differential Privacy
6.889 New Developments in Cryptography May 6, 2011 Answering Many Queries with Differential Privacy Instructors: Shafi Goldwasser, Yael Kalai, Leo Reyzin, Boaz Barak, and Salil Vadhan Lecturer: Jonathan
More informationDifferential Privacy Models for Location- Based Services
Differential Privacy Models for Location- Based Services Ehab Elsalamouny, Sébastien Gambs To cite this version: Ehab Elsalamouny, Sébastien Gambs. Differential Privacy Models for Location- Based Services.
More informationLesson 16: Technology Trends and Research
http://www.esri.com/library/whitepapers/pdfs/integrated-geoenabled-soa.pdf GEOG DL582 : GIS Data Management Lesson 16: Technology Trends and Research Overview Learning Objective Questions: 1. Why is integration
More informationOn Differentially Private Frequent Itemsets Mining
On Differentially Private Frequent Itemsets Mining Chen Zeng University of Wisconsin-Madison zeng@cs.wisc.edu Jeffrey F. Naughton University of Wisconsin-Madison naughton@cs.wisc.edu Jin-Yi Cai University
More informationAn Industry Perspective. Bryn Fosburgh Vice President Trimble
An Industry Perspective Bryn Fosburgh Vice President Trimble Who are we? Professionals & Consultants Geospatial Professionals working at or with: AEC Consultants Transportation Departments Construction
More informationRAPPOR: Randomized Aggregatable Privacy- Preserving Ordinal Response
RAPPOR: Randomized Aggregatable Privacy- Preserving Ordinal Response Úlfar Erlingsson, Vasyl Pihur, Aleksandra Korolova Google & USC Presented By: Pat Pannuto RAPPOR, What is is good for? (Absolutely something!)
More informationOman NSDI Supporting Economic Development. Saud Al-Nofli Director of Spatial Data Directorate General of NSDI, NCSI
Oman NSDI Supporting Economic Development 2017 Saud Al-Nofli Director of Spatial Data Directorate General of NSDI, NCSI "It s critical to make correct decisions the first time to optimize the Investments
More informationRelease Connection Fingerprints in Social Networks Using Personalized Differential Privacy
Release Connection Fingerprints in Social Networks Using Personalized Differential Privacy Yongkai Li, Shubo Liu, Jun Wang, and Mengjun Liu School of Computer, Wuhan University, Wuhan, China Key Laboratory
More informationLessons From the Trenches: using Mobile Phone Data for Official Statistics
Lessons From the Trenches: using Mobile Phone Data for Official Statistics Maarten Vanhoof Orange Labs/Newcastle University M.vanhoof1@newcastle.ac.uk @Metti Hoof MaartenVanhoof.com Mobile Phone Data (Call
More informationUncovering the Digital Divide and the Physical Divide in Senegal Using Mobile Phone Data
Uncovering the Digital Divide and the Physical Divide in Senegal Using Mobile Phone Data Song Gao, Bo Yan, Li Gong, Blake Regalia, Yiting Ju, Yingjie Hu STKO Lab, Department of Geography, University of
More informationProgress in Data Anonymization: from k-anonymity to the minimality attack
Progress in Data Anonymization: from k-anonymity to the minimality attack Graham Cormode graham@research.att.com Tiancheng Li, Ninghua Li, Divesh Srivastava 1 Why Anonymize? For Data Sharing Give real(istic)
More informationAccessibility as an Instrument in Planning Practice. Derek Halden DHC 2 Dean Path, Edinburgh EH4 3BA
Accessibility as an Instrument in Planning Practice Derek Halden DHC 2 Dean Path, Edinburgh EH4 3BA derek.halden@dhc1.co.uk www.dhc1.co.uk Theory to practice a starting point Shared goals for access to
More information