Diagnosing New York City s Noises with Ubiquitous Data Dr. Yu Zheng yuzheng@microsoft.com Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University
Background Many cities suffer from noise pollutions Traffic, loud music, construction, AC Compromise working efficiency Reduce sleep quality Impair both physical and mental health Urban noise is difficult to model Change over time very quickly Vary by location significantly Depends on sound level and people s tolerance The composition of noises is hard to analyze Yu Zheng, et al. Diagnosing New York City s Noises with Ubiquitous Data. UbiComp 014.
311 in NYC 311 Data A platform for citizen s non-emergent complaints Associated with a location, timestamp, and a category Human as a sensor crowd sensing Implies people s reaction and tolerance to noises
Goal Weekday:0-5am Weekend: 7pm-11pm Weekday: 6am-6pm Reveal the noise situation of each region in each hour A noise indicator denoting the noisy level Composition of noises in each location POIs Road Networks A) Overall noises B) Construction Check-ins 311 Data about noises C) Noise of different categories in Time Square Weekday:0-5am Weekend: 7pm-11pm Weekday: 6am-6pm Check-in all A) Overall noises B) Construction
Methodology Partition NYC into regions by major roads
Build a 3D tensor to model the noises [r1, [r1, r r,,, ri, ri,,, rn], rn] an entry an entry A [c 1, c,, c j,, c M ] [c 1, c,, c j,, c M ] Noise Categories Noise Categories N d R R M C S d C L T d T T Sparsity
Data sparseness 311 in NYC 90% regions receives 1 complaint per day on weekdays 75% regions receives 1 complaint per day on weekends Figure 6. Proportion of regions with complaints smaller than a number Figure 7. The proportion of regions with complaints of a noise category 68 weekdays
Methodology 311 complaints about noises POIs and Road Networks User check-in Features Road intersection Categories X X = R U 311 complaint POIs Small streets Major roads A Categories Categories Z Check-ins Check-in all L S, R, C, T, U = 1 A S R R C C T T + λ 1 X RU + λ tr CT L Z C + λ 3 S + R + C + T + U + λ 4 Time slots Y Y = T R T Y = T R T Y TRT
[r1, r,, ri,, rn] L S, R, C, T, U = 1 A S R R C C T T + λ 1 X RU + λ tr CT L Z C + λ 3 Y TR T + λ 4 S + R + C + T + U ] N [r1, r,, ri,, rn] d R R Features Noise Categories S R X X = R U d C M M dc R AN λ 1 [c 1, c,, c j,, c M ] Noise Categories A an entry an entry [c 1, c,, c j,, c M ] C d C L ST U [r1, r,, ri,, rn d T T L Time slots A Categories M d T Y Y = T R T X RU Y = T R N A N d R R d R R an entry C S M d C C S d C T T L T L Noise Categories d TR d T [c1, c,, cj,, cm] Categories 1 i,j Categories Z c i c j Z ij = = tr C T D Z C = tr(c T L Z C) M 1 A S R R C C T T C d C [r 1, r,, r i,, r N ] A an entry N S T R T LN T T d R R λ 3 N R S L dr M C Y TRT
Methodology Geographical Features Road networks Number of intersections f s r 1 r f s f r f n f d f c Length of road segments in different levels f r POIs X= ri F r F p Total number of POIs f n and density of POIs f d r N Distribution over different categories f c User check-in Road intersection 311 complaint POIs Small streets Major roads
Methodology Check in data in NYC Proportion Gowalla: 17,558 check-ins (4/4/009 to 10/13/013) Foursquare: 173,75 check-ins (5/5/008 to 7/3/011) Correlation with noises Y implies correlation between different time slots correlation between different regions Proportion 0.1 0.08 0.04 0.00 Noise - Vehicles Check-in - Art & Entertaiment 0 4 8 1 16 0 4 Time of Day 0.16 0.1 0.08 0.04 0.00 Noise - Loud music/party Check-in - Nightlife spot 0 4 8 1 16 0 4 Time of Day Y= t 1 t t k r 1 r i r N d 11 r d ki t L d LN Check-in: Entertainment Check-ins: Nightlife Spot Noise: Loud Music/Party
Methodology Correlation between different noise categories User check-in Categories % Categories % c 1. Loud Music/Party 4. c 8. Alarms 1.7 c. Construction 17. c 9. Private carting noise 0.8 c 3. Loud Talking 14.6 c 10. Manufacturing 0.3 c 4. Vehicle POIs 13.7 c 11. Lawn care equipment 0.3 c 5. AC/Ventilation 3.9 c 1. Horn Honking 0. equipment c 6.Banging/Pounding.1 c 13. Loud Television 0.1 c 7. Jack Hammering.1 c 14. Others 0.8 Road intersection 311 complaint Small streets Major roads Z= c 1 c c j c M c 1 c j c M c 11 c c jj c MM 1 3 4 5 6 7 8 9 10 11 1 13 14 1 3 4 5 6 7 8 9 10 11 1 13 14 1 3 4 5 6 7 8 9 10 11 1 13 14 1 3 4 5 6 7 8 9 10 11 1 13 14 A) weekday B) weekend
Road in 311 co Experiment Datasets Data sets Period Scales 311 noise data 5/3/013-1/31/014 67,378; 14 categories Foursquare 4/4/009-10/13/013 173,75 Gowalla 5/5/008-7/3/011 17,558 POIs 013 6,884; 15 categories Road Network 013 87,898 nodes, 91,649 edges User ch Check-in all Small Major
Evaluation of Tensor Accuracy of the inferences Remove 30% non-zero entries Features X X = R U A Categories Time slots Y Y = T R T Metrics: RMSE & MAE Categories Categories Z Methods Weekdays Weekends RMSE MAE RMSE MAE AWR 4.736.58 4.446.599 AWH 4.631.461 4.4.5 MF 4.600.474 4.393.516 Kriging 4.59.44 4.53.495 TD 4.391.381 4.141.393 TD+ X 4.85.79 4.155.36 TD+ X + Y 4.160.110 4.003.198 TD+ X + Y+ Z 4.010.013 3.930.07 Yu Zheng, et al. Diagnosing New York City s Noises with Ubiquitous Data. UbiComp 014.
Experiments Relative ranking performance Ranked by the inferred noise indicators Ground truth: measured by a mobile phone running a client program Metric: NDCG 1 0.8 0.6 0.4 0. Location with few complaints Locations with sufficient complaints 0 311 Data Inferred Data 311 Data Inferred Data 311 Data Inferred Data NDCG@ NDCG@4 NDCG@6 Daytime Nighttime A) Locations B) Correlation in 6am-6pm C) Correlation in 7pm-11pm
311 in NYC Correlation between 311 complaints and real noise levels Measured the real noise levels of 36 locations in Manhattan People s tolerances vary in time of day Location with few complaints Locations with sufficient complaints A) Locations B) Correlation in 6am-6pm C) Correlation in 7pm-11pm
311 vs. Inferences Results.75% of entries on weekdays are from 311 data (97.5% by inference) 1.83% of entries on weekends are from 311 data (98.17% by inference) 311 data Inference 311 data Inference 0-6am A) Vehicles (6am-6pm) B) Loud Talking (0-5am)
Results Overall noise situation in NYC Weekdays and weekends Different time of days 0-5am 6am-6pm 7pm-11pm Example Columbia University Wall Street, central park, Weekends Weekdays Time square, NYU, Columbia New York University Wall Street Central Park Time Square NYU Columbia Others Private carting noise Lawn care equipment Jack Hammering Horn Honking Alarms AC/Ventilation Manufacturing Construction Loud Television Vehicle Banging/Pounding Loud Talking Loud Music/Party Time Square Wall Street Central Park 0 00 400 600 800 1000 100 Noise Pollution Indicator
Weekends Weekdays Weekends Weekdays Weekends Weekdays Results Noises of particular categories Loud Music/Party Loud Talking Vehicle Construction Yu Zheng, et al. Diagnosing New York City s Noises with Ubiquitous Data. UbiComp 014.
Conclusion 311 data helps reveal the noise situation of the city but not accurately enough Fusing other data sources is important Spatio-temporal correlation between noises of different locations and time slots