S C I E N C E P A S S I O N T E C H N O L O G Y Understanding the Impact of Weather for POI Recommendations Christoph Trattner, Alex Oberegger, Lukas Eberhard, Denis Parra, Leandro Marinho, Know-Center@Graz University of Technology http://know-center.tugraz.at/
2 Understanding the Impact of Weather for POI Recommendations, Content 1. Motivation 2. Dataset 3. Empirical Analysis 4. Weather Aware POI Recommender (WPOI) 5. Conclusions & Future Work
Motivation 3 Motivation Geolocation services on portable devices Location-based social networks (LBSN) Yelp Foursquare Vast amount of check-in data Comments on places Recommendations on places Ratings on places Use data to assist users
Motivation 4 Context aware recommender systems (CARS) concentrated on Social context Geographical context Time context No research on the impact of weather Pool not popular at rainy conditions Ice cream shop not popular at cold temperatures
5 Motivation Problem Statement Definition Given a user u, the user s check-in history L u, i.e., the POIs that the user has visited in the past, and the current weather context c the aim is to predict the POIs ˆL u = {l 1,..., l L } that the user will likely visit in the future that are not in L u.
6 Motivation Research Questions RQ1 Are the users mobility patterns influenced by weather? RQ2 How can weather context information be incorporated into existent recommender systems? RQ3 To what extent can weather information be used to increase recommender accuracy? RQ4 Which weather feature provides the highest impact on the recommender accuracy?
7 Dataset Dataset Foursquare Check-in Dataset from Dingqi Yang [3][4] Worldwide check-ins from April 2012 to September 2013 Filtered by U.S. cities 3,000,000 Check-ins 500,000 Venues 50,000 Users 60 Cities
8 Dataset Dataset Weather Data from forecast.io weather API [1] One API call per <venue, city, time> triple 27,000 API calls Eight weather attributes Visibility, Precipitation intensity, Humidity, Cloud cover, Pressure, Windspeed, Temperature, Moonphase Outdated category information needed recrawling
9 Dataset Check-in Distributions Categories User 12 10 4 10 10 3 Frequency 8 6 4 Frequency (log) 10 2 10 1 2 0 10 0 10 1 10 2 10 3 10 4 10 5 10 6 Check in Frequency (log) 200 main categories 10 0 10 0 10 1 10 2 10 3 10 4 Check in Frequency (log) 11,000 user with >100 check-ins
Empirical Analysis 10 Weather Distributions Temperature Moonphase σ = 9.36 µ = 16.69 Data σ = 0.29 µ = 0.53 Data 0.045 1.4 0.040 1.2 0.035 Check in probability 0.030 0.025 0.020 0.015 0.010 Check in probability 1.0 0.8 0.6 0.4 0.005 0.2 0.000 30 20 10 0 10 20 30 40 50 Temperature Normal distributed 0.0 0.0 0.2 0.4 0.6 0.8 1.0 Moonphase Uniform distributed
11 Empirical Analysis Correlation between weather features
Empirical Analysis 12 Impact on Travel Distance Cloud Cover Pressure 10 1 10 1 10 2 10 2 Probability (log) 10 3 10 4 Probability (log) 10 3 10 4 10 5 10 5 10 6 Low Cloud cover High Cloud cover 10 0 10 1 10 2 Distance(log(KM)) 10 6 Low Pressure High Pressure 10 0 10 1 10 2 Distance(log(KM))
13 Empirical Analysis Impact on Category Popularity Ice Cream Shop Ski Area Check in probability 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0.00 0.01 0.02 30 20 10 0 10 20 30 40 50 Temperature Check in probability 0.12 0.10 0.08 0.06 0.04 0.02 0.00 0.02 20 10 0 10 20 30 40 Temperature
14 Empirical Analysis User Mobility Patterns Mobility pattern of a frosty user Mobility pattern of a heated user 0.10 0.10 0.08 0.08 Check in probability 0.06 0.04 0.02 Check in probability 0.06 0.04 0.02 0.00 0.00 0.02 20 10 0 10 20 30 40 Temperature 0.02 20 10 0 10 20 30 40 Temperature
15 Empirical Analysis Seasonal Impact on Categories Check in probability 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Winter Spring Summer Autumn 0.0 All Categories National Park Ski Area Austrian Restaurant Hawaiian Restaurant Hotel Pool Zoo Ice Cream Shop Baseball Stadium Mexican Restaurant Gym
16 Empirical Analysis Regional Weather Variability 0.9 0.8 South cities North cities Total variability normed over all weather features Standard deviation 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Honolulu Oakland Phoenix San Diego Miami Los Angeles San Jose Tampa El Paso Seattle Portland Memphis New Orleans Sacramento Nashville San Antonio Dallas Charlotte Atlanta Jacksonville Raleigh Tallahassee Houston Kansas City St. Louis Denver Richmond Norfolk Baltimore Austin Washington D.C. Indianapolis Pittsburgh Salt Lake City Cleveland Rochester Chicago Oklahoma City Cincinnati Detroit Buffalo Philadelphia Columbus Columbia Milwaukee Chester New York Lansing Boston Harrisburg Saint Paul Hartford Des Moines Providence Newark Trenton Minneapolis Brooklyn Madison
17 Weather Aware POI Recommender (WPOI) Weather Aware POI Recommender (WPOI) Many basic recommender algorithms available Most Popular (MP) KNN-Algorithms (User, Item) Matrix Factorization (MF) Enrich existing recommender system with weather context Test recommender accuracy in four cities representing variety of climate (cardinal directions) Minneapolis Boston Miami Honolulu
Weather Aware POI Recommender (WPOI) 18 Rank-GeoFM by Li et al. (2015) [2] Based on MF Context aware (Geo and Time) Easy to extend High accuracy Based on iterative learning of latent model parameters (weights) Learns model parameters with stochastic gradient descent (SGD)
Weather Aware POI Recommender (WPOI) 19 y ult = u u (1) l (1) l }{{} user preference score + u (2) u l N k (l) + t (1) t l (2) l }{{} w ll l (1) l }{{} geographical influence score time preference score + l (3) l t T + m tt t (1) t }{{} temporal influence score U (1,2), L (1,...,3), T (1) latent model parameters to learn wll is the probability that l is visited given that l has been visited in terms of geographical position and mtt is the probability that the popularity of a POI in time slot t is influenced by those in time slot t.
Weather Aware POI Recommender (WPOI) 20 Replace time with weather feature Start algorithm for each weather feature separately Measure accuracy and compare with RankGeoFM and base algorithms y ulc = u u (1) l (1) l }{{} user preference score + u (2) u l N k (l) + f (1) c l (2) l }{{} w ll l (1) l }{{} geographical influence score weather preference score + l (3) l c FC + wi cc f (1) c }{{} weather influence score
Weather Aware POI Recommender (WPOI) 21
Weather Aware POI Recommender (WPOI) 22 Algorithm 1: WPOI Input: check-in data D, hyperparameters that steer context influence and learning rate 1 Init: Initialize Θ with N (0, 0.01); Shuffle D 2 repeat 3 for (u, l, c) D do 4 Compute y ulc and n = 0 5 repeat 6 Sample a POI l and feature class c, Compute y ul c and set n++ 7 until (u,l,c ) ranked inccorrectly everything ranked correct 8 if (u,l,c ) ranked inccorrectly then 9 update latent model parameters Θ 10 according to the gradient of the error function. 11 end 12 end 13 until convergence 14 return Θ = {L (1), L (2), L (3), U (1), U (2), F (1) }
Weather Aware POI Recommender (WPOI) 23 0.06 Boston Boston, WPOI (Temperature) NDCG@20 0.05 NDCG@20 Miami Miami, WPOI (Temperature) 0.05 0.04 0.04 Measure 0.03 Measure 0.03 0.02 0.02 0.01 0.01 Measure 0.00 0.06 0.05 0.04 0.03 0.02 0 1000 2000 3000 4000 5000 Iteration Minneapolis Minneapolis, WPOI (Temperature) NDCG@20 Measure 0.00 0.07 0.06 0.05 0.04 0.03 0.02 0 1000 2000 3000 4000 5000 Iteration Honolulu Honolulu, WPOI (Temperature) NDCG@20 0.01 0.01 0.00 0 1000 2000 3000 4000 5000 Iteration 0.00 0 1000 2000 3000 4000 5000 Iteration
Weather Aware POI Recommender (WPOI) 24 0.07 Boston 0.06 Miami 0.06 0.05 0.05 0.04 NDCG@20 0.04 0.03 0.02 NDCG@20 0.03 0.02 0.01 0.01 0.00 Rank GeoFM WPOI (Press.) Rank GeoFM T WPOI (Hum.) WPOI (Windsp.) WPOI (Vis.) WPOI (Cld. Cov.) WPOI (Temp.) WPOI (Moonph.) WPOI(Prec. Int.) 0.00 Rank GeoFM WPOI (Hum.) Rank GeoFM T WPOI (Cld. Cov.) WPOI (Moonph.) WPOI (Windsp.) WPOI (Press.) WPOI (Temp.) WPOI (Vis.) WPOI(Prec. Int.) Minneapolis Honolulu 0.08 0.10 0.07 0.08 0.06 NDCG@20 0.05 0.04 0.03 NDCG@20 0.06 0.04 0.02 0.02 0.01 0.00 Rank GeoFM WPOI (Windsp.) WPOI (Hum.) WPOI (Temp.) WPOI (Press.) WPOI(Prec. Int.) WPOI (Moonph.) Rank GeoFM T WPOI (Cld. Cov.) WPOI (Vis.) 0.00 Rank GeoFM WPOI (Press.) WPOI (Hum.) WPOI (Moonph.) WPOI (Windsp.) WPOI (Temp.) Rank GeoFM T WPOI (Cld. Cov.) WPOI (Vis.) WPOI(Prec. Int.)
25 Conclusions & Future Work Conclusions Empirical Analysis Weather features have little impact on travel distance Seasonality has an impact on category popularity Category popularity is dependent on the region Higher weather variability in the north
Conclusions & Future Work 26 WPOI Weather context more useful than time WPOI better in regions closer to tropical zone Precipitation intensity and visibility improve accuracy best Weather context is indeed a useful contextual information in POI recommender systems
27 Conclusions & Future Work Future Work Only one weather feature at a time Incorporate travel distance probabilities under different weather conditions Incorporate user sensitivity to weather conditions
Conclusions & Future Work 28 Questions?
29 Conclusions & Future Work References I [1] FORECAST.IO. forecast.io weather api, 2015. [2] LI, X., CONG, G., LI, X.-L., PHAM, T.-A. N., AND KRISHNASWAMY, S. Rank-geofm: A ranking based geographical factorization method for point of interest recommendation. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (New York, NY, USA, 2015), SIGIR 15, ACM, pp. 433 442.
30 Conclusions & Future Work References II [3] YANG, D., ZHANG, D., CHEN, L., AND QU, B. Nationtelescope: Monitoring and visualizing large-scale collective behavior in lbsns. Journal of Network and Computer Applications 55 (2015), 170 180. [4] YANG, D., ZHANG, D., AND QU, B. Participatory cultural mapping based on collective behavior in location based social networks. ACM Transactions on Intelligent Systems and Technology (2015). in press.
Conclusions & Future Work 31 Weather feature Properties Range in dataset Precipitation intensity Precipitation intensity measured in milimeters of liquid water/hour. Temperature Temperature measured in degree Celsius Wind speed Wind speed measured in meters/second Cloud cover Value between 0 and 1 displaying the percentage of the sky covered by clouds. 0mm/h 34, 29mm/h 24, 48 46, 58 0m/s 19, 13m/s 0 1
32 Conclusions & Future Work Humidity Value between 0 and 1 representing the Percentage relative humidity is defined as the partial pressure of water vapor in air divided by the vapor pressure of water at the given temperature. Pressure Atmospheric pressure measured in hectopascals. Visibility Moonphase Value representing the average visibility in kilometers capped at 16,09 Value from 0 to 1 representing the range between new moon and full moon 0, 02φ 1, 00φ 957, 11hPa 1046, 05hPa 0km 16, 09km 0 1
33 Conclusions & Future Work Correlations Humidity Temperature -0.99 Relative humidity represents the saturation of moisture in the air and cold air does not need that much moisture to be saturated. Windspeed Temperature 0.93 Foehn-effect, Windchill factor not included Humidity Windspeed -0.93 See positive correlation between windspeed and temperature and negative between humidity and temperature.
34 Conclusions & Future Work Correlations Temperature Visibility 0.77 At colder temperatures saturation is reached earlier and therefore fog occurs more often that blurs visibility. Humidity Visibility -0.77 High humidity blurs visibility Cloud cover Visibility -0.67 Clouds cover the sun visibility is low
Conclusions & Future Work 35 0.56 0.55 Moonphase 0.54 0.53 0.52 0.51 Category
Conclusions & Future Work 36 Algorithm 2: WPOI Input: check-in data D, geographical influence matrix W, weather influence matrix WI, hyperparameters ɛ, C, α, β and learning rate γ g and γ w Output: parameters of the model Θ = {L (1), L (2), L (3), U (1), U (2), F (1) } 1 init: Initialize Θ with N (0, 0.01); Shuffle D 2 repeat 3 for (u, l, c) D do 4 Compute y ulc and n = 0 5 repeat 6 Sample a POI l and feature class c, Compute y ul c and set n++ 7 until I(x ulc > x ul c )I(y ulc < y ul c + ɛ) = 1 or n > L 8 if I(x ulc > x ul ( c )I(y ) ulc < y ul c + ɛ) = 1 then 9 η = E L δ n ull ( 10 g = c wi FCf c c f c (1) ) c + wi FC f cc + f (1) c+ 11 f (1) c f (1) c γ w η(l (2) l l (2) ) l 12 l (3) l 13 l (2) l (3) l l l (2) l γ w ηg γ w ηf c 14 l (2) l (2) + γ l l w ηf c 15 end 16 end 17 Project updated factors to accomplish constraints 18 until convergence 19 return Θ = {L (1), L (2), L (3), U (1), U (2), F (1) }
Conclusions & Future Work 37 Symbol U L FC f Description set of users u 1, u 2,..., u U set of POIs l 1, l 2,..., l L set of classes for feature f F set of weather feature classes f 1, f 2,..., f FCf Θ latent model parameters containing the learned weights {L (1), L (2), L (3), U (1), U (2), F (1) } for locations, users and weather features X ul X ulc U x L matrix containing the check-ins of users at POIs U x L x FC f matrix containing the check-ins of users at POIs at a specific feature class c D 1 user-poi pairs: (u, l) x ul > 0 D 2 user-poi-feature class triples: (u, l, c) x ulc > 0 d(l, l ) geo distance between the latitude and longitude of l and l
Conclusions & Future Work 38 W geographical probability matrix of size L x L where w ll contains the probability of l being visited after l has been visited according to their geographical distance. w ll = (0.5 + d(l, l )) 1) WI N k (l) y ul y ulc probability that a weather feature class c is influenced by feature class c. wi cc = set of k nearest neighbors of POI l u U l L x ulcx ulc u U l L ulc x 2 u U l L x 2 ulc the recommendation score of user u and POI l the recommendation score of user u, POI l and weather feature class c I( ) indicator function returning I(a) = 1 when a is true and 0 otherwise ɛ Incomp(y ulc, ɛ) margin to soften ranking incompatibility a function that counts the number of locations l L that should be ranked lower than l at the current weather context c and user u but are ranked higher by the model. γ w learning rate for updates on weather latent parameters.
Conclusions & Future Work 39 γ g learning rate for updates on latent parameters from base approach. E( ) a function that turns the rating incompatibility Incomp(y ulc, ɛ) into a loss. E(r) = r i=1 O 1 i objective function to minimize during the iterative learning. O = (u,l,c) D2 E(Incomp(y ulc, ɛ)) s(a) sigmoid function s(a) = δ ucll L n E 1 1+exp( a) function to approximate the indicator function with a continuous sigmoid function. δ ucll = s(y ul c + ɛ y ulc )(1 s(y ul c + ɛ y ulc )) if the nth location l was ranked incorrect by the model the expactation is that overall L locations are ranked incorrect. n calculation of stochastic gradient = Θ ( ) E L n δ (y ucl ucll +ɛ y ucl ) Θ Θ Θ E Θ SGD based optimization of the latent model parameters
40 Conclusions & Future Work Dataset Statistics City #Users #Venues #Check-ins Sparsity Minneapolis 436 797 37,737 89.1% Boston 637 1141 42,956 94.3% Miami 410 796 29,222 91.0% Honolulu 173 410 16,042 77.4%
41 Conclusions & Future Work Formulas SE x = SD n (1)