Predicting Neighbor Goodness in Collaborative Filtering

Similar documents
Query Performance Prediction: Evaluation Contrasted with Effectiveness

A Neural Passage Model for Ad-hoc Document Retrieval

Circle-based Recommendation in Online Social Networks

A Modified PMF Model Incorporating Implicit Item Associations

Language Models, Smoothing, and IDF Weighting

Predicting the Performance of Collaborative Filtering Algorithms

Language Models and Smoothing Methods for Collections with Large Variation in Document Length. 2 Models

Preliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use!

Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data

Impact of Data Characteristics on Recommender Systems Performance

Hub, Authority and Relevance Scores in Multi-Relational Data for Query Search

Recommendation Systems

Natural Language Processing. Topics in Information Retrieval. Updated 5/10

The Combination and Evaluation of Query Performance Prediction Methods

A Linked Data Recommender System using a Neighborhood-based Graph Kernel

The Axiometrics Project

On the Foundations of Diverse Information Retrieval. Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi

Probabilistic Field Mapping for Product Search

Lecture 13: More uses of Language Models

Algorithms for Collaborative Filtering

Can Vector Space Bases Model Context?

Collaborative topic models: motivations cont

Ontology-Based News Recommendation

Modeling the Score Distributions of Relevant and Non-relevant Documents

COT: Contextual Operating Tensor for Context-Aware Recommender Systems

NCDREC: A Decomposability Inspired Framework for Top-N Recommendation

Recommender Systems EE448, Big Data Mining, Lecture 10. Weinan Zhang Shanghai Jiao Tong University

A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank

Motivation. User. Retrieval Model Result: Query. Document Collection. Information Need. Information Retrieval / Chapter 3: Retrieval Models

Information Retrieval

Prediction of Citations for Academic Papers

Recommendation Systems

Probabilistic Neighborhood Selection in Collaborative Filtering Systems

Time-aware Point-of-interest Recommendation

arxiv: v1 [cs.ir] 16 Oct 2013

Rfuzzy: An expressive simple fuzzy compiler

Probabilistic Partial User Model Similarity for Collaborative Filtering

Collaborative Filtering on Ordinal User Feedback

Modeling User Rating Profiles For Collaborative Filtering

Citation for published version (APA): Andogah, G. (2010). Geographically constrained information retrieval Groningen: s.n.

Why Language Models and Inverse Document Frequency for Information Retrieval?

FUSION METHODS BASED ON COMMON ORDER INVARIABILITY FOR META SEARCH ENGINE SYSTEMS

Similarity Measures for Link Prediction Using Power Law Degree Distribution

A Gradient-based Adaptive Learning Framework for Efficient Personal Recommendation

Asymmetric Correlation Regularized Matrix Factorization for Web Service Recommendation


Scaling Neighbourhood Methods

Ranked Retrieval (2)

Measuring the Variability in Effectiveness of a Retrieval System

BEWARE OF RELATIVELY LARGE BUT MEANINGLESS IMPROVEMENTS

Collaborative Filtering with Aspect-based Opinion Mining: A Tensor Factorization Approach

Department of Computer Science, Guiyang University, Guiyang , GuiZhou, China

Retrieval by Content. Part 2: Text Retrieval Term Frequency and Inverse Document Frequency. Srihari: CSE 626 1

Comparing Relevance Feedback Techniques on German News Articles

Modeling Controversy Within Populations

Collaborative Filtering. Radek Pelánek

Collaborative Recommendation with Multiclass Preference Context

Information-driven learning, distributed fusion, and planning

Towards Collaborative Information Retrieval

Lecture 3: Pivoted Document Length Normalization

Learning MN Parameters with Alternative Objective Functions. Sargur Srihari

Towards modeling implicit feedback with quantum entanglement

Collaborative Filtering with Entity Similarity Regularization in Heterogeneous Information Networks

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition

CS 646 (Fall 2016) Homework 3

A Survey on Spatial-Keyword Search

Point-of-Interest Recommendations: Learning Potential Check-ins from Friends

Large-scale Collaborative Ranking in Near-Linear Time

Optimal Mixture Models in IR

Exploring the Ratings Prediction Task in a Group Recommender System that Automatically Detects Groups

Collaborative Filtering Using Orthogonal Nonnegative Matrix Tri-factorization

Collaborative Filtering with Temporal Dynamics with Using Singular Value Decomposition

A Study of the Dirichlet Priors for Term Frequency Normalisation

CS276A Text Information Retrieval, Mining, and Exploitation. Lecture 4 15 Oct 2002

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Research Article Normalizing Item-Based Collaborative Filter Using Context-Aware Scaled Baseline Predictor

Fast LSI-based techniques for query expansion in text retrieval systems

Information Retrieval and Organisation

Additive Co-Clustering with Social Influence for Recommendation

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

CS 277: Data Mining. Mining Web Link Structure. CS 277: Data Mining Lectures Analyzing Web Link Structure Padhraic Smyth, UC Irvine

Term Filtering with Bounded Error

Local Low-Rank Matrix Approximation with Preference Selection of Anchor Points

arxiv: v2 [cs.ir] 14 May 2018

Using the Euclidean Distance for Retrieval Evaluation

APPLICATIONS OF MINING HETEROGENEOUS INFORMATION NETWORKS

CMAP: Effective Fusion of Quality and Relevance for Multi-criteria Recommendation

Learning in Probabilistic Graphs exploiting Language-Constrained Patterns

ChengXiang ( Cheng ) Zhai Department of Computer Science University of Illinois at Urbana-Champaign

III.6 Advanced Query Types

Lecture 9: Probabilistic IR The Binary Independence Model and Okapi BM25

Ranking and Filtering

Semantic Term Matching in Axiomatic Approaches to Information Retrieval

Collaborative Hotel Recommendation based on Topic and Sentiment of Review Comments

The Use of Dependency Relation Graph to Enhance the Term Weighting in Question Retrieval

Web Movie Recommendation Using Reviews on the Web

What are the Necessity Rules in Defeasible Reasoning?

Large-Scale Social Network Data Mining with Multi-View Information. Hao Wang

Matrix Factorization with Content Relationships for Media Personalization

ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 3 Centrality, Similarity, and Strength Ties

Transcription:

Predicting Neighbor Goodness in Collaborative Filtering Alejandro Bellogín and Pablo Castells {alejandro.bellogin, pablo.castells}@uam.es Universidad Autónoma de Madrid Escuela Politécnica Superior

Introduction: Recommender Systems Recommender Systems (RS) goal objects i 1 i k i m u 1 r 11 r 1k r 1m users u j r j1? r jm u n r n1 r nk r nm how is predicted the value of r jk?

Introduction: Recommender Systems Collaborative filtering (CF) based on users items i 1 i k i m u 1 r 11 r 1k r 1m users u j r j1? r jm u n r n1 r nk r nm Producing recommendations in UBCF: g u m, i n u N[ u ] j j m u N[ u ] sim u, u r m m j j, n sim u, u m j

Introduction: Recommender Systems Collaborative filtering (CF) based on items items i 1 i k i m u 1 r 11 r 1k r 1m users u j r j1? r jm u n r n1 r nk r nm Producing recommendations in IBCF: g u m, i n i N[ i ] j n i N[ i ] j sim i, i n n j m, j sim i, i n r j

Motivation Proliferation and variety of input information in IR systems Performance prediction in IR: adjust retrieval strategies according to the value of the prediction function In classic retrieval: query effectiveness Applications: query expansion, meta-search, distributed IR, etc. In CF, each neighbor can be seen as a different source of information with different weight? (besides the similarity value)

Performance prediction in IR Mostly addressed: query performance prediction in ad-hoc retrieval Approaches (Hauff et al. 2008) Pre-retrieval Pros: The prediction can be taken into account to improve the retrieval process itself Cons: Effectiveness cues available after the retrieval are not exploited Post-retrieval Pros: Better prediction accuracy Cons: Problem with computational efficiency

Clarity score in ad-hoc retrieval Distance (relative entropy) between query and collection language models (Cronen-Townsend et al. 2002) q Pw q clarity log wv It captures the (lack of) ambiguity in a query with respect to the collection Queries whose likely relevant documents are a mix of disparate topics receive a lower score than those with a topically-coherent result set. Strong correlation between this score and the performance (average precision) of the result set, q 1 2 P w q coll dr w q ml P P w q P w d P d q P q d P w d P w d P w d P w w coll q

Performance prediction in Recommender Systems Rating prediction as dynamic aggregation Each user s neighbor can be seen as a retrieval subsystem whose output is to be combined to form the final system output Utility of an item for a user:, sim,, v r u i r u C u v r v i r v Nu Using dynamic aggregation:,,, sim,, r u i r u C v u i u v r v i r v v Nu where (v,u,i) is a predictor of the performance of neighbor v.

Predicting good neighbors The predictor can be sensitive to the specific target user u, the item i, or any other input of the system. In this work: (v,u,i) = (v), i.e., only the neighbor We define two predictors, inspired by the clarity score: Item-based user clarity (IUC) User-based user clarity (UUC) pi v pw v γ γ( v) UUC v 2 pw vlog2 ii pc i wu pc w rat v, i pw v pw i pi v 1 c i: rat( v, i) 0 5 rat w, i 1 pw i 1 pc w i 5 I 1 p v γ γ( v) IUC v p i v log p i v p i p c c U

MovieLens dataset (100k) Evaluation Two variables: Neighborhood size Sparsity (number of available ratings) Measure final performance improvements (in terms of MAE) when dynamic weights are introduced

Results I Performance comparison for different rating density

Results II Performance comparison for different neighbourhood sizes

Conclusions Performance improvements in MAE using dynamic weights for neighbors Higher difference for small neighborhoods In the future: Other variants of the clarity-based predictor (He & Ounis 2004, Zhou & Croft 2007) Correlation analysis (as in IR) It involves defining user-level performance metrics Extension to other areas: hybrid recommender systems (Adomavicious & Tuzhilin 2005), personalized IR (Castells et al. 2005), rank fusion (Fox & Shaw 1993)

Thank you

Bibliography Adomavicius, G., Tuzhilin, A.: Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering 17(6), 734--749 (2005) Castells, P., Fernández, M., Vallet, D., Mylonas, P. & Avrithis, Y. (2005), Self-tuning personalized information retrieval in an ontology-based framework, in R. Meersman, Z. Tari & P. Herrero, eds, SWWS 05: In proceedings of the 1st International Workshop on Web Semantics, Vol. 3762, Springer, pp. 977 986. Cronen-Townsend, S., Zhou, Y. & Croft, B. W. (2002), Predicting query performance, in SIGIR 02: Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval, ACM Press, New York, NY, USA, pp. 299 306. Fox, E. A. & Shaw, J. A. (1993), Combination of multiple searches, in TREC, pp. 243 252. Hauff, C., Hiemstra, D., de Jong, F.: A survey of pre-retrieval query performance predictors. In: 17 th ACM conference on Information and knowledge management (CIKM 2008), pp. 1419 1420. ACM Press, New York (2008) He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: Apostolico, A., Melucci, M. (eds.) String Processing and Information Retrieval, LNCS, vol. 2346, pp. 43--54. Springer, Heidelberg (2004) Zhou, Y., Croft, B.W.: Query performance prediction in web search environments. In: 30 th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2007), pp. 543--550, ACM Press, New York (2007)