A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank

Similar documents
Listwise Approach to Learning to Rank Theory and Algorithm

Topic Modelling and Latent Dirichlet Allocation

Generative Clustering, Topic Modeling, & Bayesian Inference

Classical Predictive Models

Online Bayesian Passive-Agressive Learning

Monte Carlo Methods for Maximum Margin Supervised Topic Models

Using Both Latent and Supervised Shared Topics for Multitask Learning

Document and Topic Models: plsa and LDA

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

Lecture 13 : Variational Inference: Mean Field Approximation

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

CS Lecture 18. Topic Models and LDA

Web-Mining Agents Topic Analysis: plsi and LDA. Tanya Braun Ralf Möller Universität zu Lübeck Institut für Informationssysteme

INTERPRETING THE PREDICTION PROCESS OF A DEEP NETWORK CONSTRUCTED FROM SUPERVISED TOPIC MODELS

PROBABILISTIC LATENT SEMANTIC ANALYSIS

Latent Dirichlet Allocation (LDA)

Note for plsa and LDA-Version 1.1

Online Bayesian Passive-Aggressive Learning"

Introduction to Machine Learning

Introduction to Machine Learning

Latent Dirichlet Allocation (LDA)

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

Decoupled Collaborative Ranking

Information Retrieval

Query-document Relevance Topic Models

11. Learning To Rank. Most slides were adapted from Stanford CS 276 course.

Latent variable models for discrete data

Note 1: Varitional Methods for Latent Dirichlet Allocation

Posterior Regularization

Information Retrieval

Probabilistic Time Series Classification

29 : Posterior Regularization

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Latent Dirichlet Allocation Introduction/Overview

13: Variational inference II

Dirichlet Enhanced Latent Semantic Analysis

Gaussian Mixture Model

Sparse Stochastic Inference for Latent Dirichlet Allocation

Lecture 13: More uses of Language Models

Position-Aware ListMLE: A Sequential Learning Process for Ranking

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

CS145: INTRODUCTION TO DATA MINING

Statistical Debugging with Latent Topic Models

Density Estimation. Seungjin Choi

Recent Advances in Bayesian Inference Techniques

Evaluation Methods for Topic Models

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

Stochastic Top-k ListNet

Study Notes on the Latent Dirichlet Allocation

PATTERN RECOGNITION AND MACHINE LEARNING

The supervised hierarchical Dirichlet process

Introduction to Bayesian inference

Topic Models. Charles Elkan November 20, 2008

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Machine Learning Summer School

Probabilistic Latent Semantic Analysis

Hidden Markov Models

LDA with Amortized Inference

Modeling User Rating Profiles For Collaborative Filtering

Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details THIS IS AN EARLY DRAFT. YOUR FEEDBACKS ARE HIGHLY APPRECIATED.

Discriminative Training of Mixed Membership Models

Online Bayesian Passive-Aggressive Learning

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Non-Parametric Bayes

Part 1: Expectation Propagation

Tsuyoshi; Shibata, Yuichiro; Oguri, management - CIKM '09, pp ;

ECE 5984: Introduction to Machine Learning

Latent Dirichlet Allocation

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine

Fast Supervised LDA for Discovering Micro-Events in Large-Scale Video Datasets

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014

Introduction To Machine Learning

Undirected Graphical Models

Applying hlda to Practical Topic Modeling

Latent Dirichlet Bayesian Co-Clustering

Term Filtering with Bounded Error

Applying LDA topic model to a corpus of Italian Supreme Court decisions

Machine Learning using Bayesian Approaches

Introduction to Machine Learning

Latent Variable Models Probabilistic Models in the Study of Language Day 4

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem

Latent Dirichlet Allocation Based Multi-Document Summarization

Inference Methods for Latent Dirichlet Allocation

Approximate Inference using MCMC

Measuring Topic Quality in Latent Dirichlet Allocation

Exploiting Geographic Dependencies for Real Estate Appraisal

Variational Autoencoders

Latent Dirichlet Alloca/on

Large margin optimization of ranking measures

Lecture 6: Graphical Models: Learning

Generative v. Discriminative classifiers Intuition

Topic Models and Applications to Short Documents

Replicated Softmax: an Undirected Topic Model. Stephen Turner

Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations

arxiv: v1 [stat.ml] 25 Dec 2015

Lecture 8: Graphical models for Text

Topic Models. Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW. Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1

Transcription:

A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank Shoaib Jameel Shoaib Jameel 1, Wai Lam 2, Steven Schockaert 1, and Lidong Bing 3 1 School of Computer Science and Informatics, Cardiff University. 2 Dept. of Systems Engineering and Engineering Management, The Chinese University of Hong Kong. 3 Machine Learning Department, Carnegie Mellon University. October 12, 2015 ONE LINE SUMMARY OF THE WORK Incorporating the latent topic similarity feature in the regularized topic space helps improve standard learning-to-rank results. This work will be presented at the 24 th ACM International Conference on Information and Knowledge Management (CIKM 2015) on October 20, 2015 in Melbourne, Australia. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 1 / 32

TABLE OF CONTENTS 1 Background 2 Introduction 3 Related Topic Models Unsupervised Latent Dirichlet Allocation (LDA) Supervised Latent Dirichlet Allocation (slda) Maximum Margin Models Maximum Margin Supervised Topic Models (MedLDA) 4 Our Pairwise Maximum Margin Learning to Rank Model Posterior Regularization Bayesian Posterior Regularization LDA as a Regularized Bayesian Model Regularized Bayesian Learning to Rank Parameter Estimation 5 Experiments and Results Comparative Models and Evaluation Metric NDCG Results 6 Conclusions Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 2 / 32

Background: Ad-Hoc Retrieval Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 3 / 32

INTRODUCTION What is? Maximum margin learning-to-rank for document retrieval models are well known for their generalizeability. Probabilistic topic models have shown to give good performance on ad-hoc document retrieval [Wei and Croft, 2006]. Why not incorporate both frameworks in a model? A Common Approach 1 Train an existing topic model such as Latent Dirichlet Allocation (LDA). 2 Compute a topic-based similarity between the query and a document. 3 Use the similarity score as one of the features. 4 Train the learning-to-rank models. A Problem Such techniques are inherently sub-optimal due to error propagation. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 4 / 32

LEARNING-TO-RANK FRAMEWORK Three approaches to learning to rank: 1 Pointwise - Ranking as a regression problem. 2 Pairwise - Binary classification problem. 3 Listwise - List optimization problem. Query-document pairs are represented by a set of features. These features are computed prior to conducting the learning. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 5 / 32

UNSUPERVISED LATENT DIRICHLET ALLOCATION (LDA) α β φ θ d z w LDA [Blei et al., 2003] posits that documents correspond to mixture of topics. Topic is a probability distribution over words. Each word w has a latent topic assignment z. Topic weights for each document are stored in θ d φ is a matrix of word-topic distributions. α and β are prior distributions. K N d M Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 6 / 32

UNSUPERVISED LATENT DIRICHLET ALLOCATION (LDA) α θ d z 3 z 1 z 4 z 5 z 6 z 7 z 2 w 3 w 1 w 4 w 5 w 6 w 7 w 2 M β φ K Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 7 / 32

Unsupervised Latent Dirichlet Allocation (LDA) UNSUPERVISED LATENT DIRICHLET ALLOCATION MODEL θ d Dirichlet α β Multinomial Dirichlet z φ Multinomial K w N d M Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 8 / 32

Unsupervised Latent Dirichlet Allocation (LDA) LDA AS A MATRIX FACTORIZATION MODEL Topics Documents {}}{ {}}{ k 1 k 2 k 3 d 1 d 2 d D v 1 0.50 0.11 0.10 v 1 8 1 1 1 4 v 2 0.44 0.27 0.34 v 2 5 12 0 0 1 Terms 1 0 1 0 1 = Terms 0 0 0 1 1 0 0 0 0 1 v W 0 0 0 0 1 v W 0.00 0.00 0.47 Documents {}}{ d 1 d 2 d D k 1 0.19 0.05 0.10 k 2 0.01 0.43 0.52 Topics k 3 0.03 0.15 0.24 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 9 / 32

Supervised Latent Dirichlet Allocation (slda) SUPERVISED LATENT DIRICHLET ALLOCATION (SLDA) β φ K (η, σ 2 ) y d α θ d z w N d M slda [Mcauliffe and Blei, 2008] posits that documents correspond to mixture of topics. Topic is a probability distribution over words. Each word w has a latent topic assignment z. Topic weights for each document are stored in θ d φ is a matrix of word-topic distributions. α and β are prior distributions. Incorporates the document meta-data y d to generate topics, for example, classification labels, sentiment labels, etc. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 10 / 32

Maximum Margin Models MAXIMUM MARGIN MODELS T 6 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 11 / 32 T 5

Maximum Margin Models LINEAR MAXIMUM MARGIN MODELS Linear Maximum Margin Classifier minimize η 1 2 η 2 + C ξ ik subject to ξ ik 0 y u ik η (ψ u i ) 1 ξ ik, u, i, k, Pairwise Maximum Margin Classifier minimize η 1 2 η 2 + C ξ ik subject to ξ ik 0 y u ik η (ψ u i ψ u k ) 1 ξ ik, u, i, k, Notations ψ u i - Feature vector. C - Regularization parameter. η - Weight vector. yik u - Class label. ξ ik - Non-negative slack variables. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 12 / 32

Maximum Margin Supervised Topic Models (MedLDA) MAXIMUM MARGIN SUPERVISED TOPIC MODELS (MEDLDA) Topic model for document classification [Zhu et al., 2012]. Regularizes the topic space with maximum margin constraints. Topic model parameter learning and maximum margin parameter learning are tightly integrated into a single model. Uses a latent linear discriminant function with a random weight vector. Encapsulates the topic feature weights. Learning is conducted iteratively: Maximum margin solver. Parameters of the topic model. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 13 / 32

Maximum Margin Supervised Topic Models (MedLDA) AD-HOC RETRIEVAL USING LDA ([WEI AND CROFT, 2006]) Study the effectiveness of the LDA model in ad-hoc retrieval task. They proposed LDA-based document model within the language modeling framework. Basic Language Modeling for Ad-hoc IR P(w d) = N ( d Maximum Likelihood N d + µ P ML(w d) + 1 N ) d P ML (w collection) N d + µ Maximum Likelihood LDA-based Language Modeling for Ad-hoc IR ( ( Nd P(w d) = λ N d + µ P ML(w d)+ 1 N d N d + µ ) ) P ML (w collection) +(1 λ)(p LDA (w d)) K P LDA (w d, θ, φ)) = P(w z, φ)p(z θ, d) z=1 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 14 / 32

OUR MODEL - PAIRWISE REGULARIZED BAYESIAN LEARNING-TO-RANK MODEL Our Model = Generative Learning + Discriminative Learning We propose an optimization algorithm to conduct learning-to-rank using topic models. We integrate the mechanism behind a basic topic model, LDA, with a maximum-margin pairwise learning-to-rank classifier. As a result, we obtain a unified learning-to-rank model, which conducts regularized Bayesian inference. In our model, the posterior distribution of the LDA model is regularized with a pairwise maximum margin classifier. This allows us to directly control the posterior distribution. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 15 / 32

Posterior Regularization POSTERIOR REGULARIZATION - HIGH-LEVEL IDEA Likelihood Prior Bayes theorem P(M x) = P(x M)P(M) P(x M)P(M)dM Posterior Evidence of M given x Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 16 / 32

Bayesian Posterior Regularization BAYESIAN INFERENCE AS AN OPTIMIZATION PROBLEM Let M M be the model space. The objective is to find the posterior distribution which we intend to infer. The posterior distribution of the Bayes Theorem is the solution to the following optimization problem: Bayes Optimization Model minimize P(M) M Kullback- Leibler Divergence KL subject to P(M) P [P(M x 1, x 2,, x N ) P(M)] N n=1 log P(x n M)P(M)dM Constraints on the posterior distribution. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 17 / 32

Bayesian Posterior Regularization BAYESIAN REGULARIZATION Bayes Optimization Model minimize KL [P(M x 1, x 2,, x N ) P(M)] P(M) M subject to P(M) P N n=1 log P(x n M)P(M)dM Bayesian Regularization minimize KL [P(M x 1, x 2,, x N ) P(M)] P(M) M subject to P(M) P(χ) N n=1 log P(x n M)P(M)dM+ Convex Function f (χ) Can encode rich information or knowledge. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 18 / 32

LDA as a Regularized Bayesian Model LDA AS A REGULARIZED BAYESIAN MODEL Posterior of LDA LDA P(Θ, Z, Φ W, α, β) = P 0 (Θ, Z, Φ α, β)p(w Θ, Z, Φ), P(W α, β) where, P 0 (Θ, Z, Φ α, β) - Prior probability. w d = {w d n }Nd n=1 W = {w d } D d=1 z d = {z d n }Nd n=1 Z = {z d } D d=1 Θ = {θ d } D d=1 Φ = {φ 1, φ 2,, φ K } = V K - Matrix of topic distribution parameters α - Parameter of the Dirichlet prior on the per-document topic distributions β - Parameter of the Dirichlet prior on the per-topic word distributions Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 19 / 32 β φ K θ d z w N d D α

Regularized Bayesian Learning to Rank REGULARIZED BAYESIAN LEARNING TO RANK Design Details Combine ideas from pairwise maximum margin learning to rank with topic models. Convex function can be borrowed from the maximum margin pairwise learning to rank model. We choose the basic topic model - LDA Formed an aggregated document set consisting of queries and documents. Incorporate the latent topic information in the form of a latent topic similarity between the document and the query during learning. minimize KL[P(Θ, Z, Φ W, α, β) P 0(Θ, Z, Φ α, β)] E P [log P(W Θ, Z, Φ)] P(Θ,Z,Φ α,β) P,ξ }{{} LDA Model + M(ξ) }{{} Convex Function subject to P(Θ, Z, Φ α, β) P new(ξ), ξ 0 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 20 / 32

Regularized Bayesian Learning to Rank FULL REGULARIZED BAYESIAN LEARNING TO RANK MODEL minimize KL[P(Θ, Z, Φ W, α, β) P 0 (Θ, Z, Φ α, β)] E P [log P(W Θ, Z, Φ)] P(Θ,Z,Φ α,β), η,η t,ξ + 1 2 ( η 2 + ηt 2 ) + C x 1 ξ u (i,k) x B u=1 u (i,k) B u }{{} Convex Function - Pairwise Maximum Margin subject to ξ u (i,k) hu i hk u y ik u [ η (ψi u ψk u ) + η t (Υ u i Υ u k ) ], u, i, k, }{{} Dynamic Topic Similarity P(Θ, Z, Φ α, β) P(ξ u (i,k) ) ξ u (i,k) 0, u, i, k. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 21 / 32

Regularized Bayesian Learning to Rank MODELING ILLUSTRATION T 6 T 5 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 22 / 32

Regularized Bayesian Learning to Rank SOLVING THE OPTIMIZATION MODEL Iterative Procedure 1 Monte Carlo method - Collapsed Gibbs sampling. 2 Iteratively compute: 1 Maximum margin separation of the points, i.e. we determine η and η t given P(Θ, Z, Φ W, α, β). 2 P(Θ, Z, Φ W, α, β) given (η, η t ). 3 Both steps are repeated for a given number of iterations or until the sampler converges to a steady state. P(Z α) Ω is the normalization constant. P(W, Z α, β) Ω } {{ } Topic model e 1 2 }{{ Ψ }. Pairwise classifier Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 23 / 32

Parameter Estimation SAMPLER FORMULATIONS ( K β Γ( s=1 P(Z α) βs) V ) β φ n zw+βw 1 z=1 s=1 Γ(βs) zw dφ z w=1 }{{} Expanding using Dirichlet - Word-Topic statistics ( D α Γ( s=1 αs) K ) α θ p dz +α z 1 dθ d e 2 1 d=1 s=1 Γ(αs) }{{ Ψ } z=1 }{{} Regularizer Expanding using Dirichlet - Document-Topic statistics Pairwise Maximum Margin Regularizer [ x ] x = λ u dk λûˆd [(ψ ˆk d u ψu k ) + (Υu d Υu k )] ) [(ψûˆd ψûˆk u=1 (d,k) B u û=1 (ˆd ˆk) Bû + ) ] (Υûˆd Υûˆk x Ψ = λ u dk (hu d hu k ) u=1 (d,k) B u Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 24 / 32

Parameter Estimation TRANSITION EQUATIONS ( ) P(zn d = t Z n, w = v, W n, α, β) n tv + β w d 1 [ n V v=1 n ] tv + β v 1 Posterior distributions are computed as: Building Word-Topic Matrix } {{ } Word-topic statistics (p dt + α t 1) }{{} Document-topic statistics e 1 2 }{{ Ψ } Regularizer φ zv = n zv + β v ( ) V v=1 nzv + βv Building Document-Topic Matrix θ dz = p dz + α z 1 ( ) e 2 Ψ K z=1 p dz + α z Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 25 / 32

EXPERIMENTS AND RESULTS Datasets We need original documents to build a topic model. Many standard learning to rank test collections do not come with original documents, except OHSUMED. Used Terrier Information Retrieval platform to compute query-document features. Experimented on standard TREC collections: 1 OHSUMED - 45 normalized features 2 AQUAINT - 39 normalized features 3 WT2G - 39 normalized features 4 ClueWeb-2009-91 normalized features Train-Validate-Test Splits Training set - approximately 60% of query-document pairs Testing set - approximately 20% of query-document pairs Validation set - approximately 20% of query-document pairs Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 26 / 32

Comparative Models and Evaluation Metric COMPARATIVE MODELS AND EVALUATION METRIC Comparative Models Listwise - Listnet, AdaRank-MAP, AdaRank-NDCG, Coordinate Ascent, LambdaMART, SVMMAP, and MART Pairwise - RankNet, RankBoost, RankSVM-Struct, and LambdaRank Pointwise - Linear Regression - L2 Norm, and Random Forests DirectRank - recently proposed [Tan et al., 2013]. Evaluation Metric Normalized Cumulative Discounted Gain (NDCG): W (q s) = 1 T n n i=1 2 r(i) 1 log(1 + i) T n - Normalization constant. r(i) - Rank label. n - Length of ranked list. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 27 / 32

Background Introduction Related Topic Models Our Pairwise Maximum Margin Learning to Rank Model Experiments and Results Conclusions NDCG Results R ESULTS - NDCG@10 Models ListNet AdaRankNDCG AdaRankMAP SVMMAP Coordinate Ascent LambdaMART LambdaRank Linear Regression RankSVM RankBoost DirectRank MART RankNet Random Forests Semantic Search Our Model Shoaib Jameel et al., Cardiff University AQUAINT W ITHOUT TOPICS WT2G ClueWeb09 OHSUMED 0.217 0.292 0.188 0.489 0.331 0.211 0.427 0.431 0.253 0.486 0.212 0.426 0.274 0.276 0.469 0.512 0.333 0.275 0.408 0.448 0.299 0.061 0.213 0.463 0.266 0.511 0.211 0.224 0.214 0.408 0.238 0.430 0.288 0.262 0.295 0.256 0.262 0.281 0.465 0.478 0.496 0.479 0.438 0.471 0.345 0.292 0.381 0.235 0.363 0.266 0.309 0.421 0.432 0.425 0.439 0.433 0.255 0.481 0.367 0.423 0.301 0.531 0.394 0.454 Learning-to-rank with Topic Models October 12, 2015 28 / 32

Background Introduction Related Topic Models Our Pairwise Maximum Margin Learning to Rank Model Experiments and Results Conclusions NDCG Results R ESULTS - NDCG@10 Models ListNet AdaRankNDCG AdaRankMAP SVMMAP Coordinate Ascent LambdaMART LambdaRank Linear Regression RankSVM RankBoost DirectRank MART RankNet Random Forests Semantic Search Our Model Shoaib Jameel et al., Cardiff University AQUAINT W ITH TOPICS WT2G ClueWeb09 OHSUMED 0.217 0.296 0.194 0.496 0.344 0.224 0.431 0.434 0.253 0.489 0.232 0.431 0.278 0.276 0.472 0.515 0.348 0.275 0.415 0.449 0.299 0.060 0.213 0.465 0.268 0.514 0.245 0.244 0.244 0.415 0.245 0.431 0.291 0.262 0.295 0.295 0.262 0.281 0.469 0.478 0.503 0.481 0.441 0.475 0.372 0.301 0.386 0.266 0.369 0.311 0.318 0.425 0.433 0.426 0.441 0.439 0.255 0.489 0.367 0.423 0.301 0.531 0.394 0.454 Learning-to-rank with Topic Models October 12, 2015 29 / 32

NDCG Results QUERY-LEVEL PERFORMANCE 1 2 3 >3 AQUAINT 54.33 59.74 69.12 78.66 WT2G 59.64 67.35 73.92 73.92 ClueWeb 61.23 72.98 79.76 83.21 LETOR OHSUMED 57.34 63.65 69.44 75.61 Table: Query level performance in terms of the percentage of winning numbers for different query lengths. Winning Number is defined as follows n m W i = I(NDCG i (j) > NDCG k (j)) j=1 k=1 Notations n - Number of datasets. k - Indexes of compared models. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 30 / 32

CONCLUSIONS Novel LTR model that combines latent topic information with maximum margin learning in a unified way. Latent topic representation is directly regularized with a pairwise maximum margin constraint. Topic similarity is computed in the regularized latent topic space. As our experiments show, such an integrated approach outperforms the existing two-stage de-coupled approach to incorporating latent topics in LTR models. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 31 / 32

REFERENCES [Blei et al., 2003] Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent Dirichlet allocation. JMLR, 3:993 1022. [Mcauliffe and Blei, 2008] Mcauliffe, J. D. and Blei, D. M. (2008). Supervised topic models. In NIPS, pages 121 128. [Tan et al., 2013] Tan, M., Xia, T., Guo, L., and Wang, S. (2013). Direct optimization of ranking measures for learning to rank models. In KDD, pages 856 864. [Wei and Croft, 2006] Wei, X. and Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In SIGIR, pages 178 185. [Zhu et al., 2012] Zhu, J., Ahmed, A., and Xing, E. P. (2012). MedLDA: Maximum margin supervised topic models. JMLR, 13(1):2237 2278. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, 2015 32 / 32