A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank

Size: px
Start display at page:

Download "A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank"

Transcription

1 A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank Shoaib Jameel Shoaib Jameel 1, Wai Lam 2, Steven Schockaert 1, and Lidong Bing 3 1 School of Computer Science and Informatics, Cardiff University. 2 Dept. of Systems Engineering and Engineering Management, The Chinese University of Hong Kong. 3 Machine Learning Department, Carnegie Mellon University. October 12, 2015 ONE LINE SUMMARY OF THE WORK Incorporating the latent topic similarity feature in the regularized topic space helps improve standard learning-to-rank results. This work will be presented at the 24 th ACM International Conference on Information and Knowledge Management (CIKM 2015) on October 20, 2015 in Melbourne, Australia. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

2 TABLE OF CONTENTS 1 Background 2 Introduction 3 Related Topic Models Unsupervised Latent Dirichlet Allocation (LDA) Supervised Latent Dirichlet Allocation (slda) Maximum Margin Models Maximum Margin Supervised Topic Models (MedLDA) 4 Our Pairwise Maximum Margin Learning to Rank Model Posterior Regularization Bayesian Posterior Regularization LDA as a Regularized Bayesian Model Regularized Bayesian Learning to Rank Parameter Estimation 5 Experiments and Results Comparative Models and Evaluation Metric NDCG Results 6 Conclusions Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

3 Background: Ad-Hoc Retrieval Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

4 INTRODUCTION What is? Maximum margin learning-to-rank for document retrieval models are well known for their generalizeability. Probabilistic topic models have shown to give good performance on ad-hoc document retrieval [Wei and Croft, 2006]. Why not incorporate both frameworks in a model? A Common Approach 1 Train an existing topic model such as Latent Dirichlet Allocation (LDA). 2 Compute a topic-based similarity between the query and a document. 3 Use the similarity score as one of the features. 4 Train the learning-to-rank models. A Problem Such techniques are inherently sub-optimal due to error propagation. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

5 LEARNING-TO-RANK FRAMEWORK Three approaches to learning to rank: 1 Pointwise - Ranking as a regression problem. 2 Pairwise - Binary classification problem. 3 Listwise - List optimization problem. Query-document pairs are represented by a set of features. These features are computed prior to conducting the learning. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

6 UNSUPERVISED LATENT DIRICHLET ALLOCATION (LDA) α β φ θ d z w LDA [Blei et al., 2003] posits that documents correspond to mixture of topics. Topic is a probability distribution over words. Each word w has a latent topic assignment z. Topic weights for each document are stored in θ d φ is a matrix of word-topic distributions. α and β are prior distributions. K N d M Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

7 UNSUPERVISED LATENT DIRICHLET ALLOCATION (LDA) α θ d z 3 z 1 z 4 z 5 z 6 z 7 z 2 w 3 w 1 w 4 w 5 w 6 w 7 w 2 M β φ K Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

8 Unsupervised Latent Dirichlet Allocation (LDA) UNSUPERVISED LATENT DIRICHLET ALLOCATION MODEL θ d Dirichlet α β Multinomial Dirichlet z φ Multinomial K w N d M Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

9 Unsupervised Latent Dirichlet Allocation (LDA) LDA AS A MATRIX FACTORIZATION MODEL Topics Documents {}}{ {}}{ k 1 k 2 k 3 d 1 d 2 d D v v v v Terms = Terms v W v W Documents {}}{ d 1 d 2 d D k k Topics k Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

10 Supervised Latent Dirichlet Allocation (slda) SUPERVISED LATENT DIRICHLET ALLOCATION (SLDA) β φ K (η, σ 2 ) y d α θ d z w N d M slda [Mcauliffe and Blei, 2008] posits that documents correspond to mixture of topics. Topic is a probability distribution over words. Each word w has a latent topic assignment z. Topic weights for each document are stored in θ d φ is a matrix of word-topic distributions. α and β are prior distributions. Incorporates the document meta-data y d to generate topics, for example, classification labels, sentiment labels, etc. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

11 Maximum Margin Models MAXIMUM MARGIN MODELS T 6 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32 T 5

12 Maximum Margin Models LINEAR MAXIMUM MARGIN MODELS Linear Maximum Margin Classifier minimize η 1 2 η 2 + C ξ ik subject to ξ ik 0 y u ik η (ψ u i ) 1 ξ ik, u, i, k, Pairwise Maximum Margin Classifier minimize η 1 2 η 2 + C ξ ik subject to ξ ik 0 y u ik η (ψ u i ψ u k ) 1 ξ ik, u, i, k, Notations ψ u i - Feature vector. C - Regularization parameter. η - Weight vector. yik u - Class label. ξ ik - Non-negative slack variables. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

13 Maximum Margin Supervised Topic Models (MedLDA) MAXIMUM MARGIN SUPERVISED TOPIC MODELS (MEDLDA) Topic model for document classification [Zhu et al., 2012]. Regularizes the topic space with maximum margin constraints. Topic model parameter learning and maximum margin parameter learning are tightly integrated into a single model. Uses a latent linear discriminant function with a random weight vector. Encapsulates the topic feature weights. Learning is conducted iteratively: Maximum margin solver. Parameters of the topic model. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

14 Maximum Margin Supervised Topic Models (MedLDA) AD-HOC RETRIEVAL USING LDA ([WEI AND CROFT, 2006]) Study the effectiveness of the LDA model in ad-hoc retrieval task. They proposed LDA-based document model within the language modeling framework. Basic Language Modeling for Ad-hoc IR P(w d) = N ( d Maximum Likelihood N d + µ P ML(w d) + 1 N ) d P ML (w collection) N d + µ Maximum Likelihood LDA-based Language Modeling for Ad-hoc IR ( ( Nd P(w d) = λ N d + µ P ML(w d)+ 1 N d N d + µ ) ) P ML (w collection) +(1 λ)(p LDA (w d)) K P LDA (w d, θ, φ)) = P(w z, φ)p(z θ, d) z=1 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

15 OUR MODEL - PAIRWISE REGULARIZED BAYESIAN LEARNING-TO-RANK MODEL Our Model = Generative Learning + Discriminative Learning We propose an optimization algorithm to conduct learning-to-rank using topic models. We integrate the mechanism behind a basic topic model, LDA, with a maximum-margin pairwise learning-to-rank classifier. As a result, we obtain a unified learning-to-rank model, which conducts regularized Bayesian inference. In our model, the posterior distribution of the LDA model is regularized with a pairwise maximum margin classifier. This allows us to directly control the posterior distribution. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

16 Posterior Regularization POSTERIOR REGULARIZATION - HIGH-LEVEL IDEA Likelihood Prior Bayes theorem P(M x) = P(x M)P(M) P(x M)P(M)dM Posterior Evidence of M given x Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

17 Bayesian Posterior Regularization BAYESIAN INFERENCE AS AN OPTIMIZATION PROBLEM Let M M be the model space. The objective is to find the posterior distribution which we intend to infer. The posterior distribution of the Bayes Theorem is the solution to the following optimization problem: Bayes Optimization Model minimize P(M) M Kullback- Leibler Divergence KL subject to P(M) P [P(M x 1, x 2,, x N ) P(M)] N n=1 log P(x n M)P(M)dM Constraints on the posterior distribution. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

18 Bayesian Posterior Regularization BAYESIAN REGULARIZATION Bayes Optimization Model minimize KL [P(M x 1, x 2,, x N ) P(M)] P(M) M subject to P(M) P N n=1 log P(x n M)P(M)dM Bayesian Regularization minimize KL [P(M x 1, x 2,, x N ) P(M)] P(M) M subject to P(M) P(χ) N n=1 log P(x n M)P(M)dM+ Convex Function f (χ) Can encode rich information or knowledge. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

19 LDA as a Regularized Bayesian Model LDA AS A REGULARIZED BAYESIAN MODEL Posterior of LDA LDA P(Θ, Z, Φ W, α, β) = P 0 (Θ, Z, Φ α, β)p(w Θ, Z, Φ), P(W α, β) where, P 0 (Θ, Z, Φ α, β) - Prior probability. w d = {w d n }Nd n=1 W = {w d } D d=1 z d = {z d n }Nd n=1 Z = {z d } D d=1 Θ = {θ d } D d=1 Φ = {φ 1, φ 2,, φ K } = V K - Matrix of topic distribution parameters α - Parameter of the Dirichlet prior on the per-document topic distributions β - Parameter of the Dirichlet prior on the per-topic word distributions Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32 β φ K θ d z w N d D α

20 Regularized Bayesian Learning to Rank REGULARIZED BAYESIAN LEARNING TO RANK Design Details Combine ideas from pairwise maximum margin learning to rank with topic models. Convex function can be borrowed from the maximum margin pairwise learning to rank model. We choose the basic topic model - LDA Formed an aggregated document set consisting of queries and documents. Incorporate the latent topic information in the form of a latent topic similarity between the document and the query during learning. minimize KL[P(Θ, Z, Φ W, α, β) P 0(Θ, Z, Φ α, β)] E P [log P(W Θ, Z, Φ)] P(Θ,Z,Φ α,β) P,ξ }{{} LDA Model + M(ξ) }{{} Convex Function subject to P(Θ, Z, Φ α, β) P new(ξ), ξ 0 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

21 Regularized Bayesian Learning to Rank FULL REGULARIZED BAYESIAN LEARNING TO RANK MODEL minimize KL[P(Θ, Z, Φ W, α, β) P 0 (Θ, Z, Φ α, β)] E P [log P(W Θ, Z, Φ)] P(Θ,Z,Φ α,β), η,η t,ξ ( η 2 + ηt 2 ) + C x 1 ξ u (i,k) x B u=1 u (i,k) B u }{{} Convex Function - Pairwise Maximum Margin subject to ξ u (i,k) hu i hk u y ik u [ η (ψi u ψk u ) + η t (Υ u i Υ u k ) ], u, i, k, }{{} Dynamic Topic Similarity P(Θ, Z, Φ α, β) P(ξ u (i,k) ) ξ u (i,k) 0, u, i, k. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

22 Regularized Bayesian Learning to Rank MODELING ILLUSTRATION T 6 T 5 Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

23 Regularized Bayesian Learning to Rank SOLVING THE OPTIMIZATION MODEL Iterative Procedure 1 Monte Carlo method - Collapsed Gibbs sampling. 2 Iteratively compute: 1 Maximum margin separation of the points, i.e. we determine η and η t given P(Θ, Z, Φ W, α, β). 2 P(Θ, Z, Φ W, α, β) given (η, η t ). 3 Both steps are repeated for a given number of iterations or until the sampler converges to a steady state. P(Z α) Ω is the normalization constant. P(W, Z α, β) Ω } {{ } Topic model e 1 2 }{{ Ψ }. Pairwise classifier Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

24 Parameter Estimation SAMPLER FORMULATIONS ( K β Γ( s=1 P(Z α) βs) V ) β φ n zw+βw 1 z=1 s=1 Γ(βs) zw dφ z w=1 }{{} Expanding using Dirichlet - Word-Topic statistics ( D α Γ( s=1 αs) K ) α θ p dz +α z 1 dθ d e 2 1 d=1 s=1 Γ(αs) }{{ Ψ } z=1 }{{} Regularizer Expanding using Dirichlet - Document-Topic statistics Pairwise Maximum Margin Regularizer [ x ] x = λ u dk λûˆd [(ψ ˆk d u ψu k ) + (Υu d Υu k )] ) [(ψûˆd ψûˆk u=1 (d,k) B u û=1 (ˆd ˆk) Bû + ) ] (Υûˆd Υûˆk x Ψ = λ u dk (hu d hu k ) u=1 (d,k) B u Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

25 Parameter Estimation TRANSITION EQUATIONS ( ) P(zn d = t Z n, w = v, W n, α, β) n tv + β w d 1 [ n V v=1 n ] tv + β v 1 Posterior distributions are computed as: Building Word-Topic Matrix } {{ } Word-topic statistics (p dt + α t 1) }{{} Document-topic statistics e 1 2 }{{ Ψ } Regularizer φ zv = n zv + β v ( ) V v=1 nzv + βv Building Document-Topic Matrix θ dz = p dz + α z 1 ( ) e 2 Ψ K z=1 p dz + α z Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

26 EXPERIMENTS AND RESULTS Datasets We need original documents to build a topic model. Many standard learning to rank test collections do not come with original documents, except OHSUMED. Used Terrier Information Retrieval platform to compute query-document features. Experimented on standard TREC collections: 1 OHSUMED - 45 normalized features 2 AQUAINT - 39 normalized features 3 WT2G - 39 normalized features 4 ClueWeb normalized features Train-Validate-Test Splits Training set - approximately 60% of query-document pairs Testing set - approximately 20% of query-document pairs Validation set - approximately 20% of query-document pairs Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

27 Comparative Models and Evaluation Metric COMPARATIVE MODELS AND EVALUATION METRIC Comparative Models Listwise - Listnet, AdaRank-MAP, AdaRank-NDCG, Coordinate Ascent, LambdaMART, SVMMAP, and MART Pairwise - RankNet, RankBoost, RankSVM-Struct, and LambdaRank Pointwise - Linear Regression - L2 Norm, and Random Forests DirectRank - recently proposed [Tan et al., 2013]. Evaluation Metric Normalized Cumulative Discounted Gain (NDCG): W (q s) = 1 T n n i=1 2 r(i) 1 log(1 + i) T n - Normalization constant. r(i) - Rank label. n - Length of ranked list. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

28 Background Introduction Related Topic Models Our Pairwise Maximum Margin Learning to Rank Model Experiments and Results Conclusions NDCG Results R ESULTS - NDCG@10 Models ListNet AdaRankNDCG AdaRankMAP SVMMAP Coordinate Ascent LambdaMART LambdaRank Linear Regression RankSVM RankBoost DirectRank MART RankNet Random Forests Semantic Search Our Model Shoaib Jameel et al., Cardiff University AQUAINT W ITHOUT TOPICS WT2G ClueWeb09 OHSUMED Learning-to-rank with Topic Models October 12, / 32

29 Background Introduction Related Topic Models Our Pairwise Maximum Margin Learning to Rank Model Experiments and Results Conclusions NDCG Results R ESULTS - NDCG@10 Models ListNet AdaRankNDCG AdaRankMAP SVMMAP Coordinate Ascent LambdaMART LambdaRank Linear Regression RankSVM RankBoost DirectRank MART RankNet Random Forests Semantic Search Our Model Shoaib Jameel et al., Cardiff University AQUAINT W ITH TOPICS WT2G ClueWeb09 OHSUMED Learning-to-rank with Topic Models October 12, / 32

30 NDCG Results QUERY-LEVEL PERFORMANCE >3 AQUAINT WT2G ClueWeb LETOR OHSUMED Table: Query level performance in terms of the percentage of winning numbers for different query lengths. Winning Number is defined as follows n m W i = I(NDCG i (j) > NDCG k (j)) j=1 k=1 Notations n - Number of datasets. k - Indexes of compared models. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

31 CONCLUSIONS Novel LTR model that combines latent topic information with maximum margin learning in a unified way. Latent topic representation is directly regularized with a pairwise maximum margin constraint. Topic similarity is computed in the regularized latent topic space. As our experiments show, such an integrated approach outperforms the existing two-stage de-coupled approach to incorporating latent topics in LTR models. Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

32 REFERENCES [Blei et al., 2003] Blei, D. M., Ng, A. Y., and Jordan, M. I. (2003). Latent Dirichlet allocation. JMLR, 3: [Mcauliffe and Blei, 2008] Mcauliffe, J. D. and Blei, D. M. (2008). Supervised topic models. In NIPS, pages [Tan et al., 2013] Tan, M., Xia, T., Guo, L., and Wang, S. (2013). Direct optimization of ranking measures for learning to rank models. In KDD, pages [Wei and Croft, 2006] Wei, X. and Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In SIGIR, pages [Zhu et al., 2012] Zhu, J., Ahmed, A., and Xing, E. P. (2012). MedLDA: Maximum margin supervised topic models. JMLR, 13(1): Shoaib Jameel et al., Cardiff University Learning-to-rank with Topic Models October 12, / 32

Listwise Approach to Learning to Rank Theory and Algorithm

Listwise Approach to Learning to Rank Theory and Algorithm Listwise Approach to Learning to Rank Theory and Algorithm Fen Xia *, Tie-Yan Liu Jue Wang, Wensheng Zhang and Hang Li Microsoft Research Asia Chinese Academy of Sciences document s Learning to Rank for

More information

Topic Modelling and Latent Dirichlet Allocation

Topic Modelling and Latent Dirichlet Allocation Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Classical Predictive Models

Classical Predictive Models Laplace Max-margin Markov Networks Recent Advances in Learning SPARSE Structured I/O Models: models, algorithms, and applications Eric Xing epxing@cs.cmu.edu Machine Learning Dept./Language Technology

More information

Online Bayesian Passive-Agressive Learning

Online Bayesian Passive-Agressive Learning Online Bayesian Passive-Agressive Learning International Conference on Machine Learning, 2014 Tianlin Shi Jun Zhu Tsinghua University, China 21 August 2015 Presented by: Kyle Ulrich Introduction Online

More information

Monte Carlo Methods for Maximum Margin Supervised Topic Models

Monte Carlo Methods for Maximum Margin Supervised Topic Models Monte Carlo Methods for Maximum Margin Supervised Topic Models Qixia Jiang, Jun Zhu, Maosong Sun, and Eric P. Xing Department of Computer Science & Technology, Tsinghua National TNList Lab, State Key Lab

More information

Using Both Latent and Supervised Shared Topics for Multitask Learning

Using Both Latent and Supervised Shared Topics for Multitask Learning Using Both Latent and Supervised Shared Topics for Multitask Learning Ayan Acharya, Aditya Rawal, Raymond J. Mooney, Eduardo R. Hruschka UT Austin, Dept. of ECE September 21, 2013 Problem Definition An

More information

Document and Topic Models: plsa and LDA

Document and Topic Models: plsa and LDA Document and Topic Models: plsa and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced Topics in Machine Learning 2 October 2018 Outline Topic Models plsa LSA Model Fitting via EM phits: link analysis

More information

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

Information retrieval LSI, plsi and LDA. Jian-Yun Nie Information retrieval LSI, plsi and LDA Jian-Yun Nie Basics: Eigenvector, Eigenvalue Ref: http://en.wikipedia.org/wiki/eigenvector For a square matrix A: Ax = λx where x is a vector (eigenvector), and

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information

CS Lecture 18. Topic Models and LDA

CS Lecture 18. Topic Models and LDA CS 6347 Lecture 18 Topic Models and LDA (some slides by David Blei) Generative vs. Discriminative Models Recall that, in Bayesian networks, there could be many different, but equivalent models of the same

More information

Web-Mining Agents Topic Analysis: plsi and LDA. Tanya Braun Ralf Möller Universität zu Lübeck Institut für Informationssysteme

Web-Mining Agents Topic Analysis: plsi and LDA. Tanya Braun Ralf Möller Universität zu Lübeck Institut für Informationssysteme Web-Mining Agents Topic Analysis: plsi and LDA Tanya Braun Ralf Möller Universität zu Lübeck Institut für Informationssysteme Acknowledgments Pilfered from: Ramesh M. Nallapati Machine Learning applied

More information

INTERPRETING THE PREDICTION PROCESS OF A DEEP NETWORK CONSTRUCTED FROM SUPERVISED TOPIC MODELS

INTERPRETING THE PREDICTION PROCESS OF A DEEP NETWORK CONSTRUCTED FROM SUPERVISED TOPIC MODELS INTERPRETING THE PREDICTION PROCESS OF A DEEP NETWORK CONSTRUCTED FROM SUPERVISED TOPIC MODELS Jianshu Chen, Ji He, Xiaodong He, Lin Xiao, Jianfeng Gao, and Li Deng Microsoft Research, Redmond, WA 9852,

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language

More information

Note for plsa and LDA-Version 1.1

Note for plsa and LDA-Version 1.1 Note for plsa and LDA-Version 1.1 Wayne Xin Zhao March 2, 2011 1 Disclaimer In this part of PLSA, I refer to [4, 5, 1]. In LDA part, I refer to [3, 2]. Due to the limit of my English ability, in some place,

More information

Online Bayesian Passive-Aggressive Learning"

Online Bayesian Passive-Aggressive Learning Online Bayesian Passive-Aggressive Learning" Tianlin Shi! stl501@gmail.com! Jun Zhu! dcszj@mail.tsinghua.edu.cn! The BIG DATA challenge" Large amounts of data.! Big data:!! Big Science: 25 PB annual data.!

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003. Following slides borrowed ant then heavily modified from: Jonathan Huang

More information

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS RETRIEVAL MODELS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Boolean model Vector space model Probabilistic

More information

Decoupled Collaborative Ranking

Decoupled Collaborative Ranking Decoupled Collaborative Ranking Jun Hu, Ping Li April 24, 2017 Jun Hu, Ping Li WWW2017 April 24, 2017 1 / 36 Recommender Systems Recommendation system is an information filtering technique, which provides

More information

Information Retrieval

Information Retrieval Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 15: Learning to Rank Sec. 15.4 Machine learning for IR ranking? We ve looked at methods for ranking

More information

Query-document Relevance Topic Models

Query-document Relevance Topic Models Query-document Relevance Topic Models Meng-Sung Wu, Chia-Ping Chen and Hsin-Min Wang Industrial Technology Research Institute, Hsinchu, Taiwan National Sun Yat-Sen University, Kaohsiung, Taiwan Institute

More information

11. Learning To Rank. Most slides were adapted from Stanford CS 276 course.

11. Learning To Rank. Most slides were adapted from Stanford CS 276 course. 11. Learning To Rank Most slides were adapted from Stanford CS 276 course. 1 Sec. 15.4 Machine learning for IR ranking? We ve looked at methods for ranking documents in IR Cosine similarity, inverse document

More information

Latent variable models for discrete data

Latent variable models for discrete data Latent variable models for discrete data Jianfei Chen Department of Computer Science and Technology Tsinghua University, Beijing 100084 chris.jianfei.chen@gmail.com Janurary 13, 2014 Murphy, Kevin P. Machine

More information

Note 1: Varitional Methods for Latent Dirichlet Allocation

Note 1: Varitional Methods for Latent Dirichlet Allocation Technical Note Series Spring 2013 Note 1: Varitional Methods for Latent Dirichlet Allocation Version 1.0 Wayne Xin Zhao batmanfly@gmail.com Disclaimer: The focus of this note was to reorganie the content

More information

Posterior Regularization

Posterior Regularization Posterior Regularization 1 Introduction One of the key challenges in probabilistic structured learning, is the intractability of the posterior distribution, for fast inference. There are numerous methods

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning, Pandu Nayak, and Prabhakar Raghavan Lecture 14: Learning to Rank Sec. 15.4 Machine learning for IR

More information

Probabilistic Time Series Classification

Probabilistic Time Series Classification Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign

More information

29 : Posterior Regularization

29 : Posterior Regularization 10-708: Probabilistic Graphical Models 10-708, Spring 2014 29 : Posterior Regularization Lecturer: Eric P. Xing Scribes: Felix Juefei Xu, Abhishek Chugh 1 Introduction This is the last lecture which tends

More information

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012 Parametric Models Dr. Shuang LIANG School of Software Engineering TongJi University Fall, 2012 Today s Topics Maximum Likelihood Estimation Bayesian Density Estimation Today s Topics Maximum Likelihood

More information

Latent Dirichlet Allocation Introduction/Overview

Latent Dirichlet Allocation Introduction/Overview Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models

More information

13: Variational inference II

13: Variational inference II 10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational

More information

Dirichlet Enhanced Latent Semantic Analysis

Dirichlet Enhanced Latent Semantic Analysis Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,

More information

Gaussian Mixture Model

Gaussian Mixture Model Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,

More information

Sparse Stochastic Inference for Latent Dirichlet Allocation

Sparse Stochastic Inference for Latent Dirichlet Allocation Sparse Stochastic Inference for Latent Dirichlet Allocation David Mimno 1, Matthew D. Hoffman 2, David M. Blei 1 1 Dept. of Computer Science, Princeton U. 2 Dept. of Statistics, Columbia U. Presentation

More information

Lecture 13: More uses of Language Models

Lecture 13: More uses of Language Models Lecture 13: More uses of Language Models William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 13 What we ll learn in this lecture Comparing documents, corpora using LM approaches

More information

Position-Aware ListMLE: A Sequential Learning Process for Ranking

Position-Aware ListMLE: A Sequential Learning Process for Ranking Position-Aware ListMLE: A Sequential Learning Process for Ranking Yanyan Lan 1 Yadong Zhu 2 Jiafeng Guo 1 Shuzi Niu 2 Xueqi Cheng 1 Institute of Computing Technology, Chinese Academy of Sciences, Beijing,

More information

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up Much of this material is adapted from Blei 2003. Many of the images were taken from the Internet February 20, 2014 Suppose we have a large number of books. Each is about several unknown topics. How can

More information

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures

More information

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM

Pattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

Statistical Debugging with Latent Topic Models

Statistical Debugging with Latent Topic Models Statistical Debugging with Latent Topic Models David Andrzejewski, Anne Mulhern, Ben Liblit, Xiaojin Zhu Department of Computer Sciences University of Wisconsin Madison European Conference on Machine Learning,

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Evaluation Methods for Topic Models

Evaluation Methods for Topic Models University of Massachusetts Amherst wallach@cs.umass.edu April 13, 2009 Joint work with Iain Murray, Ruslan Salakhutdinov and David Mimno Statistical Topic Models Useful for analyzing large, unstructured

More information

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement

A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement A Variance Modeling Framework Based on Variational Autoencoders for Speech Enhancement Simon Leglaive 1 Laurent Girin 1,2 Radu Horaud 1 1: Inria Grenoble Rhône-Alpes 2: Univ. Grenoble Alpes, Grenoble INP,

More information

Stochastic Top-k ListNet

Stochastic Top-k ListNet Stochastic Top-k ListNet Tianyi Luo, Dong Wang, Rong Liu, Yiqiao Pan CSLT / RIIT Tsinghua University lty@cslt.riit.tsinghua.edu.cn EMNLP, Sep 9-19, 2015 1 Machine Learning Ranking Learning to Rank Information

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

The supervised hierarchical Dirichlet process

The supervised hierarchical Dirichlet process 1 The supervised hierarchical Dirichlet process Andrew M. Dai and Amos J. Storkey Abstract We propose the supervised hierarchical Dirichlet process (shdp), a nonparametric generative model for the joint

More information

Introduction to Bayesian inference

Introduction to Bayesian inference Introduction to Bayesian inference Thomas Alexander Brouwer University of Cambridge tab43@cam.ac.uk 17 November 2015 Probabilistic models Describe how data was generated using probability distributions

More information

Topic Models. Charles Elkan November 20, 2008

Topic Models. Charles Elkan November 20, 2008 Topic Models Charles Elan elan@cs.ucsd.edu November 20, 2008 Suppose that we have a collection of documents, and we want to find an organization for these, i.e. we want to do unsupervised learning. One

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang Matrix Factorization & Latent Semantic Analysis Review Yize Li, Lanbo Zhang Overview SVD in Latent Semantic Indexing Non-negative Matrix Factorization Probabilistic Latent Semantic Indexing Vector Space

More information

Machine Learning Summer School

Machine Learning Summer School Machine Learning Summer School Lecture 3: Learning parameters and structure Zoubin Ghahramani zoubin@eng.cam.ac.uk http://learning.eng.cam.ac.uk/zoubin/ Department of Engineering University of Cambridge,

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Yuriy Sverchkov Intelligent Systems Program University of Pittsburgh October 6, 2011 Outline Latent Semantic Analysis (LSA) A quick review Probabilistic LSA (plsa)

More information

Hidden Markov Models

Hidden Markov Models 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Hidden Markov Models Matt Gormley Lecture 22 April 2, 2018 1 Reminders Homework

More information

LDA with Amortized Inference

LDA with Amortized Inference LDA with Amortied Inference Nanbo Sun Abstract This report describes how to frame Latent Dirichlet Allocation LDA as a Variational Auto- Encoder VAE and use the Amortied Variational Inference AVI to optimie

More information

Modeling User Rating Profiles For Collaborative Filtering

Modeling User Rating Profiles For Collaborative Filtering Modeling User Rating Profiles For Collaborative Filtering Benjamin Marlin Department of Computer Science University of Toronto Toronto, ON, M5S 3H5, CANADA marlin@cs.toronto.edu Abstract In this paper

More information

Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details THIS IS AN EARLY DRAFT. YOUR FEEDBACKS ARE HIGHLY APPRECIATED.

Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details THIS IS AN EARLY DRAFT. YOUR FEEDBACKS ARE HIGHLY APPRECIATED. Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details THIS IS AN EARLY DRAFT. YOUR FEEDBACKS ARE HIGHLY APPRECIATED. Yi Wang yi.wang.2005@gmail.com August 2008 Contents Preface 2 2 Latent

More information

Discriminative Training of Mixed Membership Models

Discriminative Training of Mixed Membership Models 18 Discriminative Training of Mixed Membership Models Jun Zhu Department of Computer Science and Technology, State Key Laboratory of Intelligent Technology and Systems; Tsinghua National Laboratory for

More information

Online Bayesian Passive-Aggressive Learning

Online Bayesian Passive-Aggressive Learning Online Bayesian Passive-Aggressive Learning Full Journal Version: http://qr.net/b1rd Tianlin Shi Jun Zhu ICML 2014 T. Shi, J. Zhu (Tsinghua) BayesPA ICML 2014 1 / 35 Outline Introduction Motivation Framework

More information

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014

Learning with Noisy Labels. Kate Niehaus Reading group 11-Feb-2014 Learning with Noisy Labels Kate Niehaus Reading group 11-Feb-2014 Outline Motivations Generative model approach: Lawrence, N. & Scho lkopf, B. Estimating a Kernel Fisher Discriminant in the Presence of

More information

Non-Parametric Bayes

Non-Parametric Bayes Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Tsuyoshi; Shibata, Yuichiro; Oguri, management - CIKM '09, pp ;

Tsuyoshi; Shibata, Yuichiro; Oguri, management - CIKM '09, pp ; NAOSITE: 's Ac Title Author(s) Citation Dynamic hyperparameter optimization Masada, Tomonari; Fukagawa, Daiji; Tsuyoshi; Shibata, Yuichiro; Oguri, Proceeding of the 18th ACM conferen management - CIKM

More information

ECE 5984: Introduction to Machine Learning

ECE 5984: Introduction to Machine Learning ECE 5984: Introduction to Machine Learning Topics: (Finish) Expectation Maximization Principal Component Analysis (PCA) Readings: Barber 15.1-15.4 Dhruv Batra Virginia Tech Administrativia Poster Presentation:

More information

Latent Dirichlet Allocation

Latent Dirichlet Allocation Outlines Advanced Artificial Intelligence October 1, 2009 Outlines Part I: Theoretical Background Part II: Application and Results 1 Motive Previous Research Exchangeability 2 Notation and Terminology

More information

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Nitish Srivastava nitish@cs.toronto.edu Ruslan Salahutdinov rsalahu@cs.toronto.edu Geoffrey Hinton hinton@cs.toronto.edu

More information

Fast Supervised LDA for Discovering Micro-Events in Large-Scale Video Datasets

Fast Supervised LDA for Discovering Micro-Events in Large-Scale Video Datasets Fast Supervised LDA for Discovering Micro-Events in Large-Scale Video Datasets Angelos Katharopoulos, Despoina Paschalidou, Christos Diou, Anastasios Delopoulos Multimedia Understanding Group ECE Department,

More information

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Understanding Comments Submitted to FCC on Net Neutrality Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Abstract We aim to understand and summarize themes in the 1.65 million

More information

Introduction To Machine Learning

Introduction To Machine Learning Introduction To Machine Learning David Sontag New York University Lecture 21, April 14, 2016 David Sontag (NYU) Introduction To Machine Learning Lecture 21, April 14, 2016 1 / 14 Expectation maximization

More information

Undirected Graphical Models

Undirected Graphical Models Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional

More information

Applying hlda to Practical Topic Modeling

Applying hlda to Practical Topic Modeling Joseph Heng lengerfulluse@gmail.com CIST Lab of BUPT March 17, 2013 Outline 1 HLDA Discussion 2 the nested CRP GEM Distribution Dirichlet Distribution Posterior Inference Outline 1 HLDA Discussion 2 the

More information

Latent Dirichlet Bayesian Co-Clustering

Latent Dirichlet Bayesian Co-Clustering Latent Dirichlet Bayesian Co-Clustering Pu Wang 1, Carlotta Domeniconi 1, and athryn Blackmond Laskey 1 Department of Computer Science Department of Systems Engineering and Operations Research George Mason

More information

Term Filtering with Bounded Error

Term Filtering with Bounded Error Term Filtering with Bounded Error Zi Yang, Wei Li, Jie Tang, and Juanzi Li Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University, China {yangzi, tangjie, ljz}@keg.cs.tsinghua.edu.cn

More information

Applying LDA topic model to a corpus of Italian Supreme Court decisions

Applying LDA topic model to a corpus of Italian Supreme Court decisions Applying LDA topic model to a corpus of Italian Supreme Court decisions Paolo Fantini Statistical Service of the Ministry of Justice - Italy CESS Conference - Rome - November 25, 2014 Our goal finding

More information

Machine Learning using Bayesian Approaches

Machine Learning using Bayesian Approaches Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Logistic Regression Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

Latent Variable Models Probabilistic Models in the Study of Language Day 4

Latent Variable Models Probabilistic Models in the Study of Language Day 4 Latent Variable Models Probabilistic Models in the Study of Language Day 4 Roger Levy UC San Diego Department of Linguistics Preamble: plate notation for graphical models Here is the kind of hierarchical

More information

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem

Deep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Latent Dirichlet Allocation Based Multi-Document Summarization

Latent Dirichlet Allocation Based Multi-Document Summarization Latent Dirichlet Allocation Based Multi-Document Summarization Rachit Arora Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. rachitar@cse.iitm.ernet.in

More information

Inference Methods for Latent Dirichlet Allocation

Inference Methods for Latent Dirichlet Allocation Inference Methods for Latent Dirichlet Allocation Chase Geigle University of Illinois at Urbana-Champaign Department of Computer Science geigle1@illinois.edu October 15, 2016 Abstract Latent Dirichlet

More information

Approximate Inference using MCMC

Approximate Inference using MCMC Approximate Inference using MCMC 9.520 Class 22 Ruslan Salakhutdinov BCS and CSAIL, MIT 1 Plan 1. Introduction/Notation. 2. Examples of successful Bayesian models. 3. Basic Sampling Algorithms. 4. Markov

More information

Measuring Topic Quality in Latent Dirichlet Allocation

Measuring Topic Quality in Latent Dirichlet Allocation Measuring Topic Quality in Sergei Koltsov Olessia Koltsova Steklov Institute of Mathematics at St. Petersburg Laboratory for Internet Studies, National Research University Higher School of Economics, St.

More information

Exploiting Geographic Dependencies for Real Estate Appraisal

Exploiting Geographic Dependencies for Real Estate Appraisal Exploiting Geographic Dependencies for Real Estate Appraisal Yanjie Fu Joint work with Hui Xiong, Yu Zheng, Yong Ge, Zhihua Zhou, Zijun Yao Rutgers, the State University of New Jersey Microsoft Research

More information

Variational Autoencoders

Variational Autoencoders Variational Autoencoders Recap: Story so far A classification MLP actually comprises two components A feature extraction network that converts the inputs into linearly separable features Or nearly linearly

More information

Latent Dirichlet Alloca/on

Latent Dirichlet Alloca/on Latent Dirichlet Alloca/on Blei, Ng and Jordan ( 2002 ) Presented by Deepak Santhanam What is Latent Dirichlet Alloca/on? Genera/ve Model for collec/ons of discrete data Data generated by parameters which

More information

Large margin optimization of ranking measures

Large margin optimization of ranking measures Large margin optimization of ranking measures Olivier Chapelle Yahoo! Research, Santa Clara chap@yahoo-inc.com Quoc Le NICTA, Canberra quoc.le@nicta.com.au Alex Smola NICTA, Canberra alex.smola@nicta.com.au

More information

Lecture 6: Graphical Models: Learning

Lecture 6: Graphical Models: Learning Lecture 6: Graphical Models: Learning 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering, University of Cambridge February 3rd, 2010 Ghahramani & Rasmussen (CUED)

More information

Generative v. Discriminative classifiers Intuition

Generative v. Discriminative classifiers Intuition Logistic Regression Machine Learning 070/578 Carlos Guestrin Carnegie Mellon University September 24 th, 2007 Generative v. Discriminative classifiers Intuition Want to Learn: h:x a Y X features Y target

More information

Topic Models and Applications to Short Documents

Topic Models and Applications to Short Documents Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text

More information

Replicated Softmax: an Undirected Topic Model. Stephen Turner

Replicated Softmax: an Undirected Topic Model. Stephen Turner Replicated Softmax: an Undirected Topic Model Stephen Turner 1. Introduction 2. Replicated Softmax: A Generative Model of Word Counts 3. Evaluating Replicated Softmax as a Generative Model 4. Experimental

More information

Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations

Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations : DAG-Structured Mixture Models of Topic Correlations Wei Li and Andrew McCallum University of Massachusetts, Dept. of Computer Science {weili,mccallum}@cs.umass.edu Abstract Latent Dirichlet allocation

More information

arxiv: v1 [stat.ml] 25 Dec 2015

arxiv: v1 [stat.ml] 25 Dec 2015 Histogram Meets Topic Model: Density Estimation by Mixture of Histograms arxiv:5.796v [stat.ml] 5 Dec 5 Hideaki Kim NTT Corporation, Japan kin.hideaki@lab.ntt.co.jp Abstract Hiroshi Sawada NTT Corporation,

More information

Lecture 8: Graphical models for Text

Lecture 8: Graphical models for Text Lecture 8: Graphical models for Text 4F13: Machine Learning Joaquin Quiñonero-Candela and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/

More information

Topic Models. Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW. Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1

Topic Models. Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW. Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1 Topic Models Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1 Low-Dimensional Space for Documents Last time: embedding space

More information