SUPERVISED MULTI-MODAL TOPIC MODEL FOR IMAGE ANNOTATION
|
|
- Philip McKenzie
- 5 years ago
- Views:
Transcription
1 SUPERVISE MULTI-MOAL TOPIC MOEL FOR IMAGE AOTATIO Thu Hoai Tran 2 and Seungjin Choi 12 1 epartment of Computer Science and Engineering POSTECH Korea 2 ivision of IT Convergence Engineering POSTECH Korea thtlamson@gmailcom seungjin@postechacr ABSTRACT Multi-modal topic models are probabilistic generative models here hidden topics are learned from data of different types In this paper e present supervised multi-modal latent irichlet allocation smmla) here e incorporate class label global description) into the joint modeling of visual ords and caption ords local description) for image annotation tas We derive variational inference algorithm to approximately compute posterior distribution over latent variables Experiments on a subset of LabelMe dataset demonstrate the useful behavior of our model compared to existing topic models Index Terms Image annotation latent irichlet allocation topic models 1 ITROUCTIO Latent irichlet allocation LA) is a idely-used topic model hich as originally developed to model text corpora 5 It is a hierarchical Bayesian model in hich each observed item is modeled as a finite mixture over an underlying set of topics and each topic is characterized by a distribution over ords The basic idea of LA hen it is applied to model a set of images treating an image as a collection of visual ords is shon in Fig 1 The same intuition in the case of documents can be found in 2 Multi-modal extensions of LA referred to as multi-modal topic models have been proposed to jointly model data of different types These models ere mainly applied to image annotation here the goal is to assign a set of eyords to an image learning underlying topics from a set of image-annotations pairs Earlier or on this direction is correspondence LA cla) 3 hich finds conditional relationships beteen latent variable representations of visual ords and caption ords The conditional distribution of the annotation given visual descriptors is modeled for automatic image annotation Topic regression multi-modal LA trmmla) 10 is an alternative method for capturing statistical association beteen image and text Unlie cla trmmla learns to separate sets of hidden topics and counts on a regression module to allo a set of caption topics to be linearly predicted from the set of image topics It as motivated by the regression-based latent factor model 1 hich as further elaborated in the hierarchical Bayesian frameor 9 It as shon in 10 that trmmla is more flexible than cla in the sense that the former allos the number of image topics to be different from the number of caption topics Class label is a global description of an image hile annotated eyords are local descriptions of image patches Class label and annotations are related to each other For instance an image labeled as highay scene is more liely to be annotated ith cars and road Fig 1 A codeord is assigned to each image patch to represent an image as a collection of visual ords We assume that some number of topics hich are distributions over ords exist for the set of images An illustration of ho an image is generated by an LA model is shon here We first choose a distribution over topics the histogram at right) Then for each visual ord choose a topic assignment the circles ith patterns filled in) and choose the visual ord from the corresponding topic rather than apple and des In this paper e present supervised multimodal latent irichlet allocation smmla) here e incorporate class label into trmmla so that to sets of hidden topics hich are related via linear regression are learned from data of to types as ell as from class label Several extensions of LA to incorporate supervision have been developed in the literature Most of these existing methods are limited to learning from data of single type The model trmmla outperforms most of previous methods in the tas of image annotation but is an unsupervised method Our model smmla extends the previous state of the arts in this domain trmmla by incorporating supervision of class label 2 LATET IRICHLET ALLOCATIO We briefly give an overvie of LA 5 LA 5 is a generative probabilistic model of a corpus in hich documents are represented
2 as random mixtures over latent topics here each topic is described by a distribution over ords Each document d1: is a sequence of ords for d 1 is the size of a corpus) and each ord dn R V V is the size of vocabulary) is a unit vector that has a single entry equal to one and all other entries equal to zero For instance if dn is the vth ord in the vocabulary then dnv 1 and dnj 0 for j v The graphical model for LA is shon in Fig 2 here each document d1: is assumed to be generated as follos: α Variational inference allos us to calculate approximate posterior distributions over hidden variables {θ d z dn } by maximizing the variational loer-bound on the log marginal lielihood 3 SUPERVISE MULTI-MOAL LA In this section e present the main result discriminative multimodal LA smmla) here e incorporate class label into the joint modeling of visual ords {r dn } and caption ords { dm } hose latent variable representations are related via linear regression The graphical model for smmla is shon in Fig 3 θ d αc C η z dn θ d c d A Λ µ φ dn Fig 2 Graphical model for LA K r c C K z dn r dn x d y dm c ra a vector of topic proportions θ d R K dm M C L θ d irα 1 α K) For each ord n ra a topic assignment z dn R K from multinomial distribution: z dn θ d Multθ d ) ra a ord dn R V : dn z dn φ 1:K p dn z dn φ 1:K ) Given parameters α and φ 1:K the joint distribution of hidden and observed variables is given by pθ d z d1: d1: α φ 1:K ) pθ d α) pz dn θ d )p dn z dn φ 1:K ) Integrating over θ d and φ 1:K and summing over z d1: the marginal distribution of a document is given by p d1: α φ 1:K ) pz dn θ d )p dn z dn φ 1:K ) z d1: pθ d α) dθ d Taing the product of marginal probabilities of single documents the probability of a crops the marginal lielihood is given by p 1:1: α φ 1:K ) p d1: α φ 1:K ) 1) Fig 3 Graphical model for discriminative multi-modal LA smmla) 31 Model The generation process for each visual ord {r dn } and caption ord { dm } is as follos Choose a category label: c d R C Multη) j1 η c dj j here c d is the C-dimensional unit vector If c d is the class label j then c dj 1 and c di 0 for i j ra a vector of image topic proportions: θ d R K irθ d α j) c dj j1 For each visual ord r dn ra an image topic assignment: z dn R K Multθ d ) K 1 θ z dn d
3 ra a visual ord: r dn z dn c d Mult r ) K V r r cdi z dn r dnj ij i1 1 j1 here V r is the size of visual ord vocabulary Given the empirical image topic frequency z d 1 z dn sample a real-valued topic proportion variable for caption text: x d z d A µ Λ x d Az d + µ Λ 1 ) Compute caption topic proportions: v dl For each caption ord dm e x dl L ex dl ra a caption topic assignment: y dm Multv d ) ra a caption ord: L v y dml dl dm y dm c d Mult ) L V cdi y dml dmj ilj i1 j1 here V is the size of caption ord vocabulary We define sets of variables as R {r dn } W { dm } Z {z dn } Y {y dm } C {c d } Θ {θ d } X {x d } Then the joint distribution over these variables obeys the folloing factorization: pr W C Θ X Z Y) pc η)pθ C α)pz Θ)pR Z r C) px Z A µ Λ)pY X)pW Y C) 2) here pc η) 1 C pθ C α) pz Θ) pr Z r C) px Z A µ Λ) j1 irθ d α j) c dj K 1 θ z dn d C K V r r ) cdi z dn r dnj ij i1 1 j1 x d Az d + µ Λ 1 ) M M py X) py dm x d ) pw Y C) m1 M C L m1 32 Variational Inference V i1 j1 The log marginal lielihood is given by L m1 v y dml dl ilj ) cdi y dml dmj log pr W C) log pr W C Θ X Z Y)dΘdX Θ X Z Y qθ X Z Y) Θ X Z Y ) pr W C Θ X Z Y) log dθdx qθ X Z Y) Fq) 3) here qθ X Z Y) denotes the variational distribution and Jensen s inequality is used to reach the variational loer-bound Fq) We assume that the variational distribution factorizes as qθ X Z Y) qθ)qx)qz)qy) 4) here each distribution { is assumed to be of the form in Table } 1 Variational parameters {α d } {x d Γ 1 d } {τ dn} {ρ dml } are determined by maximizing the variational loer-bound Fq) E q log pc η) + log log pθ C α) + log pz Θ) + log pr Z r C) + log px Z A µ Λ) + log py X) + log pw Y C) E q log qθ) + log qx) + log qz) + log qy) here E q denotes the statistical expectation ith respect to the variational distribution q ) etailed derivations for variational inference are omitted here due to the space limitation In fact derivations can be done in a similar manner to 10 Especially
4 Variational posterior distributions Table 1 Updating equations for variational parameters Updating equations for variational parameters qθ) irθ d α d ) α d C i1 c diα i + τ dn qz) K 1 τ z dn dn qy) M L m1 ρy dml dml qx) x d x d Γ 1 log τ dn ψα d ) ψα d1 + + α dk ) + C A Λx d µ) d ) ξ d L log ρ dml C i1 ex dl+ 05 γ dl Vr i1 j1 c dir dnj log r ij 2 diaga ΛA) A 1 ΛA 2 2 m n φ dm V j1 c di dmj log ilj + x dl etermine x d and γ dl by eton-raphson method E q log v dl E q e x dl L e x dl as in 10 its convex loer-bound is maximized: E q log v dl E q x dl log ξ d 1 ξ d 33 Parameter Estimation is not directly maximized Instead x dl log ξ d 1 ξ d L L e x dl + 1 e x dl+ 5 γ dl + 1 Coordinate ascent algorithms for updating variational parameters are summarized in Table 1 Regression parameters {A µ Λ 1 } are updated: A x d µ) φ dn tr diagφ dn ) + φ dn ) µ 1 x d 1 ) A φ dn Λ 1 1 x d µ)x d µ) + Γ 1 d m n φ dm 1 1 ) ) A φ dn x d Multinomial parameters { r } are updated: r ij c diτ dn r dnj c diτ dn r dnj Vr j1 highay inside city mountain open country street and tall building This subset has 2686 images of size ith complete annotations We use 128-dimensional SIFT descriptors 8 computed on image patches here each image patch is obtained by sliding a indo ith a 20-pixel interval Then e run -means clustering on 128-dimensional descriptors to learn a 256-ord visual codeboo For the annotation ords e remove the ords appearing less than 3 times in the hole data Finally e have a complete set of triples visual ords caption ords class label) The hole data is separated into the training set of size 2000 and the test set of size 686 We evaluate the performance in terms of caption perplexity defined as Perplexity exp { Md m1 log p dm r d1: ) M d here p dm r d1: ) is the conditional probability of caption ords given an image r d1: and M d is the number of caption ords in document d The higher conditional lielihood leads to the loer perplexity The performance comparison to the previous state of the arts trmmla is summarized in Table 2 here our smmla outperforms trmmla Table 2 Perplexity comparison Method K 25 K 30 trmmla smmla our method) } ilj M m1 c diρ dml dmj V M j1 m1 c diρ dml dmj irichlet parameters {α c} are updated using eton-raphson method as in LA 5 Given a test image r 1: class label and annotations are determined by choosing the most probable ones among conditional probabilities pc d r 1: ) and p dm r 1: ) 4 EXPERIMETS We use the 8-category subset of LabelMe dataset 12 to perform image annotation experiments Categories include coast forest 5 COCLUSIOS In this paper e have presented a multi-modal extension of LA ith supervision leading to smmla We have developed variational inference algorithms to approximately compute posterior distributions over variables of interest in smmla Applications to image annotation demonstrated the high performance of smmla compared to the previous state of the arts Acnoledgments: This or as supported by ational Research Foundation RF) of Korea RF-2013R1A2A2A ) and POSTECH Rising Star Program
5 6 REFERECES 1 Agaral and B-C Chen Regression-based latent factor models in Proceedings of the ACM SIGK Conference on Knoledge iscovery and ata Mining K) Paris France M Blei Probabilistic topic models Communications of the ACM vol 55 no 4 pp M Blei and M I Jordan Modeling annotated data in Proceedings of the ACM SIGIR Conference on Research and evelopment in Information Retrieval SIGIR) Toronto Canada M Blei and J McAuliffe Supervised topic models in Advances in eural Information Processing Systems IPS) vol 20 MIT Press M Blei A g and M I Jordan Latent irichlet allocation Journal of Machine Learning Research vol 3 pp L Fei-Fei and P Perona A Bayesian hierarchical model for learning natural scene categories in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition CVPR) San iego CA S Lacoste-Julien F Sha and M I Jordan iscla: iscriminative learning for dimensionality reduction and classification in Advances in eural Information Processing Systems IPS) vol G Loe istinctive image features from scale-invariant eypoints International Journal of Computer Vision vol 60 no 2 pp S Par Y- Kim and S Choi Hierarchical Bayesian matrix factorization ith side information in Proceedings of the International Joint Conference on Artificial Intelligence IJCAI) Beijing China Putthividhya H T Attias and S S agarajan Topic regression multi-modal latent irichlet allocation for image annotation in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition CVPR) San Francisco CA USA Ramage Hall H allapati and C Manning Labeled LA: A supervised topic model for credit attribution in multi-labeled corpora in Proceedings of the Conference on Empirical Methods in atural Language Processing EMLP) Singapore B C Russel A Torralba K P Murphy and W T Freeman LabelMe: A database and eb-based tool for image annotation International Journal of Computer Vision vol 77 pp C Wang M Blei and L Fei-Fei Simultaneous image classification and annotation in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition CVPR) Miami FL USA 2009
Kernel Density Topic Models: Visual Topics Without Visual Words
Kernel Density Topic Models: Visual Topics Without Visual Words Konstantinos Rematas K.U. Leuven ESAT-iMinds krematas@esat.kuleuven.be Mario Fritz Max Planck Institute for Informatics mfrtiz@mpi-inf.mpg.de
More informationTopic Modelling and Latent Dirichlet Allocation
Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Dan Oneaţă 1 Introduction Probabilistic Latent Semantic Analysis (plsa) is a technique from the category of topic models. Its main goal is to model cooccurrence information
More informationCS Lecture 18. Topic Models and LDA
CS 6347 Lecture 18 Topic Models and LDA (some slides by David Blei) Generative vs. Discriminative Models Recall that, in Bayesian networks, there could be many different, but equivalent models of the same
More informationMarkov Topic Models. Bo Thiesson, Christopher Meek Microsoft Research One Microsoft Way Redmond, WA 98052
Chong Wang Computer Science Dept. Princeton University Princeton, NJ 08540 Bo Thiesson, Christopher Meek Microsoft Research One Microsoft Way Redmond, WA 9805 David Blei Computer Science Dept. Princeton
More informationFast Supervised LDA for Discovering Micro-Events in Large-Scale Video Datasets
Fast Supervised LDA for Discovering Micro-Events in Large-Scale Video Datasets Angelos Katharopoulos, Despoina Paschalidou, Christos Diou, Anastasios Delopoulos Multimedia Understanding Group ECE Department,
More informationLatent Dirichlet Allocation Introduction/Overview
Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models
More informationDocument and Topic Models: plsa and LDA
Document and Topic Models: plsa and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced Topics in Machine Learning 2 October 2018 Outline Topic Models plsa LSA Model Fitting via EM phits: link analysis
More informationDirichlet Enhanced Latent Semantic Analysis
Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,
More informationGenerative Clustering, Topic Modeling, & Bayesian Inference
Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week
More informationDistinguish between different types of scenes. Matching human perception Understanding the environment
Scene Recognition Adriana Kovashka UTCS, PhD student Problem Statement Distinguish between different types of scenes Applications Matching human perception Understanding the environment Indexing of images
More informationMixture Models and Expectation-Maximization
Mixture Models and Expectation-Maximiation David M. Blei March 9, 2012 EM for mixtures of multinomials The graphical model for a mixture of multinomials π d x dn N D θ k K How should we fit the parameters?
More information13: Variational inference II
10-708: Probabilistic Graphical Models, Spring 2015 13: Variational inference II Lecturer: Eric P. Xing Scribes: Ronghuo Zheng, Zhiting Hu, Yuntian Deng 1 Introduction We started to talk about variational
More informationNote 1: Varitional Methods for Latent Dirichlet Allocation
Technical Note Series Spring 2013 Note 1: Varitional Methods for Latent Dirichlet Allocation Version 1.0 Wayne Xin Zhao batmanfly@gmail.com Disclaimer: The focus of this note was to reorganie the content
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationGlobal Scene Representations. Tilke Judd
Global Scene Representations Tilke Judd Papers Oliva and Torralba [2001] Fei Fei and Perona [2005] Labzebnik, Schmid and Ponce [2006] Commonalities Goal: Recognize natural scene categories Extract features
More informationUsing Both Latent and Supervised Shared Topics for Multitask Learning
Using Both Latent and Supervised Shared Topics for Multitask Learning Ayan Acharya, Aditya Rawal, Raymond J. Mooney, Eduardo R. Hruschka UT Austin, Dept. of ECE September 21, 2013 Problem Definition An
More informationA Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank
A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank Shoaib Jameel Shoaib Jameel 1, Wai Lam 2, Steven Schockaert 1, and Lidong Bing 3 1 School of Computer Science and Informatics,
More informationECE 5984: Introduction to Machine Learning
ECE 5984: Introduction to Machine Learning Topics: (Finish) Expectation Maximization Principal Component Analysis (PCA) Readings: Barber 15.1-15.4 Dhruv Batra Virginia Tech Administrativia Poster Presentation:
More informationClass-Specific Simplex-Latent Dirichlet Allocation for Image Classification
Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification Mandar Dixit, Nikhil Rasiwasia, Nuno Vasconcelos Department of Electrical and Computer Engineering University of California,
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s
More informationClass-Specific Simplex-Latent Dirichlet Allocation for Image Classification
3 IEEE International Conference on Computer Vision Class-Specific Simplex-Latent Dirichlet Allocation for Image Classification Mandar Dixit, Nikhil Rasiwasia, Nuno Vasconcelos Department of Electrical
More informationLatent Dirichlet Allocation (LDA)
Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language
More informationShared Segmentation of Natural Scenes. Dependent Pitman-Yor Processes
Shared Segmentation of Natural Scenes using Dependent Pitman-Yor Processes Erik Sudderth & Michael Jordan University of California, Berkeley Parsing Visual Scenes sky skyscraper sky dome buildings trees
More informationStudy Notes on the Latent Dirichlet Allocation
Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection
More informationLogisticLDA: Regularizing Latent Dirichlet Allocation by Logistic Regression
LogisticLDA: Regularizing Latent Dirichlet Allocation by Logistic Regression b Jia-Cheng Guo a, Bao-Liang Lu a,b, Zhiwei Li c, and Lei Zhang c a Center for Brain-Like Computing and Machine Intelligence
More informationLatent Dirichlet Allocation
Outlines Advanced Artificial Intelligence October 1, 2009 Outlines Part I: Theoretical Background Part II: Application and Results 1 Motive Previous Research Exchangeability 2 Notation and Terminology
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering
More informationEvaluation Methods for Topic Models
University of Massachusetts Amherst wallach@cs.umass.edu April 13, 2009 Joint work with Iain Murray, Ruslan Salakhutdinov and David Mimno Statistical Topic Models Useful for analyzing large, unstructured
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationDistributed ML for DOSNs: giving power back to users
Distributed ML for DOSNs: giving power back to users Amira Soliman KTH isocial Marie Curie Initial Training Networks Part1 Agenda DOSNs and Machine Learning DIVa: Decentralized Identity Validation for
More informationLatent Variable Models and EM algorithm
Latent Variable Models and EM algorithm SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic 3.1 Clustering and Mixture Modelling K-means and hierarchical clustering are non-probabilistic
More informationFast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine
Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Nitish Srivastava nitish@cs.toronto.edu Ruslan Salahutdinov rsalahu@cs.toronto.edu Geoffrey Hinton hinton@cs.toronto.edu
More informationGaussian Mixture Model
Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,
More informationLDA with Amortized Inference
LDA with Amortied Inference Nanbo Sun Abstract This report describes how to frame Latent Dirichlet Allocation LDA as a Variational Auto- Encoder VAE and use the Amortied Variational Inference AVI to optimie
More informationUnsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models
SUBMISSION TO IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models Xiaogang Wang, Xiaoxu Ma,
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationLatent variable models for discrete data
Latent variable models for discrete data Jianfei Chen Department of Computer Science and Technology Tsinghua University, Beijing 100084 chris.jianfei.chen@gmail.com Janurary 13, 2014 Murphy, Kevin P. Machine
More informationLearning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text
Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute
More informationTopic Models and Applications to Short Documents
Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text
More informationPATTERN RECOGNITION AND MACHINE LEARNING
PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality
More informationLatent Dirichlet Allocation Based Multi-Document Summarization
Latent Dirichlet Allocation Based Multi-Document Summarization Rachit Arora Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. rachitar@cse.iitm.ernet.in
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationUnsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models
SUBMISSION TO IEEE TRANS. ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Unsupervised Activity Perception in Crowded and Complicated Scenes Using Hierarchical Bayesian Models Xiaogang Wang, Xiaoxu Ma,
More informationLatent Dirichlet Alloca/on
Latent Dirichlet Alloca/on Blei, Ng and Jordan ( 2002 ) Presented by Deepak Santhanam What is Latent Dirichlet Alloca/on? Genera/ve Model for collec/ons of discrete data Data generated by parameters which
More informationDeep Poisson Factorization Machines: a factor analysis model for mapping behaviors in journalist ecosystem
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationUnsupervised Learning with Permuted Data
Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University
More informationDiversity Regularization of Latent Variable Models: Theory, Algorithm and Applications
Diversity Regularization of Latent Variable Models: Theory, Algorithm and Applications Pengtao Xie, Machine Learning Department, Carnegie Mellon University 1. Background Latent Variable Models (LVMs) are
More informationCollaborative topic models: motivations cont
Collaborative topic models: motivations cont Two topics: machine learning social network analysis Two people: " boy Two articles: article A! girl article B Preferences: The boy likes A and B --- no problem.
More informationImage segmentation combining Markov Random Fields and Dirichlet Processes
Image segmentation combining Markov Random Fields and Dirichlet Processes Jessica SODJO IMS, Groupe Signal Image, Talence Encadrants : A. Giremus, J.-F. Giovannelli, F. Caron, N. Dobigeon Jessica SODJO
More informationUnsupervised Activity Perception by Hierarchical Bayesian Models
Unsupervised Activity Perception by Hierarchical Bayesian Models Xiaogang Wang Xiaoxu Ma Eric Grimson Computer Science and Artificial Intelligence Lab Massachusetts Tnstitute of Technology, Cambridge,
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationSpatial Bayesian Nonparametrics for Natural Image Segmentation
Spatial Bayesian Nonparametrics for Natural Image Segmentation Erik Sudderth Brown University Joint work with Michael Jordan University of California Soumya Ghosh Brown University Parsing Visual Scenes
More informationLecture 8: Graphical models for Text
Lecture 8: Graphical models for Text 4F13: Machine Learning Joaquin Quiñonero-Candela and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/
More informationClustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.
Clustering K-means Machine Learning CSE546 Carlos Guestrin University of Washington November 4, 2014 1 Clustering images Set of Images [Goldberger et al.] 2 1 K-means Randomly initialize k centers µ (0)
More informationTwo Useful Bounds for Variational Inference
Two Useful Bounds for Variational Inference John Paisley Department of Computer Science Princeton University, Princeton, NJ jpaisley@princeton.edu Abstract We review and derive two lower bounds on the
More informationEnhancing Generalization Capability of SVM Classifiers with Feature Weight Adjustment
Enhancing Generalization Capability of SVM Classifiers ith Feature Weight Adjustment Xizhao Wang and Qiang He College of Mathematics and Computer Science, Hebei University, Baoding 07002, Hebei, China
More informationContent-based Recommendation
Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationPachinko Allocation: DAG-Structured Mixture Models of Topic Correlations
: DAG-Structured Mixture Models of Topic Correlations Wei Li and Andrew McCallum University of Massachusetts, Dept. of Computer Science {weili,mccallum}@cs.umass.edu Abstract Latent Dirichlet allocation
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin
More informationProbabilistic Time Series Classification
Probabilistic Time Series Classification Y. Cem Sübakan Boğaziçi University 25.06.2013 Y. Cem Sübakan (Boğaziçi University) M.Sc. Thesis Defense 25.06.2013 1 / 54 Problem Statement The goal is to assign
More informationRelative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation
Relative Performance Guarantees for Approximate Inference in Latent Dirichlet Allocation Indraneel Muherjee David M. Blei Department of Computer Science Princeton University 3 Olden Street Princeton, NJ
More informationBayesian Inference Course, WTCN, UCL, March 2013
Bayesian Course, WTCN, UCL, March 2013 Shannon (1948) asked how much information is received when we observe a specific value of the variable x? If an unlikely event occurs then one would expect the information
More informationText mining and natural language analysis. Jefrey Lijffijt
Text mining and natural language analysis Jefrey Lijffijt PART I: Introduction to Text Mining Why text mining The amount of text published on paper, on the web, and even within companies is inconceivably
More informationRecent Advances in Bayesian Inference Techniques
Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian
More informationPattern Recognition and Machine Learning. Bishop Chapter 9: Mixture Models and EM
Pattern Recognition and Machine Learning Chapter 9: Mixture Models and EM Thomas Mensink Jakob Verbeek October 11, 27 Le Menu 9.1 K-means clustering Getting the idea with a simple example 9.2 Mixtures
More informationLecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions
DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K
More informationModeling User Rating Profiles For Collaborative Filtering
Modeling User Rating Profiles For Collaborative Filtering Benjamin Marlin Department of Computer Science University of Toronto Toronto, ON, M5S 3H5, CANADA marlin@cs.toronto.edu Abstract In this paper
More informationApplying LDA topic model to a corpus of Italian Supreme Court decisions
Applying LDA topic model to a corpus of Italian Supreme Court decisions Paolo Fantini Statistical Service of the Ministry of Justice - Italy CESS Conference - Rome - November 25, 2014 Our goal finding
More informationNote on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing
Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing 1 Zhong-Yuan Zhang, 2 Chris Ding, 3 Jie Tang *1, Corresponding Author School of Statistics,
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationTopic Models. Charles Elkan November 20, 2008
Topic Models Charles Elan elan@cs.ucsd.edu November 20, 2008 Suppose that we have a collection of documents, and we want to find an organization for these, i.e. we want to do unsupervised learning. One
More informationMeasuring Topic Quality in Latent Dirichlet Allocation
Measuring Topic Quality in Sergei Koltsov Olessia Koltsova Steklov Institute of Mathematics at St. Petersburg Laboratory for Internet Studies, National Research University Higher School of Economics, St.
More informationSTATS 306B: Unsupervised Learning Spring Lecture 3 April 7th
STATS 306B: Unsupervised Learning Spring 2014 Lecture 3 April 7th Lecturer: Lester Mackey Scribe: Jordan Bryan, Dangna Li 3.1 Recap: Gaussian Mixture Modeling In the last lecture, we discussed the Gaussian
More informationBayesian Hidden Markov Models and Extensions
Bayesian Hidden Markov Models and Extensions Zoubin Ghahramani Department of Engineering University of Cambridge joint work with Matt Beal, Jurgen van Gael, Yunus Saatci, Tom Stepleton, Yee Whye Teh Modeling
More informationClustering with k-means and Gaussian mixture distributions
Clustering with k-means and Gaussian mixture distributions Machine Learning and Category Representation 2012-2013 Jakob Verbeek, ovember 23, 2012 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.12.13
More informationINTERPRETING THE PREDICTION PROCESS OF A DEEP NETWORK CONSTRUCTED FROM SUPERVISED TOPIC MODELS
INTERPRETING THE PREDICTION PROCESS OF A DEEP NETWORK CONSTRUCTED FROM SUPERVISED TOPIC MODELS Jianshu Chen, Ji He, Xiaodong He, Lin Xiao, Jianfeng Gao, and Li Deng Microsoft Research, Redmond, WA 9852,
More informationIntroduction to Bayesian Statistics
School of Computing & Communication, UTS January, 207 Random variables Pre-university: A number is just a fixed value. When we talk about probabilities: When X is a continuous random variable, it has a
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationTOPIC models, such as latent Dirichlet allocation (LDA),
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. X, NO. X, XXXX Learning Supervised Topic Models for Classification and Regression from Crowds Filipe Rodrigues, Mariana Lourenço, Bernardete
More informationChris Bishop s PRML Ch. 8: Graphical Models
Chris Bishop s PRML Ch. 8: Graphical Models January 24, 2008 Introduction Visualize the structure of a probabilistic model Design and motivate new models Insights into the model s properties, in particular
More informationMixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate
Mixture Models & EM icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Previously We looed at -means and hierarchical clustering as mechanisms for unsupervised learning -means
More informationLatent Dirichlet Bayesian Co-Clustering
Latent Dirichlet Bayesian Co-Clustering Pu Wang 1, Carlotta Domeniconi 1, and athryn Blackmond Laskey 1 Department of Computer Science Department of Systems Engineering and Operations Research George Mason
More informationLatent Variable Models and Expectation Maximization
Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15
More informationTopic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up
Much of this material is adapted from Blei 2003. Many of the images were taken from the Internet February 20, 2014 Suppose we have a large number of books. Each is about several unknown topics. How can
More informationProbablistic Graphical Models, Spring 2007 Homework 4 Due at the beginning of class on 11/26/07
Probablistic Graphical odels, Spring 2007 Homework 4 Due at the beginning of class on 11/26/07 Instructions There are four questions in this homework. The last question involves some programming which
More informationConditional Random Field
Introduction Linear-Chain General Specific Implementations Conclusions Corso di Elaborazione del Linguaggio Naturale Pisa, May, 2011 Introduction Linear-Chain General Specific Implementations Conclusions
More informationSequence labeling. Taking collective a set of interrelated instances x 1,, x T and jointly labeling them
HMM, MEMM and CRF 40-957 Special opics in Artificial Intelligence: Probabilistic Graphical Models Sharif University of echnology Soleymani Spring 2014 Sequence labeling aking collective a set of interrelated
More informationAN INTRODUCTION TO TOPIC MODELS
AN INTRODUCTION TO TOPIC MODELS Michael Paul December 4, 2013 600.465 Natural Language Processing Johns Hopkins University Prof. Jason Eisner Making sense of text Suppose you want to learn something about
More informationDensity Propagation for Continuous Temporal Chains Generative and Discriminative Models
$ Technical Report, University of Toronto, CSRG-501, October 2004 Density Propagation for Continuous Temporal Chains Generative and Discriminative Models Cristian Sminchisescu and Allan Jepson Department
More informationMulti-Layer Boosting for Pattern Recognition
Multi-Layer Boosting for Pattern Recognition François Fleuret IDIAP Research Institute, Centre du Parc, P.O. Box 592 1920 Martigny, Switzerland fleuret@idiap.ch Abstract We extend the standard boosting
More informationLecture 13 Visual recognition
Lecture 13 Visual recognition Announcements Silvio Savarese Lecture 13-20-Feb-14 Lecture 13 Visual recognition Object classification bag of words models Discriminative methods Generative methods Object
More informationA Continuous-Time Model of Topic Co-occurrence Trends
A Continuous-Time Model of Topic Co-occurrence Trends Wei Li, Xuerui Wang and Andrew McCallum Department of Computer Science University of Massachusetts 140 Governors Drive Amherst, MA 01003-9264 Abstract
More informationLatent Variable Models and Expectation Maximization
Latent Variable Models and Expectation Maximization Oliver Schulte - CMPT 726 Bishop PRML Ch. 9 2 4 6 8 1 12 14 16 18 2 4 6 8 1 12 14 16 18 5 1 15 2 25 5 1 15 2 25 2 4 6 8 1 12 14 2 4 6 8 1 12 14 5 1 15
More informationLatent Dirichlet Allocation
Latent Dirichlet Allocation 1 Directed Graphical Models William W. Cohen Machine Learning 10-601 2 DGMs: The Burglar Alarm example Node ~ random variable Burglar Earthquake Arcs define form of probability
More informationExpectation Maximization
Expectation Maximization Bishop PRML Ch. 9 Alireza Ghane c Ghane/Mori 4 6 8 4 6 8 4 6 8 4 6 8 5 5 5 5 5 5 4 6 8 4 4 6 8 4 5 5 5 5 5 5 µ, Σ) α f Learningscale is slightly Parameters is slightly larger larger
More informationProbabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016
Probabilistic classification CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Topics Probabilistic approach Bayes decision theory Generative models Gaussian Bayes classifier
More informationIEOR E4570: Machine Learning for OR&FE Spring 2015 c 2015 by Martin Haugh. The EM Algorithm
IEOR E4570: Machine Learning for OR&FE Spring 205 c 205 by Martin Haugh The EM Algorithm The EM algorithm is used for obtaining maximum likelihood estimates of parameters when some of the data is missing.
More informationWeb Search and Text Mining. Lecture 16: Topics and Communities
Web Search and Tet Mining Lecture 16: Topics and Communities Outline Latent Dirichlet Allocation (LDA) Graphical models for social netorks Eploration, discovery, and query-ansering in the contet of the
More information