Topic Models. Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW. Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1

Similar documents
Topic Models. Material adapted from David Mimno University of Maryland INTRODUCTION. Material adapted from David Mimno UMD Topic Models 1 / 51

LSI, plsi, LDA and inference methods

Mr. LDA: A Flexible Large Scale Topic Modeling Package using Variational Inference in MapReduce

Gibbs Sampling. Héctor Corrada Bravo. University of Maryland, College Park, USA CMSC 644:

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

Applying LDA topic model to a corpus of Italian Supreme Court decisions

Distributed ML for DOSNs: giving power back to users

Latent Dirichlet Allocation Introduction/Overview

Latent Dirichlet Allocation

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up

Document and Topic Models: plsa and LDA

Topic Models and Applications to Short Documents

Topic Modeling Using Latent Dirichlet Allocation (LDA)

Statistical Debugging with Latent Topic Models

CS Lecture 18. Topic Models and LDA

Probabilistic Latent Semantic Analysis

topic modeling hanna m. wallach

Gaussian Mixture Model

Latent Dirichlet Allocation (LDA)

PROBABILISTIC LATENT SEMANTIC ANALYSIS

AN INTRODUCTION TO TOPIC MODELS

Latent Dirichlet Allocation (LDA)

Introduction to Bayesian inference

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014

Efficient Tree-Based Topic Modeling

Content-based Recommendation

Lecture 19, November 19, 2012

Note for plsa and LDA-Version 1.1

Topic Models. Charles Elkan November 20, 2008

Topic Discovery Project Report

Generative Clustering, Topic Modeling, & Bayesian Inference

Kernel Density Topic Models: Visual Topics Without Visual Words

Online Bayesian Passive-Agressive Learning

Web Search and Text Mining. Lecture 16: Topics and Communities

Matrix Factorization & Latent Semantic Analysis Review. Yize Li, Lanbo Zhang

Study Notes on the Latent Dirichlet Allocation

Knowledge Discovery and Data Mining 1 (VO) ( )

A Continuous-Time Model of Topic Co-occurrence Trends

Topic Modelling and Latent Dirichlet Allocation

Measuring Topic Quality in Latent Dirichlet Allocation

Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs

Latent Dirichlet Allocation

Non-parametric Clustering with Dirichlet Processes

CS 572: Information Retrieval

Mixtures of Multinomials

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

Text Mining for Economics and Finance Latent Dirichlet Allocation

Evaluation Methods for Topic Models

Lecture 13 : Variational Inference: Mean Field Approximation

Applying hlda to Practical Topic Modeling

how to *do* computationally assisted research

Topic Modeling: Beyond Bag-of-Words

Notes on Latent Semantic Analysis

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine

Gaussian Models

AUTOMATIC DETECTION OF WORDS NOT SIGNIFICANT TO TOPIC CLASSIFICATION IN LATENT DIRICHLET ALLOCATION

Language Information Processing, Advanced. Topic Models

Interactive Topic Modeling

Modeling Topics and Knowledge Bases with Embeddings

Mixture Models and Expectation-Maximization

Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations

Machine Learning

Latent Dirichlet Alloca/on

A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank

An Empirical Study on Dimensionality Optimization in Text Mining for Linguistic Knowledge Acquisition

Probabilistic Latent Semantic Analysis

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

The Role of Semantic History on Online Generative Topic Modeling

Latent variable models for discrete data

Sparse Stochastic Inference for Latent Dirichlet Allocation

Inexact Search is Good Enough

Lecture 22 Exploratory Text Analysis & Topic Models

Web-Mining Agents Topic Analysis: plsi and LDA. Tanya Braun Ralf Möller Universität zu Lübeck Institut für Informationssysteme

Replicated Softmax: an Undirected Topic Model. Stephen Turner

Note on Algorithm Differences Between Nonnegative Matrix Factorization And Probabilistic Latent Semantic Indexing

Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability

Latent Dirichlet Allocation and Singular Value Decomposition based Multi-Document Summarization

CS145: INTRODUCTION TO DATA MINING

RETRIEVAL MODELS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS

LDA with Amortized Inference

Modeling User Rating Profiles For Collaborative Filtering

Text Mining: Basic Models and Applications

16 : Approximate Inference: Markov Chain Monte Carlo

Department of Computer Science, Guiyang University, Guiyang , GuiZhou, China

Dirichlet Enhanced Latent Semantic Analysis

Modeling Environment

Mixed Membership Matrix Factorization

Query-document Relevance Topic Models

Graphical models and topic modeling. Ho Tu Bao Japan Advance Institute of Science and Technology John von Neumann Institute, VNU-HCM

Probabilistic Topic Models Tutorial: COMAD 2011

Recent Advances in Bayesian Inference Techniques

METHODS FOR IDENTIFYING PUBLIC HEALTH TRENDS. Mark Dredze Department of Computer Science Johns Hopkins University

Lecture 8: Graphical models for Text

HTM: A Topic Model for Hypertexts

Latent Dirichlet Allocation Based Multi-Document Summarization

Probabilistic Matrix Factorization

Mixed-membership Models (and an introduction to variational inference)

Collaborative Topic Modeling for Recommending Scientific Articles

Uncovering News-Twitter Reciprocity via Interaction Patterns

Latent Dirichlet Conditional Naive-Bayes Models

Transcription:

Topic Models Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1

Low-Dimensional Space for Documents Last time: embedding space for words This time: embedding space for documents Generative story New inference techniques Advanced Machine Learning for NLP Boyd-Graber Topic Models 2 of 1

Why topic models? Suppose you have a huge number of documents Want to know what s going on Can t read them all (e.g. every New York Times article from the 90 s) Topic models offer a way to get a corpus-level view of major themes Advanced Machine Learning for NLP Boyd-Graber Topic Models 3 of 1

Why topic models? Suppose you have a huge number of documents Want to know what s going on Can t read them all (e.g. every New York Times article from the 90 s) Topic models offer a way to get a corpus-level view of major themes Unsupervised Advanced Machine Learning for NLP Boyd-Graber Topic Models 3 of 1

Roadmap What are topic models How to go from raw data to topics Advanced Machine Learning for NLP Boyd-Graber Topic Models 4 of 1

Embedding Space From an input corpus and number of topics K words to topics Corpus Forget the Bootleg, Just Download Multiplex the Heralded Movie Legally As Linchpin The Shape To of Growth Cinema, Transformed A Peaceful At Crew the Click Puts of Muppets a Where Mouse Stock Trades: A Its Better Mouth Deal Is For The Investors three Isn't big Internet Simple portals Red Light, begin Green to distinguish Light: A among 2-Tone themselves L.E.D. to as shopping Simplify Screens malls Advanced Machine Learning for NLP Boyd-Graber Topic Models 5 of 1

Embedding Space From an input corpus and number of topics K words to topics TOPIC 1 TOPIC 2 TOPIC 3 computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Advanced Machine Learning for NLP Boyd-Graber Topic Models 5 of 1

Conceptual Approach For each document, what topics are expressed by that document? Red Light, Green Light: A 2-Tone L.E.D. to Simplify Screens The three big Internet portals begin to distinguish among themselves as shopping malls Stock Trades: A Better Deal For Investors Isn't Simple TOPIC 1 Forget the Bootleg, Just Download the Movie Legally TOPIC 2 The Shape of Cinema, Transformed At the Click of a Mouse Multiplex Heralded As Linchpin To Growth TOPIC 3 A Peaceful Crew Puts Muppets Where Its Mouth Is Advanced Machine Learning for NLP Boyd-Graber Topic Models 6 of 1

Topics from Science Advanced Machine Learning for NLP Boyd-Graber Topic Models 7 of 1

Why should you care? Neat way to explore / understand corpus collections E-discovery Social media Scientific data NLP Applications Word Sense Disambiguation Discourse Segmentation Machine Translation Psychology: word meaning, polysemy Inference is (relatively) simple Advanced Machine Learning for NLP Boyd-Graber Topic Models 8 of 1

Matrix Factorization Approach M K K V M V Topics Topic Assignment Dataset K Number of topics M Number of documents V Size of vocabulary Advanced Machine Learning for NLP Boyd-Graber Topic Models 9 of 1

Matrix Factorization Approach M K K V M V Topics Topic Assignment Dataset K Number of topics M Number of documents V Size of vocabulary If you use singular value decomposition (SVD), this technique is called latent semantic analysis. Popular in information retrieval. Advanced Machine Learning for NLP Boyd-Graber Topic Models 9 of 1

Alternative: Generative Model How your data came to be Sequence of Probabilistic Steps Posterior Inference Advanced Machine Learning for NLP Boyd-Graber Topic Models 10 of 1

Alternative: Generative Model How your data came to be Sequence of Probabilistic Steps Posterior Inference Blei, Ng, Jordan. Latent Dirichlet Allocation. JMLR, 2003. Advanced Machine Learning for NLP Boyd-Graber Topic Models 10 of 1

Multinomial Distribution Distribution over discrete outcomes Represented by non-negative vector that sums to one Picture representation (1,0,0) (0,0,1) (0,1,0) (1/3,1/3,1/3) (1/4,1/4,1/2) (1/2,1/2,0) Advanced Machine Learning for NLP Boyd-Graber Topic Models 11 of 1

Multinomial Distribution Distribution over discrete outcomes Represented by non-negative vector that sums to one Picture representation (1,0,0) (0,0,1) (0,1,0) (1/3,1/3,1/3) (1/4,1/4,1/2) (1/2,1/2,0) Come from a Dirichlet distribution Advanced Machine Learning for NLP Boyd-Graber Topic Models 11 of 1

Dirichlet Distribution Advanced Machine Learning for NLP Boyd-Graber Topic Models 12 of 1

Dirichlet Distribution Advanced Machine Learning for NLP Boyd-Graber Topic Models 12 of 1

Dirichlet Distribution Advanced Machine Learning for NLP Boyd-Graber Topic Models 12 of 1

Dirichlet Distribution Advanced Machine Learning for NLP Boyd-Graber Topic Models 13 of 1

Dirichlet Distribution If φ Dir(()α), w Mult(()φ), and n k = {w i : w i = k} then p(φ α,w ) p(w φ)p(φ α) (1) φ n k (2) k k k φ α k +n k 1 φ α k 1 Conjugacy: this posterior has the same form as the prior (3)

Dirichlet Distribution If φ Dir(()α), w Mult(()φ), and n k = {w i : w i = k} then p(φ α,w ) p(w φ)p(φ α) (1) φ n k (2) k k k φ α k +n k 1 φ α k 1 Conjugacy: this posterior has the same form as the prior (3) Advanced Machine Learning for NLP Boyd-Graber Topic Models 14 of 1

Generative Model TOPIC 1 computer, technology, system, service, site, phone, internet, machine TOPIC 2 sell, sale, store, product, business, advertising, market, consumer TOPIC 3 play, film, movie, theater, production, star, director, stage Advanced Machine Learning for NLP Boyd-Graber Topic Models 15 of 1

Generative Model Red Light, Green Light: A 2-Tone L.E.D. to Simplify Screens The three big Internet portals begin to distinguish among themselves as shopping malls Stock Trades: A Better Deal For Investors Isn't Simple TOPIC 1 Forget the Bootleg, Just Download the Movie Legally TOPIC 2 The Shape of Cinema, Transformed At the Click of a Mouse Multiplex Heralded As Linchpin To Growth TOPIC 3 A Peaceful Crew Puts Muppets Where Its Mouth Is Advanced Machine Learning for NLP Boyd-Graber Topic Models 15 of 1

Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Advanced Machine Learning for NLP Boyd-Graber Topic Models 15 of 1

Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Advanced Machine Learning for NLP Boyd-Graber Topic Models 15 of 1

Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Advanced Machine Learning for NLP Boyd-Graber Topic Models 15 of 1

Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Advanced Machine Learning for NLP Boyd-Graber Topic Models 15 of 1

Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,..., K }, draw a multinomial distribution β k from a Dirichlet distribution with parameter λ Advanced Machine Learning for NLP Boyd-Graber Topic Models 16 of 1

Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,..., K }, draw a multinomial distribution β k from a Dirichlet distribution with parameter λ For each document d {1,...,M }, draw a multinomial distribution θ d from a Dirichlet distribution with parameter α Advanced Machine Learning for NLP Boyd-Graber Topic Models 16 of 1

Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,..., K }, draw a multinomial distribution β k from a Dirichlet distribution with parameter λ For each document d {1,...,M }, draw a multinomial distribution θ d from a Dirichlet distribution with parameter α For each word position n {1,...,N }, select a hidden topic z n from the multinomial distribution parameterized by θ. Advanced Machine Learning for NLP Boyd-Graber Topic Models 16 of 1

Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,..., K }, draw a multinomial distribution β k from a Dirichlet distribution with parameter λ For each document d {1,...,M }, draw a multinomial distribution θ d from a Dirichlet distribution with parameter α For each word position n {1,...,N }, select a hidden topic z n from the multinomial distribution parameterized by θ. Choose the observed word w n from the distribution β zn. Advanced Machine Learning for NLP Boyd-Graber Topic Models 16 of 1

Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,..., K }, draw a multinomial distribution β k from a Dirichlet distribution with parameter λ For each document d {1,...,M }, draw a multinomial distribution θ d from a Dirichlet distribution with parameter α For each word position n {1,...,N }, select a hidden topic z n from the multinomial distribution parameterized by θ. Choose the observed word w n from the distribution β zn. Advanced We Machine use Learning statistical for NLP Boyd-Graber inference to uncover the most likely unobserved Topic Models 16 of 1

Topic Models: What s Important Topic models Topics to word types multinomial distribution Documents to topics multinomial distribution Focus in this talk: statistical methods Model: story of how your data came to be Latent variables: missing pieces of your story Statistical inference: filling in those missing pieces We use latent Dirichlet allocation (LDA), a fully Bayesian version of plsi, probabilistic version of LSA Advanced Machine Learning for NLP Boyd-Graber Topic Models 17 of 1

Topic Models: What s Important Topic models (latent variables) Topics to word types multinomial distribution Documents to topics multinomial distribution Focus in this talk: statistical methods Model: story of how your data came to be Latent variables: missing pieces of your story Statistical inference: filling in those missing pieces We use latent Dirichlet allocation (LDA), a fully Bayesian version of plsi, probabilistic version of LSA Advanced Machine Learning for NLP Boyd-Graber Topic Models 17 of 1

Advanced Machine Learning for NLP Boyd-Graber Topic Models 18 of 1