Topic Models. Material adapted from David Mimno University of Maryland INTRODUCTION. Material adapted from David Mimno UMD Topic Models 1 / 51

Size: px
Start display at page:

Download "Topic Models. Material adapted from David Mimno University of Maryland INTRODUCTION. Material adapted from David Mimno UMD Topic Models 1 / 51"

Transcription

1 Topic Models Material adapted from David Mimno University of Maryland INTRODUCTION Material adapted from David Mimno UMD Topic Models 1 / 51

2 Why topic models? Suppose you have a huge number of documents Want to know what s going on Can t read them all (e.g. every New York Times article from the 90 s) Topic models offer a way to get a corpus-level view of major themes Material adapted from David Mimno UMD Topic Models 2 / 51

3 Why topic models? Suppose you have a huge number of documents Want to know what s going on Can t read them all (e.g. every New York Times article from the 90 s) Topic models offer a way to get a corpus-level view of major themes Unsupervised Material adapted from David Mimno UMD Topic Models 2 / 51

4 Roadmap What are topic models How to know if you have good topic model How to go from raw data to topics Material adapted from David Mimno UMD Topic Models 3 / 51

5 What are Topic Models? Conceptual Approach From an input corpus and number of topics K words to topics Corpus Forget the Bootleg, Just Download Multiplex the Heralded Movie Legally As Linchpin The Shape To of Growth Cinema, Transformed A Peaceful At Crew the Click Puts of Muppets a Where Mouse Stock Trades: A Its Better Mouth Deal Is For The Investors three Isn't big Internet Simple portals Red Light, begin Green to distinguish Light: A among 2-Tone themselves L.E.D. to as shopping Simplify Screens malls Material adapted from David Mimno UMD Topic Models 4 / 51

6 What are Topic Models? Conceptual Approach From an input corpus and number of topics K words to topics TOPIC 1 TOPIC 2 TOPIC 3 computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Material adapted from David Mimno UMD Topic Models 4 / 51

7 What are Topic Models? Conceptual Approach For each document, what topics are expressed by that document? Red Light, Green Light: A 2-Tone L.E.D. to Simplify Screens The three big Internet portals begin to distinguish among themselves as shopping malls Stock Trades: A Better Deal For Investors Isn't Simple TOPIC 1 Forget the Bootleg, Just Download the Movie Legally TOPIC 2 The Shape of Cinema, Transformed At the Click of a Mouse Multiplex Heralded As Linchpin To Growth TOPIC 3 A Peaceful Crew Puts Muppets Where Its Mouth Is Material adapted from David Mimno UMD Topic Models 5 / 51

8 What are Topic Models? Topics from Science Material adapted from David Mimno UMD Topic Models 6 / 51

9 What are Topic Models? Why should you care? Neat way to explore / understand corpus collections E-discovery Social media Scientific data NLP Applications Word Sense Disambiguation Discourse Segmentation Machine Translation Psychology: word meaning, polysemy Inference is (relatively) simple Material adapted from David Mimno UMD Topic Models 7 / 51

10 What are Topic Models? Matrix Factorization Approach M K K V M V Topics Topic Assignment Dataset K Number of topics M Number of documents V Size of vocabulary Material adapted from David Mimno UMD Topic Models 8 / 51

11 What are Topic Models? Matrix Factorization Approach M K K V M V Topics Topic Assignment Dataset K Number of topics M Number of documents V Size of vocabulary If you use singular value decomposition (SVD), this technique is called latent semantic analysis. Popular in information retrieval. Material adapted from David Mimno UMD Topic Models 8 / 51

12 What are Topic Models? Alternative: Generative Model How your data came to be Sequence of Probabilistic Steps Posterior Inference Material adapted from David Mimno UMD Topic Models 9 / 51

13 What are Topic Models? Alternative: Generative Model How your data came to be Sequence of Probabilistic Steps Posterior Inference Blei, Ng, Jordan. Latent Dirichlet Allocation. JMLR, Material adapted from David Mimno UMD Topic Models 9 / 51

14 What are Topic Models? Multinomial Distribution Distribution over discrete outcomes Represented by non-negative vector that sums to one Picture representation (1,0,0) (0,0,1) (0,1,0) (1/3,1/3,1/3) (1/4,1/4,1/2) (1/2,1/2,0) Material adapted from David Mimno UMD Topic Models 10 / 51

15 What are Topic Models? Multinomial Distribution Distribution over discrete outcomes Represented by non-negative vector that sums to one Picture representation (1,0,0) (0,0,1) (0,1,0) (1/3,1/3,1/3) (1/4,1/4,1/2) (1/2,1/2,0) Come from a Dirichlet distribution Material adapted from David Mimno UMD Topic Models 10 / 51

16 What are Topic Models? Dirichlet Distribution Material adapted from David Mimno UMD Topic Models 11 / 51

17 What are Topic Models? Dirichlet Distribution Material adapted from David Mimno UMD Topic Models 11 / 51

18 What are Topic Models? Dirichlet Distribution Material adapted from David Mimno UMD Topic Models 11 / 51

19 What are Topic Models? Dirichlet Distribution Material adapted from David Mimno UMD Topic Models 12 / 51

20 What are Topic Models? Dirichlet Distribution If φ Dir(()α), w Mult(()φ), and n k = {w i : w i = k} then p(φ α, w) p( w φ)p(φ α) (1) φ n k φ α k 1 (2) k k k φ α k+n k 1 Conjugacy: this posterior has the same form as the prior (3)

21 What are Topic Models? Dirichlet Distribution If φ Dir(()α), w Mult(()φ), and n k = {w i : w i = k} then p(φ α, w) p( w φ)p(φ α) (1) φ n k φ α k 1 (2) k k k φ α k+n k 1 (3) Conjugacy: this posterior has the same form as the prior Material adapted from David Mimno UMD Topic Models 13 / 51

22 What are Topic Models? Generative Model TOPIC 1 computer, technology, system, service, site, phone, internet, machine TOPIC 2 sell, sale, store, product, business, advertising, market, consumer TOPIC 3 play, film, movie, theater, production, star, director, stage Material adapted from David Mimno UMD Topic Models 14 / 51

23 What are Topic Models? Generative Model Red Light, Green Light: A 2-Tone L.E.D. to Simplify Screens The three big Internet portals begin to distinguish among themselves as shopping malls Stock Trades: A Better Deal For Investors Isn't Simple TOPIC 1 Forget the Bootleg, Just Download the Movie Legally TOPIC 2 The Shape of Cinema, Transformed At the Click of a Mouse Multiplex Heralded As Linchpin To Growth TOPIC 3 A Peaceful Crew Puts Muppets Where Its Mouth Is Material adapted from David Mimno UMD Topic Models 14 / 51

24 What are Topic Models? Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 14 / 51

25 What are Topic Models? Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 14 / 51

26 What are Topic Models? Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 14 / 51

27 What are Topic Models? Generative Model computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 14 / 51

28 What are Topic Models? Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,...,K }, draw a multinomial distribution βk from a Dirichlet distribution with parameter λ Material adapted from David Mimno UMD Topic Models 15 / 51

29 What are Topic Models? Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,...,K }, draw a multinomial distribution βk from a Dirichlet distribution with parameter λ For each document d {1,...,M}, draw a multinomial distribution θd from a Dirichlet distribution with parameter α Material adapted from David Mimno UMD Topic Models 15 / 51

30 What are Topic Models? Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,...,K }, draw a multinomial distribution βk from a Dirichlet distribution with parameter λ For each document d {1,...,M}, draw a multinomial distribution θd from a Dirichlet distribution with parameter α For each word position n {1,...,N}, select a hidden topic zn from the multinomial distribution parameterized by θ. Material adapted from David Mimno UMD Topic Models 15 / 51

31 What are Topic Models? Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,...,K }, draw a multinomial distribution βk from a Dirichlet distribution with parameter λ For each document d {1,...,M}, draw a multinomial distribution θd from a Dirichlet distribution with parameter α For each word position n {1,...,N}, select a hidden topic zn from the multinomial distribution parameterized by θ. Choose the observed word wn from the distribution β zn. Material adapted from David Mimno UMD Topic Models 15 / 51

32 What are Topic Models? Generative Model Approach λ βk K α θd zn wn N M For each topic k {1,...,K }, draw a multinomial distribution βk from a Dirichlet distribution with parameter λ For each document d {1,...,M}, draw a multinomial distribution θd from a Dirichlet distribution with parameter α For each word position n {1,...,N}, select a hidden topic zn from the multinomial distribution parameterized by θ. Choose the observed word wn from the distribution β zn. We use statistical inference to uncover the most likely unobserved Material adapted from David Mimno UMD Topic Models 15 / 51

33 What are Topic Models? Topic Models: What s Important Topic models Topics to word types Documents to topics Topics to word types multinomial distribution Documents to topics multinomial distribution Focus in this talk: statistical methods Model: story of how your data came to be Latent variables: missing pieces of your story Statistical inference: filling in those missing pieces We use latent Dirichlet allocation (LDA), a fully Bayesian version of plsi, probabilistic version of LSA Material adapted from David Mimno UMD Topic Models 16 / 51

34 What are Topic Models? Topic Models: What s Important Topic models (latent variables) Topics to word types Documents to topics Topics to word types multinomial distribution Documents to topics multinomial distribution Focus in this talk: statistical methods Model: story of how your data came to be Latent variables: missing pieces of your story Statistical inference: filling in those missing pieces We use latent Dirichlet allocation (LDA), a fully Bayesian version of plsi, probabilistic version of LSA Material adapted from David Mimno UMD Topic Models 16 / 51

35 Evaluation Evaluation Corpus Forget the Bootleg, Just Download Multiplex the Heralded Movie Legally As Linchpin The Shape To of Growth Cinema, Transformed A Peaceful At Crew the Click Puts of Muppets a Where Mouse Stock Trades: A Its Better Mouth Deal Is For The Investors three Isn't big Internet Simple portals Red Light, begin Green to distinguish Light: A among 2-Tone themselves L.E.D. to as shopping Simplify Screens malls Model A Model B Model C Held-out Data Sony Ericsson's Infinite Hope for a Turnaround For Search, Murdoch Looks to a Deal With Microsoft Price War Brews Between Amazon and Wal-Mart How you compute it is important too (Wallach et al. 2009) Material adapted from David Mimno UMD Topic Models 17 / 51

36 Evaluation Evaluation Corpus Forget the Bootleg, Just Download Multiplex the Heralded Movie Legally As Linchpin The Shape To of Growth Cinema, Transformed A Peaceful At Crew the Click Puts of Muppets a Where Mouse Stock Trades: A Its Better Mouth Deal Is For The Investors three Isn't big Internet Simple portals Red Light, begin Green to distinguish Light: A among 2-Tone themselves L.E.D. to as shopping Simplify Screens malls Model A Model B Model C Held-out Log Likelihood Held-out Data Sony Ericsson's Infinite Hope for a Turnaround For Search, Murdoch Looks to a Deal With Microsoft Price War Brews Between Amazon and Wal-Mart Measures predictive power, not what the topics are How you compute it is important too (Wallach et al. 2009) Material adapted from David Mimno UMD Topic Models 17 / 51

37 Evaluation Word Intrusion TOPIC 1 TOPIC 2 TOPIC 3 computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Material adapted from David Mimno UMD Topic Models 18 / 51

38 Evaluation Word Intrusion 1. Take the highest probability words from a topic Original Topic dog, cat, horse, pig, cow Material adapted from David Mimno UMD Topic Models 19 / 51

39 Evaluation Word Intrusion 1. Take the highest probability words from a topic Original Topic dog, cat, horse, pig, cow 2. Take a high-probability word from another topic and add it Topic with Intruder dog, cat, apple, horse, pig, cow Material adapted from David Mimno UMD Topic Models 19 / 51

40 Evaluation Word Intrusion 1. Take the highest probability words from a topic Original Topic dog, cat, horse, pig, cow 2. Take a high-probability word from another topic and add it Topic with Intruder dog, cat, apple, horse, pig, cow 3. We ask users to find the word that doesn t belong Hypothesis If the topics are interpretable, users will consistently choose true intruder Material adapted from David Mimno UMD Topic Models 19 / 51

41 Evaluation Word Intrusion Material adapted from David Mimno UMD Topic Models 20 / 51

42 Evaluation Word Intrusion Order of words was shuffled Which intruder was selected varied Model precision: percentage of users who clicked on intruder Material adapted from David Mimno UMD Topic Models 20 / 51

43 Evaluation Word Intrusion: Which Topics are Interpretable? New York Times, 50 LDA Topics Number of Topics committee legislation proposal republican taxis fireplace garage house kitchen list americans japanese jewish states terrorist artist exhibition gallery museum painting Model Precision Model Precision: percentage of correct intruders found Material adapted from David Mimno UMD Topic Models 21 / 51

44 Evaluation York Times Interpretability and Likelihood Better Model Precision Model Precision on New York Times New York Times Held-out Likelihood Predictive Log Likelihood Better Predictive 2.0 Log Likelihood within a model, higher likelihood higher interpretability Wikipedia Model Precision Topic Log Odds Model CTM LDA plsi Number of topics Material adapted from David Mimno UMD Topic Models 22 /

45 Evaluation 0.75 Wikipedia Interpretability and Likelihood 0.70 Topic Log Odds on Wikipedia York Times Better Topic Log Odds Number Model of topics CTM LDA 150 plsi Number of topics Predictive Log Likelihood Held-out Likelihood Predictive Log Likelihood Better Model Precision Topic Log Odds Model Precision Topic Log Odds Model CTM LDA plsi Predictive Log Likelihood across models, higher likelihood higher interpretability Material adapted from David Mimno UMD Topic Models 22 / 51

46 Evaluation Evaluation Takeaway Measure what you care about If you care about prediction, likelihood is good If you care about a particular task, measure that Material adapted from David Mimno UMD Topic Models 23 / 51

47 Inference We are interested in posterior distribution p(z X,Θ) (4) Material adapted from David Mimno UMD Topic Models 24 / 51

48 Inference We are interested in posterior distribution p(z X,Θ) (4) Here, latent variables are topic assignments z and topics θ. X is the words (divided into documents), and Θ are hyperparameters to Dirichlet distributions: α for topic proportion, λ for topics. p( z, β, θ w,α,λ) (5) Material adapted from David Mimno UMD Topic Models 24 / 51

49 Inference We are interested in posterior distribution p(z X,Θ) (4) Here, latent variables are topic assignments z and topics θ. X is the words (divided into documents), and Θ are hyperparameters to Dirichlet distributions: α for topic proportion, λ for topics. p( z, β, θ w,α,λ) (5) p( w, z, θ, β α,λ) = p(β k λ) p(θ d α) p(z d,n θ d )p(w d,n β zd,n ) k d n Material adapted from David Mimno UMD Topic Models 24 / 51

50 Gibbs Sampling A form of Markov Chain Monte Carlo Chain is a sequence of random variable states Given a state {z1,...z N } given certain technical conditions, drawing z k p(z 1,...z k 1,z k+1,...z N X,Θ) for all k (repeatedly) results in a Markov Chain whose stationary distribution is the posterior. For notational convenience, call z with zd,n removed z d,n Material adapted from David Mimno UMD Topic Models 25 / 51

51 Inference computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 26 / 51

52 Inference computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 26 / 51

53 Inference computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 26 / 51

54 Inference computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 26 / 51

55 Inference computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 26 / 51

56 Inference computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 26 / 51

57 Inference computer, technology, system, service, site, phone, internet, machine sell, sale, store, product, business, advertising, market, consumer play, film, movie, theater, production, star, director, stage Hollywood studios are preparing to let people download and buy electronic copies of movies over the Internet, much as record labels now sell songs for 99 cents through Apple Computer's itunes music store and other online services... Material adapted from David Mimno UMD Topic Models 26 / 51

58 Gibbs Sampling For LDA, we will sample the topic assignments Thus, we want: p(z d,n = k z d,n, w,α,λ) = p(z d,n = k, z d,n w,α,λ) p( z d,n w,α,λ) Material adapted from David Mimno UMD Topic Models 27 / 51

59 Gibbs Sampling For LDA, we will sample the topic assignments Thus, we want: p(z d,n = k z d,n, w,α,λ) = p(z d,n = k, z d,n w,α,λ) p( z d,n w,α,λ) The topics and per-document topic proportions are integrated out / marginalized Let nd,i be the number of words taking topic i in document d. Let v k,w be the number of times word w is used in topic k. θd i k θ α i+n d,i 1 d θ α k+n d,i d dθ d β = k θd i θ α i+n d,i 1 dθ d βk d i w d,n β λ i+v k,i 1 k,i i β λ i+v k,i 1 k,i dβ k β λ i+v k,i k,w d,n dβ k Material adapted from David Mimno UMD Topic Models 27 / 51

60 Gibbs Sampling For LDA, we will sample the topic assignments The topics and per-document topic proportions are integrated out / marginalized / Rao-Blackwellized Thus, we want: p(z d,n = k z d,n, w,α,λ) = n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v k,i + λ i K i Material adapted from David Mimno UMD Topic Models 28 / 51

61 Gibbs Sampling Integral is normalizer of Dirichlet distribution V β λ i+v k,i 1 Γ (β i i + v k,i ) k,i dβ k = β k i Γ V β i i + v k,i Material adapted from David Mimno UMD Topic Models 29 / 51

62 Gibbs Sampling Integral is normalizer of Dirichlet distribution V β λ i+v k,i 1 Γ (β i i + v k,i ) k,i dβ k = β k i Γ V β i i + v k,i So we can simplify θd i k θ α i +n d,i 1 d θ α k +n d,i d dθ d β k θd i θ α i +n d,i 1 dθ d βk Γ d dβ k = dβ k i w d,n β λ i +v k,i 1 k,i β λ i +v k,i k,w d,n β λ i +v k,i 1 i k,i Γ(α k + n d,k + 1) K α i + n d,i + 1 Γ (α Γ(λ wd,n + v k,wd,n + 1) V i k k + n d,k ) Γ λ i + v k,i + 1 Γ λ i wd,n k + v k,wd,n K i K i Γ(α i + n d,i) K α i i + n d,i Γ V i V i Γ(λ i + v k,i) V λ i i + v k,i Γ Material adapted from David Mimno UMD Topic Models 29 / 51

63 Gamma Function Identity z = Γ (z + 1) Γ (z) (6) Γ Γ(α k + n d,k + 1) K α i + n d,i + 1 Γ (α Γ(λ wd,n + v k,wd,n + 1) V i k k + n d,k ) Γ λ i + v k,i + 1 Γ λ i wd,n k + v k,wd,n K i K i Γ(α i + n d,i) K α i i + n d,i Γ = n d,k + α k v k,wd,n + λ wd,n K n i d,i + α v i i k,i + λ i V i V i Γ(λ i + v k,i) V λ i i + v k,i Γ Material adapted from David Mimno UMD Topic Models 30 / 51

64 Gibbs Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v (7) k,i + λ i K i Number of times document d uses topic k Number of times topic k uses word type wd,n Dirichlet parameter for document to topic distribution Dirichlet parameter for topic to word distribution How much this document likes topic k How much this topic likes word wd,n Material adapted from David Mimno UMD Topic Models 31 / 51

65 Gibbs Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v (7) k,i + λ i K i Number of times document d uses topic k Number of times topic k uses word type wd,n Dirichlet parameter for document to topic distribution Dirichlet parameter for topic to word distribution How much this document likes topic k How much this topic likes word wd,n Material adapted from David Mimno UMD Topic Models 31 / 51

66 Gibbs Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v (7) k,i + λ i K i Number of times document d uses topic k Number of times topic k uses word type wd,n Dirichlet parameter for document to topic distribution Dirichlet parameter for topic to word distribution How much this document likes topic k How much this topic likes word wd,n Material adapted from David Mimno UMD Topic Models 31 / 51

67 Gibbs Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v (7) k,i + λ i K i Number of times document d uses topic k Number of times topic k uses word type wd,n Dirichlet parameter for document to topic distribution Dirichlet parameter for topic to word distribution How much this document likes topic k How much this topic likes word wd,n Material adapted from David Mimno UMD Topic Models 31 / 51

68 Gibbs Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v (7) k,i + λ i K i Number of times document d uses topic k Number of times topic k uses word type wd,n Dirichlet parameter for document to topic distribution Dirichlet parameter for topic to word distribution How much this document likes topic k How much this topic likes word wd,n Material adapted from David Mimno UMD Topic Models 31 / 51

69 Gibbs Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v (7) k,i + λ i K i Number of times document d uses topic k Number of times topic k uses word type wd,n Dirichlet parameter for document to topic distribution Dirichlet parameter for topic to word distribution How much this document likes topic k How much this topic likes word wd,n Material adapted from David Mimno UMD Topic Models 31 / 51

70 Sample Document Material adapted from David Mimno UMD Topic Models 32 / 51

71 Sample Document Material adapted from David Mimno UMD Topic Models 33 / 51

72 Randomly Assign Topics Material adapted from David Mimno UMD Topic Models 34 / 51

73 Randomly Assign Topics Material adapted from David Mimno UMD Topic Models 35 / 51

74 Total Topic Counts Material adapted from David Mimno UMD Topic Models 36 / 51

75 Total Topic Counts Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v k,i + λ i K i Material adapted from David Mimno UMD Topic Models 36 / 51

76 Total Topic Counts Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v k,i + λ i K i Material adapted from David Mimno UMD Topic Models 36 / 51

77 We want to sample this word... Material adapted from David Mimno UMD Topic Models 37 / 51

78 We want to sample this word... Material adapted from David Mimno UMD Topic Models 37 / 51

79 Decrement its count Material adapted from David Mimno UMD Topic Models 38 / 51

80 What is the conditional distribution for this topic? Material adapted from David Mimno UMD Topic Models 39 / 51

81 Part 1: How much does this document like each topic? Material adapted from David Mimno UMD Topic Models 40 / 51

82 Part 1: How much does this document like each topic? Material adapted from David Mimno UMD Topic Models 41 / 51

83 Part 1: How much does this document like each topic? Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v k,i + λ i K i Material adapted from David Mimno UMD Topic Models 41 / 51

84 Part 1: How much does this document like each topic? Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v k,i + λ i K i Material adapted from David Mimno UMD Topic Models 41 / 51

85 Part 2: How much does each topic like the word? Material adapted from David Mimno UMD Topic Models 42 / 51

86 Part 2: How much does each topic like the word? Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v k,i + λ i K i Material adapted from David Mimno UMD Topic Models 42 / 51

87 Part 2: How much does each topic like the word? Sampling Equation n d,k + α k v k,wd,n + λ wd,n n d,i + α i i v k,i + λ i K i Material adapted from David Mimno UMD Topic Models 42 / 51

88 Geometric interpretation Material adapted from David Mimno UMD Topic Models 43 / 51

89 Geometric interpretation Material adapted from David Mimno UMD Topic Models 43 / 51

90 Geometric interpretation Material adapted from David Mimno UMD Topic Models 43 / 51

91 Update counts Material adapted from David Mimno UMD Topic Models 44 / 51

92 Update counts Material adapted from David Mimno UMD Topic Models 44 / 51

93 Update counts Material adapted from David Mimno UMD Topic Models 44 / 51

94 Details: how to sample from a distribution P Q P Topic n d,k + k P K i n d,i + i v k,wd,n + wd,n P i v k,i + i Topic 2 Topic Topic 4 Normalize Topic Material adapted from David Mimno UMD Topic Models 45 / 51

95 Algorithm 1. For each iteration i: 1.1 For each document d and word n currently assigned to z old : Decrement n d,zold and v zold,w d,n Sample z new = k with probability proportional to n d,k +α k v k,wd,n +λ wd,n K i n d,i +α i i v k,i +λ i Increment n d,znew and v znew,w d,n Material adapted from David Mimno UMD Topic Models 46 / 51

96 Implementation Algorithm 1. For each iteration i: 1.1 For each document d and word n currently assigned to z old : Decrement n d,zold and v zold,w d,n Sample z new = k with probability proportional to n d,k +α k v k,wd,n +λ wd,n K i n d,i +α i i v k,i +λ i Increment n d,znew and v znew,w d,n Material adapted from David Mimno UMD Topic Models 47 / 51

97 Desiderata Hyperparameters: Sample them too (slice sampling) Initialization: Random Sampling: Until likelihood converges Lag / burn-in: Difference of opinion on this Number of chains: Should do more than one Material adapted from David Mimno UMD Topic Models 48 / 51

98 Available implementations Mallet ( LDAC ( blei/lda-c) Topicmod ( Material adapted from David Mimno UMD Topic Models 49 / 51

99 Wrapup Topic Models: Tools to uncover themes in large document collections Another example of Gibbs Sampling In class: Gibbs sampling example Material adapted from David Mimno UMD Topic Models 50 / 51

100 Material adapted from David Mimno UMD Topic Models 51 / 51

Topic Models. Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW. Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1

Topic Models. Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW. Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1 Topic Models Advanced Machine Learning for NLP Jordan Boyd-Graber OVERVIEW Advanced Machine Learning for NLP Boyd-Graber Topic Models 1 of 1 Low-Dimensional Space for Documents Last time: embedding space

More information

LSI, plsi, LDA and inference methods

LSI, plsi, LDA and inference methods LSI, plsi, LDA and inference methods Guillaume Obozinski INRIA - Ecole Normale Supérieure - Paris RussIR summer school Yaroslavl, August 6-10th 2012 Guillaume Obozinski LSI, plsi, LDA and inference methods

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) A review of topic modeling and customer interactions application 3/11/2015 1 Agenda Agenda Items 1 What is topic modeling? Intro Text Mining & Pre-Processing Natural Language

More information

Topic Modelling and Latent Dirichlet Allocation

Topic Modelling and Latent Dirichlet Allocation Topic Modelling and Latent Dirichlet Allocation Stephen Clark (with thanks to Mark Gales for some of the slides) Lent 2013 Machine Learning for Language Processing: Lecture 7 MPhil in Advanced Computer

More information

Document and Topic Models: plsa and LDA

Document and Topic Models: plsa and LDA Document and Topic Models: plsa and LDA Andrew Levandoski and Jonathan Lobo CS 3750 Advanced Topics in Machine Learning 2 October 2018 Outline Topic Models plsa LSA Model Fitting via EM phits: link analysis

More information

Evaluation Methods for Topic Models

Evaluation Methods for Topic Models University of Massachusetts Amherst wallach@cs.umass.edu April 13, 2009 Joint work with Iain Murray, Ruslan Salakhutdinov and David Mimno Statistical Topic Models Useful for analyzing large, unstructured

More information

Latent Dirichlet Allocation

Latent Dirichlet Allocation Outlines Advanced Artificial Intelligence October 1, 2009 Outlines Part I: Theoretical Background Part II: Application and Results 1 Motive Previous Research Exchangeability 2 Notation and Terminology

More information

Text Mining for Economics and Finance Latent Dirichlet Allocation

Text Mining for Economics and Finance Latent Dirichlet Allocation Text Mining for Economics and Finance Latent Dirichlet Allocation Stephen Hansen Text Mining Lecture 5 1 / 45 Introduction Recall we are interested in mixed-membership modeling, but that the plsi model

More information

Sparse Stochastic Inference for Latent Dirichlet Allocation

Sparse Stochastic Inference for Latent Dirichlet Allocation Sparse Stochastic Inference for Latent Dirichlet Allocation David Mimno 1, Matthew D. Hoffman 2, David M. Blei 1 1 Dept. of Computer Science, Princeton U. 2 Dept. of Statistics, Columbia U. Presentation

More information

CS Lecture 18. Topic Models and LDA

CS Lecture 18. Topic Models and LDA CS 6347 Lecture 18 Topic Models and LDA (some slides by David Blei) Generative vs. Discriminative Models Recall that, in Bayesian networks, there could be many different, but equivalent models of the same

More information

16 : Approximate Inference: Markov Chain Monte Carlo

16 : Approximate Inference: Markov Chain Monte Carlo 10-708: Probabilistic Graphical Models 10-708, Spring 2017 16 : Approximate Inference: Markov Chain Monte Carlo Lecturer: Eric P. Xing Scribes: Yuan Yang, Chao-Ming Yen 1 Introduction As the target distribution

More information

Applying LDA topic model to a corpus of Italian Supreme Court decisions

Applying LDA topic model to a corpus of Italian Supreme Court decisions Applying LDA topic model to a corpus of Italian Supreme Court decisions Paolo Fantini Statistical Service of the Ministry of Justice - Italy CESS Conference - Rome - November 25, 2014 Our goal finding

More information

AN INTRODUCTION TO TOPIC MODELS

AN INTRODUCTION TO TOPIC MODELS AN INTRODUCTION TO TOPIC MODELS Michael Paul December 4, 2013 600.465 Natural Language Processing Johns Hopkins University Prof. Jason Eisner Making sense of text Suppose you want to learn something about

More information

Introduction to Bayesian inference

Introduction to Bayesian inference Introduction to Bayesian inference Thomas Alexander Brouwer University of Cambridge tab43@cam.ac.uk 17 November 2015 Probabilistic models Describe how data was generated using probability distributions

More information

Generative Clustering, Topic Modeling, & Bayesian Inference

Generative Clustering, Topic Modeling, & Bayesian Inference Generative Clustering, Topic Modeling, & Bayesian Inference INFO-4604, Applied Machine Learning University of Colorado Boulder December 12-14, 2017 Prof. Michael Paul Unsupervised Naïve Bayes Last week

More information

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up

Topic Models. Brandon Malone. February 20, Latent Dirichlet Allocation Success Stories Wrap-up Much of this material is adapted from Blei 2003. Many of the images were taken from the Internet February 20, 2014 Suppose we have a large number of books. Each is about several unknown topics. How can

More information

Statistical Debugging with Latent Topic Models

Statistical Debugging with Latent Topic Models Statistical Debugging with Latent Topic Models David Andrzejewski, Anne Mulhern, Ben Liblit, Xiaojin Zhu Department of Computer Sciences University of Wisconsin Madison European Conference on Machine Learning,

More information

Gaussian Mixture Model

Gaussian Mixture Model Case Study : Document Retrieval MAP EM, Latent Dirichlet Allocation, Gibbs Sampling Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 5 th,

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

Study Notes on the Latent Dirichlet Allocation

Study Notes on the Latent Dirichlet Allocation Study Notes on the Latent Dirichlet Allocation Xugang Ye 1. Model Framework A word is an element of dictionary {1,,}. A document is represented by a sequence of words: =(,, ), {1,,}. A corpus is a collection

More information

Lecture 13 : Variational Inference: Mean Field Approximation

Lecture 13 : Variational Inference: Mean Field Approximation 10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1

More information

Gibbs Sampling. Héctor Corrada Bravo. University of Maryland, College Park, USA CMSC 644:

Gibbs Sampling. Héctor Corrada Bravo. University of Maryland, College Park, USA CMSC 644: Gibbs Sampling Héctor Corrada Bravo University of Maryland, College Park, USA CMSC 644: 2019 03 27 Latent semantic analysis Documents as mixtures of topics (Hoffman 1999) 1 / 60 Latent semantic analysis

More information

Note for plsa and LDA-Version 1.1

Note for plsa and LDA-Version 1.1 Note for plsa and LDA-Version 1.1 Wayne Xin Zhao March 2, 2011 1 Disclaimer In this part of PLSA, I refer to [4, 5, 1]. In LDA part, I refer to [3, 2]. Due to the limit of my English ability, in some place,

More information

Topic Models. Charles Elkan November 20, 2008

Topic Models. Charles Elkan November 20, 2008 Topic Models Charles Elan elan@cs.ucsd.edu November 20, 2008 Suppose that we have a collection of documents, and we want to find an organization for these, i.e. we want to do unsupervised learning. One

More information

Lecture 22 Exploratory Text Analysis & Topic Models

Lecture 22 Exploratory Text Analysis & Topic Models Lecture 22 Exploratory Text Analysis & Topic Models Intro to NLP, CS585, Fall 2014 http://people.cs.umass.edu/~brenocon/inlp2014/ Brendan O Connor [Some slides borrowed from Michael Paul] 1 Text Corpus

More information

Latent Dirichlet Allocation Introduction/Overview

Latent Dirichlet Allocation Introduction/Overview Latent Dirichlet Allocation Introduction/Overview David Meyer 03.10.2016 David Meyer http://www.1-4-5.net/~dmm/ml/lda_intro.pdf 03.10.2016 Agenda What is Topic Modeling? Parametric vs. Non-Parametric Models

More information

Topic Models and Applications to Short Documents

Topic Models and Applications to Short Documents Topic Models and Applications to Short Documents Dieu-Thu Le Email: dieuthu.le@unitn.it Trento University April 6, 2011 1 / 43 Outline Introduction Latent Dirichlet Allocation Gibbs Sampling Short Text

More information

topic modeling hanna m. wallach

topic modeling hanna m. wallach university of massachusetts amherst wallach@cs.umass.edu Ramona Blei-Gantz Helen Moss (Dave's Grandma) The Next 30 Minutes Motivations and a brief history: Latent semantic analysis Probabilistic latent

More information

Mr. LDA: A Flexible Large Scale Topic Modeling Package using Variational Inference in MapReduce

Mr. LDA: A Flexible Large Scale Topic Modeling Package using Variational Inference in MapReduce Mr. LDA: A Flexible Large Scale Topic Modeling Package using Variational Inference in MapReduce Ke Zhai, Jordan Boyd-Graber, Nima Asadi, and Mohamad Alkhouja Mr. LDA: A Flexible Large Scale Topic Modeling

More information

Latent Dirichlet Allocation

Latent Dirichlet Allocation Latent Dirichlet Allocation 1 Directed Graphical Models William W. Cohen Machine Learning 10-601 2 DGMs: The Burglar Alarm example Node ~ random variable Burglar Earthquake Arcs define form of probability

More information

Efficient Tree-Based Topic Modeling

Efficient Tree-Based Topic Modeling Efficient Tree-Based Topic Modeling Yuening Hu Department of Computer Science University of Maryland, College Park ynhu@cs.umd.edu Abstract Topic modeling with a tree-based prior has been used for a variety

More information

Information retrieval LSI, plsi and LDA. Jian-Yun Nie

Information retrieval LSI, plsi and LDA. Jian-Yun Nie Information retrieval LSI, plsi and LDA Jian-Yun Nie Basics: Eigenvector, Eigenvalue Ref: http://en.wikipedia.org/wiki/eigenvector For a square matrix A: Ax = λx where x is a vector (eigenvector), and

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Yuriy Sverchkov Intelligent Systems Program University of Pittsburgh October 6, 2011 Outline Latent Semantic Analysis (LSA) A quick review Probabilistic LSA (plsa)

More information

Applying hlda to Practical Topic Modeling

Applying hlda to Practical Topic Modeling Joseph Heng lengerfulluse@gmail.com CIST Lab of BUPT March 17, 2013 Outline 1 HLDA Discussion 2 the nested CRP GEM Distribution Dirichlet Distribution Posterior Inference Outline 1 HLDA Discussion 2 the

More information

Distributed ML for DOSNs: giving power back to users

Distributed ML for DOSNs: giving power back to users Distributed ML for DOSNs: giving power back to users Amira Soliman KTH isocial Marie Curie Initial Training Networks Part1 Agenda DOSNs and Machine Learning DIVa: Decentralized Identity Validation for

More information

A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank

A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank A Unified Posterior Regularized Topic Model with Maximum Margin for Learning-to-Rank Shoaib Jameel Shoaib Jameel 1, Wai Lam 2, Steven Schockaert 1, and Lidong Bing 3 1 School of Computer Science and Informatics,

More information

Modeling Environment

Modeling Environment Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA

More information

Mixture Models and Expectation-Maximization

Mixture Models and Expectation-Maximization Mixture Models and Expectation-Maximiation David M. Blei March 9, 2012 EM for mixtures of multinomials The graphical model for a mixture of multinomials π d x dn N D θ k K How should we fit the parameters?

More information

Lecture 19, November 19, 2012

Lecture 19, November 19, 2012 Machine Learning 0-70/5-78, Fall 0 Latent Space Analysis SVD and Topic Models Eric Xing Lecture 9, November 9, 0 Reading: Tutorial on Topic Model @ ACL Eric Xing @ CMU, 006-0 We are inundated with data

More information

Bayesian Inference for Dirichlet-Multinomials

Bayesian Inference for Dirichlet-Multinomials Bayesian Inference for Dirichlet-Multinomials Mark Johnson Macquarie University Sydney, Australia MLSS Summer School 1 / 50 Random variables and distributed according to notation A probability distribution

More information

Measuring Topic Quality in Latent Dirichlet Allocation

Measuring Topic Quality in Latent Dirichlet Allocation Measuring Topic Quality in Sergei Koltsov Olessia Koltsova Steklov Institute of Mathematics at St. Petersburg Laboratory for Internet Studies, National Research University Higher School of Economics, St.

More information

Dirichlet Enhanced Latent Semantic Analysis

Dirichlet Enhanced Latent Semantic Analysis Dirichlet Enhanced Latent Semantic Analysis Kai Yu Siemens Corporate Technology D-81730 Munich, Germany Kai.Yu@siemens.com Shipeng Yu Institute for Computer Science University of Munich D-80538 Munich,

More information

Latent Dirichlet Bayesian Co-Clustering

Latent Dirichlet Bayesian Co-Clustering Latent Dirichlet Bayesian Co-Clustering Pu Wang 1, Carlotta Domeniconi 1, and athryn Blackmond Laskey 1 Department of Computer Science Department of Systems Engineering and Operations Research George Mason

More information

Web Search and Text Mining. Lecture 16: Topics and Communities

Web Search and Text Mining. Lecture 16: Topics and Communities Web Search and Tet Mining Lecture 16: Topics and Communities Outline Latent Dirichlet Allocation (LDA) Graphical models for social netorks Eploration, discovery, and query-ansering in the contet of the

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

arxiv: v6 [stat.ml] 11 Apr 2017

arxiv: v6 [stat.ml] 11 Apr 2017 Improved Gibbs Sampling Parameter Estimators for LDA Dense Distributions from Sparse Samples: Improved Gibbs Sampling Parameter Estimators for LDA arxiv:1505.02065v6 [stat.ml] 11 Apr 2017 Yannis Papanikolaou

More information

Topic Modeling Using Latent Dirichlet Allocation (LDA)

Topic Modeling Using Latent Dirichlet Allocation (LDA) Topic Modeling Using Latent Dirichlet Allocation (LDA) Porter Jenkins and Mimi Brinberg Penn State University prj3@psu.edu mjb6504@psu.edu October 23, 2017 Porter Jenkins and Mimi Brinberg (PSU) LDA October

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

LDA with Amortized Inference

LDA with Amortized Inference LDA with Amortied Inference Nanbo Sun Abstract This report describes how to frame Latent Dirichlet Allocation LDA as a Variational Auto- Encoder VAE and use the Amortied Variational Inference AVI to optimie

More information

Probablistic Graphical Models, Spring 2007 Homework 4 Due at the beginning of class on 11/26/07

Probablistic Graphical Models, Spring 2007 Homework 4 Due at the beginning of class on 11/26/07 Probablistic Graphical odels, Spring 2007 Homework 4 Due at the beginning of class on 11/26/07 Instructions There are four questions in this homework. The last question involves some programming which

More information

CS 572: Information Retrieval

CS 572: Information Retrieval CS 572: Information Retrieval Lecture 11: Topic Models Acknowledgments: Some slides were adapted from Chris Manning, and from Thomas Hoffman 1 Plan for next few weeks Project 1: done (submit by Friday).

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Probabilistic Latent Semantic Analysis Denis Helic KTI, TU Graz Jan 16, 2014 Denis Helic (KTI, TU Graz) KDDM1 Jan 16, 2014 1 / 47 Big picture: KDDM

More information

Content-based Recommendation

Content-based Recommendation Content-based Recommendation Suthee Chaidaroon June 13, 2016 Contents 1 Introduction 1 1.1 Matrix Factorization......................... 2 2 slda 2 2.1 Model................................. 3 3 flda 3

More information

Collapsed Variational Bayesian Inference for Hidden Markov Models

Collapsed Variational Bayesian Inference for Hidden Markov Models Collapsed Variational Bayesian Inference for Hidden Markov Models Pengyu Wang, Phil Blunsom Department of Computer Science, University of Oxford International Conference on Artificial Intelligence and

More information

Latent Dirichlet Allocation (LDA)

Latent Dirichlet Allocation (LDA) Latent Dirichlet Allocation (LDA) D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003. Following slides borrowed ant then heavily modified from: Jonathan Huang

More information

PROBABILISTIC LATENT SEMANTIC ANALYSIS

PROBABILISTIC LATENT SEMANTIC ANALYSIS PROBABILISTIC LATENT SEMANTIC ANALYSIS Lingjia Deng Revised from slides of Shuguang Wang Outline Review of previous notes PCA/SVD HITS Latent Semantic Analysis Probabilistic Latent Semantic Analysis Applications

More information

Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details THIS IS AN EARLY DRAFT. YOUR FEEDBACKS ARE HIGHLY APPRECIATED.

Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details THIS IS AN EARLY DRAFT. YOUR FEEDBACKS ARE HIGHLY APPRECIATED. Distributed Gibbs Sampling of Latent Topic Models: The Gritty Details THIS IS AN EARLY DRAFT. YOUR FEEDBACKS ARE HIGHLY APPRECIATED. Yi Wang yi.wang.2005@gmail.com August 2008 Contents Preface 2 2 Latent

More information

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014

Understanding Comments Submitted to FCC on Net Neutrality. Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Understanding Comments Submitted to FCC on Net Neutrality Kevin (Junhui) Mao, Jing Xia, Dennis (Woncheol) Jeong December 12, 2014 Abstract We aim to understand and summarize themes in the 1.65 million

More information

Replicated Softmax: an Undirected Topic Model. Stephen Turner

Replicated Softmax: an Undirected Topic Model. Stephen Turner Replicated Softmax: an Undirected Topic Model Stephen Turner 1. Introduction 2. Replicated Softmax: A Generative Model of Word Counts 3. Evaluating Replicated Softmax as a Generative Model 4. Experimental

More information

Mixed-membership Models (and an introduction to variational inference)

Mixed-membership Models (and an introduction to variational inference) Mixed-membership Models (and an introduction to variational inference) David M. Blei Columbia University November 24, 2015 Introduction We studied mixture models in detail, models that partition data into

More information

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine

Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Fast Inference and Learning for Modeling Documents with a Deep Boltzmann Machine Nitish Srivastava nitish@cs.toronto.edu Ruslan Salahutdinov rsalahu@cs.toronto.edu Geoffrey Hinton hinton@cs.toronto.edu

More information

Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs

Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs Lawrence Livermore National Laboratory Applying Latent Dirichlet Allocation to Group Discovery in Large Graphs Keith Henderson and Tina Eliassi-Rad keith@llnl.gov and eliassi@llnl.gov This work was performed

More information

Online Bayesian Passive-Agressive Learning

Online Bayesian Passive-Agressive Learning Online Bayesian Passive-Agressive Learning International Conference on Machine Learning, 2014 Tianlin Shi Jun Zhu Tsinghua University, China 21 August 2015 Presented by: Kyle Ulrich Introduction Online

More information

Note 1: Varitional Methods for Latent Dirichlet Allocation

Note 1: Varitional Methods for Latent Dirichlet Allocation Technical Note Series Spring 2013 Note 1: Varitional Methods for Latent Dirichlet Allocation Version 1.0 Wayne Xin Zhao batmanfly@gmail.com Disclaimer: The focus of this note was to reorganie the content

More information

Latent Dirichlet Alloca/on

Latent Dirichlet Alloca/on Latent Dirichlet Alloca/on Blei, Ng and Jordan ( 2002 ) Presented by Deepak Santhanam What is Latent Dirichlet Alloca/on? Genera/ve Model for collec/ons of discrete data Data generated by parameters which

More information

Graphical Models for Collaborative Filtering

Graphical Models for Collaborative Filtering Graphical Models for Collaborative Filtering Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Sequence modeling HMM, Kalman Filter, etc.: Similarity: the same graphical model topology,

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

Bayesian Nonparametric Models

Bayesian Nonparametric Models Bayesian Nonparametric Models David M. Blei Columbia University December 15, 2015 Introduction We have been looking at models that posit latent structure in high dimensional data. We use the posterior

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Gentle Introduction to Infinite Gaussian Mixture Modeling

Gentle Introduction to Infinite Gaussian Mixture Modeling Gentle Introduction to Infinite Gaussian Mixture Modeling with an application in neuroscience By Frank Wood Rasmussen, NIPS 1999 Neuroscience Application: Spike Sorting Important in neuroscience and for

More information

Interactive Topic Modeling

Interactive Topic Modeling The 49 th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies Interactive Topic Modeling Yuening Hu, Jordan Boyd-Graber, Brianna Satinoff {ynhu, bsonrisa}@cs.umd.edu,

More information

Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations

Pachinko Allocation: DAG-Structured Mixture Models of Topic Correlations : DAG-Structured Mixture Models of Topic Correlations Wei Li and Andrew McCallum University of Massachusetts, Dept. of Computer Science {weili,mccallum}@cs.umass.edu Abstract Latent Dirichlet allocation

More information

Mixtures of Multinomials

Mixtures of Multinomials Mixtures of Multinomials Jason D. M. Rennie jrennie@gmail.com September, 25 Abstract We consider two different types of multinomial mixtures, () a wordlevel mixture, and (2) a document-level mixture. We

More information

Latent Dirichlet Allocation Based Multi-Document Summarization

Latent Dirichlet Allocation Based Multi-Document Summarization Latent Dirichlet Allocation Based Multi-Document Summarization Rachit Arora Department of Computer Science and Engineering Indian Institute of Technology Madras Chennai - 600 036, India. rachitar@cse.iitm.ernet.in

More information

Latent Variable View of EM. Sargur Srihari

Latent Variable View of EM. Sargur Srihari Latent Variable View of EM Sargur srihari@cedar.buffalo.edu 1 Examples of latent variables 1. Mixture Model Joint distribution is p(x,z) We don t have values for z 2. Hidden Markov Model A single time

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 7 Approximate

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

Kernel Density Topic Models: Visual Topics Without Visual Words

Kernel Density Topic Models: Visual Topics Without Visual Words Kernel Density Topic Models: Visual Topics Without Visual Words Konstantinos Rematas K.U. Leuven ESAT-iMinds krematas@esat.kuleuven.be Mario Fritz Max Planck Institute for Informatics mfrtiz@mpi-inf.mpg.de

More information

Scaling Neighbourhood Methods

Scaling Neighbourhood Methods Quick Recap Scaling Neighbourhood Methods Collaborative Filtering m = #items n = #users Complexity : m * m * n Comparative Scale of Signals ~50 M users ~25 M items Explicit Ratings ~ O(1M) (1 per billion)

More information

Probabilistic Matrix Factorization

Probabilistic Matrix Factorization Probabilistic Matrix Factorization David M. Blei Columbia University November 25, 2015 1 Dyadic data One important type of modern data is dyadic data. Dyadic data are measurements on pairs. The idea is

More information

Language Information Processing, Advanced. Topic Models

Language Information Processing, Advanced. Topic Models Language Information Processing, Advanced Topic Models mcuturi@i.kyoto-u.ac.jp Kyoto University - LIP, Adv. - 2011 1 Today s talk Continue exploring the representation of text as histogram of words. Objective:

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017

COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 COMS 4721: Machine Learning for Data Science Lecture 18, 4/4/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University TOPIC MODELING MODELS FOR TEXT DATA

More information

AUTOMATIC DETECTION OF WORDS NOT SIGNIFICANT TO TOPIC CLASSIFICATION IN LATENT DIRICHLET ALLOCATION

AUTOMATIC DETECTION OF WORDS NOT SIGNIFICANT TO TOPIC CLASSIFICATION IN LATENT DIRICHLET ALLOCATION AUTOMATIC DETECTION OF WORDS NOT SIGNIFICANT TO TOPIC CLASSIFICATION IN LATENT DIRICHLET ALLOCATION By DEBARSHI ROY A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

A Tutorial on Learning with Bayesian Networks

A Tutorial on Learning with Bayesian Networks A utorial on Learning with Bayesian Networks David Heckerman Presented by: Krishna V Chengavalli April 21 2003 Outline Introduction Different Approaches Bayesian Networks Learning Probabilities and Structure

More information

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan Bayesian Learning CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Bayes Theorem MAP Learners Bayes optimal classifier Naïve Bayes classifier Example text classification Bayesian networks

More information

Streaming Variational Bayes

Streaming Variational Bayes Streaming Variational Bayes Tamara Broderick, Nicholas Boyd, Andre Wibisono, Ashia C. Wilson, Michael I. Jordan UC Berkeley Discussion led by Miao Liu September 13, 2013 Introduction The SDA-Bayes Framework

More information

Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process

Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Decoupling Sparsity and Smoothness in the Discrete Hierarchical Dirichlet Process Chong Wang Computer Science Department Princeton University chongw@cs.princeton.edu David M. Blei Computer Science Department

More information

Mixed Membership Matrix Factorization

Mixed Membership Matrix Factorization Mixed Membership Matrix Factorization Lester Mackey 1 David Weiss 2 Michael I. Jordan 1 1 University of California, Berkeley 2 University of Pennsylvania International Conference on Machine Learning, 2010

More information

Modeling User Rating Profiles For Collaborative Filtering

Modeling User Rating Profiles For Collaborative Filtering Modeling User Rating Profiles For Collaborative Filtering Benjamin Marlin Department of Computer Science University of Toronto Toronto, ON, M5S 3H5, CANADA marlin@cs.toronto.edu Abstract In this paper

More information

Gaussian Models

Gaussian Models Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density

More information

Non-parametric Clustering with Dirichlet Processes

Non-parametric Clustering with Dirichlet Processes Non-parametric Clustering with Dirichlet Processes Timothy Burns SUNY at Buffalo Mar. 31 2009 T. Burns (SUNY at Buffalo) Non-parametric Clustering with Dirichlet Processes Mar. 31 2009 1 / 24 Introduction

More information

Present and Future of Text Modeling

Present and Future of Text Modeling Present and Future of Text Modeling Daichi Mochihashi NTT Communication Science Laboratories daichi@cslab.kecl.ntt.co.jp T-FaNT2, University of Tokyo 2008-2-13 (Wed) Present and Future of Text Modeling

More information

Time-Sensitive Dirichlet Process Mixture Models

Time-Sensitive Dirichlet Process Mixture Models Time-Sensitive Dirichlet Process Mixture Models Xiaojin Zhu Zoubin Ghahramani John Lafferty May 25 CMU-CALD-5-4 School of Computer Science Carnegie Mellon University Pittsburgh, PA 523 Abstract We introduce

More information

Topic Discovery Project Report

Topic Discovery Project Report Topic Discovery Project Report Shunyu Yao and Xingjiang Yu IIIS, Tsinghua University {yao-sy15, yu-xj15}@mails.tsinghua.edu.cn Abstract In this report we present our implementations of topic discovery

More information

Probabilistic Topic Models Tutorial: COMAD 2011

Probabilistic Topic Models Tutorial: COMAD 2011 Probabilistic Topic Models Tutorial: COMAD 2011 Indrajit Bhattacharya Assistant Professor Dept of Computer Sc. & Automation Indian Institute Of Science, Bangalore My Background Interests Topic Models Probabilistic

More information

Hierarchical Bayesian Nonparametrics

Hierarchical Bayesian Nonparametrics Hierarchical Bayesian Nonparametrics Micha Elsner April 11, 2013 2 For next time We ll tackle a paper: Green, de Marneffe, Bauer and Manning: Multiword Expression Identification with Tree Substitution

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text

Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text Yi Zhang Machine Learning Department Carnegie Mellon University yizhang1@cs.cmu.edu Jeff Schneider The Robotics Institute

More information